So I was searching for something to represent protobufs in a human-readable
format. After lots of googling I’ve found that there is a magic built-in
module called text_format
, which does just what I need - it converts
protobufs to/from human readable format, which looks quite similar to JSON.
But it is not actually a valid JSON, as JSON-supported types don’t match
protobufs, and it has a slightly different format. Pb text is fine for reading,
but it has a very limited amount of tools that actually support it. For
example, if you need some kind of xpath
analog to search inside protobufs,
you will be quickly disappointed, as there is no such thing freely available
(though, on some forums google developers mentioned that they have one, but
they can’t or don’t want to share it). So, I’ve decided to try to convert
protobuf to json myself.
There are a bunch of not-so-popular pb<->json converters out there but, as it turned out, they all have the same bug related to handling an optional field with an empty repeated field inside. Here is what I mean:
message Bar {
repeated int32 baz = 1;
}
message Foo {
optional Bar bar = 1;
}
Even if you have baz
containing 0 entries, it’s still there, so bar
should
be present too.
Those pb<->json converters do convert pb to json appropriately, so Foo
foo
looks like:
{
"bar" : {}
}
But when converting back, they just miss it, as repeated baz
is represented
by a python list, so if you have no entries in baz
(baz == []
) and you
assign foo.bar = []
protobuf will think that you didn’t set foo.bar
at all.
So, if you convert pb->json->pb->json you will see:
{
}
Which indicates that protobuf just dropped your optional field (that should be set) with an empty optional inside.
In C, you have a has_field
, to mark that the field is present, so it is
pretty straight forward. But in Python there wasn’t such field to set, and a
brief investigation into pb methods didn’t reveal anything appropriate. But
after a bit of digging into text_format
sources I found a method called
SetInParent()
that does the same thing has_*
field does in C. So if you do
foo.bar.SetInParent()
, it will set has_bar
field and after converting
pb -> json -> pb -> json you will get:
{
"bar" : {}
}
Which is correct.