So I was searching for something to represent protobufs in a human-readable format. After lots of googling I’ve found that there is a magic built-in module called text_format, which does just what I need - it converts protobufs to/from human readable format, which looks quite similar to JSON. But it is not actually a valid JSON, as JSON-supported types don’t match protobufs, and it has a slightly different format. Pb text is fine for reading, but it has a very limited amount of tools that actually support it. For example, if you need some kind of xpath analog to search inside protobufs, you will be quickly disappointed, as there is no such thing freely available (though, on some forums google developers mentioned that they have one, but they can’t or don’t want to share it). So, I’ve decided to try to convert protobuf to json myself.

There are a bunch of not-so-popular pb<->json converters out there but, as it turned out, they all have the same bug related to handling an optional field with an empty repeated field inside. Here is what I mean:

message Bar {
  repeated int32 baz = 1;
}

message Foo {
  optional Bar bar = 1;
}

Even if you have baz containing 0 entries, it’s still there, so bar should be present too.

Those pb<->json converters do convert pb to json appropriately, so Foo foo looks like:

{
  "bar" : {}
}

But when converting back, they just miss it, as repeated baz is represented by a python list, so if you have no entries in baz(baz == []) and you assign foo.bar = [] protobuf will think that you didn’t set foo.bar at all. So, if you convert pb->json->pb->json you will see:

{
}

Which indicates that protobuf just dropped your optional field (that should be set) with an empty optional inside.

In C, you have a has_field, to mark that the field is present, so it is pretty straight forward. But in Python there wasn’t such field to set, and a brief investigation into pb methods didn’t reveal anything appropriate. But after a bit of digging into text_format sources I found a method called SetInParent() that does the same thing has_* field does in C. So if you do foo.bar.SetInParent(), it will set has_bar field and after converting pb -> json -> pb -> json you will get:

{
  "bar" : {}
}

Which is correct.