Skip to content

Python library does not deserialise maps with extra serialised fields #20109

Open
@TrentHouliston

Description

The python version of the protobuf deserialiser doesn't deserialise map files properly if they have additional fields in the serialised data.

If you have two protocol buffers like so

message A {
    map<int32, string> a = 1;
}

message B {
    message BB {
        optional int32 key = 1;
        optional string value = 2;
    }

    repeated BB a = 2;
}

Then according to the protobuf spec they should be encoded identically. However in the Python decoder at least, if there is an additional field in message like

message C {
    message CC {
        optional int32 key = 1;
        optional string value = 2;
        optional int32 other = 3;
    }

    repeated CC a = 2;
}

Then when decoding C as an A rather than just the other field being ignored (like it would be for message B) instead the entire key/value pair ends up not being decoded and silently ignored.

Instead, you should get each map value as normal with the extra field ignored as this would match the "maps are equivalent to a repeated virtual message" behaviour.

When encoding and decoding using protoc it displays that behaviour, so at the very least, it is inconsistent.

What version of protobuf and what language are you using?
protoc --version libprotoc 3.21.4
pip show protobuf Version: 5.29.3

Language: Python

What operating system (Linux, Windows, ...) and version?
Linux/MacOS

What runtime / compiler are you using (e.g., python version or gcc version)
Python 3.10.5

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions