• 
      

    Fix a Base64Field regression on Postgres leading to bad stored data.

    Review Request #11341 — Created Dec. 21, 2020 and submitted

    Information

    Djblets
    release-2.0.x

    Reviewers

    Base64Field in Djblets 2.0 was updated for Python 2/3 string
    compatibility, but it ended up returning the wrong string type when
    preparing data for the database. It generated Base64 content, which
    comes in the form of a byte string, and returned it for storage. This
    was handled fine by SQLite and MySQL, but Postgres replaced it with a
    literal \x (2 byte string, not the escape code). Upon loading the data
    again, this would lead to an error about invalid Base64 content.

    The correct thing to do was to return a Unicode string. On Python 2,
    either byte string or Unicode strings are handled fine as a return
    value, so long as the database backend can handle the content. On Python
    3, the string type seems to heavily depend on the backend. Since we're
    storing pure ASCII content (Base64 encoded value), a Unicode string is
    the safest option.

    Since Djblets 2.0 went live, and we've had some customers test an
    install of Review Board 4.0 that ended up breaking due to this bug, we
    need to ensure we're compatible with this bad stored data. We now look
    for this when pulling from the database, and if we see it, we turn it
    back into an empty string.

    Unit tests pass on all versions of Python.

    Manually tested the failure condition (storing an empty byte string) on
    SQLite, MySQL, and Postgres. Verified it only failed on Postgres.

    Manually tested the fix on all, and the workaround for loading the
    previously-stored bad data on all.

    Summary ID
    Fix a Base64Field regression on Postgres leading to bad stored data.
    `Base64Field` in Djblets 2.0 was updated for Python 2/3 string compatibility, but it ended up returning the wrong string type when preparing data for the database. It generated Base64 content, which comes in the form of a byte string, and returned it for storage. This was handled fine by SQLite and MySQL, but Postgres replaced it with a literal `\x` (2 byte string, not the escape code). Upon loading the data again, this would lead to an error about invalid Base64 content. The correct thing to do was to return a Unicode string. On Python 2, either byte string or Unicode strings are handled fine as a return value, so long as the database backend can handle the content. On Python 3, the string type seems to heavily depend on the backend. Since we're storing pure ASCII content (Base64 encoded value), a Unicode string is the safest option. Since Djblets 2.0 went live, and we've had some customers test an install of Review Board 4.0 that ended up breaking due to this bug, we need to ensure we're compatible with this bad stored data. We now look for this when pulling from the database, and if we see it, we turn it back into an empty string.
    54aa6c601304e22974a641b5219249c338ddac91
    david
    1. Ship It!
    2. 
        
    chipx86
    Review request changed
    Status:
    Completed
    Change Summary:
    Pushed to release-2.0.x (5acdba5)