• 
      

    Always treat diffs in commits as byte strings.

    Review Request #9148 — Created Aug. 23, 2017 and submitted — Latest diff uploaded

    Information

    Review Board
    release-2.5.x
    51ddcbd...

    Reviewers

    Commit.diff used to be more than happy to accept any Unicode or byte
    strings thrown at it, and some code actually expected these to be
    Unicode strings, attempting to unconditionally encode the contents as a
    UTF-8 byte string. If a hosting service set the diff as a byte string,
    and there was Unicode content within the diff, this could lead to a
    crash.

    Now, Commit.diff always stores a byte string, handling the encoding
    from Unicode if needed. Callers can and should now always treat this as
    a byte string.

    This fixes a crash when posting existing commits from Bitbucket with
    Unicode content.

    Unit tests pass.

    Manually tested that the diff content no longer gets improperly
    re-encoded and crashes in the customer case.