• 
      

    Always treat diffs in commits as byte strings.

    Review Request #9148 — Created Aug. 23, 2017 and submitted

    Information

    Review Board
    release-2.5.x
    51ddcbd...

    Reviewers

    Commit.diff used to be more than happy to accept any Unicode or byte
    strings thrown at it, and some code actually expected these to be
    Unicode strings, attempting to unconditionally encode the contents as a
    UTF-8 byte string. If a hosting service set the diff as a byte string,
    and there was Unicode content within the diff, this could lead to a
    crash.

    Now, Commit.diff always stores a byte string, handling the encoding
    from Unicode if needed. Callers can and should now always treat this as
    a byte string.

    This fixes a crash when posting existing commits from Bitbucket with
    Unicode content.

    Unit tests pass.

    Manually tested that the diff content no longer gets improperly
    re-encoded and crashes in the customer case.

    Description From Last Updated

    Can we emit a warning in this case? We really shouldn't be passing in unicode to this function.

    daviddavid
    david
    1. 
        
    2. reviewboard/scmtools/core.py (Diff revision 1)
       
       
      Show all issues

      Can we emit a warning in this case? We really shouldn't be passing in unicode to this function.

      1. We do in GitHub and rbgateway. I want to clean those up separately, and then I'm fine making this a warning.

    3. 
        
    david
    1. Ship It!
    2. 
        
    chipx86
    Review request changed
    Status:
    Completed
    Change Summary:
    Pushed to release-2.5.x (23fbc2f)