Fix unicode handling in git diff parsing.

Review Request #5524 — Created Feb. 21, 2014 and submitted

Information

Review Board
release-2.0.x
2d5688e...

Reviewers

We still had some lingering type problems distinguishing between unicode and
bytes in parsing git diffs. This meant that diffs that including utf-8
characters would cause UnicodeDecodeErrors when type coercion tried to
interpret utf-8 encoded text as ascii. One example of this is the diff attached
to /r/3749/.

This change makes sure that during the diff parse, we're always treating the
text as bytes. While this is theoretically bad for non-utf8 encodings, it's no
worse than we've always had, and makes it so we can actually parse these diffs.

Loaded /r/3749/. Saw the page instead of a traceback. Verified that the raw
diff returned the right utf-8 encoded patch.

chipx86
  1. Ship It!

  2. 
      
david
Review request changed
Status:
Completed
Change Summary:
Pushed to release-2.0.x (95598e4).