Fix unicode handling in git diff parsing.
Review Request #5524 — Created Feb. 21, 2014 and submitted — Latest diff uploaded
We still had some lingering type problems distinguishing between unicode and
bytes in parsing git diffs. This meant that diffs that including utf-8
characters would cause UnicodeDecodeErrors when type coercion tried to
interpret utf-8 encoded text as ascii. One example of this is the diff attached
to /r/3749/.This change makes sure that during the diff parse, we're always treating the
text as bytes. While this is theoretically bad for non-utf8 encodings, it's no
worse than we've always had, and makes it so we can actually parse these diffs.
Loaded /r/3749/. Saw the page instead of a traceback. Verified that the raw
diff returned the right utf-8 encoded patch.