Improve our handling of file encodings with diffs.
Review Request #5161 — Created Dec. 24, 2013 and submitted
Improve our handling of file encodings with diffs.
The way that we dealt with diff and file contents was kind of broken with
regards to special encodings. This was made worse by the unicode transition,
and in particular, I noticed that the diff validation resource would raise an
exception if the diff had unicode characters in it. Fixing that inspired me to
fix all our issues.We now decode strings to unicode before manipulating them, and then encode them
back to bytes afterward using the encoding that we used to decode. This means
that we're no longer splitting lines or doing regex replacements on bytes, only
on unicode objects.
- Ran unit tests
- Uploaded a diff containing unicode text via the "New Review Request" form.
Description | From | Last Updated |
---|---|---|
Could be simplified to: return self.encoding.split(',') or ['iso-8859-15'] |
chipx86 |