Fix preserving extra headers and data in diffs.

Review Request #3344 — Created Sept. 23, 2012 and submitted


Review Board


Fix preserving extra headers and data in diffs.

A long-standing bug in our diffs, which especially affects Git users, is
that our diff parser would lose information before the diffs and even
lose some within the diffs.

Diffs formatted with 'git format-patch' contained headers on the
author, description, diffstats, and more, prior to the first
'diff --git' line. We just ignored this, meaning that users who then
tried to get their patch out of Review Board would find it all missing.

That has been changed in both the standard diff parser and Git's pretty
custom parser. We now store this "preamble" and make sure the first file
contains it at the beginning of its payload.

We also no longer hide the "Binary files have changed" part of a Git
diff, which was accidentally being left out.

As part of this, we're no longer trying so hard to remove stuff we're
afraid patch can't handle. patch can handle a lot of things, so we're
just going to include it all. The thought being that if some tool is
generating content in a diff, at some point they'd have to have tested
that with patch. If it turns out something breaks as a result of this,
we can incrementally bring back filtering.

Unit tests have been updated for this change, as many of them were
expecting not to see certain things, or for smaller payloads. A new unit
test testing all the headers in a standard git format-patch has also
been added.
Unit tests all pass.

I built a diff for this change using git format-patch and uploaded it
to my dev server manually. It parsed fine, viewed fine, and when I
downloaded the raw diff, it contained all the same data I uploaded.
Description From Last Updated

Might be nice to document what this returns now that it is 3 values. I know the calling code has …

SM smacleod
  2. reviewboard/scmtools/ (Diff revision 1)
    Show all issues
    Might be nice to document what this returns now that it is 3 values. I know the calling code has descriptive names, but the method itself is a little cryptic in the return statements.
  3. reviewboard/scmtools/ (Diff revision 1)
    This could be bracketed instead of the \ on each line. Same with the others below. There is the drawback that if a comma slips in you get a tuple though...
    1. Fair point. I don't love what we were doing before, but might as well start doing this now.
  1. Ship It!
  1. Ship It!
Review request changed