Expand diff parsing for additional metadata and multiple changes.
Review Request #11744 — Created July 26, 2021 and submitted
Historically, our diff parsing infrastructure was built around a base
set of manual line-by-line parsing rules, which could be specialized.
The result would be a list of files. This was expanded a bit in 4.0 and
subsequent releases to collect and handle additional data (such as being
able to populateFileDiff.extra_data
), but the core design remained.With DiffX coming, we need more. This change expands our diff parsing
support.New
ParsedDiff
andParsedDiffChange
classes were added, which
represent information on a diff as a whole, and a change within it,
respectively. These both provide support for populatingextra_data
,
which will be set inDiffSet
andDiffCommit
when parsing a diff.A new
BaseDiffParser
has been introduced as the new base class, which
DiffParser
subclasses. The new base class does very little, and is
more of an interface. It contains no built-in parsing logic. This will
allow for more specialized parsers with their own logic.
BaseDiffParser
andDiffParser
now provide aparse_diff()
method,
which is the new method to call to parse a diff. This replaces
parse()
, which still exists inDiffParser
.For new subclasses of
BaseDiffParser
, aparse_diff()
implementation
will be responsible for parsing into a series ofParsedDiff
,
ParsedDiffChange
, andParsedDiffFile
objects.For subclasses of
DiffParser
,parse()
can still be overridden. The
new defaultDiffParser.parse_diff()
will set up the new objects as
parsed_diff
andparsed_diff_change
attributes.This new model will make it easier to implement DiffX, and will give us
some good capabilities down the road. For the moment, there are still
limitations based on our existing process of uploading and parsing
diffs. The main one being that, while aParsedDiff
(and later a DiffX)
can in theory have multiple changes, it's not actually allowed through
our parsing. Clients of the API will need to provide diffs with only a
single change, until we're able to implement the appropriate changes
in the diff upload API and related code (probably post-4.0 sometime).
Unit tests pass on Python 2 and 3.
Tested posting diffs for review, without any issues.
- Change Summary:
-
- Removed a duplicate assertion.
- Fixed a variable access.
- Removed an unused import.
- Commits:
-
Summary ID 049886e6eb19d624326f59a7432adf5aa8520b97 d25cf267a2964095fa3bde32795b19c1eba1cf26 - Diff:
-
Revision 2 (+1164 -156)
- Change Summary:
-
- Improved the deprecation handling in
ParsedDiffFile.__init__
. - Added unit tests for
ParsedDiff
,ParsedDiffChange
, andParsedDiffFile
.
- Improved the deprecation handling in
- Commits:
-
Summary ID d25cf267a2964095fa3bde32795b19c1eba1cf26 d5e07d83714ec06c81be11a876cdb4914f3412f1 - Diff:
-
Revision 3 (+1320 -160)