Add a diff parser for DiffX files.

Review Request #11765 — Created July 31, 2021 and submitted — Latest diff uploaded

Information

Review Board
release-4.0.x

Reviewers

This introduces DiffXParser, a new diff parser that can parse DiffX
files, using the pydiffx module. SCMTools can provide this as the diff
parser if they permit DiffX files to be used (whether DiffX is used
natively, or if they look for a #diffx: header in get_parser()).

This parser first turns DiffX file contents into a series of DOM (DiffX
Object Model) objects, which are then turned into ParsedDiff objects.
The ParsedDiff objects are populated with all the information needed
to reconstruct the DiffX file perfectly (options, preambles, metadata,
and diff contents).

That reconstruction happens when downloading a raw diff. With normal
diff parsers, that reconstruction just builds up a byte string
containing the diff content for each file (which may include metadata or
special lines for that diff variant).

With DiffXParser, the content is instead built up using that parsed
information stored in extra_data, along with the actual diff content
(that in a #...diff: section). It's assembled back into a DOM, and
then serialized.

All the extracted data is also exposed to the API, SCMTool
implementations, and extensions. It's all stored in namespaces within a
diffx key in the extra_data fields in DiffSet (for the main DiffX
section), DiffCommit (for change sections), and FileDiff (for file
sections).

Along with this, a diff's encoding is extracted and stored in a
top-level encoding key in FileDiff.extra_data (which is an existing
established key that aids in diff processing).

Unit tests pass on Python 2 and 3.

Commits

Files