Summary

Add a diff parser for DiffX files.

Review Request #11765 — Created July 31, 2021 and submitted Aug. 3, 2021, 9:10 p.m.

Information

Owner

chipx86

Repository

Review Board

Branch

release-4.0.x

Bugs

Depends On

Reviewers

Groups

reviewboard

People

Description

This introduces DiffXParser, a new diff parser that can parse DiffX
files, using the pydiffx module. SCMTools can provide this as the diff
parser if they permit DiffX files to be used (whether DiffX is used
natively, or if they look for a #diffx: header in get_parser()).

This parser first turns DiffX file contents into a series of DOM (DiffX
Object Model) objects, which are then turned into ParsedDiff objects.
The ParsedDiff objects are populated with all the information needed
to reconstruct the DiffX file perfectly (options, preambles, metadata,
and diff contents).

That reconstruction happens when downloading a raw diff. With normal
diff parsers, that reconstruction just builds up a byte string
containing the diff content for each file (which may include metadata or
special lines for that diff variant).

With DiffXParser, the content is instead built up using that parsed
information stored in extra_data, along with the actual diff content
(that in a #...diff: section). It's assembled back into a DOM, and
then serialized.

All the extracted data is also exposed to the API, SCMTool
implementations, and extensions. It's all stored in namespaces within a
diffx key in the extra_data fields in DiffSet (for the main DiffX
section), DiffCommit (for change sections), and FileDiff (for file
sections).

Along with this, a diff's encoding is extracted and stored in a
top-level encoding key in FileDiff.extra_data (which is an existing
established key that aids in diff processing).

Testing Done

Unit tests pass on Python 2 and 3.

Commits

Summary	ID
Add a diff parser for DiffX files. This introduces `DiffXParser`, a new diff parser that can parse DiffX files, using the `pydiffx` module. SCMTools can provide this as the diff parser if they permit DiffX files to be used (whether DiffX is used natively, or if they look for a `#diffx:` header in `get_parser()`). This parser first turns DiffX file contents into a series of DOM (DiffX Object Model) objects, which are then turned into `ParsedDiff` objects. The `ParsedDiff` objects are populated with all the information needed to reconstruct the DiffX file perfectly (options, preambles, metadata, and diff contents). That reconstruction happens when downloading a raw diff. With normal diff parsers, that reconstruction just builds up a byte string containing the diff content for each file (which may include metadata or special lines for that diff variant). With `DiffXParser`, the content is instead built up using that parsed information stored in `extra_data`, along with the actual diff content (that in a `#...diff:` section). It's assembled back into a DOM, and then serialized. All the extracted data is also exposed to the API, `SCMTool` implementations, and extensions. It's all stored in namespaces within a `diffx` key in the `extra_data` fields in `DiffSet` (for the main DiffX section), `DiffCommit` (for change sections), and `FileDiff` (for file sections). Along with this, a diff's encoding is extracted and stored in a top-level `encoding` key in `FileDiff.extra_data` (which is an existing established key that aids in diff processing).	6fa0dc48b6413fe8e163704166df793eb44e9955

Summary

Add a diff parser for DiffX files.

This introduces `DiffXParser`, a new diff parser that can parse DiffX files, using the `pydiffx` module. SCMTools can provide this as the diff parser if they permit DiffX files to be used (whether DiffX is used natively, or if they look for a `#diffx:` header in `get_parser()`). This parser first turns DiffX file contents into a series of DOM (DiffX Object Model) objects, which are then turned into `ParsedDiff` objects. The `ParsedDiff` objects are populated with all the information needed to reconstruct the DiffX file perfectly (options, preambles, metadata, and diff contents). That reconstruction happens when downloading a raw diff. With normal diff parsers, that reconstruction just builds up a byte string containing the diff content for each file (which may include metadata or special lines for that diff variant). With `DiffXParser`, the content is instead built up using that parsed information stored in `extra_data`, along with the actual diff content (that in a `#...diff:` section). It's assembled back into a DOM, and then serialized. All the extracted data is also exposed to the API, `SCMTool` implementations, and extensions. It's all stored in namespaces within a `diffx` key in the `extra_data` fields in `DiffSet` (for the main DiffX section), `DiffCommit` (for change sections), and `FileDiff` (for file sections). Along with this, a diff's encoding is extracted and stored in a top-level `encoding` key in `FileDiff.extra_data` (which is an existing established key that aids in diff processing).

6fa0dc48b6413fe8e163704166df793eb44e9955

flake8 passed.

JSHint passed.

Change Summary:


Moved the default normalize_diff_filename into BaseDiffParser.
Bumped the pydiffx dependency to 1.0.

Commits:

	Summary	ID
	Add a diff parser for DiffX files. This introduces `DiffXParser`, a new diff parser that can parse DiffX files, using the `pydiffx` module. SCMTools can provide this as the diff parser if they permit DiffX files to be used (whether DiffX is used natively, or if they look for a `#diffx:` header in `get_parser()`). This parser first turns DiffX file contents into a series of DOM (DiffX Object Model) objects, which are then turned into `ParsedDiff` objects. The `ParsedDiff` objects are populated with all the information needed to reconstruct the DiffX file perfectly (options, preambles, metadata, and diff contents). That reconstruction happens when downloading a raw diff. With normal diff parsers, that reconstruction just builds up a byte string containing the diff content for each file (which may include metadata or special lines for that diff variant). With `DiffXParser`, the content is instead built up using that parsed information stored in `extra_data`, along with the actual diff content (that in a `#...diff:` section). It's assembled back into a DOM, and then serialized. All the extracted data is also exposed to the API, `SCMTool` implementations, and extensions. It's all stored in namespaces within a `diffx` key in the `extra_data` fields in `DiffSet` (for the main DiffX section), `DiffCommit` (for change sections), and `FileDiff` (for file sections). Along with this, a diff's encoding is extracted and stored in a top-level `encoding` key in `FileDiff.extra_data` (which is an existing established key that aids in diff processing).	6811473ad8925b9b3369eaf3f5d8ad5bb5dca363
	Add a diff parser for DiffX files. This introduces `DiffXParser`, a new diff parser that can parse DiffX files, using the `pydiffx` module. SCMTools can provide this as the diff parser if they permit DiffX files to be used (whether DiffX is used natively, or if they look for a `#diffx:` header in `get_parser()`). This parser first turns DiffX file contents into a series of DOM (DiffX Object Model) objects, which are then turned into `ParsedDiff` objects. The `ParsedDiff` objects are populated with all the information needed to reconstruct the DiffX file perfectly (options, preambles, metadata, and diff contents). That reconstruction happens when downloading a raw diff. With normal diff parsers, that reconstruction just builds up a byte string containing the diff content for each file (which may include metadata or special lines for that diff variant). With `DiffXParser`, the content is instead built up using that parsed information stored in `extra_data`, along with the actual diff content (that in a `#...diff:` section). It's assembled back into a DOM, and then serialized. All the extracted data is also exposed to the API, `SCMTool` implementations, and extensions. It's all stored in namespaces within a `diffx` key in the `extra_data` fields in `DiffSet` (for the main DiffX section), `DiffCommit` (for change sections), and `FileDiff` (for file sections). Along with this, a diff's encoding is extracted and stored in a top-level `encoding` key in `FileDiff.extra_data` (which is an existing established key that aids in diff processing).	6fa0dc48b6413fe8e163704166df793eb44e9955

Diff:

Revision 2 (+5424 -44)

Show changes

	reviewboard/dependencies.py
	reviewboard/diffviewer/parser.py
	reviewboard/diffviewer/tests/test_diffx_parser.py

Checks run (2 succeeded)

flake8 passed.

JSHint passed.

Ship it!

```
Ship It!
```

Status:: Completed
Change Summary:: Pushed to release-4.0.x (cdd0a55)