Deprecate and replace DiffParser.get_orig_commit_id().

Review Request #11757 — Created July 29, 2021 and submitted

Review Board

DiffParser.get_orig_commit_id() was an old hack originally intended
for solving a complex issue with parent diffs for Mercurial. To help
understand it, we need to dive into the history of how parent diffs,
source revisions, and file lookups worked.

Originally, when parsing a diff + parent diff, we'd set the source
revision for a FileDiff to that of the source of the file for the
parent diff, enabling us to look up the right soure file. We'd then
apply the parent diff and then the diff on top of it. If a file didn't
exist in a parent diff, we could just use the source revision in the
main diff, since we could assume it would be present in the repository.

The problem at that point was that, if the SCM used a commit revision to
look up files, and the file wasn't present in the parent diff, we
wouldn't have a suitable revision to use. To address this,
DiffParser.get_orig_commit_id() was introduced. This would allow the
diff parser to capture the parent revision, and then we could use that
captured revision from the parent diff as the source revision for files.

More recently, we stopped assigning a parent diff's source revision to
the FileDiff, and instead preserved the main diff's source revisions.
We'd then store parent diff revisions and filenames in extra_data.
This was part of an effort to fix some file lookup issues and file
matching involving parent diffs. However, we still needed
get_orig_commit_id() to get those parent commit IDs, and we were still
storing them as the FileDiff's source revision.

That brings us to this change.

Now that we have the new parsed diff objects (ParsedDiff and
ParsedDiffChange), we have a new avenue for storing useful information
during parsing. This change adds commit_id and parent_commit_id to
ParsedDiffChange, which can be optionally set by the parser. It also
adds a capability flag, uses_commit_ids_as_revisions, to ParsedDiff,
which can also be set.

If this flag is set, and commit ID information is recorded, then it will
be used as the parent source revision when building the FileDiffs.
This works just like get_orig_commit_id(), except we're no longer
storing the value in extra_data like other parent source revisions.

get_orig_commit_id() is now considered deprecated, scheduled to be
removed in Review Board 5.0. DiffParser will check for a value after
parsing and, if found, will set it and the
uses_commit_ids_as_revisions flag, emitting a deprecation warning in
the process.

This is a big cleanup that addresses one of the worst remnants of our
older diff parsing code, and opens the door to allowing a DiffX parser
to provide the same parent revision behavior.

Unit tests passed.

Tested posting a Mercurial diff with a parent diff. Verified that the
result was correct and contained the right revision metadata. This test
also instrumented the compatibility for get_orig_commit_id(), since
the Mercurial implementation has not been updated.

Deprecate and replace DiffParser.get_orig_commit_id().
  1. Ship It!
Review request changed

Status: Closed (submitted)

Change Summary:

Pushed to release-4.0.x (b972f20)