• 
      

    Expand diff parsing for additional metadata and multiple changes.

    Review Request #11744 — Created July 26, 2021 and submitted

    Information

    Review Board
    release-4.0.x

    Reviewers

    Historically, our diff parsing infrastructure was built around a base
    set of manual line-by-line parsing rules, which could be specialized.
    The result would be a list of files. This was expanded a bit in 4.0 and
    subsequent releases to collect and handle additional data (such as being
    able to populate FileDiff.extra_data), but the core design remained.

    With DiffX coming, we need more. This change expands our diff parsing
    support.

    New ParsedDiff and ParsedDiffChange classes were added, which
    represent information on a diff as a whole, and a change within it,
    respectively. These both provide support for populating extra_data,
    which will be set in DiffSet and DiffCommit when parsing a diff.

    A new BaseDiffParser has been introduced as the new base class, which
    DiffParser subclasses. The new base class does very little, and is
    more of an interface. It contains no built-in parsing logic. This will
    allow for more specialized parsers with their own logic.

    BaseDiffParser and DiffParser now provide a parse_diff() method,
    which is the new method to call to parse a diff. This replaces
    parse(), which still exists in DiffParser.

    For new subclasses of BaseDiffParser, a parse_diff() implementation
    will be responsible for parsing into a series of ParsedDiff,
    ParsedDiffChange, and ParsedDiffFile objects.

    For subclasses of DiffParser, parse() can still be overridden. The
    new default DiffParser.parse_diff() will set up the new objects as
    parsed_diff and parsed_diff_change attributes.

    This new model will make it easier to implement DiffX, and will give us
    some good capabilities down the road. For the moment, there are still
    limitations based on our existing process of uploading and parsing
    diffs. The main one being that, while a ParsedDiff (and later a DiffX)
    can in theory have multiple changes, it's not actually allowed through
    our parsing. Clients of the API will need to provide diffs with only a
    single change, until we're able to implement the appropriate changes
    in the diff upload API and related code (probably post-4.0 sometime).

    Unit tests pass on Python 2 and 3.

    Tested posting diffs for review, without any issues.

    Summary ID
    Expand diff parsing for additional metadata and multiple changes.
    Historically, our diff parsing infrastructure was built around a base set of manual line-by-line parsing rules, which could be specialized. The result would be a list of files. This was expanded a bit in 4.0 and subsequent releases to collect and handle additional data (such as being able to populate `FileDiff.extra_data`), but the core design remained. With DiffX coming, we need more. This change expands our diff parsing support. New `ParsedDiff` and `ParsedDiffChange` classes were added, which represent information on a diff as a whole, and a change within it, respectively. These both provide support for populating `extra_data`, which will be set in `DiffSet` and `DiffCommit` when parsing a diff. A new `BaseDiffParser` has been introduced as the new base class, which `DiffParser` subclasses. The new base class does very little, and is more of an interface. It contains no built-in parsing logic. This will allow for more specialized parsers with their own logic. `BaseDiffParser` and `DiffParser` now provide a `parse_diff()` method, which is the new method to call to parse a diff. This replaces `parse()`, which still exists in `DiffParser`. For new subclasses of `BaseDiffParser`, a `parse_diff()` implementation will be responsible for parsing into a series of `ParsedDiff`, `ParsedDiffChange`, and `ParsedDiffFile` objects. For subclasses of `DiffParser`, `parse()` can still be overridden. The new default `DiffParser.parse_diff()` will set up the new objects as `parsed_diff` and `parsed_diff_change` attributes. This new model will make it easier to implement DiffX, and will give us some good capabilities down the road. For the moment, there are still limitations based on our existing process of uploading and parsing diffs. The main one being that, while a `ParsedDiff` (and later a DiffX) can in theory have multiple changes, it's not actually allowed through our parsing. Clients of the API will need to provide diffs with only a single change, until we're able to implement the appropriate changes in the diff upload API and related code (probably post-4.0 sometime).
    d5e07d83714ec06c81be11a876cdb4914f3412f1
    Description From Last Updated

    F821 undefined name 'parser'

    reviewbotreviewbot

    F821 undefined name 'parsed_diff_file'

    reviewbotreviewbot

    F401 'reviewboard.scmtools.core.Revision' imported but unused

    reviewbotreviewbot

    F821 undefined name 'parent_diff_change'

    reviewbotreviewbot
    Checks run (1 failed, 1 succeeded)
    flake8 failed.
    JSHint passed.

    flake8

    chipx86
    Review request changed
    Change Summary:
    • Removed a duplicate assertion.
    • Fixed a variable access.
    • Removed an unused import.
    Commits:
    Summary ID
    Expand diff parsing for additional metadata and multiple changes.
    Historically, our diff parsing infrastructure was built around a base set of manual line-by-line parsing rules, which could be specialized. The result would be a list of files. This was expanded a bit in 4.0 and subsequent releases to collect and handle additional data (such as being able to populate `FileDiff.extra_data`), but the core design remained. With DiffX coming, we need more. This change expands our diff parsing support. New `ParsedDiff` and `ParsedDiffChange` classes were added, which represent information on a diff as a whole, and a change within it, respectively. These both provide support for populating `extra_data`, which will be set in `DiffSet` and `DiffCommit` when parsing a diff. A new `BaseDiffParser` has been introduced as the new base class, which `DiffParser` subclasses. The new base class does very little, and is more of an interface. It contains no built-in parsing logic. This will allow for more specialized parsers with their own logic. `BaseDiffParser` and `DiffParser` now provide a `parse_diff()` method, which is the new method to call to parse a diff. This replaces `parse()`, which still exists in `DiffParser`. For new subclasses of `BaseDiffParser`, a `parse_diff()` implementation will be responsible for parsing into a series of `ParsedDiff`, `ParsedDiffChange`, and `ParsedDiffFile` objects. For subclasses of `DiffParser`, `parse()` can still be overridden. The new default `DiffParser.parse_diff()` will set up the new objects as `parsed_diff` and `parsed_diff_change` attributes. This new model will make it easier to implement DiffX, and will give us some good capabilities down the road. For the moment, there are still limitations based on our existing process of uploading and parsing diffs. The main one being that, while a `ParsedDiff` (and later a DiffX) can in theory have multiple changes, it's not actually allowed through our parsing. Clients of the API will need to provide diffs with only a single change, until we're able to implement the appropriate changes in the diff upload API and related code (probably post-4.0 sometime).
    049886e6eb19d624326f59a7432adf5aa8520b97
    Expand diff parsing for additional metadata and multiple changes.
    Historically, our diff parsing infrastructure was built around a base set of manual line-by-line parsing rules, which could be specialized. The result would be a list of files. This was expanded a bit in 4.0 and subsequent releases to collect and handle additional data (such as being able to populate `FileDiff.extra_data`), but the core design remained. With DiffX coming, we need more. This change expands our diff parsing support. New `ParsedDiff` and `ParsedDiffChange` classes were added, which represent information on a diff as a whole, and a change within it, respectively. These both provide support for populating `extra_data`, which will be set in `DiffSet` and `DiffCommit` when parsing a diff. A new `BaseDiffParser` has been introduced as the new base class, which `DiffParser` subclasses. The new base class does very little, and is more of an interface. It contains no built-in parsing logic. This will allow for more specialized parsers with their own logic. `BaseDiffParser` and `DiffParser` now provide a `parse_diff()` method, which is the new method to call to parse a diff. This replaces `parse()`, which still exists in `DiffParser`. For new subclasses of `BaseDiffParser`, a `parse_diff()` implementation will be responsible for parsing into a series of `ParsedDiff`, `ParsedDiffChange`, and `ParsedDiffFile` objects. For subclasses of `DiffParser`, `parse()` can still be overridden. The new default `DiffParser.parse_diff()` will set up the new objects as `parsed_diff` and `parsed_diff_change` attributes. This new model will make it easier to implement DiffX, and will give us some good capabilities down the road. For the moment, there are still limitations based on our existing process of uploading and parsing diffs. The main one being that, while a `ParsedDiff` (and later a DiffX) can in theory have multiple changes, it's not actually allowed through our parsing. Clients of the API will need to provide diffs with only a single change, until we're able to implement the appropriate changes in the diff upload API and related code (probably post-4.0 sometime).
    d25cf267a2964095fa3bde32795b19c1eba1cf26

    Checks run (1 failed, 1 succeeded)

    flake8 failed.
    JSHint passed.

    flake8

    chipx86
    david
    1. Ship It!
    2. 
        
    chipx86
    Review request changed
    Status:
    Completed
    Change Summary:
    Pushed to release-4.0.x (e2b6dcc)