• 
      

    Add a utility parser for hunks in a diff.

    Review Request #11740 — Created July 21, 2021 and submitted

    Information

    DiffX
    master

    Reviewers

    This introduces pydiffx.utils.unified_diffs, which contains a
    get_unified_diff_hunks method. This method iterates through a byte
    string, returning a list of information on each hunk found in the
    string, up until either the end of the string or the first occurrence of
    something other thna a hunk.

    The following general information is returned:

    • The number of lines from the provided list of lines that have been
      processed to return hunk data.
    • The total numbers of inserts and deletes found across all hunks.

    For each hunk:

    • The number of lines of context before/after the changed lines in the
      hunk.
    • The header context (usually a function/class after the @@ ... @@).

    For each side (original/modified) of each hunk:

    • The 0-based line number in the file where the start of the hunk
      should map to.
    • The number of lines in the file represented by the hunk.
    • The 0-based line numbers in the file where the first and last change
      in the hunk occurs.
    • The total number of lines changed in the hunk.

    This can be told to ignore junk between headers, which is helpful for
    gathering stats across an entire diff file.

    It will raise a MalformedHunkError if it finds anything really out of
    the ordinary (such as a premature end of a hunk, or garbage found within
    the hunk).

    This method is based on a similar method we have in Review Board, but
    with some improvements to parsing, strictness, and results. It will be
    used by the DiffX DOM class in an upcoming change to calculate stats
    for the generated DiffX file.

    Unit tests pass on Python 2 and 3.

    Built the docs and checked that they rendered and linked correctly.
    Checked for spelling errors.

    Summary ID
    Add a utility parser for hunks in a diff.
    This introduces `pydiffx.utils.unified_diffs`, which contains a `get_unified_diff_hunks` method. This method iterates through a byte string, returning a list of information on each hunk found in the string, up until either the end of the string or the first occurrence of something other thna a hunk. The following general information is returned: * The number of lines from the provided list of lines that have been processed to return hunk data. * The total numbers of inserts and deletes found across all hunks. For each hunk: * The number of lines of context before/after the changed lines in the hunk. * The header context (usually a function/class after the `@@ ... @@`). For each side (original/modified) of each hunk: * The 0-based line number in the file where the start of the hunk should map to. * The number of lines in the file represented by the hunk. * The 0-based line numbers in the file where the first and last change in the hunk occurs. * The total number of lines changed in the hunk. This can be told to ignore junk between headers, which is helpful for gathering stats across an entire diff file. It will raise a `MalformedHunkError` if it finds anything really out of the ordinary (such as a premature end of a hunk, or garbage found within the hunk). This method is based on a similar method we have in Review Board, but with some improvements to parsing, strictness, and results. It will be used by the `DiffX` DOM class in an upcoming change to calculate stats for the generated DiffX file.
    e025a21b02008b0dedb8f929f6e3ec94ab43d76a
    Description From Last Updated

    E501 line too long (80 > 79 characters)

    reviewbotreviewbot
    Checks run (1 failed, 1 succeeded)
    flake8 failed.
    JSHint passed.

    flake8

    david
    1. Ship It!
    2. 
        
    chipx86
    Review request changed
    Status:
    Completed
    Change Summary:
    Pushed to master (7c118b4)