• 
      

    Optimize and improve diff line-related operations, and get_lines().

    Review Request #11574 — Created April 5, 2021 and submitted

    Information

    ReviewBot
    release-3.0.x

    Reviewers

    We've had a long-standing TODO item for optimizing how line matching
    works. We previously had to do a full row scan of all chunks of a diff
    in order to find a line, for each line being commented on. This was
    slow, and it was often done several times per diff.

    The logic for the scanning was also repeated in various ways a few
    times, and was about to be repeated again.

    Part of this change redoes this logic, giving us a single function for
    iterating through lines, with an optional start line. That start line is
    found through a binary search of chunks, and a relative offset into the
    matching chunk's lines.

    The other parts introduce a new method and a fix to an existing method.

    The new get_lines() can be used to fetch a range of lines from the
    original or modified file. This will be useful for the shellcheck tool
    in an upcoming change.

    The _is_modified() function has been revised to consider a range of
    lines to be modified if any lines within it are modified. Previously,
    the entire range had to be considered modified, and this could lead to
    tools choosing not to comment on a line because some part of the range
    wasn't modified.

    All unit tests pass on Python 2.7 and 3.x.

    Tested this code along with some new logic coming to shellcheck.

    Summary ID
    Optimize and improve diff line-related operations, and get_lines().
    We've had a long-standing TODO item for optimizing how line matching works. We previously had to do a full row scan of all chunks of a diff in order to find a line, for each line being commented on. This was slow, and it was often done several times per diff. The logic for the scanning was also repeated in various ways a few times, and was about to be repeated again. Part of this change redoes this logic, giving us a single function for iterating through lines, with an optional start line. That start line is found through a binary search of chunks, and a relative offset into the matching chunk's lines. The other parts introduce a new method and a fix to an existing method. The new `get_lines()` can be used to fetch a range of lines from the original or modified file. This will be useful for the shellcheck tool in an upcoming change. The `_is_modified()` function has been revised to consider a range of lines to be modified if any lines within it are modified. Previously, the entire range had to be considered modified, and this could lead to tools choosing not to comment on a line because some part of the range wasn't modified.
    fb90c8b886eb8d8b37a4f050b3178d1a774d100a
    Description From Last Updated

    F401 'bisect' imported but unused

    reviewbotreviewbot
    Checks run (1 failed, 1 succeeded)
    flake8 failed.
    JSHint passed.

    flake8

    david
    1. Ship It!
    2. 
        
    chipx86
    Review request changed
    Status:
    Completed
    Change Summary:
    Pushed to release-3.0.x (fb3a33d)