• 
      

    Optimize and improve diff line-related operations, and get_lines().

    Review Request #11574 — Created April 5, 2021 and submitted — Latest diff uploaded

    Information

    ReviewBot
    release-3.0.x

    Reviewers

    We've had a long-standing TODO item for optimizing how line matching
    works. We previously had to do a full row scan of all chunks of a diff
    in order to find a line, for each line being commented on. This was
    slow, and it was often done several times per diff.

    The logic for the scanning was also repeated in various ways a few
    times, and was about to be repeated again.

    Part of this change redoes this logic, giving us a single function for
    iterating through lines, with an optional start line. That start line is
    found through a binary search of chunks, and a relative offset into the
    matching chunk's lines.

    The other parts introduce a new method and a fix to an existing method.

    The new get_lines() can be used to fetch a range of lines from the
    original or modified file. This will be useful for the shellcheck tool
    in an upcoming change.

    The _is_modified() function has been revised to consider a range of
    lines to be modified if any lines within it are modified. Previously,
    the entire range had to be considered modified, and this could lead to
    tools choosing not to comment on a line because some part of the range
    wasn't modified.

    All unit tests pass on Python 2.7 and 3.x.

    Tested this code along with some new logic coming to shellcheck.

    Commits

    Files