• 
      

    Greatly improve performance of the diff parser.

    Review Request #8880 — Created April 6, 2017 and submitted

    Information

    Review Board
    release-2.5.x
    d8c19be...

    Reviewers

    The diff parser had some major performance problems that didn't really
    affect standard diffs, but were absolutely noticeable for multi-megabyte
    diffs.

    The primary performance problems had to do with the building of the diff
    data strings. We were concatenating strings, which is not fast or
    memory-efficient. To solve this, a breaking change to the diff parser
    API had to be made (which is unlikely to affect too many people out
    there). We no longer work with the data string, but instead call methods
    to prepend or append data to the diff's ParsedDiffFile object (renamed
    from File, while we're breaking stuff anyway -- further changes will be
    coming in for parsing/attribute/doc improvements). These methods in turn
    work on a cStringIO, which is a much faster way of performing the string
    building.

    Along with this, there were many places where we did conditional lookups
    or range checks to determine whether to process one or more lines of
    content. These checks were preventing us from unintentionally indexing
    past the end of the diff. Since in each case it's more likely than not
    that the current line will be somewhere before the end of the diff, the
    parser now attempts the index unconditionally, and handles alternative
    logic if hitting an IndexError.

    Prior to these speedups, a 13MB diff took so long to parse that I gave
    up trying to time it. After, it took a few seconds. This should
    certainly help lighten the load on busy Review Board servers, and may
    even make general uploading of diffs faster in many real-world cases.

    All unit tests passed.

    Uploaded a very large diff. Saw that it parsed correctly in seconds.

    Description From Last Updated

    'django.utils.six.moves.range' imported but unused

    reviewbotreviewbot

    Change to single quotes?

    daviddavid

    While we're in here can we stop shadowing the builtin file?

    daviddavid

    Change to single quotes?

    daviddavid

    Let's change this variable name too.

    daviddavid
    Checks run (1 failed, 2 succeeded)
    JSHint passed.
    PEP8 Style Checker passed.
    Pyflakes failed.

    Pyflakes

    chipx86
    david
    1. Just a few formatting tweaks.

    2. reviewboard/diffviewer/parser.py (Diff revision 2)
       
       
      Show all issues

      Change to single quotes?

      1. I have another change that will be altering this line anyway. I'm going to drop this one for now, minimize the damage in this change.

    3. reviewboard/diffviewer/parser.py (Diff revision 2)
       
       
      Show all issues

      While we're in here can we stop shadowing the builtin file?

    4. reviewboard/diffviewer/parser.py (Diff revision 2)
       
       
      Show all issues

      Change to single quotes?

    5. reviewboard/diffviewer/parser.py (Diff revision 2)
       
       
      Show all issues

      Let's change this variable name too.

    6. 
        
    chipx86
    david
    1. Ship It!
    2. 
        
    chipx86
    Review request changed
    Status:
    Completed
    Change Summary:
    Pushed to release-2.5.x (de3b797)