Fix the diff viewer to handle very large files.
Review Request #3402 — Created Oct. 7, 2012 and submitted
Fix the diff viewer to handle very large files. Previously, very large files (10,000+ lines) would greatly slow down Review Board and take forever to render. This slowdown was entirely in the rendering step, rather than being in the diffing or patching steps. The reason for the slowdown is that we executed a fragment of a Django template (consisting of variable resolution, template tags, and filters) per line of the diff. This, it turns out, is very slow. Variable lookups (and possible escaping) is slow. Template tag and filter execution is slow. Copying of contexts is slow. To make matters worse, the context we supply for diffs contains data that can be quite large (chunk, file of chunks, etc.), which ends up getting copied a bunch for different tags. While we may want to audit this for other templates, fixing it for the diff lines is still slow due to how templates are processed. So what we're now doing is defining printf-style templates for the lines (and conditional parts of lines), processing some data in code, and then piecing them together for output. This creates a massive speedup. a 44,000 line change I tested with took around a minute to render before, bringing the server to its knees at the same time. Now, the total render time for the page (including all the rest of the diff viewer page) is now only a few seconds.
Tested with my usual scattering of diffs. Verified that line numbers were correct, as were the coloring, expansion, and collapsing. Tested with a nearly 2MB patch containing over 44,000 lines of changes in the diff viewer. Saw it speed by (not that the browser was very happy about it).
Description | From | Last Updated |
---|---|---|
Tiny optimization: chunk_lines = chunk['lines'] chunk_index = chunk['index'] Multiple references. |
ME medanat | |
The brackets are not required here, are you using them for clarity? |
ME medanat | |
Is the intermediate list result needed? Can't you just concatenate to a string immediately instead of appending to the list … |
ME medanat | |
Tiny optimization, maybe add here: moved_line = len(line) > 8 and line[8] and on line 291: if moved_line: |
ME medanat | |
Checking for i == 0 twice. could move second if-statement body into first if-statement. |
ME medanat | |
show_collapse = i == 0 and standalone |
ME medanat |
ME
-
Again, a bunch of possibly unnecessary optimizations that might make the code less readable. Worth mentioning though.
-
-
Is the intermediate list result needed? Can't you just concatenate to a string immediately instead of appending to the list result then joining? So result = '' and on line 341 result += line_fmt % context return result
-