Optimize diff processing for Subversion.
Review Request #12640 — Created Sept. 25, 2022 and submitted
When generating Subversion diffs, RBTools does a lot of diff processing
- Ensure that added/deleted empty files are present
- Ensure renamed files have the right information
- Convert relative paths to absolute paths
- Filter for any excluded files
Each step of this would iterate over the previous diff and then generate
a new one. If the diff had, say, 10,000 lines, we'd parse and then
re-build those 10,000 lines in almost every one of those stages, before
finally returning the result. This was slow.
This change converts each processing stage to a generator, allowing one
pass through the diff. Each stage will iterate through and yield lines
for the next stage, the result of which is iteratively joined into a
final byte string.
The stages themselves have been optimized a bit. We used to perform
repeated lookups on
SVNClientfor the same attributes (compiled
regexes and constants) on each loop. Those are now pulled out of the
A potential bug was also found and fixed that could have led to an
infinite loop during diff generation when processing empty files. If a
very specific (unlikely) set of conditions were met (deleted empty file,
revision values of 0 for the diff operation, and no
the file in
svn info), we'd repeate the loop, but with no advancement.
That'd cause the same condition to be hit over and over. We now advance
Unit tests pass.
Fixed an issue that could occur with garbage data in the empty file processor.
Revision 2 (+174 -174)
Checks run (1 failed, 1 succeeded)