Parallelize Perforce diff generation

Review Request #10554 — Created May 14, 2019 and updated

Information

RBTools
master
d87838e...

Reviewers

Parallelize Perforce diff generation

This change tweaks PerforceClient.diff to delegate diff generation to a
thread pool. This is motivated by the fact that Perforce diff generation
involves roundtrips to a Perforce server, and can be quite slow when
working with large changesets.

Resolves bug 4806.

  • nosetests rbtools/clients/tests/test_p4.py
  • Before this change, posting a review update for a change with 179 deleted
    files took over four minutes. After the change, my wait time is down to
    55s.
Description From Last Updated

Probably better to define this as its own function, rather than defining inline. Maybe generalize the name, call it _diff_file(). …

chipx86chipx86

You'll need double backticks to be valid ReStructuredText.

chipx86chipx86

Our diff generator won't handle the None: part right. You can just include the possibility of this in the description …

chipx86chipx86

Can you explicitly return None? We go for explicit rather than implicit here. Same applies below.

chipx86chipx86

Blank line in-between statements and blocks.

chipx86chipx86
Checks run (2 succeeded)
flake8 passed.
JSHint passed.
chipx86
  1. Hey Ryan! Sorry I haven't done a review on this until now.

    Very neat idea. It might be worth us exploring this for other types of repositories.

    One thing that stands out is that the order of files in the diff are probably not going to be preserved the way that it's currently generating them. Am I right about that? Would you be able to have this keep track of order?

  2. 
      
chipx86
  1. 
      
  2. rbtools/clients/perforce.py (Diff revision 1)
     
     
    Show all issues

    Probably better to define this as its own function, rather than defining inline. Maybe generalize the name, call it _diff_file().

    There's a performance hit to defining inline functions (likely won't matter much here, but it's bitten us before), and I could see doing something interesting down the road with having a standardized function that just diffs a single file, and have a standardized way to parallelize all the calls needed to those, simplifying what needs to be done in each implementation.

  3. rbtools/clients/perforce.py (Diff revision 1)
     
     
    Show all issues

    You'll need double backticks to be valid ReStructuredText.

  4. rbtools/clients/perforce.py (Diff revision 1)
     
     
     
     
     
     
    Show all issues

    Our diff generator won't handle the None: part right. You can just include the possibility of this in the description ("This may be None if there's an issue generating the diff. In that case, a warning will likely be emitted.").

  5. rbtools/clients/perforce.py (Diff revision 1)
     
     
    Show all issues

    Can you explicitly return None? We go for explicit rather than implicit here.

    Same applies below.

  6. rbtools/clients/perforce.py (Diff revision 1)
     
     
     
    Show all issues

    Blank line in-between statements and blocks.

  7. 
      
Loading...