• 
      

    Optimize diff/post performance by caching 'svn info' results

    Review Request #7199 — Created April 13, 2015 and submitted

    Information

    RBTools
    master
    2fcdf37...

    Reviewers

    rbt diff and post operations make numerous calls to 'svn info' with
    paths to various files. For any given modified file, 'svn info'
    is called 4 + N times. This breaks down as:

    • 3 calls from convert_to_absolute_paths() with the full filename
      to process the new file line, original file line, and Index line
    • 1 call from find_copyfrom() for the full filename
    • N calls from find_copyfrom() for each sub-path component in the
      filename

    By introducing caching on results from 'svn info' the number of
    calls per file can be decreased to 1 + M, where M is always <= N.
    For files sharing a common directory hierarchy, M will be equal
    to N for the first file processed and 0 for all subsequent files.

    Consider a contrived example with two modified files at:
    A/B/C/foo.txt
    A/B/C/bar.txt

    The original code make 14 total calls to 'svn info'. Measuring
    timing on a CentOS 7 box with time -p reveals an average run
    time of 0.742s. The updated code makes 5 total calls to
    'svn info' and runtime is measured at 0.586s. This is a decrease
    in runtime of 21%.

    * Performed various 'rbt diff' executions with debug enabled to 
      verify expected decrease in calls to 'svn info'.
    * Ran unit tests.
    Description From Last Updated

    How about we do something like: if path not in self._svn_info_cache: result = ... if result is None: self._svn_info_cache[path] = …

    brenniebrennie

    Remove this line.

    brenniebrennie

    Col: 26 E222 multiple spaces after operator

    reviewbotreviewbot
    reviewbot
    1. Tool: Pyflakes
      Processed Files:
          rbtools/clients/svn.py
      
      
      
      Tool: PEP8 Style Checker
      Processed Files:
          rbtools/clients/svn.py
      
      
    2. 
        
    brennie
    1. 
        
    2. rbtools/clients/svn.py (Diff revision 1)
       
       
       
       
       
       
       
       
       
       
       
       
       
       
       
       
       
       
       
       
       
       
      Show all issues

      How about we do something like:

      if path not in self._svn_info_cache:
          result = ...
      
          if result is None:
              self._svn_info_cache[path] = None
          else:
              self._svn_info_cache[path] = {}
      
              for info in result:
                  # ...
      
      return self._svn_info_cache[path]
      

      This leaves us with a single return path.
      This will have

      1. Ignore that last sentence fragment.

    3. 
        
    gmyers
    reviewbot
    1. Tool: Pyflakes
      Processed Files:
          rbtools/clients/svn.py
      
      
      
      Tool: PEP8 Style Checker
      Processed Files:
          rbtools/clients/svn.py
      
      
    2. rbtools/clients/svn.py (Diff revision 2)
       
       
      Show all issues
      Col: 26
       E222 multiple spaces after operator
      
    3. 
        
    brennie
    1. 
        
    2. rbtools/clients/svn.py (Diff revision 2)
       
       
      Show all issues

      Remove this line.

    3. 
        
    gmyers
    reviewbot
    1. Tool: Pyflakes
      Processed Files:
          rbtools/clients/svn.py
      
      
      
      Tool: PEP8 Style Checker
      Processed Files:
          rbtools/clients/svn.py
      
      
    2. 
        
    gmyers
    reviewbot
    1. Tool: PEP8 Style Checker
      Processed Files:
          rbtools/clients/svn.py
      
      
      
      Tool: Pyflakes
      Processed Files:
          rbtools/clients/svn.py
      
      
    2. 
        
    gmyers
    Review request changed
    Status:
    Completed
    Change Summary:
    Pushed to master (2747e8f)