Improve resolving of file names from git diff files when paths contain whitespace

Review Request #4347 — Created July 19, 2013 and discarded

Information

Review Board

Reviewers

Improves resolving of file names from git diff files when paths contain whitespace.
Does not entirely solve the issue as it might incorrectly present file name with " b/" in path e.g.
diff --git a/repo/dir b/file.txt b/repo/dir b/file.txt
but still even in that case the output will be much better as previous "split()" just lost part of the information
current:
- origFile = repo/dir 
- newFile = file.txt
after fix:
- origFile = repo/dir
- newFile = file.txt b/repo/dir b/file.txt

I dont think having " b/" in path is an often case :)

Further improvement could be to read the file names from +++ and --- lines in diff_line contains more than one occurrence of " b/"
Runs on our production instance
Description From Last Updated

This isn't proper Python. It should be: if not diff_line.startswith('a/'): Blank line before this.

chipx86chipx86

Blank line before/after this. Use single quotes when possible.

chipx86chipx86

No parens.

chipx86chipx86

Single quotes.

chipx86chipx86
reviewbot
  1. This is a review from Review Bot.
      Tool: PEP8 Style Checker
      Processed Files:
        reviewboard/scmtools/git.py
      Ignored Files:
    
    
  2. 
      
reviewbot
  1. This is a review from Review Bot.
      Tool: Pyflakes
      Processed Files:
        reviewboard/scmtools/git.py
      Ignored Files:
    
    
  2. 
      
SW
chipx86
  1. Giving a couple pointers on our style requirements, but don't focus on it too much, as the code's going to need to change.
    
    First off, it feels very iffy for me for this to deny " b/" as a valid part of the filename. I get that it may not be common, but I'd be curious how others deal with this.
    
    That said, I'm willing to accept that we may not have enough information at this location to do the right thing.
    
    So, if we do decide that that's a valid tradeoff, I don't believe parsing strings is the way to do this. The majority of this could be done with a regex, like: r'^diff --git a/(.+) b/(.+)$', and then it'd be easy to pull out the two paths.
    
    I'm going to need to see a few unit tests that cover various situations, so that this doesn't regress, and to know for sure that it solves the cases.
    1. Well, as you probably noticed I'm not a python developer :) I'll probably leave it as is then (good enough solution for my purposes, no time or abilities to invest in proper fix).
      Thanks for comments, I'll try to remember about that next time I have some python related work.
  2. reviewboard/scmtools/git.py (Diff revision 1)
     
     
    Show all issues
    This isn't proper Python. It should be:
    
    if not diff_line.startswith('a/'):
    
    Blank line before this.
  3. reviewboard/scmtools/git.py (Diff revision 1)
     
     
    Show all issues
    Blank line before/after this.
    
    Use single quotes when possible.
  4. reviewboard/scmtools/git.py (Diff revision 1)
     
     
    Show all issues
    No parens.
  5. reviewboard/scmtools/git.py (Diff revision 1)
     
     
    Show all issues
    Single quotes.
  6. 
      
SW
Review request changed
Status:
Discarded
Change Summary:
Abandoned.