Fix unicode-related errors when parsing XML outputs from SVN

Review Request #7224 — Created April 17, 2015 and submitted

Information

RBTools
master
ca72d8c...

Reviewers

There are two instances where svn is invoked with the --xml option to produce an XML document:
1. 'svn log' is called to extract a specific revision number
2. 'svn status' is called when attempting to detect a changelist.

For either of these cases, if a non-UTF8 character is present in the resultant XML document then an error ensues. This could be due to a non-UTF8 character in a commit message, or in a filename in a changelist. Case 1 was reported as issue 3844.

The fix here is simple, and that is to add results_unicode=False for both underlying calls to svn.

I've also added unit tests to exercise both of these cases. Both tests failed prior to the fix. To support case 1 it was necessary to introduce a non-UTF8 character in to the svn log. The easiest approach here was to update an existing commit message in the testdata repo to contain such a character, and the message for commit r2 has been updated to support this.

While testing I noticed that the error message when case 1 fails was quite misleading. For instance 'rbt diff 2' (where r2 has a non-UTF8 character in the commit log message) would produce:

CRITICAL: "2" does not appear to be a valid revision or changelist name

This is confusing because 2 is in fact a valid revision, and it does not give any indication that the error is actually due to an XML parsing failure because of a non-UTF8 character. The exception raised by the ElementTree XML parser is of type UnicodeEncodeError, which is derived from ValueError. But, in _convert_symbolic_revision(), where this exception is produced, a ValueError already has a specific purpose of indicating any failure to determine the revision number. Now, a ValueError from ElementTree is explicitly caught and a generic SCMError, with a more informative error message, is raised instead.

Ran unit tests.  Confirmed that new unit tests fail prior to the fix.
Description From Last Updated

Is this still going to end up with CRITICAL: foo does not appear to be a valid revision" ?

brenniebrennie
reviewbot
  1. Tool: PEP8 Style Checker
    Processed Files:
        rbtools/clients/tests.py
        rbtools/clients/svn.py
    
    Ignored Files:
        rbtools/clients/testdata/svn-repo/db/revprops/0/2
    
    
  2. 
      
reviewbot
  1. Tool: Pyflakes
    Processed Files:
        rbtools/clients/tests.py
        rbtools/clients/svn.py
    
    Ignored Files:
        rbtools/clients/testdata/svn-repo/db/revprops/0/2
    
    
  2. 
      
brennie
  1. 
      
  2. rbtools/clients/svn.py (Diff revision 1)
     
     
    Show all issues

    Is this still going to end up with
    CRITICAL: foo does not appear to be a valid revision" ?

    1. So now it will be:

      ERROR: Failed to parse svn log - 'ascii' codec can't encode character u'\xe9' in position 184: ordinal not in range(128).
      CRITICAL: "2" does not appear to be a valid revision or changelist name
      

      Perhaps it would be more preferable to explicitly raise some other exception here, but I wasn't sure and that's why I went with the logging approach.

    2. I think a different error should be raised, but I'm not exactly sure what. https://reviews.reviewboard.org/r/7153/ adds a generic SCMError, but that hasn't landed yet. You could add that error yourself to errors.py and then use that?

  3. 
      
gmyers
reviewbot
  1. Tool: Pyflakes
    Processed Files:
        rbtools/clients/tests.py
        rbtools/clients/svn.py
        rbtools/clients/errors.py
    
    Ignored Files:
        rbtools/clients/testdata/svn-repo/db/revprops/0/2
    
    
    
    Tool: PEP8 Style Checker
    Processed Files:
        rbtools/clients/tests.py
        rbtools/clients/svn.py
        rbtools/clients/errors.py
    
    Ignored Files:
        rbtools/clients/testdata/svn-repo/db/revprops/0/2
    
    
  2. 
      
brennie
  1. Ship It!
  2. 
      
gmyers
Review request changed
Status:
Completed
Change Summary:
Pushed to release-0.7.x (ec1a47d)