Implement a XML Review UI that allow user to perform actions on XML files

Review Request #9615 — Created Feb. 10, 2018 and discarded

Information

Review Board
master

Reviewers

XML files is handled as an uneditble attachments in the current
version.
I implemented a new class XMLReviewUI that
enhence Review board's ability to process XML file.

Features:
- Smart indentation for the file
- Diff for the file will be shown in render and source format

Known issues:
- Due to the parser library issue, the XML Render cannot render
the text of an element that are after a child element.
(e.g. <parent>
<child>
</child>
info
</parent> )
info cannot be render in the example above.

Added and ran unit test with:
- Properly formatted XML files (attributes/children).
- Handle Highly-nested XML files.

Tested the development environment with:
- File diff system.
- Properly formatted XML files (attributes/children).
- Handle Highly-nested XML files.

Description From Last Updated

Unlike JSON files, order matters in XML files. For example, imagine this: <h1>This is a section</h1> <p>Some content.</p> <h1>Here's another …

chipx86chipx86

These blank lines should remain. Check our style guide on Notion for where blank lines should go.

chipx86chipx86

Can you add a comment about why the score is 1 here, based on what we've diagnosed?

chipx86chipx86

os is a Python Standard Library module, so it should go in an import group after __future__ but before django. …

chipx86chipx86

"XMLReviewUITests"

chipx86chipx86

No blank line between class and the docstring. We also tend to include the full module path in the docstring …

chipx86chipx86

Helper functions in unit tests should be private (prefix with a _) and put at the end of the class.

chipx86chipx86

Typo, and this doesn't follow our documentation guidelines.

chipx86chipx86

Ideally, we wouldn't include data files, since that makes it harder to have self-contained tests and to provide variations on …

chipx86chipx86

Each component of os.path.join should be a single path entry. No . or /. file_diretory must also be its own …

chipx86chipx86

Blank line between these.

chipx86chipx86

You want as little as possible in this with block. Confine it to the SimpleUploadedFile creation.

chipx86chipx86

Docstrings should be in the form of: """Testing <class>.<method> <condition>""" Like: """Testing XMLReviewUI.get_rendered_lines orders tag results""" Same in other tests.

chipx86chipx86

This is all part of the Python Standard Library, so it should be in the same import group.

chipx86chipx86

Alphabetical order. This would be: from django.utils import six from django.utils.six.moves import ... from django.utils.translation import ...

chipx86chipx86

This is part of the Python Standard Library, and should go in the appropriate import group.

chipx86chipx86

Lines are wrapping way too soon.

chipx86chipx86

This must describe the results of the method (a Yields section). See Notion for info on writing docs.

chipx86chipx86

Blank line between these.

chipx86chipx86

We don't use the inline if .. else .. form. This should be expanded into a standard if/else.

chipx86chipx86

Blank line here.

chipx86chipx86

Blank line here.

chipx86chipx86

Blank line here.

chipx86chipx86

Blank line here.

chipx86chipx86

Blank line here.

chipx86chipx86

Blank line here.

chipx86chipx86

Blank line here.

chipx86chipx86

Blank line here.

chipx86chipx86

Blank line here.

chipx86chipx86

Docstrings must have a single-line summary. IT cnanot wrap. Others are wrapping too soon.

chipx86chipx86

This is going to show up wrong in the docs. If you want to show inline code, use double backticks …

chipx86chipx86

This will need a docstring in our standard format. So will other methods in this class.

chipx86chipx86

Double prefixes should never be used in Python, except for operators (like __eq__) or other special methods (__str__, __repr__). Private …

chipx86chipx86

This needs a docstring. It's also in the wrong casing format. find_attributes would be more correct.

chipx86chipx86

Blank line here.

chipx86chipx86

These can be combined into one conditional.

chipx86chipx86

Blank line here.

chipx86chipx86

This also needs a docstring and should be add_indent.

chipx86chipx86

W291 trailing whitespace

reviewbotreviewbot

All your testing happens in the get xml file function, but base on the nasme, that should be returning an …

RO rodgerwa

You don't verify if any exceptions are thrown here. Rather than waiting for a KeyError or TypeError when in etree_to_dict, …

RO rodgerwa

I think some of your variable names (like t, dc and dc) could be more descriptive. For example, I'm assuming …

JT jtrang

This docstring should explain the inputs expected and return values. You also need to say what type is expected for …

RO rodgerwa

Add a comment for why len(value) ==1: is important.

RO rodgerwa

Is it possible for strip() to return none?

RO rodgerwa

You have attrib and attrib_in, both of these side by side are a little confusing. Better names are required.

RO rodgerwa

You should add a comment to explain what is happening here.

RO rodgerwa

Similar comment to above. The variable o could have a more descriptive name.

JT jtrang

Along with providing a better name for the variable, you should add what args and kwargs are representing in your …

RO rodgerwa

Considering this is an IO Stream, is there possibility of not having a value by the end or have a …

RO rodgerwa

You'll need to document the expected results and inputs.

RO rodgerwa

Should say Add indention on new line with an indent of the current depth plus one.

RO rodgerwa

what if the wrong type is passed in?

RO rodgerwa

I understand the _ is for localization? How exactly does that lookup work?

rpolyanorpolyano

Perhaps consider using enums for this and text

rpolyanorpolyano

Not sure this is needed. Why not call self._fixname directly?

rpolyanorpolyano

W291 trailing whitespace

reviewbotreviewbot

It may be more clear if you put key[0] == "text" (or better yet use an enum) and add this …

rpolyanorpolyano

%s may round things (is this what you want?). Consider using %r if you don't want rounding. Have a look …

rpolyanorpolyano
RI
RI
RI
chipx86
  1. 
      
  2. Show all issues

    Unlike JSON files, order matters in XML files. For example, imagine this:

    <h1>This is a section</h1>
    <p>Some content.</p>
    
    <h1>Here's another section</h1>
    <p>More content.</p>
    <code>Some sample code.</code>
    

    If you order them, then you get:

    <code>Some sample code.</code>
    <h1>This is a section</h1>
    <h1>Here's another section</h1>
    <p>Some content.</p>
    <p>More content.</p>
    

    That doesn't make any sense.

    We also shouldn't order attributes. There may be a reason the author has put them that way.

    When working with XML, it's good to show the XML in a form that makes it easier to read the document, in a tree form, but we don't want to otherwise alter that content.

  3. reviewboard/attachments/mimetypes.py (Diff revision 4)
     
     
     
     
     
     
     
     
     
    Show all issues

    These blank lines should remain. Check our style guide on Notion for where blank lines should go.

  4. reviewboard/reviews/ui/base.py (Diff revision 4)
     
     
    Show all issues

    Can you add a comment about why the score is 1 here, based on what we've diagnosed?

  5. reviewboard/reviews/ui/tests.py (Diff revision 4)
     
     
    Show all issues

    os is a Python Standard Library module, so it should go in an import group after __future__ but before django. See the style guideline on Notion for import groups.

  6. reviewboard/reviews/ui/tests.py (Diff revision 4)
     
     
     
     
     
    Show all issues

    "XMLReviewUITests"

  7. reviewboard/reviews/ui/tests.py (Diff revision 4)
     
     
     
     
    Show all issues

    No blank line between class and the docstring.

    We also tend to include the full module path in the docstring for unit tests, so the full one for XMLReviewUI.

    The blank line should go after.

  8. reviewboard/reviews/ui/tests.py (Diff revision 4)
     
     
    Show all issues

    Helper functions in unit tests should be private (prefix with a _) and put at the end of the class.

  9. reviewboard/reviews/ui/tests.py (Diff revision 4)
     
     
    Show all issues

    Typo, and this doesn't follow our documentation guidelines.

  10. reviewboard/reviews/ui/tests.py (Diff revision 4)
     
     
     
     
    Show all issues

    Ideally, we wouldn't include data files, since that makes it harder to have self-contained tests and to provide variations on the tests. It's not even needed in this case, since SimpleUploadedFile takes the data directly. This function should take the data, and that should be provided in calls to the method instead of opening a file.

    1. I am confused about how to use SimpleUploadedFile() to takes the data directly. Is that means to define the data as a String?

    2. I fixed it as
      file_path = os.path.join(os.path.dirname(__file__), 'testdata', filename)

  11. reviewboard/reviews/ui/tests.py (Diff revision 4)
     
     
     
    Show all issues

    Each component of os.path.join should be a single path entry. No . or /. file_diretory must also be its own parameter -- no concatenation, since that's the point of os.path.join.

  12. reviewboard/reviews/ui/tests.py (Diff revision 4)
     
     
     
    Show all issues

    Blank line between these.

  13. reviewboard/reviews/ui/tests.py (Diff revision 4)
     
     
     
     
     
     
     
     
    Show all issues

    You want as little as possible in this with block. Confine it to the SimpleUploadedFile creation.

  14. reviewboard/reviews/ui/tests.py (Diff revision 4)
     
     
     
     
    Show all issues

    Docstrings should be in the form of:

    """Testing <class>.<method> <condition>"""
    

    Like:

    """Testing XMLReviewUI.get_rendered_lines orders tag results"""
    

    Same in other tests.

  15. reviewboard/reviews/ui/xmlui.py (Diff revision 4)
     
     
     
     
     
    Show all issues

    This is all part of the Python Standard Library, so it should be in the same import group.

    1. By same import group do you mean put them together without empty line?

  16. reviewboard/reviews/ui/xmlui.py (Diff revision 4)
     
     
     
     
    Show all issues

    Alphabetical order. This would be:

    from django.utils import six
    from django.utils.six.moves import ...
    from django.utils.translation import ...
    
  17. reviewboard/reviews/ui/xmlui.py (Diff revision 4)
     
     
    Show all issues

    This is part of the Python Standard Library, and should go in the appropriate import group.

  18. reviewboard/reviews/ui/xmlui.py (Diff revision 4)
     
     
     
     
    Show all issues

    Lines are wrapping way too soon.

  19. reviewboard/reviews/ui/xmlui.py (Diff revision 4)
     
     
    Show all issues

    This must describe the results of the method (a Yields section). See Notion for info on writing docs.

  20. reviewboard/reviews/ui/xmlui.py (Diff revision 4)
     
     
     
    Show all issues

    Blank line between these.

  21. reviewboard/reviews/ui/xmlui.py (Diff revision 4)
     
     
    Show all issues

    We don't use the inline if .. else .. form. This should be expanded into a standard if/else.

  22. reviewboard/reviews/ui/xmlui.py (Diff revision 4)
     
     
     
    Show all issues

    Blank line here.

  23. reviewboard/reviews/ui/xmlui.py (Diff revision 4)
     
     
     
    Show all issues

    Blank line here.

  24. reviewboard/reviews/ui/xmlui.py (Diff revision 4)
     
     
     
    Show all issues

    Blank line here.

  25. reviewboard/reviews/ui/xmlui.py (Diff revision 4)
     
     
     
    Show all issues

    Blank line here.

  26. reviewboard/reviews/ui/xmlui.py (Diff revision 4)
     
     
     
    Show all issues

    Blank line here.

  27. reviewboard/reviews/ui/xmlui.py (Diff revision 4)
     
     
     
    Show all issues

    Blank line here.

  28. reviewboard/reviews/ui/xmlui.py (Diff revision 4)
     
     
     
    Show all issues

    Blank line here.

  29. reviewboard/reviews/ui/xmlui.py (Diff revision 4)
     
     
     
    Show all issues

    Blank line here.

  30. reviewboard/reviews/ui/xmlui.py (Diff revision 4)
     
     
     
    Show all issues

    Blank line here.

  31. reviewboard/reviews/ui/xmlui.py (Diff revision 4)
     
     
     
     
     
     
     
     
    Show all issues

    Docstrings must have a single-line summary. IT cnanot wrap.

    Others are wrapping too soon.

  32. reviewboard/reviews/ui/xmlui.py (Diff revision 4)
     
     
    Show all issues

    This is going to show up wrong in the docs. If you want to show inline code, use double backticks around it.

    This format being shown here, though, is a bit confusing, and seems more like an implementation detail. It's better to describe this and then provide an example of a value separately.

    1. I have move this comment out as a note to my code, do I still need add double backticks around it ?

  33. reviewboard/reviews/ui/xmlui.py (Diff revision 4)
     
     
    Show all issues

    This will need a docstring in our standard format.

    So will other methods in this class.

    1. I was coping the XMLEncoderAdapter class ~/djblets/djblets/webapi/encoders.py and it does not has a doc tring.
      I add doc string in my class, though.

  34. reviewboard/reviews/ui/xmlui.py (Diff revision 4)
     
     
    Show all issues

    Double prefixes should never be used in Python, except for operators (like __eq__) or other special methods (__str__, __repr__).

    Private methods should also go after all public methods.

    1. I look over the stack overflow and I think it is ok to use double prefixes as all it does is to
      avoid your method to be overridden by a subclass or accessed accidentally.
      (https://stackoverflow.com/questions/1301346/what-is-the-meaning-of-a-single-and-a-double-underscore-before-an-object-name)

      The parent class (~/djblets/djblets/webapi/encoders.py) used the double prefixes, So I thought I should use
      it at here too.

      May you tell me why we should never use double prefixes in Python?

      I changed it to single prefix.

  35. reviewboard/reviews/ui/xmlui.py (Diff revision 4)
     
     
    Show all issues

    This needs a docstring.

    It's also in the wrong casing format. find_attributes would be more correct.

    1. I copy the name of this function directly from ~/djblets/djblets/webapi/encoders.py
      in the purpose of overwrite it.

      Maybe the name in the encoders.py class need to be changed too.

    2. I was able to changed this function because find_attributes() created by myself.

  36. reviewboard/reviews/ui/xmlui.py (Diff revision 4)
     
     
     
    Show all issues

    Blank line here.

  37. reviewboard/reviews/ui/xmlui.py (Diff revision 4)
     
     
     
    Show all issues

    These can be combined into one conditional.

  38. reviewboard/reviews/ui/xmlui.py (Diff revision 4)
     
     
     
    Show all issues

    Blank line here.

  39. reviewboard/reviews/ui/xmlui.py (Diff revision 4)
     
     
    Show all issues

    This also needs a docstring and should be add_indent.

    1. I copy the name of this function directly from ~/djblets/djblets/webapi/encoders.py
      in the purpose of overwrite it.

      Maybe the name in the encoders.py class need to be changed too.

    2. I was not able to change this function because I utilized some functions that are in the parent class that calling addIndent().

  40. 
      
RI
Review request changed
Change Summary:

Fixed issue indicated by Christian

Checks run (1 failed, 1 succeeded)

flake8 failed.
JSHint passed.

flake8

RI
RI
JT
  1. 
      
  2. reviewboard/reviews/ui/xmlui.py (Diff revision 6)
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
    Show all issues

    I think some of your variable names (like t, dc and dc) could be more descriptive. For example, I'm assuming dc means "dictionary children", but it's not immediately obvious.

  3. reviewboard/reviews/ui/xmlui.py (Diff revision 6)
     
     
    Show all issues

    Similar comment to above. The variable o could have a more descriptive name.

  4. 
      
RO
  1. I don't have to much insight on what you are working on but I feel that there are a few things you can do to make this part of the code more maintainable and readable for others.

  2. reviewboard/reviews/ui/tests.py (Diff revision 6)
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
    Show all issues

    All your testing happens in the get xml file function, but base on the nasme, that should be returning an xml file of sorts, not testing for nested XML.

    1. I had different features -- included sorted features implemented before, but they got rejected because XML files are different from the JSON files while beautifing them -- their orders matter. So by now, only the nested xml test is needed.
      I coulld not come with a good solution on how to test either a file has been beautified or not, though.

  3. reviewboard/reviews/ui/xmlui.py (Diff revision 6)
     
     
     
     
     
     
    Show all issues

    You don't verify if any exceptions are thrown here. Rather than waiting for a KeyError or TypeError when in etree_to_dict, you should get an IOError and handle that.

    1. I am not sure what do you mean by here: the IOError will be thrown to the except clause in the below and being handled. so the etree_to_dict won't be called in this case.

  4. reviewboard/reviews/ui/xmlui.py (Diff revision 6)
     
     

    Is this too broad for the types of exceptions beeing handled?

    1. hmm as there are always unexpected behavior better cover all than not covered; On the other hand I don't think there is any error that needed to be handled seperately

  5. reviewboard/reviews/ui/xmlui.py (Diff revision 6)
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     

    flow is really odd and there are no comments to explain what is happening in each block.

    You could use intermediate functions to make it obvious too.

  6. reviewboard/reviews/ui/xmlui.py (Diff revision 6)
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
    Show all issues

    This docstring should explain the inputs expected and return values.

    You also need to say what type is expected for t because it makes the rest of this code a little unclear as we don't know what attributes are guaranteed to exist.

  7. reviewboard/reviews/ui/xmlui.py (Diff revision 6)
     
     
    Show all issues

    Add a comment for why len(value) ==1: is important.

  8. reviewboard/reviews/ui/xmlui.py (Diff revision 6)
     
     

    Why is it value[0]? Add comments to explain.

  9. reviewboard/reviews/ui/xmlui.py (Diff revision 6)
     
     
    Show all issues

    Is it possible for strip() to return none?

    1. I dont see the possibilities from the document https://docs.python.org/2/library/string.html

  10. reviewboard/reviews/ui/xmlui.py (Diff revision 6)
     
     
     
     
     
     

    Why are you creating a new dict and adding t.tag to it? Is there any benefit over adding text to t.tag?

    1. It is a order Dictionary so the order of input matters.
      python did not support me to altering the input order so I have to manually do it

  11. reviewboard/reviews/ui/xmlui.py (Diff revision 6)
     
     

    It isn't obvious that this will only be run if t.attrib and t.children are None

  12. reviewboard/reviews/ui/xmlui.py (Diff revision 6)
     
     

    confusing operation. You are reassigning an input to the return value of a function to which its intial input was the value you are reassigning.

    Maybe use an intermediate?

  13. reviewboard/reviews/ui/xmlui.py (Diff revision 6)
     
     
     
     
    Show all issues

    You have attrib and attrib_in, both of these side by side are a little confusing. Better names are required.

  14. reviewboard/reviews/ui/xmlui.py (Diff revision 6)
     
     
     
     
    Show all issues

    You should add a comment to explain what is happening here.

  15. reviewboard/reviews/ui/xmlui.py (Diff revision 6)
     
     
    Show all issues

    Along with providing a better name for the variable, you should add what args and kwargs are representing in your docstring.

    1. I am actually not sure about this; this is a function overwritting; as you can tell the args and kwargs are never used.

  16. reviewboard/reviews/ui/xmlui.py (Diff revision 6)
     
     
     
     
     
     
     
     
    Show all issues

    Considering this is an IO Stream, is there possibility of not having a value by the end or have a possible IO error?

  17. reviewboard/reviews/ui/xmlui.py (Diff revision 6)
     
     
     
    Show all issues

    You'll need to document the expected results and inputs.

  18. reviewboard/reviews/ui/xmlui.py (Diff revision 6)
     
     
     

    More index operations that aren't clear why.

    Docstrings could give insight on this.

  19. reviewboard/reviews/ui/xmlui.py (Diff revision 6)
     
     
    Show all issues

    Should say
    Add indention on new line with an indent of the current depth plus one.

  20. reviewboard/reviews/ui/xmlui.py (Diff revision 6)
     
     

    You should explain why key[1] is important here.

  21. reviewboard/reviews/ui/xmlui.py (Diff revision 6)
     
     
    Show all issues

    what if the wrong type is passed in?

  22. 
      
RI
Review request changed
rpolyano
  1. 
      
  2. reviewboard/reviews/ui/xmlui.py (Diff revision 7)
     
     
    Show all issues

    I understand the _ is for localization? How exactly does that lookup work?

    1. I import
      from django.utils.translation import ugettext as _
      and use _ to represent this class

  3. reviewboard/reviews/ui/xmlui.py (Diff revision 7)
     
     
    Show all issues

    Perhaps consider using enums for this and text

    1. the key is a variable and the single "attribute" string not being used anywhere else.

  4. reviewboard/reviews/ui/xmlui.py (Diff revision 7)
     
     
    Show all issues

    Not sure this is needed. Why not call self._fixname directly?

    1. Because fixname is used in later function

  5. reviewboard/reviews/ui/xmlui.py (Diff revision 7)
     
     
    Show all issues

    It may be more clear if you put key[0] == "text" (or better yet use an enum) and add this as an and clause to the if above it to reduce nesting

  6. reviewboard/reviews/ui/xmlui.py (Diff revision 7)
     
     
    Show all issues

    %s may round things (is this what you want?). Consider using %r if you don't want rounding. Have a look at this blog post for some more info.

    1. %s means string formatting syntax and that's what I want

  7. 
      
RI
RI
david
Review request changed
Status:
Discarded