Go Tool for Review Bot

Review Request #11240 — Created Oct. 22, 2020 and submitted

jblazusi
ReviewBot
master
61c94b0...
reviewbot, students
ceciliawei

This tool's ability to format was removed and placed into another tool,
that tool is tentatively known as the GofmtTool. This was done because,
go fmt is rather lightweight and does not rely on other files in order
to carry out its function. However, go test and go vet require more
information in order to work consistently. In particular they need
access to the entire repository so that patched code can be fully
analyzed against the package code. Therefore GofmtTool inherits from
Tool, whereas GoTool inherits from RepositoryTool and is generally
a much more computationally heavy tool.

Manual testing was done to confirm that go test is working correctly.
As of now it just creates general comments based on which tests have
failed.
Manual testing was done to confirm that go vet is working and known
issues about failures when used against test files have been resolved.


Description From Last Updated

Go/Go Tools?

daviddavid

gofmt still exists, but I believe the modern recommendation is to use go fmt. Can we do that here to ...

daviddavid

Perhaps "This file contains formatting errors and should be run through go fmt" ? Also, shouldn't this be checking the ...

daviddavid

Since we have the output already in a variable, it's kind of silly to write it out to a file ...

daviddavid

The imports should be in alphabetical order. So import json should be placed before import logging.

ceciliaweiceciliawei

Using the string's "split" method isn't portable (for example, windows doesn't use "/" as a path splitter). That said, there's ...

daviddavid

It looks like we're running this for every single file, which means that if you have multiple changed files in ...

daviddavid

Let's pull the result of dest_file.lower() out into a variable so we don't have to call it twice in these ...

chipx86chipx86

Blank line between statements and the start of new blocks.

chipx86chipx86

Same here.

chipx86chipx86

This will scan the entirety of packages every iteration of files. If packages is a set, this will be faster, ...

chipx86chipx86

This can fail, so we'll want to check for exceptions, just as we do further down.

chipx86chipx86

It's generally better to use %-formatted strings to join in variables, as this is faster in Python.

chipx86chipx86

We prefer %-formatted strings, rather than .format(), as it's technically faster and more consistent with the rest of our codebase. ...

chipx86chipx86

When spanning lines, we prefer the % on the line with the variables, as it gives more room for the ...

chipx86chipx86

No blank line here.

chipx86chipx86

Can you incorporate the package name in here, or something to help debug this if this comes up in production?

chipx86chipx86

Same as above regarding %-formatted strings.

chipx86chipx86

Let's use single quotes here, since the inner string doesn't use them. We prefer single quotes in Python strings wherever ...

chipx86chipx86

If we compile this regex before we do the outer loop, it'll speed this part up.

chipx86chipx86

While common in Python, we shouldn't ever override _. That's because _ is commonly used as an alias to ugettext ...

chipx86chipx86

Rather than putting this into a variable and pulling the indexes out, let's just unpack with: filename, line_num = \ ...

chipx86chipx86

You can put the format argument on the same line, since it fits. No need for parens there either, since ...

chipx86chipx86

No blank line here.

chipx86chipx86

Same comment as above regarding having identifying information in the message.

chipx86chipx86

Things within here are getting pretty deeply nested. Maybe break out the test and vet implementations into their own methods?

daviddavid

This should be wrapped in a try/except.

daviddavid

Would it be possible to add a comment showing an example of what the JSON output from go vet looks ...

daviddavid

This is potentially running a lot of times, since it's deeply nested inside several loops. At the top level, can ...

daviddavid
jblazusi
david
  1. 
      
  2. bot/reviewbot/tools/gotool.py (Diff revision 2)
     
     

    Go/Go Tools?

    1. My mistake, I was not exactly sure whether they were separate or not at the time. It is fixed now.

  3. bot/reviewbot/tools/gotool.py (Diff revision 2)
     
     

    It looks like you're leaning towards making the different uses of go optional, which is good. We should probably default to true for everything except go test, which should default to false (test suites can often be cumbersome and long-running).

    1. That is a good idea, I will make sure to implement it that way.

  4. 
      
jblazusi
jblazusi
david
  1. 
      
  2. bot/reviewbot/tools/gotool.py (Diff revision 3)
     
     

    gofmt still exists, but I believe the modern recommendation is to use go fmt. Can we do that here to be more future-proof?

    1. I believe that this is generally true, however, it is important to note that go fmt runs gofmt -l -w. The -w flag is responsible for overwriting the file, although I do not think that this is an issue, especially since patched files are temporary. But I do think that it is worth noting in case, whomever takes over the project is experiencing issues relating to overwriting a file.

      In short, I have updated the command to use go fmt and this can be found in the new GofmtTool CR.

  3. bot/reviewbot/tools/gotool.py (Diff revision 3)
     
     

    Perhaps "This file contains formatting errors and should be run through go fmt" ?

    Also, shouldn't this be checking the output or return value from the format command? Seems like it just unconditionally adds the comment now (maybe that's still part of the WIP?)

    1. I addressed the message in the new GofmtTool. I originally did not setup conditionals, since I wanted to make sure I was getting the correct output, so this is an artifact from early testing.

  4. bot/reviewbot/tools/gotool.py (Diff revision 3)
     
     
     
     
     
     
     
     
     

    Since we have the output already in a variable, it's kind of silly to write it out to a file and load it back in. We can just do:

    try:
        json_data = json.loads(cleaned_output)
    except Exception as e:
        ...
    
    1. This is definitely a blunder. I was a so caught up in using files from previous tools, that I completely forgot about just passing in the output string.

  5. 
      
jblazusi
jblazusi
ceciliawei
  1. 
      
  2. bot/reviewbot/tools/gotool.py (Diff revision 4)
     
     

    The imports should be in alphabetical order. So import json should be placed before import logging.

    1. Thank you, I will make sure I make this update in previous and future CRs.

  3. 
      
david
  1. 
      
  2. bot/reviewbot/tools/gotool.py (Diff revision 4)
     
     

    Using the string's "split" method isn't portable (for example, windows doesn't use "/" as a path splitter). That said, there's both a portable and easier way to do this:

    package = os.path.dirname(path)
    
    1. I was not aware of this being a portability issue, thank you so much for the advice. I have updated my code to use the os.path module in other areas as well.

  3. bot/reviewbot/tools/gotool.py (Diff revision 4)
     
     

    It looks like we're running this for every single file, which means that if you have multiple changed files in a given package, we'll add the test failures multiple times.

    Instead of overriding handle_file, can we instead override handle_files, and use the file list to build a list of changed packages? We can then run the tests once per package.

    1. Excellent, this is exactly what I had in mind when I was optimizing my FBInfer Tool. However, the advice of building a list of changed packages was a very useful starting point.

  4. 
      
jblazusi
chipx86
  1. 
      
  2. bot/reviewbot/tools/gotool.py (Diff revision 5)
     
     
     
     
     
     
     
     

    Let's pull the result of dest_file.lower() out into a variable so we don't have to call it twice in these checks.

  3. bot/reviewbot/tools/gotool.py (Diff revision 5)
     
     
     

    Blank line between statements and the start of new blocks.

  4. bot/reviewbot/tools/gotool.py (Diff revision 5)
     
     
     

    Same here.

  5. bot/reviewbot/tools/gotool.py (Diff revision 5)
     
     
     

    This will scan the entirety of packages every iteration of files. If packages is a set, this will be faster, though then we lose the ordering. Perhaps that's okay, and we can sort the results when iterating it?

    If so, we can safely .add() into a set without needing to check existence first.

    1. That's clever, I am not sure why I did not think of using a set before. Further down, we loop through every package so the order is not relevant.

  6. bot/reviewbot/tools/gotool.py (Diff revision 5)
     
     
     
     
     
     
     
     
     
     
     

    This can fail, so we'll want to check for exceptions, just as we do further down.

    1. I totally agree with this. Although I think that it is worth mentioning that the other review bot tools do not have try/except blocks for the execute command.

  7. bot/reviewbot/tools/gotool.py (Diff revision 5)
     
     

    It's generally better to use %-formatted strings to join in variables, as this is faster in Python.

    1. I did not know that there was a difference in performance, that is good to know.

  8. bot/reviewbot/tools/gotool.py (Diff revision 5)
     
     

    We prefer %-formatted strings, rather than .format(), as it's technically faster and more consistent with the rest of our codebase.

    This can also be combiend with the previous line:

    formatted_output = '[%s]' % ','.join(gotest_output)
    
  9. bot/reviewbot/tools/gotool.py (Diff revision 5)
     
     

    When spanning lines, we prefer the % on the line with the variables, as it gives more room for the strings and helps more clearly indicate that we're formatting variables in.

  10. bot/reviewbot/tools/gotool.py (Diff revision 5)
     
     
     
     

    No blank line here.

  11. bot/reviewbot/tools/gotool.py (Diff revision 5)
     
     

    Can you incorporate the package name in here, or something to help debug this if this comes up in production?

  12. bot/reviewbot/tools/gotool.py (Diff revision 5)
     
     

    Same as above regarding %-formatted strings.

  13. bot/reviewbot/tools/gotool.py (Diff revision 5)
     
     

    Let's use single quotes here, since the inner string doesn't use them. We prefer single quotes in Python strings wherever possible.

  14. bot/reviewbot/tools/gotool.py (Diff revision 5)
     
     

    If we compile this regex before we do the outer loop, it'll speed this part up.

    1. I tried my best to fix this. However, I am not that familiar with regex, so I would appreciate it if you took another look at my change.

    2. Looks like you did it correctly.

    3. Perfect, I will go ahead and mark this issue as fixed.

  15. bot/reviewbot/tools/gotool.py (Diff revision 5)
     
     

    While common in Python, we shouldn't ever override _. That's because _ is commonly used as an alias to ugettext or ugettext_lazy, and this can have unintentional side-effects.

    Also for items, we need to use six.iteritems() to get consistent behavior between Python 2 and 3.

    1. I had no idea, thank you for letting me know. Is this common across most python programs, or just reviewboard in particular?
      If we are not using the variable, such as the situation I am in, what should I use instead of _?

    2. Christian answered this question in the slack:
      "naming is better than not, so that it's self-documenting. Avoiding unpacking those variables is even better. Fewer things for Python to deal with, less a maintainer has to worry about"

  16. bot/reviewbot/tools/gotool.py (Diff revision 5)
     
     
     

    Rather than putting this into a variable and pulling the indexes out, let's just unpack with:

    filename, line_num = \
        os.path.basename(key['posn']).split(':', 2)
    

    Could you also put a comment above this showing the general format of what we should expect here, so it's documented?

  17. bot/reviewbot/tools/gotool.py (Diff revision 5)
     
     
     
     

    You can put the format argument on the same line, since it fits. No need for parens there either, since we're not building a tuple (only one arg, and also (message) is equivalent to message since there's no tuple indicator, like a trailing comma or a second value).

  18. bot/reviewbot/tools/gotool.py (Diff revision 5)
     
     
     
     

    No blank line here.

  19. bot/reviewbot/tools/gotool.py (Diff revision 5)
     
     

    Same comment as above regarding having identifying information in the message.

  20. 
      
jblazusi
jblazusi
david
  1. 
      
  2. bot/reviewbot/tools/gotool.py (Diff revision 7)
     
     

    Things within here are getting pretty deeply nested. Maybe break out the test and vet implementations into their own methods?

  3. bot/reviewbot/tools/gotool.py (Diff revision 7)
     
     
     
     
     
     
     
     

    This should be wrapped in a try/except.

  4. bot/reviewbot/tools/gotool.py (Diff revision 7)
     
     
     
     
     
     
     
     

    Would it be possible to add a comment showing an example of what the JSON output from go vet looks like? That way people looking at the code can see how it maps to what it's parsing.

  5. bot/reviewbot/tools/gotool.py (Diff revision 7)
     
     
     

    This is potentially running a lot of times, since it's deeply nested inside several loops.

    At the top level, can we create a new dict that maps the patched file path back to f? That way we can just index into it instead of looping every time.

  6. 
      
jblazusi
jblazusi
david
  1. Ship It!
  2. 
      
jblazusi
Review request changed

Status: Closed (submitted)

Change Summary:

Pushed to release-2.0.x (acb0cf2)
Loading...