Parse out testing_done field from commit messages

Review Request #9890 — Created April 24, 2018 and updated

mandeep
RBTools
master
3ca9c57...
rbtools
brennie

Previously, one had to manually add comments in the testing done section
online for any testing they did for a given patch. However, now if one
was to post something with "Testing done:" in the commit message it will
split that out into the Testing Done section when posting.

Ran unit tests.

Created a test repository on a local server and posted the following review
requests testing cases with:
1. Summary, Description and Testing Done
2. Summary and Description

With a previous Summary, Description and Testing Done:
1. Update Summary
2. Update Description
3. Update Testing Done

With a previous Summary and Description (no Testing Done):
1. Add Testing Done

Testing different ways of 'Testing Done' inputs:
1. 'testing done'
2. 'testing done:'
3. 'Testing done'
4. 'Testing done:'
5. 'testing Done'
6. 'testing Done:'
7. 'tEstIng DOnE:'

  • 0
  • 0
  • 111
  • 0
  • 111
Description From Last Updated
mandeep
mandeep
brennie
  1. 
      
  2. Summary suggestion:

    "Parse out testing_done field from commit messages"

  3. rbtools/clients/__init__.py (Diff revision 1)
     
     
     
     
     

    Can you re-flow this as:

    result = {
        'key': 'value',
    }
    
  4. rbtools/clients/__init__.py (Diff revision 1)
     
     

    How about: "Determine if the commit message contains testing done information."

  5. rbtools/clients/__init__.py (Diff revision 1)
     
     

    You can add:

    only_testing_done = i == 1
    

    and remove the computation for it below. This will be a cheaper computation.

    We will also need to set it to False before in the else statement.

  6. rbtools/clients/__init__.py (Diff revision 1)
     
     
     
     

    Comments should never have first-person statements. They are not the justification for your changes -- the commit message is.

    You can explain something is, but not why something changed. An explanation of why something changed doesn't make sense when you're looking at a file -- it only makes sense when you're looking at the commit as a whole. And when you're doing that, you can read the commit message for justification.

  7. rbtools/clients/__init__.py (Diff revision 1)
     
     

    These sorts of comments are unnecessary because they can be inferred from the if statements.

  8. rbtools/clients/__init__.py (Diff revision 1)
     
     

    Comments should be full sentences: they should begin with a capital letter and end with a period.

  9. rbtools/clients/tests/test_base.py (Diff revision 1)
     
     

    Missing module-level docstring.

  10. rbtools/clients/tests/test_base.py (Diff revision 1)
     
     

    Docstring should be of the format: "Unit tests for rbtools.clients.SCMClient"

  11. rbtools/clients/tests/test_base.py (Diff revision 1)
     
     
     

    Docstring should be of the form

    """Testing SCMClient.get_commit_message when ..."""
    

    If it goes over 79 characters and wraps, the trailing """ should be on its own line.

    Same below.

  12. rbtools/clients/tests/test_base.py (Diff revision 1)
     
     

    ALL_CAPS is reserved for module-level and class-level constants. This would be fine as commit_message

  13. rbtools/clients/tests/test_base.py (Diff revision 1)
     
     
     
     
     
     
     

    Single triple-quotes ('''). Double quotes are reserved for docstrings only.

    Same below.

  14. rbtools/clients/tests/test_base.py (Diff revision 1)
     
     

    Only module-level and class-level constnats should be ALL_CAPS. This is fine as commit_message.

  15. rbtools/clients/tests/test_base.py (Diff revision 1)
     
     
     

    We're using these immediately, so I don't know that this is necessary.

    Same below.

  16. rbtools/clients/tests/test_base.py (Diff revision 1)
     
     

    We are doing this in every test. We can do:

    def setUp(self):
        super(SCMBaseClientTests, self).setUp()
    
        self.client = SCMClient()
    

    and use self.client in all the tests. We like to keep all common set-up / teardown logic in these methods. See unittest.TestCase docs for information about setUp, etc.

  17. rbtools/clients/tests/test_base.py (Diff revision 1)
     
     
     

    Instead of this, how about:

    self.assertEqual(
        client.get_commit_message([]),
        {
            'summary': lines[0],
            'description': '',
            'testing_done': lines[3],
        })
    

    That way we test all three at once and test that there are no other keys returned.

    Same below.

  18. rbtools/commands/post.py (Diff revision 1)
     
     

    "commit message"

    Missing a period.

  19. rbtools/commands/post.py (Diff revision 1)
     
     

    "commit message"

    Missing a period.

  20. rbtools/commands/post.py (Diff revision 1)
     
     

    Is this still a todo?

  21. rbtools/commands/post.py (Diff revision 1)
     
     

    Undo this.

  22. rbtools/commands/post.py (Diff revision 1)
     
     
     

    "summary, description, or testing done"

  23. rbtools/commands/post.py (Diff revision 1)
     
     
     

    instead of saying the field names over again, how about:

    "override the options for each guessed field" (in lieu of "get the ... options")

  24. rbtools/commands/post.py (Diff revision 1)
     
     
     

    Leftover debugging? :)

  25. rbtools/commands/post.py (Diff revision 1)
     
     
     
     
     
     
     
     
     
     
     
     
     

    We're doing this thrice, so we should probably refactor this into a method.

  26. rbtools/utils/match_score.py (Diff revision 1)
     
     

    "Return" over get.

    Can you add args/ returns to this docstring?

    On second thought, you may want to hold off on these docstring changes since I believe David has a pending change that updates all docstrings in RBTools to bring them up to spec. Once that lands, you can rebase off of it.

  27. rbtools/utils/match_score.py (Diff revision 1)
     
     

    ``summary``, ``description``, and ``testing_done``
    
  28. rbtools/utils/match_score.py (Diff revision 1)
     
     

    :py:class:`difflib.SequenceMatcher`
    
  29. rbtools/utils/match_score.py (Diff revision 1)
     
     

    :py:class:`Score`
    
  30. rbtools/utils/match_score.py (Diff revision 1)
     
     
     
     
     
     
     
     
     

    I assume {field}_pair is a tuple, so can we do:

    summary_score = SequenceMatcher(None, *summary_pair).ratio()
    description_score = SequenceMatcher(None, *description_pair).ratio()
    testing_done_score = SequenceMatcher(None, *testing_done_pair).ratio()
    

    Ideally, we could also do this in a dict comprehension and splat into the Score intializer:

    raw_scores = {
        field_name: SequenceMatcher(None, *field_pair).ratio()
        for field_name, field_pair in (
            ('summary_score', summary_pair),
            ('description_score', description_pair),
            ('testing_done_score', testing_done_pair),
        )
    }
    
    return Score(**raw_scores)
    
  31. rbtools/utils/review_request.py (Diff revision 1)
     
     

    This needs to be updated with args, returns, etc but needs to be based of David's docstring change.

  32. rbtools/utils/review_request.py (Diff revision 1)
     
     

    Leftover testing code? :)

  33. 
      
mandeep
brennie
  1. 
      
  2. rbtools/clients/__init__.py (Diff revision 2)
     
     
     
     
     

    Can we format this as:

    result = {
        'summary': lines[0],
        'description': b'',
        'testing_done': b'',
    }
    
  3. rbtools/clients/__init__.py (Diff revision 2)
     
     
     

    Blnank line between these.

  4. rbtools/clients/__init__.py (Diff revision 2)
     
     

    This comment is redundant.

  5. rbtools/clients/__init__.py (Diff revision 2)
     
     

    This needs to be in the if r.match(line) block above.

  6. rbtools/clients/__init__.py (Diff revision 2)
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     

    Instead of doing .strip() on each line, how about we remove those calls and do:

    return {
        key: value.strip()
        for key, value in six.iteritems(result)
    }
    
  7. rbtools/clients/__init__.py (Diff revision 2)
     
     

    We've already set summary.

  8. rbtools/clients/__init__.py (Diff revision 2)
     
     

    We've already set summary.

  9. rbtools/clients/__init__.py (Diff revision 2)
     
     

    We've already set summary.

  10. rbtools/clients/__init__.py (Diff revision 2)
     
     

    We've already set summary.

  11. rbtools/clients/tests/test_base.py (Diff revision 2)
     
     
     
     
     
     
     
     
     
     
     
     
     

    Module docstrings need to be first, e.g.

    """Unit tests for rbtools.clients.SCMClient."""
    
    from __future__ import unicode_literals
    
    # ....
    
  12. rbtools/clients/tests/test_base.py (Diff revision 2)
     
     

    No period.

  13. rbtools/clients/tests/test_base.py (Diff revision 2)
     
     

    No period.

  14. rbtools/clients/tests/test_base.py (Diff revision 2)
     
     

    No period.

  15. rbtools/clients/tests/test_base.py (Diff revision 2)
     
     

    No period.

  16. rbtools/clients/tests/test_base.py (Diff revision 2)
     
     

    No period.

  17. rbtools/clients/tests/test_base.py (Diff revision 2)
     
     

    No period.

  18. rbtools/clients/tests/test_base.py (Diff revision 2)
     
     

    No period.

  19. rbtools/commands/post.py (Diff revision 2)
     
     

    Mind adding a comment same as above to indicate what this is doing? Also those comments are missing periods, could you add them?

  20. rbtools/commands/post.py (Diff revision 2)
     
     

    No blank line here.

  21. rbtools/commands/post.py (Diff revision 2)
     
     
     

    Leftover debugging?

    1. No, this is from before for debugging purposes with the command rbt post --debug.

  22. rbtools/commands/post.py (Diff revision 2)
     
     
     

    Docstring summaries should be written in the imperitive mood (i.e., they give an order or command). In this case, it would be "Check" over "Checks", e.g.

    Check if the guess value for a given field is True or False
    

    However, we could also write this as:

    Determine whether or not we should guess a given field.
    

    Also missing args/returns.

  23. rbtools/utils/match_score.py (Diff revision 2)
     
     

    Can you revert changes to this file? See my comment in rbtools/utils/review_request.py for the reasoning.

  24. rbtools/utils/review_request.py (Diff revision 2)
     
     
     
     
     
     
     

    Can you revert the changes to this file? I don't think we want to match on testing done, since a LOT of changes have "ran unit tests" as testing done, we would get lots of false positives.

  25. 
      
brennie
  1. 
      
  2. rbtools/commands/post.py (Diff revision 2)
     
     

    Revert this too.

  3. 
      
mandeep
mandeep
brennie
  1. 
      
  2. Can you write your testing done without "I" statements, e.g.

    "Ran unit tests."
    "Created a test repository..."

  3. rbtools/clients/__init__.py (Diff revision 3)
     
     
     

    "is description and [i:] is testing done"

  4. rbtools/clients/__init__.py (Diff revision 3)
     
     
     

    This is unnecessary with the above assignment.

  5. rbtools/clients/__init__.py (Diff revision 3)
     
     

    We are doing this when we assign result above.

  6. rbtools/clients/__init__.py (Diff revision 3)
     
     

    WIth the above comment, this would be elif not summary

  7. rbtools/clients/tests/test_base.py (Diff revision 3)
     
     
     
     

    Docstrings should be in the following format:

    """Single line summary.
    
    Multi-line description.
    """
    

    This can just be

    """Unit tests for rbtools.clients.SCMClient."""
    
  8. rbtools/clients/tests/test_base.py (Diff revision 3)
     
     
     

    Blank line between these.

  9. rbtools/commands/post.py (Diff revision 3)
     
     

    unicode.

  10. rbtools/commands/post.py (Diff revision 3)
     
     

    Missing type: bool

  11. rbtools/commands/post.py (Diff revision 3)
     
     
  12. rbtools/commands/post.py (Diff revision 3)
     
     

    ) on previous line

  13. rbtools/commands/post.py (Diff revision 3)
     
     

    ) on previous line

  14. rbtools/commands/post.py (Diff revision 3)
     
     

    ) on previous line

  15. rbtools/commands/post.py (Diff revision 3)
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     

    Instead of enumerating all these cases, how about:

                        if guess_summary:
                            self.options.summary = guessed_summary
    
                        if guess_description:
                            if not guess_summary:
                                # If we're guessing the description but not the summary
                                # (for example, if --summary was included), we probably
                                # don't want to strip off the summary line of the
                                # commit message.
                                if guessed_description.startswith(guessed_summary):
                                    self.options.description = guessed_description
                                else:
                                    self.options.description = \
                                        guessed_summary + '\n\n' + guessed_description
                            else:
                                self.options.description = guessed_description
    
                        if guess_testing_done:
                            self.options.testing_done = guessed_testing_done
    
  16. rbtools/utils/match_score.py (Diff revision 3)
     
     

    Can you revert all changes to this file?

    git checkout master -- rbtools/utils/match_score.py
    
  17. rbtools/utils/review_request.py (Diff revision 3)
     
     

    Can you revert all changes to this file?

    git checkout master -- rbtools/utils/review_request.py
    
  18. 
      
mandeep
brennie
  1. Just two comments. This is looking really good! :)

  2. rbtools/clients/__init__.py (Diff revision 4)
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     

    We can optimize this to only check for a testing done when we have a description. We also don't need to enumerate all cases if we do the work of parsing out the testing done when we find it.

    if len(lines) >= 3 and lines[0] and not lines[1]:
        # We have a description. This may also contain a testing done.
        description = lines[2:]
    
        for i, line in enumerate(description):
            if r.match(line):
                result['testing_done'] = '\n'.join(description[i:])
                description = description[:i - 1]
    
        result['description'] = '\n'.join(description)
    

    This code will be a lot easier to follow and understand.

  3. rbtools/clients/__init__.py (Diff revision 4)
     
     

    Why is this lines[0:] now? This used to be lines[1:]. As it stands, i doesn't match up with the actual line number so tests should be failing.

    1. To account for the case where we only testing done, I changed enumerate[(lines[1:], 1) to enumerate[(lines[0:], 1). This was a problem I ran into with the old code.

  4. 
      
mandeep
mandeep
brennie
  1. 
      
  2. rbtools/clients/__init__.py (Diff revision 6)
     
     

    # We have a clear "Summary\n\nDescription"-style commit message so we can split out the description.
    
  3. rbtools/clients/__init__.py (Diff revision 6)
     
     
     
     

    What if len(lines) == 2 and lines[1] is not empty, e.g.

    This is a summary
    this is a description
    

    This should end up with

    result = {
        'description': "This is a summary.\nThis is a description.',
        'summary': 'This is a summary.',
        'testing_done': '',
    }
    
    1. Created a unit test for this and it produced the result above.

  4. rbtools/clients/__init__.py (Diff revision 6)
     
     

    Commends should be a full sentence, beginning with a capital letter and ending with a period.

  5. rbtools/clients/__init__.py (Diff revision 6)
     
     
     

    # We do not have a clear "Summary\n\nDescription"-style commit message so we assume everyting is the description.
    
  6. 
      
mandeep
brennie
  1. 
      
  2. What about the case of a commit message like:

    Testing done:
    foo
    
  3. What happens in the case of a commit message like:

    A commit message
    
    Testing done:
    Testing done:
    I did tests
    
  4. rbtools/clients/__init__.py (Diff revision 7)
     
     

    Your code below does enumerate on description. However, it is a blank string in the case where we only have summary and our loop expects a list!

    I suggest we do description = [] here and do the following on lines 298--304:

    if description:
        for i, line in enumerate(description):
            if r.match(line):
                # ...
                break
    
        result['description'] = '\n'.join(description)
    

    This way we skip the case where description is a

    1. ... an empty list.

  5. rbtools/clients/__init__.py (Diff revision 7)
     
     

    We should break here.

  6. rbtools/clients/tests/test_base.py (Diff revision 7)
     
     

    "SCMself.client"?

  7. rbtools/clients/tests/test_base.py (Diff revision 7)
     
     

    "SCMself.client"?

  8. rbtools/clients/tests/test_base.py (Diff revision 7)
     
     

    "SCMself.client"?

  9. rbtools/clients/tests/test_base.py (Diff revision 7)
     
     

    "SCMself.client"?

  10. rbtools/clients/tests/test_base.py (Diff revision 7)
     
     

    "SCMself.client"?

  11. rbtools/clients/tests/test_base.py (Diff revision 7)
     
     

    "SCMself.client"?

  12. rbtools/clients/tests/test_base.py (Diff revision 7)
     
     

    No period.

  13. 
      
mandeep
brennie
  1. 
      
  2. rbtools/clients/__init__.py (Diff revision 8)
     
     

    "a summary."

  3. rbtools/clients/__init__.py (Diff revision 8)
     
     

    I know I suggested the max() solution, but let's make this more clear by doing the following:

    if i == 0:
        # The entire description is the testing done, so
        # the resulting description will be empty.
        description = []
    else:
        description = description[:i - 1]
    
  4. rbtools/commands/post.py (Diff revision 8)
     
     

    , after description

  5. rbtools/commands/post.py (Diff revision 8)
     
     
     
     
     
     

    Re-flow these to take full advantage of the 79 character limit.

  6. rbtools/commands/post.py (Diff revision 8)
     
     

    Let's call this field_value

  7. rbtools/commands/post.py (Diff revision 8)
     
     

    "Whether or not we are posting a new review request."

  8. rbtools/commands/post.py (Diff revision 8)
     
     

    Whether or not the given field should be guessed.
    
  9. 
      
mandeep
brennie
  1. Ship It!
  2. 
      
chipx86
  1. 
      
  2. rbtools/clients/__init__.py (Diff revision 9)
     
     
     
     
     
     
     

    Since this method is being modified with new results, the documentation should be updated to use the modern docstring format. You'd need to make the following changes:

    1. Change "Returns" to "Return" in the summary (we're moving to that style everywhere else).
    2. Add an "Args" section detailing what revisions is.
    3. Add a "Returns" section detailing what's in the dictionary and what string types to expect.

    See https://www.notion.so/reviewboard/Writing-Codebase-Documentation-e16312b5f061437cb73cbfa369ac3cb5

  3. rbtools/clients/__init__.py (Diff revision 9)
     
     
     

    I don't believe b'' is correct. We're working with Unicode strings everywhere else, and setting them when we set result['description'] below.

  4. rbtools/clients/__init__.py (Diff revision 9)
     
     
     

    This should go in the second if block below, instead of pass. We don't need to default it to anything above.

  5. rbtools/clients/__init__.py (Diff revision 9)
     
     
     

    Blank line between these.

  6. rbtools/clients/__init__.py (Diff revision 9)
     
     
     

    Blank line between these.

  7. rbtools/clients/__init__.py (Diff revision 9)
     
     
     
     
     

    I'd rather we explicitly strip as we set above, and just return result here. That way, we're not doing any unnecessary rebuilding of a dictionary, and we're very clear with what's going into it.

  8. rbtools/clients/tests/test_base.py (Diff revision 9)
     
     

    These should all be test_get_commit_message_with_.... The get_ part is missing from the test names.

  9. rbtools/clients/tests/test_base.py (Diff revision 9)
     
     
     
     
     
     
     

    The dedenting and alignment and everything makes this a bit less ideal for readability. Can you change all these to this form:

    commit_message = (
        b'This is the summary.\n'
        b'\n'
        b'Testing Done:\n'
        b'....\n'
    )
    

    That may seem more "noisy" (the b prefixes and \n), but with unit tests it's very important to clearly see the data you're working with, and so it's an advantage in this case. We know where every newline is and where the start of every line is, and code doesn't start to run into other code below it due to the indentation.

  10. rbtools/clients/tests/test_base.py (Diff revision 9)
     
     
     
     

    Can you compare against the explicit strings we're expecting? That helps with readability and prevents issues down the road as logic changes.

    This comment applies to all tests.

  11. rbtools/commands/post.py (Diff revision 9)
     
     
     

    There's an alignment issue here.

    It's also lacking the extended_help from the other --guess-* flags.

  12. rbtools/commands/post.py (Diff revision 9)
     
     

    -g really should be implying all the guess options.

  13. rbtools/commands/post.py (Diff revision 9)
     
     

    It's better to start with a "truthy" version first, inversing this if/else. So:

    if guess_summary:
        ...
    else:
        ...
    
  14. rbtools/commands/post.py (Diff revision 9)
     
     
     
     

    Can you change this to do:

    '%s\n\n%s' % (guessed_summary, guessed_description)
    
  15. 
      
mandeep
brennie
  1. 
      
  2. You seem to have posted two changes here.

  3. rbtools/clients/__init__.py (Diff revision 10)
     
     
     

    Can we reword this as:

    The first line of the commit messages of the given revisions will be used as the summary.
    

    Can we also mention this will attempt to find a Testing Done section?

  4. rbtools/clients/__init__.py (Diff revision 10)
     
     
     

    Can we reformat this as:

                        result['testing_done'] = \
                            '\n'.join(description[i + 1:]).strip()
    
  5. 
      
mandeep
Review request changed

Commit:

-95c7a5f1005d1be149fafbd461993689d413b93a
+3ca9c57bbfa05d30313b944d7522a7eb2775fc8d

Diff:

Revision 11 (+314 -43)

Show changes

Checks run (2 succeeded)

flake8 passed.
JSHint passed.
Loading...