Change how commit message parsing and diff calls work.

Review Request #5457 — Created Feb. 11, 2014 and submitted

Information

RBTools
master
ae38c8f...

Reviewers

The SCMClients had some confusing responsibilities where they attempted
to handle the guessed summary/description logic in the diff methods,
getting and setting data in the options object. This logic had no
business being in the diff method, or in SCMClients for that matter, and
should have exclusively been in rbt post.

On top of that, we had some variations in how the commit summary and
description were fetched and processed in the different SCMClients. This
led to differences in behavior and subtle bugs. For example, Mercurial
was grabbing the summary from the wrong end of a commit range, and
Bazaar was providing extraneous information for the descriptions.

Now, instead of extract_summary() and extract_description() methods,
SCMClients only need to implement get_raw_commit_message(). This returns
the full set of commit messages for the provided revisions. The base
SCMClient class then provides a get_commit_message() that calls that and
splits the result into a summary and a description.

The description also no longer includes the summary line, preventing the
first line of the Description field from mirroring the Summary field
unnecessarily.

rbt post is then responsible for calling this when it needs to generate
the guessed content or match an existing review request in order to
update it.

Since both get_commit_message() and diff() need parsed revisions, it no
longer makes sense for diff() to call parse_revision_spec(). That's now
the job of the caller, which results in diff() being just a bit simpler.

At this point, the get_diff() utility function is no longer necessary
either, so it's been removed.

Now that rbt post owns more of the logic, it can also be smarter about
when things are called. Previously, if using -u, we'd parse the
revision spec twice (once in diff() and once when generating the
summary/description for matching).

We also would make at least two calls to get the summary and description
(more, if using -g and the description was blank).

rbt post now caches the parsed revision spec, and the commit message,
preventing us from having to run as many costly commands.

This all results in a cleaner design with less responsibility in
SCMClient, fewer usages of the options object, and faster posting.

Posted an update to this change using -g -r. Then followed it up
with additional updates as I tweaked some comments, using -u and
-g -u.

Tested posting with -g on Bazaar, Git and Mercurial in the following scenarios:

  • Single line summary.
  • Multi-line summary.
  • Single line summary + blank line + description

In the first two, the whole commit message was used for the description.
For the third, only the description was used.

Tested posting updates to changes using -u on Bazaar, Git and Mercurial.

Tested also with -g -u.

Unit tests passed. Discovered while updating tests that Mercurial's summary
guessing read from the wrong end of a commit range, and fixed that. Verified
by hand.

No errors or warnings from PyFlakes.

Description From Last Updated

How about prefixing this method name with _?

daviddavid

Debug code?

daviddavid
david
  1. Can you clarify what this is trying to do? Your description talks a lot about the actual changes but not what the goal is.

  2. 
      
chipx86
david
  1. OK, I think I'm going to give you the same comment that I gave on https://reviews.reviewboard.org/r/5436/:

    Because other SCMs have the same issue (git, bazaar, perforce, etc.), I think I'd prefer that this be done in 'rbt post' by checking if the first line of self.options.description is the same as self.options.summary.

    1. I wouldn't mind doing that except it seems extremely silly to go out of our way in Git to say "give me the summary and the description and concatenate them together" only to say "Now get rid of that summary line. I don't want it."

    2. Allow me to change your mind: It's a single, simple implementation in one place instead of complex SCM-specific implementations in each client. I'm also pretty sure your "non-silly" way in git actually regresses posting multiple revisions, because it won't have the summaries for any of the commits.

    3. Okay, fair enough.

    4. So I'm still not sure that rbt post is the ideal place for this. The purpose of the function is to extract the description, and I feel that the caller should be able to trust that the value returned is the correct description without having to perform any operations on it. By putting it into rbt post, any other clients that want this same logic would have to re-implement it.

      Here's my proposal. Let me know what you think.

      1. Rename the current extract_description methods to extract_commit_message.
      2. Add a base extract_description that calls extract_commit_message and performs the operation.

      What do you think?

    5. That sounds good, but I'd like it to also include a refactor of how extract_* is used. Right now they exist as "public" API on the clients for use by 'rbt post -u', but it's up to the individual clients to notice that the user is using -g/--guess-* and set self.options.summary and self.options.description. If you do this, we should have the command interpret the guess flags, and just have the clients provide the interface to the SCM and not the behavior.

    6. Will do. I'd like to separate this into two patches though, since I think that work will be a little more invasive. I'll update this change to go through the new function, and then will follow it up with a change that gets rid of all the SCMClient summary and description setting code. Sound okay?

    7. OK, but please make sure to test guessing and 'rbt post -u' with all affected clients.

  2. 
      
chipx86
david
  1. Can you test posting changes for specific files (in svn, for example), and perforce paths?

    1. I have a change I'm putting up that adds back raising of InvalidRevisionSpecError to the user instead of masking it, but otherwise, it looks like behavior has not changed.

      I did run into some issues while trying to test this. These are issues with the current master, and not with my change, though my change inherits the same issues:

      • Using -I with Subversion doesn't do anything if posting without a revision or posting a changelist. It only works when specifying a revision.
      • I can't get Perforce path posting to work at all. p4 tries to parse it as a revision, fails, returns an error code, and causes rbt to quit. I don't know how to test this properly, but it seems broken.
  2. rbtools/clients/__init__.py (Diff revision 3)
     
     
     
    Show all issues

    How about prefixing this method name with _?

    1. I actually meant to change that docstring. It was originally designed a bit differently. I think it's fine now to have this be public API for callers that want the full-on raw commit description, and not a parsed version.

  3. 
      
chipx86
david
  1. 
      
  2. rbtools/clients/perforce.py (Diff revisions 3 - 4)
     
     
     
     
    Show all issues

    Debug code?

    1. Oops, yeah. I'll remove it.

  3. 
      
chipx86
david
  1. Ship It!

  2. 
      
chipx86
Review request changed
Status:
Completed