Add a feature to automatically suggest reviewers for a given review request

Review Request #5740 — Created April 25, 2014 and submitted

Information

Review Board
master
ee498ff...

Reviewers

This change is, for the most part, the same as Natasha's change at /r/5029. I had to port it over to the new fields code, and make a handful of other small changes to get it to work.

My plan is to push this to a branch off master, where we can play with it and do some performance testing.


 
Description From Last Updated

""" on the next line.

chipx86chipx86

We should be using format_html. Would also be nice to use dictionary args for formatting.

chipx86chipx86

Should use format_html here too.

chipx86chipx86

Should use single quotes.

chipx86chipx86

Swap these.

chipx86chipx86

Blank line after this.

chipx86chipx86

Blank line between these.

chipx86chipx86

How about post_field_html instead of extra_html? Would be nice to be explicit there, in case we decide to allow for …

chipx86chipx86

If we don't mind a little extra '/' stripping, this can be: diffed_files = [ fd.source_file.lstrip('/') for fd in diffset.files.all() …

chipx86chipx86

Alignment looks weird.

chipx86chipx86

Let's swap these conditions. rr_id != self.pk is a faster check.

chipx86chipx86

This can be: user, user_timestamp, related_review_request = all_reviews[rev]

chipx86chipx86

I don't know if it's faster to use six.iteritems here, but since this will be the equivalent of iteritems on …

chipx86chipx86

This is formatted awkwardly. How about: sorted_users = [ user for user, weight in ... ]

chipx86chipx86

Can we pull out .suggest-user once, instead of re-querying every time?

chipx86chipx86

With the exception of selector above, all of these are in alphabetical order. Can we move these (and selector) to …

chipx86chipx86

Should just use a pre-fetched element.

chipx86chipx86

This looks to be doing too much. If we kept this logic, then this could just be: this._$suggestedPeople .empty() .append($allPeople); …

chipx86chipx86

We fetch all the diffsets in review_detail. It'd be nice to make use of those to grab the latest, if …

chipx86chipx86

Col: 80 E251 unexpected spaces around keyword / parameter equals

reviewbotreviewbot

This terrifies me still. It's a large join. We should do some testing against large MySQL databases before this goes …

chipx86chipx86

Col: 26 E127 continuation line over-indented for visual indent

reviewbotreviewbot

Col: 80 E501 line too long (80 > 79 characters)

reviewbotreviewbot

list comprehension redefines 'user' from line 195

reviewbotreviewbot

list comprehension redefines 'weight' from line 206

reviewbotreviewbot
chipx86
  1. I didn't do a full review, since I'm pretty exhausted right now, but here's a few things that I noticed from a cursory glance.

  2. reviewboard/reviews/builtin_fields.py (Diff revision 1)
     
     
    Show all issues

    """ on the next line.

  3. reviewboard/reviews/builtin_fields.py (Diff revision 1)
     
     
     
    Show all issues

    We should be using format_html. Would also be nice to use dictionary args for formatting.

  4. reviewboard/reviews/builtin_fields.py (Diff revision 1)
     
     
     
     
     
     
     
     
    Show all issues

    Should use format_html here too.

  5. reviewboard/reviews/builtin_fields.py (Diff revision 1)
     
     
    Show all issues

    Should use single quotes.

  6. Show all issues

    Swap these.

  7. Show all issues

    Blank line after this.

  8. Show all issues

    Blank line between these.

  9. Show all issues

    How about post_field_html instead of extra_html? Would be nice to be explicit there, in case we decide to allow for custom HTML before the <span>.

  10. 
      
david
david
chipx86
  1. 
      
  2. reviewboard/reviews/models/base_review_request_details.py (Diff revision 2)
     
     
     
     
     
     
    Show all issues

    If we don't mind a little extra '/' stripping, this can be:

    diffed_files = [
        fd.source_file.lstrip('/')
        for fd in diffset.files.all()
    ]
    
  3. Don't we have some normalization somewhere now for this?

  4. Show all issues

    Alignment looks weird.

  5. Show all issues

    Let's swap these conditions. rr_id != self.pk is a faster check.

  6. Show all issues

    This can be:

    user, user_timestamp, related_review_request = all_reviews[rev]
    
  7. Show all issues

    I don't know if it's faster to use six.iteritems here, but since this will be the equivalent of iteritems on Python 3.x, we might as well use it for 2.x for consistency.

  8. Show all issues

    This is formatted awkwardly. How about:

    sorted_users = [
        user
        for user, weight in ...
    ]
    
  9. Show all issues

    Can we pull out .suggest-user once, instead of re-querying every time?

    1. I don't see a good place to stash that except in a global variable, which seems worse. This also only happens once per inline editor open/close, which isn't really a performance problem.

    2. Aren't all the uses within this same class? I don't see any others. Doesn't seem like the variable would have to be global at all.

  10. Show all issues

    With the exception of selector above, all of these are in alphabetical order. Can we move these (and selector) to be consistent in that way?

  11. Show all issues

    Should just use a pre-fetched element.

    1. Same comments as above.

  12. reviewboard/static/rb/js/views/reviewRequestEditorView.js (Diff revision 2)
     
     
     
     
     
     
    Show all issues

    This looks to be doing too much. If we kept this logic, then this could just be:

    this._$suggestedPeople
        .empty()
        .append($allPeople);
    

    However, I'm not really sure why we need to do all this work. I wasn't sure why we were, but I suspect it's because we insert those ',' characters. If so, we're just doing way too much work here.

    What I'd suggest is not doing any of this, and instead just list the visible users as standard elements. Then, in CSS, we can do something like:

    .suggest-user:after {
        content: ", "
    }
    
    .suggest-user:last-child:after {
        content: ""
    }
    
  13. 
      
david
reviewbot
  1. Tool: Pyflakes
    Processed Files:
        reviewboard/reviews/builtin_fields.py
        reviewboard/reviews/models/base_review_request_details.py
        reviewboard/reviews/fields.py
    
    Ignored Files:
        reviewboard/templates/reviews/review_request_box.html
        reviewboard/static/rb/css/pages/reviews.less
        reviewboard/static/rb/js/views/reviewRequestEditorView.js
    
    
    
    Tool: PEP8 Style Checker
    Processed Files:
        reviewboard/reviews/builtin_fields.py
        reviewboard/reviews/models/base_review_request_details.py
        reviewboard/reviews/fields.py
    
    Ignored Files:
        reviewboard/templates/reviews/review_request_box.html
        reviewboard/static/rb/css/pages/reviews.less
        reviewboard/static/rb/js/views/reviewRequestEditorView.js
    
    
  2. Show all issues
    Col: 80
     E251 unexpected spaces around keyword / parameter equals
    
  3. Show all issues
    Col: 26
     E127 continuation line over-indented for visual indent
    
  4. Show all issues
    Col: 80
     E501 line too long (80 > 79 characters)
    
  5. Show all issues
     list comprehension redefines 'user' from line 195
    
  6. Show all issues
     list comprehension redefines 'weight' from line 206
    
  7. 
      
chipx86
  1. 
      
  2. Show all issues

    We fetch all the diffsets in review_detail. It'd be nice to make use of those to grab the latest, if available, to reduce a query.

    I'm very concerned about the impact this feature will have to load times, so any query reduction we can do is very important.

    1. I'm really not sure how to plumb that through (this method is called from the field), plus this is also visible on the diff viewer page.

    2. review_detail already ends up calling ReviewRequest.get_diffsets(), which will cache the list in ReviewRequest._diffsets. What I'd suggest is to have get_latest_diffset just return get_diffsets()[-1]. It'll fetch a little bit extra in the case where get_diffsets() has not been called, but both the diff viewer and review request page will already have the diffsets.

  3. Show all issues

    This terrifies me still. It's a large join. We should do some testing against large MySQL databases before this goes in. I'd like to see whether it's faster to do a large join, or to split it into a couple different queries.

    One thought on optimization:

    We're going to fetch a lot of user data, and a lot of it may be the same data. Instead of grabbing 'user', let's just grab the ID, and then after we dedup later, we can fetch the data we need from those user IDs. Same for the reviews__user.

    1. I really, really want to get this pushed to a branch so that it's not moldering in my local clone. I don't intend to merge this into master until we've done a lot of testing and optimization.

    2. Ok.

  4. 
      
david
Review request changed
Status:
Completed
Change Summary:
Pushed to dev/suggested-reviewers (0b5d243)