Make accessibility checks more robust for Reviews and Comments queries

Review Request #12424 — Created June 29, 2022 and submitted

maubin
Review Board
release-5.0.x
reviewboard

Recently, it was discovered that when querying for reviews there may be
duplicate Q(public=True)'s that appear in the queries to fetch the reviews.
The cause of this duplication lies in ReviewManager's _query method, which
handles the logic for querying for reviews. If an unauthenticated or anonymous
user was querying for the reviews, the public query field would be set twice,
once to the public value that was passed and once to True, with the latter
ensuring that only public reviews get returned to unauthenticated/anonymous
users. This however makes the query less efficient and can lead to the wrong
results being included in the returned queryset if a False value for public
was passed in.

To fix this, we ensure that the public query field is only set once, for any
type of user, in ReviewManager._query. This fix was applied to
CommentManager._query as well.

This change also adds a public parameter to ReviewManager.accessible to
ensure that the public parameter in ReviewManager._query handles all of
the logic surrounding public and unpublished reviews. Previously, you would
have to set the public query field using the extra_query parameter in
ReviewManager.accessible, which could also lead to duplicate or
conflicting Q(public= queries.

Finally, this change also improves the tests for the ReviewManager and
CommentManager. The tests are more clear, organized and well documented,
they use assertQueries, and have more comprehensive test coverage surrounding
access control.

  • Added new unit tests for ReviewManager.accessible with setting the
    public parameter.
  • Ran all unit tests with success.
Summary
Make accessibility checks more robust for Reviews and Comments queries
Description From Last Updated

How should we check that the diffset related to the review is accessible by the user? I see the logic …

maubinmaubin

I think we can have a more performant query if this becomes something like: q &= ( Q(user=user) | (Q(public=True) …

chipx86chipx86

Seems like a good performance improvement. Let's document these conditions for an empty result with a comment.

chipx86chipx86

So going through the other changes for some duplicate object fixes I'm putting together, I think we need to take …

chipx86chipx86

For booleans, we allow ('1', 'true', 'True'). Not ideal to repeat this everywhere, but let's call that a problem for …

chipx86chipx86

This should be review__public, right? If so, and no tests failed, this means we're missing a test.

chipx86chipx86

Imports should be in alphabetical order.

chipx86chipx86

Comments should generally end with a period.

chipx86chipx86

Blank line before new blocks (same below).

chipx86chipx86

Should I consolidate these common queries into a variable or something? Then use the variable instead of repeating the full …

maubinmaubin

For multi-line lists, we generally do: [ review1, review2, ..., review5, ..., ] Or, when appropriate, one item per row …

chipx86chipx86

Given this is one Q, we can just put this on the same line. Same in other places.

chipx86chipx86

Another example of where we want periods at the end of the sentence. Really applies to all such comments.

chipx86chipx86

trailing whitespace Column: 38 Error code: W291

reviewbotreviewbot

Missing a trailing comma (we always want to terminate multi-column list items with a trailing comma to minimize line changes …

chipx86chipx86
chipx86
  1. 
      
  2. reviewboard/reviews/managers.py (Diff revision 1)
     
     
     
     

    I think we can have a more performant query if this becomes something like:

    q &= (
        Q(user=user) |
        (Q(public=True) &
         (repo_q &
          Q(review_request__target_people=user) |
          group_q))
    )
    

    That allows the query planner to have to consider only one set of rows for a public=True, and short-circuits that if we know the user matches.

    It does mean repeating some of that above, but we can just take the (repo_q ...) bits and put that into a reusable acl_check_q or something.

  3. reviewboard/reviews/managers.py (Diff revision 1)
     
     
     

    Seems like a good performance improvement. Let's document these conditions for an empty result with a comment.

  4. reviewboard/reviews/tests/test_review_manager.py (Diff revision 1)
     
     
     
     
     
     
     
     
     
     
     
     

    So going through the other changes for some duplicate object fixes I'm putting together, I think we need to take the opportunity to better organize these tests so it's explicitly clear what data we're working with.

    As such, with these tests (and ideally we'd update the others to match, but let's start here), let's have a pattern of putting a comment above each distinct group of related data explaining what access policy that's testing. For example:

    # Published review on a publicly-accessible review request.
    review1 = self.create_review(review_request, publish=True)
    
    ...
    
    # Draft review on a publicly-accessible review request.
    ...
    
    # Published review on a private repository the user has access to.
    ...
    
    # Published review on a private repository the user does not have access to.
    ...
    

    etc.

    That way it's very easy for us to see what data we're working with, and it's very easy to know where the code should live.

    Bonus points for a comment above each assertQuerysetEqual that says what we're testing for, and usage of assertQueries() as part of that to ensure we're building exactly the queries we want.

  5. reviewboard/webapi/resources/root_review.py (Diff revision 1)
     
     
     

    For booleans, we allow ('1', 'true', 'True').

    Not ideal to repeat this everywhere, but let's call that a problem for another day.

  6. 
      
maubin
maubin
  1. 
      
  2. How should we check that the diffset related to the review is accessible by the user? I see the logic for it in ReviewRequest._is_diffset_accessible_by(user, diffset) but how can we efficiently do this in the Managers through a query?

    1. There's no way to, because this isn't database-bound. The computations are Python code, which may do things like check a third-party service.

      That sort of check has to be done separately from accessible results.

  3. reviewboard/reviews/tests/test_review_manager.py (Diff revision 2)
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     

    Should I consolidate these common queries into a variable or something? Then use the variable instead of repeating the full queries

    1. I've debated this myself. We could. We had a similar situation for the Cliosoft SOS tests in Power Pack, and what I ended doing there is providing a class of "rule" properties/functions that generated expected results for a particular type of operation.

      In this case, many of these sorts of queries would be repeated outside of an individual test suite.

      So maybe we coudl do something similar to the SOS tests, but centralized in something like reviewboard.testing.query_rules. A class per thing, like GroupQueryRules with properties for common static rules and functions for common dynamic rules (that take in, say, a user ID).

      I can see some benefits to that as we scale out these sorts of queries. Might be worth the tradeoff of added complexity of the properties/functions.

    2. I like that idea. For now I will continue writing out the full queries, and sometime in the future I can revisit this and think about it more. Btw, what class are you referring to in the Cliosoft SOS tests?

  4. 
      
chipx86
  1. This is a lot of work! Thanks for doing this :)

    Some small nits and one thing that may be lacking a test.

  2. reviewboard/reviews/managers.py (Diff revision 2)
     
     

    This should be review__public, right?

    If so, and no tests failed, this means we're missing a test.

    1. That's right, however this code is never going to be reached when using the CommentManager.accessible method because we always set public=None. This wasn't the case for the ReviewManager.accessible since we allow public to be None, False or True, and had tests checking this. For the CommentManager, I could add tests for checking CommentManager.from_user(superuser, public=True) and public=False which would trigger this code, but that seems kind of random. Should I test CommentManager._query directly instead?

    2. If we have code to build this part of the query, we'll want test coverage for it. However, if there's no chance it can ever be executed through normal means, then it's probably not worth building this part of the query.

  3. Imports should be in alphabetical order.

  4. Comments should generally end with a period.

  5. Blank line before new blocks (same below).

  6. For multi-line lists, we generally do:

    [
        review1, review2, ...,
        review5, ...,
    ]
    

    Or, when appropriate, one item per row (generally more appropriate when we may be operating more on the list over time, to minimize changes to the code).

  7. reviewboard/reviews/tests/test_review_manager.py (Diff revision 2)
     
     
     
     

    Given this is one Q, we can just put this on the same line.

    Same in other places.

  8. Another example of where we want periods at the end of the sentence.

    Really applies to all such comments.

  9. 
      
maubin
Review request changed

Change Summary:

  • Applied feedback from last review.
  • Tuned up the CommentManager tests.

Description:

   

Recently, it was discovered that when querying for reviews there may be

    duplicate Q(public=True)'s that appear in the queries to fetch the reviews.
    The cause of this duplication lies in ReviewManager's _query method, which
    handles the logic for querying for reviews. If an unauthenticated or anonymous
    user was querying for the reviews, the public query field would be set twice,
    once to the public value that was passed and once to True, with the latter
    ensuring that only public reviews get returned to unauthenticated/anonymous
    users. This however makes the query less efficient and can lead to the wrong
    results being included in the returned queryset if a False value for public
    was passed in.

   
   

To fix this, we ensure that the public query field is only set once, for any

    type of user, in ReviewManager._query. This fix was applied to
    CommentManager._query as well.

   
   

This change also adds a public parameter to ReviewManager.accessible to

    ensure that the public parameter in ReviewManager._query handles all of
    the logic surrounding public and unpublished reviews. Previously, you would
    have to set the public query field using the extra_query parameter in
    ReviewManager.accessible, which could also lead to duplicate or
    conflicting Q(public= queries.

  +
  +

Finally, this change also improves the tests for the ReviewManager and

  + CommentManager. The tests are more clear, organized and well documented,
  + they use assertQueries, and have more comprehensive test coverage surrounding
  + access control.

Commits:

Summary
-
Make accessibility checks more robust for Reviews and Comments queries
+
Make accessibility checks more robust for Reviews and Comments queries

Diff:

Revision 3 (+4512 -616)

Show changes

Checks run (1 failed, 1 succeeded)

flake8 failed.
JSHint passed.

flake8

maubin
chipx86
  1. Awesome, home stretch. One tiny code styling nit (in a handful of places), but everything else looks ready to go.

  2. Missing a trailing comma (we always want to terminate multi-column list items with a trailing comma to minimize line changes if we add to the list).

    This applies to other queries as well.

  3. 
      
maubin
david
  1. Ship It!
  2. 
      
chipx86
  1. Ship It!
  2. 
      
maubin
Review request changed

Status: Closed (submitted)

Change Summary:

Pushed to release-5.0.x (7128e09)
Loading...