This change has been marked as completed.

Describe the completed change (optional):

Pushed to release-5.0.x (7128e09)

Summary

Make accessibility checks more robust for Reviews and Comments queries

Review Request #12424 — Created June 29, 2022 and submitted 2 years, 11 months ago

Information

Owner

maubin*

Repository

Review Board

Branch

release-5.0.x

Bugs

Depends On

Reviewers

Groups

reviewboard

People

Description*

Recently, it was discovered that when querying for reviews there may be

duplicate Q(public=True)'s that appear in the queries to fetch the reviews.

The cause of this duplication lies in ReviewManager's _query method, which

handles the logic for querying for reviews. If an unauthenticated or anonymous

user was querying for the reviews, the public query field would be set twice,

once to the public value that was passed and once to True, with the latter

ensuring that only public reviews get returned to unauthenticated/anonymous

users. This however makes the query less efficient and can lead to the wrong

results being included in the returned queryset if a False value for public

was passed in.
To fix this, we ensure that the public query field is only set once, for any

type of user, in ReviewManager._query. This fix was applied to

CommentManager._query as well.
This change also adds a public parameter to ReviewManager.accessible to

ensure that the public parameter in ReviewManager._query handles all of

the logic surrounding public and unpublished reviews. Previously, you would

have to set the public query field using the extra_query parameter in

ReviewManager.accessible, which could also lead to duplicate or 

conflicting Q(public= queries.
Finally, this change also improves the tests for the ReviewManager and 

CommentManager. The tests are more clear, organized and well documented,

they use assertQueries, and have more comprehensive test coverage surrounding

access control.

Testing Done


Added new unit tests for ReviewManager.accessible with setting the

public parameter. 
Ran all unit tests with success.

Commits

Summary	ID
Make accessibility checks more robust for Reviews and Comments queries	321568393659002f0d6d41524cdf9e1a4e9c19cd

Issues

Description	From	Last Updated
How should we check that the diffset related to the review is accessible by the user? I see the logic …	maubin	3 years ago
I think we can have a more performant query if this becomes something like: q &= ( Q(user=user) \| (Q(public=True) …	chipx86	3 years ago
Seems like a good performance improvement. Let's document these conditions for an empty result with a comment.	chipx86	3 years ago
So going through the other changes for some duplicate object fixes I'm putting together, I think we need to take …	chipx86	3 years ago
For booleans, we allow ('1', 'true', 'True'). Not ideal to repeat this everywhere, but let's call that a problem for …	chipx86	3 years ago
This should be review__public, right? If so, and no tests failed, this means we're missing a test.	chipx86	3 years ago
Imports should be in alphabetical order.	chipx86	3 years ago
Comments should generally end with a period.	chipx86	3 years ago
Blank line before new blocks (same below).	chipx86	3 years ago
Should I consolidate these common queries into a variable or something? Then use the variable instead of repeating the full …	maubin	3 years ago
For multi-line lists, we generally do: [ review1, review2, ..., review5, ..., ] Or, when appropriate, one item per row …	chipx86	3 years ago
Given this is one Q, we can just put this on the same line. Same in other places.	chipx86	3 years ago
Another example of where we want periods at the end of the sentence. Really applies to all such comments.	chipx86	3 years ago
trailing whitespace Column: 38 Error code: W291	reviewbot	3 years ago
Missing a trailing comma (we always want to terminate multi-column list items with a trailing comma to minimize line changes …	chipx86	3 years ago
There are no open issues

flake8 passed.

JSHint passed.

reviewboard/reviews/managers.py (Diff revision 1)

The issue has been resolved. Show all issues

I think we can have a more performant query if this becomes something like:

q &= (
    Q(user=user) |
    (Q(public=True) &
     (repo_q &
      Q(review_request__target_people=user) |
      group_q))
)


That allows the query planner to have to consider only one set of rows for a public=True, and short-circuits that if we know the user matches.
It does mean repeating some of that above, but we can just take the (repo_q ...) bits and put that into a reusable acl_check_q or something.

reviewboard/reviews/managers.py (Diff revision 1)

The issue has been resolved. Show all issues

Seems like a good performance improvement. Let's document these conditions for an empty result with a comment.

reviewboard/reviews/tests/test_review_manager.py (Diff revision 1)

The issue has been resolved. Show all issues

So going through the other changes for some duplicate object fixes I'm putting together, I think we need to take the opportunity to better organize these tests so it's explicitly clear what data we're working with.
As such, with these tests (and ideally we'd update the others to match, but let's start here), let's have a pattern of putting a comment above each distinct group of related data explaining what access policy that's testing. For example:

# Published review on a publicly-accessible review request.
review1 = self.create_review(review_request, publish=True)

...

# Draft review on a publicly-accessible review request.
...

# Published review on a private repository the user has access to.
...

# Published review on a private repository the user does not have access to.
...


etc.
That way it's very easy for us to see what data we're working with, and it's very easy to know where the code should live.
Bonus points for a comment above each assertQuerysetEqual that says what we're testing for, and usage of assertQueries() as part of that to ensure we're building exactly the queries we want.

reviewboard/webapi/resources/root_review.py (Diff revision 1)

The issue has been resolved. Show all issues

For booleans, we allow ('1', 'true', 'True').

Not ideal to repeat this everywhere, but let's call that a problem for another day.

Change Summary:

The following changes were made to the ReviewManager tests
- Uses assertQueries instead of assertNumQueries
- Organized and added comments to the tests

Commits:

	Summary	ID
	Make accessibility checks more robust for Reviews and Comments queries	4fb012c68dda0657cfc1b00636d751405baf6a8a
	Make accessibility checks more robust for Reviews and Comments queries	bf75c8ca6cfd33c66bc646a65510865999ff9b7e

Diff:

Revision 2 (+2618 -300)

Show changes

	reviewboard/datagrids/tests.py
	reviewboard/reviews/managers.py
	reviewboard/reviews/tests/test_review_manager.py
	reviewboard/webapi/resources/root_review.py

Checks run (2 succeeded)

flake8 passed.

JSHint passed.

The issue has been resolved. Show all issues

How should we check that the diffset related to the review is accessible by the user? I see the logic for it in ReviewRequest._is_diffset_accessible_by(user, diffset) but how can we efficiently do this in the Managers through a query?

chipx86 3 years ago

There's no way to, because this isn't database-bound. The computations are Python code, which may do things like check a third-party service.

That sort of check has to be done separately from accessible results.

reviewboard/reviews/tests/test_review_manager.py (Diff revision 2)

The issue has been resolved. Show all issues

Should I consolidate these common queries into a variable or something? Then use the variable instead of repeating the full queries

chipx86 3 years ago

I've debated this myself. We could. We had a similar situation for the Cliosoft SOS tests in Power Pack, and what I ended doing there is providing a class of "rule" properties/functions that generated expected results for a particular type of operation.
In this case, many of these sorts of queries would be repeated outside of an individual test suite.
So maybe we coudl do something similar to the SOS tests, but centralized in something like reviewboard.testing.query_rules. A class per thing, like GroupQueryRules with properties for common static rules and functions for common dynamic rules (that take in, say, a user ID).
I can see some benefits to that as we scale out these sorts of queries. Might be worth the tradeoff of added complexity of the properties/functions.

maubin 3 years ago

I like that idea. For now I will continue writing out the full queries, and sometime in the future I can revisit this and think about it more. Btw, what class are you referring to in the Cliosoft SOS tests?

This is a lot of work! Thanks for doing this :)

Some small nits and one thing that may be lacking a test.

reviewboard/reviews/managers.py (Diff revision 2)

The issue has been resolved. Show all issues

This should be review__public, right?

If so, and no tests failed, this means we're missing a test.

maubin 3 years ago

That's right, however this code is never going to be reached when using the CommentManager.accessible method because we always set public=None. This wasn't the case for the ReviewManager.accessible since we allow public to be None, False or True, and had tests checking this. For the CommentManager, I could add tests for checking CommentManager.from_user(superuser, public=True) and public=False which would trigger this code, but that seems kind of random. Should I test CommentManager._query directly instead?

chipx86 3 years ago

If we have code to build this part of the query, we'll want test coverage for it. However, if there's no chance it can ever be executed through normal means, then it's probably not worth building this part of the query.

reviewboard/reviews/tests/test_review_manager.py (Diff revision 2)
The issue has been resolved. Show all issues
```
Imports should be in alphabetical order.
```
reviewboard/reviews/tests/test_review_manager.py (Diff revision 2)
The issue has been resolved. Show all issues
```
Comments should generally end with a period.
```
reviewboard/reviews/tests/test_review_manager.py (Diff revision 2)
The issue has been resolved. Show all issues
```
Blank line before new blocks (same below).
```

reviewboard/reviews/tests/test_review_manager.py (Diff revision 2)

The issue has been resolved. Show all issues

For multi-line lists, we generally do:

[
    review1, review2, ...,
    review5, ...,
]

Or, when appropriate, one item per row (generally more appropriate when we may be operating more on the list over time, to minimize changes to the code).

reviewboard/reviews/tests/test_review_manager.py (Diff revision 2)
The issue has been resolved. Show all issues
```
Given this is one Q, we can just put this on the same line.
Same in other places.
```

reviewboard/reviews/tests/test_review_manager.py (Diff revision 2)

The issue has been resolved. Show all issues

Another example of where we want periods at the end of the sentence.

Really applies to all such comments.

Change Summary:


Applied feedback from last review.
Tuned up the CommentManager tests.

Description:

		Recently, it was discovered that when querying for reviews there may be
		duplicate `Q(public=True)`'s that appear in the queries to fetch the reviews.
		The cause of this duplication lies in `ReviewManager`'s `_query` method, which
		handles the logic for querying for reviews. If an unauthenticated or anonymous
		user was querying for the reviews, the `public` query field would be set twice,
		once to the `public` value that was passed and once to `True`, with the latter
		ensuring that only public reviews get returned to unauthenticated/anonymous
		users. This however makes the query less efficient and can lead to the wrong
		results being included in the returned queryset if a `False` value for `public`
		was passed in.

		To fix this, we ensure that the `public` query field is only set once, for any
		type of user, in `ReviewManager._query`. This fix was applied to
		`CommentManager._query` as well.

		This change also adds a `public` parameter to `ReviewManager.accessible` to
		ensure that the `public` parameter in `ReviewManager._query` handles all of
		the logic surrounding public and unpublished reviews. Previously, you would
		have to set the `public` query field using the `extra_query` parameter in
		`ReviewManager.accessible`, which could also lead to duplicate or
		conflicting `Q(public=` queries.
	+
	+	Finally, this change also improves the tests for the `ReviewManager` and
	+	`CommentManager`. The tests are more clear, organized and well documented,
	+	they use `assertQueries`, and have more comprehensive test coverage surrounding
	+	access control.

Commits:

	Summary	ID
	Make accessibility checks more robust for Reviews and Comments queries	bf75c8ca6cfd33c66bc646a65510865999ff9b7e
	Make accessibility checks more robust for Reviews and Comments queries	edfbeab3d8c44767d124900a020a30af93b5cbf1

Diff:

Revision 3 (+4512 -616)

Show changes

	reviewboard/datagrids/tests.py
	reviewboard/reviews/managers.py
	reviewboard/reviews/tests/test_comment_manager.py
	reviewboard/reviews/tests/test_review_manager.py
	reviewboard/webapi/resources/root_review.py

Checks run (1 failed, 1 succeeded)

flake8 failed.

JSHint passed.

flake8

reviewboard/reviews/tests/test_comment_manager.py (Diff revision 3)
The issue has been resolved. Show all issues
```
trailing whitespace

Column: 38
Error code: W291
```

Change Summary:


Removed trailing whitespace
Added test for CommentManager filtering public reviews

Commits:

	Summary	ID
	Make accessibility checks more robust for Reviews and Comments queries	edfbeab3d8c44767d124900a020a30af93b5cbf1
	Make accessibility checks more robust for Reviews and Comments queries	f2598f4dbf985ed743e96b2cd0631e3f8d4ee8b8

Diff:

Revision 4 (+4602 -614)

Show changes

	reviewboard/datagrids/tests.py
	reviewboard/reviews/managers.py
	reviewboard/reviews/tests/test_comment_manager.py
	reviewboard/reviews/tests/test_review_manager.py
	reviewboard/webapi/resources/root_review.py

Checks run (2 succeeded)

flake8 passed.

JSHint passed.

Awesome, home stretch. One tiny code styling nit (in a handful of places), but everything else looks ready to go.

reviewboard/reviews/tests/test_comment_manager.py (Diff revisions 3 - 4)

The issue has been resolved. Show all issues

Missing a trailing comma (we always want to terminate multi-column list items with a trailing comma to minimize line changes if we add to the list).

This applies to other queries as well.

Change Summary:


Added trailing commas to lists and sets

Commits:

	Summary	ID
	Make accessibility checks more robust for Reviews and Comments queries	f2598f4dbf985ed743e96b2cd0631e3f8d4ee8b8
	Make accessibility checks more robust for Reviews and Comments queries	321568393659002f0d6d41524cdf9e1a4e9c19cd

Diff:

Revision 5 (+4602 -614)

Show changes

	reviewboard/datagrids/tests.py
	reviewboard/reviews/managers.py
	reviewboard/reviews/tests/test_comment_manager.py
	reviewboard/reviews/tests/test_review_manager.py
	reviewboard/webapi/resources/root_review.py

Checks run (2 succeeded)

flake8 passed.

JSHint passed.

Ship it!

```
Ship It!
```

Ship it!

```
Ship It!
```

Status:: Completed
Change Summary:: Pushed to release-5.0.x (7128e09)