dan.casares got a fish trophy!
Search: Add faceted fields to ReviewRequestIndex and UserIndex
Review Request #15051 — Created May 13, 2026 and updated
Extends
ReviewRequestIndexandUserIndexwith new Elasticsearch-specific
fields required by the faceted search engine. These fields are ignored by
Whoosh.
ReviewRequestIndexgains:
-status,branch—CharField(faceted=True)for status and branch
filters and aggregations
-repository_id,submitter_id—IntegerFieldFKs for the repository
and author searchable filters
-date_created,date_updated—datetimefields for creation and
last-updated range filters
-target_reviewer_ids,reviewer_user_ids,commenter_user_ids—
multi-value integer fields for reviewer and commenter filters
-has_ship_it,has_reviews,has_issues_open,has_file_attachment,
has_depends_on—BooleanFieldfor option-count term queries
-comment_issue_statuses,file_attachment_types— multi-value fields
for issue status and file type filtersThe review-based boolean fields (
has_ship_it,has_reviews,
reviewer_user_ids) use the prefetched_public_reviewsattribute when
available duringrebuild_indexand fall back to direct DB queries during
on-the-fly indexing.has_issues_openandcommenter_user_idsalways use
direct DB queries since comment volumes can be very large.
UserIndexgains:
-is_active,is_staff—BooleanField(faceted=True)for user status
and role filters
-group_ids— multi-value field for group membership filter
-date_joined,last_login— datetime fields for date range filters
-username_display— stored keyword copy of username for terms
aggregations
UserIndex.index_querysetnow indexes all users regardless of their
is_activestatus (theis_active=Truefilter has been removed).
Inactive users now appear in search results by default and can be filtered
out using theis_activefaceted field.A new
GroupIndexis introduced to support the groups search tab. It
indexes:
-has_open_review_requests,in_active_reviews—BooleanFieldfor
filtering groups by review request activity
-member_count—IntegerFieldfor filtering groups by membership size
-invite_only—BooleanFieldused by the permission filter to exclude
groups the searching user cannot accessA
rebuild_indexis required after upgrading to populate these new fields.
Ran full test suite.
Verified that eachnew prepare_*method onReviewRequestIndexand
UserIndexreturns the correct value, including fallback behaviour for
on-the-fly indexing when prefetches are not available. Confirmed that
index_queryseton both indexes does not introduce N+1 queries for any of the
new fields.
| Summary | ID |
|---|---|
| af3e918c0bfcb3ee69c09b8f18ab8a661ab0a7b1 | |
| acfe5b5d4e8f92028f0507025c4e5d3de8475d1c | |
| 7f11411ff3865d6f1121c5583921a97d4319061d |
| Description | From | Last Updated |
|---|---|---|
|
Omg sorry I had a brain fart... we just release Review Board 8, all the "Version Added"s should say Review … |
|
|
|
line too long (81 > 79 characters) Column: 80 Error code: E501 |
|
|
|
'typing.Sequence' imported but unused Column: 1 Error code: F401 |
|
|
|
line break before binary operator Column: 13 Error code: W503 |
|
|
|
line break before binary operator Column: 13 Error code: W503 |
|
|
|
line break before binary operator Column: 13 Error code: W503 |
|
|
|
line too long (80 > 79 characters) Column: 80 Error code: E501 |
|
|
|
local variable 'review' is assigned to but never used Column: 9 Error code: F841 |
|
|
|
local variable 'reviewer' is assigned to but never used Column: 9 Error code: F841 |
|
|
|
Can you add a blank line between these. |
|
|
|
For this and any other class attributes that are newly added, can you add a Version Added to its docstring, … |
|
|
|
For our class variable docstrings, we like to have a one-sentence summary, then a newline, then a description if needed: … |
|
|
|
Was the previous description for this supposed to be deleted, does it no longer apply? |
|
|
|
We still like to keep arguments on one line per argument, e.g: def prepare_groups( self, user: RBUser, ) -> str: … |
|
|
|
It seems like this is meant to use user.username_display? Is username_display even needed or can we just use username? |
|
|
|
For lists that are expected to be immutable, where we're just reading from them and not modifying, we use Sequence[int] … |
|
|
|
We already import django.contrib.auth.models.User at the top of the test file. Let's use that instead of re-importing here. Update this … |
|
|
|
This description got moved here from the UserIndex. |
|
|
|
For every new method that was added, let's add a Version Changed: 8.0 to the docstring. It comes before the … |
|
|
|
Can you add a blank line before and after the if block. Same below. |
|
|
|
Same comment about using Sequence and everywhere else where we return list. |
|
|
|
Do these need to be imported here for some reason? If not we can import them at the top of … |
|
|
|
Instead of hardcoding mimetypes here, which can lead to some getting forgotten (for example pdfs can also have an application/x-pdf … |
|
|
|
line too long (81 > 79 characters) Column: 80 Error code: E501 |
|
|
|
'typing.Sequence' imported but unused Column: 1 Error code: F401 |
|
|
|
line too long (82 > 79 characters) Column: 80 Error code: E501 |
|
|
|
line break before binary operator Column: 13 Error code: W503 |
|
|
|
line break before binary operator Column: 13 Error code: W503 |
|
|
|
line break before binary operator Column: 13 Error code: W503 |
|
|
|
line too long (80 > 79 characters) Column: 80 Error code: E501 |
|
|
|
local variable 'review' is assigned to but never used Column: 9 Error code: F841 |
|
|
|
line break before binary operator Column: 28 Error code: W503 |
|
|
|
line too long (80 > 79 characters) Column: 80 Error code: E501 |
|
|
|
line too long (82 > 79 characters) Column: 80 Error code: E501 |
|
- Commits:
-
Summary ID Author db34923f5d331db3d4fa24ba08831e46adff5815 DanielCasaresIglesias f65a481a34533f3c8214a101dcdc89181664accd DanielCasaresIglesias - Diff:
-
Revision 2 (+1690 -242)
- Commits:
-
Summary ID Author f65a481a34533f3c8214a101dcdc89181664accd DanielCasaresIglesias e0cdd4eeba17442551f980d44a39b7266afe6be1 DanielCasaresIglesias - Diff:
-
Revision 3 (+1688 -242)
Checks run (2 succeeded)
-
A good start. Most of these are formatting comments, can you apply these comments to your other changes too.
-
-
For this and any other class attributes that are newly added, can you add a Version Added to its docstring, for example:
#: Keyword copy of username for ES ``terms`` aggregations. #: #: Version Added: #: 8.0 -
For our class variable docstrings, we like to have a one-sentence summary, then a newline, then a description if needed:
#: Summary. #: #: Optional description if things need more explanation.Can you reformat this and the rest of the docstrings.
-
-
We still like to keep arguments on one line per argument, e.g:
def prepare_groups( self, user: RBUser, ) -> str:Can you revert this change to go back to that format, here and below.
-
It seems like this is meant to use
user.username_display? Isusername_displayeven needed or can we just useusername? -
For lists that are expected to be immutable, where we're just reading from them and not modifying, we use
Sequence[int]instead oflist. -
We already import
django.contrib.auth.models.Userat the top of the test file. Let's use that instead of re-importing here. Update this here and below. -
-
For every new method that was added, let's add a
Version Changed: 8.0to the docstring. It comes before the Args section.
-
-
-
Do these need to be imported here for some reason? If not we can import them at the top of the file.
-
Instead of hardcoding mimetypes here, which can lead to some getting forgotten (for example pdfs can also have an
application/x-pdftype), lets build these programmatically using our Review UIs. For example:from reviewboard.attachments.mimetypes import TextMimetype code_mimetypes = set(TextMimetype.supported_mimetypes) ... elif any(mime.startswith(p) for p in code_mimetypes)Also you don't have access to this codebase, but we recently fleshed out document review in the new RB release, so instead of a
pdftype can you switch it todocument, and then for the mimetypes you can do this import:from rbpowerpack.docreview import (OFFICE_DOC_SUPPORTED_MIMETYPES, PDF_SUPPORTED_MIMETYPES)And then do a set union (the two variables are already sets).
- Change Summary:
-
Fixed the issues highlighted by Michelle.
I also added indexing for groups.
- Description:
-
Extends
ReviewRequestIndexandUserIndexwith new Elasticsearch-specificfields required by the faceted search engine. These fields are ignored by Whoosh. ReviewRequestIndexgains:~ - status,branch—CharField(faceted=True)for status and branch filters~ and aggregations ~ - repository_id,submitter_id—IntegerFieldFKs for the repository and author~ searchable filters ~ - status,branch—CharField(faceted=True)for status and branch~ filters and aggregations ~ - repository_id,submitter_id—IntegerFieldFKs for the repository~ and author searchable filters - date_created,date_updated—datetimefields for creation andlast-updated range filters - target_reviewer_ids,reviewer_user_ids,commenter_user_ids—multi-value integer fields for reviewer and commenter filters - has_ship_it,has_reviews,has_issues_open,has_file_attachment,has_depends_on—BooleanFieldfor option-count term queries~ - comment_issue_statuses,file_attachment_types— multi-value fields for~ issue status and file type filters ~ - comment_issue_statuses,file_attachment_types— multi-value fields~ for issue status and file type filters The review-based boolean fields (
has_ship_it,has_reviews,~ eviewer_user_ids) use the prefetched_public_reviewsattribute when~ reviewer_user_ids) use the prefetched_public_reviewsattribute whenavailable during rebuild_indexand fall back to direct DB queries duringon-the-fly indexing. has_issues_openandcommenter_user_idsalways usedirect DB queries since comment volumes can be very large. UserIndexgains:~ - is_active,is_staff—BooleanField(faceted=True)for user status and~ role filters ~ - is_active,is_staff—BooleanField(faceted=True)for user status~ and role filters - group_ids— multi-value field for group membership filter- date_joined,last_login— datetime fields for date range filters~ - username_display— stored keyword copy of username for terms aggregations~ - username_display— stored keyword copy of username for terms+ aggregations + + UserIndex.index_querysetnow indexes all users regardless of their+ is_activestatus (theis_active=Truefilter has been removed).+ Inactive users now appear in search results by default and can be filtered + out using the is_activefaceted field.+ + A new
GroupIndexis introduced to support the groups search tab. It+ indexes: + - has_open_review_requests,in_active_reviews—BooleanFieldfor+ filtering groups by review request activity + - member_count—IntegerFieldfor filtering groups by membership size+ - invite_only—BooleanFieldused by the permission filter to exclude+ groups the searching user cannot access A
rebuild_indexis required after upgrading to populate these new fields. - Commits:
-
Summary ID Author e0cdd4eeba17442551f980d44a39b7266afe6be1 DanielCasaresIglesias a151ae824e8077accaafb535845df75aec48133c Daniel Casares-Iglesias - Diff:
-
Revision 4 (+2158 -212)
Checks run (1 failed, 1 succeeded)
flake8
- Change Summary:
-
Fix N+1 queries
prepare_author**was callinguser.get_profile(cached_only=True), which checks a custom cache thatselect_relateddoes not populate, causing a full profile fetch per review request. Replaced withuser.has_private_profile(), which reads from Django'sselect_relatedcache.
Added prefetches toindex_querysetforsubmitter(with profile),reviews,file_attachments, anddepends_onto support the newprepare_*methods without N+1 queries. Published reviews are cached to_public_reviewsand shared across multiple preparers; file attachments and depends-on results are cached to_file_attachmentsand_depends_onrespectively.
Add faceted search fields**
Added index fields andprepare_*methods to support Elasticsearch faceted filtering: review ship-it and reviewer data, comment issue statuses and commenters, file attachment presence and type, dependency tracking, and various scalar fields (status,branch,repository_id,submitter_id,date_created,date_updated). Preparers fall back to direct DB queries when prefetch caches are unavailable, for compatibility with on-the-fly indexing. - Commits:
-
Summary ID a151ae824e8077accaafb535845df75aec48133c af3e918c0bfcb3ee69c09b8f18ab8a661ab0a7b1 a6e6ab5f569ca3776b8e7d2352c7bfb77f5d1026 - Diff:
-
Revision 5 (+2534 -252)
- Commits:
-
Summary ID af3e918c0bfcb3ee69c09b8f18ab8a661ab0a7b1 a6e6ab5f569ca3776b8e7d2352c7bfb77f5d1026 af3e918c0bfcb3ee69c09b8f18ab8a661ab0a7b1 0686569d45ac47e39e24fa327c2ee6141b64adb2 - Diff:
-
Revision 6 (+2538 -254)
Checks run (2 succeeded)
- Testing Done:
-
~ Verified that each
new prepare_*method onReviewRequestIndexand~ Ran full test suite.
+ Verified that each new prepare_*method onReviewRequestIndexandUserIndexreturns the correct value, including fallback behaviour foron-the-fly indexing when prefetches are not available. Confirmed that index_queryseton both indexes does not introduce N+1 queries for any of thenew fields.
-
I'm gonna give this and your other changes a deeper review of the implementation over the next couple days, I need to learn a bit more about search and Elasticsearch. For now can you go over your other changes and apply all the formatting feedback that I gave here.
-
Omg sorry I had a brain fart... we just release Review Board 8, all the "Version Added"s should say Review Board 9.0.
- Change Summary:
-
Removed
.filter(is_active=True)from the queryset so inactive users appear in results by default. Users can remove them through the filters instead. - Commits:
-
Summary ID af3e918c0bfcb3ee69c09b8f18ab8a661ab0a7b1 0686569d45ac47e39e24fa327c2ee6141b64adb2 af3e918c0bfcb3ee69c09b8f18ab8a661ab0a7b1 2d243e676d8165518a3c86c1aab15a4f728b2beb - Diff:
-
Revision 7 (+2542 -260)
Checks run (2 succeeded)
- Change Summary:
-
Updated styling to match standard.
- Commits:
-
Summary ID af3e918c0bfcb3ee69c09b8f18ab8a661ab0a7b1 2d243e676d8165518a3c86c1aab15a4f728b2beb af3e918c0bfcb3ee69c09b8f18ab8a661ab0a7b1 acfe5b5d4e8f92028f0507025c4e5d3de8475d1c 7f11411ff3865d6f1121c5583921a97d4319061d - Diff:
-
Revision 8 (+3147 -325)