• 
      

    Fix a search indexing performance regression with ACL diff checks.

    Review Request #14165 — Created Sept. 12, 2024 and submitted

    Information

    Review Board
    release-7.x

    Reviewers

    When indexing review requests, we determine if a review request is
    accessible via review_request.is_accessible_by(). Part of this checks
    the files in the diffsets for accessibility, using ACL diff checks. This
    ends up querying the list of diffsets for each review request and then
    each file within it, which drastically slows down diff indexing.

    We now employ a couple layers of protection against this:

    1. ReviewRequest.get_diffsets() will now use the prefetch_related()
      and select_related() caches to determine if it needs to even
      perform a new query, or if it can short-cut the result. This saves
      the per-diffset and prefetched-files queries.

    2. ReviewRequest._are_diffs_accessible_by() no longer even calls
      get_diffsets() if there aren't any FileDiffACLHooks registered,
      which will be the common case.

    This ensures we don't do any more work than we need to do at any stage
    of these checks, and makes a major difference in indexing performance.

    All unit tests pass.

    Added SQL-level debugging for the database backend and performed a
    search index. With get_diffsets() being invoked for every diff, I
    verified that it used the prefetch cache, avoiding new queries. With
    ACL hooks factored in, I verified it never even got to the diffset
    query code without faking hooks being available.

    Summary ID
    Fix a search indexing performance regression with ACL diff checks.
    When indexing review requests, we determine if a review request is accessible via `review_request.is_accessible_by()`. Part of this checks the files in the diffsets for accessibility, using ACL diff checks. This ends up querying the list of diffsets for each review request and then each file within it, which drastically slows down diff indexing. We now employ a couple layers of protection against this: 1. `ReviewRequest.get_diffsets()` will now use the `prefetch_related()` and `select_related()` caches to determine if it needs to even perform a new query, or if it can short-cut the result. This saves the per-diffset and prefetched-files queries. 2. `ReviewRequest._are_diffs_accessible_by()` no longer even calls `get_diffsets()` if there aren't any `FileDiffACLHook`s registered, which will be the common case. This ensures we don't do any more work than we need to do at any stage of these checks, and makes a major difference in indexing performance.
    15968ceda0cfdacd5df7f8b56007b9ab1cd5e532
    Description From Last Updated

    Nit: You could add a comment in the form of # Query description. # # X queries: # # 1. …

    maubinmaubin
    maubin
    1. 
        
    2. Show all issues

      Nit: You could add a comment in the form of

      # Query description.
      #
      # X queries:
      #
      # 1. Operation description
      # ...
      

      above the query list. Same below.

    3. 
        
    david
    1. Ship It!
    2. 
        
    chipx86
    david
    1. Ship It!
    2. 
        
    maubin
    1. Ship It!
    2. 
        
    chipx86
    Review request changed
    Status:
    Completed
    Change Summary:
    Pushed to release-7.x (e05199e)