Add FileDiffACLsHook to allow people to hook up repository ACLs.

Review Request #11857 — Created Oct. 18, 2021 and submitted

Information

Review Board
release-4.0.x

Reviewers

One common request we've had over the years is to mirror repository
access control into Review Board. This is complicated by a few factors.
Review Board has always used a single account to access the remote
repository, and it caches aggressively to provide reasonable interactive
performance, so we can't just rely on the repository access directly. In
addition, it's not uncommon for repository users/usernames to not match
up with Review Board users/usernames (especially when repositories are
hosted on SaaS services).

In order to make this possible, we're adding a new extension hook,
FileDiffACLsHook. This hook is invoked when a user attempts to access
a review request, and is run for each DiffSet/FileDiff. The result
of this is cached per DiffSet to keep reasonable performance overall.
People can then create an extension implementing this hook to do
whatever access control they want, whether that's querying the
repository or referencing external config files.

  • Implemented a sample extension that queried p4 protect information
    to do access control for Perforce repositories. This implementation is
    visible in the documentation for the hook.
  • Ran unit tests.
Summary ID
Add FileDiffACLsHook to allow people to hook up repository ACLs.
One common request we've had over the years is to mirror repository access control into Review Board. This is complicated by a few factors. Review Board has always used a single account to access the remote repository, and it caches aggressively to provide reasonable interactive performance, so we can't just rely on the repository access directly. In addition, it's not uncommon for repository users/usernames to not match up with Review Board users/usernames (especially when repositories are hosted on SaaS services). In order to make this possible, we're adding a new extension hook, `FileDiffACLsHook`. This hook is invoked when a user attempts to access a review request, and is run for each `DiffSet`/`FileDiff`. The result of this is cached per `DiffSet` to keep reasonable performance overall. People can then create an extension implementing this hook to do whatever access control they want, whether that's querying the repository or referencing external config files. Testing Done: - Implemented a sample extension that queried `p4 protect` information to do access control for Perforce repositories. This implementation is visible in the documentation for the hook. - Ran unit tests. Reviewed at https://reviews.reviewboard.org/r/11857/
e6f74bb9a28c096decad5617047ee89accd39625
Description From Last Updated

"hosting" services.

chipx86chipx86

Each of these will need to be full reference paths, or Sphinx will fail to map and will probably emit …

chipx86chipx86

False or True?

chipx86chipx86

Can we require **kwargs (or document it at least) to make future expansion possible without breaking things?

chipx86chipx86

In 4.0 (maybe late 3.0.x?), we added scmtool_id, which is more stable and will be needed for registries. This should …

chipx86chipx86

Two blank lines.

chipx86chipx86

Can you place this alphabetically in the hook list?

chipx86chipx86

As above, let's add **kwargs to any hook methods.

chipx86chipx86

The pattern we use for similar hooks is to return a boolean if we have an explicit result, or None …

chipx86chipx86

Shouldn't this be checking results?

chipx86chipx86

Blank line between these.

chipx86chipx86

Leftover debugging?

chipx86chipx86

I saw the comment about letting the hook just take a diffset (I think that's maybe a worthwhile change — …

chipx86chipx86

This is a little difficult to read. How about returning the result as a variable, and then checking the variable?

chipx86chipx86

E712 comparison to False should be 'if cond is False:' or 'if not cond:'

reviewbotreviewbot

Blank line between these.

chipx86chipx86

Should that be extension: "%s" or extension "%s"? I see that the ReviewRequestApprovalHook version has the former, but I wonder …

chipx86chipx86

Just to check (and we should explicitly document this everywhere): Do all hooks need to say either True or None …

chipx86chipx86

Can we have unit tests that test through the hook mechanism? I feel we'd want to ensure that, for N …

chipx86chipx86

E501 line too long (80 > 79 characters)

reviewbotreviewbot

F841 local variable 'diffset' is assigned to but never used

reviewbotreviewbot

Private functions should go after public functions.

chipx86chipx86

Private functions should go after public functions.

chipx86chipx86

This function is missing an explicit return at the end. That'd result in a None, but docs say to expect …

chipx86chipx86

I think the only thing left I'd want to see is a change to this paragraph to hammer in the …

chipx86chipx86
masseyk
  1. Thanks for this change! A suggestion follows:

  2. reviewboard/extensions/hooks.py (Diff revision 1)
     
     

    As written, the is_accessible method will be called once for each filediff in each diffset, per the code in review_request.py:

             for filediff in diffset.files.all():
                ic(filediff)
                for hook in FileDiffACLHook.hooks:
                    try:
                        if not hook.is_accessible(diffset, filediff, user):
    

    I think it would be more flexible, though, if the hook were called once per diffset instead.

    For a Perforce example, suppose a company organizes its ACLs based on the first 3 levels in the depotFile path:

    //<depot>/<project>/<branch>/...
    

    That is, within a given depot, project, and branch, if a user has access to one file, they have access to all files.

    And suppose that a review has a fileset containing 100 filediffs:

    //depot/foo/main/f00.c
    //depot/foo/main/f01.c
    ...
    //depot/foo/main/f98.c
    //depot/foo/main/f99.c
    

    In this case, the hook function would only need to test a single depot path, //depot/foo/main/..., with "p4 protects" instead of testing all 100 paths to get the result.

    And in cases where there is no existing ACL pattern that can be used to optimize the tests, the hook method can still iterate over the filediffs in the diffset.

  3. 
      
chipx86
  1. 
      
  2. Show all issues

    "hosting" services.

  3. Show all issues

    Each of these will need to be full reference paths, or Sphinx will fail to map and will probably emit warnings.

  4. Show all issues

    False or True?

  5. Show all issues

    Can we require **kwargs (or document it at least) to make future expansion possible without breaking things?

  6. Show all issues

    In 4.0 (maybe late 3.0.x?), we added scmtool_id, which is more stable and will be needed for registries. This should check that and compare against 'perforce'.

  7. Show all issues

    Two blank lines.

  8. docs/manual/extending/index.rst (Diff revision 1)
     
     
     
     
    Show all issues

    Can you place this alphabetically in the hook list?

  9. reviewboard/extensions/hooks.py (Diff revision 1)
     
     
    Show all issues

    As above, let's add **kwargs to any hook methods.

  10. reviewboard/extensions/hooks.py (Diff revision 1)
     
     
     
     
     
    Show all issues

    The pattern we use for similar hooks is to return a boolean if we have an explicit result, or None if we don't. Can we stick to that here? Make the "I don't know, someone else check" result just be None?

  11. Show all issues

    Shouldn't this be checking results?

    1. This is just testing sandboxing. Functionality is covered by other tests.

  12. Show all issues

    Blank line between these.

  13. Show all issues

    Leftover debugging?

  14. reviewboard/reviews/models/review_request.py (Diff revision 1)
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
    Show all issues

    I saw the comment about letting the hook just take a diffset (I think that's maybe a worthwhile change — will discuss in that comment), so this comment depends on the outcome of that a bit, but...

    We should avoid inline functions here. The function gets redefined on every call to the parent function. I think we should instead just move this out into the class.

  15. reviewboard/reviews/models/review_request.py (Diff revision 1)
     
     
     
     
     
    Show all issues

    This is a little difficult to read. How about returning the result as a variable, and then checking the variable?

  16. 
      
david
Review request changed
Commits:
Summary ID
Add FileDiffACLsHook to allow people to hook up repository ACLs.
One common request we've had over the years is to mirror repository access control into Review Board. This is complicated by a few factors. Review Board has always used a single account to access the remote repository, and it caches aggressively to provide reasonable interactive performance, so we can't just rely on the repository access directly. In addition, it's not uncommon for repository users/usernames to not match up with Review Board users/usernames (especially when repositories are hosted on SaaS services). In order to make this possible, we're adding a new extension hook, `FileDiffACLsHook`. This hook is invoked when a user attempts to access a review request, and is run for each `DiffSet`/`FileDiff`. The result of this is cached per `DiffSet` to keep reasonable performance overall. People can then create an extension implementing this hook to do whatever access control they want, whether that's querying the repository or referencing external config files. Testing Done: - Implemented a sample extension that queried `p4 protect` information to do access control for Perforce repositories. This implementation is visible in the documentation for the hook. - Ran unit tests.
548cbf935825de3c7605e49c2a629d0852288bf1
Add FileDiffACLsHook to allow people to hook up repository ACLs.
One common request we've had over the years is to mirror repository access control into Review Board. This is complicated by a few factors. Review Board has always used a single account to access the remote repository, and it caches aggressively to provide reasonable interactive performance, so we can't just rely on the repository access directly. In addition, it's not uncommon for repository users/usernames to not match up with Review Board users/usernames (especially when repositories are hosted on SaaS services). In order to make this possible, we're adding a new extension hook, `FileDiffACLsHook`. This hook is invoked when a user attempts to access a review request, and is run for each `DiffSet`/`FileDiff`. The result of this is cached per `DiffSet` to keep reasonable performance overall. People can then create an extension implementing this hook to do whatever access control they want, whether that's querying the repository or referencing external config files. Testing Done: - Implemented a sample extension that queried `p4 protect` information to do access control for Perforce repositories. This implementation is visible in the documentation for the hook. - Ran unit tests.
6762b040281c2d3ebec676acf9d9f726b14ab642

Checks run (1 failed, 1 succeeded)

flake8 failed.
JSHint passed.

flake8

david
david
chipx86
  1. The design looks sound to me. Just a few comments.

  2. Show all issues

    Blank line between these.

  3. Show all issues

    Should that be extension: "%s" or extension "%s"?

    I see that the ReviewRequestApprovalHook version has the former, but I wonder if that was a typo back then.

  4. reviewboard/reviews/models/review_request.py (Diff revision 3)
     
     
     
     
    Show all issues

    Just to check (and we should explicitly document this everywhere): Do all hooks need to say either True or None for the result to be True, or do we want to allow any True to win?

    ReviewRequestApprovalHook chains results, so that a hook can do effectively if prev_approved and .... We don't have that design for this hook (should we?), so there isn't a precedent here.

    I don't honestly know what the right design is, so this is really just a question and a request to make it crystal-clear in docs/comments, specifically describing what happens with multiple hooks (important to note that hooks can be called in any order).

    1. I think all have to be True or None. I could see having one extension that checks repository ACLs, and another that checks some other state (maybe something like whether the user is a contractor). Any False should win.

      I don't know that chaining results gives us anything useful in this case.

  5. reviewboard/reviews/tests/test_review_request.py (Diff revision 3)
     
     
     
     
     
     
     
     
     
     
     
     
     
    Show all issues

    Can we have unit tests that test through the hook mechanism? I feel we'd want to ensure that, for N hooks (maybe 2, 3) that we get the right results for a combination of True, False, and None results.

  6. 
      
david
Review request changed
Commits:
Summary ID
Add FileDiffACLsHook to allow people to hook up repository ACLs.
One common request we've had over the years is to mirror repository access control into Review Board. This is complicated by a few factors. Review Board has always used a single account to access the remote repository, and it caches aggressively to provide reasonable interactive performance, so we can't just rely on the repository access directly. In addition, it's not uncommon for repository users/usernames to not match up with Review Board users/usernames (especially when repositories are hosted on SaaS services). In order to make this possible, we're adding a new extension hook, `FileDiffACLsHook`. This hook is invoked when a user attempts to access a review request, and is run for each `DiffSet`/`FileDiff`. The result of this is cached per `DiffSet` to keep reasonable performance overall. People can then create an extension implementing this hook to do whatever access control they want, whether that's querying the repository or referencing external config files. Testing Done: - Implemented a sample extension that queried `p4 protect` information to do access control for Perforce repositories. This implementation is visible in the documentation for the hook. - Ran unit tests.
b7ec9fed7f520bbff7b878d719bd5dbfeb8f7615
Add FileDiffACLsHook to allow people to hook up repository ACLs.
One common request we've had over the years is to mirror repository access control into Review Board. This is complicated by a few factors. Review Board has always used a single account to access the remote repository, and it caches aggressively to provide reasonable interactive performance, so we can't just rely on the repository access directly. In addition, it's not uncommon for repository users/usernames to not match up with Review Board users/usernames (especially when repositories are hosted on SaaS services). In order to make this possible, we're adding a new extension hook, `FileDiffACLsHook`. This hook is invoked when a user attempts to access a review request, and is run for each `DiffSet`/`FileDiff`. The result of this is cached per `DiffSet` to keep reasonable performance overall. People can then create an extension implementing this hook to do whatever access control they want, whether that's querying the repository or referencing external config files. Testing Done: - Implemented a sample extension that queried `p4 protect` information to do access control for Perforce repositories. This implementation is visible in the documentation for the hook. - Ran unit tests.
a3ddb46130e2572e42441b4738aef65ff88f5a8c

Checks run (1 failed, 1 succeeded)

flake8 failed.
JSHint passed.

flake8

david
david
chipx86
  1. All the logic and docs look good to me. Only a couple small things (function ordering, missing return) and I'm happy.

  2. Show all issues

    Private functions should go after public functions.

  3. Show all issues

    Private functions should go after public functions.

  4. reviewboard/reviews/models/review_request.py (Diff revision 6)
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
    Show all issues

    This function is missing an explicit return at the end. That'd result in a None, but docs say to expect True or False.

  5. 
      
david
chipx86
  1. 
      
  2. Show all issues

    I think the only thing left I'd want to see is a change to this paragraph to hammer in the expectations of the implementation. Right now, the requirements are all in one big paragraph, and I just know people are going to mentally skip something.

    Can we make the "If the user does not have ..." text onward each its own bullet point, like:

    The hook is expected to return:
    
    * ``True`` if the user does have access to all the files.
    * ``False`` if the user does not have access to one or more files.
    * ``None`` if the hook did not check the diff (for example, it might only check a single SCM or hosting service type, and ignore any others)
    
    If any hook returns ``False``, the user will not have access to the diff.
    
  3. 
      
david
Review request changed
Status:
Completed
Change Summary:
Pushed to release-4.0.x (3afe364)