Create token generator classes and a registry for them

Review Request #12415 — Created June 24, 2022 and submitted

Information

Djblets
release-3.x

Reviewers

Currently, Djblets can only generate SHA1 tokens and token generation is
baked into djblets.webapi.managers.WebAPITokenManager. As part of our move to
using an improved token format (API Tokens v2), we need a way to generate
different types of tokens.

This change introduces token generators, which handle the logic for token
generation and allows for multiple ways of generating tokens. There are three
classes implemented:
1. A base class.
2. A class for generating our legacy SHA1 tokens.
3. A class for generating our new token format that has a vendor prefix and
checksum.

This also adds a registry for the token generators.

  • Created unit tests for the token generators.
  • Ran all unit tests in djblets.secrets.tests.
Summary ID
Create Token Generator classes and registry
069535f539beaea702acdbf986f55404f71da379
Description From Last Updated

line too long (80 > 79 characters) Column: 80 Error code: E501

reviewbotreviewbot

'django.contrib.auth.models.User' imported but unused Column: 1 Error code: F401

reviewbotreviewbot

line too long (80 > 79 characters) Column: 80 Error code: E501

reviewbotreviewbot

'django.db.models' imported but unused Column: 1 Error code: F401

reviewbotreviewbot

'django.db.models' imported but unused Column: 1 Error code: F401

reviewbotreviewbot

Docstrings in this file should be "Testing ..."

daviddavid

Docstrings in this file should be "Testing ..." You can also put the """ on the same line for this …

daviddavid

I also mentioned it in the review for /r/12419, but we should instantiate the registry as an attribute of this …

daviddavid

Typo: NotImeplementedError -> NotImplementedError.

daviddavid

There's an extra blank line here.

daviddavid

Typo: NotImeplementedError -> NotImplementedError.

daviddavid

There's not actually a **kwargs in the method signature.

daviddavid

line too long (81 > 79 characters) Column: 80 Error code: E501

reviewbotreviewbot

line too long (80 > 79 characters) Column: 80 Error code: E501

reviewbotreviewbot

line too long (89 > 79 characters) Column: 80 Error code: E501

reviewbotreviewbot

For these tests, you can probably just put the data within the call. Helps keep the construction and parameters together. …

chipx86chipx86

This can be consolidated: token = ( '_%s' % self.token_generator.create_token(...)[1:] )

chipx86chipx86

Typo: "charcter" -> "character"

chipx86chipx86

This can be consolidated: token = ( '!%s' % self.token_generator.create_token(...)[1:] )

chipx86chipx86

This should have a doc comment.

chipx86chipx86

Missing , optional

chipx86chipx86

This method is missing a Returns.

chipx86chipx86

This should make it clear what the return type should be if a determination could not be made.

chipx86chipx86

Missing , optional

chipx86chipx86

Let's say "legacy SHA1 tokens".

chipx86chipx86

Let's say "legacy SHA1 tokens".

chipx86chipx86

These should be in alphabetical order.

chipx86chipx86

This method doesn't seem to ever return None. However, it does raise an exception, which should be documented.

chipx86chipx86

Blank line between these.

chipx86chipx86

Localized text can't be formatted within the _(...). The % must be after.

chipx86chipx86

No need for the outer parens. Strings generally concatenate faster if joining via a format string, rather than +. So: …

chipx86chipx86

The in check requires a loop through charset for each character in token. Two things can be done to speed …

chipx86chipx86

For Registry, get() will raise an exception if the entry is not found.

chipx86chipx86

We usually use snake_case for identifiers.

chipx86chipx86

Doc comments are in the form of: #: Summary #: #: ... Same conventions as docstrings. These will get processed …

chipx86chipx86

token_info probably shouldn't be optional here, should it? If so, then the keys within must also be optional, and the …

chipx86chipx86

For Raises, the description must be indented.

chipx86chipx86

Let's avoid the repeated self.CHARSET attribute lookup by pulling that out first.

chipx86chipx86

Rather than re-assigning, you can do: checksum = ( self._base62_encode(...) .zfill(...) )

chipx86chipx86

This can be one statement.

chipx86chipx86

Same note about token_info. This will apply to all other classes too.

chipx86chipx86

Same as with the SHA1 generator, we should avoid a search of [0-9a-f_] for every character in the hash. We …

chipx86chipx86

We access self.CHARSET many times, each involving an attribute lookup, which isn't free. Let's pull that out into a local …

chipx86chipx86

Slightly faster as: return ''.join(reversed(result)) This loops through result once. If using .reverse() instead, it loops twice, once to reverse …

chipx86chipx86

Blank line between these.

chipx86chipx86

The formatting has to happen after _(...).

chipx86chipx86

For these tests, let's feed in a hard-coded token. This gives us clear test data to work with, and avoids …

chipx86chipx86

Should be fully lowercase (snake_case is typically lowercase).

chipx86chipx86

As this is a caller-time problem, we probably don't need to log anything. The exception will be fine. Same with …

chipx86chipx86

Small nit. We usually keep the % on the line with the argument.

chipx86chipx86

If this is missing content, we probably just want to say "Nope, not valid for this generator" and return False.

chipx86chipx86
Checks run (1 failed, 1 succeeded)
flake8 failed.
JSHint passed.

flake8

maubin
Review request changed

Change Summary:

  • Adds a registry for token generators
  • Add tests for KeyErrors

Summary:

-[WIP] Create Token Generator Classes
+[WIP] Create token generator classes and a registry for them

Description:

~  

Currently, Djblets can only generate SHA1 tokens. As part of our move to using

~   an improved token format, we need a way to generate different types of tokens.

  ~

Currently, Djblets can only generate SHA1 tokens and token generation is

  ~ baked into djblets.webapi.managers.WebAPITokenManager. As part of our move to
  + using an improved token format (API Tokens v2), we need a way to generate
  + different types of tokens.

   
~  

This change introduces Token Generator classes, with three classes implemented:

~   1. A Base class
~   2. A class for our legacy SHA1 tokens
~   3. A class for our new token format that has a vendor prefix and checksum.

  ~

This change introduces token generators, which handle the logic for token

  ~ generation and allows for multiple ways of generating tokens. There are three
  ~ classes implemented:
  ~ 1. A base class.
  + 2. A class for generating our legacy SHA1 tokens.
  + 3. A class for generating our new token format that has a vendor prefix and
  + checksum.

  +
  +

This also adds a registry for the token generators.

Commits:

Summary ID
Create Token Generator classes
548aa507d38225ad8858092b4997b1cbe494729e
Create Token Generator classes and registry
3766327e9807f31917daddee7eb196ce721b77a7

Depends On:

+12341 - Update the BaseWebAPIToken model for API Tokens v2

Diff:

Revision 2 (+978)

Show changes

Checks run (1 failed, 1 succeeded)

flake8 failed.
JSHint passed.

flake8

maubin
maubin
maubin
maubin
david
  1. 
      
  2. Docstrings in this file should be "Testing ..."

  3. Docstrings in this file should be "Testing ..."

    You can also put the """ on the same line for this one.

  4. djblets/secrets/token_generators/__init__.py (Diff revision 5)
     
     
     
     
     
     

    I also mentioned it in the review for /r/12419, but we should instantiate the registry as an attribute of this module. You can use djblets.registries.importer.lazy_import_registry for this.

  5. Typo: NotImeplementedError -> NotImplementedError.

  6. There's an extra blank line here.

  7. Typo: NotImeplementedError -> NotImplementedError.

  8. There's not actually a **kwargs in the method signature.

  9. 
      
maubin
Review request changed

Change Summary:

  • Fixed test docstrings.
  • Made the token generator registry an attribute of the token generators module.

Testing Done:

   
  • Created unit tests for the token generators.
~  
  • Ran all unit tests with success.
  ~
  • Ran all unit tests in djblets.secrets.tests.
-  
  • Still need to add unit tests for token generators' validate_token method.

Commits:

Summary ID
Create Token Generator classes and registry
658228f96385f1af8073c6edb06fa3fa93da394f
Create Token Generator classes and registry
c407408eb185f52c364fba90678a6f58f8b33016

Diff:

Revision 6 (+1144)

Show changes

Checks run (1 failed, 1 succeeded)

flake8 failed.
JSHint passed.

flake8

maubin
david
  1. Ship It!
  2. 
      
chipx86
  1. 
      
  2. djblets/secrets/tests/test_legacy_sha1_token_generator.py (Diff revision 7)
     
     
     
     
     
     
     

    For these tests, you can probably just put the data within the call. Helps keep the construction and parameters together.

    Same below.

  3. This can be consolidated:

    token = (
        '_%s'
        % self.token_generator.create_token(...)[1:]
    )
    
  4. Typo: "charcter" -> "character"

  5. This can be consolidated:

    token = (
        '!%s'
        % self.token_generator.create_token(...)[1:]
    )
    
  6. This should have a doc comment.

  7. Missing , optional

  8. djblets/secrets/token_generators/base.py (Diff revision 7)
     
     
     

    This method is missing a Returns.

  9. djblets/secrets/token_generators/base.py (Diff revision 7)
     
     
     

    This should make it clear what the return type should be if a determination could not be made.

    1. I could return None (or False?) whenever the determination cannot be made, but I'm confused on what this actually means and whether we'd need a value other than True and False. In what case would a token generator not be able to determine whether the given token could have been generated by it?

    2. My original thought had to do with secret scanning, where a SHA1 isn't necessarily a token (it could be anything), but secret scanning probably wouldn't end up depending on this method anyway, so scratch that. A bool is probably fine.

  10. djblets/secrets/token_generators/base.py (Diff revision 7)
     
     
     
     

    Missing , optional

  11. Let's say "legacy SHA1 tokens".

  12. Let's say "legacy SHA1 tokens".

  13. djblets/secrets/token_generators/legacy_sha1.py (Diff revision 7)
     
     
     
     
     
     
     

    These should be in alphabetical order.

  14. djblets/secrets/token_generators/legacy_sha1.py (Diff revision 7)
     
     
     
     

    This method doesn't seem to ever return None.

    However, it does raise an exception, which should be documented.

  15. Blank line between these.

  16. Localized text can't be formatted within the _(...). The % must be after.

  17. No need for the outer parens.

    Strings generally concatenate faster if joining via a format string, rather than +. So:

    prefix = '%s%s%s' % (prefix, str(attempt), timezone.now().isoformat())
    
  18. djblets/secrets/token_generators/legacy_sha1.py (Diff revision 7)
     
     
     
     
     

    The in check requires a loop through charset for each character in token. Two things can be done to speed that up:

    charset = set('0123456789abcdef')
    

    Or, probably better, define a regex for what a SHA1 token can be: [0-9A-Fa-f]{40}

    This would allow for a single optimized check.

  19. djblets/secrets/token_generators/registry.py (Diff revision 7)
     
     
     
     
     

    For Registry, get() will raise an exception if the entry is not found.

  20. We usually use snake_case for identifiers.

  21. Doc comments are in the form of:

    #: Summary
    #:
    #: ...
    

    Same conventions as docstrings. These will get processed by Sphinx.

  22. token_info probably shouldn't be optional here, should it?

    If so, then the keys within must also be optional, and the function must check for them.

  23. For Raises, the description must be indented.

  24. Let's avoid the repeated self.CHARSET attribute lookup by pulling that out first.

  25. Rather than re-assigning, you can do:

    checksum = (
        self._base62_encode(...)
        .zfill(...)
    )
    
  26. This can be one statement.

  27. Same note about token_info.

    This will apply to all other classes too.

  28. djblets/secrets/token_generators/vendor_checksum.py (Diff revision 7)
     
     
     
     
     
     

    Same as with the SHA1 generator, we should avoid a search of [0-9a-f_] for every character in the hash.

    We also want the cheapest check first, which would be the length check, since that's stored along with a string.

    So I'd suggest a regex just for the hash part on the class-level, as simply ^_[0-9a-f]+$, and a check like:

    return (len(token) == 255 and
            HASH_RE.match(token[len(token_type):]))
    
  29. djblets/secrets/token_generators/vendor_checksum.py (Diff revision 7)
     
     
     
     
     
     
     
     
     

    We access self.CHARSET many times, each involving an attribute lookup, which isn't free. Let's pull that out into a local variable.

  30. Slightly faster as:

    return ''.join(reversed(result))
    

    This loops through result once. If using .reverse() instead, it loops twice, once to reverse the list, the second time to join items.

  31. Blank line between these.

  32. djblets/secrets/token_generators/vendor_checksum.py (Diff revision 7)
     
     
     
     
     
     

    The formatting has to happen after _(...).

  33. 
      
maubin
maubin
chipx86
  1. Just a few things left that I noticed this round. Almost there!

  2. djblets/secrets/tests/test_legacy_sha1_token_generator.py (Diff revision 9)
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     

    For these tests, let's feed in a hard-coded token. This gives us clear test data to work with, and avoids side effects from create_token changes or regressions.

    This applies to the other generator tests as well.

  3. Should be fully lowercase (snake_case is typically lowercase).

  4. As this is a caller-time problem, we probably don't need to log anything. The exception will be fine.

    Same with the other generator.

  5. Small nit. We usually keep the % on the line with the argument.

  6. If this is missing content, we probably just want to say "Nope, not valid for this generator" and return False.

  7. 
      
maubin
chipx86
  1. Ship It!
  2. 
      
maubin
Review request changed

Status: Closed (submitted)

Change Summary:

Pushed to release-3.x (4d31e46)
Loading...