Create token generator classes and a registry for them

Review Request #12415 — Created June 24, 2022 and submitted

Information

Djblets
release-3.x

Reviewers

Currently, Djblets can only generate SHA1 tokens and token generation is
baked into djblets.webapi.managers.WebAPITokenManager. As part of our move to
using an improved token format (API Tokens v2), we need a way to generate
different types of tokens.

This change introduces token generators, which handle the logic for token
generation and allows for multiple ways of generating tokens. There are three
classes implemented:
1. A base class.
2. A class for generating our legacy SHA1 tokens.
3. A class for generating our new token format that has a vendor prefix and
checksum.

This also adds a registry for the token generators.

  • Created unit tests for the token generators.
  • Ran all unit tests in djblets.secrets.tests.
Summary ID
Create Token Generator classes and registry
069535f539beaea702acdbf986f55404f71da379
Description From Last Updated

line too long (80 > 79 characters) Column: 80 Error code: E501

reviewbotreviewbot

'django.contrib.auth.models.User' imported but unused Column: 1 Error code: F401

reviewbotreviewbot

line too long (80 > 79 characters) Column: 80 Error code: E501

reviewbotreviewbot

'django.db.models' imported but unused Column: 1 Error code: F401

reviewbotreviewbot

'django.db.models' imported but unused Column: 1 Error code: F401

reviewbotreviewbot

Docstrings in this file should be "Testing ..."

daviddavid

Docstrings in this file should be "Testing ..." You can also put the """ on the same line for this …

daviddavid

I also mentioned it in the review for /r/12419, but we should instantiate the registry as an attribute of this …

daviddavid

Typo: NotImeplementedError -> NotImplementedError.

daviddavid

There's an extra blank line here.

daviddavid

Typo: NotImeplementedError -> NotImplementedError.

daviddavid

There's not actually a **kwargs in the method signature.

daviddavid

line too long (81 > 79 characters) Column: 80 Error code: E501

reviewbotreviewbot

line too long (80 > 79 characters) Column: 80 Error code: E501

reviewbotreviewbot

line too long (89 > 79 characters) Column: 80 Error code: E501

reviewbotreviewbot

For these tests, you can probably just put the data within the call. Helps keep the construction and parameters together. …

chipx86chipx86

This can be consolidated: token = ( '_%s' % self.token_generator.create_token(...)[1:] )

chipx86chipx86

Typo: "charcter" -> "character"

chipx86chipx86

This can be consolidated: token = ( '!%s' % self.token_generator.create_token(...)[1:] )

chipx86chipx86

This should have a doc comment.

chipx86chipx86

Missing , optional

chipx86chipx86

This method is missing a Returns.

chipx86chipx86

This should make it clear what the return type should be if a determination could not be made.

chipx86chipx86

Missing , optional

chipx86chipx86

Let's say "legacy SHA1 tokens".

chipx86chipx86

Let's say "legacy SHA1 tokens".

chipx86chipx86

These should be in alphabetical order.

chipx86chipx86

This method doesn't seem to ever return None. However, it does raise an exception, which should be documented.

chipx86chipx86

Blank line between these.

chipx86chipx86

Localized text can't be formatted within the _(...). The % must be after.

chipx86chipx86

No need for the outer parens. Strings generally concatenate faster if joining via a format string, rather than +. So: …

chipx86chipx86

The in check requires a loop through charset for each character in token. Two things can be done to speed …

chipx86chipx86

For Registry, get() will raise an exception if the entry is not found.

chipx86chipx86

We usually use snake_case for identifiers.

chipx86chipx86

Doc comments are in the form of: #: Summary #: #: ... Same conventions as docstrings. These will get processed …

chipx86chipx86

token_info probably shouldn't be optional here, should it? If so, then the keys within must also be optional, and the …

chipx86chipx86

For Raises, the description must be indented.

chipx86chipx86

Let's avoid the repeated self.CHARSET attribute lookup by pulling that out first.

chipx86chipx86

Rather than re-assigning, you can do: checksum = ( self._base62_encode(...) .zfill(...) )

chipx86chipx86

This can be one statement.

chipx86chipx86

Same note about token_info. This will apply to all other classes too.

chipx86chipx86

Same as with the SHA1 generator, we should avoid a search of [0-9a-f_] for every character in the hash. We …

chipx86chipx86

We access self.CHARSET many times, each involving an attribute lookup, which isn't free. Let's pull that out into a local …

chipx86chipx86

Slightly faster as: return ''.join(reversed(result)) This loops through result once. If using .reverse() instead, it loops twice, once to reverse …

chipx86chipx86

Blank line between these.

chipx86chipx86

The formatting has to happen after _(...).

chipx86chipx86

For these tests, let's feed in a hard-coded token. This gives us clear test data to work with, and avoids …

chipx86chipx86

Should be fully lowercase (snake_case is typically lowercase).

chipx86chipx86

As this is a caller-time problem, we probably don't need to log anything. The exception will be fine. Same with …

chipx86chipx86

Small nit. We usually keep the % on the line with the argument.

chipx86chipx86

If this is missing content, we probably just want to say "Nope, not valid for this generator" and return False.

chipx86chipx86
Checks run (1 failed, 1 succeeded)
flake8 failed.
JSHint passed.

flake8

maubin
Review request changed
Change Summary:
  • Adds a registry for token generators
  • Add tests for KeyErrors
Summary:
[WIP] Create Token Generator Classes
[WIP] Create token generator classes and a registry for them
Description:
~  

Currently, Djblets can only generate SHA1 tokens. As part of our move to using

~   an improved token format, we need a way to generate different types of tokens.

  ~

Currently, Djblets can only generate SHA1 tokens and token generation is

  ~ baked into djblets.webapi.managers.WebAPITokenManager. As part of our move to
  + using an improved token format (API Tokens v2), we need a way to generate
  + different types of tokens.

   
~  

This change introduces Token Generator classes, with three classes implemented:

~   1. A Base class
~   2. A class for our legacy SHA1 tokens
~   3. A class for our new token format that has a vendor prefix and checksum.

  ~

This change introduces token generators, which handle the logic for token

  ~ generation and allows for multiple ways of generating tokens. There are three
  ~ classes implemented:
  ~ 1. A base class.
  + 2. A class for generating our legacy SHA1 tokens.
  + 3. A class for generating our new token format that has a vendor prefix and
  + checksum.

  +
  +

This also adds a registry for the token generators.

Commits:
Summary ID
Create Token Generator classes
548aa507d38225ad8858092b4997b1cbe494729e
Create Token Generator classes and registry
3766327e9807f31917daddee7eb196ce721b77a7

Checks run (1 failed, 1 succeeded)

flake8 failed.
JSHint passed.

flake8

maubin
maubin
maubin
maubin
david
  1. 
      
  2. Show all issues

    Docstrings in this file should be "Testing ..."

  3. Show all issues

    Docstrings in this file should be "Testing ..."

    You can also put the """ on the same line for this one.

  4. djblets/secrets/token_generators/__init__.py (Diff revision 5)
     
     
     
     
     
     
    Show all issues

    I also mentioned it in the review for /r/12419, but we should instantiate the registry as an attribute of this module. You can use djblets.registries.importer.lazy_import_registry for this.

  5. Show all issues

    Typo: NotImeplementedError -> NotImplementedError.

  6. Show all issues

    There's an extra blank line here.

  7. Show all issues

    Typo: NotImeplementedError -> NotImplementedError.

  8. Show all issues

    There's not actually a **kwargs in the method signature.

  9. 
      
maubin
Review request changed
Change Summary:
  • Fixed test docstrings.
  • Made the token generator registry an attribute of the token generators module.
Testing Done:
   
  • Created unit tests for the token generators.
~  
  • Ran all unit tests with success.
  ~
  • Ran all unit tests in djblets.secrets.tests.
-  
  • Still need to add unit tests for token generators' validate_token method.
Commits:
Summary ID
Create Token Generator classes and registry
658228f96385f1af8073c6edb06fa3fa93da394f
Create Token Generator classes and registry
c407408eb185f52c364fba90678a6f58f8b33016

Checks run (1 failed, 1 succeeded)

flake8 failed.
JSHint passed.

flake8

maubin
david
  1. Ship It!
  2. 
      
chipx86
  1. 
      
  2. djblets/secrets/tests/test_legacy_sha1_token_generator.py (Diff revision 7)
     
     
     
     
     
     
     
    Show all issues

    For these tests, you can probably just put the data within the call. Helps keep the construction and parameters together.

    Same below.

  3. Show all issues

    This can be consolidated:

    token = (
        '_%s'
        % self.token_generator.create_token(...)[1:]
    )
    
  4. Show all issues

    Typo: "charcter" -> "character"

  5. Show all issues

    This can be consolidated:

    token = (
        '!%s'
        % self.token_generator.create_token(...)[1:]
    )
    
  6. Show all issues

    This should have a doc comment.

  7. Show all issues

    Missing , optional

  8. djblets/secrets/token_generators/base.py (Diff revision 7)
     
     
     
    Show all issues

    This method is missing a Returns.

  9. djblets/secrets/token_generators/base.py (Diff revision 7)
     
     
     
    Show all issues

    This should make it clear what the return type should be if a determination could not be made.

    1. I could return None (or False?) whenever the determination cannot be made, but I'm confused on what this actually means and whether we'd need a value other than True and False. In what case would a token generator not be able to determine whether the given token could have been generated by it?

    2. My original thought had to do with secret scanning, where a SHA1 isn't necessarily a token (it could be anything), but secret scanning probably wouldn't end up depending on this method anyway, so scratch that. A bool is probably fine.

  10. djblets/secrets/token_generators/base.py (Diff revision 7)
     
     
     
     
    Show all issues

    Missing , optional

  11. Show all issues

    Let's say "legacy SHA1 tokens".

  12. Show all issues

    Let's say "legacy SHA1 tokens".

  13. djblets/secrets/token_generators/legacy_sha1.py (Diff revision 7)
     
     
     
     
     
     
     
    Show all issues

    These should be in alphabetical order.

  14. djblets/secrets/token_generators/legacy_sha1.py (Diff revision 7)
     
     
     
     
    Show all issues

    This method doesn't seem to ever return None.

    However, it does raise an exception, which should be documented.

  15. Show all issues

    Blank line between these.

  16. Show all issues

    Localized text can't be formatted within the _(...). The % must be after.

  17. Show all issues

    No need for the outer parens.

    Strings generally concatenate faster if joining via a format string, rather than +. So:

    prefix = '%s%s%s' % (prefix, str(attempt), timezone.now().isoformat())
    
  18. djblets/secrets/token_generators/legacy_sha1.py (Diff revision 7)
     
     
     
     
     
    Show all issues

    The in check requires a loop through charset for each character in token. Two things can be done to speed that up:

    charset = set('0123456789abcdef')
    

    Or, probably better, define a regex for what a SHA1 token can be: [0-9A-Fa-f]{40}

    This would allow for a single optimized check.

  19. djblets/secrets/token_generators/registry.py (Diff revision 7)
     
     
     
     
     
    Show all issues

    For Registry, get() will raise an exception if the entry is not found.

  20. Show all issues

    We usually use snake_case for identifiers.

  21. Show all issues

    Doc comments are in the form of:

    #: Summary
    #:
    #: ...
    

    Same conventions as docstrings. These will get processed by Sphinx.

  22. Show all issues

    token_info probably shouldn't be optional here, should it?

    If so, then the keys within must also be optional, and the function must check for them.

  23. Show all issues

    For Raises, the description must be indented.

  24. Show all issues

    Let's avoid the repeated self.CHARSET attribute lookup by pulling that out first.

  25. Show all issues

    Rather than re-assigning, you can do:

    checksum = (
        self._base62_encode(...)
        .zfill(...)
    )
    
  26. Show all issues

    This can be one statement.

  27. Show all issues

    Same note about token_info.

    This will apply to all other classes too.

  28. djblets/secrets/token_generators/vendor_checksum.py (Diff revision 7)
     
     
     
     
     
     
    Show all issues

    Same as with the SHA1 generator, we should avoid a search of [0-9a-f_] for every character in the hash.

    We also want the cheapest check first, which would be the length check, since that's stored along with a string.

    So I'd suggest a regex just for the hash part on the class-level, as simply ^_[0-9a-f]+$, and a check like:

    return (len(token) == 255 and
            HASH_RE.match(token[len(token_type):]))
    
  29. djblets/secrets/token_generators/vendor_checksum.py (Diff revision 7)
     
     
     
     
     
     
     
     
     
    Show all issues

    We access self.CHARSET many times, each involving an attribute lookup, which isn't free. Let's pull that out into a local variable.

  30. Show all issues

    Slightly faster as:

    return ''.join(reversed(result))
    

    This loops through result once. If using .reverse() instead, it loops twice, once to reverse the list, the second time to join items.

  31. Show all issues

    Blank line between these.

  32. djblets/secrets/token_generators/vendor_checksum.py (Diff revision 7)
     
     
     
     
     
     
    Show all issues

    The formatting has to happen after _(...).

  33. 
      
maubin
maubin
chipx86
  1. Just a few things left that I noticed this round. Almost there!

  2. djblets/secrets/tests/test_legacy_sha1_token_generator.py (Diff revision 9)
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
    Show all issues

    For these tests, let's feed in a hard-coded token. This gives us clear test data to work with, and avoids side effects from create_token changes or regressions.

    This applies to the other generator tests as well.

  3. Show all issues

    Should be fully lowercase (snake_case is typically lowercase).

  4. Show all issues

    As this is a caller-time problem, we probably don't need to log anything. The exception will be fine.

    Same with the other generator.

  5. Show all issues

    Small nit. We usually keep the % on the line with the argument.

  6. Show all issues

    If this is missing content, we probably just want to say "Nope, not valid for this generator" and return False.

  7. 
      
maubin
chipx86
  1. Ship It!
  2. 
      
maubin
Review request changed
Status:
Completed
Change Summary:
Pushed to release-3.x (4d31e46)