• 
      

    Create token generator classes and a registry for them

    Review Request #12415 — Created June 24, 2022 and submitted

    Information

    Djblets
    release-3.x

    Reviewers

    Currently, Djblets can only generate SHA1 tokens and token generation is
    baked into djblets.webapi.managers.WebAPITokenManager. As part of our move to
    using an improved token format (API Tokens v2), we need a way to generate
    different types of tokens.

    This change introduces token generators, which handle the logic for token
    generation and allows for multiple ways of generating tokens. There are three
    classes implemented:
    1. A base class.
    2. A class for generating our legacy SHA1 tokens.
    3. A class for generating our new token format that has a vendor prefix and
    checksum.

    This also adds a registry for the token generators.

    • Created unit tests for the token generators.
    • Ran all unit tests in djblets.secrets.tests.
    Summary ID
    Create Token Generator classes and registry
    069535f539beaea702acdbf986f55404f71da379
    Description From Last Updated

    line too long (80 > 79 characters) Column: 80 Error code: E501

    reviewbotreviewbot

    'django.contrib.auth.models.User' imported but unused Column: 1 Error code: F401

    reviewbotreviewbot

    line too long (80 > 79 characters) Column: 80 Error code: E501

    reviewbotreviewbot

    'django.db.models' imported but unused Column: 1 Error code: F401

    reviewbotreviewbot

    'django.db.models' imported but unused Column: 1 Error code: F401

    reviewbotreviewbot

    Docstrings in this file should be "Testing ..."

    daviddavid

    Docstrings in this file should be "Testing ..." You can also put the """ on the same line for this …

    daviddavid

    I also mentioned it in the review for /r/12419, but we should instantiate the registry as an attribute of this …

    daviddavid

    Typo: NotImeplementedError -> NotImplementedError.

    daviddavid

    There's an extra blank line here.

    daviddavid

    Typo: NotImeplementedError -> NotImplementedError.

    daviddavid

    There's not actually a **kwargs in the method signature.

    daviddavid

    line too long (81 > 79 characters) Column: 80 Error code: E501

    reviewbotreviewbot

    line too long (80 > 79 characters) Column: 80 Error code: E501

    reviewbotreviewbot

    line too long (89 > 79 characters) Column: 80 Error code: E501

    reviewbotreviewbot

    For these tests, you can probably just put the data within the call. Helps keep the construction and parameters together. …

    chipx86chipx86

    This can be consolidated: token = ( '_%s' % self.token_generator.create_token(...)[1:] )

    chipx86chipx86

    Typo: "charcter" -> "character"

    chipx86chipx86

    This can be consolidated: token = ( '!%s' % self.token_generator.create_token(...)[1:] )

    chipx86chipx86

    This should have a doc comment.

    chipx86chipx86

    Missing , optional

    chipx86chipx86

    This method is missing a Returns.

    chipx86chipx86

    This should make it clear what the return type should be if a determination could not be made.

    chipx86chipx86

    Missing , optional

    chipx86chipx86

    Let's say "legacy SHA1 tokens".

    chipx86chipx86

    Let's say "legacy SHA1 tokens".

    chipx86chipx86

    These should be in alphabetical order.

    chipx86chipx86

    This method doesn't seem to ever return None. However, it does raise an exception, which should be documented.

    chipx86chipx86

    Blank line between these.

    chipx86chipx86

    Localized text can't be formatted within the _(...). The % must be after.

    chipx86chipx86

    No need for the outer parens. Strings generally concatenate faster if joining via a format string, rather than +. So: …

    chipx86chipx86

    The in check requires a loop through charset for each character in token. Two things can be done to speed …

    chipx86chipx86

    For Registry, get() will raise an exception if the entry is not found.

    chipx86chipx86

    We usually use snake_case for identifiers.

    chipx86chipx86

    Doc comments are in the form of: #: Summary #: #: ... Same conventions as docstrings. These will get processed …

    chipx86chipx86

    token_info probably shouldn't be optional here, should it? If so, then the keys within must also be optional, and the …

    chipx86chipx86

    For Raises, the description must be indented.

    chipx86chipx86

    Let's avoid the repeated self.CHARSET attribute lookup by pulling that out first.

    chipx86chipx86

    Rather than re-assigning, you can do: checksum = ( self._base62_encode(...) .zfill(...) )

    chipx86chipx86

    This can be one statement.

    chipx86chipx86

    Same note about token_info. This will apply to all other classes too.

    chipx86chipx86

    Same as with the SHA1 generator, we should avoid a search of [0-9a-f_] for every character in the hash. We …

    chipx86chipx86

    We access self.CHARSET many times, each involving an attribute lookup, which isn't free. Let's pull that out into a local …

    chipx86chipx86

    Slightly faster as: return ''.join(reversed(result)) This loops through result once. If using .reverse() instead, it loops twice, once to reverse …

    chipx86chipx86

    Blank line between these.

    chipx86chipx86

    The formatting has to happen after _(...).

    chipx86chipx86

    For these tests, let's feed in a hard-coded token. This gives us clear test data to work with, and avoids …

    chipx86chipx86

    Should be fully lowercase (snake_case is typically lowercase).

    chipx86chipx86

    As this is a caller-time problem, we probably don't need to log anything. The exception will be fine. Same with …

    chipx86chipx86

    Small nit. We usually keep the % on the line with the argument.

    chipx86chipx86

    If this is missing content, we probably just want to say "Nope, not valid for this generator" and return False.

    chipx86chipx86
    Checks run (1 failed, 1 succeeded)
    flake8 failed.
    JSHint passed.

    flake8

    maubin
    Review request changed
    Change Summary:
    • Adds a registry for token generators
    • Add tests for KeyErrors
    Summary:
    [WIP] Create Token Generator Classes
    [WIP] Create token generator classes and a registry for them
    Description:
    ~  

    Currently, Djblets can only generate SHA1 tokens. As part of our move to using

    ~   an improved token format, we need a way to generate different types of tokens.

      ~

    Currently, Djblets can only generate SHA1 tokens and token generation is

      ~ baked into djblets.webapi.managers.WebAPITokenManager. As part of our move to
      + using an improved token format (API Tokens v2), we need a way to generate
      + different types of tokens.

       
    ~  

    This change introduces Token Generator classes, with three classes implemented:

    ~   1. A Base class
    ~   2. A class for our legacy SHA1 tokens
    ~   3. A class for our new token format that has a vendor prefix and checksum.

      ~

    This change introduces token generators, which handle the logic for token

      ~ generation and allows for multiple ways of generating tokens. There are three
      ~ classes implemented:
      ~ 1. A base class.
      + 2. A class for generating our legacy SHA1 tokens.
      + 3. A class for generating our new token format that has a vendor prefix and
      + checksum.

      +
      +

    This also adds a registry for the token generators.

    Commits:
    Summary ID
    Create Token Generator classes
    548aa507d38225ad8858092b4997b1cbe494729e
    Create Token Generator classes and registry
    3766327e9807f31917daddee7eb196ce721b77a7

    Checks run (1 failed, 1 succeeded)

    flake8 failed.
    JSHint passed.

    flake8

    maubin
    maubin
    maubin
    maubin
    david
    1. 
        
    2. Show all issues

      Docstrings in this file should be "Testing ..."

    3. Show all issues

      Docstrings in this file should be "Testing ..."

      You can also put the """ on the same line for this one.

    4. djblets/secrets/token_generators/__init__.py (Diff revision 5)
       
       
       
       
       
       
      Show all issues

      I also mentioned it in the review for /r/12419, but we should instantiate the registry as an attribute of this module. You can use djblets.registries.importer.lazy_import_registry for this.

    5. Show all issues

      Typo: NotImeplementedError -> NotImplementedError.

    6. Show all issues

      There's an extra blank line here.

    7. Show all issues

      Typo: NotImeplementedError -> NotImplementedError.

    8. Show all issues

      There's not actually a **kwargs in the method signature.

    9. 
        
    maubin
    Review request changed
    Change Summary:
    • Fixed test docstrings.
    • Made the token generator registry an attribute of the token generators module.
    Testing Done:
       
    • Created unit tests for the token generators.
    ~  
    • Ran all unit tests with success.
      ~
    • Ran all unit tests in djblets.secrets.tests.
    -  
    • Still need to add unit tests for token generators' validate_token method.
    Commits:
    Summary ID
    Create Token Generator classes and registry
    658228f96385f1af8073c6edb06fa3fa93da394f
    Create Token Generator classes and registry
    c407408eb185f52c364fba90678a6f58f8b33016

    Checks run (1 failed, 1 succeeded)

    flake8 failed.
    JSHint passed.

    flake8

    maubin
    david
    1. Ship It!
    2. 
        
    chipx86
    1. 
        
    2. djblets/secrets/tests/test_legacy_sha1_token_generator.py (Diff revision 7)
       
       
       
       
       
       
       
      Show all issues

      For these tests, you can probably just put the data within the call. Helps keep the construction and parameters together.

      Same below.

    3. Show all issues

      This can be consolidated:

      token = (
          '_%s'
          % self.token_generator.create_token(...)[1:]
      )
      
    4. Show all issues

      Typo: "charcter" -> "character"

    5. Show all issues

      This can be consolidated:

      token = (
          '!%s'
          % self.token_generator.create_token(...)[1:]
      )
      
    6. Show all issues

      This should have a doc comment.

    7. Show all issues

      Missing , optional

    8. djblets/secrets/token_generators/base.py (Diff revision 7)
       
       
       
      Show all issues

      This method is missing a Returns.

    9. djblets/secrets/token_generators/base.py (Diff revision 7)
       
       
       
      Show all issues

      This should make it clear what the return type should be if a determination could not be made.

      1. I could return None (or False?) whenever the determination cannot be made, but I'm confused on what this actually means and whether we'd need a value other than True and False. In what case would a token generator not be able to determine whether the given token could have been generated by it?

      2. My original thought had to do with secret scanning, where a SHA1 isn't necessarily a token (it could be anything), but secret scanning probably wouldn't end up depending on this method anyway, so scratch that. A bool is probably fine.

    10. djblets/secrets/token_generators/base.py (Diff revision 7)
       
       
       
       
      Show all issues

      Missing , optional

    11. Show all issues

      Let's say "legacy SHA1 tokens".

    12. Show all issues

      Let's say "legacy SHA1 tokens".

    13. djblets/secrets/token_generators/legacy_sha1.py (Diff revision 7)
       
       
       
       
       
       
       
      Show all issues

      These should be in alphabetical order.

    14. djblets/secrets/token_generators/legacy_sha1.py (Diff revision 7)
       
       
       
       
      Show all issues

      This method doesn't seem to ever return None.

      However, it does raise an exception, which should be documented.

    15. Show all issues

      Blank line between these.

    16. Show all issues

      Localized text can't be formatted within the _(...). The % must be after.

    17. Show all issues

      No need for the outer parens.

      Strings generally concatenate faster if joining via a format string, rather than +. So:

      prefix = '%s%s%s' % (prefix, str(attempt), timezone.now().isoformat())
      
    18. djblets/secrets/token_generators/legacy_sha1.py (Diff revision 7)
       
       
       
       
       
      Show all issues

      The in check requires a loop through charset for each character in token. Two things can be done to speed that up:

      charset = set('0123456789abcdef')
      

      Or, probably better, define a regex for what a SHA1 token can be: [0-9A-Fa-f]{40}

      This would allow for a single optimized check.

    19. djblets/secrets/token_generators/registry.py (Diff revision 7)
       
       
       
       
       
      Show all issues

      For Registry, get() will raise an exception if the entry is not found.

    20. Show all issues

      We usually use snake_case for identifiers.

    21. Show all issues

      Doc comments are in the form of:

      #: Summary
      #:
      #: ...
      

      Same conventions as docstrings. These will get processed by Sphinx.

    22. Show all issues

      token_info probably shouldn't be optional here, should it?

      If so, then the keys within must also be optional, and the function must check for them.

    23. Show all issues

      For Raises, the description must be indented.

    24. Show all issues

      Let's avoid the repeated self.CHARSET attribute lookup by pulling that out first.

    25. Show all issues

      Rather than re-assigning, you can do:

      checksum = (
          self._base62_encode(...)
          .zfill(...)
      )
      
    26. Show all issues

      This can be one statement.

    27. Show all issues

      Same note about token_info.

      This will apply to all other classes too.

    28. djblets/secrets/token_generators/vendor_checksum.py (Diff revision 7)
       
       
       
       
       
       
      Show all issues

      Same as with the SHA1 generator, we should avoid a search of [0-9a-f_] for every character in the hash.

      We also want the cheapest check first, which would be the length check, since that's stored along with a string.

      So I'd suggest a regex just for the hash part on the class-level, as simply ^_[0-9a-f]+$, and a check like:

      return (len(token) == 255 and
              HASH_RE.match(token[len(token_type):]))
      
    29. djblets/secrets/token_generators/vendor_checksum.py (Diff revision 7)
       
       
       
       
       
       
       
       
       
      Show all issues

      We access self.CHARSET many times, each involving an attribute lookup, which isn't free. Let's pull that out into a local variable.

    30. Show all issues

      Slightly faster as:

      return ''.join(reversed(result))
      

      This loops through result once. If using .reverse() instead, it loops twice, once to reverse the list, the second time to join items.

    31. Show all issues

      Blank line between these.

    32. djblets/secrets/token_generators/vendor_checksum.py (Diff revision 7)
       
       
       
       
       
       
      Show all issues

      The formatting has to happen after _(...).

    33. 
        
    maubin
    maubin
    chipx86
    1. Just a few things left that I noticed this round. Almost there!

    2. djblets/secrets/tests/test_legacy_sha1_token_generator.py (Diff revision 9)
       
       
       
       
       
       
       
       
       
       
       
       
       
       
       
       
       
       
       
       
       
       
       
       
       
       
       
       
       
       
       
       
       
       
       
       
       
      Show all issues

      For these tests, let's feed in a hard-coded token. This gives us clear test data to work with, and avoids side effects from create_token changes or regressions.

      This applies to the other generator tests as well.

    3. Show all issues

      Should be fully lowercase (snake_case is typically lowercase).

    4. Show all issues

      As this is a caller-time problem, we probably don't need to log anything. The exception will be fine.

      Same with the other generator.

    5. Show all issues

      Small nit. We usually keep the % on the line with the argument.

    6. Show all issues

      If this is missing content, we probably just want to say "Nope, not valid for this generator" and return False.

    7. 
        
    maubin
    chipx86
    1. Ship It!
    2. 
        
    maubin
    Review request changed
    Status:
    Completed
    Change Summary:
    Pushed to release-3.x (4d31e46)