• 
      

    Update the BaseWebAPIToken model for API Tokens v2

    Review Request #12341 — Created June 6, 2022 and submitted

    Information

    Djblets
    release-3.x

    Reviewers

    This change is the first in a series of changes for improving our API Tokens
    (which we'll refer to as API Tokens v2). Currently, Djblets generates tokens
    using SHA1, which has the following disadvantages:
    - Tokens aren't able to be validated or identified from other SHA1 tokens.
    - Can't use secret scanning to detect leaked tokens
    - Aren't as secure as some modern counterparts.

    In response, we are moving towards a new token format that contain a
    vendor/category prefix and can be validated through a checksum. Additionally,
    we want to be able to support token expiration and invalidation, which we
    currently don't do. Since we are moving away from an old token format, we also
    need support for token deprecation.

    In this change we modify the API tokens model to allow the possibility for
    new token types, expiration, and invalidation. The following changes are made
    to the BaseWebAPIToken model:
    - Adding expires field and an is_expired method.
    - Adding valid, invalid_date and invalid_reason fields.
    - Adding token_generator_id field.
    - Adding last_used field.
    - Updating max length of token to 255 characters.

    The token authentication backends were also updated to block authentication
    if an expired or invalid token was used. We also added help texts for all
    of the BaseWebAPIToken fields for better clarity.

    • Ran unit tests in djblets.webapi.tests
    • Manually tested token authentication when making requests with curl
      and rbt api-get
    Summary ID
    Update the BaseWebAPIToken model for API Tokens v2
    25b540b95771d6599931010a52a7db61ca93ac88
    Description From Last Updated

    I think we just want timezone.now(). This is the modern equivalent of our old get_tz_aware_utcnow().

    chipx86chipx86

    By default, save() will save every single field in the model. If there are two threads dealing with an API …

    chipx86chipx86

    Same note about saving.

    chipx86chipx86

    Blank line before comments.

    chipx86chipx86

    We haven't been consistent with this, but we should aim to have localized help_text= set for every field. Since that's …

    chipx86chipx86

    255 should be fine.

    chipx86chipx86

    'django.utils.timezone' imported but unused Column: 1 Error code: F401

    reviewbotreviewbot

    continuation line over-indented for hanging indent Column: 13 Error code: E126

    reviewbotreviewbot

    This is being set after we save. I assume we'll want to set this before the parent save call above.

    chipx86chipx86

    Things can go wrong when comparing dates/times. 1 second seems fine, but what if we hit a leap second while …

    chipx86chipx86

    It looks like you've pulled on release-3.x but not rebased, which is why your diff is now undoing these changes. …

    daviddavid

    This could use djblets.db.fields.ModificationTimestampField in order to avoid having to babysit it in save()

    daviddavid

    Add a blank line in here (datetime is standard lib, django is third party).

    daviddavid

    Can you move this to the third-party import section?

    daviddavid

    'datetime.timedelta' imported but unused Column: 1 Error code: F401

    reviewbotreviewbot

    I feel like this is something we should do per-test-method rather than for everything in the entire suite. I think …

    daviddavid

    This is missing a "Version Added".

    chipx86chipx86

    Might be more clear as "... the token is invalid or expired, and any appropriate HTTP headers to send to …

    chipx86chipx86

    This should be "str" now.

    chipx86chipx86

    Since we're defining a new method, it might be a good opportunity to replace this with a dictionary, for better …

    chipx86chipx86

    Any % values must be specified outside of a _(...). Otherwise they'll modify the string that needs to be localized.

    chipx86chipx86

    We should specify which user this pertains to.

    chipx86chipx86

    Same comment about localized string formatting.

    chipx86chipx86

    We always either override everything in here, or return None. Rather than pre-define it, let's just return { ... } …

    chipx86chipx86

    Same comment about localized string formatting.

    chipx86chipx86

    We should specify which user this pertains to.

    chipx86chipx86

    For safety, this should ddo LOGIN_FAILED.headers.copy(). That avoids any state leakage (a problem we've had before in the API).

    chipx86chipx86

    For something like this, formatting works a bit better as: return_dict['error_message'] = ( _('...') % ... ) That gives both …

    chipx86chipx86

    Rather than format and set a default, let's move this to an else below.

    chipx86chipx86

    As above, this becomes more clear when the form is: return_dict['error_message'] = ( _('This token is invalid. It became invalid …

    chipx86chipx86

    When logging, we don't want to use %. Instead, pass the value as a positional parameter.

    chipx86chipx86

    Similar comments above apply here.

    chipx86chipx86

    This should contain a Version Added.

    chipx86chipx86

    We've been moving to a better format. This can be: result = ( super(...) .valdilate_credentials(...) ) Though, being Python 3, …

    chipx86chipx86

    This can be simplified to: return self.expires is not None and timezone.now() >= self.expires

    chipx86chipx86

    Plain imports should be listed before from imports.

    chipx86chipx86

    Let's switch this to an import kgb, and then access these as attributes from kgb. This is the pattern we're …

    chipx86chipx86

    For unit tests, we don't want to use _(...).

    chipx86chipx86

    Same note about import kgb.

    chipx86chipx86

    This should be assertTrue

    chipx86chipx86

    line too long (82 > 79 characters) Column: 80 Error code: E501

    reviewbotreviewbot

    continuation line over-indented for visual indent Column: 36 Error code: E127

    reviewbotreviewbot

    Dictionaries should (almost) always be in multi-line format, with keys alphabetized and on their own line (not on the { …

    chipx86chipx86

    Same here.

    chipx86chipx86

    Missing a trailing comma.

    chipx86chipx86

    Let's actually move these into the assertion, so it's not scattered: self.assertEqual( result, ( False, '...', { # Explicit headers …

    chipx86chipx86

    No need for the .copy() here, since we know it won't be modified in a test. Although, we want to …

    chipx86chipx86

    I'm torn on this. My gut is telling me we don't want to bake this into the API. I think, …

    chipx86chipx86

    We don't want to localize here. Test results should almost always compare against fixed state. Generating dynamic state can mask …

    chipx86chipx86

    'djblets.webapi.errors.LOGIN_FAILED' imported but unused Column: 1 Error code: F401

    reviewbotreviewbot

    These should still hard-code the exact string we expect, rather than running it through any Django functions. Otherwise we're just …

    chipx86chipx86
    maubin
    chipx86
    1. 
        
    2. Show all issues

      I think we just want timezone.now(). This is the modern equivalent of our old get_tz_aware_utcnow().

    3. Show all issues

      By default, save() will save every single field in the model. If there are two threads dealing with an API token at once, one will overwrite the other's data.

      Best to always be explicit:

      webapi_token.save(update_fields=('last_used',))
      
    4. Show all issues

      Same note about saving.

      1. Actually I don't think I need to update the last_used field here because the oauth2_provider.models.AccessToken model doesn't have a last_used or an equivalent field. Will take this out.

    5. djblets/webapi/models.py (Diff revision 1)
       
       
       
      Show all issues

      Blank line before comments.

    6. djblets/webapi/models.py (Diff revision 1)
       
       
       
       
       
       
       
       
       
       
       
       
      Show all issues

      We haven't been consistent with this, but we should aim to have localized help_text= set for every field.

      Since that's long, the best formatting for fields are:

      field = models.FooField(
          param,
          param,
          help_text=...)
      
    7. djblets/webapi/models.py (Diff revision 1)
       
       
      Show all issues

      255 should be fine.

    8. 
        
    maubin
    Review request changed
    Change Summary:
    • Added help_text to token fields and reformatted the fields
    • Removed updating the last_used field for OAuth2 tokens since that field doesnt exist for those tokens
    • Modified BaseWebAPIToken default test to check for the new fields too
    • Modified WebAPITokenAuthBackend's tests to check if last_used field gets updated upon successful authentication
    • Updates the last_used field in WebAPITokenAuthBackend
    • Updates the last_updated field in whenever an API token object is saved and that save is from an update
    Description:
    ~  

    As part of our move to API Tokens v2, we need to open API tokens for new token

    ~   types, expiration, and invalidation. The following changes are made to the
    ~   BaseWebApiToken model:

      ~

    This change is the first in a series of changes for improving our API Tokens

      ~ (which we'll refer to as API Tokens v2). Currently, Djblets generates tokens
      ~ using SHA1, which has the following disadvantages:
      + - Tokens aren't able to be validated or identified from other SHA1 tokens.
      + - Can't use secret scanning to detect leaked tokens
      + - Aren't as secure as some modern counterparts.

       
    ~  
    • Adding expires field
    ~  
    • Adding valid, invalid_date and invalid_reason fields
    ~  
    • Adding token_generator_id field
    ~  
    • Adding last_used field
    ~  
    • Updating max length of token to 255 characters.
      ~

    In response, we are moving towards a new token format that contain a

      ~ vendor/category prefix and can be validated through a checksum. Additionally,
      ~ we want to be able to support token expiration and invalidation, which we
      ~ currently don't do. Since we are moving away from an old token format, we also
      ~ need support for token deprecation.

      +
      +

    In this change we modify the API tokens model to allow the possibility for

      + new token types, expiration, and invalidation. The following changes are made
      + to the BaseWebAPIToken model:
      + - Adding expires field
      + - Adding valid, invalid_date and invalid_reason fields
      + - Adding token_generator_id field
      + - Adding last_used field
      + - Updating max length of token to 255 characters.

      +
      +

    We also add help_text for all the BaseWebAPIToken fields for better

      + clarity.

    Commits:
    Summary ID
    Updating the BaseWebAPIToken model for API Tokens v2
    512cab638671fa0635701720c7ba37830a68de10
    Updating the BaseWebAPIToken model for API Tokens v2
    62dcc5f4eeadab14fbf86d3dbaba35301b532e9d

    Checks run (1 failed, 1 succeeded)

    flake8 failed.
    JSHint passed.

    flake8

    maubin
    chipx86
    1. 
        
    2. djblets/webapi/models.py (Diff revision 3)
       
       
       
      Show all issues

      This is being set after we save. I assume we'll want to set this before the parent save call above.

    3. djblets/webapi/tests/test_api_token.py (Diff revision 3)
       
       
       
       
       
      Show all issues

      Things can go wrong when comparing dates/times. 1 second seems fine, but what if we hit a leap second while this runs? Or something causes the process to stall? This sort of stuff unfortunately happens.

      A better solution we've found is to hard-code when "now" is, using spies:

      self.spy_on(timezone.now, op=kgb.SpyOpReturn(...))
      

      Then we can safely compare the values with strict equal checks.

    4. 
        
    maubin
    david
    1. 
        
    2. README.rst (Diff revision 4)
       
       
       
       
      Show all issues

      It looks like you've pulled on release-3.x but not rebased, which is why your diff is now undoing these changes. Can you rebase and update?

    3. 
        
    maubin
    david
    1. 
        
    2. djblets/webapi/models.py (Diff revision 5)
       
       
       
       
      Show all issues

      This could use djblets.db.fields.ModificationTimestampField in order to avoid having to babysit it in save()

    3. djblets/webapi/tests/test_api_token.py (Diff revision 5)
       
       
       
      Show all issues

      Add a blank line in here (datetime is standard lib, django is third party).

    4. djblets/webapi/tests/test_api_token.py (Diff revision 5)
       
       
      Show all issues

      Can you move this to the third-party import section?

    5. 
        
    maubin
    Review request changed
    Change Summary:
    • Spy on timezone.now in Token Auth authentication test to see if last_used gets updated
    • Fixed imports ordering issues
    • Uses ModificationTimeStampField for token's last_updated field
    Commits:
    Summary ID
    Updating the BaseWebAPIToken model for API Tokens v2
    bbdcc6ad1c1cce018b116c95eac2de22b3b7c9ce
    Updating the BaseWebAPIToken model for API Tokens v2
    ace466caf3cfa24acc2a8ef4204c9401037b468b

    Checks run (1 failed, 1 succeeded)

    flake8 failed.
    JSHint passed.

    flake8

    maubin
    david
    1. 
        
    2. djblets/webapi/tests/test_api_token.py (Diff revision 7)
       
       
      Show all issues

      I feel like this is something we should do per-test-method rather than for everything in the entire suite.

      I think I'd also be happier if we initialized it with some fake timestamp value. Right now in something like test_last_updated, if no time has elapsed between when the token is created and when it's saved, there's no reason to think the timestamp would change. Instead, to be sure everything is working independent of timing, we could do this:

      • create token
      • spy on timezone.now with a fake timestamp
      • save
      • check timestamp against fake value
      1. Sure, it makes sense to move it so that its per-test-method.

        For the 2nd problem, that's why I set the time_added and last_updated fields to timezone.now() - timedelta(hours=1) when I create the token. That way when we check to see if last_updated is timezone.now(), it'll fail if it's still timezone.now() - timedelta(hours=1). I think this gives the same effect of what you're looking for?

      2. Ah, you're right. I'm happy after you just move it to be per-method.

    3. 
        
    maubin
    david
    1. Ship It!
    2. 
        
    maubin
    chipx86
    1. 
        
    2. Show all issues

      This is missing a "Version Added".

    3. Show all issues

      This should be "str" now.

    4. djblets/webapi/auth/backends/api_tokens.py (Diff revision 9)
       
       
       
       
      Show all issues

      Since we're defining a new method, it might be a good opportunity to replace this with a dictionary, for better readability and future expansion.

      1. Agreed. Should I then also update WebAPIAuthBackend.authenticate and the other methods that share this return type?

      2. That'd be API-breaking. We can just consume it as a dict and turn it into a tuple there. Opens the door to later moving those to a dict more easily if we find it necessary.

    5. djblets/webapi/auth/backends/api_tokens.py (Diff revision 9)
       
       
       
      Show all issues

      Any % values must be specified outside of a _(...). Otherwise they'll modify the string that needs to be localized.

    6. djblets/webapi/auth/backends/api_tokens.py (Diff revision 9)
       
       
       
      Show all issues

      We should specify which user this pertains to.

    7. djblets/webapi/auth/backends/api_tokens.py (Diff revision 9)
       
       
       
       
       
      Show all issues

      Same comment about localized string formatting.

    8. djblets/webapi/auth/backends/api_tokens.py (Diff revision 9)
       
       
       
      Show all issues

      Same comment about localized string formatting.

    9. djblets/webapi/auth/backends/api_tokens.py (Diff revision 9)
       
       
       
      Show all issues

      We should specify which user this pertains to.

    10. Show all issues

      This should contain a Version Added.

    11. djblets/webapi/auth/backends/api_tokens.py (Diff revision 9)
       
       
       
      Show all issues

      We've been moving to a better format. This can be:

      result = (
          super(...)
          .valdilate_credentials(...)
      )
      

      Though, being Python 3, the super can just be super(), probably.

    12. djblets/webapi/models.py (Diff revision 9)
       
       
       
       
       
      Show all issues

      This can be simplified to:

      return self.expires is not None and timezone.now() >= self.expires
      
    13. Show all issues

      Plain imports should be listed before from imports.

    14. Show all issues

      Let's switch this to an import kgb, and then access these as attributes from kgb. This is the pattern we're adopting elsewhere.

    15. Show all issues

      For unit tests, we don't want to use _(...).

    16. djblets/webapi/tests/test_api_token.py (Diff revision 9)
       
       
      Show all issues

      Same note about import kgb.

    17. djblets/webapi/tests/test_api_token.py (Diff revision 9)
       
       
      Show all issues

      This should be assertTrue

    18. 
        
    maubin
    Review request changed
    Commits:
    Summary ID
    Update the BaseWebAPIToken model for API Tokens v2
    69cae8216db36df3004dddde3fd3b4a7d88b9e19
    Update the BaseWebAPIToken model for API Tokens v2
    9963d9cb06b5a00d089a6964ec3b420d124de200

    Checks run (1 failed, 1 succeeded)

    flake8 failed.
    JSHint passed.

    flake8

    maubin
    chipx86
    1. 
        
    2. djblets/webapi/auth/backends/api_tokens.py (Diff revisions 9 - 11)
       
       
      Show all issues

      Might be more clear as "... the token is invalid or expired, and any appropriate HTTP headers to send to the client."

    3. djblets/webapi/auth/backends/api_tokens.py (Diff revisions 9 - 11)
       
       
       
       
       
      Show all issues

      We always either override everything in here, or return None. Rather than pre-define it, let's just return { ... } directly where we need it.

      For any conditional values, let's set a variable and then use that as the value in the return { ... }.

    4. djblets/webapi/auth/backends/api_tokens.py (Diff revisions 9 - 11)
       
       
      Show all issues

      For safety, this should ddo LOGIN_FAILED.headers.copy(). That avoids any state leakage (a problem we've had before in the API).

    5. djblets/webapi/auth/backends/api_tokens.py (Diff revisions 9 - 11)
       
       
       
       
      Show all issues

      For something like this, formatting works a bit better as:

      return_dict['error_message'] = (
          _('...')
          % ...
      )
      

      That gives both the message and the formatting arguments breathing room. Changes to one don't impact the other, and it's obvious how to reformat for any new changes.

    6. djblets/webapi/auth/backends/api_tokens.py (Diff revisions 9 - 11)
       
       
       
       
       
       
      Show all issues

      Rather than format and set a default, let's move this to an else below.

    7. djblets/webapi/auth/backends/api_tokens.py (Diff revisions 9 - 11)
       
       
       
       
       
       
      Show all issues

      As above, this becomes more clear when the form is:

      return_dict['error_message'] = (
          _('This token is invalid. It became invalid %s.'
            'It was made invalid because: %s')
          % (webapi_token.invalid_date,
             webapi_token.invalid_reason)
      )
      

      So that being said, I'd like to change the message. Something more like:

      'This API token became invalid on %s: %s'
      

      This is just a bit less technical, flows a bit better, and makes the reason more part of the error message itself. An example then becomes:

      "This API token became invalid on 2023-06-14 12:13:14.123456: We've reset tokens due to a potential compromise of our internal systems."

      Similarly, the above error should be This API token became invalid on %s..

      We should probably format the date here as well, rather than using the (above) default representation that has too much precision.

      Now we're not at all consistent with this (we mostly do this right in templates via the |date filter), but Django does have some localization code. We can do:

      from django.utils.formats import localize
      
      '... %s ...' % localize(webapi_token.invalid_date)
      

      This gives us a string like July 28, 2022, 6:18 a.m. (matching bits of our UI), which users can customize if they really want to via settings.DATETIME_FORMAT (not that we advertise that).

    8. djblets/webapi/auth/backends/api_tokens.py (Diff revisions 9 - 11)
       
       
       
       
      Show all issues

      When logging, we don't want to use %. Instead, pass the value as a positional parameter.

    9. djblets/webapi/auth/backends/api_tokens.py (Diff revisions 9 - 11)
       
       
       
       
       
       
       
       
      Show all issues

      Similar comments above apply here.

    10. 
        
    maubin
    chipx86
    1. 
        
    2. djblets/webapi/auth/backends/api_tokens.py (Diff revisions 11 - 12)
       
       
      Show all issues

      Dictionaries should (almost) always be in multi-line format, with keys alphabetized and on their own line (not on the { or } lines):

      return {
          'error_message': None,
          'headers': None,
      }
      
    3. djblets/webapi/auth/backends/api_tokens.py (Diff revisions 11 - 12)
       
       
       
      Show all issues

      Same here.

    4. djblets/webapi/auth/backends/api_tokens.py (Diff revisions 11 - 12)
       
       
      Show all issues

      Missing a trailing comma.

    5. djblets/webapi/tests/test_api_auth_backend.py (Diff revisions 11 - 12)
       
       
       
       
       
       
      Show all issues

      Let's actually move these into the assertion, so it's not scattered:

      self.assertEqual(
          result,
          (
              False,
              '...',
              {
                  # Explicit headers here
              }
          ))
      

      Same below.

    6. djblets/webapi/tests/test_api_auth_backend.py (Diff revisions 11 - 12)
       
       
      Show all issues

      No need for the .copy() here, since we know it won't be modified in a test.

      Although, we want to be clear on expectations, so we know what regresses from a change. So this should hard-code any headers to check against.

      Same below.

    7. 
        
    maubin
    Review request changed
    Commits:
    Summary ID
    Update the BaseWebAPIToken model for API Tokens v2
    141b505d35d961d923f74bfe3d1ccd0b69eafb26
    Update the BaseWebAPIToken model for API Tokens v2
    1771899497ed3539da1f5561d83288174ba9b821

    Checks run (1 failed, 1 succeeded)

    flake8 failed.
    JSHint passed.

    flake8

    maubin
    maubin
    maubin
    chipx86
    1. 
        
    2. djblets/webapi/models.py (Diff revisions 12 - 15)
       
       
       
       
       
       
       
       
       
       
       
       
       
       
       
      Show all issues

      I'm torn on this. My gut is telling me we don't want to bake this into the API.

      I think, since this is a workaround for wanting to update a timestamp without triggering default behavior, we should work around it there.

      We have two approaches we could take for that:

      1. We could use save_base() instead of save(). This can be used just like save() for our purposes. It just bypasses overridden save logic, and some deferred model and field sanity checks that won't apply in our case.

      2. Alternatively, we could do api_token_model.objects.filter(pk=token.pk).update(...) to update the database copy without triggering any functionality (including things like the standard pre_save/post_save signals).

      Since we probably aren't interested in making this save entirely invisible — we just want to avoid webapi_token_updated — we should probably go with the first option.

    3. djblets/webapi/tests/test_api_auth_backend.py (Diff revisions 12 - 15)
       
       
       
      Show all issues

      We don't want to localize here. Test results should almost always compare against fixed state. Generating dynamic state can mask bugs.

      We can assume a default localization when running tests. If the assumption were to be wrong, then we'd have a test environment/setup issue, which we don't want to mask.

    4. 
        
    maubin
    chipx86
    1. 
        
    2. djblets/webapi/tests/test_api_auth_backend.py (Diff revisions 15 - 16)
       
       
       
      Show all issues

      These should still hard-code the exact string we expect, rather than running it through any Django functions. Otherwise we're just testing Django's functionality using Django's functionality. So let's make this whole thing a plain string, no function calls. If the values ever change, it highlights a test environment problem, or a change in Django that we need to be aware of.

    3. 
        
    maubin
    chipx86
    1. Ship It!
    2. 
        
    maubin
    Review request changed
    Status:
    Completed
    Change Summary:
    Pushed to release-3.x (5aa0a27)