Update the BaseWebAPIToken model for API Tokens v2

Review Request #12341 — Created June 6, 2022 and submitted

Information

Djblets
release-3.x

Reviewers

This change is the first in a series of changes for improving our API Tokens
(which we'll refer to as API Tokens v2). Currently, Djblets generates tokens
using SHA1, which has the following disadvantages:
- Tokens aren't able to be validated or identified from other SHA1 tokens.
- Can't use secret scanning to detect leaked tokens
- Aren't as secure as some modern counterparts.

In response, we are moving towards a new token format that contain a
vendor/category prefix and can be validated through a checksum. Additionally,
we want to be able to support token expiration and invalidation, which we
currently don't do. Since we are moving away from an old token format, we also
need support for token deprecation.

In this change we modify the API tokens model to allow the possibility for
new token types, expiration, and invalidation. The following changes are made
to the BaseWebAPIToken model:
- Adding expires field and an is_expired method.
- Adding valid, invalid_date and invalid_reason fields.
- Adding token_generator_id field.
- Adding last_used field.
- Updating max length of token to 255 characters.

The token authentication backends were also updated to block authentication
if an expired or invalid token was used. We also added help texts for all
of the BaseWebAPIToken fields for better clarity.

  • Ran unit tests in djblets.webapi.tests
  • Manually tested token authentication when making requests with curl
    and rbt api-get
Summary ID
Update the BaseWebAPIToken model for API Tokens v2
25b540b95771d6599931010a52a7db61ca93ac88
Description From Last Updated

I think we just want timezone.now(). This is the modern equivalent of our old get_tz_aware_utcnow().

chipx86chipx86

By default, save() will save every single field in the model. If there are two threads dealing with an API …

chipx86chipx86

Same note about saving.

chipx86chipx86

Blank line before comments.

chipx86chipx86

We haven't been consistent with this, but we should aim to have localized help_text= set for every field. Since that's …

chipx86chipx86

255 should be fine.

chipx86chipx86

'django.utils.timezone' imported but unused Column: 1 Error code: F401

reviewbotreviewbot

continuation line over-indented for hanging indent Column: 13 Error code: E126

reviewbotreviewbot

This is being set after we save. I assume we'll want to set this before the parent save call above.

chipx86chipx86

Things can go wrong when comparing dates/times. 1 second seems fine, but what if we hit a leap second while …

chipx86chipx86

It looks like you've pulled on release-3.x but not rebased, which is why your diff is now undoing these changes. …

daviddavid

This could use djblets.db.fields.ModificationTimestampField in order to avoid having to babysit it in save()

daviddavid

Add a blank line in here (datetime is standard lib, django is third party).

daviddavid

Can you move this to the third-party import section?

daviddavid

'datetime.timedelta' imported but unused Column: 1 Error code: F401

reviewbotreviewbot

I feel like this is something we should do per-test-method rather than for everything in the entire suite. I think …

daviddavid

This is missing a "Version Added".

chipx86chipx86

Might be more clear as "... the token is invalid or expired, and any appropriate HTTP headers to send to …

chipx86chipx86

This should be "str" now.

chipx86chipx86

Since we're defining a new method, it might be a good opportunity to replace this with a dictionary, for better …

chipx86chipx86

Any % values must be specified outside of a _(...). Otherwise they'll modify the string that needs to be localized.

chipx86chipx86

We should specify which user this pertains to.

chipx86chipx86

Same comment about localized string formatting.

chipx86chipx86

We always either override everything in here, or return None. Rather than pre-define it, let's just return { ... } …

chipx86chipx86

Same comment about localized string formatting.

chipx86chipx86

We should specify which user this pertains to.

chipx86chipx86

For safety, this should ddo LOGIN_FAILED.headers.copy(). That avoids any state leakage (a problem we've had before in the API).

chipx86chipx86

For something like this, formatting works a bit better as: return_dict['error_message'] = ( _('...') % ... ) That gives both …

chipx86chipx86

Rather than format and set a default, let's move this to an else below.

chipx86chipx86

As above, this becomes more clear when the form is: return_dict['error_message'] = ( _('This token is invalid. It became invalid …

chipx86chipx86

When logging, we don't want to use %. Instead, pass the value as a positional parameter.

chipx86chipx86

Similar comments above apply here.

chipx86chipx86

This should contain a Version Added.

chipx86chipx86

We've been moving to a better format. This can be: result = ( super(...) .valdilate_credentials(...) ) Though, being Python 3, …

chipx86chipx86

This can be simplified to: return self.expires is not None and timezone.now() >= self.expires

chipx86chipx86

Plain imports should be listed before from imports.

chipx86chipx86

Let's switch this to an import kgb, and then access these as attributes from kgb. This is the pattern we're …

chipx86chipx86

For unit tests, we don't want to use _(...).

chipx86chipx86

Same note about import kgb.

chipx86chipx86

This should be assertTrue

chipx86chipx86

line too long (82 > 79 characters) Column: 80 Error code: E501

reviewbotreviewbot

continuation line over-indented for visual indent Column: 36 Error code: E127

reviewbotreviewbot

Dictionaries should (almost) always be in multi-line format, with keys alphabetized and on their own line (not on the { …

chipx86chipx86

Same here.

chipx86chipx86

Missing a trailing comma.

chipx86chipx86

Let's actually move these into the assertion, so it's not scattered: self.assertEqual( result, ( False, '...', { # Explicit headers …

chipx86chipx86

No need for the .copy() here, since we know it won't be modified in a test. Although, we want to …

chipx86chipx86

I'm torn on this. My gut is telling me we don't want to bake this into the API. I think, …

chipx86chipx86

We don't want to localize here. Test results should almost always compare against fixed state. Generating dynamic state can mask …

chipx86chipx86

'djblets.webapi.errors.LOGIN_FAILED' imported but unused Column: 1 Error code: F401

reviewbotreviewbot

These should still hard-code the exact string we expect, rather than running it through any Django functions. Otherwise we're just …

chipx86chipx86
maubin
chipx86
  1. 
      
  2. Show all issues

    I think we just want timezone.now(). This is the modern equivalent of our old get_tz_aware_utcnow().

  3. Show all issues

    By default, save() will save every single field in the model. If there are two threads dealing with an API token at once, one will overwrite the other's data.

    Best to always be explicit:

    webapi_token.save(update_fields=('last_used',))
    
  4. Show all issues

    Same note about saving.

    1. Actually I don't think I need to update the last_used field here because the oauth2_provider.models.AccessToken model doesn't have a last_used or an equivalent field. Will take this out.

  5. djblets/webapi/models.py (Diff revision 1)
     
     
     
    Show all issues

    Blank line before comments.

  6. djblets/webapi/models.py (Diff revision 1)
     
     
     
     
     
     
     
     
     
     
     
     
    Show all issues

    We haven't been consistent with this, but we should aim to have localized help_text= set for every field.

    Since that's long, the best formatting for fields are:

    field = models.FooField(
        param,
        param,
        help_text=...)
    
  7. djblets/webapi/models.py (Diff revision 1)
     
     
    Show all issues

    255 should be fine.

  8. 
      
maubin
Review request changed
Change Summary:
  • Added help_text to token fields and reformatted the fields
  • Removed updating the last_used field for OAuth2 tokens since that field doesnt exist for those tokens
  • Modified BaseWebAPIToken default test to check for the new fields too
  • Modified WebAPITokenAuthBackend's tests to check if last_used field gets updated upon successful authentication
  • Updates the last_used field in WebAPITokenAuthBackend
  • Updates the last_updated field in whenever an API token object is saved and that save is from an update
Description:
~  

As part of our move to API Tokens v2, we need to open API tokens for new token

~   types, expiration, and invalidation. The following changes are made to the
~   BaseWebApiToken model:

  ~

This change is the first in a series of changes for improving our API Tokens

  ~ (which we'll refer to as API Tokens v2). Currently, Djblets generates tokens
  ~ using SHA1, which has the following disadvantages:
  + - Tokens aren't able to be validated or identified from other SHA1 tokens.
  + - Can't use secret scanning to detect leaked tokens
  + - Aren't as secure as some modern counterparts.

   
~  
  • Adding expires field
~  
  • Adding valid, invalid_date and invalid_reason fields
~  
  • Adding token_generator_id field
~  
  • Adding last_used field
~  
  • Updating max length of token to 255 characters.
  ~

In response, we are moving towards a new token format that contain a

  ~ vendor/category prefix and can be validated through a checksum. Additionally,
  ~ we want to be able to support token expiration and invalidation, which we
  ~ currently don't do. Since we are moving away from an old token format, we also
  ~ need support for token deprecation.

  +
  +

In this change we modify the API tokens model to allow the possibility for

  + new token types, expiration, and invalidation. The following changes are made
  + to the BaseWebAPIToken model:
  + - Adding expires field
  + - Adding valid, invalid_date and invalid_reason fields
  + - Adding token_generator_id field
  + - Adding last_used field
  + - Updating max length of token to 255 characters.

  +
  +

We also add help_text for all the BaseWebAPIToken fields for better

  + clarity.

Commits:
Summary ID
Updating the BaseWebAPIToken model for API Tokens v2
512cab638671fa0635701720c7ba37830a68de10
Updating the BaseWebAPIToken model for API Tokens v2
62dcc5f4eeadab14fbf86d3dbaba35301b532e9d

Checks run (1 failed, 1 succeeded)

flake8 failed.
JSHint passed.

flake8

maubin
chipx86
  1. 
      
  2. djblets/webapi/models.py (Diff revision 3)
     
     
     
    Show all issues

    This is being set after we save. I assume we'll want to set this before the parent save call above.

  3. djblets/webapi/tests/test_api_token.py (Diff revision 3)
     
     
     
     
     
    Show all issues

    Things can go wrong when comparing dates/times. 1 second seems fine, but what if we hit a leap second while this runs? Or something causes the process to stall? This sort of stuff unfortunately happens.

    A better solution we've found is to hard-code when "now" is, using spies:

    self.spy_on(timezone.now, op=kgb.SpyOpReturn(...))
    

    Then we can safely compare the values with strict equal checks.

  4. 
      
maubin
david
  1. 
      
  2. README.rst (Diff revision 4)
     
     
     
     
    Show all issues

    It looks like you've pulled on release-3.x but not rebased, which is why your diff is now undoing these changes. Can you rebase and update?

  3. 
      
maubin
david
  1. 
      
  2. djblets/webapi/models.py (Diff revision 5)
     
     
     
     
    Show all issues

    This could use djblets.db.fields.ModificationTimestampField in order to avoid having to babysit it in save()

  3. djblets/webapi/tests/test_api_token.py (Diff revision 5)
     
     
     
    Show all issues

    Add a blank line in here (datetime is standard lib, django is third party).

  4. djblets/webapi/tests/test_api_token.py (Diff revision 5)
     
     
    Show all issues

    Can you move this to the third-party import section?

  5. 
      
maubin
Review request changed
Change Summary:
  • Spy on timezone.now in Token Auth authentication test to see if last_used gets updated
  • Fixed imports ordering issues
  • Uses ModificationTimeStampField for token's last_updated field
Commits:
Summary ID
Updating the BaseWebAPIToken model for API Tokens v2
bbdcc6ad1c1cce018b116c95eac2de22b3b7c9ce
Updating the BaseWebAPIToken model for API Tokens v2
ace466caf3cfa24acc2a8ef4204c9401037b468b

Checks run (1 failed, 1 succeeded)

flake8 failed.
JSHint passed.

flake8

maubin
david
  1. 
      
  2. djblets/webapi/tests/test_api_token.py (Diff revision 7)
     
     
    Show all issues

    I feel like this is something we should do per-test-method rather than for everything in the entire suite.

    I think I'd also be happier if we initialized it with some fake timestamp value. Right now in something like test_last_updated, if no time has elapsed between when the token is created and when it's saved, there's no reason to think the timestamp would change. Instead, to be sure everything is working independent of timing, we could do this:

    • create token
    • spy on timezone.now with a fake timestamp
    • save
    • check timestamp against fake value
    1. Sure, it makes sense to move it so that its per-test-method.

      For the 2nd problem, that's why I set the time_added and last_updated fields to timezone.now() - timedelta(hours=1) when I create the token. That way when we check to see if last_updated is timezone.now(), it'll fail if it's still timezone.now() - timedelta(hours=1). I think this gives the same effect of what you're looking for?

    2. Ah, you're right. I'm happy after you just move it to be per-method.

  3. 
      
maubin
david
  1. Ship It!
  2. 
      
maubin
chipx86
  1. 
      
  2. Show all issues

    This is missing a "Version Added".

  3. Show all issues

    This should be "str" now.

  4. djblets/webapi/auth/backends/api_tokens.py (Diff revision 9)
     
     
     
     
    Show all issues

    Since we're defining a new method, it might be a good opportunity to replace this with a dictionary, for better readability and future expansion.

    1. Agreed. Should I then also update WebAPIAuthBackend.authenticate and the other methods that share this return type?

    2. That'd be API-breaking. We can just consume it as a dict and turn it into a tuple there. Opens the door to later moving those to a dict more easily if we find it necessary.

  5. djblets/webapi/auth/backends/api_tokens.py (Diff revision 9)
     
     
     
    Show all issues

    Any % values must be specified outside of a _(...). Otherwise they'll modify the string that needs to be localized.

  6. djblets/webapi/auth/backends/api_tokens.py (Diff revision 9)
     
     
     
    Show all issues

    We should specify which user this pertains to.

  7. djblets/webapi/auth/backends/api_tokens.py (Diff revision 9)
     
     
     
     
     
    Show all issues

    Same comment about localized string formatting.

  8. djblets/webapi/auth/backends/api_tokens.py (Diff revision 9)
     
     
     
    Show all issues

    Same comment about localized string formatting.

  9. djblets/webapi/auth/backends/api_tokens.py (Diff revision 9)
     
     
     
    Show all issues

    We should specify which user this pertains to.

  10. Show all issues

    This should contain a Version Added.

  11. djblets/webapi/auth/backends/api_tokens.py (Diff revision 9)
     
     
     
    Show all issues

    We've been moving to a better format. This can be:

    result = (
        super(...)
        .valdilate_credentials(...)
    )
    

    Though, being Python 3, the super can just be super(), probably.

  12. djblets/webapi/models.py (Diff revision 9)
     
     
     
     
     
    Show all issues

    This can be simplified to:

    return self.expires is not None and timezone.now() >= self.expires
    
  13. Show all issues

    Plain imports should be listed before from imports.

  14. Show all issues

    Let's switch this to an import kgb, and then access these as attributes from kgb. This is the pattern we're adopting elsewhere.

  15. Show all issues

    For unit tests, we don't want to use _(...).

  16. djblets/webapi/tests/test_api_token.py (Diff revision 9)
     
     
    Show all issues

    Same note about import kgb.

  17. djblets/webapi/tests/test_api_token.py (Diff revision 9)
     
     
    Show all issues

    This should be assertTrue

  18. 
      
maubin
Review request changed
Commits:
Summary ID
Update the BaseWebAPIToken model for API Tokens v2
69cae8216db36df3004dddde3fd3b4a7d88b9e19
Update the BaseWebAPIToken model for API Tokens v2
9963d9cb06b5a00d089a6964ec3b420d124de200

Checks run (1 failed, 1 succeeded)

flake8 failed.
JSHint passed.

flake8

maubin
chipx86
  1. 
      
  2. djblets/webapi/auth/backends/api_tokens.py (Diff revisions 9 - 11)
     
     
    Show all issues

    Might be more clear as "... the token is invalid or expired, and any appropriate HTTP headers to send to the client."

  3. djblets/webapi/auth/backends/api_tokens.py (Diff revisions 9 - 11)
     
     
     
     
     
    Show all issues

    We always either override everything in here, or return None. Rather than pre-define it, let's just return { ... } directly where we need it.

    For any conditional values, let's set a variable and then use that as the value in the return { ... }.

  4. djblets/webapi/auth/backends/api_tokens.py (Diff revisions 9 - 11)
     
     
    Show all issues

    For safety, this should ddo LOGIN_FAILED.headers.copy(). That avoids any state leakage (a problem we've had before in the API).

  5. djblets/webapi/auth/backends/api_tokens.py (Diff revisions 9 - 11)
     
     
     
     
    Show all issues

    For something like this, formatting works a bit better as:

    return_dict['error_message'] = (
        _('...')
        % ...
    )
    

    That gives both the message and the formatting arguments breathing room. Changes to one don't impact the other, and it's obvious how to reformat for any new changes.

  6. djblets/webapi/auth/backends/api_tokens.py (Diff revisions 9 - 11)
     
     
     
     
     
     
    Show all issues

    Rather than format and set a default, let's move this to an else below.

  7. djblets/webapi/auth/backends/api_tokens.py (Diff revisions 9 - 11)
     
     
     
     
     
     
    Show all issues

    As above, this becomes more clear when the form is:

    return_dict['error_message'] = (
        _('This token is invalid. It became invalid %s.'
          'It was made invalid because: %s')
        % (webapi_token.invalid_date,
           webapi_token.invalid_reason)
    )
    

    So that being said, I'd like to change the message. Something more like:

    'This API token became invalid on %s: %s'
    

    This is just a bit less technical, flows a bit better, and makes the reason more part of the error message itself. An example then becomes:

    "This API token became invalid on 2023-06-14 12:13:14.123456: We've reset tokens due to a potential compromise of our internal systems."

    Similarly, the above error should be This API token became invalid on %s..

    We should probably format the date here as well, rather than using the (above) default representation that has too much precision.

    Now we're not at all consistent with this (we mostly do this right in templates via the |date filter), but Django does have some localization code. We can do:

    from django.utils.formats import localize
    
    '... %s ...' % localize(webapi_token.invalid_date)
    

    This gives us a string like July 28, 2022, 6:18 a.m. (matching bits of our UI), which users can customize if they really want to via settings.DATETIME_FORMAT (not that we advertise that).

  8. djblets/webapi/auth/backends/api_tokens.py (Diff revisions 9 - 11)
     
     
     
     
    Show all issues

    When logging, we don't want to use %. Instead, pass the value as a positional parameter.

  9. djblets/webapi/auth/backends/api_tokens.py (Diff revisions 9 - 11)
     
     
     
     
     
     
     
     
    Show all issues

    Similar comments above apply here.

  10. 
      
maubin
chipx86
  1. 
      
  2. djblets/webapi/auth/backends/api_tokens.py (Diff revisions 11 - 12)
     
     
    Show all issues

    Dictionaries should (almost) always be in multi-line format, with keys alphabetized and on their own line (not on the { or } lines):

    return {
        'error_message': None,
        'headers': None,
    }
    
  3. djblets/webapi/auth/backends/api_tokens.py (Diff revisions 11 - 12)
     
     
     
    Show all issues

    Same here.

  4. djblets/webapi/auth/backends/api_tokens.py (Diff revisions 11 - 12)
     
     
    Show all issues

    Missing a trailing comma.

  5. djblets/webapi/tests/test_api_auth_backend.py (Diff revisions 11 - 12)
     
     
     
     
     
     
    Show all issues

    Let's actually move these into the assertion, so it's not scattered:

    self.assertEqual(
        result,
        (
            False,
            '...',
            {
                # Explicit headers here
            }
        ))
    

    Same below.

  6. djblets/webapi/tests/test_api_auth_backend.py (Diff revisions 11 - 12)
     
     
    Show all issues

    No need for the .copy() here, since we know it won't be modified in a test.

    Although, we want to be clear on expectations, so we know what regresses from a change. So this should hard-code any headers to check against.

    Same below.

  7. 
      
maubin
Review request changed
Commits:
Summary ID
Update the BaseWebAPIToken model for API Tokens v2
141b505d35d961d923f74bfe3d1ccd0b69eafb26
Update the BaseWebAPIToken model for API Tokens v2
1771899497ed3539da1f5561d83288174ba9b821

Checks run (1 failed, 1 succeeded)

flake8 failed.
JSHint passed.

flake8

maubin
maubin
maubin
chipx86
  1. 
      
  2. djblets/webapi/models.py (Diff revisions 12 - 15)
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
    Show all issues

    I'm torn on this. My gut is telling me we don't want to bake this into the API.

    I think, since this is a workaround for wanting to update a timestamp without triggering default behavior, we should work around it there.

    We have two approaches we could take for that:

    1. We could use save_base() instead of save(). This can be used just like save() for our purposes. It just bypasses overridden save logic, and some deferred model and field sanity checks that won't apply in our case.

    2. Alternatively, we could do api_token_model.objects.filter(pk=token.pk).update(...) to update the database copy without triggering any functionality (including things like the standard pre_save/post_save signals).

    Since we probably aren't interested in making this save entirely invisible — we just want to avoid webapi_token_updated — we should probably go with the first option.

  3. djblets/webapi/tests/test_api_auth_backend.py (Diff revisions 12 - 15)
     
     
     
    Show all issues

    We don't want to localize here. Test results should almost always compare against fixed state. Generating dynamic state can mask bugs.

    We can assume a default localization when running tests. If the assumption were to be wrong, then we'd have a test environment/setup issue, which we don't want to mask.

  4. 
      
maubin
chipx86
  1. 
      
  2. djblets/webapi/tests/test_api_auth_backend.py (Diff revisions 15 - 16)
     
     
     
    Show all issues

    These should still hard-code the exact string we expect, rather than running it through any Django functions. Otherwise we're just testing Django's functionality using Django's functionality. So let's make this whole thing a plain string, no function calls. If the values ever change, it highlights a test environment problem, or a change in Django that we need to be aware of.

  3. 
      
maubin
chipx86
  1. Ship It!
  2. 
      
maubin
Review request changed
Status:
Completed
Change Summary:
Pushed to release-3.x (5aa0a27)