Add support for encryption of cache keys and cached data.

Review Request #12379 — Created June 16, 2022 and submitted

Information

Djblets
release-3.x

Reviewers

When we cache data, we typically do it in plain text. Cache keys are
readable (aside from any hashing we need to do for length purposes), and
data is generally raw (unless we're compressing/chunking large data).
This is pretty standard, and so long as a caching server is protected,
it's safe enough.

However, if a caching server is at all accessible by anything other than
the application (not particularly uncommon), then people may have free
reign to read cached data and even set new data. This is a problem if
the cached data is at all sensitive, or if access restrictions can be
circumvented by modifying the cached data.

Some security best practices guides go so far as to recommend encrypting
all data stored in cache. This is dependent on the applications using
the cache.

This change enables Djblets to encrypt select data or all data cached
through our caching functions.

cache_memoize(), cache_memoize_iter(), and make_cache_key() now
all accept a use_encryption flag (defaults to False) and an optional
encryption_key (defaults to the standard AES encryption key derived
from settings.SECRET_KEY).

When setting use_encryption=True, cache keys are stored as SHA256 HMAC
digests of the normalized cache key. This prevents anyone reading the
cache from knowing what any given key represents, making key poisoning
more difficult.

Data is then AES-encrypted, and if decryption at all fails on a piece of
data (e.g., the data was encrypted using a different key, wasn't
encrypted at all, or was tampered with), then the cached data is
considered to be invalid and will be regenerated.

This supports large data and compression as well. For large data, we
make use of aes_encrypt_iter() to progressively generate pieces of
encrypted data. This means that the entire cached set of chunks is part
of one encryption payload, rather than encrypting each chunk
individually. If any chunk fails to decrypt or is missing, the entirety
of the data is invalidated.

There's a lot of flexibility given to the caller and the administrator.
Callers can choose whether or not to enable encryption per-cache
operation, and can choose whether any should use a custom key.

Administrators that want extra security can force all Djblets-based
caching operations to use encryption (unless a cache operation
specifically opts out of encryption) by setting:

DJBLETS_CACHE_FORCE_ENCRYPTION = True

Or can choose an explicit encryption key by setting:

DJBLETS_CACHE_DEFAULT_ENCRYPTION_KEY = '...'

All unit tests pass in Djblets and Review Board.

Made use of this in some in-progress code, and saw the encrypted
cache keys. Had no trouble with cache usage or invalidation.

Summary ID
Add support for encryption of cache keys and cached data.
When we cache data, we typically do it in plain text. Cache keys are readable (aside from any hashing we need to do for length purposes), and data is generally raw (unless we're compressing/chunking large data). This is pretty standard, and so long as a caching server is protected, it's safe enough. However, if a caching server is at all accessible by anything other than the application (not particularly uncommon), then people may have free reign to read cached data and even set new data. This is a problem if the cached data is at all sensitive, or if access restrictions can be circumvented by modifying the cached data. Some security best practices guides go so far as to recommend encrypting all data stored in cache. This is dependent on the applications using the cache. This change enables Djblets to encrypt select data or all data cached through our caching functions. `cache_memoize()`, `cache_memoize_iter()`, and `make_cache_key()` now all accept a `use_encryption` flag (defaults to `False`) and an optional `encryption_key` (defaults to the standard AES encryption key derived from `settings.SECRET_KEY`). When setting `use_encryption=True`, cache keys are stored as SHA256 HMAC digests of the normalized cache key. This prevents anyone reading the cache from knowing what any given key represents, making key poisoning more difficult. Data is then AES-encrypted, and if decryption at all fails on a piece of data (e.g., the data was encrypted using a different key, wasn't encrypted at all, or was tampered with), then the cached data is considered to be invalid and will be regenerated. This supports large data and compression as well. For large data, we make use of `aes_encrypt_iter()` to progressively generate pieces of encrypted data. This means that the entire cached set of chunks is part of one encryption payload, rather than encrypting each chunk individually. If any chunk fails to decrypt or is missing, the entirety of the data is invalidated. There's a lot of flexibility given to the caller and the administrator. Callers can choose whether or not to enable encryption per-cache operation, and can choose whether any should use a custom key. Administrators that want extra security can force all Djblets-based caching operations to use encryption (unless a cache operation specifically opts out of encryption) by setting: ```python DJBLETS_CACHE_FORCE_ENCRYPTION = True ``` Or can choose an explicit encryption key by setting: ```python DJBLETS_CACHE_DEFAULT_ENCRYPTION_KEY = '...' ```
347e358563fdd2b97d635502b2c8bbc68fcce122
Description From Last Updated

Change to "This defaults to the value in"

maubinmaubin

Add a "Raises" section to the docstring

maubinmaubin
chipx86
maubin
  1. 
      
  2. djblets/cache/backend.py (Diff revision 2)
     
     
    Show all issues

    Change to "This defaults to the value in"

  3. djblets/cache/backend.py (Diff revision 2)
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
    Show all issues

    Add a "Raises" section to the docstring

  4. 
      
chipx86
david
  1. Ship It!
  2. 
      
chipx86
Review request changed
Status:
Completed
Change Summary:
Pushed to release-3.x (7af206f)