Summary

Add a safer method of constructing cache keys.

Review Request #14477 — Created June 25, 2025 and submitted Aug. 7, 2025, 5:30 a.m.

Information

Owner

chipx86

Repository

Djblets

Branch

release-5.x

Bugs

Depends On

Reviewers

Groups

djblets

People

Description

That sort of bug could manifest as a key structured like:

path:to:<value>[:optional_param]

Where value could potentially be my_value:forced_param.

To avoid these sort of issues, we'd need to escape the values going into
the key. The safest way of doing that consistently is leaving it up to a
central method to handle.

All of Djblets's caching methods now accept a cache key as a sequence of
strings. This includes cache_memoize, cache_memoize_iter, and
make_cache_key.

Each key in the sequence will be escaped, first converting any %
characters to %25 (escape code for %) and then any : characters to
%3A.

This is opt-in. Callers passing a plain string won't see any difference
in behavior. Those keys will be used as-is.

It does means that keys constructed as part of another key that's either
been normalized or contains escapable characters may get an unexpected
key in return.

Going forward, we'll want to make use of these sequences rather than
hard-coding strings, in order to ensure safety of all keys.

Testing Done

Unit tests pass.

Made use of this in some in-progress code.

Commits

Summary	ID
Add a safer method of constructing cache keys. Cache keys sometimes need to involve input that comes from another source, such as a value in a database. While we're pretty careful about how we compose these keys, there's always a risk of a cache key poisoning vulnerability. That sort of bug could manifest as a key structured like: ``` path:to:<value>[:optional_param] ``` Where `value` could potentially be `my_value:forced_param`. To avoid these sort of issues, we'd need to escape the values going into the key. The safest way of doing that consistently is leaving it up to a central method to handle. All of Djblets's caching methods now accept a cache key as a sequence of strings. This includes `cache_memoize`, `cache_memoize_iter`, and `make_cache_key`. Each key in the sequence will be escaped, first converting any `%` characters to `%25` (escape code for `%`) and then any `:` characters to `%3A`. This is opt-in. Callers passing a plain string won't see any difference in behavior. Those keys will be used as-is. It does means that keys constructed as part of another key that's either been normalized or contains escapable characters may get an unexpected key in return. Going forward, we'll want to make use of these sequences rather than hard-coding strings, in order to ensure safety of all keys.	48d7ea0d2d65b79b96fc86598a7e2a1dad9f2e70

Summary

Add a safer method of constructing cache keys.

Cache keys sometimes need to involve input that comes from another source, such as a value in a database. While we're pretty careful about how we compose these keys, there's always a risk of a cache key poisoning vulnerability. That sort of bug could manifest as a key structured like: ``` path:to:<value>[:optional_param] ``` Where `value` could potentially be `my_value:forced_param`. To avoid these sort of issues, we'd need to escape the values going into the key. The safest way of doing that consistently is leaving it up to a central method to handle. All of Djblets's caching methods now accept a cache key as a sequence of strings. This includes `cache_memoize`, `cache_memoize_iter`, and `make_cache_key`. Each key in the sequence will be escaped, first converting any `%` characters to `%25` (escape code for `%`) and then any `:` characters to `%3A`. This is opt-in. Callers passing a plain string won't see any difference in behavior. Those keys will be used as-is. It does means that keys constructed as part of another key that's either been normalized or contains escapable characters may get an unexpected key in return. Going forward, we'll want to make use of these sequences rather than hard-coding strings, in order to ensure safety of all keys.

48d7ea0d2d65b79b96fc86598a7e2a1dad9f2e70

Issues

Description	From	Last Updated
Need to update the type here too.	maubin	June 27, 2025, 10:20 p.m.
We have a mix of tuples and lists here. While it all works out because of the * unpacking at …	david	June 27, 2025, 5:28 p.m.

flake8 passed.

JSHint passed.

Ship it!

djblets/cache/backend.py (Diff revision 1)
The issue has been resolved. Show all issues
```
Need to update the type here too.
```

djblets/cache/backend.py (Diff revision 1)

The issue has been dropped. Show all issues

We have a mix of tuples and lists here. While it all works out because of the * unpacking at the end, I think it would be nicer to be consistent and just use lists.

chipx86 June 26, 2025, 11:40 p.m.

I wanted to dig in and make sure the numbers matched my understanding of this. We're in a hot path (this code is accessed a lot, 115 times in just one Dashboard request), and I knew that I was introducing a small performance penalty in this change by dealing with escaping and joinin, so my goal was to keep performance as close to the original as possible with minimal loss.
Tuples are faster and smaller than lists most of the time, and typically measure as such.
There's a disclaimer by that: List comprehensions are faster than doing a tuple(x for x in y), because that sets up a generator and then constructs a tuple around it, and there's an expense to this (I wish there was native tuple syntax for this). tuple([x for x in y]) ends up being faster, but then you might as well use a list for that case.
But if you're just building up a static tuple, or accessing a tuple, it's faster. At least in theory.
Decided to measure it to be sure.

Measurements
These tests were run on my 2021 M1 Max. I ran each timeit once, threw out the first result (bytecode compilation), and ran it 4 more times, getting the average.
Tests were run with:

# String keys
timeit.timeit('make_cache_key("a:b:c")', setup='from djblets.cache.backend import make_cache_key')

# List keys
timeit.timeit('make_cache_key(["a", "b", "c"])', setup='from djblets.cache.backend import make_cache_key')

# Tuple keys
timeit.timeit('make_cache_key(("a", "b", "c"))', setup='from djblets.cache.backend import make_cache_key')


Here's what I got:
Original code



Keys
Time (seconds)




String
3.333815235999888




Tuple-wrapped keys



Keys
Time (seconds)




String
3.4278556352501255


List
3.6817527916664403


Tuple
3.659096694666611




List-wrapped keys



Keys
Time (seconds)




String
3.4579977917501306


List
4.244010402999872


Tuple
4.055572395750005




Results
Overall, tuple-wrapped keys were faster than list-wrapped keys, and passing in tuples was faster than passing in lists.
Tuple-wrapped keys were also much closer in performance to the original code. The list-based results were as expensive as ~1 second slower across tests.
That's for 10,000 requests, which is about ~86 concurrent Dashboard views at 115 cache requests per view. It's not seconds of performance gain, but it's 0.1-0.2s of loss for tuples vs. 0.1-0.9 for lists, at least in the tests I ran.
(I re-ran the battery of tests a few times, but got roughly the same results. There's a margin for error, but the results told the same story.)
I also tried alternative approaches, like building norm_key as a string and then using f-strings or % to build the final result. It was slower than the current method with list-wrapped keys. Interestingly, in order of speed, %-based formatting is the fastest (0.14s across 10,000 tests), followed by f-strings (0.17s), followed by .format (0.22s).
Anyway, given that, I'm confident tuples are the right choice for this code.

Keys	Time (seconds)
String	3.333815235999888

Keys	Time (seconds)
String	3.4278556352501255
List	3.6817527916664403
Tuple	3.659096694666611

Keys	Time (seconds)
String	3.4579977917501306
List	4.244010402999872
Tuple	4.055572395750005

Change Summary:


Fixed some missing updates to the docs.
Simplified the logic around joining of key parts when there's no Site (uncommon case).

Commits:

	Summary	ID
	Add a safer method of constructing cache keys. Cache keys sometimes need to involve input that comes from another source, such as a value in a database. While we're pretty careful about how we compose these keys, there's always a risk of a cache key poisoning vulnerability. That sort of bug could manifest as a key structured like: ``` path:to:<value>[:optional_param] ``` Where `value` could potentially be `my_value:forced_param`. To avoid these sort of issues, we'd need to escape the values going into the key. The safest way of doing that consistently is leaving it up to a central method to handle. All of Djblets's caching methods now accept a cache key as a sequence of strings. This includes `cache_memoize`, `cache_memoize_iter`, and `make_cache_key`. Each key in the sequence will be escaped, first converting any `%` characters to `%25` (escape code for `%`) and then any `:` characters to `%3A`. This is opt-in. Callers passing a plain string won't see any difference in behavior. Those keys will be used as-is. It does means that keys constructed as part of another key that's either been normalized or contains escapable characters may get an unexpected key in return. Going forward, we'll want to make use of these sequences rather than hard-coding strings, in order to ensure safety of all keys.	b26178c0e411d8a61b8bb81bb28f91b5c8a53c77
	Add a safer method of constructing cache keys. Cache keys sometimes need to involve input that comes from another source, such as a value in a database. While we're pretty careful about how we compose these keys, there's always a risk of a cache key poisoning vulnerability. That sort of bug could manifest as a key structured like: ``` path:to:<value>[:optional_param] ``` Where `value` could potentially be `my_value:forced_param`. To avoid these sort of issues, we'd need to escape the values going into the key. The safest way of doing that consistently is leaving it up to a central method to handle. All of Djblets's caching methods now accept a cache key as a sequence of strings. This includes `cache_memoize`, `cache_memoize_iter`, and `make_cache_key`. Each key in the sequence will be escaped, first converting any `%` characters to `%25` (escape code for `%`) and then any `:` characters to `%3A`. This is opt-in. Callers passing a plain string won't see any difference in behavior. Those keys will be used as-is. It does means that keys constructed as part of another key that's either been normalized or contains escapable characters may get an unexpected key in return. Going forward, we'll want to make use of these sequences rather than hard-coding strings, in order to ensure safety of all keys.	48d7ea0d2d65b79b96fc86598a7e2a1dad9f2e70

Diff:

Revision 2 (+518 -110)

Show changes

	djblets/auth/ratelimit.py
	djblets/cache/backend.py
	djblets/cache/tests/test_backend.py
	djblets/mail/dmarc.py
	djblets/pipeline/compilers/mixins.py
	djblets/privacy/consent/tracker.py
	djblets/util/views.py

Checks run (2 succeeded)

flake8 passed.

JSHint passed.

Ship it!

```
Ship It!
```

Status:: Completed
Change Summary:: Pushed to release-5.x (f962c15)