Treat cache synchronization state as expired when there are errors.

Review Request #13980 — Created June 17, 2024 and submitted

Information

Djblets
release-5.x

Reviewers

Djblets 5 introduced handling for cache backend errors, attempting to
avoid outages if the cache server was down or misconfigured. However, it
had a flaw in that if GenerationSynchronizer experienced a backend
failure when checking for state expiration, it would determine that the
state has not expired.

This prevented cache server changes from taking effect. If cache
configuration was misconfigured, all servers/threads/processes would
inherit the broken state and then assume it never expires, preventing
any fixes from being loaded.

Now, we assume the state is invalid in this case, and force refreshing
state, hoping to load in corrected state.

We also now always set a sync_gen state when attempting to
fetch-or-create a generation value, rather than keeping a stale value
(which may be tied to bad state).

This doesn't fully solve issues with broken cache state, as Django
doesn't sandbox cache backend failures at all, and will break during
session loading (and other places). However, this does offer a
possibility of fixing that issue, provided cache settings state is reset
before a user session or other impacted Django code runs, if some
process can update siteconfig settings for cache configuration through
another means.

Unit tests pass.

Summary ID
Treat cache synchronization state as expired when there are errors.
Djblets 5 introduced handling for cache backend errors, attempting to avoid outages if the cache server was down or misconfigured. However, it had a flaw in that if `GenerationSynchronizer` experienced a backend failure when checking for state expiration, it would determine that the state has not expired. This prevented cache server changes from taking effect. If cache configuration was misconfigured, all servers/threads/processes would inherit the broken state and then assume it never expires, preventing any fixes from being loaded. Now, we assume the state is invalid in this case, and force refreshing state, hoping to load in corrected state. We also now always set a `sync_gen` state when attempting to fetch-or-create a generation value, rather than keeping a stale value (which may be tied to bad state). This doesn't fully solve issues with broken cache state, as Django doesn't sandbox cache backend failures at all, and will break during session loading (and other places). However, this does offer a possibility of fixing that issue, provided cache settings state is reset before a user session or other impacted Django code runs, if some process can update siteconfig settings for cache configuration through another means.
3feb67d1d217e191c652887b493f2964ca407016
david
  1. Ship It!
  2. 
      
maubin
  1. Ship It!
  2. 
      
chipx86
Review request changed
Status:
Completed
Change Summary:
Pushed to release-5.x (63a2cdf)