• 
      

    Treat cache synchronization state as expired when there are errors.

    Review Request #13980 — Created June 17, 2024 and submitted

    Information

    Djblets
    release-5.x

    Reviewers

    Djblets 5 introduced handling for cache backend errors, attempting to
    avoid outages if the cache server was down or misconfigured. However, it
    had a flaw in that if GenerationSynchronizer experienced a backend
    failure when checking for state expiration, it would determine that the
    state has not expired.

    This prevented cache server changes from taking effect. If cache
    configuration was misconfigured, all servers/threads/processes would
    inherit the broken state and then assume it never expires, preventing
    any fixes from being loaded.

    Now, we assume the state is invalid in this case, and force refreshing
    state, hoping to load in corrected state.

    We also now always set a sync_gen state when attempting to
    fetch-or-create a generation value, rather than keeping a stale value
    (which may be tied to bad state).

    This doesn't fully solve issues with broken cache state, as Django
    doesn't sandbox cache backend failures at all, and will break during
    session loading (and other places). However, this does offer a
    possibility of fixing that issue, provided cache settings state is reset
    before a user session or other impacted Django code runs, if some
    process can update siteconfig settings for cache configuration through
    another means.

    Unit tests pass.

    Summary ID
    Treat cache synchronization state as expired when there are errors.
    Djblets 5 introduced handling for cache backend errors, attempting to avoid outages if the cache server was down or misconfigured. However, it had a flaw in that if `GenerationSynchronizer` experienced a backend failure when checking for state expiration, it would determine that the state has not expired. This prevented cache server changes from taking effect. If cache configuration was misconfigured, all servers/threads/processes would inherit the broken state and then assume it never expires, preventing any fixes from being loaded. Now, we assume the state is invalid in this case, and force refreshing state, hoping to load in corrected state. We also now always set a `sync_gen` state when attempting to fetch-or-create a generation value, rather than keeping a stale value (which may be tied to bad state). This doesn't fully solve issues with broken cache state, as Django doesn't sandbox cache backend failures at all, and will break during session loading (and other places). However, this does offer a possibility of fixing that issue, provided cache settings state is reset before a user session or other impacted Django code runs, if some process can update siteconfig settings for cache configuration through another means.
    3feb67d1d217e191c652887b493f2964ca407016
    david
    1. Ship It!
    2. 
        
    maubin
    1. Ship It!
    2. 
        
    chipx86
    Review request changed
    Status:
    Completed
    Change Summary:
    Pushed to release-5.x (63a2cdf)