Attempt to solve deadlock regressions in RelationCounterField.

Review Request #9706 — Created Feb. 26, 2018 and submitted

Information

Djblets
release-1.0.x
645e20e...

Reviewers

RelationCounterField recently received a handful of changes regarding
how it manages state, and with that came more complexity around the
locks. After deploying the updates to our servers, we began to notice
that Apache would stop responding after a period of time.

While hard to diagnose, the problem likely has to do with the process of
modifying weak reference state, which may cause old weak references to drop,
which can then attempt to modify state again. The code that sets new
state and the code that resets state each attempt to grab a lock, which
was likely causing threads to deadlock, eventually starving the thread
pool.

As an attempt at solving this, we're now using re-entrant locks, which
allows a thread to grab its own lock multiple times without a deadlock,
preventing only other threads from grabbing the lock. This would allow
the state storage code to grab a lock and, if it triggers a state reset,
for that lock to be grabbed without problems.

This is not a guaranteed fix for the problem, but given the time spans
of Apache failure, we should be able to learn quickly if this solves the
problem.

On top of this, we were hitting spurious AttributeErrors during thread
shutdown, since some of the classes and methods that the weak reference
destruction handler needed to call have since gone away or been changed
to None. We now bullet-proof this a bit, ignoring any
AttributeErrors being raised during the destruction handler.

Unit tests pass.

This will have to be tested in production.

david
  1. Ship It!
  2. 
      
chipx86
Review request changed

Status: Closed (submitted)

Change Summary:

Pushed to release-1.0.x (c925c0f)
Loading...