Add a management command for finding large diffs.

Review Request #12849 — Created Feb. 26, 2023 and submitted — Latest diff uploaded


Review Board


When diagnosing performance problems in production, we often need
companies to run a script we provide to scan for large diffs, in order
to determine if users are uploading massive diffs and slowing down the
server due to excess repository checks or other issues.

To simplify this process going forward, Review Board 5.0.3+ will ship
with a new find-large-diffs management command. This is based on our
existing script, with a few noticeable differences:

  1. It can scan back a given number of days, instead of or along with an
    ID range.

  2. It tracks the largest diffsets on a review request individually for
    file counts, diff sizes, and parent diff sizes.

  3. It outputs the results in CSV, easing processing. This output also
    includes user IDs, timestamps, and the max sizes and related diffset
    IDs for each category.

Full documentation is provided in the management commands docs.

Ran each combination of arguments locally, checking the results and
comparing it to my database.

Built the docs and read through them, checking for formatting issues,
spelling errors, bad links, and build errors.