Add a management command for finding large diffs.

Review Request #12849 — Created Feb. 26, 2023 and submitted

Review Board

When diagnosing performance problems in production, we often need
companies to run a script we provide to scan for large diffs, in order
to determine if users are uploading massive diffs and slowing down the
server due to excess repository checks or other issues.

To simplify this process going forward, Review Board 5.0.3+ will ship
with a new find-large-diffs management command. This is based on our
existing script, with a few noticeable differences:

  1. It can scan back a given number of days, instead of or along with an
    ID range.

  2. It tracks the largest diffsets on a review request individually for
    file counts, diff sizes, and parent diff sizes.

  3. It outputs the results in CSV, easing processing. This output also
    includes user IDs, timestamps, and the max sizes and related diffset
    IDs for each category.

Full documentation is provided in the management commands docs.

Ran each combination of arguments locally, checking the results and
comparing it to my database.

Built the docs and read through them, checking for formatting issues,
spelling errors, bad links, and build errors.

Add a management command for finding large diffs.
Description From Last Updated

'typing.Dict' imported but unused Column: 1 Error code: F401

Checks run (1 failed, 1 succeeded)
flake8 failed.
JSHint passed.


  1. Once Review Bot is happy, I am too.

  1. Ship It!
Review request changed

Status: Closed (submitted)

Change Summary:

Pushed to release-5.0.x (f355c91)