This change has been marked as completed.

Describe the completed change (optional):

Summary

Optimize our datagrid queries when sorting data.

Review Request #883 — Created June 7, 2009 and submitted

Information

Owner

chipx86*

Repository

Navi (deprecated)

Branch

trunk

Bugs

Depends On

Reviewers

Groups

reviewboard

People

Description*

Optimize our datagrid queries when sorting data.

We were doing one big query for the data in the datagrid, but when
sorting, it could take a long time to fetch the results. In MySQL (our
test case for this) we were hitting several slowdowns, as it needed to
set up a temporary table and do a filesort on the data.

To improve the speed, we now query for the IDs only, factoring out all
other parts of the provided query. We can then perform a second query,
fetching the data we need, by querying specifically for entries with
those IDs, after all sorting has been applied.

A side-effect of this optimization is that things may be dropped from
the initial query. This isn't a problem for most uses, but if subqueries
are used and the datagrid cells are expected to be able to access this
data, that data will end up stripped away and the cells will fail.

This can be worked around by either turning off optimize_sorts for the
datagrid, or overriding post_process_queryset to specify those extra
subqueries (the latter being significantly better for larger datasets).

Testing Done

Tested with all the datagrids in Review Board, and saw significant performance increases, especially in conjunction with the Review Board changes in /r/884.

Description	From	Last Updated
There are no open issues

This mostly looks pretty good. My comments are mostly about documentation.

/trunk/djblets/djblets/datagrid/grids.py (Diff revision 1)

I'm not really sure what this comment means.

/trunk/djblets/djblets/datagrid/grids.py (Diff revision 1)

Can you mention in here that this function is meant to be overridden? It took me a while to figure out that I had to look at your other changeset for this to make sense.

Change Summary:

Fixed the sort order of the results. This was broken, since the database doesn't actually keep results in the ID list provided (nor should it). We now process this ourselves after getting the results.

Also enhanced the comments to describe what we're doing a bit better, and to explain how post_process_queryset() is useful.

Diff:

Revision 2 (+69 -6)

Show changes

/trunk/djblets/djblets/datagrid/grids.py

Ship it!

/trunk/djblets/djblets/datagrid/grids.py (Diff revisions 1 - 2)

This should just be "if id_list"

chipx86 16 years, 1 month ago

Right. Originally I was thinking that we wanted to still handle this logic if id_list was [], but that was really before I knew exactly what my logic was going to be.

It seems like there should be a more elegant way to do the ID sorting, but it's not jumping out at me right away. I'll continue thinking about it, but this is good as-is.

chipx86 16 years, 1 month ago

Same. I found some of this logic on a random message board from someone else doing this. This works for now but if you have a better method, please feel free :)

Thanks for doing these!