Optimize our datagrid queries when sorting data.
Review Request #883 — Created June 7, 2009 and submitted
Optimize our datagrid queries when sorting data. We were doing one big query for the data in the datagrid, but when sorting, it could take a long time to fetch the results. In MySQL (our test case for this) we were hitting several slowdowns, as it needed to set up a temporary table and do a filesort on the data. To improve the speed, we now query for the IDs only, factoring out all other parts of the provided query. We can then perform a second query, fetching the data we need, by querying specifically for entries with those IDs, after all sorting has been applied. A side-effect of this optimization is that things may be dropped from the initial query. This isn't a problem for most uses, but if subqueries are used and the datagrid cells are expected to be able to access this data, that data will end up stripped away and the cells will fail. This can be worked around by either turning off optimize_sorts for the datagrid, or overriding post_process_queryset to specify those extra subqueries (the latter being significantly better for larger datasets).
Tested with all the datagrids in Review Board, and saw significant performance increases, especially in conjunction with the Review Board changes in /r/884.
- Change Summary:
-
Fixed the sort order of the results. This was broken, since the database doesn't actually keep results in the ID list provided (nor should it). We now process this ourselves after getting the results. Also enhanced the comments to describe what we're doing a bit better, and to explain how post_process_queryset() is useful.
- Diff:
-
Revision 2 (+69 -6)