• 
      

    Better control search indexing of datagrids.

    Review Request #14362 — Created March 4, 2025 and submitted

    Information

    Djblets
    release-5.x

    Reviewers

    Some search index bots have a tendency to get stuck on datagrid pages.
    They navigate through pages, click sort/unsort and column edit links,
    and end up generating mass amounts of URLs to index.

    This change works to control this through sane defaults and full opt-out
    of search indexing.

    The sort and column choice links now use role="button" and
    rel="nofollow noindex". The role="button" should prevent indexing
    by itself, but we use both to cover bases.

    Pagination buttons now use the same nofollow noindex for the "Last"
    link, to prevent jumping to the end of the list where queries are most
    expensive.

    Other pagination buttons set rel to first for the first page, last
    for the last page, next for the next page,and prev for the previous
    page. Not all engines care about this (Google ignores them), but some
    do, and can help search engines with their indexing choices.

    Datagrid pages define a canonical URL that excludes columns, sort,
    and others, helping to de-index previously-indexed pages with these
    options set, and reducing the indexing queue.

    Standard query string arguments for sorting/unsorting now use sort order
    instead of dictionary order, keeping URLs stable.

    Search indexing can also be disabled entirely for a datagrid by setting
    allow_search_indexing = False. In this case, all pagination links will
    use rel="nofollow noindex", and the datagid pages will set the same in
    a <meta> tag. This is particularly useful when there are multiple
    datagrids that ultimately cover subsets of a larger list of items.

    In the process, a small optimization was made to the code determining if
    there should be a First Page or Last Page link shown. We were searching
    the entire list of pages for page 1 or page <length>, which was
    unnecessary. We now just check the first and last page numbers in the
    display list, respectively.

    Unit tests pass.

    Tested this with Review Board's datagrids, checking the resulting
    HTML with search indexing on and off.

    Summary ID
    Better control search indexing of datagrids.
    Some search index bots have a tendency to get stuck on datagrid pages. They navigate through pages, click sort/unsort and column edit links, and end up generating mass amounts of URLs to index. This change works to control this through sane defaults and full opt-out of search indexing. The sort and column choice links now use `role="button"` and `rel="nofollow noindex"`. The `role="button"` should prevent indexing by itself, but we use both to cover bases. Pagination buttons now use the same `nofollow noindex` for the "Last" link, to prevent jumping to the end of the list where queries are most expensive. Other pagination buttons set `rel` to `first` for the first page, `last` for the last page, `next` for the next page,` and prev` for the previous page. Not all engines care about this (Google ignores them), but some do, and can help search engines with their indexing choices. Datagrid pages define a canonical URL that excludes `columns`, `sort`, and others, helping to de-index previously-indexed pages with these options set, and reducing the indexing queue. Standard query string arguments for sorting/unsorting now use sort order instead of dictionary order, keeping URLs stable. Search indexing can also be disabled entirely for a datagrid by setting `allow_search_indexing = False`. In this case, all pagination links will use `rel="nofollow noindex"`, and the datagid pages will set the same in a `<meta>` tag. This is particularly useful when there are multiple datagrids that ultimately cover subsets of a larger list of items. In the process, a small optimization was made to the code determining if there should be a First Page or Last Page link shown. We were searching the entire list of pages for page 1 or page <length>, which was unnecessary. We now just check the first and last page numbers in the display list, respectively.
    6bc288b34ff4fa8be5e30107d58d16e4e4bd55d0
    Description From Last Updated

    Seems like a set would be better here.

    daviddavid

    Blank lines between bullet points is weird.

    daviddavid
    david
    1. 
        
    2. djblets/datagrid/grids.py (Diff revision 1)
       
       
      Show all issues

      Seems like a set would be better here.

      1. Where would a set come in? We're operating off of a QueryDict, which gives us .pop() for deleting. The tuple is just items to delete, so a set doesn't buy us anything.

    3. djblets/datagrid/grids.py (Diff revision 1)
       
       
       
       
       
       
       
       
      Show all issues

      Blank lines between bullet points is weird.

      1. I find it so much easier to read that way, especially as they become multi-paragraph.

    4. 
        
    david
    1. Ship It!
    2. 
        
    maubin
    1. Ship It!
    2. 
        
    chipx86
    Review request changed
    Status:
    Completed
    Change Summary:
    Pushed to release-5.x (26772ac)