• 
      

    Add a method to check if a mimetype is supported and improve guess_mimetype.

    Review Request #14310 — Created Jan. 27, 2025 and submitted

    Information

    Review Board
    release-7.1.x

    Reviewers

    This adds a method for checking whether a mimetype is supported or not.
    This is useful for code that deals with uploading binary files, to help
    decide which files should be uploaded. We have similar logic for this in
    RBTools. Valid and invalid mimetypes are saved to the cache and are
    checked against when available, which should help speed things up when
    the caller is checking a sequence of mimetypes.

    This change also fixes a fault with guess_mimetype. The function runs
    the file command in a subprocess to determine the given file's mimetype.
    The function doesn't properly close and clean up the subprocess's resources
    (the file descriptors for stdin and stdout). In the Review Board server
    process which runs for a long time, we're relying on Python's garbage
    collector to close everything. The garbage collector is not deterministic
    and it could take a while before the resources are cleaned up. It's better
    to deal with these ourselves to avoid any unexpected side effects. This
    change makes sure the resources get cleaned up when they're no longer used.

    • Ran unit tests.
    • Used in the Bitbucket Server Pull Request integration code in Power
      Pack.
    • Saw the ResourceWarning: unclosed file coming from guess_mimetype
      go away in the logs.
    Summary ID
    Add a method to check if a mimetype is supported and improve guess_mimetype.
    This adds a method for checking whether a mimetype is supported or not. This is useful for code that deals with uploading binary files, to help decide which files should be uploaded. We have similar logic for this in RBTools. Valid and invalid mimetypes are saved to the cache and are checked against when available, which should help speed things up when the caller is checking a sequence of mimetypes. This change also fixes a fault with `guess_mimetype`. The function runs the `file` command in a subprocess to determine the given file's mimetype. The function doesn't properly close and clean up the subprocess's resources (the file descriptors for stdin and stdout). In the Review Board server process which runs for a long time, we're relying on Python's garbage collector to close everything. The garbage collector is not deterministic and it could take a while before the resources are cleaned up. It's better to deal with these ourselves to avoid any unexpected side effects. This change makes sure the resources get cleaned up when they're no longer used.
    dd965637ff55487c8ee49a120c19ef6aa04185ab
    Description From Last Updated

    Maybe indent this as: ... stdin=subprocess.PIPE, ) as p: assert p.stdin is not None ... to make the block clearer?

    daviddavid

    This blank line should go away now.

    daviddavid

    I feel like this might be simplified if we did individual cache entries with a key like review-uis:supported-mimetype:{mimetype} and then …

    daviddavid

    It might be nice to have some in-memory caching (like a global) for this, so we don't have to re-parse …

    daviddavid

    Blank line after this.

    daviddavid

    I feel like this file would be better served using cache_memoize(), since that's designed for this kind of use case.

    chipx86chipx86
    david
    1. 
        
    2. reviewboard/attachments/mimetypes.py (Diff revision 1)
       
       
      Show all issues

      Maybe indent this as:

          ...
          stdin=subprocess.PIPE,
      ) as p:
          assert p.stdin is not None
          ...
      

      to make the block clearer?

    3. reviewboard/attachments/mimetypes.py (Diff revision 1)
       
       
      Show all issues

      This blank line should go away now.

    4. reviewboard/attachments/mimetypes.py (Diff revision 1)
       
       
       
       
       
       
      Show all issues

      I feel like this might be simplified if we did individual cache entries with a key like review-uis:supported-mimetype:{mimetype} and then just stored a True or False.

    5. reviewboard/attachments/mimetypes.py (Diff revision 1)
       
       
       
       
       
       
      Show all issues

      It might be nice to have some in-memory caching (like a global) for this, so we don't have to re-parse the mime types if this gets called repeatedly.

    6. Show all issues

      Blank line after this.

    7. 
        
    maubin
    david
    1. Ship It!
    2. 
        
    chipx86
    1. 
        
    2. reviewboard/attachments/mimetypes.py (Diff revision 2)
       
       
      Show all issues

      I feel like this file would be better served using cache_memoize(), since that's designed for this kind of use case.

    3. 
        
    maubin
    david
    1. Ship It!
    2. 
        
    maubin
    Review request changed
    Status:
    Completed
    Change Summary:
    Pushed to release-7.1.x (5bcb517)