[WIP] Support exploring archive files attached to reviews.

Review Request #11402 — Created Jan. 24, 2021 and updated

mfliu
Review Board
master
reviewboard, students
ReviewBoard currently supports attaching individual files for review,
but while some filetypes can be opened in-browser and examined, archive
files can only be downloaded. This commit adds a new review UI that
allows archives to be browsed in ReviewBoard without downloading the
file. Users can navigate the archived files, see metadata, and download
individual files.

Unit tests for get_js_model_data using empty archives, simple archives with files, and archives with nested directories.

Summary
[WIP] Support exploring archive files attached to reviews.

Description From Last Updated

Michael, your project seems so interesting for sure! The minor and general slips that I've seen are for the import ...

CL clarissaudrey

Have you given some thought to how wide the indent would be for each directory level contained within the zip? ...

mderosemderose

It seems like there is a lot of room for the table to grow within the File tab. Some questions ...

mderosemderose

I think this is a good idea. I think a seperate download button will make things a bit clearer for ...

mderosemderose

It should match any other tables that exist in review board. An example would be the dashboard that displays all ...

mderosemderose

Maybe a drop down/popup that appears when you hover over the file name?

mderosemderose

This model will have all of the data that the view will display. The templates will json-encode the return value ...

daviddavid

F401 'tarfile.TarFile' imported but unused

reviewbotreviewbot

Should be using single quotes

ryankangryankang

These should all be single quotes as well

ryankangryankang

single quotes in the print statement for strings

ryankangryankang

Let's make sure that everything after base here stays in alphabetical order.

daviddavid

This should inherit from object

daviddavid

This should inherit from object

daviddavid

I'm sure it's already on your radar, but this is kind of backwards. We should check explicitly for application/x-tar when ...

daviddavid

We can probably leave this out for now. Unless you're planning on using backbone routing to add a bunch of ...

daviddavid

F821 undefined name 'FileNotFoundError'

reviewbotreviewbot

What were you thinking you might use for this extraction? Are you considering a third party service or are you ...

mderosemderose

For my own understanding, why are mime types guessed here? I've never heard of that before.

mderosemderose

Why is "wah" added to the end of this?

mderosemderose

There is some white space on lines 7,8 and 12 that needs to be removed.

mderosemderose

I think this whitespace needs to be removed

mderosemderose
david
  1. 
      
  2. This model will have all of the data that the view will display. The templates will json-encode the return value from your ArchiveReviewUI.get_js_model_data() method and pass that into the constructor for your model here (which is how data gets from the Python on the server into JS on the client). In this case you'd probably have some kind of tree data representing the archive contents.

    The view then takes the data stored in the model and displays it.

  3. 
      
mfliu
Review request changed

Change Summary:

Update the Zip archive parsing to use intermediary class hierarchy instead of directly parsing to a Python dictionary

Commits:

Summary
-
[WIP] Support exploring archive files attached to reviews.
+
[WIP] Support exploring archive files attached to reviews.

Diff:

Revision 2 (+488)

Show changes

Checks run (1 failed, 1 succeeded)

flake8 failed.
JSHint passed.

flake8

ryankang
  1. 
      
  2. reviewboard/reviews/ui/archive.py (Diff revision 2)
     
     

    Should be using single quotes

  3. reviewboard/reviews/ui/archive.py (Diff revision 2)
     
     

    These should all be single quotes as well

  4. reviewboard/reviews/ui/archive.py (Diff revision 2)
     
     

    single quotes in the print statement for strings

    1. These are just for debugging, and will be removed in the actual review-request

  5. 
      
mfliu
mfliu
david
  1. 
      
  2. reviewboard/reviews/ui/__init__.py (Diff revision 4)
     
     
     
     
     

    Let's make sure that everything after base here stays in alphabetical order.

  3. reviewboard/reviews/ui/archive.py (Diff revision 4)
     
     

    This should inherit from object

  4. reviewboard/reviews/ui/archive.py (Diff revision 4)
     
     

    This should inherit from object

  5. reviewboard/reviews/ui/archive.py (Diff revision 4)
     
     
     
     
     
     
     

    I'm sure it's already on your radar, but this is kind of backwards. We should check explicitly for application/x-tar when instantiating the TarArchiveFormat, and the else clause should raise.

    Inn fact, we might want to have a dict mapping mimetype to the format class so it can just be something like:

    try:
        archiveFormat = self.formats[self.obj.mimetype]
        archive = archiveFormat(self.obj.filename, file_bytes)
        data['archive_hierarchy'] = archive.to_dict()
    except KeyError:
        raise NotImplementedError
    
    1. I added the formats dict mappings, which works quite well. I did want to check what the expected behaviour for unsupported archive formats is. Should they not be handled by the review UI at all (in which case I just omit the mimetypes from the dict), or should they be handled but error out somehow?

    2. I think we still want to catch the KeyError (just for bullet-proofing), but we should also only put things into supported_mimetypes if the review UI can handle them.

  6. We can probably leave this out for now. Unless you're planning on using backbone routing to add a bunch of separate URLs that are all served by the same view, we won't need this.

  7. 
      
mfliu
mfliu
mfliu
mfliu
mderose
  1. 
      
  2. Have you given some thought to how wide the indent would be for each directory level contained within the zip? You may have to make the default width of the Name column match the string length of the deepest file with the added indents (i.e. if there are 3 levels of directories then the minimum width of the name column should be the width of the longest filename is this directory level with three indents). If you want the Name column width to be fixed for all zips, tarballs or other archive files then you may need to speak with the mentors to figure out the average number of directory levels in zip files to calculate the average number of indents that will be required).

    If you are interested in considering an alternate way of displaying the different directory nesting levels you might want to consider subsituting the indent with the relative path for the file as shown below.

    test.zip
    file.text
    directory_name
    directory_name/other_file.text

    1. If it's worthy for you to consider, I have another suggestion for the style of the nesting, which hopefully wouldn't be too complicated to visualize. I was thinking, with yours and Matthew's suggestions, there might be some empty or repetitive lines. If we consider a case where the directory name is really long or if the nesting is really deep, the users would be shown empty spaces (because the indentation is getting larger and larger while the space is limited) or the users would be shown the same line (e.g directory_level0/directory_level1/directory_level2... and imagine directory_level2 has many components). Perhaps a collapsible directory would fit this scenario better, so the users can expand/collapse each directory and see the components inside? And when the user collapsed it, it would hide all directory and/or files inside?

  3. It seems like there is a lot of room for the table to grow within the File tab. Some questions you may want to think about:
    -Will the name column have a max width? If yes how will that max width be calculated relative to the window size and file tab size.
    -How will you display file names (with or without indents) that exceed the name column width? Will you use a scroll bar?
    -If you use a scroll bar where will that be place to avoid text being covered (either column name or file names)? Do the scroll bars used in the rest of Review Board work with your design?

  4. I think this is a good idea. I think a seperate download button will make things a bit clearer for users that haven't used this feature before. You'll just need to take into consideration where this download button should be placed and how easily it can be found at different screen widths.

    1. I agree with Matthew regarding the separate download button. Just a bit of consideration, maybe add one extra checkbox at the very top where user can "select all" by clicking that one box? It sounds tedious having to check the box one by one if the users want to download all files?

  5. It should match any other tables that exist in review board. An example would be the dashboard that displays all review requests but before you start implementing this it might be worth asking one of the mentors if there is another table that would be a better reference for you to use.

  6. Maybe a drop down/popup that appears when you hover over the file name?

  7. 
      
mfliu
Review request changed

Change Summary:

Add support for downloading individual files with in an archive

Commits:

Summary
-
[WIP] Support exploring archive files attached to reviews.
+
[WIP] Support exploring archive files attached to reviews.

Diff:

Revision 8 (+800)

Show changes

Checks run (1 failed, 1 succeeded)

flake8 failed.
JSHint passed.

flake8

mfliu
mderose
  1. Your work looks really interesting. I have some questions that are just for my own knowledge and also to get some clarifications on a few things. I also found a few white spaces that need to be removed. I'm not sure if you're there yet but you may want to update your mockup just so that when you want to start implementing you have a clear design that you can work from. It might save you time instead of having to design and implement at the same time. Keep up the great work!

    1. Thanks Matthew! I still haven't fully decided on what the frontend design will be, so I'm saving the mock-up update until later. Currently, I'm still trying to work on fleshing out all of the features that this review UI should have such as supporting more archive types and adding review comments.

  2. reviewboard/reviews/ui/archive.py (Diff revision 9)
     
     

    What were you thinking you might use for this extraction? Are you considering a third party service or are you going to code it yourself?

    1. This function gets overridden by subclasses to support extracting a single file from the archive. I'll put a note in the docstring to make that clearer. The same Python packages that allow us to read the archive hierarchies also allow us to extract single files, so this function is more or less using the same data as _get_tree

  3. reviewboard/reviews/ui/archive.py (Diff revision 9)
     
     

    For my own understanding, why are mime types guessed here? I've never heard of that before.

    1. When we send data via HTTP to the client, the user's browser needs to know what kind of data has just been sent so it can best decide how to handle it. This is how your browser tells you whether the file you just downloaded was a text file, video, image, etc. If we can't tell what the file's mimetype is (since it isn't always possible to determine), we use the default application/octet-stream which represents a raw sequence of bytes.

  4. reviewboard/reviews/ui/archive.py (Diff revision 9)
     
     

    Why is "wah" added to the end of this?

    1. Oops, you weren't supposed to see that.

  5. There is some white space on lines 7,8 and 12 that needs to be removed.

  6. I think this whitespace needs to be removed

  7. 
      
mfliu
CL
  1. 
      
    1. Sorry for the weird formatting, my bad!

  2. Michael, your project seems so interesting for sure! The minor and general slips that I've seen are for the import statements, where you need to sort it alphabetically, e.g

    ```from reviewboard.reviews.ui.archive import ArchiveReviewUI

     should come before the 
    ```from reviewboard.reviews.ui.base import register_ui
    

    or the from xxx import C,A,B
    should be from xxx import A,B,C.

    1. Thanks Clarissa! I've fixed the import statement ordering in archive.py, but I think David mentioned that only the imports after base should be in order for __init__.py.

  3. 
      
mfliu
mfliu
mfliu
Review request changed

Testing Done:

  +

Unit tests for get_js_model_data using empty archives, simple archives with files, and archives with nested directories.

Loading...