Detect and handle collapsible minified file diffs

Review Request #10162 — Created Sept. 22, 2018 and updated

Information

Review Board
release-4.0.x
0dcef66...

Reviewers

Implementation of diff file name extension checking for .min and content
checking for a small number of long lines in a file. If either is true then
it is likely a minified file. The minified file is then collapsed similarly
to deleted files and can be expanded to show the content of the file.

In diffviewer/diffutils.py, get_diff_files() performs the file extension
check then in populate_diff_chunks() if the file didn't have a .min
extension the content check is performed.

If the file was previously marked as a min by it's extension and if it has
less than 20 lines total then the lines are length checked to be greater
than 500 characters and if true remarked as minified files.

The diff_file_fragment.html then replaces then diff content with a minified
file message and a show content button that with a event listener in
diffReviewableModel.es6.js and diffViewerPageView.es6.js loads in the
minified file contents to the fragment.

test_diffutils.py python test written for testing minification testing.
test_minified_file_detection creates a minified diff object and verifies
the diffutils.py populate_diff_chunks() minification content check.

Ran Python Unit Tests.

Ran Javascript Jasmine Unit Tests.


Description From Last Updated

Unfortunately, Review Board never really has the files on the OS disk to read. The depot_filename that you've called this …

brenniebrennie

Please add a screenshot as a file attachment.

daviddavid

Can you detail what testing you've done?

daviddavid

W293 blank line contains whitespace

reviewbotreviewbot

W291 trailing whitespace

reviewbotreviewbot

E501 line too long (84 > 79 characters)

reviewbotreviewbot

W291 trailing whitespace

reviewbotreviewbot

E501 line too long (84 > 79 characters)

reviewbotreviewbot

F841 local variable 'temp' is assigned to but never used

reviewbotreviewbot

F841 local variable 'temp2' is assigned to but never used

reviewbotreviewbot

E501 line too long (92 > 79 characters)

reviewbotreviewbot

E261 at least two spaces before inline comment

reviewbotreviewbot

E265 block comment should start with '# '

reviewbotreviewbot

W293 blank line contains whitespace

reviewbotreviewbot

E265 block comment should start with '# '

reviewbotreviewbot

E265 block comment should start with '# '

reviewbotreviewbot

E265 block comment should start with '# '

reviewbotreviewbot

E261 at least two spaces before inline comment

reviewbotreviewbot

F401 'pdb' imported but unused

reviewbotreviewbot

E302 expected 2 blank lines, found 1

reviewbotreviewbot

E303 too many blank lines (2)

reviewbotreviewbot

Col: 1 Use the function form of "use strict".

reviewbotreviewbot

Col: 87 Expected an assignment or function call and instead saw an expression.

reviewbotreviewbot

Col: 153 Missing semicolon.

reviewbotreviewbot

Col: 154 Expected an assignment or function call and instead saw an expression.

reviewbotreviewbot

Col: 67 'module' is not defined.

reviewbotreviewbot

E303 too many blank lines (2)

reviewbotreviewbot

E501 line too long (83 > 79 characters)

reviewbotreviewbot

Col: 3 Expected an assignment or function call and instead saw an expression.

reviewbotreviewbot

Col: 9 Unclosed string.

reviewbotreviewbot

Col: 481 Unclosed string.

reviewbotreviewbot

Col: 2 Missing semicolon.

reviewbotreviewbot

Col: 1 Expected an assignment or function call and instead saw an expression.

reviewbotreviewbot

Col: 1 Unclosed string.

reviewbotreviewbot

Col: 67 'module' is not defined.

reviewbotreviewbot

Col: 154 Expected an assignment or function call and instead saw an expression.

reviewbotreviewbot

Col: 153 Missing semicolon.

reviewbotreviewbot

Col: 87 Expected an assignment or function call and instead saw an expression.

reviewbotreviewbot

Col: 1 Use the function form of "use strict".

reviewbotreviewbot

Col: 1 Expected an assignment or function call and instead saw an expression.

reviewbotreviewbot

Col: 2 Missing semicolon.

reviewbotreviewbot

Col: 481 Unclosed string.

reviewbotreviewbot

Col: 9 Unclosed string.

reviewbotreviewbot

Col: 3 Expected an assignment or function call and instead saw an expression.

reviewbotreviewbot

Col: 1 Unclosed string.

reviewbotreviewbot

Col: 1 Use the function form of "use strict".

reviewbotreviewbot

Col: 87 Expected an assignment or function call and instead saw an expression.

reviewbotreviewbot

Col: 153 Missing semicolon.

reviewbotreviewbot

Col: 154 Expected an assignment or function call and instead saw an expression.

reviewbotreviewbot

Col: 67 'module' is not defined.

reviewbotreviewbot

Nit: undo all these removed blank lines.

brenniebrennie

We don't do multi-line comments with docstrings. Instead we do multiple single-line comments with #.

brenniebrennie

Nit: blank line between statement and block.

brenniebrennie

Undo.

brenniebrennie

This will need to be a new class so we can use a separate event handler.

brenniebrennie

E702 multiple statements on one line (semicolon)

reviewbotreviewbot

"minified_bool" isn't a good variable name (I can infer the type from the assignment), plus we only use it once. …

daviddavid

Can we add a comment explaining what the heuristics are?

daviddavid

This blank line isn't necessary.

daviddavid

Can you add a doc comment for this?

daviddavid

E128 continuation line under-indented for visual indent

reviewbotreviewbot

E128 continuation line under-indented for visual indent

reviewbotreviewbot

E128 continuation line under-indented for visual indent

reviewbotreviewbot

E128 continuation line under-indented for visual indent

reviewbotreviewbot

E128 continuation line under-indented for visual indent

reviewbotreviewbot

E128 continuation line under-indented for visual indent

reviewbotreviewbot

E128 continuation line under-indented for visual indent

reviewbotreviewbot

W191 indentation contains tabs

reviewbotreviewbot

E101 indentation contains mixed spaces and tabs

reviewbotreviewbot

E128 continuation line under-indented for visual indent

reviewbotreviewbot

E101 indentation contains mixed spaces and tabs

reviewbotreviewbot

E501 line too long (87 > 79 characters)

reviewbotreviewbot

E128 continuation line under-indented for visual indent

reviewbotreviewbot

E501 line too long (89 > 79 characters)

reviewbotreviewbot

E128 continuation line under-indented for visual indent

reviewbotreviewbot

E128 continuation line under-indented for visual indent

reviewbotreviewbot

E501 line too long (89 > 79 characters)

reviewbotreviewbot

E128 continuation line under-indented for visual indent

reviewbotreviewbot

E128 continuation line under-indented for visual indent

reviewbotreviewbot

E128 continuation line under-indented for visual indent

reviewbotreviewbot

E501 line too long (87 > 79 characters)

reviewbotreviewbot

E128 continuation line under-indented for visual indent

reviewbotreviewbot

E501 line too long (89 > 79 characters)

reviewbotreviewbot

E128 continuation line under-indented for visual indent

reviewbotreviewbot

E128 continuation line under-indented for visual indent

reviewbotreviewbot

E501 line too long (89 > 79 characters)

reviewbotreviewbot

E128 continuation line under-indented for visual indent

reviewbotreviewbot

E128 continuation line under-indented for visual indent

reviewbotreviewbot

E128 continuation line under-indented for visual indent

reviewbotreviewbot

E128 continuation line under-indented for visual indent

reviewbotreviewbot

E124 closing bracket does not match visual indentation

reviewbotreviewbot

W293 blank line contains whitespace

reviewbotreviewbot

Could use some rewording for clarity, like "Checks the number of lines in the chunk to determine if they are …

shovenshoven

Should there be an additional linebreak here to separate the two paragraphs?

gojeffchogojeffcho

This comment is a bit unclear, particularly the second part of it. I think the commenting might end up being …

gojeffchogojeffcho

I believe this should have an extra line break since the docstring is on a single line.

gojeffchogojeffcho

Tiny nit: two spaces in front of trans should probably be one. Also this line should be indented to be …

shovenshoven

W291 trailing whitespace

reviewbotreviewbot

Nitpick: I had to re-read this comment a few times to try and make sense of what it was saying. …

skaefer143skaefer143
Checks run (1 failed, 1 succeeded)
flake8 failed.
JSHint passed.

flake8

mike_conley
  1. 
      
  2. reviewboard/diffviewer/diffutils.py (Diff revision 1)
     
     

    One thing you might find useful is to use PDB (the Python Debugger), in tandem with print statements like this.

    You can make PDB break in your program by putting in:

    import pdb; pdb.set_trace()
    

    where you want the break to occur. Then, execute the steps to hit that line of code, and switch to your server window - you'll notice that the server has become a debugger. You can then next (n) through, continue (c) and print (p) things.

    1. I see, I'll try that next then, thanks!

  3. 
      
cdkushni
Review request changed

Commit:

-2df11efd2b0f40d52c337c4a6f0c855ecc0bb8d6
+22b8e752f07904bc9fd6cc3c2dd82818b388c045

Diff:

Revision 2 (+43)

Show changes

Checks run (1 failed, 1 succeeded)

flake8 failed.
JSHint passed.

flake8

cdkushni
Review request changed

Summary:

-(WIP)Set up base file extension parsing to check for '.min.' extension before sending a bool with the result to the get_diff_files() data return. Still need to do the file line length checking.
+(WIP) checks if files are minified and adds the result to the file metadata.

Description:

~  

(WIP)Set up base file extension parsing to check for '.min.' extension before sending a bool with the result to the get_diff_files() data return. Still need to do the file line length checking.

  ~

(WIP)Set up base file extension parsing to check for '.min.' extension before

  + sending a bool with the result to the get_diff_files() data return.
  + Still need to do the file line length checking.

Commit:

-22b8e752f07904bc9fd6cc3c2dd82818b388c045
+6a45a10294cb2660791e935b55045497a42de8ef

Diff:

Revision 3 (+58)

Show changes

Checks run (1 failed, 1 succeeded)

flake8 failed.
JSHint passed.

flake8

cdkushni
brennie
  1. 
      
  2. Show all issues

    Unfortunately, Review Board never really has the files on the OS disk to read. The depot_filename that you've called this function with is the name of the file in the repository. To get the contents of this file, you would need to use get_patched_file.

    However, we do not want to do that just to see if the file is modified as this function is called by endpoints just to retrieve metadata about files in a diff.

    The JavaScript reposonsible for loading file contents for display in the DiffViewer is RB.DiffViewerPageView (in reviewboard/static/rb/js/pages/views/diffViewerPageView.es6.js). Specifically in queueLoadDiff, which calls RB.DiffReviewableModel.getRenderedDiff (in reviewboard/static/rb/js/diffviewer/models/diffReviewableModel.es6.js).

    The Django view for actually rendering the content (that the above JS model calls into) is reviewboard.diffviewer.views.DiffFragmentView (in reviewboard/diffviewer/views.py). This is where we will want to do the check call this function with (1) the filename and (2) the old/new file contents. Then we can take into account similar logic to how the contents of deleted files are requested to show the contents of minified files.

    I realize this is a lot to take in, but I'm happy to answer any questions (esp tomorrow in person) about this. I've spent quite a lot of time in the DiffViewer's JS recently, so I'm fairly familiar with how it all sort of fits together.

  3. reviewboard/diffviewer/diffutils.py (Diff revision 3)
     
     

    I find it easy to do import pdb; pdb.set_trace() wherever it is im using pdb. That way I don't forget to remove imports :)

  4. 
      
cdkushni
Review request changed

Commit:

-6a45a10294cb2660791e935b55045497a42de8ef
+97fd115f4b9339687d4f2a9f2864a47a1ca11ec7

Diff:

Revision 4 (+39)

Show changes

Checks run (2 failed)

flake8 failed.
JSHint failed.

flake8

JSHint

cdkushni
Review request changed

Summary:

-(WIP) checks if files are minified and adds the result to the file metadata.
+(WIP) checks if files are minified and overwrites the diff fragment display

Description:

   

Set up a function get_file_isminified(filename) in diffutils.py

    to check the file name and the file contents on if it is a minified file.
    This function is called and returns a boolean which is passed to the file
~   metadata in get_diff_files().

  ~ metadata in get_diff_files().
  + In diffviewer/renderer.py: If the file was previously marked as a min by
  + it's extension and if it has less than 20 lines total then the lines are
  + length checked to be greater than 800 characters and if true remarked as
  + minified files.
  + The diff_file_fragment.html then replaces then diff content with a minified
  + file message instead of displaying the contents.

Commit:

-97fd115f4b9339687d4f2a9f2864a47a1ca11ec7
+58a91172940a508410da813f9acc555268051520

Diff:

Revision 5 (+60 -5)

Show changes

Checks run (2 failed)

flake8 failed.
JSHint failed.

flake8

JSHint

cdkushni
Review request changed

Checks run (1 failed, 1 succeeded)

flake8 passed.
JSHint failed.

JSHint

cdkushni
brennie
  1. 
      
  2. reviewboard/diffviewer/renderers.py (Diff revision 7)
     
     
    Show all issues

    Nit: undo all these removed blank lines.

  3. reviewboard/diffviewer/renderers.py (Diff revision 7)
     
     
     
     
     
     
     
    Show all issues

    We don't do multi-line comments with docstrings. Instead we do multiple single-line comments with #.

  4. reviewboard/diffviewer/renderers.py (Diff revision 7)
     
     
     
    Show all issues

    Nit: blank line between statement and block.

  5. Show all issues

    Undo.

  6. Show all issues

    This will need to be a new class so we can use a separate event handler.

  7. 
      
cdkushni
Review request changed

Commit:

-80c23f9baef6c3273d78fab01bdf66aa2e39552a
+d792e26553ae3f9bb0ab780d2ac8c5926dc14212

Diff:

Revision 8 (+21 -6)

Show changes

Checks run (1 failed, 1 succeeded)

flake8 failed.
JSHint passed.

flake8

cdkushni
cdkushni
cdkushni
cdkushni
david
  1. 
      
  2. Show all issues

    Please add a screenshot as a file attachment.

  3. Show all issues

    Can you detail what testing you've done?

  4. reviewboard/diffviewer/diffutils.py (Diff revision 9)
     
     
     
     
     
    Show all issues

    "minified_bool" isn't a good variable name (I can infer the type from the assignment), plus we only use it once. Let's just do it inline:

    f = {
        'minified': 'min' in depot_filename.split('.'),
        ...
    }
    
  5. reviewboard/diffviewer/diffutils.py (Diff revision 9)
     
     
    Show all issues

    Can we add a comment explaining what the heuristics are?

  6. reviewboard/diffviewer/views.py (Diff revision 9)
     
     
    Show all issues

    This blank line isn't necessary.

  7. Show all issues

    Can you add a doc comment for this?

  8. 
      
cdkushni
Review request changed

Checks run (1 failed, 1 succeeded)

flake8 failed.
JSHint passed.

flake8

cdkushni
Review request changed

Testing Done:

   

DiffUtils python test written for testing minification testing.

  +
  +

Ran Python Unit Tests.

  +
  +

Ran Javascript Jasmine Unit Tests.

Commit:

-189fc75a38e3a1e14aa30bdf1ac7ee9510eb009c
+1fd06cc6dfd8a2aa9216e5deba6cb8e7e40291c0

Diff:

Revision 11 (+146 -8)

Show changes

Added Files:

Checks run (1 failed, 1 succeeded)

flake8 failed.
JSHint passed.

flake8

cdkushni
cdkushni
cdkushni
shoven
  1. 
      
  2. reviewboard/diffviewer/diffutils.py (Diff revision 12)
     
     
    Show all issues

    Could use some rewording for clarity, like "Checks the number of lines in the chunk to determine if they are short and not already categorized as minified files." (Based on your code it seems that you mean the number of lines in the chunk, not the length of the lines in the chunk so clarifying this in your comment would be helpful.)

    Could be more clear on what "short" means (give a number)? It seems that you are saying a file is minified if the number of lines is less than 20?

  3. Show all issues

    Tiny nit: two spaces in front of trans should probably be one.

    Also this line should be indented to be nested within the <td> element.

  4. 
      
gojeffcho
  1. 
      
  2. reviewboard/diffviewer/diffutils.py (Diff revision 12)
     
     
    Show all issues

    Should there be an additional linebreak here to separate the two paragraphs?

  3. reviewboard/diffviewer/diffutils.py (Diff revision 12)
     
     
     
     
     
    Show all issues

    This comment is a bit unclear, particularly the second part of it. I think the commenting might end up being unnecessary if this nested structure can be cleaned up to be more immediately readable.

  4. Show all issues

    I believe this should have an extra line break since the docstring is on a single line.

  5. 
      
cdkushni
Review request changed
cdkushni
Review request changed
skaefer143
  1. 
      
  2. reviewboard/diffviewer/diffutils.py (Diff revision 14)
     
     
     
     
     
     
     
    Show all issues

    Nitpick: I had to re-read this comment a few times to try and make sense of what it was saying. Could you try to make it a bit more clear? Maybe something like:

    "It checks the file's chunk data to determine if it is possibly a minified file or not. If there are less than 20 lines, and more than 500 characters in a line, it is determined there is a minified script in the file."

    Some thing like that? Correct my logic if it's wrong haha

  3. 
      
Loading...