Testing Done: |
|
---|
Reduce diff storage by hashing diff uploads
Review Request #2618 — Created Sept. 25, 2011 and submitted
Reduce diff storage by hashing diff uploads Diff files that already exist in the database will no longer be double-stored. Diffs are now hashed on upload and correspond to a new hash->binary table. Existing table data is not hashed and will remain for backwards compatibility. Evolutions have been made to create the new table and rename existing fields, so that model-logic can override fields. Test data has also been modified for new field name compatibility.
Ran fill-database and all data was hashed accordingly. Manually posted reviews and posted parent diffs. Created a unit test to check if hashes match.
Description | From | Last Updated |
---|---|---|
These are both built-in Python modules, so no blank line. |
chipx86 | |
I'm not really sure "Binary" is quite the term we want. FileDiffData might be better. Then "hash" and "data" fields. |
chipx86 | |
Docstrings are in the format of: """One-line summary Option multi-line description. """ It should always be a proper sentence, punctuation … |
chipx86 | |
Alignment problem. There's no need for \, since the parens take care of it. |
chipx86 | |
Same here. |
chipx86 | |
Blank line between these. A comment describing the next block of code should be separated. Also, comments should have proper … |
chipx86 | |
This can be: self.diff_hash, is_new = \ FileDiffBinary.objects.get_or_create(binary_hash=hashkey, defaults={ 'binary': diff }) |
chipx86 | |
Same logic can be applied here for the get_or_create. |
chipx86 | |
I?think that there should a corresponding py file for that? |
ZH zhenli | |
For consistencies sake, can we call that "kwargs"? |
mike_conley | |
Space after the if block |
mike_conley | |
Managers go in managers.py. |
chipx86 | |
It'd be nice to pull defaults out, so you don't repeat yourself (and also, don't assume defaults is set): defaults … |
chipx86 | |
Space before \ |
chipx86 | |
Put the comment inside the else: |
chipx86 | |
Two blank lines here |
david | |
Imports should be alphabetized. |
david |
-
-
reviewboard/diffviewer/models.py (Diff revision 1) These are both built-in Python modules, so no blank line.
-
reviewboard/diffviewer/models.py (Diff revision 1) I'm not really sure "Binary" is quite the term we want. FileDiffData might be better. Then "hash" and "data" fields.
-
reviewboard/diffviewer/models.py (Diff revision 1) Docstrings are in the format of: """One-line summary Option multi-line description. """ It should always be a proper sentence, punctuation and all.
-
reviewboard/diffviewer/models.py (Diff revision 1) Alignment problem. There's no need for \, since the parens take care of it.
-
-
reviewboard/diffviewer/models.py (Diff revision 1) Blank line between these. A comment describing the next block of code should be separated. Also, comments should have proper punctuation (period at the end).
-
reviewboard/diffviewer/models.py (Diff revision 1) This can be: self.diff_hash, is_new = \ FileDiffBinary.objects.get_or_create(binary_hash=hashkey, defaults={ 'binary': diff })
-
reviewboard/diffviewer/models.py (Diff revision 1) Same logic can be applied here for the get_or_create.
-
-
reviewboard/diffviewer/evolutions/__init__.py (Diff revision 1) I think it might be better to call it “add_diff_hash".
-
-
reviewboard/reviews/fixtures/test_reviewrequests.json (Diff revision 1) Looks like, after the first line, the who block is identical, but still marked by ReviewBoard as different. Is it a bug?
-
-
reviewboard/diffviewer/evolutions/__init__.py (Diff revision 2) I?think that there should a corresponding py file for that?
Change Summary:
Removed the AutoKey from FileDiffData and added a custom manager for the Base64Field AutoKey workaround
Diff: |
Revision 4 (+361 -134) |
---|
Change Summary:
Removed the AutoKey from FileDiffData and added a custom manager for the Base64Field AutoKey workaround (Disregard revision 4)
Diff: |
Revision 5 (+168 -91) |
---|
-
David: This looks really good! Just the two tiny nits I found below. Once those are fixed, I'll happily give this my ship-it. Thanks, -Mike
-
reviewboard/diffviewer/models.py (Diff revision 5) For consistencies sake, can we call that "kwargs"?
-
-
Couple small things. Do unit tests still run?
-
-
reviewboard/diffviewer/models.py (Diff revision 6) It'd be nice to pull defaults out, so you don't repeat yourself (and also, don't assume defaults is set): defaults = kwargs.get('defaults', {}) if defaults and defaults['binary']: defaults['binary'] = ...
-
-
Change Summary:
Moved FileDiffDataManager to managers.py, added a unit test for hashes, cleaned up tests.py with pep8.
Diff: |
Revision 7 (+201 -95) |
---|
Testing Done: |
|
---|