Testing Done: |
|
---|
Reduce diff storage by hashing diff uploads
Review Request #2618 — Created Sept. 25, 2011 and submitted
Information | |
---|---|
ddruska | |
Review Board | |
Reviewers | |
reviewboard | |
Reduce diff storage by hashing diff uploads Diff files that already exist in the database will no longer be double-stored. Diffs are now hashed on upload and correspond to a new hash->binary table. Existing table data is not hashed and will remain for backwards compatibility. Evolutions have been made to create the new table and rename existing fields, so that model-logic can override fields. Test data has also been modified for new field name compatibility.
Ran fill-database and all data was hashed accordingly. Manually posted reviews and posted parent diffs. Created a unit test to check if hashes match.
Description | From | Last Updated |
---|---|---|
These are both built-in Python modules, so no blank line. |
|
|
I'm not really sure "Binary" is quite the term we want. FileDiffData might be better. Then "hash" and "data" fields. |
|
|
Docstrings are in the format of: """One-line summary Option multi-line description. """ It should always be a proper sentence, punctuation … |
|
|
Alignment problem. There's no need for \, since the parens take care of it. |
|
|
Same here. |
|
|
Blank line between these. A comment describing the next block of code should be separated. Also, comments should have proper … |
|
|
This can be: self.diff_hash, is_new = \ FileDiffBinary.objects.get_or_create(binary_hash=hashkey, defaults={ 'binary': diff }) |
|
|
Same logic can be applied here for the get_or_create. |
|
|
I?think that there should a corresponding py file for that? |
ZH zhenli | |
For consistencies sake, can we call that "kwargs"? |
|
|
Space after the if block |
|
|
Managers go in managers.py. |
|
|
It'd be nice to pull defaults out, so you don't repeat yourself (and also, don't assume defaults is set): defaults … |
|
|
Space before \ |
|
|
Put the comment inside the else: |
|
|
Two blank lines here |
|
|
Imports should be alphabetized. |
|
-
-
reviewboard/diffviewer/models.py (Diff revision 1) These are both built-in Python modules, so no blank line.
-
reviewboard/diffviewer/models.py (Diff revision 1) I'm not really sure "Binary" is quite the term we want. FileDiffData might be better. Then "hash" and "data" fields.
-
reviewboard/diffviewer/models.py (Diff revision 1) Docstrings are in the format of: """One-line summary Option multi-line description. """ It should always be a proper sentence, punctuation and all.
-
reviewboard/diffviewer/models.py (Diff revision 1) Alignment problem. There's no need for \, since the parens take care of it.
-
-
reviewboard/diffviewer/models.py (Diff revision 1) Blank line between these. A comment describing the next block of code should be separated. Also, comments should have proper punctuation (period at the end).
-
reviewboard/diffviewer/models.py (Diff revision 1) This can be: self.diff_hash, is_new = \ FileDiffBinary.objects.get_or_create(binary_hash=hashkey, defaults={ 'binary': diff })
-
reviewboard/diffviewer/models.py (Diff revision 1) Same logic can be applied here for the get_or_create.
-
-
reviewboard/diffviewer/evolutions/__init__.py (Diff revision 1) I think it might be better to call it “add_diff_hash".
-
-
reviewboard/reviews/fixtures/test_reviewrequests.json (Diff revision 1) Looks like, after the first line, the who block is identical, but still marked by ReviewBoard as different. Is it a bug?
-
-
reviewboard/diffviewer/evolutions/__init__.py (Diff revision 2) I?think that there should a corresponding py file for that?
Change Summary:
Removed the AutoKey from FileDiffData and added a custom manager for the Base64Field AutoKey workaround
Diff: |
Revision 4 (+361 -134) |
---|
Change Summary:
Removed the AutoKey from FileDiffData and added a custom manager for the Base64Field AutoKey workaround (Disregard revision 4)
Diff: |
Revision 5 (+168 -91) |
---|
-
David: This looks really good! Just the two tiny nits I found below. Once those are fixed, I'll happily give this my ship-it. Thanks, -Mike
-
reviewboard/diffviewer/models.py (Diff revision 5) For consistencies sake, can we call that "kwargs"?
-
-
Couple small things. Do unit tests still run?
-
-
reviewboard/diffviewer/models.py (Diff revision 6) It'd be nice to pull defaults out, so you don't repeat yourself (and also, don't assume defaults is set): defaults = kwargs.get('defaults', {}) if defaults and defaults['binary']: defaults['binary'] = ...
-
-
Change Summary:
Moved FileDiffDataManager to managers.py, added a unit test for hashes, cleaned up tests.py with pep8.
Diff: |
Revision 7 (+201 -95) |
---|
Testing Done: |
|
---|