Added cap to filename header
Review Request #8646 — Created Jan. 20, 2017 and submitted
Clients occasionally dump a ton of data in the filename e-mail headers, which many SMTP servers reject.
To combat this, I have added a limit to the number of characters (8192 characters) that these filename headers may contain. Once this limit is reached, additional filename headers are not added.
Sent e-mail through gmail servers; cap worked correctly.
Unit test was added that checks for correct logger warning, as well as testing the correct overflow behavior.
Description | From | Last Updated |
---|---|---|
Does this have a bug number? |
david | |
Please flesh out your description to include more info on what the bug was and how you fixed it. See … |
david | |
We're also going to need a unit test that tests overflowing this header and checking for warnings. One of the … |
chipx86 | |
Your summary should mention that it is an e-mail header because HTTP also has headers. |
brennie | |
'formataddr' imported but unused |
reviewbot | |
Col: 1 W191 indentation contains tabs |
reviewbot | |
Col: 1 E101 indentation contains mixed spaces and tabs |
reviewbot | |
Col: 1 W191 indentation contains tabs |
reviewbot | |
Col: 1 W191 indentation contains tabs |
reviewbot | |
Col: 1 W191 indentation contains tabs |
reviewbot | |
Col: 1 W191 indentation contains tabs |
reviewbot | |
Col: 1 W191 indentation contains tabs |
reviewbot | |
Col: 80 E501 line too long (83 > 79 characters) |
reviewbot | |
undefined name 'counter' |
reviewbot | |
local variable 'counter' is assigned to but never used |
reviewbot | |
Col: 1 W191 indentation contains tabs |
reviewbot | |
Col: 1 W191 indentation contains tabs |
reviewbot | |
Col: 1 W191 indentation contains tabs |
reviewbot | |
Col: 1 W191 indentation contains tabs |
reviewbot | |
Col: 1 W191 indentation contains tabs |
reviewbot | |
Col: 1 W293 blank line contains whitespace |
reviewbot | |
Col: 80 E501 line too long (86 > 79 characters) |
reviewbot | |
Col: 29 E701 multiple statements on one line (colon) |
reviewbot | |
Col: 13 E701 multiple statements on one line (colon) |
reviewbot | |
Col: 9 E122 continuation line missing indentation or outdented |
reviewbot | |
Col: 1 W293 blank line contains whitespace |
reviewbot | |
Col: 80 E501 line too long (86 > 79 characters) |
reviewbot | |
Col: 29 E701 multiple statements on one line (colon) |
reviewbot | |
Col: 13 E701 multiple statements on one line (colon) |
reviewbot | |
Col: 9 E122 continuation line missing indentation or outdented |
reviewbot | |
It looks like your editor created this. Please remove it from your change. |
david | |
'formataddr' imported but unused |
reviewbot | |
Please put a blank line between these. |
david | |
Please add a comment explaining why we do this complex stuff. |
david | |
We should iterate this as for key, value in six.iteritems(headers): |
david | |
These lines could all be combined. |
david | |
Please put a blank line between these. |
david | |
We should have a comment explaining what this "2" is for. We should also include some extra padding in here … |
david | |
It might be nice to pull out 'X-ReviewBoard-Diff-For' into a constant string. We can also compute the length of it … |
david | |
Please put a blank line between these. |
david | |
It might be nice to pull this out into a constant. |
david | |
This could be current_header_length = newlength, since we already did that same computation. |
david | |
After a block, please put a newline. Can we add a log statement to the break case to record that … |
david | |
We only want 2 blank lines between sections at the top level of a file. |
chipx86 | |
Col: 1 W293 blank line contains whitespace |
reviewbot | |
Col: 1 W293 blank line contains whitespace |
reviewbot | |
Col: 5 E266 too many leading '#' for block comment |
reviewbot | |
Col: 5 E303 too many blank lines (3) |
reviewbot | |
Col: 80 E501 line too long (123 > 79 characters) |
reviewbot | |
Only one # for these. Make sure to wrap at <= 79 characters. |
chipx86 | |
Col: 5 E266 too many leading '#' for block comment |
reviewbot | |
Col: 80 E501 line too long (115 > 79 characters) |
reviewbot | |
Col: 1 W293 blank line contains whitespace |
reviewbot | |
I suggest storing this as a constant at the top-level of the module. Constants are also generally uppercase. This one … |
chipx86 | |
Col: 1 W293 blank line contains whitespace |
reviewbot | |
Col: 9 E266 too many leading '#' for block comment |
reviewbot | |
Col: 80 E501 line too long (106 > 79 characters) |
reviewbot | |
\r\n, not /r/n. |
chipx86 | |
Col: 9 E266 too many leading '#' for block comment |
reviewbot | |
So-called "magic numbers" (the 6) should always have a constant. |
chipx86 | |
Col: 1 W293 blank line contains whitespace |
reviewbot | |
Col: 1 W293 blank line contains whitespace |
reviewbot | |
Col: 5 E303 too many blank lines (2) |
reviewbot | |
Col: 1 W293 blank line contains whitespace |
reviewbot | |
Col: 1 W293 blank line contains whitespace |
reviewbot | |
Col: 9 E122 continuation line missing indentation or outdented |
reviewbot | |
No blank line here. |
chipx86 | |
Col: 56 E701 multiple statements on one line (colon) |
reviewbot | |
Col: 1 W293 blank line contains whitespace |
reviewbot | |
No blank line. |
chipx86 | |
Col: 1 W293 blank line contains whitespace |
reviewbot | |
Col: 9 E122 continuation line missing indentation or outdented |
reviewbot | |
Col: 13 E701 multiple statements on one line (colon) |
reviewbot | |
Col: 80 E501 line too long (103 > 79 characters) |
reviewbot | |
Make sure to test this. There's no logger. You also need to provide more information in the log message. As … |
chipx86 | |
Col: 80 E501 line too long (120 > 79 characters) |
reviewbot | |
Not sure we want to put a header in here for this? I don't know whether Warning is a standard … |
chipx86 | |
Col: 1 W293 blank line contains whitespace |
reviewbot | |
'formataddr' imported but unused |
reviewbot | |
Can you document these? Refer to https://www.notion.so/reviewboard/Writing-Python-Docs-845c696271194d85b69503246db1d0e6#574a663ebae8453b8b014bd9ad841ce3 |
chipx86 | |
Comments should wrap at around character 79. |
chipx86 | |
No blank line here. |
chipx86 | |
No parens needed. |
chipx86 | |
This should also explicitly say that this is when sending e-mail. Something like: "Unable to store all filenames in X-ReviewBoard-Diff-For … |
chipx86 | |
'formataddr' imported but unused |
reviewbot | |
Almost there, but not quite right. This should be using the #: syntax as shown in my link, and there … |
chipx86 | |
I have a couple comments in this code block, but they're more to help learn about our code style. I … |
chipx86 | |
You can fit more per line. |
chipx86 | |
These should be the same statement. If you need to wrap lines, use parens, like: current_header_length += (len(key) + len(value) … |
chipx86 | |
I would actually reverse this. Try: if current_header_length >= DIFF_FOR_HEADER_MAX_LENGTH: logging... break headers.appendlist(...) |
chipx86 | |
Two important things here: 1) This isn't an error, it's a warning. We should use logging.warning. 2) """ should never … |
chipx86 | |
You don't need to (and shouldn't do) str(...). The %s will take care of it automatically. |
chipx86 | |
Col: 72 W291 trailing whitespace |
reviewbot | |
Col: 68 E226 missing whitespace around arithmetic operator |
reviewbot | |
Col: 1 W293 blank line contains whitespace |
reviewbot | |
Col: 70 W291 trailing whitespace |
reviewbot | |
Col: 1 W293 blank line contains whitespace |
reviewbot | |
Col: 1 W293 blank line contains whitespace |
reviewbot | |
Col: 1 W293 blank line contains whitespace |
reviewbot | |
Blank line between these two. Can we also call the constant MAX_FILENAME_HEADERS_LENGTH? |
david | |
This should just be a regular string, not a raw string (no 'r' prefix). It should also use single quotes. |
david | |
typos: "headrs", "addin". There's also an extra space at the beginning of this line. |
david | |
extra "g" that got chopped off from the previous line. |
david | |
This should wrap on word boundaries rather than breaking words: logging.warning('Unable to store all filenames in ' 'X-ReviewBoard-Diff-For headers when … |
david | |
logging.X() will do a format operation, so these can be passed in as additional arguments rather than doing your own … |
david | |
Add a blank line between these two. |
david | |
local variable 'from_email' is assigned to but never used |
reviewbot | |
Please use single-quotes around 'X' |
david | |
local variable 'filediff' is assigned to but never used |
reviewbot | |
Leftover debug code? |
david | |
Please wrap the string on word boundaries. |
david | |
local variable 'from_email' is assigned to but never used |
reviewbot | |
local variable 'filediff' is assigned to but never used |
reviewbot | |
Col: 33 E128 continuation line under-indented for visual indent |
reviewbot | |
Col: 33 E128 continuation line under-indented for visual indent |
reviewbot | |
The contents of the docstring should be in the doc comments below. No sense splitting it up or repeating it. |
brennie | |
Looks like it got a bit mangled at the end. |
brennie | |
This should not be a raw string. This will be a string that is four characters long: >>> r'\r\n' '\\r\\n' … |
brennie | |
Colon instead of semicolon. |
brennie | |
This will take less lines and be easier to read as: logging.warning( '.... ... ...' '... ... ...', ...) |
brennie | |
Undo this change. |
brennie | |
So we are going to generate a 100 character string 400 times instead of just doing it once. Instead we … |
brennie | |
Can we just rename this filename? |
brennie | |
If we are always adding at least one character and we are truncating to the last 100, we can technically … |
brennie | |
"and" should not be split over two lines. |
brennie | |
Colon instead of semicolon here. |
brennie | |
You shouldn't need a u' prefix. We use from __future__ import unicode_literals at the start of each file, which turns … |
brennie | |
spy.called is equivalent to len(spy.calls) > 0. If spy.called is False the above assertions will raise an exception. This should … |
brennie | |
Since this is a raw string, it will have length 4. Do not use a raw string (i.e., remove the … |
brennie | |
Undo this change. The line is unnecessary but should be fixed in a different patch. |
brennie | |
local variable 'from_email' is assigned to but never used |
reviewbot | |
Col: 26 E703 statement ends with a semicolon |
reviewbot | |
Col: 1 W293 blank line contains whitespace |
reviewbot | |
So str is not actually the string type for Python 2: unicode is. However, using %-formatting will call the relevant … |
brennie | |
Col: 49 W291 trailing whitespace |
reviewbot | |
Col: 70 W291 trailing whitespace |
reviewbot | |
Col: 77 W291 trailing whitespace |
reviewbot | |
You don't need the backticks in comments because they will never be rendered. |
brennie | |
'formataddr' imported but unused |
reviewbot | |
Col: 33 E128 continuation line under-indented for visual indent |
reviewbot | |
Col: 33 E128 continuation line under-indented for visual indent |
reviewbot | |
Col: 33 E128 continuation line under-indented for visual indent |
reviewbot | |
local variable 'from_email' is assigned to but never used |
reviewbot | |
Col: 1 W293 blank line contains whitespace |
reviewbot | |
Col: 49 W291 trailing whitespace |
reviewbot | |
Col: 70 W291 trailing whitespace |
reviewbot | |
Col: 77 W291 trailing whitespace |
reviewbot | |
This should just be a comment -- it needn't be a doc comment (#:). |
brennie | |
Can you surround X-Reviewboard-Diff-For with double backticks? |
brennie | |
Likewise here? |
brennie | |
And the whole header thing here too. |
brennie | |
local variable 'from_email' is assigned to but never used |
reviewbot | |
This comment isn't entirely correct (we no longer have DIFF_FOR_HEADER_MAX_LENGTH), plus it's duplicating stuff on the individual doc comments for … |
david | |
There's an extra set of parens around the log text that can be removed. |
david | |
local variable 'from_email' is assigned to but never used |
reviewbot | |
Did you run this code? The %ss should have been filled in. |
david | |
You can fit more per line. |
chipx86 | |
Blank line before comments. |
chipx86 | |
The ordering is a bit weird. I'd kinda suggest comparing the entirety of args at once. |
chipx86 |
- Commit:
-
efe3c0fda9132bbf411ab7e613a47b8b8830293da8daf5ca5c4cf17e821edc83c6f71fa83f005619
- Diff:
-
Revision 2 (+23 -2)
-
Tool: PEP8 Style Checker Processed Files: reviewboard/notifications/email.py Ignored Files: .project
-
-
-
-
-
- Commit:
-
a8daf5ca5c4cf17e821edc83c6f71fa83f0056193f255a0d2feb677d3cc87bcb6a4964e3bcfa7596
- Diff:
-
Revision 3 (+22 -2)
-
Tool: PEP8 Style Checker Processed Files: reviewboard/notifications/email.py Ignored Files: .project
-
-
-
-
- Commit:
-
3f255a0d2feb677d3cc87bcb6a4964e3bcfa7596980fb57f1cd0e9b0fbce6c50e9d8a9b8d01effcd
- Diff:
-
Revision 4 (+23 -2)
-
-
-
Please flesh out your description to include more info on what the bug was and how you fixed it. See https://www.reviewboard.org/docs/codebase/dev/writing-good-descriptions/ for a guide.
-
-
-
-
-
-
-
We should have a comment explaining what this "2" is for. We should also include some extra padding in here to accomodate newlines (can you check to see if emails use \r\n or just \n?)
-
It might be nice to pull out 'X-ReviewBoard-Diff-For' into a constant string. We can also compute the length of it once, outside the loop, and then just add the length of the filename inside.
-
-
-
-
After a block, please put a newline.
Can we add a log statement to the
break
case to record that it was clipped? It might be nice to also include a header in this case explaining that the file list is clipped.
- Description:
-
~ Added length limit for filename headers.
~ Clients occasionally dump a ton of data in the filename headers, which many SMTP servers reject.
+ + To combat this, I have added a limit to the number of characters (8192 characters) that these filename headers may contain. Once this limit is reached, additional filename headers are not added.
- Bugs:
-
- Commit:
980fb57f1cd0e9b0fbce6c50e9d8a9b8d01effcda98ae924e1c17d14c0214a0f381fb19b0ebbb17b- Diff:
Revision 5 (+28 -2)
-
-
-
-
I suggest storing this as a constant at the top-level of the module.
Constants are also generally uppercase. This one should be more specific about being for this header. Maybe:
DIFF_FOR_HEADER_MAX_LENGTH
. -
-
-
-
-
Make sure to test this. There's no
logger
.You also need to provide more information in the log message. As it is, if a server has thousands of users using it and keeps getting this, the admin will have no idea what review request or user this is being triggered for.
At a minimum, this should include the review request display ID (
review_request.display_id
). -
Not sure we want to put a header in here for this? I don't know whether
Warning
is a standard or not, but I'm not seeing it.
- Commit:
-
a98ae924e1c17d14c0214a0f381fb19b0ebbb17b8a90085c1f4ec4920c03e4df1d551475b6b22b75
- Diff:
-
Revision 6 (+30 -2)
-
-
Can you document these? Refer to https://www.notion.so/reviewboard/Writing-Python-Docs-845c696271194d85b69503246db1d0e6#574a663ebae8453b8b014bd9ad841ce3
-
-
-
-
This should also explicitly say that this is when sending e-mail. Something like:
"Unable to store all filenames in X-ReviewBoard-Diff-For header when sending e-mail for review request %s: The header size exceeds the limit of %s. Remaining headers have been removed."
We also use format strings, rather than string concatenation. This is faster and allows for localization if we want it (we don't for logs, but if this were an exception or something sent to the UI, we would).
- Groups:
- Commit:
-
8a90085c1f4ec4920c03e4df1d551475b6b22b754847c83bb5e3bedd0dde21dc0062833f4f41a291
- Diff:
-
Revision 7 (+36 -2)
- Commit:
-
4847c83bb5e3bedd0dde21dc0062833f4f41a291c88aeeab008541240b070c95868f9367963c6e9b
- Diff:
-
Revision 8 (+36 -3)
-
Tool: PEP8 Style Checker Processed Files: reviewboard/notifications/email.py Tool: Pyflakes Processed Files: reviewboard/notifications/email.py
-
-
We're also going to need a unit test that tests overflowing this header and checking for warnings. One of the mentors there can guide you on this.
-
Almost there, but not quite right. This should be using the
#:
syntax as shown in my link, and there should be a comment per constant.Also, when you have literals like the "\r\n", you should wrap them in double backticks:
``\r\n``
-
I have a couple comments in this code block, but they're more to help learn about our code style. I have a bigger comment for this whole block, though.
Nuke it.
The reason is that we're not even done with headers by this stage. There's more that will happen further down, more in the mail client, etc. We have the limit set above as 8KB instead of something like 32KB because of the fact that there's going to be other headers put in in places.
So the maximum length and current length here is intended to be solely for this header. We're allowing 8KB for this header alone, leaving 24KB for the rest of the headers. Should be more than enough.
-
-
These should be the same statement. If you need to wrap lines, use parens, like:
current_header_length += (len(key) + len(value) + HEADER_ADDITIONAL_CHARACTERS_LENGTH)
-
I would actually reverse this. Try:
if current_header_length >= DIFF_FOR_HEADER_MAX_LENGTH: logging... break headers.appendlist(...)
-
Two important things here:
1) This isn't an error, it's a warning. We should use
logging.warning
.
2)"""
should never be used for anything but docstrings. This needs to use single quoted strings, formatted like:logging.warning('Unable to store all filenames ..... ' '................................... ' '................')
-
- Commit:
-
c88aeeab008541240b070c95868f9367963c6e9b25f4ffac12c4e368cde9e403d0f020ef3a280a02
- Diff:
-
Revision 9 (+37 -4)
- Commit:
-
25f4ffac12c4e368cde9e403d0f020ef3a280a026196ae691fe4e6b23bfd219b9913c101548584b5
- Diff:
-
Revision 10 (+37 -4)
- Commit:
-
6196ae691fe4e6b23bfd219b9913c101548584b53ffbe3c2bcd387cafdad37d8f81bfb4fc90e0497
- Diff:
-
Revision 11 (+37 -4)
-
Tool: Pyflakes Processed Files: reviewboard/notifications/email.py Tool: PEP8 Style Checker Processed Files: reviewboard/notifications/email.py
- Summary:
-
Added cap to filename headers. [WIP]Added cap to filename headers
- Testing Done:
-
Sent e-mail through gmail servers; cap worked correctly.
+ + Unit test was added that checks for correct logger warning, as well as testing the correct overflow behavior.
- Commit:
-
3ffbe3c2bcd387cafdad37d8f81bfb4fc90e04979a326f0cfb72f0ab253ccbd0d664f86f5f38e3cc
-
Tool: PEP8 Style Checker Processed Files: reviewboard/notifications/email.py reviewboard/notifications/tests.py
-
-
-
-
-
-
-
Tool: Pyflakes Processed Files: reviewboard/notifications/email.py reviewboard/notifications/tests.py Tool: PEP8 Style Checker Processed Files: reviewboard/notifications/email.py reviewboard/notifications/tests.py
-
-
-
-
-
This should just be a regular string, not a raw string (no 'r' prefix). It should also use single quotes.
-
-
-
This should wrap on word boundaries rather than breaking words:
logging.warning('Unable to store all filenames in ' 'X-ReviewBoard-Diff-For headers when ' 'sending e-mail for review request %s: ' ...
Make sure there's a space at the end of each of those lines so when they're concatenated, it still looks right.
-
logging.X()
will do a format operation, so these can be passed in as additional arguments rather than doing your own format. -
-
-
-
-
Tool: Pyflakes Processed Files: reviewboard/notifications/email.py reviewboard/notifications/tests.py Tool: PEP8 Style Checker Processed Files: reviewboard/notifications/email.py reviewboard/notifications/tests.py
-
-
- Summary:
-
Added cap to filename headers [WIP]Added cap to filename header
- Commit:
-
aca35b8291dcc1621c5f6784e4f3f91d7f633d1fa3299b3c811cb20e619551356a0d310543d2cc2b
-
Tool: Pyflakes Processed Files: reviewboard/notifications/email.py reviewboard/notifications/tests.py Tool: PEP8 Style Checker Processed Files: reviewboard/notifications/email.py reviewboard/notifications/tests.py
-
-
-
Tool: Pyflakes Processed Files: reviewboard/notifications/email.py reviewboard/notifications/tests.py Tool: PEP8 Style Checker Processed Files: reviewboard/notifications/email.py reviewboard/notifications/tests.py
-
-
-
The contents of the docstring should be in the doc comments below. No sense splitting it up or repeating it.
-
-
This should not be a raw string. This will be a string that is four characters long:
>>> r'\r\n' '\\r\\n' >>> len(r'\r\n') 4
Also, single quotes.
-
-
This will take less lines and be easier to read as:
logging.warning( '.... ... ...' '... ... ...', ...)
-
-
So we are going to generate a 100 character string 400 times instead of just doing it once. Instead we can generate the prefix once and then append the number in the loop:
prefix = 'x' * 1000 for i in range(400): filename = '%s%s' % (prefix, i)
It is also important to note that string concatenation (with
+
) has bad performance. It's not super bad, but if you're doing it 100 times it will take a long time. Using interpolation with%
is much faster in this case. It is better to use%
-formatting in general as we use it almost everywhere. -
-
If we are always adding at least one character and we are truncating to the last 100, we can technically use
'X' * 99
. You may want to document this section to explain why you're truncating the file name. -
-
-
You shouldn't need a
u'
prefix.We use
from __future__ import unicode_literals
at the start of each file, which turns'x'
intou'x'
behind the scenes. -
spy.called
is equivalent tolen(spy.calls) > 0
.If
spy.called
isFalse
the above assertions will raise an exception. This should probably come before those assertions.
- Summary:
-
Added cap to filename headerAdded cap to filename header [WIP]
- Description:
-
~ Clients occasionally dump a ton of data in the filename headers, which many SMTP servers reject.
~ Clients occasionally dump a ton of data in the filename e-mail headers, which many SMTP servers reject.
To combat this, I have added a limit to the number of characters (8192 characters) that these filename headers may contain. Once this limit is reached, additional filename headers are not added.
- Commit:
-
632b30cc1c3d5f6dcee604d7e6aec73092085b770543526136029ca672d67281b19e24a7d6493817
-
Tool: Pyflakes Processed Files: reviewboard/notifications/email.py reviewboard/notifications/tests.py Tool: PEP8 Style Checker Processed Files: reviewboard/notifications/email.py reviewboard/notifications/tests.py
-
-
-
-
-
-
-
-
Since this is a raw string, it will have length 4. Do not use a raw string (i.e., remove the
r
.).Also use single quotes.
-
-
So
str
is not actually the string type for Python 2:unicode
is. However, using%
-formatting will call the relevant type conversion methods, e.g. when using%s
for anint
it will callunicode()
. (Or you could use%d
-- same difference.TL;DR: you can just do
(prefix, i)
-
-
Tool: Pyflakes Processed Files: reviewboard/notifications/email.py reviewboard/notifications/tests.py Tool: PEP8 Style Checker Processed Files: reviewboard/notifications/email.py reviewboard/notifications/tests.py
-
-
-
-
-
-
-
-
-
- Branch:
-
release-2.5.xmaster
- Commit:
-
d1e71f679b78ed4ff0ca5d64526e9fd6b082fb237a9e403fbfa3457fb144d3b02f30c022a69fb401