flake8
-
bot/reviewbot/tools/credentials_check.py (Diff revision 1) Show all issues -
-
-
-
-
Review Request #10223 — Created Oct. 11, 2018 and discarded
Information | |
---|---|
ammar | |
ReviewBot | |
master | |
Reviewers | |
reviewbot, students | |
Previously, Review Bot did not check for credentials that may have been
accidentally included in the commit. A human reviewer would have to
look out for them, but we hoped to move more of this task's burden
to Review Bot.A new Credentials Check tool has been added to Review Bot which
looks for various key files, other sensitive files and inline embedded
AWS credentials to make sure these are not pushed to the repository.
Manual tests (correctly finds and creates issues on lines with the
credentials, or marks the first line of a file type (including
file types specified from options tab) that should not have been
included e.g. .pem files).
Description | From | Last Updated |
---|---|---|
Can you wrap your description and testing done at 72 Chars? |
|
|
Is this a WIP? Your description has "WIP: Add options..." If this is WIP please put it in the summary |
|
|
In your change description: "ReviewBot" -> "Review Bot" (3x) |
|
|
E501 line too long (83 > 79 characters) |
![]() |
|
E501 line too long (84 > 79 characters) |
![]() |
|
E501 line too long (85 > 79 characters) |
![]() |
|
E501 line too long (80 > 79 characters) |
![]() |
|
E501 line too long (84 > 79 characters) |
![]() |
|
E501 line too long (89 > 79 characters) |
![]() |
|
typo: "credntialscheck" |
|
|
typo: "credntialscheck" |
|
|
Missing module-level docstring |
|
|
Module imports should be formatted as: from __future__ import ... # Python STDLib imports # 3rd party imports # Imports … |
|
|
Instead of having multiple credential regexes, we can make this into a single regular expression: compiled_credential_pattern = re.compile( '|'.join( '(%s)' … |
|
|
Single quotes here and throughout |
|
|
Missing trailing comma. This regex doesn't do what you want it to becuase of the leading [, which makes everything … |
|
|
Instead of doing this here, we should do it in CredentialsCheckTool.__init__ so that theyre not sitting here taking up memory … |
|
|
Docstrings should be of the form: """Single line summary. Multi-line description. """ |
|
|
We use the Oxford comma, so there should be a comma after private keys. |
|
|
How about just "Review a single file."? |
|
|
Blank line between these. |
|
|
This will not detect files named id_rsa becuase it is not an extension. You will need a separate set of … |
|
|
We should word this as "may be a security risk" because if its a public key --PEM files can be … |
|
|
Comments should be complete sentences: they should begin with a capital letter and end with a period. |
|
|
What exceptions are we hoping to catch? We should be very specific about what we expect so that other exceptions … |
|
|
Blank line between these. |
|
|
Since pattern is a compiled regular expression, you can just do pattern.match(line) |
|
|
You're going to need to add a entrypoint for your tool. |
|
|
:file:`.pem` :file:`id_rsa` |
|
|
How about: This tool is built into ReviewBot. There is no separate installation step required. |
|
|
Typo: "credntialscheck" |
|
|
One more blank line here. |
|
|
Can you add a docstring here? |
|
|
No blank line here. |
|
|
Can you wrap these in parens to make it clear that this is supposed to be multiple lines? e.g. python … |
|
|
E128 continuation line under-indented for visual indent |
![]() |
|
It looks like these wouldn't catch cases where the value was enclosed in quotes? |
|
|
Let's put one per line and sort them all alphabetically. |
|
|
Should probably be "Including this file ..." (files -> file) |
|
|
We should be able to use regex matching against bytestrings, so we can skip the detection/decoding here. We just need … |
|
|
Do we want to use .search() instead of .match()? |
|
|
I feel like we should be more verbose about what the problem might be (for example, "Potential disclosure of private … |
|
|
A link to this needs to be added to docs/reviewbot/tools/index.rst |
|
|
This reads a little funky. How about "Improper credentials can include things such as AWS keys hardcoded in source or … |
|
|
E124 closing bracket does not match visual indentation |
![]() |
|
E124 closing bracket does not match visual indentation |
![]() |
|
Can you insert this in alphabetical order? |
|
|
Can you insert this in alphabetical order? |
|
|
These regexes won't work in a few cases: Shell scripts with AWS_SECRET_KEY=... i.e., without quotes. Single-quoted string values We can … |
|
|
Single quotes. |
|
|
Single quotes around AWS_SECRET_KEY |
|
|
Single quotes here |
|
|
Trailing space, but it should also have a period. Also, comma after e.g.. |
|
|
Missing args/kwargs |
|
|
https://docs.python.org/2/library/os.path.html#os.path.splitext |
|
|
.iteritems() is Python2 only. You'll want to do: import six # ... for name, pattern in six.iteritems(self.compiled_re): |
|
|
This should line up with the string above, e.g. ('... ' '... '), |
|
|
RBTools depends on six but if we're using it directly, we should have our own dependency on it. (I mention … |
|
|
Trailing whitespace. |
|
|
Can you insert this in alphabetical order? |
|
|
Can you insert this in alphabetical order? |
|
|
six is a third-party library, so it should go in it's own "section": import re import six from reviewbot.tools import … |
|
|
Formatting here could be a little nicer. If you put the parens on their own lines, then the strings will … |
|
|
This should use six.iteritems. Also, this can use a dict comprehension to do it all in one go: super(CredentialsCheckTool, self).__init__() … |
|
|
In this case I think it's probably better to just ignore the line length warning. |
|
|
ReviewBot -> Review Bot |
|
|
ReviewBot -> Review Bot |
|
|
E501 line too long (88 > 79 characters) |
![]() |
|
Should be in alphabetical order? |
|
|
Will this fit as: for risk_name, pattern in six.iteritems( self._compiled_re): # ... |
|
|
E501 line too long (88 > 79 characters) |
![]() |
|
F821 undefined name 'compiled_pattern' |
![]() |
|
E501 line too long (88 > 79 characters) |
![]() |
|
E501 line too long (88 > 79 characters) |
![]() |
|
Add another blank line here. |
|
|
E501 line too long (88 > 79 characters) |
![]() |
bot/reviewbot/tools/credentials_check.py (Diff revision 1) |
---|
Commit: |
|
||||
---|---|---|---|---|---|
Diff: |
Revision 2 (+86) |
bot/reviewbot/tools/credentials_check.py (Diff revision 2) |
---|
Module imports should be formatted as:
from __future__ import ... # Python STDLib imports # 3rd party imports # Imports from this package
e.g.
from __future__ import unicode_literals import re import chardet from reviewbot.tools import Tool
bot/reviewbot/tools/credentials_check.py (Diff revision 2) |
---|
Instead of having multiple credential regexes, we can make this into a single regular expression:
compiled_credential_pattern = re.compile( '|'.join( '(%s)' % pattern for pattern in credential_patterns ) )
bot/reviewbot/tools/credentials_check.py (Diff revision 2) |
---|
Missing trailing comma.
This regex doesn't do what you want it to becuase of the leading
[
, which makes everything up to]
a character this regex will match.e.g.
>>> import re >>> x = re.compile(r"[AWS_SECRET_KEY\s*=\s*[A-Za-z0-9/+=]{40}") >>> m = x.match('SSSSSSSSSSSS = %s' % 'A' * 40) >>> bool(m) True
bot/reviewbot/tools/credentials_check.py (Diff revision 2) |
---|
Instead of doing this here, we should do it in
CredentialsCheckTool.__init__
so that theyre not sitting here taking up memory if the tool isn't used. e.g.class CredentialsCheckTool(Tool): def __init__(self, *args, **kwargs): super(CredentialsCheckTool, self).__init__(*args, **kwargs) self.compiled_re = [ re.compile(regex) for regex in credential_patterns ]
bot/reviewbot/tools/credentials_check.py (Diff revision 2) |
---|
Docstrings should be of the form:
"""Single line summary. Multi-line description. """
bot/reviewbot/tools/credentials_check.py (Diff revision 2) |
---|
We use the Oxford comma, so there should be a comma after
private keys
.
bot/reviewbot/tools/credentials_check.py (Diff revision 2) |
---|
This will not detect files named
id_rsa
becuase it is not an extension. You will need a separate set of file names (id_rsa
,id_dsa
,id_ecdsa
) versus file extensions (p12
,pem
,ppk
,key
).
bot/reviewbot/tools/credentials_check.py (Diff revision 2) |
---|
We should word this as "may be a security risk" because if its a public key --PEM files can be encoded public keys -- it totally isn't.
bot/reviewbot/tools/credentials_check.py (Diff revision 2) |
---|
Comments should be complete sentences: they should begin with a capital letter and end with a period.
bot/reviewbot/tools/credentials_check.py (Diff revision 2) |
---|
I would organize this as:
try: encoding = chardet.detect( f.patched_file_contents)['encoding'] if encoding: contents = f.patched_file_contents.decode( encoding, 'strict') else: # We can't do any more for this file. return except (TypeError, UnicodeError, ValueError): return lines = contents.split('\n') for line_number, line in enumerate(lines, 1): for pattern in compiled_credential_patterns: if pattern.match(line): f.comment(...)
This lets us use explicit control flow (return) when we are done instead of keeping track of boolean state.
bot/reviewbot/tools/credentials_check.py (Diff revision 2) |
---|
Does
bytes.decode(encoding, 'strict')
work in Python 3?
bot/reviewbot/tools/credentials_check.py (Diff revision 2) |
---|
What exceptions are we hoping to catch? We should be very specific about what we expect so that other exceptions that we aren't expecting aren't also captured.
Looking at the source of
chardet.detect
, it looks like it only raisesTypeError
.bytes.decode(..., 'strict')
seems to raiseValueError
andUnicodeError
.
bot/reviewbot/tools/credentials_check.py (Diff revision 2) |
---|
Since
pattern
is a compiled regular expression, you can just dopattern.match(line)
docs/reviewbot/tools/credentialscheck.rst (Diff revision 2) |
---|
How about:
This tool is built into ReviewBot. There is no separate installation step required.
Commit: |
|
||||
---|---|---|---|---|---|
Diff: |
Revision 3 (+101) |
Is this a WIP? Your description has "WIP: Add options..."
If this is WIP please put it in the summary
bot/setup.py (Diff revision 3) |
---|
Can you wrap these in parens to make it clear that this is supposed to be multiple lines?
e.g.
python ('credentialscheck = ' '...'),
This will help in not accidentally adding a comma there in the future.
Description: |
|
||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Testing Done: |
|
||||||||||||||||||||||||||||||||||||
Commit: |
|
||||||||||||||||||||||||||||||||||||
Diff: |
Revision 4 (+120) |
Commit: |
|
||||
---|---|---|---|---|---|
Diff: |
Revision 5 (+120) |
bot/reviewbot/tools/credentials_check.py (Diff revision 5) |
---|
It looks like these wouldn't catch cases where the value was enclosed in quotes?
bot/reviewbot/tools/credentials_check.py (Diff revision 5) |
---|
Let's put one per line and sort them all alphabetically.
bot/reviewbot/tools/credentials_check.py (Diff revision 5) |
---|
Should probably be "Including this file ..." (files -> file)
bot/reviewbot/tools/credentials_check.py (Diff revision 5) |
---|
We should be able to use regex matching against bytestrings, so we can skip the detection/decoding here. We just need to make sure that the patterns use
br'...'
bot/reviewbot/tools/credentials_check.py (Diff revision 5) |
---|
Do we want to use
.search()
instead of.match()
?
bot/reviewbot/tools/credentials_check.py (Diff revision 5) |
---|
I feel like we should be more verbose about what the problem might be (for example, "Potential disclosure of private Amazon AWS keys")
docs/reviewbot/tools/credentialscheck.rst (Diff revision 5) |
---|
A link to this needs to be added to
docs/reviewbot/tools/index.rst
docs/reviewbot/tools/credentialscheck.rst (Diff revision 5) |
---|
This reads a little funky. How about "Improper credentials can include things such as AWS keys hardcoded in source or private key files."
Commit: |
|
||||
---|---|---|---|---|---|
Diff: |
Revision 6 (+111) |
bot/reviewbot/tools/credentials_check.py (Diff revision 6) |
---|
E124 closing bracket does not match visual indentation
Diff: |
Revision 7 (+111) |
---|
bot/reviewbot/tools/credentials_check.py (Diff revision 7) |
---|
E124 closing bracket does not match visual indentation
Commit: |
|
||||
---|---|---|---|---|---|
Diff: |
Revision 8 (+112) |
bot/reviewbot/tools/credentials_check.py (Diff revision 8) |
---|
These regexes won't work in a few cases:
- Shell scripts with
AWS_SECRET_KEY=...
i.e., without quotes.- Single-quoted string values
We can get around this with:
{ 'AWS_KEY': br'''(?:AWS_KEY|AWS_ACCESS_KEY|AWS_ACCESS_KEY_ID)\s*=\s*(?P<quote>["']?)[A-Z0-9]{20}(?P=quote)''' }
bot/reviewbot/tools/credentials_check.py (Diff revision 8) |
---|
Trailing space, but it should also have a period.
Also, comma after
e.g.
.
bot/reviewbot/tools/credentials_check.py (Diff revision 8) |
---|
https://docs.python.org/2/library/os.path.html#os.path.splitext
bot/reviewbot/tools/credentials_check.py (Diff revision 8) |
---|
.iteritems()
is Python2 only. You'll want to do:import six # ... for name, pattern in six.iteritems(self.compiled_re):
bot/setup.py (Diff revision 8) |
---|
RBTools depends on
six
but if we're using it directly, we should have our own dependency on it. (I mention usingsix
in another comment).
Commit: |
|
||||
---|---|---|---|---|---|
Diff: |
Revision 9 (+117) |
bot/reviewbot/tools/credentials_check.py (Diff revision 9) |
---|
six is a third-party library, so it should go in it's own "section":
import re import six from reviewbot.tools import Tool
bot/reviewbot/tools/credentials_check.py (Diff revision 9) |
---|
Formatting here could be a little nicer. If you put the parens on their own lines, then the strings will line up a bit nicer. We also don't need to use the triple quotes (better to just escape any inner
'
characters), but we do need to have thebr
prefix on each line, even though they get concatenated.'AWS_KEY': ( br'...' br'...' ), 'AWS_SECRET_KEY': ( br'...' br'...' ),
bot/reviewbot/tools/credentials_check.py (Diff revision 9) |
---|
This should use six.iteritems. Also, this can use a dict comprehension to do it all in one go:
super(CredentialsCheckTool, self).__init__() self.compiled_re = { name: re.compile(pattern) for name, pattern in six.iteritems(credential_patterns) }
bot/setup.py (Diff revision 9) |
---|
In this case I think it's probably better to just ignore the line length warning.
Commit: |
|
||||
---|---|---|---|---|---|
Diff: |
Revision 10 (+123) |
Commit: |
|
||||
---|---|---|---|---|---|
Diff: |
Revision 11 (+123) |
bot/reviewbot/tools/credentials_check.py (Diff revision 11) |
---|
Will this fit as:
for risk_name, pattern in six.iteritems( self._compiled_re): # ...
Branch: |
|
||||
---|---|---|---|---|---|
Commit: |
|
||||
Diff: |
Revision 12 (+122) |
Commit: |
|
||||
---|---|---|---|---|---|
Diff: |
Revision 13 (+122) |
Two tiny nits:
Description: |
|
|||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Commit: |
|
|||||||||||||||||||||||||||||||||
Diff: |
Revision 14 (+123) |