• 
      

    Add a code safety checker for some trojan source code attacks.

    Review Request #11906 — Created Jan. 4, 2022 and submitted

    Information

    Review Board
    release-5.0.x

    Reviewers

    This code safety checker looks for zero-width spaces and bi-directional
    text in lines of code, flagging them when they appear and putting a
    banner at the top of the diff.

    Certain bi-directional Unicode characters can be used together to make
    malicious code display one way and execute another way. For example,
    code can appear to be inside of a comment, but instead be inside of a
    string, opening up opportunities to circumvent access control checks or
    other logic. This is CVE-2021-42574.

    Similarly, zero-width spaces can make code appear one way and execute
    another way. Languages that allow for Unicode characters in function
    names or variable names may accept zero-width spaces as a legitimate
    character in a name. To reviewers, an identifier with a zero-width space
    would appear the same as an identifier without one. This can cause, for
    instance, state checks to be circumvented.

    If either issue is found in code, the file alert template will show a
    section for the vulnerability with examples, so reviewers know the kind
    of risks that could be hiding in the code. They're also given a link to
    the CVE.

    This does not currently address CVE-2021-42694, which enables attacks
    similar to the zero-width space attach, but through homoglyphs
    (separate Unicode characters that resemble another character, like an
    "H"). That will be tackled separately.

    Note that there are legitimate situations in which these characters may
    appear. Depending on user feedback, we may want to offer options for
    disabling these checks. At the moment, we're following what other tools
    are doing and unconditionally checking the code.

    Unit tests pass on Python 2 and 3.

    Tested with a wide collection of trojan source files available at
    https://github.com/nickboucher/trojan-source/

    Summary ID
    Add a code safety checker for some trojan source code attacks.
    This code safety checker looks for zero-width spaces and bi-directional text in lines of code, flagging them when they appear and putting a banner at the top of the diff. Certain bi-directional Unicode characters can be used together to make malicious code display one way and execute another way. For example, code can appear to be inside of a comment, but instead be inside of a string, opening up opportunities to circumvent access control checks or other logic. This is CVE-2021-42574. Similarly, zero-width spaces can make code appear one way and execute another way. Languages that allow for Unicode characters in function names or variable names may accept zero-width spaces as a legitimate character in a name. To reviewers, an identifier with a zero-width space would appear the same as an identifier without one. This can cause, for instance, state checks to be circumvented. If either issue is found in code, the file alert template will show a section for the vulnerability with examples, so reviewers know the kind of risks that could be hiding in the code. They're also given a link to the CVE. This does not currently address CVE-2021-42694, which enables attacks similar to the zero-width space attach, but through homoglyphs (separate Unicode characters that resemble another character, like an "H"). That will be tackled separately. Note that there are legitimate situations in which these characters may appear. Depending on user feedback, we may want to offer options for disabling these checks. At the moment, we're following what other tools are doing and unconditionally checking the code.
    dbaa4d5f211b92091b1c1fe0692d579199e8eb39

    Description From Last Updated

    E501 line too long (80 > 79 characters)

    reviewbotreviewbot

    E501 line too long (80 > 79 characters)

    reviewbotreviewbot

    F401 're' imported but unused

    reviewbotreviewbot

    F841 local variable 'checks_map' is assigned to but never used

    reviewbotreviewbot

    F841 local variable 'checks_map' is assigned to but never used

    reviewbotreviewbot

    Shouldn't this be switched to SafeString as well?

    maubinmaubin

    Does this need to be back-ticked? Also should add a type (list of str)

    maubinmaubin
    Checks run (1 failed, 1 succeeded)
    flake8 failed.
    JSHint passed.

    flake8

    chipx86
    chipx86
    chipx86
    Review request changed
    Change Summary:
    • Reworked the Unicode scanning to use a more flexible loop for iteration and line changing, in preparation for confusables support.
    • Shortened the result IDs, to reduce cache space.
    • Removed the hard-coded Unicode titles, in favor of unicodedata.name().
    • Reordered some unit test functions.
    Commits:
    Summary ID
    Add a code safety checker for some trojan source code attacks.
    This code safety checker looks for zero-width spaces and bi-directional text in lines of code, flagging them when they appear and putting a banner at the top of the diff. Certain bi-directional Unicode characters can be used together to make malicious code display one way and execute another way. For example, code can appear to be inside of a comment, but instead be inside of a string, opening up opportunities to circumvent access control checks or other logic. This is CVE-2021-42574. Similarly, zero-width spaces can make code appear one way and execute another way. Languages that allow for Unicode characters in function names or variable names may accept zero-width spaces as a legitimate character in a name. To reviewers, an identifier with a zero-width space would appear the same as an identifier without one. This can cause, for instance, state checks to be circumvented. If either issue is found in code, the file alert template will show a section for the vulnerability with examples, so reviewers know the kind of risks that could be hiding in the code. They're also given a link to the CVE. This does not currently address CVE-2021-42694, which enables attacks similar to the zero-width space attach, but through homoglyphs (separate Unicode characters that resemble another character, like an "H"). That will be tackled separately. Note that there are legitimate situations in which these characters may appear. Depending on user feedback, we may want to offer options for disabling these checks. At the moment, we're following what other tools are doing and unconditionally checking the code.
    11a4139d174c7bb75ded5a2fcf523bd76de41c08
    Add a code safety checker for some trojan source code attacks.
    This code safety checker looks for zero-width spaces and bi-directional text in lines of code, flagging them when they appear and putting a banner at the top of the diff. Certain bi-directional Unicode characters can be used together to make malicious code display one way and execute another way. For example, code can appear to be inside of a comment, but instead be inside of a string, opening up opportunities to circumvent access control checks or other logic. This is CVE-2021-42574. Similarly, zero-width spaces can make code appear one way and execute another way. Languages that allow for Unicode characters in function names or variable names may accept zero-width spaces as a legitimate character in a name. To reviewers, an identifier with a zero-width space would appear the same as an identifier without one. This can cause, for instance, state checks to be circumvented. If either issue is found in code, the file alert template will show a section for the vulnerability with examples, so reviewers know the kind of risks that could be hiding in the code. They're also given a link to the CVE. This does not currently address CVE-2021-42694, which enables attacks similar to the zero-width space attach, but through homoglyphs (separate Unicode characters that resemble another character, like an "H"). That will be tackled separately. Note that there are legitimate situations in which these characters may appear. Depending on user feedback, we may want to offer options for disabling these checks. At the moment, we're following what other tools are doing and unconditionally checking the code.
    5105919086bcf387da8c551632147b954d92a44d

    Checks run (1 failed, 1 succeeded)

    flake8 failed.
    JSHint passed.

    flake8

    chipx86
    david
    1. Ship It!
    2. 
        
    chipx86
    david
    1. Ship It!
    2. 
        
    maubin
    1. Ship It!

    2. 
        
    maubin
    1. 
        
    2. Show all issues

      Shouldn't this be switched to SafeString as well?

    3. 
        
    chipx86
    maubin
    1. 
        
    2. Show all issues

      Does this need to be back-ticked? Also should add a type (list of str)

      1. It does in this form. I'll update it to use the newer Keys: section.

    3. 
        
    chipx86
    maubin
    1. Ship It!
    2. 
        
    david
    1. Ship It!
    2. 
        
    chipx86
    Review request changed
    Status:
    Completed
    Change Summary:
    Pushed to release-5.0.x (3823d88)