• 
      

    Add support in the Trojan Source Checker to disable confusable checks.

    Review Request #12784 — Created Jan. 11, 2023 and submitted

    Information

    Review Board
    release-5.0.x

    Reviewers

    This updates the Trojan Source code safety checker to take arguments to
    turn off Unicode confusable checks as a whole, or to disable them for
    specific Unicode aliases.

    To allow this to work, the build-confusables.py script now provides a
    mapping of confusable alias names to numeric indexes, and a reverse of
    that map. The number indexes are then referenced in the confusable
    character map. The checker can then use this to quickly determine which
    alias a particular character is part of, and to then exclude that if
    provided in the call.

    There's no user-level customization at this stage. This is purely an
    update to the safety checker code. Additional work is still needed to
    allow customization of these settings and to provide it at all the right
    stages of diff generation.

    There's also a fix that was encountered when posting this change, which
    caused a crash when Python failed to identify the name of a Unicode
    character.

    Unit tests pass on all supported versions of Python.

    Tested this along with in-progress changes to allow customization of
    which confusables are checked.

    Summary ID
    Add support in the Trojan Source Checker to disable confusable checks.
    This updates the Trojan Source code safety checker to take arguments to turn off Unicode confusable checks as a whole, or to disable them for specific Unicode aliases. To allow this to work, the `build-confusables.py` script now provides a mapping of confusable alias names to numeric indexes, and a reverse of that map. The number indexes are then referenced in the confusable character map. The checker can then use this to quickly determine which alias a particular character is part of, and to then exclude that if provided in the call. There's no user-level customization at this stage. This is purely an update to the safety checker code. Additional work is still needed to allow customization of these settings and to provide it at all the right stages of diff generation. There's also a fix that was encountered when posting this change, which caused a crash when Python failed to identify the name of a Unicode character.
    58190502fe5bbf9b0ac27c58a298e99f948923b6
    Description From Last Updated

    Looks like there's a bug displaying the diff for reviewboard/codesafety/_unicode_confusables.py

    david david

    < 3.9

    david david

    Need to add check_confusables and confusable_aliases_allowed to the args section.

    maubin maubin

    Should probably add check_confusables and confusable_aliases_allowed to the args section here too.

    maubin maubin
    david
    1. 
        
    2. Show all issues

      Looks like there's a bug displaying the diff for reviewboard/codesafety/_unicode_confusables.py

      1. Yep, I mentioned that in the description. The fix is included in this change.

        It's just generated output. Nothing that really needs to be reviewed itself.

    3. contrib/internal/build-confusables.py (Diff revision 1)
       
       
      Show all issues

      < 3.9

    4. 
        
    maubin
    1. 
        
    2. reviewboard/codesafety/checkers/trojan_source.py (Diff revision 1)
       
       
       
       
       
       
       
       
       
       
      Show all issues

      Need to add check_confusables and confusable_aliases_allowed to the args section.

    3. Show all issues

      Should probably add check_confusables and confusable_aliases_allowed to the args section here too.

    4. 
        
    chipx86
    maubin
    1. Ship It!
    2. 
        
    david
    1. Ship It!
    2. 
        
    chipx86
    Review request changed
    Status:
    Completed
    Change Summary:
    Pushed to release-5.0.x (ed18f57)