Add support in the Trojan Source Checker to disable confusable checks.

Review Request #12784 — Created Jan. 11, 2023 and submitted — Latest diff uploaded

Information

Review Board
release-5.0.x

Reviewers

This updates the Trojan Source code safety checker to take arguments to
turn off Unicode confusable checks as a whole, or to disable them for
specific Unicode aliases.

To allow this to work, the build-confusables.py script now provides a
mapping of confusable alias names to numeric indexes, and a reverse of
that map. The number indexes are then referenced in the confusable
character map. The checker can then use this to quickly determine which
alias a particular character is part of, and to then exclude that if
provided in the call.

There's no user-level customization at this stage. This is purely an
update to the safety checker code. Additional work is still needed to
allow customization of these settings and to provide it at all the right
stages of diff generation.

There's also a fix that was encountered when posting this change, which
caused a crash when Python failed to identify the name of a Unicode
character.

Unit tests pass on all supported versions of Python.

Tested this along with in-progress changes to allow customization of
which confusables are checked.

Commits

Files

    Loading...