• 
      

    Tree Sitter: Add Lua pattern converter.

    Review Request #14516 — Created July 26, 2025 and updated

    Information

    Review Board
    master

    Reviewers

    For doing syntax highlighting with tree sitter, we're going to be
    shipping many of the highlights files from the nvim-treesitter package.
    These highlights files are often much, much better than the ones that
    are shipped with the grammars themselves, but they do have a few things
    that need to be worked around.

    The most complicated of these is that neovim has custom lua-match? and
    not-lua-match? predicates in its query implementation that allow
    queries to use Lua pattern matching.

    I initially implemented my own custom predicate that used the luapatt
    library to handle these, but while that works, for extensive
    highlighting, it involves performing huge numbers of function calls from
    the C code in py-tree-sitter to the predicate callable, adding
    significant overhead.

    This change adds a function that can translate (most) Lua patterns to
    Python regexes. To my knowledge, there are only two features of Lua
    patterns which are not supported: the non-greedy *- operator, and the
    balanced parentheses %b class. Neither of these is used anywhere in
    any of the queries files that we need.

    • Ran unit tests.
    • Used this with processed queries to verify that queries that formerly
      used lua-match? worked correctly with the rewritten regexes.
    Summary ID
    Tree Sitter: Add Lua pattern converter.
    For doing syntax highlighting with tree sitter, we're going to be shipping many of the highlights files from the nvim-treesitter package. These highlights files are often much, much better than the ones that are shipped with the grammars themselves, but they do have a few things that need to be worked around. The most complicated of these is that neovim has custom `lua-match?` and `not-lua-match?` predicates in its query implementation that allow queries to use Lua pattern matching. I initially implemented my own custom predicate that used the `luapatt` library to handle these, but while that works, for extensive highlighting, it involves performing huge numbers of function calls from the C code in py-tree-sitter to the predicate callable, adding significant overhead. This change adds a function that can translate (most) Lua patterns to Python regexes. To my knowledge, there are only two features of Lua patterns which are not supported: the non-greedy `*-` operator, and the balanced parentheses `%b` class. Neither of these is used anywhere in any of the queries files that we need. Testing Done: - Ran unit tests. - Used this with processed queries to verify that queries that formerly used lua-match? worked correctly with the rewritten regexes.
    tttwozowskpntwnxnyuzlxxknpxpnqyl
    david
    david
    Review request changed
    Change Summary:

    Add to coderef.

    Commits:
    Summary ID
    Tree Sitter: Add Lua pattern converter.
    For doing syntax highlighting with tree sitter, we're going to be shipping many of the highlights files from the nvim-treesitter package. These highlights files are often much, much better than the ones that are shipped with the grammars themselves, but they do have a few things that need to be worked around. The most complicated of these is that neovim has custom `lua-match?` and `not-lua-match?` predicates in its query implementation that allow queries to use Lua pattern matching. I initially implemented my own custom predicate that used the `luapatt` library to handle these, but while that works, for extensive highlighting, it involves performing huge numbers of function calls from the C code in py-tree-sitter to the predicate callable, adding significant overhead. This change adds a function that can translate (most) Lua patterns to Python regexes. To my knowledge, there are only two features of Lua patterns which are not supported: the non-greedy `*-` operator, and the balanced parentheses `%b` class. Neither of these is used anywhere in any of the queries files that we need. Testing Done: - Ran unit tests. - Used this with processed queries to verify that queries that formerly used lua-match? worked correctly with the rewritten regexes.
    22a8d42a582a199532c434834d3f721200f595f8
    Tree Sitter: Add Lua pattern converter.
    For doing syntax highlighting with tree sitter, we're going to be shipping many of the highlights files from the nvim-treesitter package. These highlights files are often much, much better than the ones that are shipped with the grammars themselves, but they do have a few things that need to be worked around. The most complicated of these is that neovim has custom `lua-match?` and `not-lua-match?` predicates in its query implementation that allow queries to use Lua pattern matching. I initially implemented my own custom predicate that used the `luapatt` library to handle these, but while that works, for extensive highlighting, it involves performing huge numbers of function calls from the C code in py-tree-sitter to the predicate callable, adding significant overhead. This change adds a function that can translate (most) Lua patterns to Python regexes. To my knowledge, there are only two features of Lua patterns which are not supported: the non-greedy `*-` operator, and the balanced parentheses `%b` class. Neither of these is used anywhere in any of the queries files that we need. Testing Done: - Ran unit tests. - Used this with processed queries to verify that queries that formerly used lua-match? worked correctly with the rewritten regexes.
    tttwozowskpntwnxnyuzlxxknpxpnqyl

    Checks run (2 succeeded)

    flake8 passed.
    JSHint passed.