Tree Sitter: Add Lua pattern converter.

Review Request #14516 — Created July 26, 2025 and updated

Information

Review Board
master

Reviewers

For doing syntax highlighting with tree sitter, we're going to be
shipping many of the highlights files from the nvim-treesitter package.
These highlights files are often much, much better than the ones that
are shipped with the grammars themselves, but they do have a few things
that need to be worked around.

The most complicated of these is that neovim has custom lua-match? and
not-lua-match? predicates in its query implementation that allow
queries to use Lua pattern matching.

I initially implemented my own custom predicate that used the luapatt
library to handle these, but while that works, for extensive
highlighting, it involves performing huge numbers of function calls from
the C code in py-tree-sitter to the predicate callable, adding
significant overhead.

This change adds a function that can translate (most) Lua patterns to
Python regexes. To my knowledge, there are only two features of Lua
patterns which are not supported: the non-greedy *- operator, and the
balanced parentheses %b class. Neither of these is used anywhere in
any of the queries files that we need.

  • Ran unit tests.
  • Used this with processed queries to verify that queries that formerly
    used lua-match? worked correctly with the rewritten regexes.
Summary ID
Tree Sitter: Add Lua pattern converter.
For doing syntax highlighting with tree sitter, we're going to be shipping many of the highlights files from the nvim-treesitter package. These highlights files are often much, much better than the ones that are shipped with the grammars themselves, but they do have a few things that need to be worked around. The most complicated of these is that neovim has custom `lua-match?` and `not-lua-match?` predicates in its query implementation that allow queries to use Lua pattern matching. I initially implemented my own custom predicate that used the `luapatt` library to handle these, but while that works, for extensive highlighting, it involves performing huge numbers of function calls from the C code in py-tree-sitter to the predicate callable, adding significant overhead. This change adds a function that can translate (most) Lua patterns to Python regexes. To my knowledge, there are only two features of Lua patterns which are not supported: the non-greedy `*-` operator, and the balanced parentheses `%b` class. Neither of these is used anywhere in any of the queries files that we need. Testing Done: - Ran unit tests. - Used this with processed queries to verify that queries that formerly used lua-match? worked correctly with the rewritten regexes.
22a8d42a582a199532c434834d3f721200f595f8
david
Review request changed
Change Summary:
  • Pass tuples for pytest parametrize arg names.
  • Remove some duplicate test cases.
Commits:
Summary ID
Tree Sitter: Add Lua pattern converter.
For doing syntax highlighting with tree sitter, we're going to be shipping many of the highlights files from the nvim-treesitter package. These highlights files are often much, much better than the ones that are shipped with the grammars themselves, but they do have a few things that need to be worked around. The most complicated of these is that neovim has custom `lua-match?` and `not-lua-match?` predicates in its query implementation that allow queries to use Lua pattern matching. I initially implemented my own custom predicate that used the `luapatt` library to handle these, but while that works, for extensive highlighting, it involves performing huge numbers of function calls from the C code in py-tree-sitter to the predicate callable, adding significant overhead. This change adds a function that can translate (most) Lua patterns to Python regexes. To my knowledge, there are only two features of Lua patterns which are not supported: the non-greedy `*-` operator, and the balanced parentheses `%b` class. Neither of these is used anywhere in any of the queries files that we need. Testing Done: - Ran unit tests. - Used this with processed queries to verify that queries that formerly used lua-match? worked correctly with the rewritten regexes.
3f126b9a7c71614ba2ef634a9a05b2a3a98e4500
Tree Sitter: Add Lua pattern converter.
For doing syntax highlighting with tree sitter, we're going to be shipping many of the highlights files from the nvim-treesitter package. These highlights files are often much, much better than the ones that are shipped with the grammars themselves, but they do have a few things that need to be worked around. The most complicated of these is that neovim has custom `lua-match?` and `not-lua-match?` predicates in its query implementation that allow queries to use Lua pattern matching. I initially implemented my own custom predicate that used the `luapatt` library to handle these, but while that works, for extensive highlighting, it involves performing huge numbers of function calls from the C code in py-tree-sitter to the predicate callable, adding significant overhead. This change adds a function that can translate (most) Lua patterns to Python regexes. To my knowledge, there are only two features of Lua patterns which are not supported: the non-greedy `*-` operator, and the balanced parentheses `%b` class. Neither of these is used anywhere in any of the queries files that we need. Testing Done: - Ran unit tests. - Used this with processed queries to verify that queries that formerly used lua-match? worked correctly with the rewritten regexes.
22a8d42a582a199532c434834d3f721200f595f8

Checks run (2 succeeded)

flake8 passed.
JSHint passed.