Add regex support by lungarella-raffaele · Pull Request #265 · rescript-lang/tree-sitter-rescript

lungarella-raffaele · 2026-01-29T20:15:37Z

Fixes #255

I took inspiration (copied) from https://github.com/tree-sitter/tree-sitter-javascript, it seems rescript has the same regex syntax as js, couldn't find something official to confirm it.

I tested it using the zed extension everything works fine. Testing on a random file with contents:

let r = /\\n/g

Tree before:

(source_file [0, 0] - [1, 0]
  (ERROR [0, 0] - [0, 14]
    (value_identifier [0, 4] - [0, 5])
    (ERROR [0, 11] - [0, 12])
    (ERROR [0, 13] - [0, 14])))

Tree after:

(source_file [0, 0] - [1, 0]
  (let_declaration [0, 0] - [0, 14]
    (let_binding [0, 4] - [0, 14]
      pattern: (value_identifier [0, 4] - [0, 5])
      body: (regex [0, 8] - [0, 14]
        pattern: (regex_pattern [0, 9] - [0, 12])
        flags: (regex_flags [0, 13] - [0, 14])))))

Copilot

Pull request overview

This PR adds support for regex literal syntax (/pattern/flags) to the ReScript tree-sitter grammar, fixing issue #255 where ReScript 12's new regex literal format was not being parsed correctly. The implementation is based on the tree-sitter-javascript approach and replaces the need for the older %re() syntax.

Changes:

Added regex grammar rules with pattern and flags support to grammar.js
Updated syntax highlighting and language injection queries for regex literals
Added comprehensive test cases covering regex literals in various contexts

Reviewed changes

Copilot reviewed 6 out of 7 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
grammar.js	Defines regex grammar rules (regex, regex_pattern, regex_flags) and integrates them into expression and literal pattern rules
src/grammar.json	Generated grammar definition reflecting the new regex rules
src/node-types.json	Generated node type definitions including regex, regex_pattern, and regex_flags
queries/highlights.scm	Adds syntax highlighting for regex literals as string.special
queries/injections.scm	Adds regex language injection for regex patterns and removes trailing whitespace
test/corpus/literals.txt	Adds comprehensive test cases for regex literals including basic patterns, flags, let bindings, and function arguments

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-18T06:55:56Z

test/corpus/literals.txt

+            (regex_pattern)
+            (regex_flags))
+          (string
+            (escape_sequence)))))))


Consider adding a test case that combines regex literals with division operators to explicitly verify disambiguation. For example:

let result = (10 / 2) + /test/g

This would help ensure the parser correctly distinguishes between division operators and regex delimiters in complex expressions, especially since both use the forward slash character. While the implementation should handle this correctly due to the use of token.immediate and precedence, an explicit test would provide confidence that this edge case is covered.

Suggested change

(escape_sequence)))))))

(escape_sequence)))))))

================================================================================

Regex literal combined with division operator

================================================================================

let result = (10 / 2) + /test/g

--------------------------------------------------------------------------------

(source_file

(let_declaration

(let_binding

(value_identifier)

(binary_expression

(binary_expression

(integer)

(operator)

(integer))

(operator)

(regex

(regex_pattern)

(regex_flags))))))

Copilot · 2026-02-18T06:55:57Z

grammar.js

+        ),
+      ),
+
+    regex_flags: (_) => token.immediate(/[a-z]+/),


The regex_flags pattern allows any sequence of lowercase letters (pattern: /[a-z]+/). While JavaScript/ReScript officially supports only specific flags (g, i, m, s, u, y, d), this lenient validation allows invalid flags like "xyz" to parse without error. Consider whether stricter validation would be beneficial, such as:

regex_flags: (_) => token.immediate(/[gimsuy]+/)

However, the current implementation may be intentionally permissive for forward compatibility with potential new flags. If this is the intended behavior, it's acceptable, but it's worth documenting this design decision.

Suggested change

regex_flags: (_) => token.immediate(/[a-z]+/),

// Only allow officially supported JS/ReScript regex flags: g, i, m, s, u, y, d

regex_flags: (_) => token.immediate(/[gimsuyd]+/),

Add regex parsing

2c94206

lungarella-raffaele changed the title ~~Add regex parsing~~ Add regex support Jan 29, 2026

shulhi requested a review from aspeddro February 12, 2026 00:55

nojaf requested a review from Copilot February 18, 2026 06:50

Copilot started reviewing on behalf of nojaf February 18, 2026 06:50 View session

Copilot AI reviewed Feb 18, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add regex support#265

Add regex support#265
lungarella-raffaele wants to merge 1 commit intorescript-lang:mainfrom
lungarella-raffaele:feat-regex-grammar

lungarella-raffaele commented Jan 29, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Feb 18, 2026

Uh oh!

Copilot AI Feb 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Comments

-            (escape_sequence)))))))
+            (escape_sequence)))))))
+================================================================================
+Regex literal combined with division operator
+================================================================================
+let result = (10 / 2) + /test/g
+--------------------------------------------------------------------------------
+(source_file
+  (let_declaration
+    (let_binding
+      (value_identifier)
+      (binary_expression
+        (binary_expression
+          (integer)
+          (operator)
+          (integer))
+        (operator)
+        (regex
+          (regex_pattern)
+          (regex_flags))))))

	regex_flags: (_) => token.immediate(/[a-z]+/),
	// Only allow officially supported JS/ReScript regex flags: g, i, m, s, u, y, d
	regex_flags: (_) => token.immediate(/[gimsuyd]+/),

Conversation

lungarella-raffaele commented Jan 29, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Feb 18, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 18, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Comments