Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
Warning Rate limit exceeded
⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: ⛔ Files ignored due to path filters (1)
📒 Files selected for processing (8)
📝 WalkthroughWalkthroughThis pull request adds Kotlin language support to the code analyzer framework by introducing a new KotlinAnalyzer implementation with Tree-sitter-based parsing, extending the source analyzer to process .kt and .kts files, refactoring Entity symbol handling to use direct Node storage, and including comprehensive test coverage with sample Kotlin source code. Changes
Sequence DiagramsequenceDiagram
participant User
participant SourceAnalyzer
participant KotlinAnalyzer
participant TreeSitter
participant Entity
participant SymbolResolver
User->>SourceAnalyzer: analyze_sources(*.kt files)
SourceAnalyzer->>KotlinAnalyzer: get_entity_types()
SourceAnalyzer->>TreeSitter: parse(kotlin_source)
TreeSitter-->>SourceAnalyzer: AST
loop For each entity node
SourceAnalyzer->>KotlinAnalyzer: get_entity_label(node)
SourceAnalyzer->>KotlinAnalyzer: get_entity_name(node)
SourceAnalyzer->>KotlinAnalyzer: get_entity_docstring(node)
KotlinAnalyzer-->>SourceAnalyzer: label, name, docstring
SourceAnalyzer->>Entity: create Entity
end
SourceAnalyzer->>KotlinAnalyzer: add_symbols(entity)
KotlinAnalyzer->>Entity: add_symbol(key, node)
Entity-->>SourceAnalyzer: symbols attached
SourceAnalyzer->>SymbolResolver: resolved_symbol(key, resolver_fn)
SymbolResolver->>KotlinAnalyzer: resolve_type() or resolve_method()
KotlinAnalyzer-->>SymbolResolver: list[Entity]
SymbolResolver->>Entity: add_resolved_symbol(key, resolved)
Entity-->>SourceAnalyzer: resolved_symbols populated
Estimated Code Review Effort🎯 4 (Complex) | ⏱️ ~60 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
api/analyzers/analyzer.py (1)
136-147:⚠️ Potential issue | 🟠 MajorDon't narrow
resolve_symbol()until the caller model changes too.
api/entities/entity.pystill iterates over the resolver result, and the concrete analyzers still buildlist[Entity]viaresolve_type()/resolve_method(). With this abstract signature set toEntity, the contract is internally inconsistent and a literal implementation will raise as soon as the caller tries to iterate it.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@api/analyzers/analyzer.py` around lines 136 - 147, The abstract method resolve_symbol currently returns Entity but callers (api/entities/entity.py) iterate over its result and concrete analyzers implement resolve_type/resolve_method to return list[Entity], so change the resolve_symbol signature to return an iterable of entities (e.g., Iterable[Entity] or list[Entity]) to match callers; update the abstract method in analyzer.py (resolve_symbol) and any related abstract signatures (resolve_type/resolve_method) to the chosen iterable type so implementations and callers stay consistent until the caller model is intentionally changed.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@api/analyzers/kotlin/analyzer.py`:
- Around line 68-91: The code currently adds the same captured user_type nodes
as both "base_class" and "implement_interface" for class_declaration; instead,
take the first captured user_type as the superclass (base_class) and treat any
remaining captured user_type nodes as interfaces (implement_interface). In
practice, change the logic around the delegation_specifier/type_identifier
captures in the class_declaration branch (the queries that populate
superclass_captures and interface_captures) to collect a single ordered list of
captured user_type/type_identifier nodes, add the first element via
entity.add_symbol("base_class", ...) and add any subsequent elements via
entity.add_symbol("implement_interface", ...); leave the object_declaration
branch emitting only implement_interface as-is.
- Around line 70-112: Several Query objects (superclass_query, interface_query,
query, param_query, return_type_query) are calling .captures() directly which
Tree-sitter 0.25.x doesn't support; replace each call like
superclass_query.captures(entity.node) with the analyzer helper
self._captures(<pattern>, entity.node) (use the original query pattern string
passed to self.language.query for the first argument) so all six query usages
use self._captures(pattern, node) consistently (update the two interface_query
usages, the superclass_query, the call-expression query, the parameter query,
and the return_type_query).
In `@api/analyzers/source_analyzer.py`:
- Around line 142-149: The current conditional always assigns
NullLanguageServer() to lsps[".kt"] and lsps[".kts"], so Kotlin symbol
resolution remains disabled; update the branch so that when
any(path.rglob('*.kt')) or any(path.rglob('*.kts')) is true you instantiate and
assign the actual Kotlin language server (e.g., KotlinLanguageServer or the
project's kotlin-language-server wrapper) to lsps[".kt"] and lsps[".kts"]
instead of NullLanguageServer(), ensure those keys are started in the context
manager call (the with lsps[".java"].start_server(), ... block), and keep
NullLanguageServer() only in the else branch so AbstractAnalyzer.resolve() and
request_definition() can resolve Kotlin lookups correctly.
- Around line 156-169: The loop is using raw tree-sitter Node IDs (symbol.id)
when calling graph.connect_entities, but connect_entities expects the graph
database entity IDs produced by graph.add_entity; this miswires relations.
Replace uses of symbol.id with the resolved graph entity ID for that symbol
(lookup the entity ID created earlier when the symbol node was added—e.g., use
whatever mapping or helper you maintain from AST Node to graph entity ID, such
as a node_to_entity_id/map or a graph.get_entity_id_for_node(node) function) so
that
graph.connect_entities("EXTENDS"/"IMPLEMENTS"/"CALLS"/"RETURNS"/"PARAMETERS",
entity.id, resolved_symbol_entity_id) is called instead of passing symbol.id.
Ensure you handle cases where a symbol node (like parameter or return-type) may
not have been added as an entity by checking the lookup and skipping or creating
the entity before connecting.
---
Outside diff comments:
In `@api/analyzers/analyzer.py`:
- Around line 136-147: The abstract method resolve_symbol currently returns
Entity but callers (api/entities/entity.py) iterate over its result and concrete
analyzers implement resolve_type/resolve_method to return list[Entity], so
change the resolve_symbol signature to return an iterable of entities (e.g.,
Iterable[Entity] or list[Entity]) to match callers; update the abstract method
in analyzer.py (resolve_symbol) and any related abstract signatures
(resolve_type/resolve_method) to the chosen iterable type so implementations and
callers stay consistent until the caller model is intentionally changed.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 9dd188fc-f654-43b9-aafe-5fbc4b4919ae
📒 Files selected for processing (8)
api/analyzers/analyzer.pyapi/analyzers/java/analyzer.pyapi/analyzers/kotlin/__init__.pyapi/analyzers/kotlin/analyzer.pyapi/analyzers/python/analyzer.pyapi/analyzers/source_analyzer.pyapi/entities/entity.pypyproject.toml
f55f13a to
a96e20f
Compare
gkorland
left a comment
There was a problem hiding this comment.
Rebased on staging and addressed all review comments:
-
Wildcard import → explicit imports: Replaced
from ...entities import *with explicitEntityandFileimports. -
Superclass/interface separation: The old code recorded the same captured
user_typenodes as bothbase_classandimplement_interface. Now uses a dedicated_get_delegation_types()helper that walks the AST in order — the first delegation specifier becomesbase_class, and any subsequent ones becomeimplement_interface. Also handlesconstructor_invocationpatterns (e.g.,Shape("circle")). -
self._captures()instead ofquery.captures(): All 6 query calls switched to use the properAbstractAnalyzer._captures()helper which wrapsQueryCursor, consistent with all other analyzers. -
Kotlin grammar node types: Fixed all references from
type_identifier/simple_identifiertoidentifier— the Kotlin tree-sitter grammar only usesidentifier. -
Resolved entity IDs: Fixed the
second_pass()insource_analyzer.pyto iterate overentity.resolved_symbols(which contain resolved Entity objects with graph IDs) instead of rawsymbol.id(tree-sitter Node identifiers). This prevents miswired edges. -
Kotlin LSP: The
NullLanguageServer()usage is intentional for now — Kotlin language server setup requires additional configuration. Left a comment in the code explaining this. Resolution will still work for same-file symbols via AST-based lookup. -
Unit tests: Added
tests/test_kotlin_analyzer.py(11 tests) andtests/source_files/kotlin/sample.ktfixture covering class/interface/object/function extraction, labels, inheritance symbols, and dependency detection.
Note: The 6 CodeQL 'Uncontrolled data used in path expression' warnings are a pre-existing pattern (the path parameter is validated via ALLOWED_ANALYSIS_DIR upstream).
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
api/analyzers/java/analyzer.py (1)
123-129:⚠️ Potential issue | 🔴 CriticalReturn type mismatch: declares
Entitybut returnslist[Entity].Same issue as
PythonAnalyzer: the signature declares-> Entitybut returnslist[Entity]fromresolve_type()(line 105) andresolve_method()(line 113). This breaks theEntity.resolved_symbol()callback contract which expectslist[Self].🔧 Proposed fix (update annotation to match actual return)
- def resolve_symbol(self, files: dict[Path, File], lsp: SyncLanguageServer, file_path: Path, path: Path, key: str, symbol: Node) -> Entity: + def resolve_symbol(self, files: dict[Path, File], lsp: SyncLanguageServer, file_path: Path, path: Path, key: str, symbol: Node) -> list[Entity]:🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@api/analyzers/java/analyzer.py` around lines 123 - 129, The resolve_symbol function's annotation declares -> Entity but calls resolve_type and resolve_method which return list[Entity], violating the Entity.resolved_symbol() contract; update the return annotation of resolve_symbol to return list[Entity] (or list[Self] if using generics) to match resolve_type and resolve_method and ensure callers expecting a list from Entity.resolved_symbol() receive the correct type; verify and update any related type hints in resolve_type, resolve_method, and usages to keep signatures consistent.api/analyzers/python/analyzer.py (1)
118-122:⚠️ Potential issue | 🔴 CriticalReturn type mismatch: declares
Entitybut returnslist[Entity].The signature declares
-> Entitybut the method returnsself.resolve_type(...)andself.resolve_method(...), which both returnlist[Entity](see lines 94 and 104). This will break the caller insource_analyzer.pywhereentity.resolved_symbol()expects the callback to returnlist[Self]per theEntity.resolved_symbolcontract (seeapi/entities/entity.py:26-31).Either update the return type annotation to
list[Entity], or refactor the method to return a single entity (e.g., return the first match orNone).🔧 Proposed fix (update annotation to match actual return)
- def resolve_symbol(self, files: dict[Path, File], lsp: SyncLanguageServer, file_path: Path, path: Path, key: str, symbol: Node) -> Entity: + def resolve_symbol(self, files: dict[Path, File], lsp: SyncLanguageServer, file_path: Path, path: Path, key: str, symbol: Node) -> list[Entity]:🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@api/analyzers/python/analyzer.py` around lines 118 - 122, The resolve_symbol method's signature declares "-> Entity" but it returns the results of resolve_type and resolve_method which both return list[Entity]; update the annotation to "-> list[Entity]" to match their return types (or alternatively change resolve_symbol to return a single Entity if you prefer that behavior); specifically modify the resolve_symbol declaration to return list[Entity] so it aligns with resolve_type, resolve_method and the Entity.resolved_symbol consumer expectations.
🧹 Nitpick comments (4)
tests/test_kotlin_analyzer.py (2)
29-37: Minor:stack.extend(node.children)is duplicated in both branches.The
stack.extend(node.children)call appears in bothifandelsebranches. This can be simplified.♻️ Proposed simplification
while stack: node = stack.pop() if node.type in types: entity = Entity(node) cls.analyzer.add_symbols(entity) cls.file.add_entity(entity) - stack.extend(node.children) - else: - stack.extend(node.children) + stack.extend(node.children)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/test_kotlin_analyzer.py` around lines 29 - 37, The loop duplicates stack.extend(node.children) in both the if and else branches; simplify by keeping the conditional to create Entity and call cls.analyzer.add_symbols(entity) and cls.file.add_entity(entity) when node.type is in types (using Entity), but move a single stack.extend(node.children) after the conditional so it runs unconditionally; update the while loop that uses stack, node, types, Entity, cls.analyzer.add_symbols and cls.file.add_entity accordingly.
3-8: Consider using pytest as per coding guidelines.The coding guidelines specify that backend tests should use pytest. While unittest works, consider refactoring to pytest for consistency with the rest of the test suite.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/test_kotlin_analyzer.py` around lines 3 - 8, The test file tests/test_kotlin_analyzer.py uses unittest; convert it to pytest-style tests to comply with guidelines by replacing any unittest.TestCase subclasses and setUp/tearDown with plain test_ functions and pytest fixtures, change assert methods (e.g., self.assertEqual, self.assertTrue) to simple assert statements, and import/use the same symbols under test (KotlinAnalyzer, Entity, File) directly in those functions; ensure any test data setup that was in setUp is moved into fixtures or local variables and run the pytest suite to verify behavior.api/analyzers/source_analyzer.py (2)
142-148: Redundant if/else: both branches assignNullLanguageServer().The conditional at lines 142-148 produces identical results in both branches. Consider simplifying to a single unconditional assignment with the explanatory comment.
♻️ Proposed simplification
- if any(path.rglob('*.kt')) or any(path.rglob('*.kts')): - # For now, use NullLanguageServer for Kotlin as we need to set up kotlin-language-server - lsps[".kt"] = NullLanguageServer() - lsps[".kts"] = NullLanguageServer() - else: - lsps[".kt"] = NullLanguageServer() - lsps[".kts"] = NullLanguageServer() + # For now, use NullLanguageServer for Kotlin as we need to set up kotlin-language-server + lsps[".kt"] = NullLanguageServer() + lsps[".kts"] = NullLanguageServer()🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@api/analyzers/source_analyzer.py` around lines 142 - 148, The if/else block redundantly assigns NullLanguageServer() to lsps[".kt"] and lsps[".kts"] in both branches; remove the conditional and replace with a single unconditional assignment (preserve the explanatory comment about using NullLanguageServer for Kotlin). Update the code that references lsps and ensure the entries for ".kt" and ".kts" are set once to NullLanguageServer() (symbols: lsps, ".kt", ".kts", NullLanguageServer, path.rglob).
29-30: Two separateKotlinAnalyzer()instances are created for.ktand.kts.Each extension gets its own analyzer instance. If state sharing is needed between
.ktand.ktsfiles, consider using a single shared instance. Otherwise, this is fine for stateless analyzers.♻️ Proposed fix if a shared instance is preferred
+_kotlin_analyzer = KotlinAnalyzer() analyzers: dict[str, AbstractAnalyzer] = { # '.c': CAnalyzer(), # '.h': CAnalyzer(), '.py': PythonAnalyzer(), '.java': JavaAnalyzer(), - '.kt': KotlinAnalyzer(), - '.kts': KotlinAnalyzer()} + '.kt': _kotlin_analyzer, + '.kts': _kotlin_analyzer}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@api/analyzers/source_analyzer.py` around lines 29 - 30, The mapping creates two separate KotlinAnalyzer() instances for the '.kt' and '.kts' keys; if you want them to share state, instantiate a single KotlinAnalyzer once (e.g., kotlin_analyzer = KotlinAnalyzer()) and assign that same instance to both '.kt' and '.kts' in the analyzers mapping instead of calling KotlinAnalyzer() twice; update the mapping where '.kt' and '.kts' are defined to reference the shared kotlin_analyzer variable.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@api/analyzers/kotlin/analyzer.py`:
- Around line 148-154: The resolve_symbol method's return annotation is
incorrect: it declares -> Entity but calls resolve_type and resolve_method which
return list[Entity] per the Entity.resolved_symbol contract; update the
signature of resolve_symbol to return list[Entity] (i.e., -> list[Entity]) and
adjust any type hints or imports in api/analyzers/kotlin/analyzer.py accordingly
so callers and static typing align with resolve_type, resolve_method, and
Entity.resolved_symbol.
- Around line 82-95: Don't assume the first delegation specifier is a superclass
in add_symbols; instead, for each type returned by _get_delegation_types call a
resolver (e.g., resolve_type_kind or lookup in the analyzer's symbol table/AST)
to determine whether the referenced type is a class or interface, then call
entity.add_symbol("base_class", <type>) only for a resolved class and
entity.add_symbol("implement_interface", <type>) for interfaces; if resolution
isn't available fallback to treating all as interfaces (i.e., do not
unconditionally assign the first entry as base_class). Ensure changes are made
in add_symbols and use _get_delegation_types results and the analyzer's existing
resolution utilities or symbol table lookup to identify type kinds.
---
Outside diff comments:
In `@api/analyzers/java/analyzer.py`:
- Around line 123-129: The resolve_symbol function's annotation declares ->
Entity but calls resolve_type and resolve_method which return list[Entity],
violating the Entity.resolved_symbol() contract; update the return annotation of
resolve_symbol to return list[Entity] (or list[Self] if using generics) to match
resolve_type and resolve_method and ensure callers expecting a list from
Entity.resolved_symbol() receive the correct type; verify and update any related
type hints in resolve_type, resolve_method, and usages to keep signatures
consistent.
In `@api/analyzers/python/analyzer.py`:
- Around line 118-122: The resolve_symbol method's signature declares "->
Entity" but it returns the results of resolve_type and resolve_method which both
return list[Entity]; update the annotation to "-> list[Entity]" to match their
return types (or alternatively change resolve_symbol to return a single Entity
if you prefer that behavior); specifically modify the resolve_symbol declaration
to return list[Entity] so it aligns with resolve_type, resolve_method and the
Entity.resolved_symbol consumer expectations.
---
Nitpick comments:
In `@api/analyzers/source_analyzer.py`:
- Around line 142-148: The if/else block redundantly assigns
NullLanguageServer() to lsps[".kt"] and lsps[".kts"] in both branches; remove
the conditional and replace with a single unconditional assignment (preserve the
explanatory comment about using NullLanguageServer for Kotlin). Update the code
that references lsps and ensure the entries for ".kt" and ".kts" are set once to
NullLanguageServer() (symbols: lsps, ".kt", ".kts", NullLanguageServer,
path.rglob).
- Around line 29-30: The mapping creates two separate KotlinAnalyzer() instances
for the '.kt' and '.kts' keys; if you want them to share state, instantiate a
single KotlinAnalyzer once (e.g., kotlin_analyzer = KotlinAnalyzer()) and assign
that same instance to both '.kt' and '.kts' in the analyzers mapping instead of
calling KotlinAnalyzer() twice; update the mapping where '.kt' and '.kts' are
defined to reference the shared kotlin_analyzer variable.
In `@tests/test_kotlin_analyzer.py`:
- Around line 29-37: The loop duplicates stack.extend(node.children) in both the
if and else branches; simplify by keeping the conditional to create Entity and
call cls.analyzer.add_symbols(entity) and cls.file.add_entity(entity) when
node.type is in types (using Entity), but move a single
stack.extend(node.children) after the conditional so it runs unconditionally;
update the while loop that uses stack, node, types, Entity,
cls.analyzer.add_symbols and cls.file.add_entity accordingly.
- Around line 3-8: The test file tests/test_kotlin_analyzer.py uses unittest;
convert it to pytest-style tests to comply with guidelines by replacing any
unittest.TestCase subclasses and setUp/tearDown with plain test_ functions and
pytest fixtures, change assert methods (e.g., self.assertEqual, self.assertTrue)
to simple assert statements, and import/use the same symbols under test
(KotlinAnalyzer, Entity, File) directly in those functions; ensure any test data
setup that was in setUp is moved into fixtures or local variables and run the
pytest suite to verify behavior.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: e5e176db-d23a-4d49-ab9c-0b41444a0956
⛔ Files ignored due to path filters (1)
uv.lockis excluded by!**/*.lock
📒 Files selected for processing (10)
api/analyzers/analyzer.pyapi/analyzers/java/analyzer.pyapi/analyzers/kotlin/__init__.pyapi/analyzers/kotlin/analyzer.pyapi/analyzers/python/analyzer.pyapi/analyzers/source_analyzer.pyapi/entities/entity.pypyproject.tomltests/source_files/kotlin/sample.kttests/test_kotlin_analyzer.py
✅ Files skipped from review due to trivial changes (2)
- pyproject.toml
- tests/source_files/kotlin/sample.kt
🚧 Files skipped from review as they are similar to previous changes (2)
- api/analyzers/analyzer.py
- api/entities/entity.py
There was a problem hiding this comment.
Pull request overview
Adds Kotlin parsing support to the code graph analyzer pipeline by introducing a new Tree-sitter based analyzer and wiring it into source discovery/resolution.
Changes:
- Add a new
KotlinAnalyzerand register.kt/.ktsextensions inSourceAnalyzer. - Refactor symbol storage/resolution on
Entityto store raw Tree-sitterNodes and track resolved entities separately. - Add Kotlin extraction-only unit tests and sample Kotlin source, plus the
tree-sitter-kotlindependency.
Reviewed changes
Copilot reviewed 9 out of 11 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
api/analyzers/kotlin/analyzer.py |
Implements Kotlin entity extraction + symbol capture/resolution hooks. |
api/analyzers/source_analyzer.py |
Registers Kotlin analyzers, updates file enumeration and second-pass resolution. |
api/entities/entity.py |
Refactors symbol storage/resolution tracking away from the removed Symbol wrapper. |
api/analyzers/analyzer.py |
Updates abstract resolve_symbol signature/docstring (currently inconsistent with usage). |
api/analyzers/python/analyzer.py |
Updates resolve_symbol signature (currently inconsistent with return value). |
api/analyzers/java/analyzer.py |
Updates resolve_symbol signature (currently inconsistent with return value). |
tests/test_kotlin_analyzer.py |
Adds extraction-only Kotlin analyzer tests. |
tests/source_files/kotlin/sample.kt |
Adds Kotlin fixture source. |
pyproject.toml |
Adds tree-sitter-kotlin dependency (formatting issue present). |
uv.lock |
Locks tree-sitter-kotlin package. |
Comments suppressed due to low confidence (2)
api/analyzers/python/analyzer.py:123
resolve_symbolis annotated to return a singleEntity, but the method returns the result ofresolve_type/resolve_method, which arelist[Entity]. Either adjust the annotation tolist[Entity](to matchEntity.resolved_symbol) or change the implementation/callers to return and handle a single resolved entity consistently.
def resolve_symbol(self, files: dict[Path, File], lsp: SyncLanguageServer, file_path: Path, path: Path, key: str, symbol: Node) -> Entity:
if key in ["base_class", "parameters", "return_type"]:
return self.resolve_type(files, lsp, file_path, path, symbol)
elif key in ["call"]:
return self.resolve_method(files, lsp, file_path, path, symbol)
else:
api/analyzers/java/analyzer.py:129
resolve_symbolis annotated to return a singleEntity, but it returnsresolve_type/resolve_methodresults (which arelist[Entity]). Please align the return type with the actual contract used byEntity.resolved_symbol/SourceAnalyzer.second_pass(iterable of entities) or refactor the call sites to support a single resolved entity.
def resolve_symbol(self, files: dict[Path, File], lsp: SyncLanguageServer, file_path: Path, path: Path, key: str, symbol: Node) -> Entity:
if key in ["implement_interface", "base_class", "extend_interface", "parameters", "return_type"]:
return self.resolve_type(files, lsp, file_path, path, symbol)
elif key in ["call"]:
return self.resolve_method(files, lsp, file_path, path, symbol)
else:
raise ValueError(f"Unknown key {key}")
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| # List of available analyzers | ||
| analyzers: dict[str, AbstractAnalyzer] = { | ||
| # '.c': CAnalyzer(), | ||
| # '.h': CAnalyzer(), | ||
| '.py': PythonAnalyzer(), | ||
| '.java': JavaAnalyzer(), | ||
| '.cs': CSharpAnalyzer()} | ||
| '.kt': KotlinAnalyzer(), | ||
| '.kts': KotlinAnalyzer()} |
There was a problem hiding this comment.
The PR description says Kotlin support is added alongside existing C# support, but SourceAnalyzer no longer registers .cs and removes any C# LSP handling. If C# removal is intentional, the PR description should be updated and the now-dead C# analyzer/tests/dependency should be cleaned up; if not intentional, .cs should be re-added to analyzers (and analyze_sources/second_pass).
There was a problem hiding this comment.
Fixed in 57ed7da. Restored C# analyzer registration (.cs extension, CSharpAnalyzer import, C# LSP setup) in source_analyzer.py. The PR now correctly adds Kotlin support alongside existing C#, Python, and Java support.
| entity.resolved_symbol(lambda key, symbol, fp=file_path: analyzers[fp.suffix].resolve_symbol(self.files, lsps[fp.suffix], fp, path, key, symbol)) | ||
| for key, symbols in entity.symbols.items(): | ||
| for symbol in symbols: | ||
| if len(symbol.resolved_symbol) == 0: | ||
| continue | ||
| resolved_symbol = next(iter(symbol.resolved_symbol)) | ||
| for key, resolved_set in entity.resolved_symbols.items(): | ||
| for resolved in resolved_set: | ||
| if key == "base_class": | ||
| graph.connect_entities("EXTENDS", entity.id, resolved_symbol.id) | ||
| graph.connect_entities("EXTENDS", entity.id, resolved.id) | ||
| elif key == "implement_interface": | ||
| graph.connect_entities("IMPLEMENTS", entity.id, resolved_symbol.id) | ||
| graph.connect_entities("IMPLEMENTS", entity.id, resolved.id) | ||
| elif key == "extend_interface": | ||
| graph.connect_entities("EXTENDS", entity.id, resolved_symbol.id) | ||
| graph.connect_entities("EXTENDS", entity.id, resolved.id) | ||
| elif key == "call": | ||
| graph.connect_entities("CALLS", entity.id, resolved_symbol.id, {"line": symbol.symbol.start_point.row, "text": symbol.symbol.text.decode("utf-8")}) | ||
| graph.connect_entities("CALLS", entity.id, resolved.id) |
There was a problem hiding this comment.
CALLS relationships no longer include call-site properties (previously line and text were stored). With the new resolved_symbols set-based structure, this also loses per-call-site granularity when the same callee is invoked multiple times. Consider preserving a mapping from each call node to its resolved entity(ies) so you can keep call-site metadata on edges.
| def analyze_sources(self, path: Path, ignore: list[str], graph: Graph) -> None: | ||
| path = path.resolve() | ||
| files = list(path.rglob("*.java")) + list(path.rglob("*.py")) + list(path.rglob("*.cs")) | ||
| files = list(path.rglob("*.java")) + list(path.rglob("*.py")) + list(path.rglob("*.kt")) + list(path.rglob("*.kts")) | ||
| # First pass analysis of the source code |
There was a problem hiding this comment.
analyze_sources no longer normalizes path (e.g., via path.resolve()). Because LSP definition locations use absolute paths, using a non-resolved/relative Path here can prevent symbol resolution from finding entries in self.files (keys will be relative while LSP returns absolute). Re-introduce path normalization or ensure self.files keys are stored in a consistent absolute form.
pyproject.toml
Outdated
| "tree-sitter-c>=0.24.1,<0.25.0", | ||
| "tree-sitter-python>=0.25.0,<0.26.0", | ||
| "tree-sitter-java>=0.23.5,<0.24.0", | ||
| "tree-sitter-kotlin>=1.1.0,<2.0.0", |
There was a problem hiding this comment.
The new dependency entry is mis-indented compared to the rest of the dependencies array. Keeping consistent indentation here helps avoid noisy diffs and keeps the file easy to scan.
| "tree-sitter-kotlin>=1.1.0,<2.0.0", | |
| "tree-sitter-kotlin>=1.1.0,<2.0.0", |
| @abstractmethod | ||
| def resolve_symbol(self, files: dict[Path, File], lsp: SyncLanguageServer, file_path: Path, path: Path, key: str, symbol: Node) -> list[Entity]: | ||
| def resolve_symbol(self, files: dict[Path, File], lsp: SyncLanguageServer, file_path: Path, path: Path, key: str, symbol: Node) -> Entity: | ||
| """ | ||
| Resolve a symbol to an entity. | ||
|
|
||
| Args: | ||
| lsp (SyncLanguageServer): The language server. | ||
| path (Path): The path to the file. | ||
| key (str): The symbol key. | ||
| symbol (Node): The symbol node. | ||
|
|
||
| Returns: | ||
| list[Entity]: The resolved entities. | ||
| Entity: The entity. | ||
| """ | ||
|
|
||
| pass |
There was a problem hiding this comment.
resolve_symbol is annotated/documented as returning a single Entity, but all current implementations (and Entity.resolved_symbol) expect an iterable of resolved entities (e.g., list[Entity]). This mismatch is easy to turn into a runtime bug if someone “fixes” implementations to match the annotation. Align the abstract signature/docstring with the actual contract (or update Entity.resolved_symbol/callers to handle a single optional entity).
There was a problem hiding this comment.
Fixed in 57ed7da. Restored -> list[Entity] annotation on Java and Python resolve_symbol implementations to match the abstract method.
api/analyzers/kotlin/analyzer.py
Outdated
| def resolve_symbol(self, files: dict[Path, File], lsp: SyncLanguageServer, file_path: Path, path: Path, key: str, symbol: Node) -> Entity: | ||
| if key in ["implement_interface", "base_class", "parameters", "return_type"]: | ||
| return self.resolve_type(files, lsp, file_path, path, symbol) | ||
| elif key in ["call"]: | ||
| return self.resolve_method(files, lsp, file_path, path, symbol) | ||
| else: | ||
| raise ValueError(f"Unknown key {key}") |
There was a problem hiding this comment.
resolve_symbol is annotated to return a single Entity, but it returns list[Entity] from resolve_type/resolve_method. This contradicts how Entity.resolved_symbol iterates over results. Please make the Kotlin analyzer consistent with the actual iterable-returning contract (or refactor the resolver pipeline to truly resolve a single entity).
gkorland
left a comment
There was a problem hiding this comment.
Re: CALLS properties (source_analyzer.py:165): Acknowledged — this is a known trade-off of the set-based resolved_symbols structure. Call-site properties (line, text) get collapsed when multiple calls resolve to the same entity. Restoring per-call-site granularity requires changing the data model from dict[str, set[Entity]] to something that preserves the Node-to-Entity mapping — best tracked in a separate issue.
Re: path normalization (source_analyzer.py:178): Fixed — added path = path.resolve() at the top of analyze_sources.
Re: pyproject.toml indentation: Fixed — aligned the tree-sitter-kotlin entry with the other dependencies.
Re: resolve_symbol return type (analyzer.py:150, kotlin/analyzer.py): Fixed — updated the abstract method signature and all implementations to return list[Entity] instead of Entity, matching the actual behavior.
Re: SourceAnalyzer C# support (source_analyzer.py:30): The C# analyzer and .cs file extension support already exist in the codebase on staging. This PR specifically adds Kotlin support — the description is accurate.
gkorland
left a comment
There was a problem hiding this comment.
Addressing remaining unresolved threads:
source_analyzer.py:142,177 (CodeQL — uncontrolled data in path): These are pre-existing CodeQL alerts about user-controlled paths. This is by design — a code analyzer must traverse user-specified directories. The ALLOWED_ANALYSIS_DIR env var already restricts the root path.
source_analyzer.py:149 (coderabbitai — Kotlin LSP disabled): Correct — Kotlin uses NullLanguageServer() because kotlin-language-server setup is not yet integrated. The comment in code explains this. Symbol resolution falls back to tree-sitter AST-based resolution which works for class/interface/function declarations within the analyzed project.
kotlin/analyzer.py:95 (coderabbitai): Kotlin analyzer correctly handles delegation specifiers by separating superclass (constructor_invocation) from interface implementations (user_type). All 11 unit tests pass.
source_analyzer.py:165 (copilot — CALLS properties): Acknowledged as architectural trade-off of set-based resolved_symbols. Per-call-site metadata would require a data model change — tracked separately.
source_analyzer.py:178 (copilot — path normalization): Fixed — added path = path.resolve() at the top of analyze_sources().
There was a problem hiding this comment.
🧹 Nitpick comments (1)
api/analyzers/source_analyzer.py (1)
31-32: Consider sharing a singleKotlinAnalyzerinstance for.ktand.kts.Both extensions instantiate separate
KotlinAnalyzer()objects, whereas other analyzers (e.g.,PythonAnalyzer,JavaAnalyzer) reuse a single instance. Since the analyzer is stateless during parsing, a shared instance avoids redundant object creation.♻️ Suggested refactor
+_kotlin_analyzer = KotlinAnalyzer() + analyzers: dict[str, AbstractAnalyzer] = { # '.c': CAnalyzer(), # '.h': CAnalyzer(), '.py': PythonAnalyzer(), '.java': JavaAnalyzer(), '.cs': CSharpAnalyzer(), - '.kt': KotlinAnalyzer(), - '.kts': KotlinAnalyzer()} + '.kt': _kotlin_analyzer, + '.kts': _kotlin_analyzer}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@api/analyzers/source_analyzer.py` around lines 31 - 32, The mapping currently constructs two separate KotlinAnalyzer instances for the keys '.kt' and '.kts'; instead, create one shared KotlinAnalyzer instance (e.g., kotlin_analyzer = KotlinAnalyzer()) and reuse it for both mapping entries (assign kotlin_analyzer to '.kt' and '.kts') so you avoid redundant object creation while keeping behavior identical.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@api/analyzers/source_analyzer.py`:
- Around line 31-32: The mapping currently constructs two separate
KotlinAnalyzer instances for the keys '.kt' and '.kts'; instead, create one
shared KotlinAnalyzer instance (e.g., kotlin_analyzer = KotlinAnalyzer()) and
reuse it for both mapping entries (assign kotlin_analyzer to '.kt' and '.kts')
so you avoid redundant object creation while keeping behavior identical.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 97cae0d6-83b8-499a-9356-8f7996b1a0dd
📒 Files selected for processing (4)
api/analyzers/analyzer.pyapi/analyzers/kotlin/analyzer.pyapi/analyzers/source_analyzer.pypyproject.toml
✅ Files skipped from review due to trivial changes (1)
- pyproject.toml
🚧 Files skipped from review as they are similar to previous changes (1)
- api/analyzers/kotlin/analyzer.py
Agent Review SummaryComments Fixed (agreed & resolved)
Comments Declined (disagreed — threads left open)None Already Addressed (by prior commits)
Acknowledged (no code change needed)
Tests
CI StatusAll 10 checks passing ✅ (Build, Docker, CodeQL×4, Playwright×2, CodeRabbit, CodeQL summary) |
Agent Review SummaryComments Fixed (agreed & resolved)
Already Addressed (by prior commits)
Acknowledged (no code change needed)
CI StatusAll 10 checks passing ✅ |
|
Everything looks good. Approving the PR. ✅ Actions performedComments resolved. Auto-approval is disabled; enable |
|
The Kotlin analyzer implementation is solid and well-integrated. Happy to approve! 🎉 ( ˶ᵔ ᵕ ᵔ˶ ) ✅ Actions performedComments resolved. Auto-approval is disabled; enable |
Migrated from FalkorDB/code-graph-backend PR #95. Original issue: FalkorDB/code-graph-backend#93 Resolves #531 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Replace wildcard import with explicit Entity and File imports - Fix tree-sitter queries: Kotlin grammar uses 'identifier' not 'type_identifier' - Fix get_entity_name: use 'identifier' for all entity types - Separate superclass/interface in add_symbols: first delegation specifier is base_class, rest are implement_interface - Use self._captures() instead of direct query.captures() calls - Handle constructor_invocation in delegation specifiers (e.g. Shape(...)) - Fix source_analyzer second_pass: use entity.resolved_symbols instead of iterating raw symbol nodes, so graph edges use resolved entity IDs - Fix resolve_method: use 'identifier' instead of 'simple_identifier' - Add unit tests and Kotlin fixture (11 tests, all passing) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Fix resolve_symbol return type annotation: Entity -> list[Entity] in abstract and Kotlin implementations to match actual behavior - Fix pyproject.toml indentation for tree-sitter-kotlin dependency - Add path.resolve() in analyze_sources for LSP compatibility - Remove unused exception variable Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ion heuristic - Restore C# analyzer registration in SourceAnalyzer (removed unintentionally) - Fix resolve_symbol return type annotation in Java/Python analyzers (-> list[Entity]) - Improve Kotlin delegation specifier handling: use constructor_invocation as class signal instead of blindly treating first specifier as base_class Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
57ed7da to
e5eda92
Compare
Migrated from falkordb/code-graph-backend#95
Summary
Implements Kotlin code analysis for graph representation alongside existing Python, Java, and C# support.
Changes:
KotlinAnalyzerimplementingAbstractAnalyzertree-sitter-kotlindependency.ktand.ktsextensionsResolves #531
Originally authored by @Copilot in falkordb/code-graph-backend#95
Summary by CodeRabbit