Conversation
…head - Move Tree-sitter parser initialization from instance constructor to module-level singleton in all four language analyzers (C#, Go, JavaScript, TypeScript). Parser.setLanguage() is now called once per language per process instead of once per file analysis. - Add a 20-entry LRU-style result cache (Map) in MetricsAnalyzerFactory keyed by language + content hash. Identical source text no longer triggers a full parse cycle; cache hits return in <1ms vs ~14ms cold. All 35 unit tests pass. Compile and lint clean. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #222 +/- ##
==========================================
+ Coverage 68.30% 68.50% +0.19%
==========================================
Files 8 8
Lines 2963 3000 +37
Branches 276 279 +3
==========================================
+ Hits 2024 2055 +31
- Misses 937 943 +6
Partials 2 2 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Pull request overview
This PR reduces VS Code CodeLens update overhead by reusing Tree-sitter parsers across analyses and caching analysis results for unchanged source text, improving responsiveness when CodeLens is requested repeatedly without content changes.
Changes:
- Added a small (max 20 entries) in-memory cache in
MetricsAnalyzerFactory.analyzeFile()keyed by language + source hash. - Switched C#, Go, JavaScript, and TypeScript analyzers to reuse a module-level
tree-sitterParsersingleton per language instead of constructing one per call.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
src/metricsAnalyzer/metricsAnalyzerFactory.ts |
Adds content-hash keyed analysis result cache with FIFO eviction. |
src/metricsAnalyzer/languages/csharpAnalyzer.ts |
Reuses a module-level Tree-sitter parser instance. |
src/metricsAnalyzer/languages/goAnalyzer.ts |
Reuses a module-level Tree-sitter parser instance. |
src/metricsAnalyzer/languages/javascriptAnalyzer.ts |
Reuses a module-level Tree-sitter parser instance. |
src/metricsAnalyzer/languages/typescriptAnalyzer.ts |
Reuses a module-level Tree-sitter parser instance. |
| /** Fast non-cryptographic hash for cache key generation (djb2 variant). */ | ||
| function hashString(str: string): number { | ||
| let hash = 5381; | ||
| for (let i = 0; i < str.length; i++) { | ||
| hash = (hash * 33) ^ str.charCodeAt(i); | ||
| } | ||
| return hash >>> 0; // Convert to unsigned 32-bit integer | ||
| } |
There was a problem hiding this comment.
The cache key relies on a 32-bit non-cryptographic hash (djb2 variant). Collisions are possible, which can return metrics for a different source text with the same length+hash. Consider using a collision-resistant key (e.g., crypto hash) or storing/verifying the original sourceText for cache hits to guarantee correctness.
| // Use cache to avoid re-analyzing identical source text | ||
| const cacheKey = `${languageId}:${sourceText.length}:${hashString(sourceText)}`; | ||
| const cached = analysisCache.get(cacheKey); | ||
| if (cached) { | ||
| return cached; | ||
| } | ||
| const results = analyzer(sourceText); | ||
| if (analysisCache.size >= CACHE_MAX_SIZE) { | ||
| analysisCache.delete(analysisCache.keys().next().value!); | ||
| } | ||
| analysisCache.set(cacheKey, results); | ||
| return results; |
There was a problem hiding this comment.
The newly added analysis result cache (hit path + eviction when full) is not covered by unit tests. Please add tests that exercise: (1) repeated analyzeFile calls with identical input returning from cache, and (2) eviction behavior once CACHE_MAX_SIZE is exceeded, to prevent regressions in correctness/performance.
🤖 This is an automated pull request from Repo Assist.
Summary
Every time VS Code requests CodeLens updates (i.e., on every keystroke),
MetricsAnalyzerFactory.analyzeFile()was called, which internally:ParserobjectsetLanguage()to load the grammarThis meant two expensive operations repeated on every edit even when the file hadn't changed between calls.
Changes
1. Module-level parser singletons (all 4 language analyzers)
new Parser()+setLanguage()now runs once per language per process instead of once peranalyzeFile()call. Each*MetricsAnalyzerconstructor now assigns the shared singleton rather than instantiating a new parser.Files changed:
csharpAnalyzer.ts,goAnalyzer.ts,javascriptAnalyzer.ts,typescriptAnalyzer.ts2. Content-hash result cache in
MetricsAnalyzerFactoryAdded a 20-entry insertion-order eviction cache keyed by
languageId:contentLength:djb2Hash(sourceText). When the CodeLens provider callsanalyzeFile()with source text that hasn't changed (e.g., scrolling through a file, switching tabs, re-focusing a window), results are returned immediately from cache without any parse work.Measured improvement: 14ms → <1ms for cache hits on a simple Go file.
Trade-offs
parser.parse()is fully synchronous and completes before the next call.nesting,complexity,details) remains instance-level and is unaffected.Test Status
npm run compile— cleannpm run lint— cleanWarning
The following domain was blocked by the firewall during workflow execution:
releaseassets.githubusercontent.comTo allow these domains, add them to the
network.allowedlist in your workflow frontmatter:See Network Configuration for more information.