Skip to content

feat(postprocess): fuzzy class-name disambiguation by directory prefix#331

Open
gzenz wants to merge 7 commits intotirth8205:mainfrom
gzenz:feat/parser-cross-file-class-method-fuzzy-v2
Open

feat(postprocess): fuzzy class-name disambiguation by directory prefix#331
gzenz wants to merge 7 commits intotirth8205:mainfrom
gzenz:feat/parser-cross-file-class-method-fuzzy-v2

Conversation

@gzenz
Copy link
Copy Markdown
Contributor

@gzenz gzenz commented Apr 19, 2026

Summary

Relaxes the "drop ambiguous classes" rule: when duplicates share a directory prefix, prefer the candidate closest to the caller.

Stacked PR approach

This is PR 6/16 in a stack that ports closed #158 into small, reviewable slices based on your feedback. Each branch contains this commit plus every earlier branch in the stack, so it applies cleanly on its own. The intended merge order is bottom-to-top.

Stack (merge order):

  1. feat/store-connection-cache-v2
  2. feat/hardened-tests-v2
  3. feat/parser-python-star-imports-v2
  4. feat/parser-resolve-class-method-v2
  5. feat/parser-cross-file-class-method-v2
  6. feat/parser-cross-file-class-method-fuzzy-v2
  7. feat/parser-node-decorators-v2
  8. feat/parser-import-map-bare-call-resolver-v2
  9. feat/query-dedupe-and-transitive-tests-v2
  10. feat/incremental-ignore-build-artifacts-v2
  11. perf/build-timing-telemetry-v2
  12. feat/parser-call-noise-filter-v2
  13. feat/parser-typed-var-enrichment-v2
  14. feat/parser-func-ref-enrichment-v2
  15. feat/cli-quiet-json-and-enrich-v2
  16. feat/cli-refactor-dead-code-v2

After you merge PR N, I rebase PRs N+1..16 on current main and force-push. Each PR ends up as a single commit on top of the previous one. Review only the newest commit in this PR; the rest will already be on main (or in earlier PRs in the stack).

All 16 branches are rebased on 072ab80. None re-attempts the closed lang-handler refactor.

Roadmap: restoring v7 loop-test parity

Measured impact on loop-test fixtures (Gadgetbridge Kotlin/Java, HealthAgent Python/TS):

  • HealthAgent CALLS resolution: 15.2 percent to 36.1 percent (v7 baseline 39.3 percent)
  • HealthAgent dead-code: 275 to 122 (backend/api 106 to 0, backend/cli 57 to 2)
  • Gadgetbridge CALLS resolution: 33.4 percent, Q2/Q3/Q5 scorecard queries now return qualified targets
  • Gadgetbridge loop-test scorecard: 8 PASS / 2 PARTIAL

Still deferred after this stack

  • Lang handler refactor (closed feat: lang handler refactor + parser improvements #246): will redo per-language, one PR per handler, each with tests, once v7 parity holds. Addresses the no-handler-tests concern directly. Svelte, TESTED_BY, Zig, PowerShell, and Julia support will be preserved.
  • test_pain_points.py fixtures: half are handler-dependent, so the full suite waits for the per-handler PRs.

@gzenz gzenz force-pushed the feat/parser-cross-file-class-method-fuzzy-v2 branch 2 times, most recently from 4570ef3 to 18106c7 Compare April 19, 2026 07:55
gzenz added 7 commits April 21, 2026 21:01
- Remove duplicate macOS-copy module files (analysis 2.py, enrich 2.py,
  enrich 3.py, exports 2.py, exports 3.py, graph_diff 2.py,
  jedi_resolver 2.py, memory 2.py, memory 3.py, token_benchmark 2.py)
  that tripped ruff N999.
- Sort imports in main.py (ruff I001).
- Drop unused parse_git_diff_ranges import in tools/review.py (ruff F401).
- Add type-ignore for FastMCP._tool_manager private attribute access
  in main.py.

CI on main has been red since PR tirth8205#94. This fixes it.
MCP tool calls were opening a fresh GraphStore (and SQLite connection)
on every invocation. Cache one GraphStore per db_path and reuse it,
falling back to a fresh connection if the cached one is dead.

Thread-safe via a module-level lock.
Exact known-answer assertions for compute_risk_score and trace_flows, plus
error-path coverage for parser on malformed input and module-cache eviction.

- TestRiskScoreExact: 6 tests pinning risk score math (untested, tested,
  security keyword, caller fractions, 20-caller cap)
- TestFlowExact: 4 tests for linear chain depth, cycles, single-file
  criticality, and non-test neighbors
- TestParserErrorPaths: 8 tests for binary files, unknown extensions,
  syntax errors, empty files, CRLF, deeply nested AST, unicode
- TestCacheEviction: oldest-half module cache eviction
Add three helpers:

- `_resolve_star_imports`: walks top-level `import_from_statement`
  nodes, detects `wildcard_import`, resolves the source module to a
  file, and merges the exported names into the caller's import_map.
- `_get_exported_names`: returns the public names a module exports,
  respecting `__all__` when present. Caches results per resolved path
  with a threading lock so concurrent parses share work safely.
- `_extract_dunder_all`: parses `__all__ = [...]` at module scope.

Wires star-import expansion into `parse_bytes` and the notebook
concat path. Enables resolution of calls that come in via wildcard
imports (e.g. `from constants import *` then `use_constant()`).
When receiver filtering emits "Foo.bar" (uppercase receiver), resolve it
to "/path/file::Foo.bar" by looking up the class entry in the file
symbol table. Closes a resolution gap where class-qualified bare calls
stayed unresolved after the receiver filter rewrite.
Add _resolve_cross_file_class_methods as a postprocess step that scans
Class nodes, builds a name -> file::ClassName map (dropping colliding
names), and rewrites bare ClassName.method edge targets to the qualified
file::ClassName.method form. Complements the per-file resolver in
parser.py which only sees symbols in the calling file.
The cross-file ClassName.method resolver dropped any class whose name
appeared in more than one file. That was too strict: codebases commonly
reuse short class names across feature packages (util/ClassA.kt and
test/ClassA.kt, multiple platform-specific Impl files, etc.). The call
site almost always means the version in the closest package.

Rebuild class_candidates as a name -> [qn, ...] map. For each edge with
a bare ClassName.method target, pick the class qn whose file path shares
the deepest common directory prefix with the edge's calling file. If the
top match is not unique, fall back to leaving the target bare (rather
than mis-resolving).
@gzenz gzenz force-pushed the feat/parser-cross-file-class-method-fuzzy-v2 branch from 18106c7 to 13c07bd Compare April 21, 2026 19:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant