Skip to content

Local integrated#567

Open
win4r wants to merge 10 commits into
DeusData:mainfrom
win4r:local-integrated
Open

Local integrated#567
win4r wants to merge 10 commits into
DeusData:mainfrom
win4r:local-integrated

Conversation

@win4r

@win4r win4r commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

What does this PR do?

Checklist

  • Every commit is signed off (git commit -s) — required, CI rejects
    unsigned commits (DCO, see CONTRIBUTING.md)
  • Tests pass locally (make -f Makefile.cbm test)
  • Lint passes (make -f Makefile.cbm lint-ci)
  • New behavior is covered by a test (reproduce-first for bug fixes)

KerseyFabrications and others added 10 commits June 20, 2026 23:09
A node group variable carried through a WITH aggregation
(e.g. `WITH g, count(*) AS c RETURN g.file_path`) returned blank for every
property except its name: the carried virtual binding held only the group
key (the node's name) and lacked a store handle, so node_prop() could
neither read other fields nor compute degrees.
Fix: capture the node id of a bare node group-var in with_agg_find_or_create
and tag the carried virtual binding with it; in node_prop(), when such a stub
(id set, string fields unpopulated) is asked for a missing property, re-fetch
the full node via cbm_store_find_node_by_id and project it. Also propagate
the store onto virtual bindings so node_prop can re-fetch and compute
degrees. The stub gate is heuristic but never yields a wrong value — worst
case is one redundant indexed lookup. Adds regression test
cypher_exec_with_node_groupvar_prop.

Signed-off-by: Kris Kersey <kris@kerseyfabrications.com>
(cherry picked from commit 8b03974)
…esults

MATCH (c:Class)-[:DEFINES_METHOD]->(m:Method) returned at most 10 results
for any class, regardless of how many methods it actually has.

Root cause: bind_cap was set to scan_count (the number of nodes matched in
the initial pattern — typically 1 when querying a single class by name).
max_new = bind_cap * 10 = 10, so the edge expansion loop exited after
collecting 10 results. No error, no warning, no truncation indicator.

This is language-agnostic: any class with more than 10 methods in any
language was silently truncated. The fix is two characters:
  bind_cap = scan_count > max_rows ? scan_count : max_rows

Regression test: a Python class with 15 methods must return all 15 via
MATCH (c:Class)-[:DEFINES_METHOD]->(m:Method) with label filtering.

Signed-off-by: Thomas Dyar <tdyar@intersystems.com>
(cherry picked from commit c43fc8d)
A call carrying enough long arguments drove append_args_json()'s running
position past the fixed CBM_SZ_2K `props` stack buffer in
emit_normal_calls_edge(): format_call_arg() returns snprintf's *untruncated*
length, so `pos += (size_t)n` could exceed `bufsize`, after which the
trailing `buf[pos] = '\0'` (and `buf[pos++] = ']'`) wrote out of bounds. The
stack canary caught it as SIGABRT, so full-repo indexing of large TypeScript
codebases crashed the server in the parallel resolve pass
(emit_service_edge -> emit_normal_calls_edge -> finalize_and_emit ->
append_args_json). Confirmed with AddressSanitizer:
stack-buffer-overflow WRITE at pass_parallel.c:1124, 'props' (2048 B).

Fix: when an argument does not fully fit, roll back to before its separator
and stop appending (atomic field, matching append_json_string's behaviour),
so `pos` can never advance past the buffer.

Add regression test parallel_args_json_no_overflow: indexes a fixture whose
single call carries 60 long string args (args JSON well past 2 KB); under the
ASan test build it aborts without this fix and passes with it.

Signed-off-by: Andrius Skerla <1492322+rainder@users.noreply.github.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
(cherry picked from commit 74d15a6)
Signed-off-by: Saurav Kumar <sauravsk2507@gmail.com>
(cherry picked from commit c3a1a79)
git_allocator moved out of the top-level git2.h into git2/sys/alloc.h
in libgit2 1.8.0. Add an explicit include so the mimalloc binding
compiles against libgit2 >= 1.8 (e.g. MacPorts libgit2 1.9.4).

(cherry picked from commit 586fc8a)
manage_adr stores ADRs in project_summaries, but a full re-index
(triggered by file changes or new files) deletes the DB in
try_incremental_or_delete_db and rebuilds it from the graph buffer,
which writes an empty project_summaries table. file_hashes were
re-persisted after the rebuild but project_summaries were not, so the
ADR was silently lost.

Fix: capture the ADR before the DB is unlinked, stash it on the
pipeline struct, and restore it after the rebuilt DB is reopened in
dump_and_persist_hashes. The incremental path is unaffected (it never
rewrites the DB). Verified: ADR now survives a full re-index.

Signed-off-by: RithvikReddy0-0 <rithvikreddymukkara@gmail.com>
(cherry picked from commit 7b6c063)
detect_changes advertised a `since` parameter in its inputSchema but the
handler never read it — it always diffed against base_branch (default
"main"), so detect_changes(since="HEAD~10") silently returned the wrong or
empty result when HEAD was on the default branch.

Fix: read `since` and, when present, route it through base_branch so the
existing shell-arg validation (cbm_validate_shell_arg) and the
`<base>...HEAD` diff apply unchanged; `since` takes precedence over
base_branch. Also narrows the schema description — the prior "date" form
(e.g. 2026-01-01) is not a revision and never worked through this path — and
documents the inherited three-dot semantics. Adds regression tests
tool_detect_changes_since and tool_detect_changes_since_precedence.

Refs DeusData#371

Signed-off-by: Kris Kersey <kris@kerseyfabrications.com>
(cherry picked from commit 53501b0)
trace_path resolved a function_name from the first row of an unordered name
query with no ambiguity check, so a same-named entity (e.g. a shell script's
main()) could silently shadow the intended C main(). get_code_snippet
reported "ambiguous" for a short name even when one match was the obvious
definition (the .c body vs a .h declaration).

Fix: add a deterministic resolution ranking — a callable label outranks a
module, then the larger definition by line span wins, preferring a real
definition without hardcoding file extensions — and a picker that flags a
genuine tie. trace_path now traces the preferred node and returns the
existing ambiguous-suggestions response on a true tie instead of silently
taking nodes[0]; get_code_snippet resolves directly to the preferred match,
reporting ambiguity only for real ties. Adds regression tests
tool_trace_call_path_ambiguous and tool_trace_call_path_prefers_definition.

Signed-off-by: Kris Kersey <kris@kerseyfabrications.com>
(cherry picked from commit 382dc24)
Signed-off-by: King Star <mcxin.y@gmail.com>
(cherry picked from commit 935027a)
Mark this as a community fork of DeusData/codebase-memory-mcp (MIT, © 2025
DeusData) and list the integrated incremental-reindex fix (DeusData#528) plus the
9 cherry-picked upstream PRs (DeusData#465 DeusData#412 DeusData#475 DeusData#527 DeusData#512 DeusData#539 DeusData#464 DeusData#466 DeusData#526).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Signed-off-by: win4r <win4r@outlook.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants