From fe9da083f126e5b7ce0edb46c95348d5d575b0e2 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Vladimir=20Venegas=20Vel=C3=A1squez?= Date: Sat, 20 Jun 2026 21:29:47 -0400 Subject: [PATCH] feat(mcp): expose index freshness and relation evidence MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit ### Motivation - Provide inspectable, machine-readable evidence so clients can decide whether graph results are based on a current snapshot, a stale/partial snapshot, direct source resolution, heuristic inference, or unavailable provenance. - Avoid opaque reliability scores by exposing factual fields (indexed HEAD, working-tree dirtiness, coverage counters, and per-edge provenance when available). - Keep changes additive and backward-compatible so older indexes/clients continue to work. ### Description - Persist project snapshot metadata and coverage counters: added `indexed_git_head`, `files_discovered`, `files_indexed`, `files_excluded`, and `files_failed` to the `projects` table and to `cbm_project_t`, with a non-destructive schema migration and an update API `cbm_store_update_project_coverage`; files changed: `src/store/store.h`, `src/store/store.c`, `src/pipeline/pipeline.c`. - Capture the repository HEAD at snapshot time and store it during `cbm_store_upsert_project` instead of substituting current HEAD at query time; files changed: `src/store/store.c`, `src/git/git_context.c`, `src/git/git_context.h`. - Expose working-tree freshness and index snapshot evidence in `index_status` by adding an additive `evidence` object containing `index_snapshot` (indexed/current HEADs, repository_state, snapshot_matches_* booleans, `freshness`), `coverage`, and `limitations` via `add_index_evidence_json`; file changed: `src/mcp/mcp.c`. - Carry stored edge `properties` through BFS and add `edge_evidence` to `trace_path`/`trace_call_path` responses that derives `resolution_strategy`, source `confidence` (when present), `candidate_count`, `evidence_status` (verified|inferred|ambiguous|unavailable), and preserves prior response shape; files changed: `src/store/store.h`, `src/store/store.c`, `src/mcp/mcp.c`. - Add `cbm_git_tracked_dirty()` to determine tracked-file dirtiness (explicitly excluding untracked files) and document semantics in README; file changed: `src/git/git_context.c`, `src/git/git_context.h`, and `README.md`. ### Testing - Ran `scripts/build.sh` which completed successfully and produced `build/c/codebase-memory-mcp` (build: ✅). - Launched `scripts/test.sh` under sanitizers; the test run was started but terminated in this environment before completion due to runtime limits (tests: started but incomplete ⚠️). - Performed local compile+link verification of modified units as part of the build; no new compile errors reported (compile: ✅). - Created an atomic local commit `8379965` with message `feat(mcp): expose index freshness and relation evidence`. Signed-off-by: Codex --- README.md | 19 +++++ src/git/git_context.c | 16 ++++ src/git/git_context.h | 3 + src/mcp/mcp.c | 161 +++++++++++++++++++++++++++++++++++++++- src/pipeline/pipeline.c | 2 + src/store/store.c | 78 +++++++++++++++++-- src/store/store.h | 8 ++ 7 files changed, 277 insertions(+), 10 deletions(-) diff --git a/README.md b/README.md index b48a297f0..6ff5fdcb8 100644 --- a/README.md +++ b/README.md @@ -382,6 +382,17 @@ codebase-memory-mcp cli --raw search_graph '{"label": "Function"}' | jq '.result | `delete_project` | Remove a project and all its graph data. | | `index_status` | Check indexing status of a project. | +`index_status` includes an additive `evidence` object. `evidence.index_snapshot` +reports the timestamp and Git HEAD captured when the last successful graph +snapshot was finalized, the current HEAD, tracked working-tree state, and a +`freshness` value: `current` means the indexed HEAD equals current HEAD and +tracked files are clean; `head_changed` means a later checkout/commit changed +HEAD; `working_tree_changed` means HEAD matches but tracked files are modified; +`unknown` means Git or comparison data is unavailable. Untracked files are not +compared. `evidence.coverage` reports discovered/indexed/excluded/failed file +counters when the stored index contains them; older indexes may report +`unknown` rather than guessing. + ### Querying | Tool | Description | @@ -393,6 +404,14 @@ codebase-memory-mcp cli --raw search_graph '{"label": "Function"}' | jq '.result | `get_graph_schema` | Node/edge counts, relationship patterns, property definitions per label. Run this first. | | `get_code_snippet` | Read source code for a function by qualified name. | | `get_architecture` | Codebase overview: languages, packages, routes, hotspots, clusters, ADR. | + +`trace_path`/`trace_call_path` preserve their existing `callers`/`callees` +arrays and add `edge_evidence` for traversed relations when stored edge +properties contain provenance. Relation `confidence` is source-resolution +confidence from the indexer, not a probability of runtime correctness and not +BM25/semantic search relevance. Dynamic behavior such as reflection, +dependency injection, framework wiring, generated code, configuration, HTTP, +async messaging, and cross-repo links may remain inferred or unavailable. | `search_code` | Grep-like text search within indexed project files. | | `manage_adr` | CRUD for Architecture Decision Records. | | `ingest_traces` | Ingest runtime traces to validate HTTP_CALLS edges. | diff --git a/src/git/git_context.c b/src/git/git_context.c index 5f27b9f20..a7668a8a7 100644 --- a/src/git/git_context.c +++ b/src/git/git_context.c @@ -266,6 +266,22 @@ int cbm_git_context_resolve(const char *path, cbm_git_context_t *out) { return 0; } +int cbm_git_tracked_dirty(const char *path, bool *out_dirty) { + if (!out_dirty) { + return CBM_NOT_FOUND; + } + *out_dirty = false; + char *status = NULL; + int rc = git_capture(path, "status --porcelain --untracked-files=no", &status); + if (rc != 0) { + free(status); + return CBM_NOT_FOUND; + } + *out_dirty = status && status[0] != '\0'; + free(status); + return 0; +} + char *cbm_git_context_branch_qn(const char *project_name, const cbm_git_context_t *ctx) { const char *project = project_name && project_name[0] ? project_name : "project"; const char *slug = "working-tree"; diff --git a/src/git/git_context.h b/src/git/git_context.h index 876309eb6..3939fb1c1 100644 --- a/src/git/git_context.h +++ b/src/git/git_context.h @@ -21,6 +21,9 @@ typedef struct { int cbm_git_context_resolve(const char *path, cbm_git_context_t *out); void cbm_git_context_free(cbm_git_context_t *ctx); +/* Returns 0 when tracked working-tree dirtiness was determined, non-zero when + * unavailable. Untracked files are intentionally not compared. */ +int cbm_git_tracked_dirty(const char *path, bool *out_dirty); char *cbm_git_context_branch_qn(const char *project_name, const cbm_git_context_t *ctx); int cbm_git_context_props_json(const cbm_git_context_t *ctx, char *buf, int buf_size); diff --git a/src/mcp/mcp.c b/src/mcp/mcp.c index 8102b1e77..6eb0c970c 100644 --- a/src/mcp/mcp.c +++ b/src/mcp/mcp.c @@ -893,6 +893,83 @@ static void add_git_context_json(yyjson_mut_doc *doc, yyjson_mut_val *obj, const cbm_git_context_free(&ctx); } +static void add_index_evidence_json(yyjson_mut_doc *doc, yyjson_mut_val *root, + const cbm_project_t *proj) { + yyjson_mut_val *evidence = yyjson_mut_obj(doc); + yyjson_mut_val *snap = yyjson_mut_obj(doc); + const char *indexed_head = + (proj && proj->indexed_git_head && proj->indexed_git_head[0]) ? proj->indexed_git_head : NULL; + cbm_git_context_t ctx = {0}; + int git_rc = cbm_git_context_resolve(proj ? proj->root_path : NULL, &ctx); + bool dirty = false; + int dirty_rc = (git_rc == 0 && ctx.is_git) ? cbm_git_tracked_dirty(proj->root_path, &dirty) + : CBM_NOT_FOUND; + + add_git_context_string(doc, snap, "indexed_at", proj ? proj->indexed_at : NULL); + add_git_context_string(doc, snap, "indexed_git_head", indexed_head); + add_git_context_string(doc, snap, "current_git_head", + (git_rc == 0 && ctx.is_git && ctx.head_sha && ctx.head_sha[0]) + ? ctx.head_sha + : NULL); + const char *repo_state = "unavailable"; + if (git_rc == 0 && ctx.root_exists && !ctx.is_git) { + repo_state = "not_git"; + } else if (git_rc == 0 && ctx.is_git && dirty_rc == 0) { + repo_state = dirty ? "dirty" : "clean"; + } + yyjson_mut_obj_add_str(doc, snap, "repository_state", repo_state); + if (indexed_head && git_rc == 0 && ctx.is_git && ctx.head_sha && ctx.head_sha[0]) { + yyjson_mut_obj_add_bool(doc, snap, "snapshot_matches_current_head", + strcmp(indexed_head, ctx.head_sha) == 0); + } else { + yyjson_mut_obj_add_null(doc, snap, "snapshot_matches_current_head"); + } + if (indexed_head && git_rc == 0 && ctx.is_git && ctx.head_sha && ctx.head_sha[0] && + dirty_rc == 0) { + yyjson_mut_obj_add_bool(doc, snap, "snapshot_matches_working_tree", + strcmp(indexed_head, ctx.head_sha) == 0 && !dirty); + } else { + yyjson_mut_obj_add_null(doc, snap, "snapshot_matches_working_tree"); + } + const char *freshness = "unknown"; + if (indexed_head && git_rc == 0 && ctx.is_git && ctx.head_sha && ctx.head_sha[0] && + dirty_rc == 0) { + if (strcmp(indexed_head, ctx.head_sha) != 0) { + freshness = "head_changed"; + } else { + freshness = dirty ? "working_tree_changed" : "current"; + } + } else if (git_rc == 0 && ctx.root_exists && !ctx.is_git) { + freshness = "unknown"; + } + yyjson_mut_obj_add_str(doc, snap, "freshness", freshness); + yyjson_mut_obj_add_val(doc, evidence, "index_snapshot", snap); + + yyjson_mut_val *cov = yyjson_mut_obj(doc); + int discovered = proj ? proj->files_discovered : 0; + int indexed = proj ? proj->files_indexed : 0; + int excluded = proj ? proj->files_excluded : 0; + int failed = proj ? proj->files_failed : 0; + yyjson_mut_obj_add_int(doc, cov, "files_discovered", discovered); + yyjson_mut_obj_add_int(doc, cov, "files_indexed", indexed); + yyjson_mut_obj_add_int(doc, cov, "files_excluded", excluded); + yyjson_mut_obj_add_int(doc, cov, "files_failed", failed); + yyjson_mut_obj_add_str(doc, cov, "coverage_status", + failed > 0 || excluded > 0 ? "partial" + : discovered > 0 && indexed == discovered ? "complete" + : "unknown"); + yyjson_mut_obj_add_val(doc, evidence, "coverage", cov); + yyjson_mut_val *limits = yyjson_mut_arr(doc); + yyjson_mut_val *lim = yyjson_mut_obj(doc); + yyjson_mut_obj_add_str(doc, lim, "code", "UNTRACKED_FILES_NOT_COMPARED"); + yyjson_mut_obj_add_str(doc, lim, "message", + "Working-tree freshness compares current HEAD and tracked modifications; untracked files are not compared."); + yyjson_mut_arr_add_val(limits, lim); + yyjson_mut_obj_add_val(doc, evidence, "limitations", limits); + yyjson_mut_obj_add_val(doc, root, "evidence", evidence); + cbm_git_context_free(&ctx); +} + /* Build a helpful error listing available projects. Caller must free() result. */ static char *build_project_list_error(const char *reason) { char dir_path[CBM_SZ_1K]; @@ -1739,9 +1816,8 @@ static char *handle_index_status(cbm_mcp_server_t *srv, const char *args) { yyjson_mut_obj_add_strcpy(doc, root, "root_path", proj_info.root_path ? proj_info.root_path : ""); add_git_context_json(doc, root, proj_info.root_path); - safe_str_free(&proj_info.name); - safe_str_free(&proj_info.indexed_at); - safe_str_free(&proj_info.root_path); + add_index_evidence_json(doc, root, &proj_info); + cbm_project_free_fields(&proj_info); } if (nodes == 0) { yyjson_mut_obj_add_str( @@ -2244,6 +2320,77 @@ static yyjson_mut_val *bfs_to_json_array(yyjson_mut_doc *doc, cbm_traverse_resul return arr; } +static const char *edge_evidence_status(const char *strategy, double confidence, int candidates) { + if (!strategy || !strategy[0]) { + return "unavailable"; + } + if (candidates > 1) { + return "ambiguous"; + } + if (strstr(strategy, "heur") || strstr(strategy, "fuzzy") || confidence < 0.8) { + return "inferred"; + } + return "verified"; +} + +static const char *edge_resolution_strategy(const char *strategy) { + if (!strategy || !strategy[0]) { + return "unknown"; + } + if (strstr(strategy, "lsp")) { + return "hybrid_lsp"; + } + if (strstr(strategy, "import") || strstr(strategy, "same_module") || + strstr(strategy, "receiver")) { + return "direct_ast"; + } + if (strstr(strategy, "fuzzy") || strstr(strategy, "heur")) { + return "heuristic"; + } + return strategy; +} + +static yyjson_mut_val *trace_edges_to_json_array(yyjson_mut_doc *doc, cbm_traverse_result_t *tr) { + yyjson_mut_val *arr = yyjson_mut_arr(doc); + for (int i = 0; i < tr->edge_count; i++) { + yyjson_mut_val *item = yyjson_mut_obj(doc); + yyjson_mut_obj_add_str(doc, item, "from", tr->edges[i].from_name ? tr->edges[i].from_name : ""); + yyjson_mut_obj_add_str(doc, item, "to", tr->edges[i].to_name ? tr->edges[i].to_name : ""); + yyjson_mut_obj_add_str(doc, item, "type", tr->edges[i].type ? tr->edges[i].type : ""); + yyjson_mut_val *edge = yyjson_mut_obj(doc); + const char *props = tr->edges[i].properties_json; + yyjson_doc *pdoc = props ? yyjson_read(props, strlen(props), 0) : NULL; + yyjson_val *proot = pdoc ? yyjson_doc_get_root(pdoc) : NULL; + yyjson_val *v = proot ? yyjson_obj_get(proot, "strategy") : NULL; + const char *strategy = yyjson_is_str(v) ? yyjson_get_str(v) : NULL; + v = proot ? yyjson_obj_get(proot, "confidence") : NULL; + bool has_conf = yyjson_is_num(v); + double conf = has_conf ? yyjson_get_num(v) : 0.0; + v = proot ? yyjson_obj_get(proot, "candidates") : NULL; + int candidates = yyjson_is_int(v) ? (int)yyjson_get_int(v) : 0; + yyjson_mut_obj_add_str(doc, edge, "resolution_strategy", edge_resolution_strategy(strategy)); + if (has_conf) { + yyjson_mut_obj_add_real(doc, edge, "confidence", conf); + } else { + yyjson_mut_obj_add_null(doc, edge, "confidence"); + } + if (candidates > 0) { + yyjson_mut_obj_add_int(doc, edge, "candidate_count", candidates); + } else { + yyjson_mut_obj_add_null(doc, edge, "candidate_count"); + } + yyjson_mut_obj_add_null(doc, edge, "source_location"); + yyjson_mut_obj_add_str(doc, edge, "evidence_status", + edge_evidence_status(strategy, conf, candidates)); + yyjson_mut_obj_add_val(doc, item, "edge", edge); + yyjson_mut_arr_add_val(arr, item); + if (pdoc) { + yyjson_doc_free(pdoc); + } + } + return arr; +} + static char *handle_trace_call_path(cbm_mcp_server_t *srv, const char *args) { char *func_name = cbm_mcp_get_string_arg(args, "function_name"); char *project = cbm_mcp_get_string_arg(args, "project"); @@ -2365,6 +2512,14 @@ static char *handle_trace_call_path(cbm_mcp_server_t *srv, const char *args) { yyjson_mut_obj_add_val(doc, root, "callers", bfs_to_json_array(doc, &tr_in, risk_labels, include_tests)); } + yyjson_mut_val *edge_evidence = yyjson_mut_obj(doc); + if (do_outbound) { + yyjson_mut_obj_add_val(doc, edge_evidence, "outbound", trace_edges_to_json_array(doc, &tr_out)); + } + if (do_inbound) { + yyjson_mut_obj_add_val(doc, edge_evidence, "inbound", trace_edges_to_json_array(doc, &tr_in)); + } + yyjson_mut_obj_add_val(doc, root, "edge_evidence", edge_evidence); /* Serialize BEFORE freeing traversal results (yyjson borrows strings) */ char *json = yy_doc_to_str(doc); diff --git a/src/pipeline/pipeline.c b/src/pipeline/pipeline.c index 499c916a5..015ec5229 100644 --- a/src/pipeline/pipeline.c +++ b/src/pipeline/pipeline.c @@ -848,6 +848,8 @@ static int dump_and_persist_hashes(cbm_pipeline_t *p, const cbm_file_info_t *fil stat_mtime_ns(&fst), fst.st_size); } } + (void)cbm_store_update_project_coverage(hash_store, p->project_name, file_count, + file_count, p->excluded_count, 0); /* FTS5 backfill: populate nodes_fts with camelCase-split names. * Contentless FTS5 requires the special 'delete-all' command instead of diff --git a/src/store/store.c b/src/store/store.c index c237332e2..2e5758020 100644 --- a/src/store/store.c +++ b/src/store/store.c @@ -62,6 +62,7 @@ enum { #include "foundation/log.h" #include "foundation/compat_regex.h" #include "foundation/str_util.h" +#include "git/git_context.h" #define XXH_INLINE_ALL #include "xxhash/xxhash.h" @@ -217,7 +218,12 @@ static int init_schema(cbm_store_t *s) { "CREATE TABLE IF NOT EXISTS projects (" " name TEXT PRIMARY KEY," " indexed_at TEXT NOT NULL," - " root_path TEXT NOT NULL" + " root_path TEXT NOT NULL," + " indexed_git_head TEXT," + " files_discovered INTEGER NOT NULL DEFAULT 0," + " files_indexed INTEGER NOT NULL DEFAULT 0," + " files_excluded INTEGER NOT NULL DEFAULT 0," + " files_failed INTEGER NOT NULL DEFAULT 0" ");" "CREATE TABLE IF NOT EXISTS file_hashes (" " project TEXT NOT NULL REFERENCES projects(name) ON DELETE CASCADE," @@ -258,6 +264,11 @@ static int init_schema(cbm_store_t *s) { ");"; int rc = exec_sql(s, ddl); + (void)exec_sql(s, "ALTER TABLE projects ADD COLUMN indexed_git_head TEXT;"); + (void)exec_sql(s, "ALTER TABLE projects ADD COLUMN files_discovered INTEGER NOT NULL DEFAULT 0;"); + (void)exec_sql(s, "ALTER TABLE projects ADD COLUMN files_indexed INTEGER NOT NULL DEFAULT 0;"); + (void)exec_sql(s, "ALTER TABLE projects ADD COLUMN files_excluded INTEGER NOT NULL DEFAULT 0;"); + (void)exec_sql(s, "ALTER TABLE projects ADD COLUMN files_failed INTEGER NOT NULL DEFAULT 0;"); if (rc != CBM_STORE_OK) { return rc; } @@ -944,31 +955,69 @@ int cbm_store_dump_to_file(cbm_store_t *s, const char *dest_path) { int cbm_store_upsert_project(cbm_store_t *s, const char *name, const char *root_path) { sqlite3_stmt *stmt = prepare_cached(s, &s->stmt_upsert_project, - "INSERT INTO projects (name, indexed_at, root_path) VALUES (?1, ?2, ?3) " - "ON CONFLICT(name) DO UPDATE SET indexed_at=?2, root_path=?3;"); + "INSERT INTO projects (name, indexed_at, root_path, indexed_git_head) " + "VALUES (?1, ?2, ?3, ?4) " + "ON CONFLICT(name) DO UPDATE SET indexed_at=?2, root_path=?3, " + "indexed_git_head=?4;"); if (!stmt) { return CBM_STORE_ERR; } char ts[CBM_SZ_64]; iso_now(ts, sizeof(ts)); + cbm_git_context_t git = {0}; + const char *head = NULL; + if (cbm_git_context_resolve(root_path, &git) == 0 && git.is_git && git.head_sha && + git.head_sha[0]) { + head = git.head_sha; + } bind_text(stmt, SKIP_ONE, name); bind_text(stmt, ST_COL_2, ts); bind_text(stmt, ST_COL_3, root_path); + if (head) { + bind_text(stmt, ST_COL_4, head); + } else { + sqlite3_bind_null(stmt, ST_COL_4); + } int rc = sqlite3_step(stmt); if (rc != SQLITE_DONE) { store_set_error_sqlite(s, "upsert_project"); + cbm_git_context_free(&git); return CBM_STORE_ERR; } + cbm_git_context_free(&git); return CBM_STORE_OK; } +int cbm_store_update_project_coverage(cbm_store_t *s, const char *name, int files_discovered, + int files_indexed, int files_excluded, int files_failed) { + sqlite3_stmt *stmt = NULL; + int rc = sqlite3_prepare_v2( + s->db, + "UPDATE projects SET files_discovered=?2, files_indexed=?3, files_excluded=?4, " + "files_failed=?5 WHERE name=?1;", + CBM_NOT_FOUND, &stmt, NULL); + if (rc != SQLITE_OK) { + store_set_error_sqlite(s, "update_project_coverage"); + return CBM_STORE_ERR; + } + bind_text(stmt, 1, name); + sqlite3_bind_int(stmt, 2, files_discovered); + sqlite3_bind_int(stmt, 3, files_indexed); + sqlite3_bind_int(stmt, 4, files_excluded); + sqlite3_bind_int(stmt, 5, files_failed); + rc = sqlite3_step(stmt); + sqlite3_finalize(stmt); + return rc == SQLITE_DONE ? CBM_STORE_OK : CBM_STORE_ERR; +} + int cbm_store_get_project(cbm_store_t *s, const char *name, cbm_project_t *out) { sqlite3_stmt *stmt = prepare_cached(s, &s->stmt_get_project, - "SELECT name, indexed_at, root_path FROM projects WHERE name = ?1;"); + "SELECT name, indexed_at, root_path, indexed_git_head, files_discovered, " + "files_indexed, files_excluded, files_failed FROM projects WHERE name = ?1;"); if (!stmt) { return CBM_STORE_ERR; } @@ -979,6 +1028,11 @@ int cbm_store_get_project(cbm_store_t *s, const char *name, cbm_project_t *out) out->name = heap_strdup((const char *)sqlite3_column_text(stmt, 0)); out->indexed_at = heap_strdup((const char *)sqlite3_column_text(stmt, SKIP_ONE)); out->root_path = heap_strdup((const char *)sqlite3_column_text(stmt, CBM_SZ_2)); + out->indexed_git_head = heap_strdup((const char *)sqlite3_column_text(stmt, ST_COL_3)); + out->files_discovered = sqlite3_column_int(stmt, ST_COL_4); + out->files_indexed = sqlite3_column_int(stmt, 5); + out->files_excluded = sqlite3_column_int(stmt, 6); + out->files_failed = sqlite3_column_int(stmt, 7); return CBM_STORE_OK; } return CBM_STORE_NOT_FOUND; @@ -987,7 +1041,8 @@ int cbm_store_get_project(cbm_store_t *s, const char *name, cbm_project_t *out) int cbm_store_list_projects(cbm_store_t *s, cbm_project_t **out, int *count) { sqlite3_stmt *stmt = prepare_cached(s, &s->stmt_list_projects, - "SELECT name, indexed_at, root_path FROM projects ORDER BY name;"); + "SELECT name, indexed_at, root_path, indexed_git_head, files_discovered, " + "files_indexed, files_excluded, files_failed FROM projects ORDER BY name;"); if (!stmt) { return CBM_STORE_ERR; } @@ -995,16 +1050,22 @@ int cbm_store_list_projects(cbm_store_t *s, cbm_project_t **out, int *count) { /* Collect into dynamic array */ int cap = ST_INIT_CAP_8; int n = 0; - cbm_project_t *arr = malloc(cap * sizeof(cbm_project_t)); + cbm_project_t *arr = calloc((size_t)cap, sizeof(cbm_project_t)); while (sqlite3_step(stmt) == SQLITE_ROW) { if (n >= cap) { cap *= ST_GROWTH; arr = safe_realloc(arr, cap * sizeof(cbm_project_t)); + memset(&arr[n], 0, (size_t)(cap - n) * sizeof(cbm_project_t)); } arr[n].name = heap_strdup((const char *)sqlite3_column_text(stmt, 0)); arr[n].indexed_at = heap_strdup((const char *)sqlite3_column_text(stmt, SKIP_ONE)); arr[n].root_path = heap_strdup((const char *)sqlite3_column_text(stmt, CBM_SZ_2)); + arr[n].indexed_git_head = heap_strdup((const char *)sqlite3_column_text(stmt, ST_COL_3)); + arr[n].files_discovered = sqlite3_column_int(stmt, ST_COL_4); + arr[n].files_indexed = sqlite3_column_int(stmt, 5); + arr[n].files_excluded = sqlite3_column_int(stmt, 6); + arr[n].files_failed = sqlite3_column_int(stmt, 7); n++; } @@ -2582,7 +2643,7 @@ static int bfs_collect_edges(cbm_store_t *s, int64_t start_id, const cbm_node_ho char edge_sql[ST_SQL_BUF]; snprintf(edge_sql, sizeof(edge_sql), - "SELECT n1.name, n2.name, e.type " + "SELECT n1.name, n2.name, e.type, e.properties " "FROM edges e " "JOIN nodes n1 ON n1.id = e.source_id " "JOIN nodes n2 ON n2.id = e.target_id " @@ -2618,6 +2679,7 @@ static int bfs_collect_edges(cbm_store_t *s, int64_t start_id, const cbm_node_ho edges[en].from_name = heap_strdup((const char *)sqlite3_column_text(estmt, 0)); edges[en].to_name = heap_strdup((const char *)sqlite3_column_text(estmt, SKIP_ONE)); edges[en].type = heap_strdup((const char *)sqlite3_column_text(estmt, CBM_SZ_2)); + edges[en].properties_json = heap_strdup((const char *)sqlite3_column_text(estmt, ST_COL_3)); edges[en].confidence = (double)SKIP_ONE; en++; } @@ -2771,6 +2833,7 @@ void cbm_store_traverse_free(cbm_traverse_result_t *out) { safe_str_free(&out->edges[i].from_name); safe_str_free(&out->edges[i].to_name); safe_str_free(&out->edges[i].type); + safe_str_free(&out->edges[i].properties_json); } free(out->edges); @@ -5538,6 +5601,7 @@ void cbm_project_free_fields(cbm_project_t *p) { safe_str_free(&p->name); safe_str_free(&p->indexed_at); safe_str_free(&p->root_path); + safe_str_free(&p->indexed_git_head); } void cbm_store_free_projects(cbm_project_t *projects, int count) { diff --git a/src/store/store.h b/src/store/store.h index 26b09a5c2..c8eef9bb5 100644 --- a/src/store/store.h +++ b/src/store/store.h @@ -51,6 +51,11 @@ typedef struct { const char *name; const char *indexed_at; /* ISO 8601 */ const char *root_path; + const char *indexed_git_head; /* NULL/empty when non-git or unavailable */ + int files_discovered; + int files_indexed; + int files_excluded; + int files_failed; } cbm_project_t; typedef struct { @@ -147,6 +152,7 @@ typedef struct { const char *from_name; const char *to_name; const char *type; + const char *properties_json; double confidence; } cbm_edge_info_t; @@ -267,6 +273,8 @@ int cbm_store_dump_to_file(cbm_store_t *s, const char *dest_path); /* ── Project CRUD ───────────────────────────────────────────────── */ int cbm_store_upsert_project(cbm_store_t *s, const char *name, const char *root_path); +int cbm_store_update_project_coverage(cbm_store_t *s, const char *name, int files_discovered, + int files_indexed, int files_excluded, int files_failed); int cbm_store_get_project(cbm_store_t *s, const char *name, cbm_project_t *out); int cbm_store_list_projects(cbm_store_t *s, cbm_project_t **out, int *count); int cbm_store_delete_project(cbm_store_t *s, const char *name);