Skip to content

feat(config): index extra source paths from config.json (+ pending core/TTS work)#37

Merged
jb-thery merged 7 commits into
developfrom
feature/config-sources
Jul 2, 2026
Merged

feat(config): index extra source paths from config.json (+ pending core/TTS work)#37
jb-thery merged 7 commits into
developfrom
feature/config-sources

Conversation

@jb-thery

@jb-thery jb-thery commented Jul 2, 2026

Copy link
Copy Markdown
Member

Integrates the pending develop work and adds the new inline sources config field. Targets develop; a follow-up develop -> main PR cuts the release.

New — inline sources in .mimir/config.json

  • sources: string[] accepts file, directory, and glob paths plus ! exclusions, classified like sources.txt lines and merged with it.
  • mimir init no longer writes a sources.txt; new projects get a self-contained config.json. The legacy file is still read when present and mimir sources add/list keep working.
  • Docs: README now documents every config field (table) and the sources syntax; doctor hint, AGENTS.md, and the CLI reference updated.
  • Tests cover config.sources ingestion (path/glob/exclusion), the legacy-file merge, and config parsing.

Also included (already on develop locally)

  • fix(security): prevent secret exposure in redaction and the research scan.
  • refactor(core): dedupe shared utilities, centralize constants, drop dead code.
  • test(core): cover redaction, MCP guards, config, and file skipping.
  • feat(tts): English, Spanish, and French narration (+ docs).

Validation

  • Core: typecheck, Biome lint, build, and 89 tests green. TTS: build in sync, 9 tests green. Committed dist regenerated for both packages.
  • Full release gate (pnpm validate) runs in CI before any publish.

Release impact

feat commits bump the minor version → 1.3.0 when this reaches main.

jb-thery added 7 commits July 2, 2026 14:56
…scan

- Broaden built-in redaction: OpenAI (sk-), AWS (AKIA/ASIA), Google (AIza),
  Slack (xox), SendGrid, and URL-embedded credentials; case-fold IBANs.
- Apply the secret-like file skip list to the research code scan so the MCP
  research tool cannot surface lines from .env-family, key, or credential files.
- Widen the secret-file skip list (.env.*, id_rsa family, credentials, extra
  crypto extensions) behind a shared isSensitiveFilePath helper.
- Add a shared text normalizer (text.ts) used by the scan.
…ead code

- Reuse the shared normalizeForMatch/tokenize (text.ts) and add isRecord
  (guards.ts), replacing duplicated copies in query, embeddings, ingest, config,
  semantic-config, and store.
- Centralize SOURCES_FILE_HEADER, the MIMIR_PROJECT_ROOT env name, the fast-glob
  ignore list, OCR image extensions, and the agent-kit manifest so doctor, skill,
  files, parsing, and init share one source of truth.
- Export mcp searchOptions/projectRelativeGoldenPath for testing; remove the
  unused KbCommand type alias.
Add behaviour tests for the built-in redaction patterns and disabled passthrough,
the MCP topK clamp and goldenPath traversal guard, config validation and the
mcpMaxTopK env override, the security-audit redaction/gitignore warnings, the
broadened secret-file skips, the research code-scan secret exclusion, and the
agent-kit install/doctor contract.
Add a --lang (en|es|fr, default fr) option to `mimir audio` and `mimir-tts render`
that selects a self-contained per-language MMS model (Xenova/mms-tts-{eng,spa,fra})
for the offline Transformers.js path and a native Microsoft neural voice for the
Edge path. Behaviour is unchanged when --lang is omitted. The supported languages
live in a single TTS_LANGUAGES source with an isTtsLanguage guard reused by both
CLIs, and the language is reported in render results, doctor output, and the
audio-summary skill.
Warn contributors that scripts/public-surface-smoke.mjs scans every tracked file
(tests included) and to build secret-shaped fixtures at runtime.
Add a `sources` array to .mimir/config.json so projects can declare extra
file, directory, and glob paths (with `!` exclusions) without a separate
sources.txt. Entries are classified like sources.txt lines and merged with
it, so existing projects keep working; `mimir init` no longer writes a
sources.txt for new projects.

Document every config field and the `sources` syntax in the README, update
the doctor hint, AGENTS.md, and the CLI reference. Cover config.sources
ingestion, the legacy-file merge, and config parsing in tests. Rebuild the
committed core dist.
@jb-thery jb-thery merged commit 1bc7a8b into develop Jul 2, 2026
7 checks passed
@jb-thery jb-thery deleted the feature/config-sources branch July 2, 2026 10:46
@github-actions

github-actions Bot commented Jul 2, 2026

Copy link
Copy Markdown

🎉 This PR is included in version 1.3.0 🎉

The release is available on:

Your semantic-release bot 📦🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant