feat: send previousDomains on full refresh to stabilize domain names#111
Conversation
When Generate runs with --force and a prior shards.json cache exists, extract domain names + subdomain counts and pass them as the previousDomains query param. The API (v0.12.0) seeds the LLM prompt to reuse those names where the same groupings still apply. Tested on Directus (17,877 nodes) against production: - Run 1 (fresh): ArtificialIntelligence, DataModel, CoreServices, ServerInfrastucture, CliCommands, DirectusApp - Run 2 (seeded): 83% name reuse (5/6 identical, 1 typo corrected by LLM: ServerInfrastucture → ServerInfrastructure) Closes supermodeltools#110
WalkthroughThis PR adds a Changes
Sequence DiagramsequenceDiagram
participant Handler as Shards Handler
participant Cache as Cache Layer
participant Client as API Client
participant API as Analysis API
participant LLM as LLM Analyzer
Handler->>Cache: Read cached `ShardIR` (cacheFile)
Cache-->>Handler: Return cached analysis result
Handler->>Handler: Extract domains → `[]PreviousDomain`
Handler->>Client: AnalyzeShards(ctx, zipPath, idemKey, prevDomains)
Client->>Client: Marshal `previousDomains` to JSON
Client->>Client: Append `previousDomains` query param to endpoint
Client->>API: POST /analyze?previousDomains=[...]
API->>LLM: Seed LLM with domain names & counts
LLM-->>API: Return generated analysis
API-->>Client: Return analysis result
Client-->>Handler: Return `ShardIR`
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@internal/api/client.go`:
- Around line 119-123: The struct PreviousDomain has a goimports formatting
issue (likely missing/extra blank line or import grouping) causing CI to fail;
fix by running goimports -w on the file or manually adjust spacing/newlines
around the PreviousDomain declaration so imports and code are properly
formatted, then re-run goimports/gofmt to ensure the file compiles cleanly.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: d38d8090-4f83-40c3-bcb1-4d201d1c79c2
📒 Files selected for processing (4)
internal/analyze/handler.gointernal/api/client.gointernal/shards/daemon.gointernal/shards/handler.go
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@internal/api/client.go`:
- Line 128: Update the comment that currently reads "If previousDomains is
non-nil, passes them as a query param to seed domain names." to accurately
reflect the actual trigger used in the code: only send previousDomains when the
slice has elements (len(previousDomains) > 0). Locate the comment near the code
that checks previousDomains and sends it as a query parameter (look for the
previousDomains variable and the conditional that builds the query) and change
the wording to mention "non-empty slice" or "len(previousDomains) > 0" so it
matches the implementation.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 953e7abb-803e-4613-80d0-b90f90c3dccc
📒 Files selected for processing (1)
internal/api/client.go
| // required for sidecar rendering (IDs, labels, properties preserved). | ||
| func (c *Client) AnalyzeShards(ctx context.Context, zipPath, idempotencyKey string) (*ShardIR, error) { | ||
| job, err := c.pollUntilComplete(ctx, zipPath, idempotencyKey) | ||
| // If previousDomains is non-nil, passes them as a query param to seed domain names. |
There was a problem hiding this comment.
Fix comment mismatch on the trigger condition.
Line 128 says "non-nil", but the code sends previousDomains only when the slice is non-empty (len(previousDomains) > 0). That wording can mislead future readers.
✏️ Suggested doc fix
-// If previousDomains is non-nil, passes them as a query param to seed domain names.
+// If previousDomains is non-empty, passes them as a query param to seed domain names.📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // If previousDomains is non-nil, passes them as a query param to seed domain names. | |
| // If previousDomains is non-empty, passes them as a query param to seed domain names. |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@internal/api/client.go` at line 128, Update the comment that currently reads
"If previousDomains is non-nil, passes them as a query param to seed domain
names." to accurately reflect the actual trigger used in the code: only send
previousDomains when the slice has elements (len(previousDomains) > 0). Locate
the comment near the code that checks previousDomains and sends it as a query
parameter (look for the previousDomains variable and the conditional that builds
the query) and change the wording to mention "non-empty slice" or
"len(previousDomains) > 0" so it matches the implementation.
Problem
Every
supermodel analyze --forceproduces different domain names because the LLM classifies from scratch.SharedKernelbecomesCoreUtilities,DataServicesbecomesDataModel, etc.Fix
When
Generateruns and a priorshards.jsoncache exists, extract domain names + subdomain counts and pass them as thepreviousDomainsquery param to the API (v0.12.0). The LLM is seeded to reuse those names.4 files changed, 30 lines added.
Production test on Directus (17,877 nodes)
83% exact match. The one "change" was the LLM correcting a typo in the domain name it generated on run 1.
Closes #110
Summary by CodeRabbit