Distinguish CCA vs CLI baseline-build expectations#127586
Open
steveisok wants to merge 1 commit intodotnet:mainfrom
Open
Distinguish CCA vs CLI baseline-build expectations#127586steveisok wants to merge 1 commit intodotnet:mainfrom
steveisok wants to merge 1 commit intodotnet:mainfrom
Conversation
## Why
The Baseline Build section in copilot-instructions.md was written with a
single, uniformly strict rule: 'You MUST complete a baseline build
BEFORE making any code changes.' That language is correct for CCA, where
the environment is a fresh sandbox with no pre-existing artifacts and
skipping the baseline causes confusing 'missing testhost' / 'shared
framework' failures 20+ minutes into the task.
But the same language is actively harmful in CLI (interactive) use:
- A developer's workspace almost always already has a recent baseline
for the component they're working on. Re-running a 40-minute build
for every task is pure waste.
- The strict 'MUST baseline first' wording pushes a CLI-driving agent
to do exactly that — kicking off a 40-minute build before touching
any code, even when the existing artifacts would have worked fine.
- The original wording also gave no recovery path: if a CLI session
did skip the baseline and later hit a missing-testhost error, the
rule offered no guidance other than 'should have run it first.'
The two surfaces have genuinely different needs:
- CCA: fresh sandbox, no artifacts, no human nearby, skipping is
catastrophic — needs forceful, no-exceptions language.
- CLI: existing workspace state, human or driving agent in the loop,
probe-and-fail-cheap is feasible — needs flexibility and a
deterministic fallback.
A single uniformly-strict or uniformly-soft rule mis-serves one of them.
## What changes
Split the Baseline Build section into two mode-specific subsections
with strictness reflected in the headings:
- 'When running under CCA — MANDATORY' keeps the original forceful
language (MUST, BEFORE, no exceptions, STOP on failure, 'IS a task
failure') so a CCA-mode model cannot rationalize skipping.
- 'When running under CLI (interactive) — flexible' introduces a
probe-and-fall-back rule that works for both human users and local
agents driving the CLI:
1. Check the component's sentinel artifact under artifacts/. If
missing, baseline.
2. Otherwise attempt the incremental work; on a documented
baseline-missing error, baseline once and retry. No looping.
3. Honor explicit user signals ('just built' / 'fresh checkout').
A default-to-strict tiebreaker ('If you're uncertain which mode you're
in, follow the CCA rule') prevents a misclassified mode from skipping
the baseline.
To make the CLI rule operational, every Component-Specific Workflow
(Libraries, CoreCLR, Mono, WASM Libraries, Host, Tools, Build Tasks,
Runtime Tests) now lists a concrete Baseline sentinel path under
artifacts/ that the model can ls in a single command.
Step 2's 'clean working tree' guidance is also softened to acknowledge
both the baseline-up-front case (clean HEAD required) and the
baseline-after-probe case (stash work-in-progress or accept that the
baseline incorporates it).
## Net effect
- CCA behavior is unchanged: same up-front mandatory baseline, same
forceful language, same stop-on-failure.
- CLI behavior gains permission to skip a 40-minute baseline when the
workspace already has one, with a deterministic fallback if the
skip turns out to be wrong.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Contributor
|
Tagging subscribers to this area: @dotnet/runtime-infrastructure |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why
The Baseline Build section in copilot-instructions.md was written with a single, uniformly strict rule: 'You MUST complete a baseline build BEFORE making any code changes.' That language is correct for CCA, where the environment is a fresh sandbox with no pre-existing artifacts and skipping the baseline causes confusing 'missing testhost' / 'shared framework' failures 20+ minutes into the task.
But the same language can end up adding unnecessary churn in CLI (interactive) use:
The two surfaces have genuinely different needs:
A single uniformly-strict or uniformly-soft rule mis-serves one of them.
What changes
Split the Baseline Build section into two mode-specific subsections with strictness reflected in the headings:
A default-to-strict tiebreaker ('If you're uncertain which mode you're in, follow the CCA rule') prevents a misclassified mode from skipping the baseline.
To make the CLI rule operational, every Component-Specific Workflow (Libraries, CoreCLR, Mono, WASM Libraries, Host, Tools, Build Tasks, Runtime Tests) now lists a concrete Baseline sentinel path under artifacts/ that the model can ls in a single command.
Step 2's 'clean working tree' guidance is also softened to acknowledge both the baseline-up-front case (clean HEAD required) and the baseline-after-probe case (stash work-in-progress or accept that the baseline incorporates it).
Net effect