Skip to content

mgmt eng, avoid double TypeSpec generation by resolving sdk_folder from tspconfig.yaml#48150

Merged
XiaofeiCao merged 8 commits intomainfrom
mgmt_eng_avoid_generating_twice
Feb 28, 2026
Merged

mgmt eng, avoid double TypeSpec generation by resolving sdk_folder from tspconfig.yaml#48150
XiaofeiCao merged 8 commits intomainfrom
mgmt_eng_avoid_generating_twice

Conversation

@XiaofeiCao
Copy link
Contributor

@XiaofeiCao XiaofeiCao commented Feb 27, 2026

What this PR does

Fixes #47775

When generating SDK from TypeSpec via eng/automation/generate.py, the code previously ran tsp-client init twice for mgmt-plane libraries (remove_before_regen=True):

  1. First generation: only to discover sdk_folder via git status on tsp-location.yaml
  2. Second generation: the actual generation with correct version/api-version options

For large TypeSpec projects where TCGC is slow, this doubled generation time (e.g., 35min → 1:15h).

Solution

Parse tspconfig.yaml before generation to determine the SDK output folder, eliminating the need for the first generation pass entirely. The code now always generates only once.

Changes to eng/automation/generate_utils.py

  1. _resolve_tspconfig_variables() (new): General-purpose template variable resolver that follows the same resolution order as the spec validation rules:

    • Emitter options → Parameters → Global config
    • Iterative resolution (up to 10 iterations) for nested variables (e.g., service-dir containing {output-dir})
    • Handles runtime variables ({output-dir}, {project-root}) that are provided by tsp-client, not defined in tspconfig.yaml
  2. resolve_sdk_folder_from_tspconfig() (new): Reads tspconfig.yaml (remote URL or local path), resolves emitter-output-dir from @azure-tools/typespec-java options using the variable resolver, and returns the SDK folder path relative to sdk_root. Returns None with a clear error if resolution fails.

  3. _read_tspconfig() (new): Reads tspconfig.yaml content from either a GitHub blob URL (converted to raw URL) or a local filesystem path.

  4. TSPCONFIG_URL_PATTERN (new): Module-level compiled regex for matching GitHub tspconfig.yaml blob URLs, extracted from the duplicate patterns that existed in the codebase.

  5. generate_typespec_project() (modified): Simplified to a single linear flow — resolve sdk_folder from tspconfig → optionally prep (remove old code, set version) → generate once. Fails fast with a RuntimeError if tspconfig resolution fails.

Additional fix

The resolve_sdk_folder_from_tspconfig() approach also fixes the edge case where git status does not detect tsp-location.yaml changes when commit-id is unchanged (e.g., regen due to emitter bug fix with no spec changes, as noted in this comment).

How to verify

The change fails fast if tspconfig.yaml cannot be parsed — resolve_sdk_folder_from_tspconfig() returns None and a RuntimeError is raised with a clear error message. This is by design: all valid TypeSpec projects should have a parseable tspconfig.yaml with Java emitter options including emitter-output-dir and service-dir.

…ig.yaml

Parse tspconfig.yaml to determine the SDK output folder before running
generation, eliminating the need for a first generation pass that was only
used to discover the sdk_folder via git status on tsp-location.yaml.

When remove_before_regen=True (mgmt-plane), the optimized path:
- Reads tspconfig.yaml (remote URL or local) to extract service-dir and
  emitter-output-dir for Java
- Resolves template variables to determine the sdk_folder
- Runs generation only once with all options (version, api-version)

Falls back to original double-generation behavior if tspconfig parsing fails.

Also uses resolved sdk_folder as fallback in find_sdk_folder, fixing the
edge case where git status doesn't detect tsp-location.yaml changes when
commit-id is unchanged (e.g., regen due to emitter bug fix).

Fixes #47775

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@XiaofeiCao XiaofeiCao force-pushed the mgmt_eng_avoid_generating_twice branch from 2dda23a to 8dbbc7c Compare February 27, 2026 04:18
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@XiaofeiCao XiaofeiCao marked this pull request as ready for review February 27, 2026 04:51
Copilot AI review requested due to automatic review settings February 27, 2026 04:51
@XiaofeiCao XiaofeiCao changed the title Avoid double TypeSpec generation by resolving sdk_folder from tspconfig.yaml mgmt eng, avoid double TypeSpec generation by resolving sdk_folder from tspconfig.yaml Feb 27, 2026
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR optimizes TypeSpec SDK generation by eliminating double generation for management-plane libraries. Previously, generate.py ran tsp-client init twice when remove_before_regen=True: once to discover the SDK folder via git status, and again to perform the actual generation. For large TypeSpec projects, this doubled generation time (e.g., 35min → 1:15h).

Changes:

  • Added resolve_sdk_folder_from_tspconfig() to parse tspconfig.yaml and determine the SDK output folder without running generation
  • Added _read_tspconfig() to fetch tspconfig.yaml from either remote GitHub URLs or local paths
  • Modified generate_typespec_project() to use the optimized path when resolution succeeds, falling back to the original double-generation approach when it fails

Copy link
Member

@weidongxu-microsoft weidongxu-microsoft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please check copilot review message

wondered how risky if we just depend on the new check (without the fallback) -- it can make the code cleaner, but can have problem on corner cases (what case could happen?)

@XiaofeiCao
Copy link
Contributor Author

XiaofeiCao commented Feb 27, 2026

wondered how risky if we just depend on the new check (without the fallback) -- it can make the code cleaner, but can have problem on corner cases (what case could happen?)

The corner cases I have in mind is when tspconfig.yaml format got changed (e.g. something like package-dir replaced with emitter-output-dir, etc).
This may not be a big issue, since tsp-client might also need to be bumped in this case.

"@azure-tools/typespec-client-generator-cli": "0.31.0"

@weidongxu-microsoft
Copy link
Member

wondered how risky if we just depend on the new check (without the fallback) -- it can make the code cleaner, but can have problem on corner cases (what case could happen?)

The corner cases I have in mind is when tspconfig.yaml format got changed (e.g. something like package-dir replaced with emitter-output-dir, etc). This may not be a big issue, since tsp-client might also need to be bumped in this case.

"@azure-tools/typespec-client-generator-cli": "0.31.0"

we may also see typespec-validation, and handling case that would pass it should be enough
https://github.com/Azure/azure-rest-api-specs/blob/main/eng/tools/typespec-validation/src/rules/sdk-tspconfig-validation.ts#L212

XiaofeiCao and others added 5 commits February 27, 2026 16:02
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Instead of generating twice (once to discover sdk_folder, once with
correct options), always parse tspconfig.yaml to determine sdk_folder
before generation. Fails fast with a clear error if tspconfig cannot
be resolved.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Align with azure-rest-api-specs sdk-tspconfig-validation.ts:
- General variable resolution following the same order: emitter
  options -> parameters -> global config
- Iterative resolution (up to 10 iterations) for nested variables
  (e.g., service-dir containing {output-dir})
- Handle runtime variables ({output-dir}, {project-root}) that are
  provided by tsp-client at runtime, not in tspconfig.yaml
- Support arbitrary variables like {package-name}, {namespace} etc.
- Non-string emitter-output-dir returns None
- Sanity check: resolved path must start with sdk/

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@XiaofeiCao
Copy link
Contributor Author

we may also see typespec-validation, and handling case that would pass it should be enough https://github.com/Azure/azure-rest-api-specs/blob/main/eng/tools/typespec-validation/src/rules/sdk-tspconfig-validation.ts#L212

Thanks, done. Removed fallback logic as well.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 9 comments.

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@XiaofeiCao XiaofeiCao merged commit 78455c9 into main Feb 28, 2026
12 checks passed
@XiaofeiCao XiaofeiCao deleted the mgmt_eng_avoid_generating_twice branch February 28, 2026 02:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

mgmt, improve performance on automation from specs "SDK Validation" CI

3 participants