mgmt eng, avoid double TypeSpec generation by resolving sdk_folder from tspconfig.yaml#48150
mgmt eng, avoid double TypeSpec generation by resolving sdk_folder from tspconfig.yaml#48150XiaofeiCao merged 8 commits intomainfrom
Conversation
…ig.yaml Parse tspconfig.yaml to determine the SDK output folder before running generation, eliminating the need for a first generation pass that was only used to discover the sdk_folder via git status on tsp-location.yaml. When remove_before_regen=True (mgmt-plane), the optimized path: - Reads tspconfig.yaml (remote URL or local) to extract service-dir and emitter-output-dir for Java - Resolves template variables to determine the sdk_folder - Runs generation only once with all options (version, api-version) Falls back to original double-generation behavior if tspconfig parsing fails. Also uses resolved sdk_folder as fallback in find_sdk_folder, fixing the edge case where git status doesn't detect tsp-location.yaml changes when commit-id is unchanged (e.g., regen due to emitter bug fix). Fixes #47775 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2dda23a to
8dbbc7c
Compare
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
This PR optimizes TypeSpec SDK generation by eliminating double generation for management-plane libraries. Previously, generate.py ran tsp-client init twice when remove_before_regen=True: once to discover the SDK folder via git status, and again to perform the actual generation. For large TypeSpec projects, this doubled generation time (e.g., 35min → 1:15h).
Changes:
- Added
resolve_sdk_folder_from_tspconfig()to parse tspconfig.yaml and determine the SDK output folder without running generation - Added
_read_tspconfig()to fetch tspconfig.yaml from either remote GitHub URLs or local paths - Modified
generate_typespec_project()to use the optimized path when resolution succeeds, falling back to the original double-generation approach when it fails
The corner cases I have in mind is when |
we may also see typespec-validation, and handling case that would pass it should be enough |
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Instead of generating twice (once to discover sdk_folder, once with correct options), always parse tspconfig.yaml to determine sdk_folder before generation. Fails fast with a clear error if tspconfig cannot be resolved. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Align with azure-rest-api-specs sdk-tspconfig-validation.ts:
- General variable resolution following the same order: emitter
options -> parameters -> global config
- Iterative resolution (up to 10 iterations) for nested variables
(e.g., service-dir containing {output-dir})
- Handle runtime variables ({output-dir}, {project-root}) that are
provided by tsp-client at runtime, not in tspconfig.yaml
- Support arbitrary variables like {package-name}, {namespace} etc.
- Non-string emitter-output-dir returns None
- Sanity check: resolved path must start with sdk/
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Thanks, done. Removed fallback logic as well. |
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
What this PR does
Fixes #47775
When generating SDK from TypeSpec via
eng/automation/generate.py, the code previously rantsp-client inittwice for mgmt-plane libraries (remove_before_regen=True):sdk_folderviagit statusontsp-location.yamlFor large TypeSpec projects where TCGC is slow, this doubled generation time (e.g., 35min → 1:15h).
Solution
Parse
tspconfig.yamlbefore generation to determine the SDK output folder, eliminating the need for the first generation pass entirely. The code now always generates only once.Changes to
eng/automation/generate_utils.py_resolve_tspconfig_variables()(new): General-purpose template variable resolver that follows the same resolution order as the spec validation rules:service-dircontaining{output-dir}){output-dir},{project-root}) that are provided by tsp-client, not defined in tspconfig.yamlresolve_sdk_folder_from_tspconfig()(new): Readstspconfig.yaml(remote URL or local path), resolvesemitter-output-dirfrom@azure-tools/typespec-javaoptions using the variable resolver, and returns the SDK folder path relative to sdk_root. ReturnsNonewith a clear error if resolution fails._read_tspconfig()(new): Readstspconfig.yamlcontent from either a GitHub blob URL (converted to raw URL) or a local filesystem path.TSPCONFIG_URL_PATTERN(new): Module-level compiled regex for matching GitHub tspconfig.yaml blob URLs, extracted from the duplicate patterns that existed in the codebase.generate_typespec_project()(modified): Simplified to a single linear flow — resolve sdk_folder from tspconfig → optionally prep (remove old code, set version) → generate once. Fails fast with aRuntimeErrorif tspconfig resolution fails.Additional fix
The
resolve_sdk_folder_from_tspconfig()approach also fixes the edge case wheregit statusdoes not detecttsp-location.yamlchanges when commit-id is unchanged (e.g., regen due to emitter bug fix with no spec changes, as noted in this comment).How to verify
The change fails fast if
tspconfig.yamlcannot be parsed —resolve_sdk_folder_from_tspconfig()returnsNoneand aRuntimeErroris raised with a clear error message. This is by design: all valid TypeSpec projects should have a parseabletspconfig.yamlwith Java emitter options includingemitter-output-dirandservice-dir.