Skip to content

fix: std.parseYaml wraps single-doc YAML with explicit --- in array#971

Closed
He-Pin wants to merge 2 commits into
databricks:masterfrom
He-Pin:fix/parseyaml-doc-marker-v2
Closed

fix: std.parseYaml wraps single-doc YAML with explicit --- in array#971
He-Pin wants to merge 2 commits into
databricks:masterfrom
He-Pin:fix/parseyaml-doc-marker-v2

Conversation

@He-Pin

@He-Pin He-Pin commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

Summary

Fix std.parseYaml to wrap single-doc YAML with explicit --- document start marker in an array, matching go-jsonnet behavior.

Depends on: #968 (YAML 1.2 octal fix)

Motivation

go-jsonnet treats YAML input containing an explicit document start marker (---) as a multi-document stream, always returning an array even when there is only one document. sjsonnet returned the single document directly, diverging from go-jsonnet for inputs like "---", "---\n", and "---\na: 1".

Note: jrsonnet 0.5.0-pre99 does not wrap single-doc YAML with --- in an array (returns null for "---"). This PR aligns sjsonnet with go-jsonnet's stricter behavior.

Modification

Added YamlDocStartPattern regex that detects --- followed by whitespace or end-of-string (per YAML spec). When detected and composeAll returns a single document, the result is wrapped in an array. Updated existing ParseYaml tests and go_test_suite golden file to match go-jsonnet.

Result

std.parseYaml now correctly handles YAML document start markers, matching go-jsonnet behavior for all edge cases.

Behavior Comparison

YAML input go-jsonnet v0.22.0 jrsonnet 0.5.0-pre99 sjsonnet (before) sjsonnet (after)
"---" [null] null null (bug) [null]
"---\n" [null] null null (bug) [null]
"---\na: 1" [{a:1}] {a:1} {a:1} (bug) [{a:1}]
"--- 3\n" [3] 3 3 (bug) [3]
"---a: 1" {---a:1} {---a:1} {---a:1} {---a:1}
"a: 1" {a:1} {a:1} {a:1} {a:1}

jsonnet-cpp was not available locally for comparison.

Test plan

  • All ParseYaml tests pass (14 tests)
  • All FileTests pass (including new parseyaml_doc_marker.jsonnet)
  • All go_test_suite golden tests pass
  • Code passes scalafmt
  • Verified against go-jsonnet v0.22.0 and jrsonnet 0.5.0-pre99

He-Pin added 2 commits June 18, 2026 18:22
Motivation:
SnakeYAML's SafeConstructor uses YAML 1.1 implicit type resolution which
does not recognize the 0o prefix for octal integers introduced in YAML 1.2.
This caused std.parseYaml to treat unquoted 0o777 as the string "0o777"
instead of the integer 511, diverging from go-jsonnet and jrsonnet.

Modification:
Replaced SafeConstructor-based parsing with composeAll() which gives access
to raw YAML nodes with scalar style information. Added yamlNodeToJson()
that handles YAML 1.2 octal (0o prefix) for plain (unquoted) scalars while
correctly preserving quoted values as strings. Also handles all other YAML
scalar types (int, float, bool, null) with full YAML 1.1 compatibility.

Result:
std.parseYaml now correctly parses both legacy (0777) and modern (0o777)
octal syntax for unquoted values, while quoted "0o777" remains a string,
matching go-jsonnet and jrsonnet behavior exactly.

| YAML input | go-jsonnet v0.22.0 | jrsonnet 0.5.0-pre99 | sjsonnet (before) | sjsonnet (after) |
|-----------|-------------------|---------------------|-------------------|-----------------|
| 0777      | 511               | 511                 | 511               | 511             |
| 0o777     | 511               | 511                 | "0o777" (bug)     | 511             |
| 0o10      | 8                 | 8                   | "0o10" (bug)      | 8               |
| -0o777    | -511              | -511                | "-0o777" (bug)    | -511            |
| "0o777"   | "0o777"           | "0o777"             | "0o777"           | "0o777"         |
Motivation:
go-jsonnet treats YAML input containing an explicit document start marker
(---) as a multi-document stream, always returning an array even when
there is only one document. sjsonnet returned the single document
directly, diverging from go-jsonnet for inputs like "---", "---\n",
and "---\na: 1".

Note: jrsonnet 0.5.0-pre99 does NOT wrap single-doc YAML with --- in
an array (returns null for "---"), so this aligns sjsonnet with
go-jsonnet's stricter behavior.

Modification:
Added YamlDocStartPattern regex that detects --- followed by whitespace
or end-of-string (per YAML spec). When detected and composeAll returns
a single document, the result is wrapped in an array. Updated existing
ParseYaml tests and go_test_suite golden file to match go-jsonnet.

Result:
std.parseYaml now correctly handles YAML document start markers,
matching go-jsonnet behavior for all edge cases.

| YAML input | go-jsonnet v0.22.0 | jrsonnet 0.5.0-pre99 | sjsonnet (before) | sjsonnet (after) |
|-----------|-------------------|---------------------|-------------------|-----------------|
| "---"     | [null]            | null                | null (bug)        | [null]          |
| "---\n"   | [null]            | null                | null (bug)        | [null]          |
| "---\na:1"| [{a:1}]           | {a:1}               | {a:1} (bug)       | [{a:1}]         |
| "--- 3\n" | [3]               | 3                   | 3 (bug)           | [3]             |
| "a: 1"    | {a:1}             | {a:1}               | {a:1}             | {a:1}           |
@He-Pin He-Pin closed this Jun 18, 2026
@He-Pin He-Pin deleted the fix/parseyaml-doc-marker-v2 branch June 18, 2026 10:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant