Implement AWK strnum semantics for input-derived numeric strings#486
Open
bertysentry wants to merge 1 commit into
Open
Implement AWK strnum semantics for input-derived numeric strings#486bertysentry wants to merge 1 commit into
bertysentry wants to merge 1 commit into
Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
This PR introduces AWK “strnum” semantics by adding an internal scalar type to preserve input-derived numeric-string behavior, then updates runtime comparison/truthiness logic plus tests/benchmarks to validate the new rules (including locale-aware numeric recognition for comparisons).
Changes:
- Add internal
StrNumscalar and use it for input-derived$0/fields andgetlinetargets to drive comparison/truthiness semantics. - Update
JRT.compare2(...)to follow number vs strnum vs plain-string comparison rules; update array-key normalization to avoid leaking internalStrNumkeys. - Add regression tests and update benchmarks to cover the new behavior and hot paths.
Reviewed changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| src/test/java/io/jawk/StrNumSemanticsTest.java | New end-to-end tests covering input-derived comparisons, assignment propagation, $0/field materialization, truthiness, and locale. |
| src/test/java/io/jawk/PosixConformanceTest.java | Adjusts expected results for numeric-looking string literals under the new comparison rules. |
| src/test/java/io/jawk/JRTTest.java | Updates unit expectations for compare2 to treat plain String operands as string comparisons. |
| src/test/java/io/jawk/jrt/JRTComparisonNumberTest.java | Renames/extends tests for locale-aware parseability and StrNum vs number/string comparison behavior. |
| src/main/java/io/jawk/jrt/StrNum.java | Adds the internal StrNum scalar with strict numeric recognition and cached numeric conversion for comparisons/truthiness. |
| src/main/java/io/jawk/jrt/JRT.java | Implements strnum-aware comparisons/truthiness, locale decimal separator handling, and input/field propagation logic. |
| src/main/java/io/jawk/jrt/AssocArray.java | Normalizes StrNum keys to their string value to preserve AWK array index semantics. |
| src/main/java/io/jawk/backend/AVM.java | Ensures Java eval results don’t leak internal StrNum scalars; preserves field assignment value objects. |
| src/jmh/java/io/jawk/jrt/JRTHotPathBenchmark.java | Adds StrNum cases to toDouble/toBoolean hot-path benchmarks. |
| src/jmh/java/io/jawk/jrt/JRTCompare2Benchmark.java | Adds StrNum and “plain numeric-looking String” scenarios to compare2 benchmarks. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
+1306
to
1310
| public void setInputLine(Object inputLine) { | ||
| String inputText = inputLine == null ? "" : inputLine.toString(); | ||
| this.inputLine = inputText; | ||
| recordState = new RecordState(inputText, null, false); | ||
| } |
Comment on lines
1660
to
1663
| private Object getField(int fieldIndex) { | ||
| if (fieldIndex == 0) { | ||
| return getRecordText(); | ||
| return inputDerived ? new StrNum(getRecordText(), decimalSeparator) : getRecordText(); | ||
| } |
Comment on lines
+58
to
+61
| try { | ||
| return new BigDecimal(JRT.normalizeNumberForComparison(value, decimalSeparator)).doubleValue(); | ||
| } catch (NumberFormatException nfe) { | ||
| return JRT.parseDoubleForComparison(value, decimalSeparator); |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
StrNumscalar to preserve input-derived numeric-string semantics.$0/field materialization.Testing
mvn clean formatter:format testmvn -Pbenchmark package -DskipTestsmvn verify site