Skip to content

Implement AWK strnum semantics for input-derived numeric strings#486

Open
bertysentry wants to merge 1 commit into
mainfrom
476-implement-awk-strnum-semantics-for-input-derived-numeric-strings
Open

Implement AWK strnum semantics for input-derived numeric strings#486
bertysentry wants to merge 1 commit into
mainfrom
476-implement-awk-strnum-semantics-for-input-derived-numeric-strings

Conversation

@bertysentry
Copy link
Copy Markdown
Contributor

Summary

  • Add an internal StrNum scalar to preserve input-derived numeric-string semantics.
  • Make comparisons follow AWK's string/number/strnum rules while keeping permissive numeric-prefix conversion for arithmetic.
  • Preserve attribute propagation across assignment, field mutation, getline, and $0/field materialization.
  • Update benchmarks and regression tests for comparison, truthiness, locale handling, and field behavior.

Testing

  • mvn clean formatter:format test
  • mvn -Pbenchmark package -DskipTests
  • mvn verify site

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces AWK “strnum” semantics by adding an internal scalar type to preserve input-derived numeric-string behavior, then updates runtime comparison/truthiness logic plus tests/benchmarks to validate the new rules (including locale-aware numeric recognition for comparisons).

Changes:

  • Add internal StrNum scalar and use it for input-derived $0/fields and getline targets to drive comparison/truthiness semantics.
  • Update JRT.compare2(...) to follow number vs strnum vs plain-string comparison rules; update array-key normalization to avoid leaking internal StrNum keys.
  • Add regression tests and update benchmarks to cover the new behavior and hot paths.

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
src/test/java/io/jawk/StrNumSemanticsTest.java New end-to-end tests covering input-derived comparisons, assignment propagation, $0/field materialization, truthiness, and locale.
src/test/java/io/jawk/PosixConformanceTest.java Adjusts expected results for numeric-looking string literals under the new comparison rules.
src/test/java/io/jawk/JRTTest.java Updates unit expectations for compare2 to treat plain String operands as string comparisons.
src/test/java/io/jawk/jrt/JRTComparisonNumberTest.java Renames/extends tests for locale-aware parseability and StrNum vs number/string comparison behavior.
src/main/java/io/jawk/jrt/StrNum.java Adds the internal StrNum scalar with strict numeric recognition and cached numeric conversion for comparisons/truthiness.
src/main/java/io/jawk/jrt/JRT.java Implements strnum-aware comparisons/truthiness, locale decimal separator handling, and input/field propagation logic.
src/main/java/io/jawk/jrt/AssocArray.java Normalizes StrNum keys to their string value to preserve AWK array index semantics.
src/main/java/io/jawk/backend/AVM.java Ensures Java eval results don’t leak internal StrNum scalars; preserves field assignment value objects.
src/jmh/java/io/jawk/jrt/JRTHotPathBenchmark.java Adds StrNum cases to toDouble/toBoolean hot-path benchmarks.
src/jmh/java/io/jawk/jrt/JRTCompare2Benchmark.java Adds StrNum and “plain numeric-looking String” scenarios to compare2 benchmarks.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +1306 to 1310
public void setInputLine(Object inputLine) {
String inputText = inputLine == null ? "" : inputLine.toString();
this.inputLine = inputText;
recordState = new RecordState(inputText, null, false);
}
Comment on lines 1660 to 1663
private Object getField(int fieldIndex) {
if (fieldIndex == 0) {
return getRecordText();
return inputDerived ? new StrNum(getRecordText(), decimalSeparator) : getRecordText();
}
Comment on lines +58 to +61
try {
return new BigDecimal(JRT.normalizeNumberForComparison(value, decimalSeparator)).doubleValue();
} catch (NumberFormatException nfe) {
return JRT.parseDoubleForComparison(value, decimalSeparator);
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Implement AWK strnum semantics for input-derived numeric strings

2 participants