fix: correct \c escape handling in replacements (#403) by MukundaKatta · Pull Request #449 · uutils/sed

MukundaKatta · 2026-05-29T08:09:09Z

Summary

Fixes the two \c replacement-escape bugs in #403, verified against GNU sed 4.10.

\cX takes the character after \c raw. GNU allows exactly one exception: a backslash escaped as \\ denotes a literal backslash, so \c\\ yields Ctrl-\ (0x1c). Any other backslash escape after \c is rejected.

Before:

$ echo a | ./target/release/sed 's/./\c\\/'      # expected \x1c
sed: :0:10: error: unterminated substitute replacement

$ echo a | ./target/release/sed '1s/./\c\d/'     # expected error
d                                                # silently produced 'd'

After (matches GNU sed 4.10):

$ echo a | ./target/release/sed 's/./\c\\/' | od -An -tx1
 1c 0a
$ echo a | ./target/release/sed '1s/./\c\d/'
sed: ...: recursive escaping after \c not allowed

Implementation

The \c arm of parse_char_escape now consumes its argument correctly: a leading \ must be followed by another \ (literal backslash → control char); otherwise it raises the recursive-escaping error.
parse_char_escape returns UResult<Option<char>> so the error propagates. All five call sites (regex, replacement, transliteration, character class, GNU text commands) thread lines and ?.
Non-ASCII-after-\c keeps its prior literal-c fallback (byte-level GNU behavior is out of scope for a char-based parser).

Testing

cargo test: full suite green (285 unit + 181 integration).
cargo clippy --all-targets -- -D warnings and cargo fmt --check: clean.
New unit tests for \cx, \c\\, \c\d, and trailing-backslash.
Verified byte-for-byte against GNU sed 4.10 for \cA, \c\\, \cx, \c\d, and mixed cases.

Out of scope

\c immediately followed by the closing delimiter (e.g. s/./\c/) is a separate, delimiter-dependent edge case (GNU emits a literal \); it already errored on main and is not part of #403.

Closes #403

GNU sed takes the character after \c raw, with one exception: a backslash may be escaped as \\ to denote a literal backslash, so \c\\ yields Ctrl-\ (0x1c). Any other backslash escape after \c is rejected with 'recursive escaping after \c not allowed'. Previously \c\\ errored as an unterminated replacement and \c\d was silently accepted as 'd'. parse_char_escape now consumes the escaped backslash correctly and returns a compilation error for recursive escaping; its signature becomes UResult<Option<char>> so the error can propagate through all call sites (regex, replacement, transliteration, text commands, character class). Verified against GNU sed 4.10. Adds unit tests for \cx, \c\\, \c\d, and trailing-backslash cases. Closes uutils#403

codecov · 2026-05-29T08:12:52Z

Codecov Report

❌ Patch coverage is 98.11321% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 82.31%. Comparing base (7c6cbf9) to head (4ef2d5f).

Files with missing lines	Patch %	Lines
src/sed/delimited_parser.rs	98.03%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #449      +/-   ##
==========================================
+ Coverage   82.20%   82.31%   +0.10%     
==========================================
  Files          13       13              
  Lines        5542     5582      +40     
  Branches      310      313       +3     
==========================================
+ Hits         4556     4595      +39     
- Misses        983      984       +1     
  Partials        3        3

Flag	Coverage Δ
macos_latest	`83.00% <98.11%> (+0.10%)`	⬆️
ubuntu_latest	`83.10% <98.11%> (+0.10%)`	⬆️
windows_latest	`0.00% <0.00%> (ø)`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

dspinellis

Thanks. Please see the comments. When done please force-push the squashed commits.

dspinellis · 2026-06-02T15:46:43Z


        'c' => {
-            // Control character escape: \cC
+            // Control character escape: \cX. The character after \c is taken


There is considerable logic here. Consider abstracting it into a new function parse_control_char() in delimited_parser.rs, as we do with parse_numeric_escape().

dspinellis · 2026-06-02T15:47:59Z

+                // Nothing to control-escape; treat the lone 'c' literally.
+                Some('c')
+            } else if line.current() == '\\' {
+                line.advance(); // consume the first backslash


Write as complete sentence: Consume the first backslash.

dspinellis · 2026-06-02T15:49:26Z

+                }
+                // \\ -> literal backslash as the control argument.
+                let decoded = create_control_char('\\');
+                line.advance(); // consume the second backslash


See above regarding comment. Follow this in all comments.

dspinellis · 2026-06-02T15:49:52Z

        'c' => {
-            // Control character escape: \cC
+            // Control character escape: \cX. The character after \c is taken
+            // raw (no further escape processing). GNU sed allows exactly one


Please specify GNU sed version.

dspinellis · 2026-06-02T15:50:58Z

+    }
+
+    #[test]
+    fn test_control_escape_escaped_backslash() {


You also need to test at EOL, right?

dspinellis requested changes Jun 2, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: correct \c escape handling in replacements (#403)#449

fix: correct \c escape handling in replacements (#403)#449
MukundaKatta wants to merge 1 commit into
uutils:mainfrom
MukundaKatta:codex/fix-c-escape

MukundaKatta commented May 29, 2026

Uh oh!

codecov Bot commented May 29, 2026 •

edited

Loading

Uh oh!

dspinellis left a comment

Uh oh!

dspinellis Jun 2, 2026

Uh oh!

dspinellis Jun 2, 2026

Uh oh!

dspinellis Jun 2, 2026

Uh oh!

dspinellis Jun 2, 2026

Uh oh!

dspinellis Jun 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

MukundaKatta commented May 29, 2026

Summary

Implementation

Testing

Out of scope

Uh oh!

codecov Bot commented May 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

dspinellis left a comment

Choose a reason for hiding this comment

Uh oh!

dspinellis Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

dspinellis Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

dspinellis Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

dspinellis Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

dspinellis Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

codecov Bot commented May 29, 2026 •

edited

Loading