Skip to content

fix(csv-parse): preserve multi-byte record delimiter in raw output#491

Open
Jian-Zhang08 wants to merge 1 commit into
adaltas:masterfrom
Jian-Zhang08:fix/csv-parse-raw-crlf-delimiter
Open

fix(csv-parse): preserve multi-byte record delimiter in raw output#491
Jian-Zhang08 wants to merge 1 commit into
adaltas:masterfrom
Jian-Zhang08:fix/csv-parse-raw-crlf-delimiter

Conversation

@Jian-Zhang08

Copy link
Copy Markdown

With { raw: true }, only the first byte of the record delimiter was appended to the raw buffer (in the per-char loop), and the parser then advanced pos past the remaining delimiter bytes before emitting the record. For multi-byte delimiters such as Windows "\r\n" this dropped the trailing byte, so raw was 'a,b\r' instead of 'a,b\r\n'.

Append the remaining record-delimiter bytes to the raw buffer when a delimiter is detected, so multi-byte delimiters are preserved in full. Single-byte delimiters are unaffected (the loop body does not run).

Adds a regression test and rebuilds dist.

Fixes #332

With { raw: true }, only the first byte of the record delimiter was
appended to the raw buffer (in the per-char loop), and the parser then
advanced pos past the remaining delimiter bytes before emitting the
record. For multi-byte delimiters such as Windows "\r\n" this dropped
the trailing byte, so raw was 'a,b\r' instead of 'a,b\r\n'.

Append the remaining record-delimiter bytes to the raw buffer when a
delimiter is detected, so multi-byte delimiters are preserved in full.
Single-byte delimiters are unaffected (the loop body does not run).

Adds a regression test and rebuilds dist.

Fixes adaltas#332
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

parse with {raw: true} and windows line endings loses the newline

1 participant