Throw on iconv() failure and default fputcsv escape to empty string#149
Merged
Conversation
ADmad
reviewed
May 11, 2026
Member
Author
|
Addressed in 54ab09b. Added a
Tests for both ignore and transliterate paths added. Moved the conversion into a new |
ADmad
approved these changes
May 11, 2026
54ab09b to
c1c75fb
Compare
Three related changes to the row writer in `_generateRow()` /
`_transcode()`:
1. iconv() returning false (e.g. unsupported target encoding,
malformed source bytes) previously got concatenated into the
accumulator where `(string)false === ''` silently produced an
empty row. Now throws a CakeException in `strict` mode.
2. PHP 8.4 deprecates passing a non-empty `$escape` to fputcsv().
The shipped default was the legacy backslash, which triggers
E_DEPRECATED on every render under 8.4+. Default is now `''`
(RFC 4180-compliant: quotes inside a field are escaped by
doubling them rather than by a preceding backslash). Users on
legacy PHP-style escaping can still set `escape => '\\'`.
3. Per review: add a configurable `transcodingMode` option so
apps that cannot fix bad input can opt out of the throw. Three
class-constant values:
- `TRANSCODING_MODE_STRICT` (default): throws on iconv failure.
- `TRANSCODING_MODE_IGNORE`: passes `//IGNORE` to iconv so
unconvertible characters are dropped and the rest of the row
is preserved. For mbstring, `mb_substitute_character('none')`
is set for the duration of the call.
- `TRANSCODING_MODE_TRANSLITERATE`: passes `//TRANSLIT//IGNORE`
so accented Latin etc. transliterate to ASCII; mbstring
falls back to ignore (no transliteration support).
The transcoding logic moves into a new protected `_transcode()`
hook so apps can override the policy per-view.
The iconv warning emitted right before a strict-mode failure is
consumed by a scoped `set_error_handler()` so the user sees the
clear CakeException rather than two near-duplicate signals; this
also keeps PHPUnit's failOnWarning bar happy.
Tests: iconv strict throw, ignore drops unconvertible chars,
transliterate converts accented Latin, and fputcsv default emits
RFC 4180 escaping with no E_DEPRECATED.
4414975 to
c28e4aa
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Two small correctness fixes around the row writer. Two commits, one per fix — revert independently if needed.
1.
iconv()failure throws instead of silently dropping a rowWhen
dataEncoding !== csvEncodingand the iconv path is selected,_generateRow()previously assignediconv(...)straight into$csv.iconv()can returnfalse(unsupported target encoding, malformed source bytes); thatfalsewas concatenated into the accumulator where(string)false === '', producing a row of nothing with no exception, no log, no signal to the user. Their export looked successful but quietly contained empty rows.Now we check the return value and throw a
CakeExceptionnaming the source and target encodings so the failure is impossible to miss.Regression test:
testIconvFailureThrowsconfigures a bogus target encoding to forceiconv()to returnfalseand asserts the exception fires (wrapped bySerializedView'sSerializationFailureException).2. Default
escapeis now''(RFC 4180, PHP 8.4 compliant)PHP 8.4 deprecates passing a non-empty
$escapetofputcsv(). The shipped default was the legacy'\\'value introduced by #145 specifically to make the parameter explicit — but on PHP 8.4 this triggersE_DEPRECATEDfor every rendered CSV. PHP plans to make it a hard error in a future release.Switching the default to
'':"") instead of by a preceding backslash.escape => '\\'explicitly (and will see the matching PHP deprecation; documented in the docblock).Behavior change: when a field contains a literal backslash immediately before the enclosure character, the output now uses doubled-quote escaping instead of backslash-quote. In practice every mainstream CSV consumer (Excel, LibreOffice, Google Sheets, PHP's own
fgetcsv) handles both correctly. Worth a CHANGELOG note at the next release.Regression test:
testDefaultEscapeIsRfc4180renders a field with embedded double-quotes and asserts (a) the output matches the doubled-quote form and (b) PHP did not emitE_DEPRECATEDduring render.=> new minor release
CI
The
cs-stanjob is expected to fail on this PR for the same pre-existing reason #147 is fixing (squizlabs/php_codesniffer v4vsslevomat 8.29incompatibility). Nothing in this PR affects code style; once #147 lands, rebasing makes CI green.Relationship to other open PRs
Independent of:
excelshorthand for Excel-friendly UTF-8 exports #148 (excelshorthand for UTF-8 Excel exports)Any of the three can land first; all three are clean low-risk additions on top of pristine
5.x.