Fix problem with encoding of css entities when post with block level custom css is edited by user without unfiltered_html#11104
Conversation
…custom css is edited by user without unfiltered_html
|
The following accounts have interacted with this PR and/or linked issues. I will continue to update these lists as activity occurs. You can also manually ask me to refresh this list by adding the Core Committers: Use this line as a base for the props when committing in SVN: To understand the WordPress project's expectations around crediting contributors, please review the Contributor Attribution page in the Core Handbook. |
Test using WordPress PlaygroundThe changes in this pull request can previewed and tested using a WordPress Playground instance. WordPress Playground is an experimental project that creates a full WordPress instance entirely within the browser. Some things to be aware of
For more details about these limitations and more, check out the Limitations page in the WordPress Playground documentation. |
|
FYI - I will add tests for this once there is confirmation that this is the correct approach for fixing this bug. There may be a better solution. If there is feel free to close this PR and open an alternative. |
|
|
||
| return str_replace( | ||
| array( '&', '>', '"', ''' ), | ||
| array( '&', '>', '"', "'" ), |
There was a problem hiding this comment.
We will not doubt need to account for other values here, but we can work out exactly what needs to be covered once there is some agreement on the best way to solve this problem - I imagine there will be a smarter solution - so didn't spend too much time finessing this one.
| @@ -2077,6 +2077,17 @@ function _filter_block_content_callback( $matches ) { | |||
| function filter_block_kses( $block, $allowed_html, $allowed_protocols = array() ) { | |||
| $block['attrs'] = filter_block_kses_value( $block['attrs'], $allowed_html, $allowed_protocols, $block ); | |||
There was a problem hiding this comment.
Thanks for getting up a fix!
I was wondering: what is the important sanitization step for block CSS attributes?
wp_kses() treats the CSS string as HTML. But should it?
I'm wondering if an alternative solution would be to target the css attribute and run it through wp_strip_all_tags rather than wp_kses.
Or running through something similar (or reuse this same validation in a helper) that @sirreal and @dmsnell worked on in WP_REST_Global_Styles_Controller::validate_custom_css() for https://core.trac.wordpress.org/ticket/64418
There, for users without unfiltered_html, & and > in block custom CSS were being double-encoded by KSES + JSON, so the CSS broke.
(Sorry Jon and Dennis - you've become my default go-to brains trust for this stuff 😄 )
|
Related: https://core.trac.wordpress.org/changeset/61486 / #10641 fixed the same class of KSES-mangling issue for Global Styles custom CSS by pre-escaping the JSON with JSON_HEX_TAG | JSON_HEX_AMP. This PR addresses the same problem for per-block custom CSS (attrs.style.css), which goes through the separate filter_block_kses() pipeline. I don't think the same pre-escaping approach can work in this case, as parse_blocks() calls json_decode() on the entire attributes object, which converts \u0026 back to & before KSES runs - but I do not know a lot about these flows, and don't have time to look closer as currenlty travelling - so could be completely wrong about this. |
|
I had some trouble reproducing the issue. In order to test this, I had to enable a recent version of the Gutenberg plugin. The individual block CSS feature is not yet available in Core yet, is it? When I used an author role, I don't see the additional CSS panel for blocks. When I used an editor role, I was unable to reproduce the issue because it seems to have Am I doing something wrong in the reproduction steps? Some thoughts based on the issue:
I was very happy with the solution in r61486. Post content exclusively contained JSON which has some flexibility in escaping. JSON can be made to be plain HTML text by escaping HTML syntax characters ( This PR tries to recover after KSES has mangled the data. That seems inherently more risky. It should be possible to decode HTML character references (like done here) but what if KSES starts to remove things that look like tags? What if I'd like to use Another option is to protect the data before KSES can mangle it by encoding it ourselves in an HTML-text safe way. A few quick options come to mind:
Of course, before using the value it will need to be decoded appropriately. Either of these seem likely to prevent the issue by ensuring HTML syntax |
@sirreal when logged in as the author user you do not need to see the custom CSS input box, you just need to edit the post content with the existing custom CSS in place that was added when you created the post as the admin user. This bug only occurs if a user without |
Thanks a lot @sirreal, this is great. Could "don’t run KSES on block attribute attrs.style.css" be another option? Above, I was thinking of an allowlist of “non-HTML” attribute paths that are not HTML, e.g. If there are hidden gotchas there...
Maybe @glendaviesnz can answer this: let's say we encode in |
Got it, that worked. I did have to change the post author so that the author role user could edit the post.
That's a way to prevent this issue. The problem is that exceptions like that often create vulnerabilities. If a bad actor knows |
this is going to be a dead-end, because it’s largely not possible to do that. we can build in signals into the storage to communicate it though. for instance, prefix a base64-encoded string. or in that same vein but better, store the attribute as a data URI which says explicitly what the content is. {
"style": {
"css": "data:text/css;base64,eyBjb2xvcjogcmVkOyB9"
}
}for the sake of transparency we can always escape the CSS from any characters that would be “dangerous,” but I think we’ve seen a number of cases where this has gone wrong because downstream code likes to unescape and re-escape, which ends up eliminating the escaping we intentionally applied.
the |
Trac ticket: https://core.trac.wordpress.org/ticket/64771
Summary
WordPress/gutenberg#73959 introduced block-level custom CSS. Everything works as expected unless a user without unfiltered_html edits a page/post with block-level custom CSS that includes nested selectors, eg.
This PR fixes double-encoding of HTML entities in per-block custom CSS (
attrs.style.css) when a user without theunfiltered_htmlcapability saves a post that includes block-level custom CSS with nested selectors.The problem
When a user without
unfiltered_html(e.g. an Author) saves a block with custom CSS containing&(CSS nesting selector) or>(child combinator), thefilter_block_content()pipeline corrupts these characters through double-encoding:parse_blocks()—json_decode()decodes\u0026→&filter_block_kses_value()—wp_kses()treats the CSS string as HTML and encodes&→&,>→>serialize_block_attributes()—json_encode()encodes the&in&→\u0026amp;The result is
\u0026amp;in post_content instead of the original\u0026. On the next editor load,json_decode()produces the literal string&instead of&, so the CSS textarea displays corrupted values like&and>. Each subsequent save compounds the corruption further.The fix
After KSES has run on block attributes (and stripped any dangerous HTML tags), decode the specific named entities it introduced in the
style.cssattribute. HTML entities are invalid in CSS, so KSES should not have introduced them.This PR adds:
undo_block_custom_css_kses_entities()— a new function that reverses only the 4 specific named entities thatwp_kses()may introduce (&,>,",'). This is intentionally narrower thanwp_specialchars_decode()to avoid decoding numeric/hex references that KSES may have intentionally preserved.A call in
filter_block_kses()— afterfilter_block_kses_value()has processed all attributes, ifattrs.style.cssexists, it is passed through the decode function before the block is returned for serialization.Why this is safe
<) are affected<is intentionally excluded — KSES strips bare<entirely rather than encoding it, so<in the output would indicate it was already present in the inputattrs.style.cssonly — other block attributes remain entity-encoded as expectedTest steps
Setup
unfiltered_htmlcapability)Test 1: CSS nesting selector (
&)color: blue; & p { color: red; }unfiltered_html, eg.authorand edit the paragraph, eg. make part of string italic. Note you will not see the custom CSS input box when logged in as this user. Just edit the existing paragraph in the post content and save.color: blue; & .child { color: red; }(unchanged)color: blue; & .child { color: red; }Test 2: Child combinator (
>)& > p { margin: 0; }& > p { margin: 0; }(unchanged)& > p { margin: 0; }Test 3: Idempotent saves
Test 4: Frontend rendering
Test 5: Non-CSS attributes are unaffected
&(e.g.foo&bar)attrs.style.cssis decodedThis Pull Request is for code review only. Please keep all other discussion in the Trac ticket. Do not merge this Pull Request. See GitHub Pull Requests for Code Review in the Core Handbook for more details.