Skip to content

Html api fuzzer#40

Draft
sirreal wants to merge 182 commits into
trunkfrom
html-api-fuzz
Draft

Html api fuzzer#40
sirreal wants to merge 182 commits into
trunkfrom
html-api-fuzz

Conversation

@sirreal

@sirreal sirreal commented Jun 9, 2026

Copy link
Copy Markdown
Owner

Trac ticket:

Use of AI Tools

Yeah 🙂


This Pull Request is for code review only. Please keep all other discussion in the Trac ticket. Do not merge this Pull Request. See GitHub Pull Requests for Code Review in the Core Handbook for more details.

sirreal and others added 30 commits July 22, 2025 08:52
Remove SELECT case and update comments numbers.

The SELECT case was removed from the algorithm in the standard.
This was removed from the HTML standard
These insertion modes are removed from the standard.
When SELECT > BUTTON > SELECTEDCONTENT is encountered, the selected
option may need to be cloned into the SELECTEDCONTENT. The HTML
processor does not support this action as it may require out of order
processing.
Co-authored-by: sirreal <841763+sirreal@users.noreply.github.com>
Co-authored-by: sirreal <841763+sirreal@users.noreply.github.com>
…t tags separate from wp_localize_script

Co-authored-by: sirreal <841763+sirreal@users.noreply.github.com>
…p and simplify iteration

Co-authored-by: sirreal <841763+sirreal@users.noreply.github.com>
Co-authored-by: sirreal <841763+sirreal@users.noreply.github.com>
- Removed separate print_script_data() method and wp_print_script_data() wrapper
- Removed action hooks from wp_footer and admin_print_footer_scripts
- Filter now runs during script processing in do_item()
- Data script tag is output immediately before each script tag
- Updated all tests to use wp_print_scripts instead of wp_print_script_data
- Added test to verify data tag appears before script tag

Co-authored-by: sirreal <841763+sirreal@users.noreply.github.com>
Co-authored-by: sirreal <841763+sirreal@users.noreply.github.com>
- Rename $data_tag to $script_data_tag for clarity
- Remove unnecessary (string) coercion from wp_json_encode
- Update @SInCE tag to 7.0.0
- Remove empty line before closing brace

Co-authored-by: sirreal <841763+sirreal@users.noreply.github.com>
- Add JSON_INVALID_UTF8_SUBSTITUTE to json_encode flags
- This prevents failures on invalid UTF-8 by substituting with U+FFFD
- Provides defense-in-depth beyond wp_json_encode's sanity checking
- Document the flag's purpose in the inline comment

Co-authored-by: dmsnell <5431237+dmsnell@users.noreply.github.com>
This reverts commit d89d333. The flag was added based on incorrect analysis.
wp_json_encode() already handles invalid UTF-8 through its fallback
mechanism (_wp_json_sanity_check), converting invalid bytes to "?".

The implementation now matches script modules exactly, using only:
- JSON_HEX_TAG
- JSON_UNESCAPED_SLASHES
- JSON_UNESCAPED_UNICODE (UTF-8 pages only)
- JSON_UNESCAPED_LINE_TERMINATORS (UTF-8 pages only)

Co-authored-by: dmsnell <5431237+dmsnell@users.noreply.github.com>
Without this flag, json_encode() returns false on invalid UTF-8, triggering
wp_json_encode()'s expensive fallback mechanism (_wp_json_sanity_check)
which recursively walks all data and re-encodes.

With JSON_INVALID_UTF8_SUBSTITUTE:
- Invalid UTF-8 bytes are substituted with U+FFFD (�) in a single pass
- No fallback overhead
- Standard, consistent behavior

This is the proper solution rather than relying on the fallback.

Co-authored-by: dmsnell <5431237+dmsnell@users.noreply.github.com>
sirreal added 30 commits June 14, 2026 21:28
…into html-api-fuzz

# Conflicts:
#	src/wp-includes/html-api/class-wp-html-processor.php
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants