gh-151857: Fix IndexError in the email header parser on empty input#151858
Open
tonghuaroot wants to merge 2 commits into
Open
gh-151857: Fix IndexError in the email header parser on empty input#151858tonghuaroot wants to merge 2 commits into
tonghuaroot wants to merge 2 commits into
Conversation
…nput Guard two empty-input index escapes in the modern email header parser that raised a bare IndexError instead of a parse defect: a MIME parameter name ending with '*', and an address display name that is only a comment.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes two instances of a single bug class in the email header parser: an empty token-list/string is indexed and escapes as a bare
IndexErrorrather than being handled as a parse defect.get_parameter— a parameter name ending in the extended-parameter marker*with no value leavesvalueempty beforevalue[0]. Guard:if not value or value[0] != '='.DisplayName.display_name— a display name consisting only of a comment emptiesresafterres.pop(0), thenres[-1]raised. Guard: early-returnres.value('') whenresis empty after the pop, mirroring the pre-existinglen(res) == 0guard.Both now degrade gracefully (a parse defect / an empty display name).
Surveyed both instances and closed the class: a focused fuzz of malformed address and parameter headers — 23.5k structured combinations plus 100k randomized
message_from_stringparses underemail.policy.default— shows no other non-HeaderParseErrorexception escaping the parser. The empty-stringIndexErrorstill produced by directly calling the low-levelget_*token parsers is their documented non-empty precondition and is unreachable from public header parsing (every caller checks non-empty first).Tests: regression tests for both instances at the parser level (
test__header_value_parser.py) and via the public header API (test_headerregistry.py). The previousname*; charset=utf-8case did not actually exercise the regression — it raisesHeaderParseErroron both patched and unpatched code — and has been replaced with cases thatIndexErrorunpatched (name*0*,x=1; name*).