Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Conform encoding-label matching to Encoding spec
This change makes the parser’s encoding-name matching conform to the current Encoding spec at https://encoding.spec.whatwg.org/#concept-encoding-get — which requires that only leading and trailing whitespace be removed from a string before checking if it matches any valid encoding name. Otherwise, without this change, the parser instead implements https://www.unicode.org/reports/tr22/tr22-8.html#Charset_Alias_Matching — which requires deleting “all characters except a-z, A-Z, and 0-9” from a string before checking if it matches any valid encoding name. That difference makes us fail two html5-tests cases. Relates to #47
- Loading branch information