Skip to content

Commit

Permalink
html: fix SOLIDUS '/' handling in attribute parsing
Browse files Browse the repository at this point in the history
Calling the Tokenizer with HTML elements containing SOLIDUS (/) character
in the attribute name results in incorrect tokenization.

This is due to violation of the following rule transitions in the WHATWG spec:
- https://html.spec.whatwg.org/multipage/parsing.html#attribute-name-state,
  where we are not reconsuming the character if '/' is encountered
- https://html.spec.whatwg.org/multipage/parsing.html#after-attribute-name-state,
  where we are not switching to self closing state

Fixes golang/go#63402

Change-Id: I90d998dd8decde877bd63aa664f3657aa6161024
GitHub-Last-Rev: 3546db8
GitHub-Pull-Request: #195
Reviewed-on: https://go-review.googlesource.com/c/net/+/533518
LUCI-TryBot-Result: Go LUCI <[email protected]>
Auto-Submit: Michael Pratt <[email protected]>
Reviewed-by: Roland Shoemaker <[email protected]>
Reviewed-by: David Chase <[email protected]>
  • Loading branch information
maciekmm authored and gopherbot committed Feb 7, 2024
1 parent 73e4b50 commit 643fd16
Show file tree
Hide file tree
Showing 2 changed files with 23 additions and 4 deletions.
12 changes: 8 additions & 4 deletions html/token.go
Original file line number Diff line number Diff line change
Expand Up @@ -910,17 +910,16 @@ func (z *Tokenizer) readTagAttrKey() {
return
}
switch c {
case ' ', '\n', '\r', '\t', '\f', '/':
z.pendingAttr[0].end = z.raw.end - 1
return
case '=':
if z.pendingAttr[0].start+1 == z.raw.end {
// WHATWG 13.2.5.32, if we see an equals sign before the attribute name
// begins, we treat it as a character in the attribute name and continue.
continue
}
fallthrough
case '>':
case ' ', '\n', '\r', '\t', '\f', '/', '>':
// WHATWG 13.2.5.33 Attribute name state
// We need to reconsume the char in the after attribute name state to support the / character
z.raw.end--
z.pendingAttr[0].end = z.raw.end
return
Expand All @@ -939,6 +938,11 @@ func (z *Tokenizer) readTagAttrVal() {
if z.err != nil {
return
}
if c == '/' {
// WHATWG 13.2.5.34 After attribute name state
// U+002F SOLIDUS (/) - Switch to the self-closing start tag state.
return
}
if c != '=' {
z.raw.end--
return
Expand Down
15 changes: 15 additions & 0 deletions html/token_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -601,6 +601,21 @@ var tokenTests = []tokenTest{
`<p =asd>`,
`<p =asd="">`,
},
{
"forward slash before attribute name",
`<p/=">`,
`<p ="="">`,
},
{
"forward slash before attribute name with spaces around",
`<p / =">`,
`<p ="="">`,
},
{
"forward slash after attribute name followed by a character",
`<p a/ ="">`,
`<p a="" =""="">`,
},
}

func TestTokenizer(t *testing.T) {
Expand Down

0 comments on commit 643fd16

Please sign in to comment.