Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Block API: Parse entity only when valid character reference #13512

Merged
merged 2 commits into from
Jan 29, 2019

Conversation

aduth
Copy link
Member

@aduth aduth commented Jan 25, 2019

Fixes #12448
Supersedes #13406

This pull request seeks to resolve an issue where certain HTML strings may result in a block being incorrectly marked as invalid.

Implementation notes:

The root issue is that in the HTML tokenization which occurs during block validation, the entity decoder wrongly evaluates invalid character references.

For example, given the markup:

<h2>Test & Test</h2><h2>Test &amp; Test</h2>

Previously, the entity decoder would wrongly produce a value for every segment of text between & and ;. With this string, producing a value for the segment Test</h2><h2>Test &amp would confuse the tokenizer into considering the entire string as a continuous set of character data, thus missing the EndTag (</h2>) and StartTag (<h2>) within.

Testing instructions:

Repeat steps to reproduce from #12448 (comment), verifying that no block invalidation occurs.

Ensure unit tests pass:

npm run test-unit packages/blocks/src/api/test/validation.js

cc @fastlinemedia

@aduth aduth added [Type] Bug An existing feature does not function as intended [Feature] Block API API that allows to express the block paradigm. labels Jan 25, 2019
@aduth aduth requested a review from talldan January 25, 2019 21:23
Copy link
Contributor

@talldan talldan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for finding the root cause of this issue.

Tested and I no longer saw the block validation warning. Code looks great, very helpful comments 😄 .

@gziolo gziolo added this to the 5.0 (Gutenberg) milestone Jan 29, 2019
@aduth aduth force-pushed the fix/12448-normalize-encoded branch from 76c0e42 to 895fb27 Compare January 29, 2019 16:45
@aduth aduth force-pushed the fix/12448-normalize-encoded branch from 895fb27 to df99432 Compare January 29, 2019 16:47
@aduth aduth merged commit a6f7f9d into master Jan 29, 2019
@aduth aduth deleted the fix/12448-normalize-encoded branch January 29, 2019 17:05
daniloercoli added a commit that referenced this pull request Jan 30, 2019
…rnmobile/372-use-RichText-on-Title-block

* 'master' of https://github.com/WordPress/gutenberg: (36 commits)
  Fixes plural messages POT generation. (#13577)
  Typo fix (#13595)
  REST API: Remove oEmbed proxy HTML filtering (#13575)
  Removed unnecessary className attribute. Fixes #11664 (#11831)
  Add changelog for RSS block (#13588)
  Components: Set type=button for TabPanel button elements. (#11944)
  Update util.js (#13582)
  Docs: Add accessbility specific page (#13169)
  Rnmobile/media methods refactor (#13554)
  chore(release): publish
  chore(release): publish
  Plugin: Deprecate gutenberg_get_script_polyfill (#13536)
  Block API: Parse entity only when valid character reference (#13512)
  RichText: List: fix indentation (#13563)
  Plugin: Deprecate window._wpLoadGutenbergEditor (#13547)
  Plugin: Avoid setting generic "Edit Post" title on load (#13552)
  Plugin: Populate demo content by default content filters (#13553)
  RichText: List: Fix getParentIndex (#13562)
  RichText: List: Fix outdent with children (#13559)
  Scripts: Remove npm run build from test-e2e default run (#13420)
  ...
@designsimply
Copy link
Member

I tested with master @ f2c5db6c8 and the problem is still there for me. I very likely did something wrong in my testing though! f2c5db6c8 should include the fix right? (23s)

@aduth
Copy link
Member Author

aduth commented Jan 30, 2019

@designsimply Did you confirm to compile the files from that commit by npm run build to make sure what you're running reflects the code at that point? I'm not able to reproduce the issue.

@designsimply
Copy link
Member

I thought I had but perhaps I was mistaken! 🤦‍♀️ I just tested again and it's working, ack sorry for the noise!

(Aside: this makes me want a badge in the dev env that shows what level or branch I'm on…)

youknowriad pushed a commit that referenced this pull request Mar 6, 2019
* Block API: Rename IdentityEntityParser as DecodeEntityParser

* Block API: Parse entity only when valid character reference
youknowriad pushed a commit that referenced this pull request Mar 6, 2019
* Block API: Rename IdentityEntityParser as DecodeEntityParser

* Block API: Parse entity only when valid character reference
@aduth aduth added the [Feature] Block Validation/Deprecation Handling block validation to determine accuracy and deprecation label Jan 6, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
[Feature] Block API API that allows to express the block paradigm. [Feature] Block Validation/Deprecation Handling block validation to determine accuracy and deprecation [Type] Bug An existing feature does not function as intended
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Unexpected block validation error with unescaped ampersands
4 participants