Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(tiptap): remove indentation from pretty printed HTML #1791

Merged
merged 1 commit into from
Jan 26, 2024

Conversation

dcshzj
Copy link
Contributor

@dcshzj dcshzj commented Jan 25, 2024

Problem

The original library that was used to pretty print the HTML does not handle <img> tags correctly, resulting in inaccurate indentation and leading to the page not being rendered correctly.

This happens because the <img> tag is a void tag, but the pretty printer interprets it as an opening of a new tag, causing the rest of the content to become incorrectly indented.

Solution

Breaking Changes

  • Yes - this PR contains breaking changes
  • No - this PR is backwards compatible with ALL of the following feature flags in this doc

Bug Fixes:

  • Switched the library to the underlying formatting library, which opens up the ability to configure the indentation level.
  • Configured the indentation level to be -1 (which represents 0, due to a bug in the upstream code). HTML does not require indentation, so this does not affect semantics/rendering.

Before & After Screenshots

BEFORE:

Screenshot 2024-01-25 at 22 49 22

AFTER:

Screenshot 2024-01-25 at 22 52 31

Tests

  • Unit tests (using npm run tests)
  • e2e tests (comment on this PR with the text !run e2e)
  • Smoke tests
    • Navigate to any site and edit a page using the Tiptap editor
    • Insert an image at the end of the page and save, verify that there are no stray HTML rendered on the built staging site.
    • Insert arbitrary text and content after that image on the same page and save, verify that the page renders as expected without any stray HTML.

Deploy Notes

New dependencies:

  • html : library for pretty-printing HTML specifically

New dev dependencies:

  • @types/html : typings for the html package

REMOVED dependencies:

  • beautify
  • @types/beautify

Copy link
Contributor

@kishore03109 kishore03109 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple of concerns/clarifications

  1. how does this deal with code block for existing codes blocks? (eg. <blah/>)
  2. How does this exactly work? why does prettifying the html, which to me seems to produce semantically similar code to its input, lead to a different site content altogether?
  3. I dont understand the -1 bug :( it uses a default indent size as 4, why would that lead to indentation issues?
  4. Performance. This does seem to add another parser in the fe, concerned if this could lead to pages crashing esp in slow computers since this is done on the client side. Is there another solution we could take that does not involve adding a library that we need to be aware of + parsing

@dcshzj
Copy link
Contributor Author

dcshzj commented Jan 26, 2024

  1. how does this deal with code block for existing codes blocks? (eg. <blah/>)

Hmm I think in Tiptap it is not possible to create a code block, so it should not affect. Previously there were no indentation as well since everything is on a single line.

  1. How does this exactly work? why does prettifying the html, which to me seems to produce semantically similar code to its input, lead to a different site content altogether?

This is due to how Jekyll renders the HTML, which for some reason treats all content after an <img> tag (on the same line) as markdown and refuses to render it as HTML. Prettifying the HTML by adding line breaks between tags allows Jekyll to render each line in isolation, which results in the HTML being rendered correctly.

  1. I dont understand the -1 bug :( it uses a default indent size as 4, why would that lead to indentation issues?

The upstream library has an issue with the indentation when it comes to the <img> tag and causes all other nodes after that tag to gain an additional indentation level (can see the diff here).

This additional indentation causes Jekyll to render it incorrectly, as having 4 spaces in front of text is considered to be a code block (part of the Markdown spec), which causes the page to render all the HTML content as a code block and not the actual HTML.

  1. Performance. This does seem to add another parser in the fe, concerned if this could lead to pages crashing esp in slow computers since this is done on the client side. Is there another solution we could take that does not involve adding a library that we need to be aware of + parsing

There is quite limited libraries out there for prettifying HTML and I opted to use beautify earlier. This PR further streamlines it by only using the html package. Making a wild guess here but I don't think performance would have a big impact here because it is only prettifying a single page.

Another option I considered was building our own prettifier but I felt that it might be quite error prone due to the amount of possible permutations. Open to revisit this decision if you feel strongly about it!

@dcshzj dcshzj requested a review from a team January 26, 2024 01:57
@kishore03109 kishore03109 self-requested a review January 26, 2024 02:00
@kishore03109 kishore03109 merged commit ca2e669 into develop Jan 26, 2024
12 checks passed
@mergify mergify bot deleted the fix/remove-pretty-html-indentation branch January 26, 2024 03:52
@kishore03109 kishore03109 mentioned this pull request Jan 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants