Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for inline image Markdown #658

Merged
merged 7 commits into from
Mar 19, 2024

Conversation

kidroca
Copy link
Contributor

@kidroca kidroca commented Feb 29, 2024

This Pull Request introduces enhancements to the ExpensiMark library by implementing the conversion of image markdown syntax to HTML <img> tags and vice versa. It covers various cases including handling single images and text containing images.
Additionally, it addresses the conversion of <img> tags back to markdown.
An issue with image alt text containing markdown not converting to plain text is acknowledged and currently marked as a limitation.

Fixed Issues

$ Expensify/App#37246

Tests

  1. Unit/Integration Tests Covering the Change:

    • The changes are covered by unit tests added to ExpensiMark-HTML-test.js, ExpensiMark-HTMLToText-test.js, and ExpensiMark-Markdown-test.js. These tests validate the conversion of image markdown to HTML tags, the handling of images with different attributes, and the conversion of HTML <img> tags back to markdown.
  2. Tests Performed to Validate Changes:

    • Ran all newly added tests and ensured they pass, verifying the correct conversion of markdown to HTML <img> tags and vice versa.
    • Manually tested the parsing of markdown containing images in various formats and contexts to ensure the output matches expectations.
    • Checked the handling of invalid image URLs to ensure they remain unchanged, as expected.

QA

  1. QA Validation Steps:

    • Utilize NewDot to post comments incorporating Markdown (MD) image syntax along with public images sourced from platforms like Unsplash. Verify that these images are correctly displayed inline within the comments.
    • Ensure that you can seamlessly mix text and images within the same comment, maintaining the integrity of the intended message and layout.
    • Confirm the functionality to edit previously submitted comments, checking that upon editing, the comments are accurately presented with the original Markdown syntax intact.
    • Test the handling of invalid Markdown image syntax, such as using a broken or incorrectly formatted URL (e.g., ![image](invalid-link)). These should not be parsed and ought to remain displayed as the raw Markdown text, without conversion or rendering as an image.
  2. Areas to Test for Regressions:

    • General markdown to HTML conversion functionality, especially focusing on links, autolinks, and other media types to ensure no regressions have been introduced.
    • The handling of alt text in both markdown and HTML conversions, given the new logic around image processing.
    • The interaction of image markdown with other markdown features (e.g., bold, italic) to verify that the integration is seamless and does not introduce unexpected behavior.

@kidroca kidroca requested a review from a team as a code owner February 29, 2024 19:21
@melvin-bot melvin-bot bot requested review from puneetlath and removed request for a team February 29, 2024 19:22
Comment on lines +1903 to +1910
// Currently any markdown used inside the square brackets is converted to html string in the alt attribute
// The attributes should only contain plain text, but it doesn't seem possible to convert markdown to plain text
// or let the parser know not to convert markdown to html for html attributes
xtest('Image with alt text containing markdown', () => {
const testString = '![*bold* _italic_ ~strike~](https://example.com/image.png)';
const resultString = '<img src="https://example.com/image.png" alt="*bold* _italic_ ~strike~" />';
expect(parser.replace(testString)).toBe(resultString);
});
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The interaction of image Markdown with other Markdown features (e.g., bold, italic) currently exhibits partial functionality issues. Specifically, content intended for the alt attribute in images is being incorrectly parsed from Markdown to HTML. For instance, given an input like ![*bold*](https://example/img.jpg), the output produced is <img src="https://example/img.jpg" alt="<strong>bold</strong>" />.
Although this parsing behavior functions as designed and doesn't seem to cause issues in NewDot, the expected outcome for the alt attribute should be either alt="*bold*" or a plain text alt="bold", without HTML tags. This highlights a known limitation in the current implementation, where Markdown syntax within image alt text does not retain its original or simplified plain text form.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Created a fix for this in a separate PR: #661

@kidroca kidroca changed the title Kidroca/md image syntax Add support for inline image Markdown Feb 29, 2024
@situchan
Copy link
Contributor

situchan commented Mar 5, 2024

reviewing this as C+

@puneetlath
Copy link
Contributor

How's the review going @situchan?

kidroca added a commit to kidroca/expensify-common that referenced this pull request Mar 12, 2024
Related to: Expensify#658 (comment)

Content intended for the alt attribute in images is being incorrectly parsed from Markdown to HTML if it contains MD special characters
kidroca added a commit to kidroca/expensify-common that referenced this pull request Mar 12, 2024
Related to: Expensify#658 (comment)

Content intended for the alt attribute in images is being incorrectly parsed from Markdown to HTML if it contains MD special characters
Copy link
Contributor

@situchan situchan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good so far. Still testing various edge cases. Will update soon

@puneetlath
Copy link
Contributor

How's the review going @situchan?

@situchan
Copy link
Contributor

will complete today. I am also testing with this app PR

@situchan
Copy link
Contributor

situchan commented Mar 15, 2024

I provided feedback on app PR

// Currently any markdown used inside the square brackets is converted to html string in the alt attribute
// The attributes should only contain plain text, but it doesn't seem possible to convert markdown to plain text
// or let the parser know not to convert markdown to html for html attributes
xtest('Image with alt text containing markdown', () => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

xtest ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The xtest function in Jest is a way to skip a particular test. When you prefix a test with xtest instead of test, Jest will ignore this test during the test run. This is useful for temporarily disabling a test that might be failing due to unfinished features or bugs that are not yet resolved, without having to comment out the test code. It allows you to easily re-enable the test in the future by changing xtest back to test.

marcaaron
marcaaron previously approved these changes Mar 18, 2024
Copy link
Contributor

@marcaaron marcaaron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

test('Image with invalid url should remain unchanged', () => {
const testString = '![test](invalid)';
expect(parser.replace(testString)).toBe(testString);
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add a test for something like:

![banana](https://example.com/banana.png" onerror="alert('xss')")

Tbh not sure if xss like that is even remotely possible based on how the markdown parser works.

Copy link
Contributor Author

@kidroca kidroca Mar 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That particular example is converted into an anchor, presumably due to the use of brackets. However, I will add a simpler test to ensure that additional attributes are not captured.

Expected: "![banana](https://example.com/banana.png onerror=\"alert('xss')\")"
Received: "![banana](<a href=\"https://example.com/banana.png\" target=\"_blank\" rel=\"noreferrer noopener\">https://example.com/banana.png</a> onerror=&quot;alert(&#x27;xss&#x27;)&quot;)"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome thanks!

kidroca added a commit to kidroca/expensify-common that referenced this pull request Mar 19, 2024
Related to: Expensify#658 (comment)

Content intended for the alt attribute in images is being incorrectly parsed from Markdown to HTML if it contains MD special characters
@kidroca kidroca force-pushed the kidroca/md-image-syntax branch from 5f4040b to f843a1f Compare March 19, 2024 11:20
kidroca added a commit to kidroca/expensify-common that referenced this pull request Mar 19, 2024
Related to: Expensify#658 (comment)

Content intended for the alt attribute in images is being incorrectly parsed from Markdown to HTML if it contains MD special characters
Comment on lines +1922 to +1936
test('Trying to pass additional attributes should not create an <img>', () => {
const testString = '![test](https://example.com/image.png "title" class="image")';

// It seems the autolink rule is applied. We might need to update this test if the autolink rule is changed
// Ideally this test should return the same string as the input, or an <img> tag with the alt attribute set to
// "test" and no other attributes
const resultString = '![test](<a href=\"https://example.com/image.png\" target=\"_blank\" rel=\"noreferrer noopener\">https://example.com/image.png</a> &quot;title&quot; class=&quot;image&quot;)';
expect(parser.replace(testString)).toBe(resultString);
});

test('Trying to inject additional attributes should not work', () => {
const testString = '![test" onerror="alert(\'xss\')](https://example.com/image.png)';
const resultString = '<img src=\"https://example.com/image.png\" alt=\"test&quot; onerror=&quot;alert(&#x27;xss&#x27;)\" />';
expect(parser.replace(testString)).toBe(resultString);
});
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the PR to include 2 more tests here, other updates are minor code style/formatting changes applied automatically

kidroca added a commit to kidroca/expensify-common that referenced this pull request Mar 19, 2024
Related to: Expensify#658 (comment)

Content intended for the alt attribute in images is being incorrectly parsed from Markdown to HTML if it contains MD special characters
kidroca added a commit to kidroca/expensify-common that referenced this pull request Mar 19, 2024
Related to: Expensify#658 (comment)

Content intended for the alt attribute in images is being incorrectly parsed from Markdown to HTML if it contains MD special characters
Copy link
Contributor

@situchan situchan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Checklist will be done in app PR after version bump

@marcaaron marcaaron merged commit fe10326 into Expensify:main Mar 19, 2024
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants