Add support for inline image Markdown #658

kidroca · 2024-02-29T19:21:32Z

This Pull Request introduces enhancements to the ExpensiMark library by implementing the conversion of image markdown syntax to HTML <img> tags and vice versa. It covers various cases including handling single images and text containing images.
Additionally, it addresses the conversion of <img> tags back to markdown.
An issue with image alt text containing markdown not converting to plain text is acknowledged and currently marked as a limitation.

Fixed Issues

$ Expensify/App#37246

Tests

Unit/Integration Tests Covering the Change:
- The changes are covered by unit tests added to ExpensiMark-HTML-test.js, ExpensiMark-HTMLToText-test.js, and ExpensiMark-Markdown-test.js. These tests validate the conversion of image markdown to HTML tags, the handling of images with different attributes, and the conversion of HTML <img> tags back to markdown.
Tests Performed to Validate Changes:
- Ran all newly added tests and ensured they pass, verifying the correct conversion of markdown to HTML <img> tags and vice versa.
- Manually tested the parsing of markdown containing images in various formats and contexts to ensure the output matches expectations.
- Checked the handling of invalid image URLs to ensure they remain unchanged, as expected.

QA

QA Validation Steps:
- Utilize NewDot to post comments incorporating Markdown (MD) image syntax along with public images sourced from platforms like Unsplash. Verify that these images are correctly displayed inline within the comments.
- Ensure that you can seamlessly mix text and images within the same comment, maintaining the integrity of the intended message and layout.
- Confirm the functionality to edit previously submitted comments, checking that upon editing, the comments are accurately presented with the original Markdown syntax intact.
- Test the handling of invalid Markdown image syntax, such as using a broken or incorrectly formatted URL (e.g., ![image](invalid-link)). These should not be parsed and ought to remain displayed as the raw Markdown text, without conversion or rendering as an image.
Areas to Test for Regressions:
- General markdown to HTML conversion functionality, especially focusing on links, autolinks, and other media types to ensure no regressions have been introduced.
- The handling of alt text in both markdown and HTML conversions, given the new logic around image processing.
- The interaction of image markdown with other markdown features (e.g., bold, italic) to verify that the integration is seamless and does not introduce unexpected behavior.

… versa

kidroca · 2024-02-29T19:22:20Z

__tests__/ExpensiMark-HTML-test.js

+    // Currently any markdown used inside the square brackets is converted to html string in the alt attribute
+    // The attributes should only contain plain text, but it doesn't seem possible to convert markdown to plain text
+    // or let the parser know not to convert markdown to html for html attributes
+    xtest('Image with alt text containing markdown', () => {
+        const testString = '![*bold* _italic_ ~strike~](https://example.com/image.png)';
+        const resultString = '<img src="https://example.com/image.png" alt="*bold* _italic_ ~strike~" />';
+        expect(parser.replace(testString)).toBe(resultString);
+    });


The interaction of image Markdown with other Markdown features (e.g., bold, italic) currently exhibits partial functionality issues. Specifically, content intended for the alt attribute in images is being incorrectly parsed from Markdown to HTML. For instance, given an input like ![*bold*](https://example/img.jpg), the output produced is <img src="https://example/img.jpg" alt="<strong>bold</strong>" />.
Although this parsing behavior functions as designed and doesn't seem to cause issues in NewDot, the expected outcome for the alt attribute should be either alt="*bold*" or a plain text alt="bold", without HTML tags. This highlights a known limitation in the current implementation, where Markdown syntax within image alt text does not retain its original or simplified plain text form.

Created a fix for this in a separate PR: #661

situchan · 2024-03-05T22:31:00Z

reviewing this as C+

puneetlath · 2024-03-11T17:33:58Z

How's the review going @situchan?

Related to: Expensify#658 (comment) Content intended for the alt attribute in images is being incorrectly parsed from Markdown to HTML if it contains MD special characters

situchan

Looks good so far. Still testing various edge cases. Will update soon

puneetlath · 2024-03-14T15:37:03Z

How's the review going @situchan?

situchan · 2024-03-15T14:25:20Z

will complete today. I am also testing with this app PR

situchan · 2024-03-15T15:18:23Z

I provided feedback on app PR

situchan · 2024-03-15T15:28:55Z

__tests__/ExpensiMark-HTML-test.js

+    // Currently any markdown used inside the square brackets is converted to html string in the alt attribute
+    // The attributes should only contain plain text, but it doesn't seem possible to convert markdown to plain text
+    // or let the parser know not to convert markdown to html for html attributes
+    xtest('Image with alt text containing markdown', () => {


The xtest function in Jest is a way to skip a particular test. When you prefix a test with xtest instead of test, Jest will ignore this test during the test run. This is useful for temporarily disabling a test that might be failing due to unfinished features or bugs that are not yet resolved, without having to comment out the test code. It allows you to easily re-enable the test in the future by changing xtest back to test.

marcaaron

LGTM

marcaaron · 2024-03-18T20:58:41Z

__tests__/ExpensiMark-HTML-test.js

+    test('Image with invalid url should remain unchanged', () => {
+        const testString = '![test](invalid)';
+        expect(parser.replace(testString)).toBe(testString);
+    });


Should we add a test for something like:

![banana](https://example.com/banana.png" onerror="alert('xss')")

Tbh not sure if xss like that is even remotely possible based on how the markdown parser works.

That particular example is converted into an anchor, presumably due to the use of brackets. However, I will add a simpler test to ensure that additional attributes are not captured.

Expected: "![banana](https://example.com/banana.png onerror=\"alert('xss')\")" Received: "![banana](<a href=\"https://example.com/banana.png\" target=\"_blank\" rel=\"noreferrer noopener\">https://example.com/banana.png</a> onerror="alert('xss')")"

Awesome thanks!

Related to: Expensify#658 (comment) Content intended for the alt attribute in images is being incorrectly parsed from Markdown to HTML if it contains MD special characters

kidroca · 2024-03-19T11:30:58Z

__tests__/ExpensiMark-HTML-test.js

+    test('Trying to pass additional attributes should not create an <img>', () => {
+        const testString = '![test](https://example.com/image.png "title" class="image")';
+
+        // It seems the autolink rule is applied. We might need to update this test if the  autolink rule is changed
+        // Ideally this test should return the same string as the input, or an <img> tag with the alt attribute set to
+        // "test" and no other attributes
+        const resultString = '![test](<a href=\"https://example.com/image.png\" target=\"_blank\" rel=\"noreferrer noopener\">https://example.com/image.png</a> &quot;title&quot; class=&quot;image&quot;)';
+        expect(parser.replace(testString)).toBe(resultString);
+    });
+
+    test('Trying to inject additional attributes should not work', () => {
+        const testString = '![test" onerror="alert(\'xss\')](https://example.com/image.png)';
+        const resultString = '<img src=\"https://example.com/image.png\" alt=\"test&quot; onerror=&quot;alert(&#x27;xss&#x27;)\" />';
+        expect(parser.replace(testString)).toBe(resultString);
+    });


Updated the PR to include 2 more tests here, other updates are minor code style/formatting changes applied automatically

Related to: Expensify#658 (comment) Content intended for the alt attribute in images is being incorrectly parsed from Markdown to HTML if it contains MD special characters

situchan

Looks good! Checklist will be done in app PR after version bump

kidroca added 2 commits February 29, 2024 18:34

feat(ExpensiMark): conversion of image markdown to HTML tags and vice…

c6698c6

… versa

ExpensiMark: handle additional cases

4b7ed95

kidroca requested a review from a team as a code owner February 29, 2024 19:21

melvin-bot bot requested review from puneetlath and removed request for a team February 29, 2024 19:22

kidroca commented Feb 29, 2024

View reviewed changes

ExpensiMark: fix lint warnings

4111539

kidroca mentioned this pull request Feb 29, 2024

Add support for inline image Markdown Expensify/App#37566

Merged

48 tasks

kidroca changed the title ~~Kidroca/md image syntax~~ Add support for inline image Markdown Feb 29, 2024

MelvinBot mentioned this pull request Feb 27, 2024

[Snyk] Upgrade semver from 7.5.4 to 7.6.0 #656

Merged

kidroca mentioned this pull request Mar 12, 2024

Add support for inline image Markdown - attribute fix #661

Merged

situchan reviewed Mar 12, 2024

View reviewed changes

situchan reviewed Mar 15, 2024

View reviewed changes

marcaaron previously approved these changes Mar 18, 2024

View reviewed changes

kidroca added 2 commits March 19, 2024 12:31

ExpensiMark: add test for additional attribute handling

0764cf0

Merge branch 'main' into kidroca/md-image-syntax

3e89f3f

kidroca dismissed marcaaron’s stale review via 5f4040b March 19, 2024 11:13

ExpensiMark: add test for attribute injection

f843a1f

kidroca force-pushed the kidroca/md-image-syntax branch from 5f4040b to f843a1f Compare March 19, 2024 11:20

ExpensiMark: cleanup

c904af2

kidroca commented Mar 19, 2024

View reviewed changes

kidroca mentioned this pull request Mar 19, 2024

HIGH: [Polish] Add support for inline image Markdown Expensify/App#37246

Closed

situchan approved these changes Mar 19, 2024

View reviewed changes

marcaaron approved these changes Mar 19, 2024

View reviewed changes

marcaaron merged commit fe10326 into Expensify:main Mar 19, 2024
5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for inline image Markdown #658

Add support for inline image Markdown #658

kidroca commented Feb 29, 2024

kidroca Feb 29, 2024

kidroca Mar 12, 2024

situchan commented Mar 5, 2024

puneetlath commented Mar 11, 2024

situchan left a comment

puneetlath commented Mar 14, 2024

situchan commented Mar 15, 2024

situchan commented Mar 15, 2024 •

edited

Loading

situchan Mar 15, 2024

kidroca Mar 18, 2024

marcaaron left a comment

marcaaron Mar 18, 2024

kidroca Mar 19, 2024 •

edited

Loading

marcaaron Mar 19, 2024

kidroca Mar 19, 2024

situchan left a comment

Add support for inline image Markdown #658

Add support for inline image Markdown #658

Conversation

kidroca commented Feb 29, 2024

Fixed Issues

Tests

QA

kidroca Feb 29, 2024

Choose a reason for hiding this comment

kidroca Mar 12, 2024

Choose a reason for hiding this comment

situchan commented Mar 5, 2024

puneetlath commented Mar 11, 2024

situchan left a comment

Choose a reason for hiding this comment

puneetlath commented Mar 14, 2024

situchan commented Mar 15, 2024

situchan commented Mar 15, 2024 • edited Loading

situchan Mar 15, 2024

Choose a reason for hiding this comment

kidroca Mar 18, 2024

Choose a reason for hiding this comment

marcaaron left a comment

Choose a reason for hiding this comment

marcaaron Mar 18, 2024

Choose a reason for hiding this comment

kidroca Mar 19, 2024 • edited Loading

Choose a reason for hiding this comment

marcaaron Mar 19, 2024

Choose a reason for hiding this comment

kidroca Mar 19, 2024

Choose a reason for hiding this comment

situchan left a comment

Choose a reason for hiding this comment

situchan commented Mar 15, 2024 •

edited

Loading

kidroca Mar 19, 2024 •

edited

Loading