Fix footnotes plus fix footnote reference labels and backrefs #230

phillmv · 2021-08-23T14:40:40Z

This PR is a companion to/blocked by #229, and should only be merged after that PR. It also supports https://github.com/github/coding/issues/2251 .

This PR incorporates all of the changes in #229 plus it fixes an additional issue:

Before, footnote anchor ids were referenced by an incrementing integer id, and only included a single backreference link when rendered into html.

This prevents multiple blocks of independently rendered text (think: an Issue body and Issue comments) from all containing footnotes, because the first footnote in the issue body and the first footnote in any issue comment will both reference fnref1.

After this PR, footnote anchor ids use their footnote reference label text as the anchor id text. Also, we insert an additional backreference link per individual footnote reference, so that you can navigate back to each individual footnote reference if a given footnote is cited more than once.

Notes to reviewers,

Due to the changes set in #229, and given that this PR modifies every footnote test case, in order to prevent painful merge conflicts while keeping PR size down, I decided to package this as a separate PR.

All of the significant work is in this commit, but this PR is best reviewed after #229 is merged.

…ther footnote.

When two footnote references are adjacent, the handle_close_bracket function will first try to match the closing bracket to a link reference. Now we reset the subject's state, so that the parser correctly picks up both footnote references.

…ark_nodes. Sometimes, the autolinker will go ahead and greedily split input into multiple text nodes in the hopes of matching a hyperlink. This broke footnotes, which expected a singular node. Instead of relying on the tokenizing to have worked perfectly, when handling footnote references we now simply insert the reference based on the closing bracket and ignore and delete any existing and superfluous nodes.

When a footnote is referenced multiple times, we now insert multiple backrefs linking back to each reference. In order to do this, we had to change how footnote ref link labels work away from an incrementing index, and instead use footnote reference label text *plus* an index.

…abels.

…in multiple places and multiple backrefs.

…ontain 'w' or '_'.

…erences.

…tnote-fixes

…note-fixes

…te-fixes

src/blocks.c

… intent more obvious.

…ade sure to free allocated string in commonmark.c

…krefs

tldr, avoid freeing memory before passing it along to another function.

Sometimes, when cleaning up unusued footnotes, two footnote->nodes may end up referencing each other. As they get free()'d up, this can lead to problems. Instead, first we unlink every node and _then_ free them up.

…notes-plus-fix-fnref-label-and-backrefs

brasic

Thanks for this PR @phillmv! I left a few comments about ensuring global uniqueness and a question about sanitization. Happy to pair on this if you have time!

src/html.c

brasic · 2021-09-14T18:34:32Z

test/extensions.txt

-<p>This is some text!<sup class="footnote-ref"><a href="#fn1" id="fnref1">1</a></sup>. Other text.<sup class="footnote-ref"><a href="#fn2" id="fnref2">2</a></sup>.</p>
-<p>Here's a thing<sup class="footnote-ref"><a href="#fn3" id="fnref3">3</a></sup>.</p>
-<p>And another thing<sup class="footnote-ref"><a href="#fn4" id="fnref4">4</a></sup>.</p>
+<p>This is some text!<sup class="footnote-ref"><a href="#fn:1" id="fnref:1">1</a></sup>. Other text.<sup class="footnote-ref"><a href="#fn:footnote" id="fnref:footnote">2</a></sup>.</p>


I think this addition will still not get us quite to where we need to use this in github. What we need is a guarantee that multiple comments on the same page do not unintentionally link to eachother's anchors. It seems like with this approach that will still happen if comments reuse the same footnote labels (e.g. so far I've just been using labels like [^1])

What do you think about one of the following?

adding some some random text as part of the anchor name

adding some pseudorandom text obtained by e.g. digesting the footnote target body

Hrmmm, the random text is probably the most robust solution but on first thought I don't love inserting random strings into labels.

The pseudrandom is more pleasant, but then you're still stuck with folks say copying the same input generating the same output.

Maybe the best way to tackle that issue is to handle it in the pipeline phase, like how we handle header ids for tables of contents – that way we can scope it to the id of the rendered comment/issue/whatever and avoid randomness AND be guaranteed everything is properly scoped.

I like this solution although I bet that @talum will not! 😂

If this is handled in the pipeline phase can there be a public description of how it is done so tools that try to mimic GitHub's output can do so without guessing?

Header IDs are still not publicly described and apparently not open to GitHub employees as well.

I'm testing the pipeline out with these colon-separated hrefs. Something with this syntax is wonky...

Example:

<a href="#fn:other-note">testing a link</a>

testing a link

^^ does not show up as a link

Experimenting with this, it looks like our sanitizer strips links whose hrefs contain : since

<a href="#fn-other-note">this works</a> this works

so, meh, okay, let's use a dash instead!

@UziTech thanks for bringing this up. In general we intentionally don't document things like how we generate slugs and ids not because it's secret or anything but because these are implementation details; we don't want to unintentionally define an API contract that we need to adhere to.

However, I think we'd be happy to informally describe the algorithm we eventually choose for generating footnote anchor ids from the output of what is publicly specified here. Please create an issue and mention me after the feature is released and I'll do my best to provide an answer.

…ped.

src/commonmark.c

src/html.c

…ck literal->len first.

…notes-plus-fix-fnref-label-and-backrefs

…ely.

phillmv added 14 commits August 19, 2021 09:15

Footnotes now support being nested, i.e. a footnote may reference ano…

71e27f2

…ther footnote.

Fixed footnote extension test to handle new footnote reference link l…

272c999

…abels.

Added test example that exercises a single footnote being referenced …

2cb2f7c

…in multiple places and multiple backrefs.

Converted regression test to expect new footnote ref link labels.

8ccdaa7

Added regression test that exercises nested footnotes.

a0de7d8

Added test that properly exercises footnotes whose reference labels c…

7fa2372

…ontain 'w' or '_'.

Added test that exercises whether footnotes are confused for link ref…

740b987

…erences.

Merge branch 'footnotes-fix-confused-for-link-reference' into all-foo…

582eb8a

…tnote-fixes

Merge branch 'footnotes-fix-when-across-multiple-nodes' into all-foot…

c464de3

…note-fixes

Merge branch 'footnotes-fix-fnref-label-and-backrefs' into all-footno…

1aabfa3

…te-fixes

Adapted existing regression tests to conform to new footnote ref label.

7b5d45d

phillmv mentioned this pull request Aug 23, 2021

Fix footnotes: nested, confused for link refs & mangled by the autolinker #229

Merged

Merge branch 'master' into fix-footnotes-nested-linkrefs-autolinker

993e869

nickrolfe reviewed Sep 1, 2021

View reviewed changes

src/blocks.c Outdated Show resolved Hide resolved

phillmv added 4 commits September 1, 2021 10:48

Renamed cmark_node->footnote.{ix,count} to {ref_ix,def_count} to make…

32ffc77

… intent more obvious.

Added cmark_node.parent_footnote_def, removed usage of 'user_data', m…

1717040

…ade sure to free allocated string in commonmark.c

Merge branch 'master' into fix-footnotes-plus-fix-fnref-label-and-bac…

984b5ea

…krefs

replaced strbuf_put with strbuf_puts

32002ec

phillmv force-pushed the fix-footnotes-plus-fix-fnref-label-and-backrefs branch from 3dccaf9 to 32002ec Compare September 1, 2021 16:21

phillmv added 4 commits September 1, 2021 20:02

WIP: what if we only free the nodes after calling process_emphasis?

6e186b3

Fix & regression test for use-after-free introduced in bb117ff

d3a819c

tldr, avoid freeing memory before passing it along to another function.

Fix for use-after-free bug introduced in 71e27f2

a1d171a

Sometimes, when cleaning up unusued footnotes, two footnote->nodes may end up referencing each other. As they get free()'d up, this can lead to problems. Instead, first we unlink every node and _then_ free them up.

Merge branch 'fix-footnotes-nested-linkrefs-autolinker' into fix-foot…

98a2544

…notes-plus-fix-fnref-label-and-backrefs

brasic reviewed Sep 14, 2021

View reviewed changes

phillmv added 2 commits September 14, 2021 17:17

By default, always escape footnote hrefs when emitting html.

d43ae4b

added extension test that verifies that footnote labels get href esca…

a86bbc5

…ped.

bk2204 reviewed Sep 15, 2021

View reviewed changes

src/commonmark.c Outdated Show resolved Hide resolved

src/commonmark.c Outdated Show resolved Hide resolved

src/html.c Outdated Show resolved Hide resolved

phillmv added 5 commits September 15, 2021 11:08

literal->data is probably NULL terminated, but just in case let's che…

586a22d

…ck literal->len first.

Added check for underflows when duping footnote ref literal.

de6feae

Merge branch 'fix-footnotes-nested-linkrefs-autolinker' into fix-foot…

5790bf2

…notes-plus-fix-fnref-label-and-backrefs

Swapped calloc argument order, so that we use the function appropriat…

4bf57ea

…ely.

Swapped : for - when emitting html footnote ref labels.

8474289

phillmv requested review from brasic and bk2204 September 15, 2021 19:26

brasic approved these changes Sep 16, 2021

View reviewed changes

phillmv merged commit d7e50f0 into master Sep 16, 2021

phillmv deleted the fix-footnotes-plus-fix-fnref-label-and-backrefs branch September 16, 2021 20:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix footnotes plus fix footnote reference labels and backrefs #230

Fix footnotes plus fix footnote reference labels and backrefs #230

phillmv commented Aug 23, 2021 •

edited

Loading

brasic left a comment

brasic Sep 14, 2021

phillmv Sep 14, 2021

brasic Sep 14, 2021

UziTech Sep 15, 2021

talum Sep 15, 2021

phillmv Sep 15, 2021

brasic Sep 16, 2021

Fix footnotes plus fix footnote reference labels and backrefs #230

Fix footnotes plus fix footnote reference labels and backrefs #230

Conversation

phillmv commented Aug 23, 2021 • edited Loading

Notes to reviewers,

brasic left a comment

Choose a reason for hiding this comment

brasic Sep 14, 2021

Choose a reason for hiding this comment

phillmv Sep 14, 2021

Choose a reason for hiding this comment

brasic Sep 14, 2021

Choose a reason for hiding this comment

UziTech Sep 15, 2021

Choose a reason for hiding this comment

talum Sep 15, 2021

Choose a reason for hiding this comment

phillmv Sep 15, 2021

Choose a reason for hiding this comment

brasic Sep 16, 2021

Choose a reason for hiding this comment

phillmv commented Aug 23, 2021 •

edited

Loading