Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix footnotes plus fix footnote reference labels and backrefs #230

Merged
merged 30 commits into from
Sep 16, 2021
Merged
Show file tree
Hide file tree
Changes from 14 commits
Commits
Show all changes
30 commits
Select commit Hold shift + click to select a range
71e27f2
Footnotes now support being nested, i.e. a footnote may reference ano…
phillmv Aug 19, 2021
1f026ef
Fix for footnotes being confused for link references.
phillmv Aug 19, 2021
bb117ff
Fix for when footnote reference labels get broken up into multiple cm…
phillmv Aug 19, 2021
bf76871
Fix footnote reference label text, and add multiple backrefs.
phillmv Aug 19, 2021
272c999
Fixed footnote extension test to handle new footnote reference link l…
phillmv Aug 19, 2021
2cb2f7c
Added test example that exercises a single footnote being referenced …
phillmv Aug 19, 2021
8ccdaa7
Converted regression test to expect new footnote ref link labels.
phillmv Aug 19, 2021
a0de7d8
Added regression test that exercises nested footnotes.
phillmv Aug 20, 2021
7fa2372
Added test that properly exercises footnotes whose reference labels c…
phillmv Aug 20, 2021
740b987
Added test that exercises whether footnotes are confused for link ref…
phillmv Aug 20, 2021
582eb8a
Merge branch 'footnotes-fix-confused-for-link-reference' into all-foo…
phillmv Aug 20, 2021
c464de3
Merge branch 'footnotes-fix-when-across-multiple-nodes' into all-foot…
phillmv Aug 20, 2021
1aabfa3
Merge branch 'footnotes-fix-fnref-label-and-backrefs' into all-footno…
phillmv Aug 20, 2021
7b5d45d
Adapted existing regression tests to conform to new footnote ref label.
phillmv Aug 20, 2021
993e869
Merge branch 'master' into fix-footnotes-nested-linkrefs-autolinker
phillmv Aug 23, 2021
32ffc77
Renamed cmark_node->footnote.{ix,count} to {ref_ix,def_count} to make…
phillmv Sep 1, 2021
1717040
Added cmark_node.parent_footnote_def, removed usage of 'user_data', m…
phillmv Sep 1, 2021
984b5ea
Merge branch 'master' into fix-footnotes-plus-fix-fnref-label-and-bac…
phillmv Sep 1, 2021
32002ec
replaced strbuf_put with strbuf_puts
phillmv Sep 1, 2021
6e186b3
WIP: what if we only free the nodes after calling process_emphasis?
phillmv Sep 2, 2021
d3a819c
Fix & regression test for use-after-free introduced in bb117ffa7f0dcc…
phillmv Sep 2, 2021
a1d171a
Fix for use-after-free bug introduced in 71e27f25f11c9a34f0532dba4599…
phillmv Sep 2, 2021
98a2544
Merge branch 'fix-footnotes-nested-linkrefs-autolinker' into fix-foot…
phillmv Sep 14, 2021
d43ae4b
By default, always escape footnote hrefs when emitting html.
phillmv Sep 14, 2021
a86bbc5
added extension test that verifies that footnote labels get href esca…
phillmv Sep 15, 2021
586a22d
literal->data is probably NULL terminated, but just in case let's che…
phillmv Sep 15, 2021
de6feae
Added check for underflows when duping footnote ref literal.
phillmv Sep 15, 2021
5790bf2
Merge branch 'fix-footnotes-nested-linkrefs-autolinker' into fix-foot…
phillmv Sep 15, 2021
4bf57ea
Swapped calloc argument order, so that we use the function appropriat…
phillmv Sep 15, 2021
8474289
Swapped : for - when emitting html footnote ref labels.
phillmv Sep 15, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 14 additions & 2 deletions src/blocks.c
Original file line number Diff line number Diff line change
Expand Up @@ -468,7 +468,6 @@ static void process_footnotes(cmark_parser *parser) {
while ((ev_type = cmark_iter_next(iter)) != CMARK_EVENT_DONE) {
cur = cmark_iter_get_node(iter);
if (ev_type == CMARK_EVENT_EXIT && cur->type == CMARK_NODE_FOOTNOTE_DEFINITION) {
cmark_node_unlink(cur);
cmark_footnote_create(map, cur);
}
}
Expand All @@ -485,6 +484,17 @@ static void process_footnotes(cmark_parser *parser) {
if (!footnote->ix)
footnote->ix = ++ix;

// keep track of a) how many times this footnote def has been
// referenced, and b) which reference count this footnote ref is at
// this is used by renderers when generating links and backreferences.
cur->footnote.ix = ++footnote->node->footnote.count;

// store the footnote reference text label in the footnote ref's node's
// `user_data`, so that renderers can use the label when generating
// links and backreferences.
cur->user_data = parser->mem->calloc(1, (sizeof(char) * cur->as.literal.len) + 1);
phillmv marked this conversation as resolved.
Show resolved Hide resolved
memmove(cur->user_data, cur->as.literal.data, cur->as.literal.len);

char n[32];
snprintf(n, sizeof(n), "%d", footnote->ix);
cmark_chunk_free(parser->mem, &cur->as.literal);
Expand Down Expand Up @@ -515,8 +525,10 @@ static void process_footnotes(cmark_parser *parser) {
qsort(map->sorted, map->size, sizeof(cmark_map_entry *), sort_footnote_by_ix);
for (unsigned int i = 0; i < map->size; ++i) {
cmark_footnote *footnote = (cmark_footnote *)map->sorted[i];
if (!footnote->ix)
if (!footnote->ix) {
cmark_node_unlink(footnote->node);
continue;
}
cmark_node_append_child(parser->root, footnote->node);
footnote->node = NULL;
}
Expand Down
10 changes: 6 additions & 4 deletions src/commonmark.c
Original file line number Diff line number Diff line change
Expand Up @@ -477,7 +477,7 @@ static int S_render_node(cmark_renderer *renderer, cmark_node *node,
case CMARK_NODE_FOOTNOTE_REFERENCE:
if (entering) {
LIT("[^");
OUT(cmark_chunk_to_cstr(renderer->mem, &node->as.literal), false, LITERAL);
OUT(node->user_data, false, LITERAL);
LIT("]");
}
break;
Expand All @@ -486,9 +486,11 @@ static int S_render_node(cmark_renderer *renderer, cmark_node *node,
if (entering) {
renderer->footnote_ix += 1;
LIT("[^");
char n[32];
snprintf(n, sizeof(n), "%d", renderer->footnote_ix);
OUT(n, false, LITERAL);

char *str = renderer->mem->calloc(1, (sizeof(char) * node->as.literal.len) + 1);
memmove(str, node->as.literal.data, node->as.literal.len);

OUT(str, false, LITERAL);
LIT("]:\n");

cmark_strbuf_puts(renderer->prefix, " ");
Expand Down
51 changes: 36 additions & 15 deletions src/html.c
Original file line number Diff line number Diff line change
Expand Up @@ -59,17 +59,31 @@ static void filter_html_block(cmark_html_renderer *renderer, uint8_t *data, size
cmark_strbuf_put(html, data, (bufsize_t)len);
}

static bool S_put_footnote_backref(cmark_html_renderer *renderer, cmark_strbuf *html) {
static bool S_put_footnote_backref(cmark_html_renderer *renderer, cmark_strbuf *html, cmark_node *node) {
if (renderer->written_footnote_ix >= renderer->footnote_ix)
return false;
renderer->written_footnote_ix = renderer->footnote_ix;

cmark_strbuf_puts(html, "<a href=\"#fnref");
char n[32];
snprintf(n, sizeof(n), "%d", renderer->footnote_ix);
cmark_strbuf_puts(html, n);
cmark_strbuf_puts(html, "<a href=\"#fnref:");
phillmv marked this conversation as resolved.
Show resolved Hide resolved
cmark_strbuf_put(html, node->as.literal.data, node->as.literal.len);
phillmv marked this conversation as resolved.
Show resolved Hide resolved
cmark_strbuf_puts(html, "\" class=\"footnote-backref\">↩</a>");

if (node->footnote.count > 1)
{
for(int i = 2; i <= node->footnote.count; i++) {
char n[32];
snprintf(n, sizeof(n), "%d", i);

cmark_strbuf_puts(html, " <a href=\"#fnref:");
cmark_strbuf_put(html, node->as.literal.data, node->as.literal.len);
cmark_strbuf_puts(html, ":");
cmark_strbuf_put(html, (const unsigned char *)n, strlen(n));
cmark_strbuf_puts(html, "\" class=\"footnote-backref\">↩<sup class=\"footnote-ref\">");
cmark_strbuf_put(html, (const unsigned char *)n, strlen(n));
cmark_strbuf_puts(html, "</sup></a>");
}
}

return true;
}

Expand Down Expand Up @@ -273,7 +287,7 @@ static int S_render_node(cmark_html_renderer *renderer, cmark_node *node,
} else {
if (parent->type == CMARK_NODE_FOOTNOTE_DEFINITION && node->next == NULL) {
cmark_strbuf_putc(html, ' ');
S_put_footnote_backref(renderer, html);
S_put_footnote_backref(renderer, html, parent);
}
cmark_strbuf_puts(html, "</p>\n");
}
Expand Down Expand Up @@ -395,13 +409,12 @@ static int S_render_node(cmark_html_renderer *renderer, cmark_node *node,
cmark_strbuf_puts(html, "<section class=\"footnotes\">\n<ol>\n");
}
++renderer->footnote_ix;
cmark_strbuf_puts(html, "<li id=\"fn");
char n[32];
snprintf(n, sizeof(n), "%d", renderer->footnote_ix);
cmark_strbuf_puts(html, n);

cmark_strbuf_puts(html, "<li id=\"fn:");
cmark_strbuf_put(html, node->as.literal.data, node->as.literal.len);
cmark_strbuf_puts(html, "\">\n");
} else {
if (S_put_footnote_backref(renderer, html)) {
if (S_put_footnote_backref(renderer, html, node)) {
cmark_strbuf_putc(html, '\n');
}
cmark_strbuf_puts(html, "</li>\n");
Expand All @@ -410,10 +423,18 @@ static int S_render_node(cmark_html_renderer *renderer, cmark_node *node,

case CMARK_NODE_FOOTNOTE_REFERENCE:
if (entering) {
cmark_strbuf_puts(html, "<sup class=\"footnote-ref\"><a href=\"#fn");
cmark_strbuf_put(html, node->as.literal.data, node->as.literal.len);
cmark_strbuf_puts(html, "\" id=\"fnref");
cmark_strbuf_put(html, node->as.literal.data, node->as.literal.len);
cmark_strbuf_puts(html, "<sup class=\"footnote-ref\"><a href=\"#fn:");
cmark_strbuf_puts(html, node->user_data);
cmark_strbuf_puts(html, "\" id=\"fnref:");
cmark_strbuf_puts(html, node->user_data);

if (node->footnote.ix > 1) {
char n[32];
snprintf(n, sizeof(n), "%d", node->footnote.ix);
cmark_strbuf_puts(html, ":");
cmark_strbuf_put(html, (const unsigned char *)n, strlen(n));
}

cmark_strbuf_puts(html, "\">");
cmark_strbuf_put(html, node->as.literal.data, node->as.literal.len);
cmark_strbuf_puts(html, "</a></sup>");
Expand Down
71 changes: 61 additions & 10 deletions src/inlines.c
Original file line number Diff line number Diff line change
Expand Up @@ -1137,17 +1137,68 @@ static cmark_node *handle_close_bracket(cmark_parser *parser, subject *subj) {
// What if we're a footnote link?
if (parser->options & CMARK_OPT_FOOTNOTES &&
opener->inl_text->next &&
opener->inl_text->next->type == CMARK_NODE_TEXT &&
!opener->inl_text->next->next) {
opener->inl_text->next->type == CMARK_NODE_TEXT) {

cmark_chunk *literal = &opener->inl_text->next->as.literal;
if (literal->len > 1 && literal->data[0] == '^') {
inl = make_simple(subj->mem, CMARK_NODE_FOOTNOTE_REFERENCE);
inl->as.literal = cmark_chunk_dup(literal, 1, literal->len - 1);
inl->start_line = inl->end_line = subj->line;
inl->start_column = opener->inl_text->start_column;
inl->end_column = subj->pos + subj->column_offset + subj->block_offset;
cmark_node_insert_before(opener->inl_text, inl);
cmark_node_free(opener->inl_text->next);

// look back to the opening '[', and skip ahead to the next character
// if we're looking at a '[^' sequence, and there is other text or nodes
// after the ^, let's call it a footnote reference.
if (literal->data[0] == '^' && (literal->len > 1 || opener->inl_text->next->next)) {

// Before we got this far, the `handle_close_bracket` function may have
// advanced the current state beyond our footnote's actual closing
// bracket, ie if it went looking for a `link_label`.
// Let's just rewind the subject's position:
subj->pos = initial_pos;

cmark_node *fnref = make_simple(subj->mem, CMARK_NODE_FOOTNOTE_REFERENCE);

// the start and end of the footnote ref is the opening and closing brace
// i.e. the subject's current position, and the opener's start_column
int fnref_end_column = subj->pos + subj->column_offset + subj->block_offset;
int fnref_start_column = opener->inl_text->start_column;

// any given node delineates a substring of the line being processed,
// with the remainder of the line being pointed to thru its 'literal'
// struct member.
// here, we copy the literal's pointer, moving it past the '^' character
// for a length equal to the size of footnote reference text.
// i.e. end_col minus start_col, minus the [ and the ^ characters
//
// this copies the footnote reference string, even if between the
// `opener` and the subject's current position there are other nodes
fnref->as.literal = cmark_chunk_dup(literal, 1, (fnref_end_column - fnref_start_column) - 2);

fnref->start_line = fnref->end_line = subj->line;
fnref->start_column = fnref_start_column;
fnref->end_column = fnref_end_column;

// we then replace the opener with this new fnref node, the net effect
// being replacing the opening '[' text node with a `^footnote-ref]` node.
cmark_node_insert_before(opener->inl_text, fnref);

// sometimes, the footnote reference text gets parsed into multiple nodes
// i.e. '[^example]' parsed into '[', '^exam', 'ple]'.
// this happens for ex with the autolink extension. when the autolinker
// finds the 'w' character, it will split the text into multiple nodes
// in hopes of being able to match a 'www.' substring.
//
// because this function is called one character at a time via the
// `parse_inlines` function, and the current subj->pos is pointing at the
// closing ] brace, and because we copy all the text between the [ ]
// braces, we should be able to safely ignore and delete any nodes after
// the opener->inl_text->next.
//
// therefore, here we walk thru the list and free them all up
cmark_node *next_node;
cmark_node *current_node = opener->inl_text->next;
while(current_node) {
next_node = current_node->next;
cmark_node_free(current_node);
current_node = next_node;
}

cmark_node_free(opener->inl_text);
process_emphasis(parser, subj, opener->previous_delimiter);
pop_bracket(subj);
Expand Down
5 changes: 5 additions & 0 deletions src/node.h
Original file line number Diff line number Diff line change
Expand Up @@ -76,6 +76,11 @@ struct cmark_node {

cmark_syntax_extension *extension;

union {
int ix;
int count;
} footnote;

union {
cmark_chunk literal;
cmark_list list;
Expand Down
42 changes: 31 additions & 11 deletions test/extensions.txt
Original file line number Diff line number Diff line change
Expand Up @@ -672,31 +672,51 @@ Hi!

[^unused]: This is unused.
.
<p>This is some text!<sup class="footnote-ref"><a href="#fn1" id="fnref1">1</a></sup>. Other text.<sup class="footnote-ref"><a href="#fn2" id="fnref2">2</a></sup>.</p>
<p>Here's a thing<sup class="footnote-ref"><a href="#fn3" id="fnref3">3</a></sup>.</p>
<p>And another thing<sup class="footnote-ref"><a href="#fn4" id="fnref4">4</a></sup>.</p>
<p>This is some text!<sup class="footnote-ref"><a href="#fn:1" id="fnref:1">1</a></sup>. Other text.<sup class="footnote-ref"><a href="#fn:footnote" id="fnref:footnote">2</a></sup>.</p>
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this addition will still not get us quite to where we need to use this in github. What we need is a guarantee that multiple comments on the same page do not unintentionally link to eachother's anchors. It seems like with this approach that will still happen if comments reuse the same footnote labels (e.g. so far I've just been using labels like [^1])

What do you think about one of the following?

  • adding some some random text as part of the anchor name
  • adding some pseudorandom text obtained by e.g. digesting the footnote target body

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hrmmm, the random text is probably the most robust solution but on first thought I don't love inserting random strings into labels.

The pseudrandom is more pleasant, but then you're still stuck with folks say copying the same input generating the same output.

Maybe the best way to tackle that issue is to handle it in the pipeline phase, like how we handle header ids for tables of contents – that way we can scope it to the id of the rendered comment/issue/whatever and avoid randomness AND be guaranteed everything is properly scoped.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this solution although I bet that @talum will not! 😂

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is handled in the pipeline phase can there be a public description of how it is done so tools that try to mimic GitHub's output can do so without guessing?

Header IDs are still not publicly described and apparently not open to GitHub employees as well.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm testing the pipeline out with these colon-separated hrefs. Something with this syntax is wonky...

Example:

<a href="#fn:other-note">testing a link</a>

testing a link

^^ does not show up as a link

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Experimenting with this, it looks like our sanitizer strips links whose hrefs contain : since

<a href="#fn-other-note">this works</a> this works

so, meh, okay, let's use a dash instead!

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@UziTech thanks for bringing this up. In general we intentionally don't document things like how we generate slugs and ids not because it's secret or anything but because these are implementation details; we don't want to unintentionally define an API contract that we need to adhere to.

However, I think we'd be happy to informally describe the algorithm we eventually choose for generating footnote anchor ids from the output of what is publicly specified here. Please create an issue and mention me after the feature is released and I'll do my best to provide an answer.

<p>Here's a thing<sup class="footnote-ref"><a href="#fn:other-note" id="fnref:other-note">3</a></sup>.</p>
<p>And another thing<sup class="footnote-ref"><a href="#fn:codeblock-note" id="fnref:codeblock-note">4</a></sup>.</p>
<p>This doesn't have a referent[^nope].</p>
<p>Hi!</p>
<section class="footnotes">
<ol>
<li id="fn1">
<p>Some <em>bolded</em> footnote definition. <a href="#fnref1" class="footnote-backref">↩</a></p>
<li id="fn:1">
<p>Some <em>bolded</em> footnote definition. <a href="#fnref:1" class="footnote-backref">↩</a></p>
</li>
<li id="fn2">
<li id="fn:footnote">
<blockquote>
<p>Blockquotes can be in a footnote.</p>
</blockquote>
<pre><code>as well as code blocks
</code></pre>
<p>or, naturally, simple paragraphs. <a href="#fnref2" class="footnote-backref">↩</a></p>
<p>or, naturally, simple paragraphs. <a href="#fnref:footnote" class="footnote-backref">↩</a></p>
</li>
<li id="fn3">
<p>no code block here (spaces are stripped away) <a href="#fnref3" class="footnote-backref">↩</a></p>
<li id="fn:other-note">
<p>no code block here (spaces are stripped away) <a href="#fnref:other-note" class="footnote-backref">↩</a></p>
</li>
<li id="fn4">
<li id="fn:codeblock-note">
<pre><code>this is now a code block (8 spaces indentation)
</code></pre>
<a href="#fnref4" class="footnote-backref">↩</a>
<a href="#fnref:codeblock-note" class="footnote-backref">↩</a>
</li>
</ol>
</section>
````````````````````````````````

## When a footnote is used multiple times, we insert multiple backrefs.

```````````````````````````````` example
This is some text. It has a footnote[^a-footnote].

This footnote is referenced[^a-footnote] multiple times, in lots of different places.[^a-footnote]

[^a-footnote]: This footnote definition should have three backrefs.
.
<p>This is some text. It has a footnote<sup class="footnote-ref"><a href="#fn:a-footnote" id="fnref:a-footnote">1</a></sup>.</p>
<p>This footnote is referenced<sup class="footnote-ref"><a href="#fn:a-footnote" id="fnref:a-footnote:2">1</a></sup> multiple times, in lots of different places.<sup class="footnote-ref"><a href="#fn:a-footnote" id="fnref:a-footnote:3">1</a></sup></p>
<section class="footnotes">
<ol>
<li id="fn:a-footnote">
<p>This footnote definition should have three backrefs. <a href="#fnref:a-footnote" class="footnote-backref">↩</a> <a href="#fnref:a-footnote:2" class="footnote-backref">↩<sup class="footnote-ref">2</sup></a> <a href="#fnref:a-footnote:3" class="footnote-backref">↩<sup class="footnote-ref">3</sup></a></p>
</li>
</ol>
</section>
Expand Down
Loading