-
Notifications
You must be signed in to change notification settings - Fork 175
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix footnotes plus fix footnote reference labels and backrefs #230
Fix footnotes plus fix footnote reference labels and backrefs #230
Conversation
When two footnote references are adjacent, the handle_close_bracket function will first try to match the closing bracket to a link reference. Now we reset the subject's state, so that the parser correctly picks up both footnote references.
…ark_nodes. Sometimes, the autolinker will go ahead and greedily split input into multiple text nodes in the hopes of matching a hyperlink. This broke footnotes, which expected a singular node. Instead of relying on the tokenizing to have worked perfectly, when handling footnote references we now simply insert the reference based on the closing bracket and ignore and delete any existing and superfluous nodes.
When a footnote is referenced multiple times, we now insert multiple backrefs linking back to each reference. In order to do this, we had to change how footnote ref link labels work away from an incrementing index, and instead use footnote reference label text *plus* an index.
…in multiple places and multiple backrefs.
…ontain 'w' or '_'.
… intent more obvious.
…ade sure to free allocated string in commonmark.c
3dccaf9
to
32002ec
Compare
tldr, avoid freeing memory before passing it along to another function.
Sometimes, when cleaning up unusued footnotes, two footnote->nodes may end up referencing each other. As they get free()'d up, this can lead to problems. Instead, first we unlink every node and _then_ free them up.
…notes-plus-fix-fnref-label-and-backrefs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this PR @phillmv! I left a few comments about ensuring global uniqueness and a question about sanitization. Happy to pair on this if you have time!
test/extensions.txt
Outdated
<p>This is some text!<sup class="footnote-ref"><a href="#fn1" id="fnref1">1</a></sup>. Other text.<sup class="footnote-ref"><a href="#fn2" id="fnref2">2</a></sup>.</p> | ||
<p>Here's a thing<sup class="footnote-ref"><a href="#fn3" id="fnref3">3</a></sup>.</p> | ||
<p>And another thing<sup class="footnote-ref"><a href="#fn4" id="fnref4">4</a></sup>.</p> | ||
<p>This is some text!<sup class="footnote-ref"><a href="#fn:1" id="fnref:1">1</a></sup>. Other text.<sup class="footnote-ref"><a href="#fn:footnote" id="fnref:footnote">2</a></sup>.</p> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this addition will still not get us quite to where we need to use this in github. What we need is a guarantee that multiple comments on the same page do not unintentionally link to eachother's anchors. It seems like with this approach that will still happen if comments reuse the same footnote labels (e.g. so far I've just been using labels like [^1]
)
What do you think about one of the following?
- adding some some random text as part of the anchor name
- adding some pseudorandom text obtained by e.g. digesting the footnote target body
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hrmmm, the random text is probably the most robust solution but on first thought I don't love inserting random strings into labels.
The pseudrandom is more pleasant, but then you're still stuck with folks say copying the same input generating the same output.
Maybe the best way to tackle that issue is to handle it in the pipeline phase, like how we handle header ids for tables of contents – that way we can scope it to the id of the rendered comment/issue/whatever and avoid randomness AND be guaranteed everything is properly scoped.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this solution although I bet that @talum will not! 😂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is handled in the pipeline phase can there be a public description of how it is done so tools that try to mimic GitHub's output can do so without guessing?
Header IDs are still not publicly described and apparently not open to GitHub employees as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm testing the pipeline out with these colon-separated hrefs. Something with this syntax is wonky...
Example:
<a href="#fn:other-note">testing a link</a>
testing a link
^^ does not show up as a link
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Experimenting with this, it looks like our sanitizer strips links whose hrefs contain :
since
<a href="#fn-other-note">this works</a>
this works
so, meh, okay, let's use a dash instead!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@UziTech thanks for bringing this up. In general we intentionally don't document things like how we generate slugs and ids not because it's secret or anything but because these are implementation details; we don't want to unintentionally define an API contract that we need to adhere to.
However, I think we'd be happy to informally describe the algorithm we eventually choose for generating footnote anchor ids from the output of what is publicly specified here. Please create an issue and mention me after the feature is released and I'll do my best to provide an answer.
…ck literal->len first.
…notes-plus-fix-fnref-label-and-backrefs
This PR is a companion to/blocked by #229, and should only be merged after that PR. It also supports https://github.com/github/coding/issues/2251 .
This PR incorporates all of the changes in #229 plus it fixes an additional issue:
Before, footnote anchor ids were referenced by an incrementing integer id, and only included a single backreference link when rendered into html.
This prevents multiple blocks of independently rendered text (think: an Issue body and Issue comments) from all containing footnotes, because the first footnote in the issue body and the first footnote in any issue comment will both reference
fnref1
.After this PR, footnote anchor ids use their footnote reference label text as the anchor id text. Also, we insert an additional backreference link per individual footnote reference, so that you can navigate back to each individual footnote reference if a given footnote is cited more than once.
Notes to reviewers,
Due to the changes set in #229, and given that this PR modifies every footnote test case, in order to prevent painful merge conflicts while keeping PR size down, I decided to package this as a separate PR.
All of the significant work is in this commit, but this PR is best reviewed after #229 is merged.