Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTML reader: parse footnotes using dpub-aria roles #5294

Closed
jgm opened this issue Feb 11, 2019 · 2 comments
Closed

HTML reader: parse footnotes using dpub-aria roles #5294

jgm opened this issue Feb 11, 2019 · 2 comments

Comments

@jgm
Copy link
Owner

jgm commented Feb 11, 2019

Using the dpub-aria roles doc-endnotes, doc-endnote, doc-noteref, and doc-backlink, we should be able to parse footnotes from an HTML document into pandoc Notes.

See #4213.

@jgm jgm added this to the next release milestone Feb 28, 2020
@Seirdy
Copy link

Seirdy commented Apr 19, 2022

Agreed that this would be a great feature to have. Note that doc-endnote is deprecated as of DPUB-ARIA 1.1.

@jgm jgm removed this from the next release milestone Dec 5, 2024
@jgm
Copy link
Owner Author

jgm commented Dec 5, 2024

A note reference looks like

<a href="#fn1" class="footnote-ref" id="fnref1" role="doc-noteref"><sup>1</sup></a>

The notes:

<section id="footnotes" class="footnotes footnotes-end-of-document"
role="doc-endnotes">
<hr />
<ol>
<li id="fn1"><p>there<a href="#fnref1" class="footnote-back"
role="doc-backlink">↩</a></p></li>
</ol>
</section>

When pandoc parses the note reference, it already includes the role:

    [ Link
        ( "fnref1"
        , [ "footnote-ref" ]
        , [ ( "role" , "doc-noteref" ) ]
        )
        [ Superscript [ Str "1" ] ]
        ( "#fn1" , "" )
    ]

The information is also present in the parsed footnotes section:

, Div
    ( "footnotes"
    , [ "section" , "footnotes" , "footnotes-end-of-document" ]
    , [ ( "role" , "doc-endnotes" ) ]
    )
    [ HorizontalRule
    , OrderedList
        ( 1 , DefaultStyle , DefaultDelim )
        [ [ Div
              ( "fn1" , [] , [] )
              [ Para
                  [ Str "there"
                  , Link
                      ( ""
                      , [ "footnote-back" ]
                      , [ ( "role" , "doc-backlink" ) ]
                      )
                      [ Str "\8617\65038" ]
                      ( "#fnref1" , "" )
                  ]
              ]
          ]
        ]
    ]

So it shouldn't be too hard to define an AST transformation that runs after the main HTML parsing and converts the note-links to proper Note elements. There is no need to change the actual parsing.

@jgm jgm closed this as completed in e9389ab Dec 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants