Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Quarto does not allow to add filters after citeproc processing, leading to uncorrect results #9726

Closed
TomBener opened this issue May 21, 2024 · 18 comments
Labels
duplicate This issue or pull request already exists lua Issues related to the lua codebase, filter chain, etc support a request for support

Comments

@TomBener
Copy link

TomBener commented May 21, 2024

Bug description

Hello Quarto team,

I am experiencing an issue where a specific Lua filter is not being applied when rendering a document in Quarto.

When I run the command quarto render --to html, the Lua filter specified in my _quarto.yml file is not applied. In the HTML output, it generated the reference as:

<div id="ref-han2020" class="csl-entry" role="listitem">
韩旭东, 李德阳, 王若男, et al., 2020. 盈余分配制度对合作社经营绩效影响的实证分析:基于新制度经济学视角[J]. 中国农村经济(4): 56–77.
</div>

However, when I manually specify the Lua filter in the command line with quarto render --to html -L _extensions/filters/localize-cnbib/localize-cnbib.lua, it works as expected (et al. was replaced with ):

<div id="ref-han2020" class="csl-entry" role="listitem">
韩旭东, 李德阳, 王若男, 等, 2020. 盈余分配制度对合作社经营绩效影响的实证分析:基于新制度经济学视角[J]. 中国农村经济(4): 56–77.
</div>

Reproduction Steps

To demonstrate the problem, I have created a GitHub repo to reproduce the issue.

Environment

Here is the output of quarto check:

$ quarto check

Quarto 1.5.37
[✓] Checking versions of quarto binary dependencies...
      Pandoc version 3.2.0: OK
      Dart Sass version 1.70.0: OK
      Deno version 1.41.0: OK
      Typst version 0.11.0: OK
[✓] Checking versions of quarto dependencies......OK
[✓] Checking Quarto installation......OK
      Version: 1.5.37
      Path: /Applications/quarto/bin

[✓] Checking tools....................OK
      TinyTeX: v2024.05
      Chromium: (not installed)

[✓] Checking LaTeX....................OK
      Using: TinyTex
      Path: /Users/username/Library/TinyTeX/bin/universal-darwin
      Version: 2024

[✓] Checking basic markdown render....OK

[✓] Checking Python 3 installation....OK
      Version: 3.10.10
      Path: /Users/username/.pyenv/versions/3.10.10/bin/python3
      Jupyter: 5.7.2
      Kernels: python3

[✓] Checking Jupyter engine render....OK

[✓] Checking R installation...........OK
      Version: 4.4.0
      Path: /opt/homebrew/Cellar/r/4.4.0_1/lib/R
      LibPaths:
        - /Users/username/.R/packages
        - /opt/homebrew/lib/R/4.4/site-library
        - /opt/homebrew/Cellar/r/4.4.0_1/lib/R/library
      knitr: 1.46
      rmarkdown: 2.26

[✓] Checking Knitr engine render......OK
@TomBener TomBener added the bug Something isn't working label May 21, 2024
@TomBener TomBener changed the title https://github.com/TomBener/quarto-lua-filter-test Lua Filter Not Applied When Rendering in Quarto May 21, 2024
@cscheid
Copy link
Collaborator

cscheid commented May 21, 2024

I don't believe this is a bug - filters have to be specified per document, as in the example here: https://quarto.org/docs/extensions/filters.html#filter-extensions

@TomBener
Copy link
Author

TomBener commented May 21, 2024

filters have to be specified per document

Do you mean that the filter should be specified in index.qmd rather than _quarto.yml? But other Lua filters specified in _quarto.yml work as expected, except for the Lua filter above.

@cscheid
Copy link
Collaborator

cscheid commented May 21, 2024

Can you provide an example of one that works as you expected?

@TomBener
Copy link
Author

I have added the Lua filter abstract-section in the repo, and it works as expected.

@cscheid
Copy link
Collaborator

cscheid commented May 21, 2024

Ok. First, a correction on my side. The filter is getting called in the pipeline, and I didn't realize this actually worked. This is generally speaking a bad idea; multiple filters usually require coordination and specifying them in different places (say, _quarto.yml instead of the documents) means you're not going to be able to control the filter execution order if you ever need to add a filter to a specific document.

But, in any case, try this replacement on your own code:

function Pandoc(doc)
    print(pandoc.write(doc, "native"))
end

return {
    {
        Pandoc = Pandoc,
        Cite = process_cite,
        Link = process_cite,
        Div = Div
    }
}

and you'll see the printout:

quarto render
pandoc
  to: html
  output-file: index.html
  standalone: true
  section-divs: true
  html-math-method: mathjax
  wrap: none
  default-image-extension: png

metadata
  document-css: false
  link-citations: true
  date-format: long
  lang: en
  title: Lua Filter Test
  bibliography:
    - bib.bib
  csl: gb-author-date.csl

[ Header 1 ( "abstract" , [] , [] ) [ Str "Abstract" ]
, Para
    [ Str "Place"
    , Space
    , Str "abstract"
    , Space
    , Str "here."
    ]
, Para
    [ Str "Multiple"
    , Space
    , Str "paragraphs"
    , Space
    , Str "are"
    , Space
    , Str "possible."
    ]
, Header 1 ( "test" , [] , [] ) [ Str "Test" ]
, Para
    [ Str "Quarto"
    , Space
    , Str "enables"
    , Space
    , Str "you"
    , Space
    , Str "to"
    , Space
    , Str "weave"
    , Space
    , Str "together"
    , Space
    , Str "content"
    , Space
    , Str "and"
    , Space
    , Str "executable"
    , Space
    , Str "code"
    , Space
    , Str "into"
    , Space
    , Str "a"
    , Space
    , Str "finished"
    , Space
    , Str "document."
    , Space
    , Str "To"
    , Space
    , Str "learn"
    , Space
    , Str "more"
    , Space
    , Str "about"
    , Space
    , Str "Quarto"
    , Space
    , Str "see"
    , Space
    , Link
        ( "" , [ "uri" ] , [] )
        [ Str "https://quarto.org" ]
        ( "https://quarto.org" , "" )
    , Str "."
    ]
, Para
    [ Str "Testing"
    , Space
    , Str "citations"
    , Space
    , Cite
        [ Citation
            { citationId = "knuth84"
            , citationPrefix = []
            , citationSuffix = []
            , citationMode = AuthorInText
            , citationNoteNum = 1
            , citationHash = 0
            }
        ]
        [ Str "@knuth84" ]
    , Space
    , Str "and"
    , Space
    , Cite
        [ Citation
            { citationId = "han2020"
            , citationPrefix = []
            , citationSuffix = []
            , citationMode = AuthorInText
            , citationNoteNum = 2
            , citationHash = 0
            }
        ]
        [ Str "@han2020" ]
    , Space
    , Str "for"
    , Space
    , Str "using"
    , Space
    , Str "the"
    , Space
    , Str "Lua"
    , Space
    , Str "filter."
    ]
, Div
    ( "3ade8a4a-fb1d-4a6c-8409-ac45482d5fc9"
    , [ "hidden" ]
    , []
    )
    []
]
pandoc
  to: docx
  output-file: index.docx
  default-image-extension: png

metadata
  title: Lua Filter Test
  bibliography:
    - bib.bib
  csl: gb-author-date.csl

[ Header 1 ( "abstract" , [] , [] ) [ Str "Abstract" ]
, Para
    [ Str "Place"
    , Space
    , Str "abstract"
    , Space
    , Str "here."
    ]
, Para
    [ Str "Multiple"
    , Space
    , Str "paragraphs"
    , Space
    , Str "are"
    , Space
    , Str "possible."
    ]
, Header 1 ( "test" , [] , [] ) [ Str "Test" ]
, Para
    [ Str "Quarto"
    , Space
    , Str "enables"
    , Space
    , Str "you"
    , Space
    , Str "to"
    , Space
    , Str "weave"
    , Space
    , Str "together"
    , Space
    , Str "content"
    , Space
    , Str "and"
    , Space
    , Str "executable"
    , Space
    , Str "code"
    , Space
    , Str "into"
    , Space
    , Str "a"
    , Space
    , Str "finished"
    , Space
    , Str "document."
    , Space
    , Str "To"
    , Space
    , Str "learn"
    , Space
    , Str "more"
    , Space
    , Str "about"
    , Space
    , Str "Quarto"
    , Space
    , Str "see"
    , Space
    , Link
        ( "" , [ "uri" ] , [] )
        [ Str "https://quarto.org" ]
        ( "https://quarto.org" , "" )
    , Str "."
    ]
, Para
    [ Str "Testing"
    , Space
    , Str "citations"
    , Space
    , Cite
        [ Citation
            { citationId = "knuth84"
            , citationPrefix = []
            , citationSuffix = []
            , citationMode = AuthorInText
            , citationNoteNum = 1
            , citationHash = 0
            }
        ]
        [ Str "@knuth84" ]
    , Space
    , Str "and"
    , Space
    , Cite
        [ Citation
            { citationId = "han2020"
            , citationPrefix = []
            , citationSuffix = []
            , citationMode = AuthorInText
            , citationNoteNum = 2
            , citationHash = 0
            }
        ]
        [ Str "@han2020" ]
    , Space
    , Str "for"
    , Space
    , Str "using"
    , Space
    , Str "the"
    , Space
    , Str "Lua"
    , Space
    , Str "filter."
    ]
, Div
    ( "3ade8a4a-fb1d-4a6c-8409-ac45482d5fc9"
    , [ "hidden" ]
    , []
    )
    []
]
Output created: index.html

That means your filter is executing. It's just something in the document that is not working as your code expects.

@cscheid cscheid added support a request for support and removed bug Something isn't working labels May 21, 2024
@mcanouil mcanouil added the lua Issues related to the lua codebase, filter chain, etc label May 21, 2024
@TomBener
Copy link
Author

TomBener commented May 21, 2024

Thanks for your diagnosis. But I cannot figure out why the Lua filter is executed from the native output. Further, the html output was not modified by the Lua filter as expected. Could you please help to diagnose why the Lua filter can work on the command line but not by adding in _quarto.yml?

@TomBener
Copy link
Author

This is the diff of the command quarto render index.qmd -t native -o test.txt and quarto render index.qmd -t native -o test.txt -L _extensions/filters/localize-cnbib/localize-cnbib.lua:

Pandoc
  Meta
    { unMeta =
        fromList
          [ ( "bibliography"
            , MetaList [ MetaInlines [ Str "bib.bib" ] ]
            )
          , ( "csl" , MetaInlines [ Str "gb-author-date.csl" ] )
          , ( "title"
            , MetaInlines
                [ Str "Lua"
                , Space
                , Str "Filter"
                , Space
                , Str "Test"
                ]
            )
          ]
    }
  [ Para
      [ Str "Testing"
      , Space
      , Str "citations"
      , Space
      , Cite
          [ Citation
              { citationId = "knuth84"
              , citationPrefix = []
              , citationSuffix = []
              , citationMode = AuthorInText
              , citationNoteNum = 1
              , citationHash = 0
              }
          ]
          [ Str "Knuth" , Space , Str "(1984)" ]
      , Space
      , Str "and"
      , Space
      , Cite
          [ Citation
              { citationId = "han2020"
              , citationPrefix = []
              , citationSuffix = []
              , citationMode = AuthorInText
              , citationNoteNum = 2
              , citationHash = 0
              }
          ]
-         [ Str "\38889\26093\19996\160et"
-         , Space
-         , Str "al."
+         [ Str "\38889\26093\19996\160\31561"
          , Space
          , Str "(2020)"
          ]
      , Space
      , Str "for"
      , Space
      , Str "using"
      , Space
      , Str "the"
      , Space
      , Str "Lua"
      , Space
      , Str "filter."
      ]
  , Div
      ( "refs"
      , [ "references" , "csl-bib-body" , "hanging-indent" ]
      , [ ( "entry-spacing" , "0" ) ]
      )
      [ Div
          ( "ref-knuth84" , [ "csl-entry" ] , [] )
          [ Para
              [ Str "Knuth"
              , Space
              , Str "D"
              , Space
              , Str "E,"
              , Space
              , Str "1984."
              , Space
              , Str "Literate"
              , Space
              , Str "Programming[J/OL]."
              , Space
              , Str "Comput."
              , Space
              , Str "J.,"
              , Space
              , Str "27(2):"
              , Space
              , Str "97\8211\&111."
              , Space
              , Link
                  ( "" , [] , [] )
                  [ Str "https://doi.org/10.1093/comjnl/27.2.97" ]
                  ( "https://doi.org/10.1093/comjnl/27.2.97" , "" )
              , Str "."
              , Space
              , Str "DOI:"
              , Space
              , Link
                  ( "" , [] , [] )
                  [ Str "10.1093/comjnl/27.2.97" ]
                  ( "https://doi.org/10.1093/comjnl/27.2.97" , "" )
              , Str "."
              ]
          ]
      , Div
          ( "ref-han2020" , [ "csl-entry" ] , [] )
          [ Para
              [ Str "\38889\26093\19996,"
              , Space
              , Str "\26446\24503\38451,"
              , Space
              , Str "\29579\33509\30007,"
              , Space
-             , Str "et"
-             , Space
-             , Str "al.,"
+             , Str "\31561,"
              , Space
              , Str "2020."
              , Space
              , Str
                  "\30408\20313\20998\37197\21046\24230\23545\21512\20316\31038\32463\33829\32489\25928\24433\21709\30340\23454\35777\20998\26512\65306\22522\20110\26032\21046\24230\32463\27982\23398\35270\35282[J]."
              , Space
              , Str "\20013\22269\20892\26449\32463\27982(4):"
              , Space
              , Str "56\8211\&77."
              ]
          ]
      ]
  ]

From the native output, I don't think the Lua filter was applied, or applied correctly at least.

@cscheid
Copy link
Collaborator

cscheid commented May 22, 2024

From the native output, I don't think the Lua filter was applied, or applied correctly at least.

That isn't consistent with the testing I've done. If you add Pandoc = function(doc) print("here") end and you see the printout (as I did), then the filter is getting called, and the problem is that the structure of the document is not what you're expecting it to be. In that case, you need to fix your filter.

@TomBener
Copy link
Author

the problem is that the structure of the document is not what you're expecting it to be. In that case, you need to fix your filter.

Could you please help to see what's the problem with the Lua filter, and how can I fix it? Thanks very much!

@cderv
Copy link
Collaborator

cderv commented May 22, 2024

@TomBener here are some resources to help you debug this on your end.

Hope this helps

@TomBener
Copy link
Author

@cderv Many thanks for your guidance. But I'm confused with the "More precise targeting of AST processing phases" in the document. I cannot fully understand the implications of the three parts: astquarto, and render. Could you please list specific examples in the document to make it easier to understand and use? For example, I have a use-case: Editing LaTeX from Markdown before compiling to PDF, is this possible to use Lua filter in one of the three stages?

@cderv
Copy link
Collaborator

cderv commented May 22, 2024

ast, quarto, and render. Could you please list specific examples in the document to make it easier to understand and use?

We have no more documentation yet on this. Those are only possible steps where you can apply your filter. By default, the filter will apply at the end IIRC.

Editing LaTeX from Markdown before compiling to PDF, is this possible to use Lua filter in one of the three stages?

Read the doc about Extensions and How Lua filters works. You can do anything from the Parsed Markdown by Pandoc until the writing to output format. So you can do a Lua filter that would catch some object and output Raw LaTeX. but Lua filters cannot be used to post process LaTeX files that would have been generated by Pandoc conversion. Quarto will call LaTeX on it directly.

Hope it helps understand. I did not know how advanced you may be so I mentioned the three parts. You should not consider this for now, and only try to debug your filter using logging at different places in your processing.

@TomBener
Copy link
Author

@cderv Thanks!

@cscheid
Copy link
Collaborator

cscheid commented Jun 5, 2024

I'm going to go ahead and close this one, since I don't think there's anything outstanding.

@cscheid cscheid closed this as not planned Won't fix, can't repro, duplicate, stale Jun 5, 2024
@TomBener
Copy link
Author

TomBener commented Jun 6, 2024

Sorry I don't think so. Some Lua filters not applying is indeed a problem I have not resolved. I have updated to test Citation Backlinks Filter by @tarleb, but it didn't work in my Quarto example.

@cderv
Copy link
Collaborator

cderv commented Jun 6, 2024

@TomBener as discussed this is specific ordering of how the lua filters should be applied. I gave some hints to look into this, and try debug so that you could come back and provide more details on what is not working.

I have updated to test Citation Backlinks Filter by @tarleb, but it didn't work in my Quarto example.

Let's just deal with that first: The README of this filter clearly state:

The filter doesn't work yet as a Quarto extension.

So you can't expect it to work ! There is even an issue in there about this: tarleb/citation-backlinks#2

The reason this filter is not working is probably the same as yours, if localize-cnbib.lua requires to be run after citeproc has happen.

So more generally, Lua filters that requires to be ran after citeproc does not work yet in Quarto extension.

This is discussed at

with another filter not working

and we are tracking the improvement at

Currently, we call citeproc as part of the default files after all the other filters, and there is no way to apply it after.
Quarto has a specific handling of filters by creating a filter chain to mix internal and user filter together the right way in right context. citeproc is applied after this filter chain.

Using quarto render -L as you did works because in this case the filters is independently added to the list of filters to run. It could work for some filters, but would not for other more tied to Lua API for example. Those need to be in the filter chain.

I hope this helps understand. I'll rename this issue to make clear what this is about. And follow #7888 for resolution of this limitation.

@cderv cderv changed the title Lua Filter Not Applied When Rendering in Quarto Quarto does not allow to add filters after citeproc processing, leading to uncorrect results Jun 6, 2024
@cderv cderv added the duplicate This issue or pull request already exists label Jun 6, 2024
@cderv
Copy link
Collaborator

cderv commented Jun 6, 2024

Duplicate of #7888

@TomBener
Copy link
Author

TomBener commented Jun 6, 2024

@cderv Thanks very much! I think your comment is correct and it helps me understand the cause for the issue. All my Lua filters that cannot be used as Quarto extensions are indeed related to citeproc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
duplicate This issue or pull request already exists lua Issues related to the lua codebase, filter chain, etc support a request for support
Projects
None yet
Development

No branches or pull requests

4 participants