Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix smartquote logic #479

Open
chrisjsewell opened this issue Dec 29, 2021 · 2 comments
Open

Fix smartquote logic #479

chrisjsewell opened this issue Dec 29, 2021 · 2 comments
Labels
bug Something isn't working question Further information is requested topic/extension issues regarding myst extension

Comments

@chrisjsewell
Copy link
Member

chrisjsewell commented Dec 29, 2021

I'm surprised no one has noticed this before, but I've just realised there are some "conflicts" in how smartquotes are handled; between myst (via markdown-it), docutils and sphinx.


Within myst-parser, enabling the "smartquotes" extension (e.g. via myst_enable_extensions = ["smartquotes"], enables the markdown-it smartquotes rule:

if "smartquotes" in config.enable_extensions:
md.enable("smartquotes")
typographer = True

You can see its tested behaviour here: https://raw.githubusercontent.com/executablebooks/markdown-it-py/master/tests/test_port/fixtures/smartquotes.md,
i.e. this is applied before converting to docutils AST


With docutils, the MyST parser is inheriting the docutils RST parser:

class Parser(RstParser):

Because of this, it also currently inherits its transforms, for which RST adds the SmartQuote transform: https://github.com/chrisjsewell/docutils/blob/8adab0660b2097b4f3c32cef7e5ff4cb3c72b084/docutils/docutils/parsers/rst/__init__.py#L177-L179

This does approximately the same thing, and is turned off by default, so for example:

$ echo "\"a\"" | rst2pseudoxml.py
<document source="<stdin>">
    <paragraph>
        "a"
$ echo "\"a\"" | rst2pseudoxml.py --smart-quotes=yes
<document source="<stdin>">
    <paragraph>
        “a”
$ echo "\"a\"" | myst-docutils-pseudoxml 
<document source="<stdin>">
    <paragraph>
        "a"
$ echo "\"a\"" | myst-docutils-pseudoxml --smart-quotes=yes 
<document source="<stdin>">
    <paragraph>
        “a”
$ echo "\"a\"" | myst-docutils-pseudoxml --myst-enable-extensions=smartquotes
<document source="<stdin>">
    <paragraph>
        “a”

we could perhaps remove this transform, i.e.:

class Parser(RstParser):

    def get_transforms(self):
        transforms = super().get_transforms()
        transforms.remove(SmartQuotes)
        return transforms

For sphinx though, removing the SmartQuote transform would not matter, because it already does this: https://github.com/sphinx-doc/sphinx/blob/eed0730b4ba3bd2fbd34f2d6ab555ba876c77717/sphinx/parsers.py#L72-L80,
in order to replace it with SphinxSmartQuotes: https://github.com/sphinx-doc/sphinx/blob/eed0730b4ba3bd2fbd34f2d6ab555ba876c77717/sphinx/transforms/__init__.py#L323

By contrast with docutils, sphinx smartquotes are on by default 😬 : https://www.sphinx-doc.org/en/master/usage/configuration.html#confval-smartquotes,
so even if you don't enable the myst-parser smartquotes extension, quotes will still be converted, unless you set smartquotes = False

There are ways of disabling, for instance by setting document.settings.smart_quotes = False in the parser.


Not decided on what the best behaviour here should be yet, I imagine though there should be only one "source of truth" for enabling/disabling smart quotes:

  1. Disable the docutils/sphinx smartquote transforms, and rely only on the markdown-it rule
  2. Remove the use of the markdown-it rule, and have myst_enable_extensions = ["smartquotes"] dynamically change the docutils/sphinx config in the parser
  3. Remove the myst-parser smartquotes extension entirely and instruct users to use the docutils/sphinx config options

Perhaps the SmartQuote transform is more "optimised" for docutils/sphinx use, but the pro of the markdown-it rule, is that it is not tightly coupled to docutils, so you can use it in other "myst implementations", e.g.: https://github.com/executablebooks/myst-vs-code/blob/c8c96542171ebe6ccdd896cf8337a5ffffeb73d4/src/extension.ts#L1

also related #424

cc @choldgraf, @mmcky, etc and anyone else's thoughts welcome

@chrisjsewell chrisjsewell added bug Something isn't working question Further information is requested topic/extension issues regarding myst extension labels Dec 29, 2021
chrisjsewell added a commit that referenced this issue Dec 29, 2021
After PR #477, I noted that the `SmartQuote` transform was being applied to  the front-matter values,
before they were extracted to the document metadata (see also issue #479).
Therefore, we now add them as a `literal` to stop this happening.
@mmcky
Copy link
Member

mmcky commented Jan 3, 2022

thanks @chrisjsewell -- If I understand correctly smartquotes provides typographically correct quotations marks right? Given markdown-it is a tight dependency and does the bulk of the parsing from myst:md to sphinx.ast I don't see a big downside to using it for smartquotes. Perhaps the key issues is if a user provides sphinx configuration in _config.py then this may cause confusing results as it is rendered ineffective?

I see your issue re: source of truth here.

@72757373656c6c
Copy link
Contributor

72757373656c6c commented May 28, 2024

According to docutils, smartquotes also contains:

> Straight quotes (" and ') into "curly" quote characters
> dashes (-- and ---) into en- and em-dash entities
> three consecutive dots (... or . . .) into an ellipsis entity.

This is also referenced in the Sphinx documentation for smartquotes.

The myst-parser documentation includes -- and --- as replacements.

I am using Sphinx+MyST and did not want replacements or smartquotes. I created a simple project to find a solution. The conf.py contains:

extensions = ["myst_parser"]
exclude_patterns = [ ".venv", ]
source_suffix = { ".md": "markdown", }
myst_enable_extensions = [ "colon_fence", ]
myst_disable_syntax = [ "smartquotes", ]
#myst_disable_syntax = [ "replacements", "smartquotes", ]

myst-parser FAQ - Disable Markdown syntax for the parser

But ', ", --, --- and ... were still converted. Adding replacements to myst_disable_syntax changed nothing. Also notice myst_enable_extensions does not contain smartquotes. It is enabled by default.

My work around is to double escape -- like \\-\\- according to stackoverflow - Display two dashes in rst file.

It would be helpful if myst_disable_syntax = [ "smartquotes", ] disabled the conversion of ', ", --, --- and ..., or smartquotes is disabled by default and are only enabled if smartquotes is added to myst_enable_extensions.

sphinx 7.3.7
myst-parser 3.0.1
docutils 0.21.2
markdown-it-py 3.0.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working question Further information is requested topic/extension issues regarding myst extension
Projects
None yet
Development

No branches or pull requests

3 participants