-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support MkDocs fenced codeblock attributes and dot prefixed language #153
Comments
This RegEx seems to do the job of getting rid of the prefixed white space on attributes, and it removes the use of named groups.
Running the test with this regex yields the following changed files:
|
reenberg
added a commit
to reenberg/vscode-markdown-tm-grammar
that referenced
this issue
Dec 21, 2023
In microsoft#57 support for Codebraid syntax was added, which essentially is just Pandoc attribute syntax, but with a specific class attribute added. The support was added as an extra `identifier` in the list of languages, for which Codebraid has support, such as for python: `\\{\\.python.+?\\}`. The below example would give the following scope: "text.html.markdown markup.fenced_code.block.markdown fenced_code.block.language.markdown" to the entire line: ```{.python .cb.nb jupyter_kernel=python3} ``` However the "language scope" should only be given to the "python" part, and the current support doesn't allow spaces between the curly braces, and it lacks support for all languages. MkDocs allows a few ways to annotate fenced code blocks, but if additional classes, id or key/value pairs are used, then the curly braces must be used and the language must be prefixed with a dot. In simple cases where only the language is specified, then the curly braces and the dot may be omitted. The following are quick examples: ``` { .python #id .class title="My Title"} ``` or ``` python ``` This change removes the Codebraid support from the specific languages as an `identifier` attribute, and moved into the RegEx by defining it as two alternative cases: surrounded by curly braces or allowing them after the language: 1. The case where the entire line after the code fence is wrapped in curly braces. In this case the curly braces is not part of the language and attribute scope. 2. The case where the attributes follows the language specification in all sorts of ways (I'm specifically thinking of you Gatsby microsoft#62). In this case the curly braces are included in the attribute scope as it is not trivial to handle all the various ways it may be used, and since this is the current behavior. @microsoft-github-policy-service agree Closes microsoft#153 Refs: https://github.com/Python-Markdown/markdown/blob/master/docs/extensions/fenced_code_blocks.md
reenberg
added a commit
to reenberg/vscode-markdown-tm-grammar
that referenced
this issue
Dec 21, 2023
In microsoft#57 support for Codebraid syntax was added, which essentially is just Pandoc attribute syntax, but with a specific class attribute added. The support was added as an extra `identifier` in the list of languages, for which Codebraid has support, such as for python: `\\{\\.python.+?\\}`. The below example would give the following scope: "text.html.markdown markup.fenced_code.block.markdown fenced_code.block.language.markdown" to the entire line: ```{.python .cb.nb jupyter_kernel=python3} ``` However the "language scope" should only be given to the "python" part, and the current support doesn't allow spaces between the curly braces, and it lacks support for all languages. MkDocs allows a few ways to annotate fenced code blocks, but if additional classes, id or key/value pairs are used, then the curly braces must be used and the language must be prefixed with a dot. In simple cases where only the language is specified, then the curly braces and the dot may be omitted. The following are quick examples: ``` { .python #id .class title="My Title"} ``` or ``` python ``` This change removes the Codebraid support from the specific languages as an `identifier` attribute, and moved into the RegEx by defining it as two alternative cases: surrounded by curly braces or allowing them after the language: 1. The case where the entire line after the code fence is wrapped in curly braces. In this case the curly braces is not part of the language and attribute scope. 2. The case where the attributes follows the language specification in all sorts of ways (I'm specifically thinking of you Gatsby microsoft#62). In this case the curly braces are included in the attribute scope as it is not trivial to handle all the various ways it may be used, and since this is the current behavior. @microsoft-github-policy-service agree Closes microsoft#153 Refs: https://github.com/Python-Markdown/markdown/blob/master/docs/extensions/fenced_code_blocks.md
reenberg
added a commit
to reenberg/vscode-markdown-tm-grammar
that referenced
this issue
Dec 22, 2023
In microsoft#57 support for Codebraid syntax was added, which essentially is just Pandoc attribute syntax, but with a specific class attribute added. The support was added as an extra `identifier` in the list of languages, for which Codebraid has support, such as for python: `\\{\\.python.+?\\}`. The below example would give the following scope: "text.html.markdown markup.fenced_code.block.markdown fenced_code.block.language.markdown" to the entire line: ```{.python .cb.nb jupyter_kernel=python3} ``` However the "language scope" should only be given to the "python" part, and the current support doesn't allow spaces between the curly braces, and it lacks support for all languages. MkDocs allows a few ways to annotate fenced code blocks, but if additional classes, id or key/value pairs are used, then the curly braces must be used and the language must be prefixed with a dot. In simple cases where only the language is specified, then the curly braces and the dot may be omitted. The following are quick examples: ``` { .python #id .class title="My Title"} ``` or ``` python ``` This change removes the Codebraid support from the specific languages as an `identifier` attribute, and moved into the RegEx by defining it as two alternative cases: surrounded by curly braces or allowing them after the language: 1. The case where the entire line after the code fence is wrapped in curly braces. In this case the curly braces is not part of the language and attribute scope. 2. The case where the attributes follows the language specification in all sorts of ways (I'm specifically thinking of you Gatsby microsoft#62). In this case the curly braces are included in the attribute scope as it is not trivial to handle all the various ways it may be used, and since this is the current behavior. @microsoft-github-policy-service agree Closes microsoft#153 Refs: https://github.com/Python-Markdown/markdown/blob/master/docs/extensions/fenced_code_blocks.md
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
The following three examples of fenced code blocks are valid in MkDocs, acording to the docs: https://github.com/Python-Markdown/markdown/blob/master/docs/extensions/fenced_code_blocks.md
However currently only the first one is highlighted in VS Code, as python code:
The rest of them are not highlighted currently:
If the space is removed between the start curly brace and the dot prefixed language attribute in the last example, then it is matched, due to #57 which added support for Codebraid style Pandoc attributes.
I have been playing around with an updated RegEx that will properly match the above by 1) allowing languages to be dot prefixed, and 2) generalising the Codebraid contribution by removing it as an identifier of the few supported languages and including it in the RegEx so all languages can be surrounded by curly braces:
I decided to use named scopes in the regex such that I could back reference them in the second scenarios. I don't know if this makes the RegEx slower, compared to explicitly inserting the language and attribute specification twice.
Currently this updated RegEx only changes the
test/colorize-results/pr-57_md.json
, as it no longer assigns the language scope to the entire sting:"{ .python .cb.nb jupyter_kernel=python3 }"
, but now it discards the braces and the dot, assigns the language scope to the string"python"
(as one would expect), and assigns the attribute scope to the rest:" .cb.nb jupyter_kernel=python3"
.The downside currently seems to be that it includes the space in the beginning of the attribute part. However I'm not sure if this is worth using more energy on, as the attributes is not really used for anything as far as I can see, at least now it actually assigns the attribute scope to that example.
The text was updated successfully, but these errors were encountered: