Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add syntax highlight variant based on CommonMark using external parser #328

Draft
wants to merge 34 commits into
base: master
Choose a base branch
from

Conversation

alerque
Copy link
Member

@alerque alerque commented May 12, 2020

See primary discussion at #327.

See CommonMark Spec.

This will be a long running effort. Please feel free to pitch in with ideas or pull requests against this branch.

@alerque
Copy link
Member Author

alerque commented May 12, 2020

Nesting needs to work like this:

Container blocksLeaf blocksInlines

In that order.

@alerque
Copy link
Member Author

alerque commented May 12, 2020

I'm not sure where CommonMark leaves Pandoc's div syntax. An extension with an extra kind of Container block?

@fmoralesc
Copy link
Member

Fenced divs? Those should be a container block.

@alerque
Copy link
Member Author

alerque commented May 12, 2020

Fenced divs? Those should be a container block.

My point was the CM spec at this point has three types of container blocks, and fenced divs are not one of them. I believe there is an allowance for them somewhere (I remember discussing this when Pandoc was considering adding the syntax) but I can't remember how it handles this. Extensions I think.

@alerque
Copy link
Member Author

alerque commented May 14, 2020

I mistakenly branched this work off of something 52 commits behind master. I just rebased and tweaked how it is launched to make it easier to experiment with different approaches.

let g:pandoc#syntax#flavor#commonmark = 1 to enable my (empty) alternate, default is to keep using flavor legacy which should be pretty much a passthrough to what master does now.

@alerque alerque changed the title Gut current syntax, replace with new syntax targeting CommonMark Add syntax highlight variant based on CommonMark using external parser May 16, 2020
@alerque
Copy link
Member Author

alerque commented May 16, 2020

@fmoralesc I know for the sake of interfacing with vim-pandoc you want to get this working in Pandoc ... and that's understandable. I have my own reasons for wanting to get this working in Lua. First I need the general highlighting method work from Lua for another project that this is just a proof of concept for, and also I want to be able to wire SILE internals directly into my vim buffer. Hence I'm pushing forward with being able to do both.

I rewired the Rust code so that it can generate both Python and Lua native modules from the same basic code. I actually got it working without the extra wrappers, but the trait handling was messy because pyo3 mucks around with the internals of functions and doesn't like aliased types being passed in.

I got back to basically where I was with Lua, but this time with Rust code using native types and only converting to Lua structures at the last minute. I did not get the Python as far along as your version yet, but I'll keep working on it some. I wasn't sure where to put the Python code half on the equation relative to the rest of the plugin.

@fmoralesc
Copy link
Member

fmoralesc commented May 16, 2020 via email

@fmoralesc
Copy link
Member

I left some comments in gitter earlier today, I'm copying them here for reference:

Last night I implemented an algorithm to get around the multiline limitations of nvim's add_highlight mechanism.
I think we need to ditch the legacy groups, because of the differences in the "parsing" strategies there won't be a 1:1 mapping anyway.
I also found we don't only need to capture the Start events, we also need to capture the offsets of Code, Html, FootnoteReference, Rule, and maybe HardBreak.

A weakness of the mechanisms we are using now is that they don't seem to be introspectable: once we push highlighting to the buffer, we cannot retrieve much information about it back (this is something we need for context-aware autoformatting in vim-pandoc, for example). I just discovered that we can retrieve the position of the boundaries of the highlight elements with nvim_buf_get_extmarks(0, ns_id, 0, -1, {}) (this emits output like: [[85, 0, 0], [86, 1, 0], [87, 2, 0], [88, 3, 5], [89, 3, 15], [90, 5, 6], [91, 6, 0], [94, 6, 53], [92, 7, 0], [93, 8, 0], [95, 10 , 0], [96, 11, 0]]). But I can't find a way to retrieve the hlgroups.

@alerque
Copy link
Member Author

alerque commented May 16, 2020

I'm actually learning a lot from seeing your Python work, I don't think it's a waste to mess around with what vim/nvim allow different language interfaces to accomplish.

By the way I stuck a Makefile in my branch to make it easy while hacking on this to build both modules. Eventually we'll have to refactor that into viml with a bunch of detection code to make sure we handle different vim versions correctly, but it's enough for hacking in Arch with Neovim for now. Just make all to get set up.

@fmoralesc
Copy link
Member

fmoralesc commented May 16, 2020

I'm actually learning a lot from seeing your Python work, I don't think it's a waste to mess around with what vim/nvim allow different language interfaces to accomplish.

Same here for me with your rust and lua work. Earlier you said you didn't know what to do with my python code. Well, just steal the ideas and incorporate them in the lua code! 😉

I think that as far as the highlighting system goes, it makes sense to have only one language in the codebase interacting with vim (at some stage vim-pandoc used viml, python AND RUBY... those were the days... and we had python in the syntax file... 🤦 I guess we have come full circle EDIT: and let's not forget that if commonmark-hs finally pans out, we might want to have the syntax interact with a haskell library).

Makefile Outdated Show resolved Hide resolved
Makefile Outdated Show resolved Hide resolved
fn get_offsets(_py: Python, buffer: String) -> PyResult<()> {
let events = super::get_offsets(buffer).unwrap();
for event in events.iter() {
eprintln!("DEBUG={:#?}", event);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

        //eprintln!("DEBUG={:#?}", event);
        let event_dict = PyDict::new(_py);
        event_dict.set_item("group", event.group.as_str()).unwrap();
        event_dict.set_item("start", event.first).unwrap();
        event_dict.set_item("end", event.last).unwrap();
        //event_dict.set_item("lang", format!("{:#?}", event.lang)).unwrap();
        pyevents.set_item(i, event_dict)?;
        i += 1;

@@ -0,0 +1,23 @@
use pyo3::prelude::*;
use pyo3::wrap_pyfunction;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use pyo3::types::PyDict;

@@ -3,3 +3,4 @@
*.swp
tags
doc/tags
/target
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

*.so
/target-lua
/target-python

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants