Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CoreNodeFormatter does not descend into children on Link nodes #295

Closed
spand opened this issue Jan 8, 2019 · 11 comments
Closed

CoreNodeFormatter does not descend into children on Link nodes #295

spand opened this issue Jan 8, 2019 · 11 comments

Comments

@spand
Copy link
Contributor

spand commented Jan 8, 2019

When the CoreNodeFormatter renders Link nodes it does not descend into its children like other inline containers ie.

private void render(Emphasis node, NodeFormatterContext context, MarkdownWriter markdown) {
	markdown.append(node.getOpeningMarker());
	context.renderChildren(node);
	markdown.append(node.getOpeningMarker());
}

Instead it simply appends the node chars. In my mind this is inconsistent and concretely it means that it will not format my ast manipulation.

I assume that it would be more correct to do much of the same as in the translating context.

@vsch
Copy link
Owner

vsch commented Jan 8, 2019

@spand, fixing it now. Did not foresee your use case. 😊

@vsch
Copy link
Owner

vsch commented Jan 8, 2019

Fix for this is available. Repo updated, maven updated but may take a while to show up in maven central.

@spand
Copy link
Contributor Author

spand commented Jan 16, 2019

Thanks for the quick reply, and sorry for not being quicker here.

I have submitted a pull request that demonstrates it is not fixed entirely (if at all) yet.

I expect the if (node.getFirstChildAny(HtmlInline.class) != null) { condition needs to be widened but I am not sure what the correct would be.

@vsch vsch reopened this Jan 16, 2019
@vsch
Copy link
Owner

vsch commented Jan 16, 2019

@spand, I pushed changes to the test case fixing your modification code. You cannot just change the AST child nodes without propagating changes to affected parent values.

In this case link's text node has to reflect the text contained in the children. That is why your AST collecting visitor was still showing the link text having link and not Link.

The code assumes consistency between AST and character ranges. It is generally not a good idea to just replace the AST outside the API because it can create inconsistent state. Also, all AST text values are BasedSequence implemented and expect the base sequence to be the same for all nodes with offsets to reflect position of text in the original input. Adding nodes based on strings violates this and can result in assertion failures.

So if you want to replace nodes you should use the NodePostProcessor extension API and not a visitor, and should follow the caveats of BasedSequence.

In most cases the node text should use PrefixedSubSequence with the original text passed as .subSequence(0,0) as the second argument so that the new text will "replace" original text but will have the correct offset location and base sequence.

@vsch vsch closed this as completed Jan 21, 2019
@spand
Copy link
Contributor Author

spand commented Jan 21, 2019

I am just looking at this now. What is it you propose doing here in #160 then ?

@vsch
Copy link
Owner

vsch commented Jan 21, 2019

@spand, if you modify the AST then you will need to ensure that all interdependent values in the AST are update, which is not a trivial task if you are not using the API for "standard" mods.

The difficulty comes from interdependent values and the fact that all text in the nodes is stored as BasedSequence instances which are subSequences of the file source. This is used to provide offsets into the original file for all text in the AST. There is a limitation that all text should come from the same base sequence.

Text from outside this base sequence can be inserted but it must still be linked to a location in the original base sequence. Effectively you can insert text outside the original text but must provide an offset where it is inserted.

The code does combine multiple disjoint sequences into a single sequence, with the requirement that offsets into original base sequence must be in increasing order. This is used to generate multi-line sequences for child nodes when these lines have the parent's prefix text in the original source and must have these prefixes removed for the child content to be processed correctly.

All these constraints make modifying the existing AST text error prone.

In #160, I recommended that if you need to make extensive AST text modifications it is a better option to use the AST for decisions and to output modified Markdown. Effectively you create custom Formatter node renderers which render modified Markdown instead of modifying the AST.

If you need the AST of the modified markdown then just parse the modified markdown into a new AST.

@spand
Copy link
Contributor Author

spand commented Jan 21, 2019

Ok. In that case my PR may not have shown what I intended. I am not modifying standard nodes.

You write:

I recommended that if you need to make extensive AST text modifications it is a better option to use the AST for decisions and to output modified Markdown. Effectively you create custom Formatter node renderers which render modified Markdown instead of modifying the AST.

That is what I am doing. I replace certain nodes with my own node types, format the AST with a Formatter instance (that delegates to CoreNodeFormatter for standard nodes).

The problem is that CoreNodeFormatter only delegate control back to the formatter for its children if node.getFirstChildAny(HtmlInline.class) != null is true. It ought to do it unconditionally like for Emphasis nodes:

private void render(Emphasis node, NodeFormatterContext context, MarkdownWriter markdown) {
    markdown.append(node.getOpeningMarker());
    context.renderChildren(node);
    markdown.append(node.getOpeningMarker());
}

So my custom formatter will never get invoked if the custom nodes are children of a Link

@vsch
Copy link
Owner

vsch commented Jan 21, 2019

@spand, for your use case you can get the results you want if you update the parent link's text with your updated text.

I will need to change the formatter implementation for handling translating and non-translating text sequences before I can make link nodes unconditionally descend into link text child elements.

Original implementation did not consider having the AST change without updating the equivalent parent text content.

@vsch
Copy link
Owner

vsch commented Jan 21, 2019

@spand, I have separated the rendering for link text during translation and de-optimized formatting so that the latter always uses renderChildren. The translation formatting logic is too hairy to fix at this time.

That way straight forward formatting works, but translation will not work without properly updating the parent link text.

Will push a new release shortly.

@vsch
Copy link
Owner

vsch commented Jan 21, 2019

Fix for this is available. Repo updated, maven updated but may take a while to show up in maven central.

@spand
Copy link
Contributor Author

spand commented Jan 21, 2019

Awesome. Thanks once again !

@vsch vsch closed this as completed Feb 23, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants