inlinePatterns should have access to their parent/ancestors. #596

waylan · 2017-11-15T16:46:32Z

It is reasonable to expect that some inlinePatterns should not apply if an ancestor is of a specific set of elements. For example. an @mention should not be converted to a "mention link" if the text is in a link label (like this: [@waylan](https://example.com/waylan)). Also consider that ancestors are potentially relevant ([**@waylan**](https://example.com/waylan)). In fact, see Python-Markdown/github-links#5 for that very example.

Perhaps markdown.inlinepatterns.Pattern.handleMatch should have an ancestors keyword passed to it which would include a list of ancestor elements. I doubt we need to actually pass the elements themselves. Presumably, strings of their tag names would be sufficient. Do we only pass the inline elements, or include the block-level elements all the way up to the root? Should we pass a set (with no repetitions) or a list (with all elements in order)?

I'm thinking perhaps it could be used something like this:

class MentionPattern(Pattern):

    def handleMatch(self, m, ancestors):
        if 'a' in ancestors:
            # A false match. Return None to indicate no change.
            return None
        # build and return element here...

As a reminder, it would not be a good idea to pass the parent element as the element could include text before and after the match that would need to wrap the created element. While the user (extension dev) could access and rebuilt the content correctly, that just opens up more ways for them to break things. Therefore, I'm inclined to limit this to a list of tags (as string names).

The text was updated successfully, but these errors were encountered:

waylan · 2017-11-15T17:06:33Z

Another option would be to hardcode the exception into Markdown itself. For example, a Pattern class could define a list of ancestors to "exclude" itself from:

class MentionPattern(Pattern):
    exclude = ['a']
    # Implement Pattern here...

Then the pattern would never even be run if an ancestor element was in the exclude list.

facelessuser · 2017-11-15T17:25:24Z

I agree that having access to ancestors would be very cool. I've run into a similar issue targeting raw, bare links (GitHub style auto-linking). In your case, it seems escaping the mention would solve your problem, but it would be more intuitive if you didn't have to.

facelessuser · 2017-11-15T17:39:40Z

Are we just passing tag names? Or do we get attributes of tags as well?

waylan · 2017-11-15T18:11:30Z

I'm thinking just tag names. I'm not keen on passing the element instances for the reasons mentioned and I don't really see any sense in passing attributes without the instances of the objects. I actually like my second suggestion better (exclude), but I also understand that that is less flexible for extension devs.

Another reason to not pass the element instances is that if 'a' in ancestors doesn't work. Instead you need to do if 'a' in [t.name for t in ancestors]. Obviously still doable, but less than ideal. Of course, if we were using some custom element object (see #420), we could include methods/properties to make this easier. But that is both a bigger refactor and a backward-incompatible change.

facelessuser · 2017-11-15T18:28:37Z

Yeah, I wasn't necessarily implying to pass the element. I was more thinking just a list of tags, or a list of of tuples that contains the tag name and attributes, but even just the exclude name is fine.

To be honest, links is the only time I have ever had this issue and wished I had access to parents. But maybe that is simply because I'm so use to the current constraints I haven't thought about how we could leverage ancestry to improve existing syntax behavior.

waylan · 2017-11-15T19:11:13Z

To be honest, links is the only time I have ever had this issue and wished I had access to parents. But maybe that is simply because I'm so use to the current constraints I haven't thought about how we could leverage ancestry to improve existing syntax behavior.

Same here. That's why I like the "exclude" solution. It meets the needs I've actually come across. But a more flexible solution might open up possibilities I haven't even thought of.

waylan · 2017-11-15T21:37:54Z

Groan. I just looked at the inline pattern calling code for the first time in a long time. There are so many levels of recursion there (all for good reason) that the location were an element's tag is known and where the pattern is run are so far removed that it is non-trivial to implement either of my proposed solutions. The existing code doesn't even check for code tags as the BacktickPattern simply returns the text as an AtomicString to avoid further processing. IIRC, AtomicString was first introduced as an easier way to implement that behavior. I recall being resistant to it at first, but did see its value. However, it doesn't help us here as in this case we're not trying to stop all processing, only a select few patterns. As an aside, this all has to do with the underlying reasons why I had considered the refactor mentioned in this comment. But that refactor would be a backward-incompatible change which would break all existing inline patterns in all extensions everywhere. Sigh.

facelessuser · 2017-11-15T22:19:41Z

That's the inline pattern code I remember; full of recursion and confusion.

facelessuser · 2017-11-16T19:51:49Z

Looking over this code I think this may be possible. I'll play with it maybe tonight if I have time.

facelessuser · 2017-11-17T01:49:45Z

I've posted a simple backwards compatible experiment that uses ancestry to avoid processing inline code blocks. Obviously this is not a practical case and code blocks should use AtomicString as that is better suited for handling that case, but it is used to illustrate how this could work for others. This is just a prototype: https://github.com/Python-Markdown/markdown/tree/experiment-ancestory.

facelessuser · 2017-11-17T01:50:18Z

And yes I spelled ancestry wrong in the branch name...but it's too late for that 🙂 .

waylan · 2017-11-17T01:58:35Z

Cool! For easy reference, compare the changes here.

facelessuser · 2017-11-17T02:05:20Z

It was much easier than I thought. Theoretically you could later add inclusion in addition to the exclusion, but I can't think of a reason to right now.

If we want to go in this direction, I can work towards a final solution with unit tests and such.

waylan · 2017-11-17T02:50:23Z

Yep, I think this direction makes sense.

facelessuser · 2017-11-17T03:44:58Z

I may have something by tomorrow. As I go through tests, I see more places we need to account for ancestors in the inline treeprocessor, but it is coming together.

facelessuser · 2017-11-17T03:47:29Z

By the way, do we want to add python 3.5 and 3.6 to Travis?

facelessuser · 2017-11-17T03:57:16Z

I've got some other things to do tonight, so here is the WIP (compare). I'll pick this up tomorrow.

facelessuser · 2017-11-17T14:42:02Z

I think this working now. The only thing it can't handle is exclusion in a raw HTML link <a href="#">@mention</a>. I don't think raw inline HTML ever makes it into the tree as the HTML syntax is just capture as pieces of HTML instead of as a whole tag. Block HTML definitely doesn't, but I am less concerned about that. Not handling inline raw HTML isn't a deal breaker for me, but there might be ways to handle this (none that are trivial), just not sure if we care at this point.

waylan · 2017-11-17T16:57:03Z

You make a good point. Markdown is processed inside raw inline HTML. To accomplish that, only the tags themselves are stashed, not their contents. And we do nothing to keep track of the tags used in the raw HTML.

However, as this is raw HTML, I don't know that I care so much. Users already need to be careful with raw HTML as Markdown is not very intelligent about it and if your not careful, you can easily end up with invalid HTML. Given that an extra level of complexity already exists for document authors when working with raw HTML, I'm okay requiring then to also escape Markdown content inside raw HTML when necessary (for example: <a href="#">\@mention</a>).

facelessuser · 2017-11-17T17:26:30Z

I agree. Raw not working isn't a big deal for me. For a long time we've dealt with raw HTML not being in the tree as well.

In order to handle raw tags as actual tags there would have to be a big refactor. I'd prefer to take this as it is as right now.

facelessuser · 2017-11-18T01:34:22Z

I think I'm done. Feel free to review and make suggestions in #598.

facelessuser mentioned this issue Nov 15, 2017

MagicLink: Autolink should not occur in anchor tags facelessuser/pymdown-extensions#151

Closed

waylan closed this as completed in de5c696 Nov 23, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

inlinePatterns should have access to their parent/ancestors. #596

inlinePatterns should have access to their parent/ancestors. #596

waylan commented Nov 15, 2017 •

edited

Loading

waylan commented Nov 15, 2017

facelessuser commented Nov 15, 2017

facelessuser commented Nov 15, 2017

waylan commented Nov 15, 2017

facelessuser commented Nov 15, 2017

waylan commented Nov 15, 2017

waylan commented Nov 15, 2017

facelessuser commented Nov 15, 2017

facelessuser commented Nov 16, 2017

facelessuser commented Nov 17, 2017

facelessuser commented Nov 17, 2017

waylan commented Nov 17, 2017

facelessuser commented Nov 17, 2017

waylan commented Nov 17, 2017

facelessuser commented Nov 17, 2017

facelessuser commented Nov 17, 2017

facelessuser commented Nov 17, 2017

facelessuser commented Nov 17, 2017

waylan commented Nov 17, 2017

facelessuser commented Nov 17, 2017

facelessuser commented Nov 18, 2017

inlinePatterns should have access to their parent/ancestors. #596

inlinePatterns should have access to their parent/ancestors. #596

Comments

waylan commented Nov 15, 2017 • edited Loading

waylan commented Nov 15, 2017

facelessuser commented Nov 15, 2017

facelessuser commented Nov 15, 2017

waylan commented Nov 15, 2017

facelessuser commented Nov 15, 2017

waylan commented Nov 15, 2017

waylan commented Nov 15, 2017

facelessuser commented Nov 15, 2017

facelessuser commented Nov 16, 2017

facelessuser commented Nov 17, 2017

facelessuser commented Nov 17, 2017

waylan commented Nov 17, 2017

facelessuser commented Nov 17, 2017

waylan commented Nov 17, 2017

facelessuser commented Nov 17, 2017

facelessuser commented Nov 17, 2017

facelessuser commented Nov 17, 2017

facelessuser commented Nov 17, 2017

waylan commented Nov 17, 2017

facelessuser commented Nov 17, 2017

facelessuser commented Nov 18, 2017

waylan commented Nov 15, 2017 •

edited

Loading