Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Annotate tokens with their positions from the source text #98

Open
arseniiv opened this issue Jul 6, 2019 · 9 comments
Open

Annotate tokens with their positions from the source text #98

arseniiv opened this issue Jul 6, 2019 · 9 comments

Comments

@arseniiv
Copy link

arseniiv commented Jul 6, 2019

In Sprache, you provided IPositionAware and Positioned to make a parse result, well, aware of the position it’s parsed. I see this feature useful for giving a precise position of some syntax construct in post-parse checks (like, “this variable right here wasn’t declared” vs. the same without being able to report a position to the user, so they would have to search that place for themselves).

There is Result<T>.Location, but I don’t see how I could apply that to the resulting value via combinators. Could I achive it here, and which way you’d advice to do it best? (Or if maybe I’m looking for the wrong thing, and the thing mentioned should be done another way.)

@nblumhardt
Copy link
Member

Hi!

There's no built-in combinator; I think it would be reasonably easy to write one, using similar tactics to Sprache's implementation - keen to explore how it might look.

If you want to drop this into your own project I think it's roughly:

interface ILocated
{
    TextSpan Location { get; set; }
}

static TextParser<T> WithLocation<T>(this TextParser<T> parser)
    where T: ILocated
{
    return i => {
        var inner = parser(i);
        if (!inner.HasValue) return inner;
        inner.Value.Location = inner.Location;
        return inner;
    };
}

(Sketched in browser, no idea whether or not this will compile as-is ;-))

HTH,
Nick

@arseniiv
Copy link
Author

arseniiv commented Jul 7, 2019

Ah, thank you! I’ll look at it and write back if there will be problems hard to fix. (Or if it goes smoothly, anyway.)

@arseniiv
Copy link
Author

arseniiv commented Jul 8, 2019

Hi again, I’ve tested this code, and it works like a charm!

…Almost. Length of all TextSpans returned seems always be the same (and to be the full length of the string parsed). Is it expected? I used Superpower 2.3.0 from NuGet, and here is my source and some examples.

I’m okay with having only start positions, though. Thanks once more!

@nblumhardt
Copy link
Member

Thanks for the follow-up! That's great.

I think the proper span length could be reported using something like:

    return i => {
        var inner = parser(i);
        if (!inner.HasValue) return inner;
        inner.Value.Location = inner.Location.Until(inner.Remainder);
        return inner;
    };

Let's leave this open as a nod towards implementing it within Superpower sometime in the future :-)

@arseniiv
Copy link
Author

arseniiv commented Jul 9, 2019

This modification works nicely. 🙂

@JoeMGomes
Copy link

I have been trying to get this to work on a TokenListParser instead of a TextParser but I can't figure out how to retrieve the Token source position from the parsed TokenListParserResult. I want to have similar behaviour to the Positioned() method from Sprache but my grammar is fully Tokenized at this stage. I have bellow my adaptation of the WithLocation method but for TokenListParser. Is this possible with the current interface of Superpower? Am I missing something?

 public static TokenListParser<TKind, T> WithLocation<TKind, T>(this TokenListParser<TKind, T> parser)  
            where T : ILocated  
        {  
            return i => {  
                var inner = parser(i);  
                if (!inner.HasValue) return inner;  
                inner.Value.Location =   //Can't figure out how to retrieve position information from inner
                return inner;  
            };  
        }  

@nblumhardt
Copy link
Member

Hi @JoeMGomes - unfortunately no time to dig in properly but hopefully this helps:

The start index of the match within the input will be inner.Location.First().Position.Absolute.

The exclusive end index will be inner.Remainder.First().Position.Absolute.

In the second case it's also possible that the remainder token list will be empty, which would mean "matched until end of stream".

@JoeMGomes
Copy link

The start index of the match within the input will be inner.Location.First().Position.Absolute.

This worked nicely for me! Thank you very much! Any reason for this not to be part of Superpower?

@nblumhardt
Copy link
Member

That's good to know @JoeMGomes 👍

Just design and implementation time constraints, currently, but it seems like a worthwhile inclusion for the future 👍

@nblumhardt nblumhardt changed the title Question: an analog of Positioned from Sprache? Annotate tokens with their positions from the source text Jun 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants