Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PPC0019] - Should it be a sublex? #47

Open
leonerd opened this issue Jan 11, 2024 · 6 comments
Open

[PPC0019] - Should it be a sublex? #47

leonerd opened this issue Jan 11, 2024 · 6 comments

Comments

@leonerd
Copy link
Contributor

leonerd commented Jan 11, 2024

First interesting question: Should qt() strings be sub-lexed, or not..?

I.e. what do people feel -should- be the behaviour of a construction like

sub f { ... }

say qt(Is this { f(")") } valid syntax?);

Should it:

  1. Yield a parse error similar to the ones given in the example above?
  2. Parse as valid perl code yielding a similar result to:
     say 'Is this ', f(")"), ' valid syntax?';
  1. Something else?

I feel that interpretation 2 might be most useful and powerful, but would be inconsistent with existing behaviour of existing operators. Interpretation 1 is certainly easier to achieve as it reüses existing parser structures, but given the whole point is to interpolate code inside the {braces} it might lead to weird annoying cases that don't work so well.

Does anyone have any good examples one way or other from other languages that have a similar construction?

(Cross-posted to https://www.nntp.perl.org/group/perl.perl5.porters/2024/01/msg267671.html)

@mauke
Copy link

mauke commented Jan 11, 2024

FWIW, my implementation at https://metacpan.org/pod/Quote::Code goes with interpretation 2. It recursively parses embedded code blocks as it scans through the string. This is different from how perl does things, but I've never found a case where perl's "scan for the end and reparse" approach is actually useful. All it does is cause bizarre problems (e.g. with source filters that want to rewrite code). I'd welcome a change.

JavaScript's `...` construct also works this way:

console.log(`a ${ 'b' } c`); // "a b c"
console.log(`a ${ '}' } c`); // "a } c"
console.log(`a ${ `}` } c`); // "a } c"
console.log(`a ${ `} ${ '}' }` } c`); // "a } } c"

@leonerd
Copy link
Contributor Author

leonerd commented Jan 11, 2024

@mauke

I've never found a case where perl's "scan for the end and reparse" approach is actually useful.

Oh I don't think anyone ever suggested it was useful. It's relatively simple and cheap to implement which may be why it's done that way currently. I only suggest it as an option for consistency with existing quoting.

@book
Copy link
Contributor

book commented Nov 22, 2024

Possibly the main issue is familiarity? Given that all the existing quote-like operators work by scanning for the end and then reparsing, and there is ample (and ancient) documentation about the gory details of parsing quoted constructs.

If qt is championing a change to sublexing (which will make it much easier to copy/paste code blocks inside the {} construct), are there other quote-like operators that would benefit from such a change?

In my understanding this is only an issue for paired delimiters and interpolation.
Extending sublexing to qq would mean that the following would work:

qq{ foo ${ \ "}" } bar } # would yield " foo } bar "

In my understanding, qq, qx, qr, m and s would benefit from it (but not q, qw, tr and y, since they do not interpolate).

Should there be a sublexing feature that would enable the new parsing? Then I suppose we'd have our first pair of dependent features.

@rabbiveesh
Copy link
Contributor

Even though i think that sublexing sucks, and as a parser author i would have to go out of my way to implement the sublexing logic if i cared to parse exactly like perl, there's no real reason to change existing code.
The case where you would run into the issue mainly is in more advanced interpolation; that's what the qt is being proposed for; so old code already works, and new code doesn't need a feature to change lexing behavior at a distnace, they can just use the new operator.
It's easy enough to justify that this works without any footguns b/c it's specifically for the case where this could be a problem

@leonerd
Copy link
Contributor Author

leonerd commented Nov 26, 2024

There's been very little movement on this, so unless anyone screams very loudly, I'm going with option 2.

@book
Copy link
Contributor

book commented Nov 26, 2024

Option 2 sounds like the more useful, especially because it will minimize the need for escaping.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants