Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parser chokes on paths containing "operator" #374

Closed
Exxion opened this issue Oct 22, 2023 · 4 comments
Closed

Parser chokes on paths containing "operator" #374

Exxion opened this issue Oct 22, 2023 · 4 comments

Comments

@Exxion
Copy link

Exxion commented Oct 22, 2023

It is perhaps unwise to include operator in a path, but BYOND accepts it.

/datum/operator

/datum/operator/proc/test()

This emits a warning that the proc is relatively-pathed, and confuses the language server into thinking the proc is global. If you leave out the first line, it also emits a "bad parent type" error.

/datum/test/proc/test(datum/operator/thing)

This causes a fatal error in the parser.

It would not be unreasonable to just treat operator as a reserved word and disallow its use in paths. OpenDream does this by default. It shouldn't break the parser, though.

@SpaceManiac
Copy link
Owner

Unfortunately it's difficult to adapt the parser to allow both of the following to work:

/datum/operator/()
    // this is the overload for `/` division operator for `/datum`
/datum/operator/unrelated_proc()
    // this is an unrelated proc on `/datum/operator`

So I may leave this as a won't-fix.

@Zandario
Copy link

Zandario commented Jul 2, 2024

Unfortunately it's difficult to adapt the parser to allow both of the following to work:

/datum/operator/()
    // this is the overload for `/` division operator for `/datum`
/datum/operator/unrelated_proc()
    // this is an unrelated proc on `/datum/operator`

So I may leave this as a won't-fix.

Typepaths containing operator within the path work fine* without any issues? It's only when it's present within a named argument's typepath.

image

Recreation Code

/operator
	binary
		proc/add(input)
		proc/sub(operator/input)

Other issues with operator paths

What's interesting is that defining absolute paths with it complains about a bad parent, which also breaks the type tree
02-Tuesday-Code_7855 02-Tuesday-Code_7850

Having an absolute path with operator in it anywhere in your project will create a dud type in the tree. Though if you noticed with my first example it was fine when I did relative pathing
02-Tuesday-Code_7857 02-Tuesday-Code_7856

@LummoxJR
Copy link

LummoxJR commented Jul 3, 2024

Adding context for how BYOND's parser works: There are different modes the parser can be in, one of which is reading path tokens. A path token is either of the / or : operators or a name token (anything that would work as a var name).

When the operator keyword is read as a path token, the lexer immediately looks at the next character following it, and depending on what else comes next, the characters after that. In some cases if the first character after operator might be part of an overloadable operator but turns out not to be, the parser will rewind the input stream back to just after the keyword and the token is just operator. Otherwise it keeps going to get a complete operator and adds that to the end of the token, to create a token such as operator*=.

As an example of a normal case without rewinding: If + follows operator then that's either + or ++ or +=, so the next character is checked to see if it's = or the same as the first. This also means - and * (until I ever get around to adding **=) follow the same logic. The most complicated cases start with < since it covers a lot of operators, including <=> coming in 516.

Rewinding after operator is only done for a few specific situations:

  • When the [ character is encountered without a matching ], it rewinds. (If there is a matching ] it checks if there's a following = to tell operator[] and operator[]= apart.)
  • When : is encountered without a = after it, it rewinds, since := is overloadable but : is a path operator and should be parsed separately.
  • For the / character, if the following character is another / or a * it rewinds because both of those indicate a comment. An underscore or alphabetical character following the slash, which would be the start of a new name, also rewinds because that indicates path parsing is still ongoing. Otherwise it's considered a valid operator match and just checks if = comes next to differentiate between operator/ and operator/=.

@SpaceManiac
Copy link
Owner

I appreciate the in-depth explanation. SpacemanDMM's parser is fairly different to the official DM parser - it's not built to be able to rewind, but it's split into a lexing and parsing phase, so it can make decisions by looking one whole token ahead, rather than just one character. This gap does mean I've had to add special cases for certain things (like the difference between a ? b : c:d and a ? b:c : d). This will have to be another one.

SpacemanDMM's tree path parsing basically alternates between reading identifiers and reading / (or : or .). But it's looking for tokens, not characters, so /= and comments aren't a concern at this stage.

Previously, if it saw operator, it would bail immediately with the expectation that the word should then be followed by one of the operator tokens. But / was one of these, so operator / proc became operator/ followed by a seemingly out-of-place proc word. This was done in the fix to #362 - observing that at the time, operator/ was actually just being parsed as operator followed by a dangling / that gets ignored.

The fix is to not bail immediately. Instead, if the parser is in the state where it's just read a / but there's no identifier following (normally ignored as dangling), then it checks if the final word of the path was operator, and only then turns it into operator/.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants