Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

when in generic should fail earlier #8603

Closed
ghost opened this issue Aug 10, 2018 · 24 comments
Closed

when in generic should fail earlier #8603

ghost opened this issue Aug 10, 2018 · 24 comments

Comments

@ghost
Copy link

ghost commented Aug 10, 2018

The following code compiles on devel (and older) if the proc containing the incorrect code is not called.

proc enumToString*(enums: openArray[enum]): string =
  when e.low.ord >= 0 and e.high.ord < 256:
    result = newStringOfCap(enums.len)
  else:
    result = newStringOfCap(enums.len * 2)
  for e in enums:
    result.add(e.enumToString)

type Test = enum
  A, B, C

#echo enumToString([A, B])
echo 1

Only if the line with the comment is uncommented it shows the error message

file.nim(12, 18) template/generic instantiation from here
file.nim(2, 8) Error: undeclared identifier: 'e'

If I understood correctly, wrong code should always be detected.

@andreaferretti
Copy link
Collaborator

Actually there is nothing wrong. The enumToString procedure is generic, and until it is specialized to a concrete type there is no way to tell whether the code will compile or not.

For another example, the following is valid:

proc sum[A](xs: seq[A]): A =
  result = 0
  for x in xs:
    result += x

echo sum(@[1, 2, 3])
echo sum(@[1.0, 2.0, 3.0])

but if you try to call sum(@["a", "b", "c"]) you get a compile time error

@ghost
Copy link
Author

ghost commented Aug 10, 2018

The code is never valid. The identifier e is not defined.
(when e.low.ord >= 0 and e.high.ord < 256:)
That's also, what is found out by the compiler: file.nim(2, 8) Error: undeclared identifier: 'e'
but only if I call the proc.

@andreaferretti
Copy link
Collaborator

I am sure it is never valid. But, as I mentioned, the compiler does not try do compile generic procs until they are specialized, it just parses them, because it does not have enough information on the types and symbols in general

@ghost
Copy link
Author

ghost commented Aug 10, 2018

That's correct for the first error message, but not for the second
file.nim(2, 8) Error: undeclared identifier: 'e'.
The identifier is never there and can never be there. The code is always wrong independent on generics.
So I expect that the compiler can detect the error here.

@andreaferretti
Copy link
Collaborator

In fact I just checked and the compiler does detect errors on missing identifiers - if these identifiers are in non-static code. For instance, the following will not compile:

proc sum[A](xs: seq[A]): A =
  result = 0
  for x in xs:
    result += y

even if sum is never called. Seems something specific to when...

@ghost
Copy link
Author

ghost commented Aug 10, 2018

Yes, that's what I thought too. In fact when the signature is changed to proc enumToString*(enums: openArray[int]): string = the error is detected.

@mratsim
Copy link
Collaborator

mratsim commented Aug 10, 2018

Seems like another symptom of #5053 or #7632:

Generics symbol resolution is not done the same way as concrete type.

@dom96
Copy link
Contributor

dom96 commented Aug 10, 2018

Yeah... this sucks but I'm pretty sure this isn't a bug.

@dom96 dom96 closed this as completed Aug 10, 2018
@zah
Copy link
Member

zah commented Aug 10, 2018

Er, it's a bug and a rather simple one. The generic pre-pass in semgnrc.nim have to examine the when condition expressions in the same way it examines all other expressions.

@drslump
Copy link
Contributor

drslump commented Aug 12, 2018

@zah I tried to come up with a fix for this issue but I haven't gone too far, our definition of "easy" must not be the same 😆

Basically I've come up with the following: devel...drslump:fixes-8603-undeclared-symbols-on-when-statements

It now correctly reports as an error the following case:

when undef_symbol:
  return

But that's about it, anything more complex than a simple symbol is not detected. I've tried to follow your advice and traverse the condition expression normally and then the body adding the withinMixing flag, that builds the compiler without issues but then when compiling system.nim it chokes on some expressions, for instance when T is string or when T is (uint|uint64), where it tries to resolve is or even | as an identifier and fails to find them.

I would be happy to have another go at it, but I guess it's above my current knowledge level so I will need some guidance to get there.

@ghost
Copy link
Author

ghost commented Aug 13, 2018

Here are new examples related to the problem, I think when just checks too less and evaluates too early compared to if:

proc test1(a: string): string = a

when -134: # why does int work here but not `if -134:`?
  echo 1
  template test2(a: string): int = test1(a) # this triggers no error if not invoced although different return type
  # proc test3(a: string): int = test1(a) # but this one is checked correctly!
elif true:
  echo 2
else:
  this_compiles---------X&Y(A)*A&&&&&&&&A

when array: echo "this compiles"
when "abc": echo "this compiles"
when openArray[int]: echo "this compiles"
when typedesc: echo "this compiles"
# note the different error messages for the following examples:
# if proc: discard
# when proc: discard
#
# both should yield the same error!
# it seems `when` evaluates before checking

I think if checking when in the same depth like if everything will work, but also the VM has to eval much more so runtime and memory usage will grow much.

@ghost ghost changed the title Invalid code inside proc creates no error if proc is not called when should check the code at the same depth like if Aug 13, 2018
@drslump
Copy link
Contributor

drslump commented Aug 13, 2018

@Tim-St it's not possible make when have the same checks as if. The when statement is like a preprocessor in C, it's evaluated early on so basically the contents of its branches can only need to be parseable (and Nim has a very flexible syntax so it accepts tons of weird things).

The bug here is that the guard expressions (the conditions that tell if it should use a block of expression or another) do have to be valid code always. That's explained in the manual actually:

  • Each condition (expr) has to be a constant expression (of type bool).
  • The statements do not open a new scope.
  • The statements that belong to the expression that evaluated to true are translated by the compiler, the other statements are not checked for semantics! However, each condition is checked for semantics.

So for example:

when this_compiles---------X&Y(A)*A&&&&&&&&A:  # should produce an error
   ...

when false:
  this_compiles---------X&Y(A)*A&&&&&&&&A  # no error here, since the condition never succeeds

Edit: I know it's too late for syntax changes but I also found it surprising the semantics of when at first. I guess something like const if (cond): (body) would be more explicit and less surprising for newcomers.

@ghost
Copy link
Author

ghost commented Aug 13, 2018

Ok, I understand that the part that cannot be reached is just ignored by the compiler, so you can write there what you want, but something like this should be checked also when it's not invoced?

proc test1(a: string): string = a
template test2(a: string): int = test1(a) # this triggers no error if not invoced although different return type
let i: int = test2("a")

@drslump
Copy link
Contributor

drslump commented Aug 13, 2018

a template is meta programming, think of it as being an abstract form that is ignored in later stages of the compiler unless it's applied.

When you call it then the compiler will have the information it needs to generate a concrete type from that abstract form via a specialization or a monophormization process. Basically it replaces the AST nodes of the template call with the expansion of its body, not a real function call at runtime.

Now you could argue that the evaluation in Nim is closed over a module, it should detect that test2 once expanded will find no candidate of test1 that satisfies a return type of int, and hence the compiler should warn about this. However in practice this is not required, because if you use test2 then a concrete type is created and it will raise the error and if not called you should get a hint that test2 is never used nor exported.

The only caveat is now what happens if you export test2, in that case the compiler will not complain and you'll be able to export test2 even if when trying to use it from another module the compiler will yield an error because there are no test1 candidates to satisfy it. That could be considered a minor bug in my opinion.

@zah
Copy link
Member

zah commented Aug 13, 2018

The fix I imagine is that you should just call semGenericStmt on the when condition expression (n.sons[0])

@drslump
Copy link
Contributor

drslump commented Aug 13, 2018

Yeah, I tried that, it's probably the good solution but it breaks when processing more complex expressions (on system.nim). My current understanding is that at this stage an expression like when x is Type for instance hasn't yet been fully processed so it handles is as yet another identifier, which tries to resolve but fails.

I guess a potential solution would be to modify the nkIdent case so it doesn't try to resolve known keywords or operators. Do you think that's a viable alternative?

@zah
Copy link
Member

zah commented Aug 13, 2018

I'll have to look at some specific examples. Inside the compiler, we keep track of which generic symbols are in "unresolved" state and we need to add guards in the code detecting such symbols in many places. I suspect something similar is going on here as well.

@dom96
Copy link
Contributor

dom96 commented Aug 18, 2018

A particularly egregious example:

const x = "qwe"
when x:
  echo x
else:
  echo "wtf?"

Output:

wtf?

This shouldn't compile.

@ghost
Copy link
Author

ghost commented Oct 29, 2018

Only the behaviour from the original post is still the same. If that's not a problem this can get closed.

@krux02 krux02 changed the title when should check the code at the same depth like if when in generic should fail earlier Oct 29, 2018
@ghost
Copy link
Author

ghost commented Oct 29, 2018

Here is an other example code that fails too late (only when used), I think it's related to the same problem:

type A* = object
  b: var A # wrong here
  c: string

# var a: A # only detected when this is uncommented / when `A` is used.

@Araq
Copy link
Member

Araq commented Oct 29, 2018

@Tim-St: That's an unrelated bug that has also been reported. The original code should not compile but does so the bug needs to remain open.

@ghost
Copy link
Author

ghost commented Oct 29, 2018

@Araq Ok, I reopened the linked bug.

Araq added a commit that referenced this issue Oct 29, 2018
@Araq Araq closed this as completed in f6def42 Oct 30, 2018
@ghost
Copy link
Author

ghost commented Oct 31, 2018

The original code is now correctly detected by the compiler, but this one is not:

proc enumToString*(enums: openArray[enum]): string =
  mixin enums
  when enums.low.ord >= 0 and enums.high.ord < 256:
    result = newStringOfCap(enums.len)
  else:
    result = newStringOfCap(enums.len * 2)

type Test = enum A, B, C

# discard enumToString([A, B, C]) # only detected when uncommented

Maybe a bit related to #9567

Update: When you compare the codes (also the original one) in vscode you see that there are two different problems marked red (incorrect). But only one problem is detected:
When commented only the e is marked as incorrect.
When uncommented e ("undeclared identifier e") and . from low.ord and high.ord ("ordinal type expected") is marked as wrong. So I assume the check is still not deep enough and maybe this is the reason that it is not detected for the new code.

@Araq
Copy link
Member

Araq commented Oct 31, 2018

There is no deep checking for generics, only a symbol lookup pass.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants