Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix assignments in array literals #10009

Conversation

straight-shoota
Copy link
Member

@straight-shoota straight-shoota commented Dec 2, 2020

When assigning inside an array literal, the compiler issues a weird error that doesn't make sense from the code presented to the user:

[c = 3]
c # Semantic Error: read before definition of 'c'

The reason for this is that the above code actually expands to this:

::Array(typeof(c = 3)).build(1) do |__temp|
  __temp[0] = c = 3
  1
end
c

So c is declared inside the build block and not visible in the original scope.

This patch changes array literal expansion to unwrap the behaviour of Array.build into the current scope.
Our code example now expands to this:

__temp = ::Array(typeof(c = 3)).unsafe_build(1)
__temp.to_unsafe[0] = c = 3
c

A necessary change to stdlib's Array class is the new undocumented constructor unsafe_build which initializes the buffer and sets size to capacity.

To enable smooth progression between Crystal releases, Array.unsafe_build should be merged alone and the compiler change can follow after the next release.

Also adds codegen specs for assigning in array-like, hash-like and hash literals. They are already working correctly because there's no additional scope introduced.

Resolves #3195


EDIT: Original proposal was this, has been changed to the above during the course of the discussion.

__temp = ::Array(typeof(c = 3)).new(1)
__temp.to_unsafe[0] = c = 3
__temp.size = 1
c

We only need a tiny change in stdlib because Array#size= is protected but we need to call it from outside (Array.build calls it internally). The method is undocumented but to avoid accidental calls, I named the public method unsafe_size=.
size= can probably be removed, but that's left for later to ensure backwards compatibility.

This enables the compiler to expand array literals which currently
uses Array.build and introduces a new scope.
Unwraps the behaviour of Array.build to avoid introducing an additional
scope which breaks assignments to local variables.

Also includes codegen specs for array-like, hash-like and hash literals
(which are not broken).
@straight-shoota straight-shoota added kind:bug A bug in the code. Does not apply to documentation, specs, etc. topic:compiler:semantic labels Dec 2, 2020
@asterite
Copy link
Member

asterite commented Dec 2, 2020

I kind of dislike exposing setting the size of an array, even if it's undocumented.

That said, anyone could reopen array and do the same so maybe this is fine.

@straight-shoota
Copy link
Member Author

straight-shoota commented Dec 2, 2020

Alternative to that setter could be an initializer that directly sets size = initial_capacity. That might be better. You can still call that and do unsafe stuff, but you can't change an existing array that easily. It also reduces the interface between compiler and stdlib.

This avoids having a publicly accessible setter for Array#size
@bcardiff
Copy link
Member

bcardiff commented Dec 2, 2020

Maybe a different expansion would another alternative for this

a = [1, b = 2, c = 3, 4]
c

Introducing temp variables for each element.

%e0 = 1
%e1 = b = 2
%e2 = c = 3
%e3 = 4
a = ::Array(typeof(c = 3)).build(1) do |__temp|
  __temp[0] = %e0
  __temp[1] = %e1
  __temp[2] = %e2
  __temp[2] = %e3
  1
end
c

@asterite
Copy link
Member

asterite commented Dec 2, 2020

won't work if the assignment is nested even deeper

@straight-shoota
Copy link
Member Author

straight-shoota commented Dec 2, 2020

I thought we could keep it simple and avoid all those additional temp variables and block. This works:

__temp_2 = ::Array(typeof(c = 3)).new(1, __temp_2 = uninitialized typeof(c = 3))
__temp_2.to_unsafe[0] = c = 3
__temp_2

But it seems I've hit some unrelated compiler bug with this: #10018

@straight-shoota
Copy link
Member Author

We could skip that uninitialized default value and either use an actual value or add an undocumented Array constructor that sets @size = @capacity. The latter is probably best becaus it avoids unnecessary initialization of the buffer.

This is more efficient than initializing a default value but requires
stdlib support.
@straight-shoota
Copy link
Member Author

This should be good now 👍

The literal now expands to:

__temp = ::Array(typeof(c = 3)).unsafe_build(1)
__temp.to_unsafe[0] = c = 3
c

I've updated the original description.

straight-shoota added a commit to straight-shoota/crystal that referenced this pull request Dec 17, 2020
This prepares the API for refactoring array literal instantiation
without a block scope.

crystal-lang#10009
straight-shoota added a commit to straight-shoota/crystal that referenced this pull request Dec 17, 2020
This prepares the API for refactoring array literal instantiation
without a block scope.

crystal-lang#10009
Comment on lines 10 to 13
__temp_1 = ::Array(Int8).unsafe_build(2)
__temp_1.to_unsafe[0] = 1
__temp_1.to_unsafe[1] = 2
__temp_1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we change this to be:

__temp_1 = ::Array(Int8).unsafe_build(2)
__temp_2 = __temp_1.to_unafe
__temp_2[0] = 1
__temp_2[1] = 2
__temp_1

That way the to_unsafe call would be needed to be computed just once.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would have expected that call to be inlined. But sure, I'll change it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, actually it is inlined by the codegen because it's just returning an instance var. But the semantic phase will have to look it up once per each element, so maybe it's too much.

@bcardiff bcardiff merged commit a80ed6d into crystal-lang:master Feb 18, 2021
@straight-shoota straight-shoota deleted the fix/compiler-assignment-array-literal branch February 18, 2021 14:02
@straight-shoota
Copy link
Member Author

I just realized yesterday that instead of introducing Array.unsafe_build we could have just used Array.build(size) { size } 🤦
We could probably refactor that and remove unsafe_build. It's not public API, so it can be done quickly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind:bug A bug in the code. Does not apply to documentation, specs, etc. topic:compiler:semantic
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Assignment to variables inside array literals causes "read before definition of"
3 participants