Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parse empty array/hash like literals #10192

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

straight-shoota
Copy link
Member

Fixes #9551

As a byproduct this also allows empty array literals like String{} because that's just a different syntax for String.new.

If we don't want to allow that syntax for everything, it could be further restricted. But this patch focuses on fixing a bug.

@straight-shoota straight-shoota added kind:bug A bug in the code. Does not apply to documentation, specs, etc. topic:compiler:parser topic:compiler:semantic labels Jan 3, 2021
@asterite
Copy link
Member

asterite commented Jan 3, 2021

I think we should only allow this for generic types or for aliases.

@straight-shoota
Copy link
Member Author

Why? Using non-generic types is completely valid. For example HTTP::Headers in 82dcb66 (#10186).
HTTP::Headers{} should work as well with this change.

@asterite
Copy link
Member

asterite commented Jan 3, 2021

But in that case you can use new. I don't think the original issue is an actual issue.

@straight-shoota
Copy link
Member Author

Sure, you can use .new. And that's perfectly acceptable when you're writing code directly.
But when the literal is created in a macro (the OP's use case) you need to add logic to handle zero elements.
The literal syntax allows for empty collections. It already works for array and hash literals, it should also work for array-like and hash-like. That's the least surprising way.

@asterite
Copy link
Member

asterite commented Jan 4, 2021

Theres no need to add special logic in a macro. Just use new and append the elements. The end result is the same.

I just find it strange to allow String{} and such.

@Sija
Copy link
Contributor

Sija commented Jan 4, 2021

Wait, wasn't {} notation supposed to be used only for container-like objects - like HTTP::Headers, and not "simple" ones like String?

@HertzDevil
Copy link
Contributor

I don't understand the intent behind the original issue; if you omit Set's type argument, the only reasonable type for an empty set literal would be Set(NoReturn), and if the type argument is given, there is no need for the literal syntax anyway. I'm not fully convinced that the original use case necessitates an array-like literal inside a macro.

@straight-shoota
Copy link
Member Author

@Sija Yes, that's the intention. Foo{a, b} expands to %t = Foo.new; %t << a; %t << b; %t. If the literal is empty Foo{}, that leaves essentially Foo.new which just works with any type that has an argless constructor. This is an unintentional byproduct product of the implementation.

@HertzDevil A generic type without type argument won't work of course. This patch makes sure to raise an appropriate error in that case. But the describes use case applies for non-generic types and generic instance types.

I've added a check that makes sure the instance type responds to << or []= if the literal is empty. This now prevents String{}. The check is unnecessary when there are elements in the literal because then a call will be generated anyways.

@asterite
Copy link
Member

asterite commented Jan 4, 2021

I've been thinking about this. Maybe the best way to handle this is to expand these literals into a T.literal(...) macro. It accepts a single argument, a list, for array-like types, or two arguments, two lists, for Hash-like types. The nice thing about that is that the source code for the expansión is in Crystal, and it can be redefined.

Thoughts?

@asterite
Copy link
Member

asterite commented Jan 4, 2021

And a third overload without arguments for the empty list case.

@straight-shoota
Copy link
Member Author

Yes, a standardized API similar to String.interpolation would be a great solution. This is discussed in #5703.

@straight-shoota
Copy link
Member Author

We should probably still merge this. At least the parts that fix bugs and add specs for existing behaviour.

@bcardiff
Copy link
Member

I'm fine with merging this as is. If we want to implement later T.literal it can work. I am afraid if the overloads might not explode or reach a limit in the number of arguments.

Yet I find it weird to write Set {} of Int32 🙃

@asterite are you against merging this to allow the empty case?

@asterite
Copy link
Member

I guess it's fine. With this you can do:

class Person
end

person = Person {}

Maybe it doesn't matter too much 🙃

@Blacksmoke16
Copy link
Member

I personally wasn't aware you could have spaces between the {} and the type, e.g. Set {1}. Is there any reason to not require there be no spaces? E.g. Set{1}. IMO that makes it more clear what's going on.

@oprypin
Copy link
Member

oprypin commented Feb 19, 2021

There's little or maybe no precedent for significant whitespace at a word boundary. Might even be very difficult to ensure in the parser.

@asterite
Copy link
Member

The only place this happens is named argument vs type declaration. But I don't mind the space here.

@straight-shoota
Copy link
Member Author

straight-shoota commented Mar 7, 2022

I've stumbled across this again and continue to wonder why nobody has approved this yet...

This change just makes sure that array- and hash-like literals work consistently for any number of elements (including zero).

@oprypin oprypin self-requested a review March 7, 2022 20:17
node.name = generic
when GenericClassInstanceType
if node.elements.empty? && !(type.has_def?("<<") || type.has_def?("[]="))
node.raise "Type #{type} does not support array-like or hash-like literal"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💭 Optional! The error could be a bit more explicit, like saying:

does not support array-like literal (it has no << method) or hash-like literal (it has no []= method)

@HertzDevil
Copy link
Contributor

HertzDevil commented Mar 14, 2022

It is worth mentioning that this is now possible:

class Foo(*T); end

Foo{1} # => #<Foo(Int32):0x7f913e9c1e90>
Foo{}  # => #<Foo():0x7f913e9c1ea0>

even though Foo().new is not (#10204). I think it is unexpected that the arity of the generic type arguments changes depending on the number of elements.

@HertzDevil
Copy link
Contributor

Looking at the specs, this PR indeed changes the meaning of Array{} to attempt to instantiate Array() with 0 generic type arguments. I don't think that should change. If one writes:

class Foo(*T)
  def <<(x); end
end

macro f(x)
  # *x* is an `ArrayLiteral`, which has `#splat`
  Foo{ {{ x.splat }} }
end

then f might silently return a Foo() instead of Foo(typeof(x[...])). If the generic type arguments really need to be empty then we should support Foo(){...} on top of #10204.

@straight-shoota
Copy link
Member Author

straight-shoota commented Mar 17, 2022

@HertzDevil That means Array{} would be valid and equivalent to Array(NoReturn).new, right?

@HertzDevil
Copy link
Contributor

If Array's type variables have a fixed arity then that could be the case. But I would still advise against supporting T{} whenever T is a generic type because that is inherently ambiguous between array-like and hash-like literals.

@straight-shoota
Copy link
Member Author

It's only really ambiguous if a type has a splat type parameter and implements both #>> and #[]=. That's very rare and not worth worrying about. And we can simply define that in this case array-like literal takes precedence.

@HertzDevil
Copy link
Contributor

HertzDevil commented Mar 18, 2022

Then the following will silently become a Foo(NoReturn) instead of Foo(NoReturn, NoReturn) if x is empty:

macro f(x)
  # *x* is a `HashLiteral` or `NamedTupleLiteral`
  Foo{ {{ x.double_splat }} }
end

It is very much worth worrying about because it is still enabling new behaviour, and reverting that always constitutes a breaking change, however unlikely to happen.

By the way, what would happen in this case?

class Foo(T)
  macro method_missing(call)
  end
end

Foo(Int32){}

@straight-shoota
Copy link
Member Author

straight-shoota commented Mar 18, 2022

We can easily tell apart empty array-like and empty hash-like literals for generic types based on the arity of its type parameters. 1 type param = array-like, 2 type params = hash-like. It's only ambiguous with a splat parameter. And I don't think that's a likely use case for a collection type that can be used with such a literal.
To be on the safe side, we can disallow splat type parameters with these literals altogether.

method_missing might be an argument to not do any def check (it's incomplete anyways). However, I think it would be perfectly acceptable to not support method_missing for this. Array-like and hash-like literals are about collection types and they just expect a method to add a value to the collection. It shouldn't be too hard to ask for this method to be explicitly defined and not generated from method_missing.

I've started refactoring the implementation and I'll post an update later.

@HertzDevil
Copy link
Contributor

Since the original issue only asks for a Set(String){} at the moment, we could leave the GenericClassType branch intact and that would still fix the issue nonetheless. But if we do allow uninstantiated generic types here or in a later PR, the following is what I think should be allowed at most, just to be clear:

class Foo1(T); end
class Foo2(T, U); end
class Foo3(T, U, V); end
class Foo4(*T); end
class Foo5(T, *U); end
class Foo6(T, *U, V); end

# generic, unambiguously array-like
typeof(Foo1{}) # => Foo1(NoReturn)

# generic, unambiguously hash-like
typeof(Foo2{}) # => Foo2(NoReturn, NoReturn)

# generic, cannot be instantiated with 1 or 2 type arguments
typeof(Foo3{}) # Error: Foo3(T, U, V) supports neither array-like nor hash-like literals (wrong number of type vars)

# generic, can be instantiated with either 1 or 2 type arguments
typeof(Foo4{}) # Error: cannot infer Foo4(*T)'s type variables (ambiguous array-like or hash-like literal)
typeof(Foo5{}) # Error: cannot infer Foo5(T, *U)'s type variables (ambiguous array-like or hash-like literal)

# generic, unambiguously hash-like
typeof(Foo6{}) # => Foo6(NoReturn, NoReturn)

And then #<< and #[]= may additionally be used to disambiguate the literals Foo4{} and Foo5{}; that is, they are checked only if the type variable arity is not enough.

@asterite
Copy link
Member

I don't think it's true that one type argument is array. For example you could have a Hash like type with always string keys, but the values are generic.

Given all the complexity this little change involves, and how it doesn't add anything new to the table, I think we should simply drop this. You can use new for this.

@straight-shoota
Copy link
Member Author

straight-shoota commented Mar 18, 2022

I don't think it's true that one type argument is array. For example you could have a Hash like type with always string keys, but the values are generic.

Yes, that's entirely possible. But such a type can't be used with a hash-like literal unless it's instantiated. A hash-like literal with an uninstantiated generic type infers the key and values types and instantiates the type with exactly those two type parameters.
If a hash-like literal was used with a generic type that has only one type parameter, it ould be impossible to defer whether that's supposed to be the key or value type.

@straight-shoota straight-shoota self-assigned this Dec 8, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind:bug A bug in the code. Does not apply to documentation, specs, etc. topic:compiler:parser topic:compiler:semantic
Projects
None yet
Development

Successfully merging this pull request may close these issues.

bug: empty Set literals don't compile
7 participants