Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Dict comprehensions and typed Dicts #1467

Closed
wants to merge 3 commits into from

Conversation

carlobaldassi
Copy link
Member

This would close #287.
Examples (updated with changes following the discussion):

Dict comprehension:

julia> { i/2 => 2i for i = 1:4 }
{2.0=>8,0.5=>2,1.5=>6,1.0=>4}

julia> typeof(ans)
Dict{Float64,Int64}

Typed Dict comprehension:

julia> (Real=>Real){ i/2 => 2i for i = 1:4 }
{2.0=>8,0.5=>2,1.5=>6,1.0=>4}

julia> typeof(ans)
Dict{Real,Real}

Typed Dict:

julia> (Real=>Real){ 1=>2, 0x5=>2.0 }
{0x05=>2.0,1=>2}

julia> typeof(ans)
Dict{Real,Real}

In contrast to array comprehensions, the length of the iteration is not required:

julia> require("iterators.jl")

julia> import Iterators.*

julia> { i/2 => 2i for i in take(count(), 4) }
{0.0=>0,0.5=>2,1.5=>6,1.0=>4}

There are no restrictions on the number of iterators:

julia> { i+j => i-j for i=1:2,j=[4,8] }
{5=>-3,6=>-2,10=>-6,9=>-7}

Technical note: curly brackets preceded by an identifier parse a little different after this commit, since they use parse-ref rather then parse-arglist. I don't think this breaks any existing code (and of course make testall runs fine), but there are at least 2 notable differences:

  1. assignments are allowed: while this is not really a problem here, I think it would be reasonable to add a wrapper macro with-assignment-allowed, similar to with-end-symbol etc. ; as a side effect, this could also be used to parse macro arguments and allow things like @options(x=1), unless there are plans to support named macro arguments.

  2. the obscure semicolon-separated parameter syntax is unsupported, but I couldn't really find any place where it is actually supported:

 julia> exp(1;2)
 unsupported or misplaced expression parameters

(What is this about?)

For example:

  julia> { i/2 => 2i for i=1:5 }
  {2.0=>8,0.5=>2,1.5=>6,2.5=>10,1.0=>4}
@StefanKarpinski
Copy link
Member

the obscure semicolon-separated parameter syntax

This may be about the fact that a long time ago we planned to allow both optional and named arguments, and the separation between the two was to be indicated with a semicolon:

foo(a, b=2; c, d=4) = a+b+c+d

In this example:

  • a is positional and required
  • b is positional but optional, defaulting to 2
  • c is named and required (yes, there was a plan to support this combination)
  • d is named and optional, defaulting to 4.

I'll confess that I was the one behind this overly complex scheme and managed to convince Jeff and Viral for a little while that we should do this. We've subsequently come to our collective senses and decided this is way too much.

@StefanKarpinski
Copy link
Member

Also, very nice patch. Obviously, we should wait for Jeff's feedback on this.

@JeffBezanson
Copy link
Member

Very well done as usual! This could even be extended to allow (T,S){} to make an empty Dict, since that syntax currently has no other meaning.

I wonder if it is confusing to put what are essentially type parameters in parens instead of the usual curly braces. Maybe it should be {Int,Int}{1=>2} instead.

It's a good point that it would be better to allow assignments in macro arguments; that should be fixed.

@nolta
Copy link
Member

nolta commented Oct 29, 2012

How about (T => S){...}? Seems like we might want to save the (T,S){} notation in case we adopt python-like set literals:

{1, 2, 3}               # Set{Int}(1,2,3)
Real{1, 2, 3}           # Set{Real}(1,2,3)
(Real,Real){}           # Set{(Real,Real)}()
{ f(x) for x in y }

Or perhaps i'm just being a syntax hypochondriac ;)

@carlobaldassi
Copy link
Member Author

Thanks!

Stefan, I don't actually find that syntax that bad at all...

About using {Int,Int}{1=>2} in place of (Int,Int){1=>2}: even if those are in fact type parameters, I still kind of prefer the tuple, since to me it feels a more natural extension of the single-type notation used for arrays. Otherwise, should we also allow {Int}[...] for consistency? That would be only possible for array comprehensions, otherwise it gets mixed up with ref notation ({Int}[1]==Int or {Int}[1]==[1]::Vector{Int}?).
I also find that the tuple syntax makes it easier to distinguish the two components' different meanings. I don't have strong opinions on this though. We could also allow both.

About allowing (T,S){} (or {T,S}{}) for an empty Dict: I also don't have too strong opinions on this, but I find it probably too confusing, unless we also remove the curly notation for cell arrays, since at least now you have a super-easy rule and can immediately tell the two apart, by just checking if there are any => symbols or not, while having {T,S}{}::Dict could create even more confusion at getting {}::Array.

While I'm at it, I'll confess that I have an experimental branch where I completed the (massively breaking) transition to using curly notation for Dicts exclusively, but looking at the result (older version here) made me actually change my mind on the this subject: on one hand the one and only gain would be in building empty Dicts; on the other hand, cell arrays are so ubiquitous and handy that having a special notation for them seems very relevant, plus there seem to be no real substitute for cell matrices (of which I found an instance in the sources, BTW).

@carlobaldassi
Copy link
Member Author

Ah, I like (T => S){...}.

@StefanKarpinski
Copy link
Member

Yes, I also rather like (T=>S){...}.

@carlobaldassi
Copy link
Member Author

I just had a horrible/ridiculous idea: {=>}Dict()

@JeffBezanson
Copy link
Member

Ok, I am fine with (T,S){...} or (T=>S){...}. The second one is a bit better since it is quite hard to imagine it meaning anything else, while the first appears to involve tuples and cell arrays. We could even go as far as making T=>S by itself syntax for an empty dict.

And it really is necessary for literal notations to support the empty case, so {=>} is actually a good idea.

@StefanKarpinski
Copy link
Member

I'm a bit uncomfortable with indexing into anything with curly braces since that kind of syntax is already used for type parameters and feel like we're getting into fiddly syntax subtleties. How about instead indexing with square brackets and arrows: i.e. adding support for (T,S)[1=>2,3=>4] or (T=>S)[1=>2,3=>4] and having {1,2,3} remain as syntax for a cell array, having [1=>2,3=>4] for a Dict{Int,Int} and having {1=>2,3=>4} be syntax for a Dict{Any,Any}.

Examples:

  (Int=>Int){}
  (Integer=>Real){1=>2, 3=>3.0}
  (Real=>Integer){ i/2 => 2i for i = 1:3 }
for consistency, also allows (T=>S){=>}
@carlobaldassi
Copy link
Member Author

Ok, for now I've changed the syntax to (T=>S){...}. It also includes the empty case:

julia> (Int=>Int){}
Dict{Int64,Int64}()

In another commit, I added the syntax {=>} (and for consistency, also (T=>S){=>}, even though it's not strictly needed). It still feels ridiculous, but it seems handy.
As a test, I substituted Dict() with {=>} throughout base and it works (didn't commit that).

I actually support using square brackets for explicitly typed containers as per Stafan's suggestion, but having {1=>2,3=>4} be a Dict{Any,Any} is a breaking change. Other opinions?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Associative Array Comprehension
4 participants