Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nested comprehensions question #2030

Closed
gkz opened this issue Jan 12, 2012 · 24 comments
Closed

Nested comprehensions question #2030

gkz opened this issue Jan 12, 2012 · 24 comments
Labels

Comments

@gkz
Copy link

gkz commented Jan 12, 2012

(This is not about the result being flat or not.)

First of all, thank you to all CoffeeScript contributors. I've been using CS a lot, and doing some pretty weird things with it.

Now here is an issue I've encountered while working with CS recently that I hope you guys could clarify:

In python:
["{0}|{1}".format(x,y) for x in [1,2,3] for y in [7, 8, 9]
produces
['1|7', '1|8', '1|9', '2|7', '2|8', '2|9', '3|7', '3|8', '3|9']

In CS
("#{x}|#{y}" for x in [1,2,3] for y in [7, 8, 9])
produces (when flattened)
["1|7", "2|7", "3|7", "1|8", "2|8", "3|8", "1|9", "2|9", "3|9"]

I am confused about this difference. The python results makes sense to me, the CS result does not.

Furthermore, in python (notice the if condition)
["{0}|{1}".format(x,y) for x in [1,2,3] for y in [7, 8, 9] if x + y != 10]
produces (note that the combination of 3|7, 2|8, and 1|9 are gone due to the attached condition)
['1|7', '1|8', '2|7', '2|9', '3|8', '3|9']

however, in CS
("#{x}|#{y}" for x in [1,2,3] for y in [7, 8, 9] when x + y != 10)
produces no change
["1|7", "2|7", "3|7", "1|8", "2|8", "3|8", "1|9", "2|9", "3|9"]

but if the when condition is moved to the middle (?)
("#{x}|#{y}" for x in [1,2,3] when x + y != 10 for y in [7, 8, 9])
produces the proper
["1|7", "2|7", "1|8", "3|8", "2|9", "3|9"]

I find the behavior in python more intuitive - or am I missing something here.

@michaelficarra
Copy link
Collaborator

@gkz: First of all, thanks for the very clearly worded question.

In CS, a() if b is shorthand for

if b
  a()

The equivalence holds for all other postfix expressions: for, until, etc.

When we perform this conversion on your example, hopefully it will become clear why CS behaves the way it does.

  • ("#{x}|#{y}" for x in [1,2,3] for y in [7, 8, 9])
for y in [7, 8, 9]
  for x in [1,2,3]
    "#{x}|#{y}"

Notice that the outermost inline loop remains the outermost loop here. So pairs of integers (x, y) would be generated in the order (7, 1), (7, 2), (7,3), (8,1), etc.

  • ("#{x}|#{y}" for x in [1,2,3] for y in [7, 8, 9] when x + y != 10)
for y in [7, 8, 9] when x + y != 10
  for x in [1,2,3]
    "#{x}|#{y}"

Now you should see why moving the condition to the inner loop produced your expected behaviour. In this version, x hasn't been defined yet. In JS, undefined + 7 produces NaN, which will never === anything, including 10. So the condition passes unconditionally.

I think our behaviour is more intuitive here. Regarding producing flat arrays, I'm currently undecided. Our semantics are definitely simpler. I hope this clears things up for you.

@gkz
Copy link
Author

gkz commented Jan 12, 2012

Thank you for your answer, I now understand how it all works. I think however, due to this issue and the results not being flat issue, that the documentation should be amended to clarify that CoffeeScript does not have actual comprehensions, just for loops that in the case of a single variable act like comprehensions.

I give one more example, this time from haskell, and it mirrors the python result, not the CS result:

[(x,y) | x <- [1..3], y <- [7..9]]
produces
[(1,7),(1,8),(1,9),(2,7),(2,8),(2,9),(3,7),(3,8),(3,9)]

and
[(x,y) | x <- [1..3], y <- [7..9], x + y /= 10]
produces
[(1,7),(1,8),(2,7),(2,9),(3,8),(3,9)]

maybe in the future CS could get actual list comprehensions, perhaps denoted by the current syntax surrounded by square brackets as in haskell and python.

@twoolie
Copy link

twoolie commented Jan 14, 2012

I second that this syntax is very confusing, but still technically correct.

Both Python and Haskell have tailored their syntax to be easy to read and understand, whereas CS' syntax strives for correctness. Both approaches are valid, although the solution that meshes with more established languages would be appreciated.

Perhaps if the CS syntax is to be kept, this could be mentioned in the docs?

@showell
Copy link

showell commented Jan 14, 2012

@gkz I think this is mostly a repeat of issue 1191. I recommend closing this issue, and adding support to 1191. I prefer Python's approach, even though I understand CoffeeScript's rationale. I don't know Haskell, but the fact that it's more like Python just adds more precedence.

As I mention in the other thread, this is a good litmus test for comprehensions:

http://rosettacode.org/wiki/List_comprehensions

@twoolie I agree that the current behavior should at least be documented.

I mostly avoid nested comprehensions, even in Python. I use simple comprehensions all the time, but I find the nested comprehensions tricky to read and debug.

@michaelficarra
Copy link
Collaborator

@showell: It's not that hard in CS:

n = 20

flatten = (arr) -> arr.reduce ((memo, b) -> memo.concat b), []

r = flatten (for x in [1..n]
  flatten (for y in [x..n]
    for z in [y..n] when x*x + y*y is z*z
      [x, y, z]
  ))

console.dir r

Though I really wish we had Haskell's $ so I could avoid those parentheses.

edit: fancy, compact, single line version:

n = 20
flatten = (arr) -> arr.reduce ((memo, b) -> memo.concat b), []
r = flatten (flatten ([x, y, z] for z in [y..n] when x*x + y*y is z*z for y in [x..n]) for x in [1..n])
console.dir r

@showell
Copy link

showell commented Jan 14, 2012

@michaelficarra Yep, but that's technically not a comprehension, correct? My (possibly flawed) definition of a comprehension is that it puts the expression before the loop. Your code is an elegant example of CS loops being expressions by default.

@michaelficarra
Copy link
Collaborator

@showell: nope, it's a comprehension because it produces a list. It's a loop if it's making use of side effects and we aren't using the loop as a value. I tried explaining it this morning in #2038.

@showell
Copy link

showell commented Jan 14, 2012

@michaelficarra Here's the Python example from Rosetta:

((x,y,z) for x in xrange(1,n+1) for y in xrange(x,n+1) for z in xrange(y,n+1) if x**2 + y**2 == z**2)

The key aspect of the Python code is that you can lead with (x,y,z), if you consider that to be the most prominent feature of the expression. In your example you don't get to the triplet until line five. I don't consider this to be a big deal, by the way; I'm obviously being a little pedantic. I actually prefer your style. Even though it buries the lead in one sense, I think the more interesting aspect of the code is the three loops.

@showell
Copy link

showell commented Jan 14, 2012

@michaelficarra We are obviously just disagreeing on definitions. I think there are three types of code that we are discussing:

  1. for x in bla do something (loop for side effects)
  2. for x in bla x * x (for-loop producing a list)
  3. x * x for x in bla (for-loop producing a list, with inverted syntax to emphasize x*x)

The first is clearly a loop, and the last is clearly a comprehension. The middle one depends on how strict your definition of a comprehension is.

@showell
Copy link

showell commented Jan 14, 2012

@michaelficarra Wikipedia seems to agree with my interpretation:

http://en.wikipedia.org/wiki/List_comprehension

It alludes to set builder notation, which puts the output function first. It goes like this: output function, variable, input set, predicate.

Now, of course, it's perfectly fine to have a broader definition than Wikipedia, but it's important to agree on the definitions. I prefer a strict definition of comprehensions, to distinguish them from list expressions based on a for loop.

@michaelficarra
Copy link
Collaborator

@showell: did my analogy to FDs and NFEs not clear up my definition? It's not about the syntax or the actual production that defines it, it's about how it's being used.

  • function a(){} is a FD
  • function(){} is a FE
  • (function a(){}) is a NFE
  • +function a(){} is a NFE
  • a = function a(){} is a NFE

So, similarly

  • a for b in c is a loop
  • (a for b in c) is a comprehension
  • a = b for c in d is a loop
  • a = (b for c in d) is a comprehension
  • +(a for b in c) is a comprehension
  • for a in b then c is a loop
  • a = for b in c then d is a comprehension
  • +for b in c then d is a comprehension
  • [].concat (for b in c then d) is a comprehension
  • -> for b in c then d is a comprehension (this may be tricky; it's being used as a value because of implicit return)

Does that make it more clear?

@showell
Copy link

showell commented Jan 14, 2012

@michaelficarra And, of course, Wikipedia contradicts its own definition in its many examples.

Lots of languages pretty strictly adhere to the set builder notation, including Erlang, Haskell, Javascript 1.8, OCaml, Pure, and Python.

But, then, plenty of examples fit your broader definition, including Clojure and Common Lisp.

@showell
Copy link

showell commented Jan 14, 2012

@michaelficarra Your definition is clear enough, but the fact that you dismiss the actual syntax as being intrinsic to the definition makes your definition overly broad IMHO. Instead of dividing all the examples into two categories--loops and comprehensions--I prefer to have a third category:

  1. loops
  2. list comprehensions as defined by Wikipedia, i.e. syntax mirrors set notation
  3. list expressions

@michaelficarra
Copy link
Collaborator

@showell: But we're discussing a feature, represented as an actual parser node. My definition is syntax agnostic, as all feature definitions are. Your list comprehensions and list expressions produce the same AST, but are merely specified differently. Those are not (and cannot be) different things. My analogy here is to string literals, specified with ' or " delimiters. They are the same thing. You may think of them differently because they look a little different when you're writing your programs, but they mean the exact same thing.

@showell
Copy link

showell commented Jan 14, 2012

@michaelficarra I guess we're go around in circles on this. My whole point here is the definition of whether something is a list comprehension happens at the syntax layer, not at the AST layer. To use your string analogy, there's a difference between strings (which exist in the abstract, no syntax required) and string literals (which are specifically about syntax).

Obviously, if you go low enough in the conceptual stack, lots of concepts mean the exact same thing, but it makes it hard to debate syntax if we're not allowed to talk about syntax.

@showell
Copy link

showell commented Jan 14, 2012

@michaelficarra To use a mathematical analogy, a set expression is any mathematical expression that produces a set, and a mapping is more precisely a set expression that produces a set from another set. Finally, set builder notation is a specific notation for specifying a mapping.

In CS, a list is a data structure, and a list expression is any piece of CS code that produces a list. Finally, a comprehension is a specific syntactical notation for specifying a list expression, with the prominent feature being that it places the output function first at the syntax layer.

@showell
Copy link

showell commented Jan 14, 2012

@michaelficarra And just to beat a dead horse, there's precedence in Python to define list comprehensions as a syntactical feature:

http://www.python.org/dev/peps/pep-0202/

"This PEP describes a proposed syntactical extension to Python, list comprehensions."

Edit: And also note the first sentence from Wikipedia's article: "A list comprehension is a syntactic construct..."

@gkz
Copy link
Author

gkz commented May 21, 2012

@showell I've implemented proper list comprehensions in LiveScript, passing the test that you mentioned:

 [[x,y,z] for x in [1 to 20] for y in [x to 20] for z in [y to 20] when x^2 + y^2 == z^2]
 #=> [[3,4,5],[5,12,13],[6,8,10],[8,15,17],[9,12,15],[12,16,20]]

I would encourage CoffeeScript to do the same! I think this syntax is much better than the current one.

@joyrexus
Copy link

In reviewing the discussions in 1191, 2640, and above, I'm led to believe that there's general agreement about the desirability of having some syntactic construct for mapcat-style comprehensions. Is that fair to say?

I'd be curious to hear what people think about @gkz's proposal above (viz., to use surrounding square brackets for flattening/mapcat-behavior) and related implementation in LiveScript.

Example:

    P = []
    P.push [x, y, z] for z in [1..2] when (x + y + z) % 2 > 0 \
                     for y in [1..2]                          \
                     for x in [1..2]

... could be expressed (more elegantly, IMO) as ...

    Q = [[x, y, z] for x in [1..2]                         \
                   for y in [1..2]                         \
                   for z in [1..2] when (x + y + z) % 2 > 0]

Questions:

Are there objections to using the square bracket notation to indicate a python/haskell/harmony-style comprehension, independent of the cost of implementation?

@gkz, Is the LS implementation dependent on non-CS language features introduced in coco or LS?

Sidenote:

Peter Norvig uses Python's nested comprehensions to very powerful effect throughout his CS212 course on Udacity. He provides a nice summary of them here.


@jashkenas wrote: "I think it's worth talking about, especially if we can have our cake and eat it too -- have a way of allowing the inner comprehension to either be treated as an expression, or flattened out."

@michaelficarra wrote: "FWIW, I'd prefer to switch to the proposed semantics."

@gkz proposed above: "maybe in the future CS could get actual list comprehensions, perhaps denoted by the current syntax surrounded by square brackets as in haskell and python."

@vendethiel
Copy link
Collaborator

... could be expressed (more elegantly, IMO) as ...

You could even remove those backslashes.

@gkz, Is the LS implementation dependent on non-CS language features introduced in coco or LS?

It's written in LS, so, ye ... But it has some little quirks.

@gkz
Copy link
Author

gkz commented Apr 19, 2013

One issue is that

[x for x in x]

is already valid CoffeeScript, and it produces post-fix loop expression x for x in x as the first element in a list. That is why in LiveScript post-fix loops were removed.

@joyrexus
Copy link

@gkz: "One issue is that [x for x in X] is already valid CoffeeScript, and it produces post-fix loop expression x for x in x as the first element in a list."

Of course this is what annoys some people, myself included. I'd prefer to write [ [x for x in X] ] or [ (x for x in X) ] to produce a list within a list.

"That is why in LiveScript post-fix loops were removed."

Oh, wow. As a relative newcomer and casual user of CS, it's not clear to me why you'd have to remove post-fix loops wholesale. I can't imagine @michaelficarra removing post-fix loops in the coffee of his dreams, yet he plans on changing "comprehension syntax to {python,harmony}-style".

Curious ... is the issue one of implementation (viz., that the parsing would just be too messy in having to recognize what look like post-fix loops within square brackets as part of a mapcat comprehension) or that much existing code would be broken as a result of the change in syntax/semantics ... or both?

Genuinely appreciate the feedback.

@satyr
Copy link
Collaborator

satyr commented Apr 20, 2013

he plans on changing "comprehension syntax to {python,harmony}-style"

Note that the latest Harmony disagrees with Python, e.g.: [for (x of xs) x]

@GeoffreyBooth
Copy link
Collaborator

Reimplementing comprehensions, or adding new types of comprehensions, isn’t a priority at the moment; but pull requests are welcome. Closing as there doesn’t appear to be anything to do to resolve the issue discussed by the original post.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

8 participants