-
Notifications
You must be signed in to change notification settings - Fork 662
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unexpected behavior in recursive data structure declarations #999
Comments
This is Elm being a non-lazy language. Such recursive value (rather than function) definitions are idiomatic Haskell, but not idiomatic Elm. Enabling such recursive definitions in general would require a lot of overhead compared to the current implementation: introducing thunks for all bindings, essentially switching to an implementation of lazy evaluation. As for a compiler check: This could always only be approximative, right? I don't think there is a way to do this reliably (catching all cases of recursive dependencies in values) without being too conservative (i.e., catching all said cases but still permitting all definitions that in reality can be evaluated without ending up in a nonterminating situation). |
The (first) point being: If you need this (lazy) behavior, then in Elm you will have to introduce the laziness explicitly, not expect the language to behave that way automatically. Introducing the desired laziness can be done in a domain specific API (which is why I added the |
I don't think that this form of recursive value requires laziness though. This doesn't really require recursion so much as cycles. This kind of thing is commonly done in two pass form:
The existing compiler could be made to do this by having an optional last parameter generated for constructors which passes in the object to construct on top of, and turning the one phase output into a two stage output. I'm sure this is a vast oversimplification of the work involved; but in theory this should work without any form of laziness. I think the compiler check should always be exact and computable without sacrificing much. Cycle detection in the AST should work I imagine. You'd have to mark nodes as initialization-time (constants, direct references) and post-init time ( thunks, functions, etc) and only need to check the init-time ones. You do lose some cases though -- in the snippet I posted, if you flip the order of declaration it works, and if you did this cycle check the compiler should always throw you an error on it. To me though, this is desirable -- it is unintuitive behavior when order of declarations matters sometimes and not others. I think, despite work involved (of which I'll be happy to help out on if this kind of thing is deemed appropriate to the language), this type of thing should be allowed. My concern here is that I don't entirely see a simple way to work with graphs without it, and that would somewhat limiting. Manually using the lazy functionality would be a work around to force a sort of two phase initialization, but this is also forcing the user to be aware of implementation details in the compiler. (Which isn't to say there aren't other valid use cases for the lazy package -- I just don't think it is the best approach to this usage). |
Is this a duplicate of #873? If so, check out the full description of how things work in OCaml that is linked there and see if it actually addresses your case. The restriction that you must use constructors explicitly seems to make it hard to use in a lot of cases. |
This is the same issue; apologies for not finding it via search. The functionality that is indicated in OCaml is pretty much what I want/expected here. Namely --
Is this a feasible thing implementation wise / is this something you think elm should have capabilities for? |
@jdavidberger, I don't understand your proposed implementation strategy in general, but maybe it is not too far off to say that it would indeed be some kind of lazy evaluation, specialized to a very specific situation? But independently of that question (whether it is/requires lazy evaluation), I question whether it would be a good idea to allow this in Elm. Elm is, in all other respects, a strictly evaluated language. So if there are definitions a = f b
b = g a in a program, one should expect evaluating either If I understand correctly, you are proposing that for certain choices of Also, considering just the case that a = 1 :: b
b = 2 :: a If this is compiled to something terminating, putting a' () = 1 :: b' ()
b' () = 2 :: a' () In Haskell, these two code snippets are semantically equivalent, in the sense that you can replace So actually I think that after your proposal is implemented, the programmer would have to be aware of compiler implementation details in order to understand what the program means (namely that for the first code snippet some magic happens, but not for the second). |
The linked discussion -- https://groups.google.com/forum/m/#!msg/comp.lang.ml/4eK7ucslWRw/CDc9S5AGYSYJ -- does a much better and more formal job of explaining what I was after. Some recursive expressions have a fixed point solution, and some work goes into finding out which ones do and then representing them. In terms of the example you gave, I'm not sure I understand what wouldn't be semantically equivalent about those two expressions. As it currently exists, the later versions would hang, but doesn't this have more to do with no built-in memoization? This goes back to the
example as well. This actually could work in either form with generated code something like
where you could perform some list operations without hitting a stack limit. I'm not sure infinite lists like this should be considered fixed point for elms purposes though; so I would agree that for these examples; in either case, an error (hopefully compile-time) should be generated. |
Yeah, I've looked at the email describing how (some) ML does it as well now. I don't like it. :-) And the reason is exactly that it depends on the syntax of the expression. If expression About the discussion of Likewise, I think that even after the "ML trick" is implemented, ones () = 1 :: ones () would mean that |
Okay, I'm going to close as a duplicate then :) I personally think the syntactic thing that OCaml is restrictive enough to be kind of useless. In any case, part of Joey's work on dead code elimination will make it quite easy to detect safe recursive values and allow them when the recursion is guarded by a lambda. I don't know if we will go farther than that. Let's revisit this once the initial restrictive implementation is released. |
I see what you are saying about seemingly equivalent expressions giving differing results. And I'd agree that this isn't ideal. Even with very good error message support it could hard to determine exactly what is blocking compilation so that it could be resolved. In terms of making (a' ()) equivalent to a: With memoization support I think it does what you expect, without it it would always hang. I think that is a large argument for memoizing these forms of expressions if this kind of data structure is included in the language. Without memoization head (ones ()) hangs; with it, it should return 1. In practical terms, I'm not that bothered by equivalent expressions having differing termination behaviors. Ideally everything would terminate with the 'right' answer, but sometimes there is no fixed-point right answer (sum ones)[Actually I guess Infinity would be correct, but there are other series which addition diverges on] and in other cases there is, but most compilers won't/shouldn't try to find it (sum zeroes), even though one does exist. I think we can both agree though that if both forms of the expression do terminate, they must have the same value, or chaos ensues. :) @evancz Sounds good, thanks for looking into this! |
Yes, what I was writing about referential transparency was mainly a matter of principle. Pragmatically, one might make some compromises. (Also, Evan, absolutely okay to close this here, of course. Am just continuing the exchange with David (?) here for the intellectual fun of it.) So, memoization, yes that would work for ones n = 1 :: ones (n+1) Since |
@JoeyEremondi, this issue specifically is the one that I think may be made worse by #1024. Currently it only seems to apply to functions defined in the same module, whereas with the proposed output code in #1024 will have to topologically sort function definitions across modules. |
@zeckalpha, we have it under control. The new format cannot make things worse because modules must form a directed acyclic graph (DAG) meaning that there can be no mutually recursive values across modules. The issue about giving proper warnings on dangerously recursive values is a separate topic. We have noticed that it seems closely related, but the actual code is pretty orthogonal either way. Please trust us to sort it out, we think about these things and would very much like to resolve the core issue here. |
See: http://share-elm.com/sprout/55afc18ae4b06aacf0e8b605.
It seems like cyclic declarations are not always doing expected things. The output mechanism initializes b first, so it resolves a as undefined and this leads to crashes. Swap the order of declarations or which node you are inspecting and it works.
I'm unsure on how tractable this issue will turn out to be given the current output mechanisms, but it'd be very nice to have supported. But even without that support, there should be a consistent cycle detector and a compile error for it.
Note this related issue in an elm library: Dandandan/parser#15 (comment)
The text was updated successfully, but these errors were encountered: