-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid constructing intermediate results when chaining operations #1383
Comments
That is an interesting suggestion, I'd be curious to see some details of how that could work. That probably can be some advanced optimization for immutable structures, given that we already have some inline information, we can theoretically do more analysis (and store more data). The problem is that in order to do it (let's say with lists), both type and functions working with this type will need to become a compiler intrinsic (which is subjectively not a great thing), or we will need to provide a generic mechanism for providing more data to compiler. I guess a custom computation expression can be used in this case, which will describe all the needed custom operations for map, reduce, folds, filters, etc. |
I think that once the compiler has validated the purity of a (chained ?) expression, applying simple rewriting rules would suffice. Obviously, the previous stage of this process remains unclear and as I don't really know how the F# compiler works internally I couldn't suggest a way of doing this. I indeed suppose that the compiler would have to be given more information, which in theory can be done as you say, but I personally don't know how it could be done as I don't work on the compiler code base. In any case, in Haskell, this can be done in two ways: either it is hardcoded in the Haskell compiler, GHC, or the user can provide rewriting rules themselves (although the basic rules, like the one I originally presented, are also coded in the GHC built-in libs as far as I know). You can read more about it here. For example: {-# RULES
"map/map" forall f g xs. map f (map g xs) = map (f . g) xs
#-} It's a powerful feature, but I don't know how it would fit into F# at the moment. Of course, one danger this raises is the inconsistency of the rules that the user could provide (I also think that checking semantic equality is an undecidable problem). Haskell, being a "particular" language used by and for research, can probably afford this risk, but I have my doubts about F#. |
I big downside I see is the potential applicability of the optimization. I don't see how a custom collection type would make this different - is it just to rule out backwards compat concerns? If the code is detected to be pure, e.g. numeric code applied on collections, this could be done automatically even on existing collections. On the other hand, high-perf numeric code is unlikely to be using Lists in the first place and would probably be inclining more towards tensors or vectorized APIs where applicable. |
You're right, that's why I invoked the concept of purity in the discussion. Now, it's true that this is becoming difficult to integrate into the .NET environment beyond F#, I hadn't really thought about it. I think my proposal fits in more broadly with the proposals for an addition to make it explicit the purity of methods in .NET and C#, see here, for example.
The proposal to add a new collection type was precisely an argument / an idea for trying to make the optimisation presented possible (even within .NET and without modifying the entire typechecker), but as you pointed out this doesn't specifically solve the problems of BCL methods, for example, which can have side effects and which F# has no control over checking.
Proposing a new type of collection would effectively add a new learning element for anyone. Ideally we wouldn't have to do it (referring to previous discussion points), but as mentioned just before, it adds constraints that aren't necessarily within the remit of F#, which my initial proposal hadn't alluded to, sadly.
I'd love it if this could be done automatically, without the need to add a new collection type! However, from what I've been able to understand, but I may be wrong, there doesn't seem to have been any desire to infer purity in F# (see the issue mentioned in the original message).
You're right, and I use such libraries myself. However, the original idea was to make F# faster without the user having to do much more than write the usual code. |
I think the existence of side effects is not a showstopper as long as the different execution order is documented and cannot break existing code without people opting into it. A project-wide setting turning on the specific optimization on demand would work here, just not being on by default. Once this is accepted, the optimization step could follow the design that has been recently done for collection builders by @brianrourkeboll, where coded sequences of expressions are rewritten. If similar style is followed, it would be the "hardcoded in compiler" option from this suggestion. |
It could indeed be interesting, I wasn't aware of this new design. I suppose it could be part of that. |
I propose that we add to F#, and to the standard libraries, a new type of sequence which would be pure (not accepting side effects) but which would have an interface similar to what
Seq
already offers. A collection that is free of side effects lends itself to a well-known form of optimization in Haskell (see here for example and details), which is stream/list fusion (a stream in Haskell is a kind of list but without the overhead that is normally associated with linked lists). Of course, this is not the only possible optimization, for example some results can be calculated in parallel, strictly, or lazily depending on the optimizer or the user choice, for example, F# already offers theArray.Parallel
module.Fusion (or deforestation) is an optimization technique that eliminates most, if not all, intermediate collections, leaving just one loop. It's a bit similar to loop fusion in imperative languages.
This style of optimization, known as Short cut fusion in Haskell, cannot be applied directly in F# or any other strict language (in any case, according to the classic methods, see below!) And I think that's fairly common in F#, or in functional programming in general, to reason with intermediate collections in a chained way; except that these intermediate collections have an impact on performance. In addition, most of the time, these collections are pure! But there is currently no mechanism in F# to deduce purity. If such a mechanism existed, then my suggestion would no longer need to consider adding a new type of "pure sequence", but would be limited to the proposed fusion optimization.
Consider the following codes:
Example 1
Example 2
Example 3
Example 4
Example 5
Example 6
Example 7
All these examples of F# code seem to me to be fairly idiomatic and common. They all present optimization opportunities that could be applied with fusion and/or pure sequences. Note that I have deliberately chosen to use lists and not sequences or arrays to remain consistent, although real code would probably use arrays or lazy sequences in some situations rather than linked-lists.
The current way of doing things in F# is to apply the rewriting rules or the optimizations ourselves (but note that my examples are sometimes exaggerated for illustrative purposes). The examples 2, 4, 5 and 6 show the best cases of potential list fusion as found in Haskell. I refer you to the links mentioned earlier and down below to find out how Haskell would optimize this, but here's the idea in the case of the first example (example 2). You can try to apply rewrite rules yourself for the other examples, normally, they would all optimize in a similar way. The example is:
We know that for any function
f: b -> c
andg: a -> b
and for any listxs: [a]
the following relation holds:In other words, in the F# code,
is the same as
Next, we know that for any function
f: a -> Bool
andg: a -> Bool
and for any listxs: [a]
the following relation holds:In other words, in the F# code,
is the same as
Combining the two previous results, we obtain
So,
Then, we know that for all
f: a -> bool
andg: b -> a
and for any listxs: [b]
the following relation holds:So, the F# code become
I know that this code is becoming increasingly unreadable, but please note that I'm only illustrating how the compiler may works step by step for optimizing the initial code. Now, the last rewriting step is "simple": just unwind the fold and you get the following optimized code:
You can check that
example2_optimized
andexample2
are identical. Only, one of the two codes uses 5 intermediate lists, while the other uses no intermediate list at all. According to my small benchmarks, for two random lists of 10,000 items between -10 and 10, the reduced code is around 36% faster on my machine, which of course comes as no surprise. I've carried out the same procedure for the other examples cited, and the code is obviously much faster every time. I think that the potential for optimisation is not negligible for real programmes either.I was able to do the reduction in a fairly automated way because I simply applied the rewriting rules I showed you each step, but this isn't always easy to do automatically in a language with side effects, hence my proposal to add a collection type that is pure and can lend itself to this style of optimization. For example "
PureSeq
" or something like that.Ideally, we wouldn't need to introduce this new type, but as mentioned above, F# doesn't allow purity to be checked and a previous suggestion from 2016 on the subject has been rejected.
Now, examples 1 and 3 are not specifically about fusion optimization, but about generating a temporary list/collection that will be directly fully consumed during the calculation process. Example 3 is easily optimized using lazy evaluation :
The problem with this function is that the list will potentially generate many elements, whereas the comparison will only require two. The version of this code where lists are replaced by
seq
produces a more optimized code whenn
is not a prime number. But I'm still including it in my examples to suggest thatPureSeq
might also be lazy.Example 1 is more interesting because it uses a list and a fold/reduce as substitutes for a loop:
This code is elegant but produces a temporary list and is therefore slower. It's conceivable that a F# compiler implementing pure sequences might recognize such a pattern and eliminate the need to create the list:
The optimization of removing the creation of the list can also be applied to Example 5, where the list is only generated to be mapped and filtered, it is not provided by the user. How Haskell does this is illustrated here.
I've included Example 7 for discussion, as it's very specific to linked lists. As a reminder,
this function browses the list twice to remove the last element. I know this is typical of linked lists, but I don't know whether or not
PureSeq
should also be a linked list-like. The advantage would be that pattern-matching is possible (streams in Haskell aren't really linked lists, but allow a similar pattern matching). So I don't know howPureSeq
should behave.I think that initially, only fusion optimizations as presented in more detail would be relevant thant the last few examples, but I think the discussion could be taken further.
Pros and Cons
The advantage of this adaptation to F# is that it would encourage people to write idiomatic/pretty/normal code with no impact on performance, and on the contrary to way improve it.
The disadvantages of this adaptation to F# are that it would be a specific optimization for what I call
PureSeq
and would not be compatible with non-pure code. I don't know how interop with C# would look either, but it could be an F#-specific feature or a compiler feature. Also, this might make the compiler slower, but it's the style of compilation pass that would be done in release mode only and not in debug mode (debugging would be horrible otherwise).Extra information
Estimated cost (XS, S, M, L, XL, XXL): it depends on how far the concept would be developed, but it would probably be XL or XXL because this would require changes to the behavior of the type system, code generation, optimization and maybe the introduction of a new native type with its associated module and functions.
Related suggestions:
PureSeq
might not even be necessary! That said, it's an unproven area of research for now.pure
keyword, but the links are dead and the discussion isn't very enriching.Affidavit (please submit!)
Please tick these items by placing a cross in the box:
Please tick all that apply:
For Readers
If you would like to see this issue implemented, please click the 👍 emoji on this issue. These counts are used to generally order the suggestions by engagement.
The text was updated successfully, but these errors were encountered: