Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implements flatmap #44792

Merged
merged 14 commits into from
Apr 7, 2022
Merged

Implements flatmap #44792

merged 14 commits into from
Apr 7, 2022

Conversation

nlw0
Copy link
Contributor

@nlw0 nlw0 commented Mar 29, 2022

Related to #44294

flatmap is the composition of map and flatten. It is important for functional programming patterns.

Some tasks that can be easily attained with list-comprehensions, including the filter-mapping ([f(x) for x in xx if p(x)]), or flattening a list of computed lists, can only be attained with do-syntax style if a flatmap functor is available. (Or appending a |> flatten, etc.)

Filtering can be implemented by outputting empty lists or singleton lists for the values to be removed or kept. A more proper approach would be the optional monad, though, usually implemented in Julia as a union of Some and Nothing.

This patch therefore also implements iteration methods for Some and Nothing, to enable the filter-map pattern with flatmap.

Highlights:

  • A flatmap over a collection of optionals is not equivalent to a composition of map and filter.
  • do-syntax today is limited compared to comprehensions or for-loops with continues and appends. Offering flatmap and flattening optionals brings do-syntax closer to these other constructs.
  • Flattening over optionals, and further applying map etc, is necessary to make Some-Nothing equivalent to optional monads in other languages, from Haskell and Scala to C++17. Right now a map(sqrt, Some(4)) wouldn't work in Julia, though. Is this something we would like to pursue?

@jakobnissen
Copy link
Contributor

jakobnissen commented Mar 30, 2022

This PR defines iterate(_::Nothing) = nothing, which is a bad idea. nothing should not be iterable. In general, the more attributes you define for nothing, the worse it is as a sentinel value, because you can rely less on it erroring in a given function call.

Less importantly, I would also argue the same with Some. Having non-containers be iterable is confusing and leads to bugs.

@nlw0
Copy link
Contributor Author

nlw0 commented Mar 30, 2022

That really goes to the heart of the matter, @jakobnissen , thanks.

In functional programming languages such as Haskell and Scala, the optional type behaves very much like a container that can contain at most one element. It can be iterated, mapped, for-eached, etc.

In Julia, the official alternative for the optional type is the union of Some and Nothing. If this is supposed to behave like the optional monad in other languages, then we should implement things like iterate(::Nothing). In my personal opinion, it is a good idea.

If the designers of Julia come to understand, however, that Some and Nothing must not behave like a proper option monad, perhaps because this was the case for a long time and introducing this now feels like and afterthought or whatever, then maybe we should consider the question of how could we implement a proper optional type.

About having non-containers being iterable, I would point out that right now integers, for example, are actually iterable in Julia

julia> for x in 3
          println(x)
       end
3

julia> collect(Iterators.flatten(1:5))
5-element Vector{Int64}:
 1
 2
 3
 4
 5

I would actually recommend against this behavior, given that integers are not containers in principle. I believe Some, on the other hand, could safely be viewed as a container that contains exactly one element. Do you disagree with that?

Of course Nothing is widely used in the language and I can see how this might be considered a big deal. But it would be really nice to have an official optional type that can behave as a container, in my humble FP hipster opinion. I'd be perfectly fine with a separate None() that can go alone with Some and be iterable. Or an iterable alternative to Some as well.

I would actually also argue that the use of nothing is and should always be done in a way that is actually consistent with the complete container behavior. Is code that relies on nothing being non-iterable right now actually relying on expected, well-defined behavior? Or is it more of an undefined behavior that programmers should not have been counting on?

Regarding the complete patch, I would also like to point out flatmap has an important role in other monads as well, one nice example being the future monad. To compose two futures today, we need to "fetch" the future and then apply a function to the fetched value. If fetch is replaced with iterate, composing the futures naturally becomes a flatmap. This happens with other classes as well, that's why flatmap is a big deal, as well as having more clarity about when a type should be iterable.

@jakobnissen
Copy link
Contributor

jakobnissen commented Mar 30, 2022

It's correct that integers and chars are iterable in Julia. I find that objectionable, both from a conceptual point of view (integers are not containers), and from a practical (I often bump into bugs that would not be possible if not for this behaviour).

I suppose one could define Some as a 1-object container, but I absolutely oppose nothing as a zero-element container. The reason is that although Union{Some{T}, Nothing} is used in Julia as something akin to Optional, the meaning of nothing is not "the opposite of Some". That is merely one concrete use of nothing. It implies nothing about nothing, no more than the fact that Union{Char, Nothing} implies that nothing is the opposite of Char, and therefore ought to have Char methods.

Less abstractly, adding iterate to Nothing will absolutely, 100% cause future bugs, when people accidentally pass nothing into the wrong function. The issue here is that different people assign different semantic meaning to the struct nothing, which implies different behaviours:

, and these different ideas about nothing are mutually exclusive.

Implementing a new type that behaves like Optional is just fine - but it is not nothing. I fail to see why we would necessarily want to copy Haskell here, but I don't really object to it existing.

@nlw0
Copy link
Contributor Author

nlw0 commented Mar 30, 2022

@jakobnissen I very much understand the desire to be conservative about introducing iterate(::Nothing). I'd be happy to simply have a separate object intended to work as a rigorous FP-style optional type.

I would still like to invite you to step back for a bit, and try to look at this proposal from a perspective of following the nice programming-language theory and wisdom coming from functional programming, Haskell and Scala being merely two examples I'm most familiar with, but we can go all the way back to John Backus' Nobel lecture, or any good Category Theory book. I'm inviting you all to ask the question "what is Nothing anyways?". Or maybe, following Julia tradition, we might call this PR "Taking Nothing seriously"!

Thanks for enumerating those two related issues. You seem to have a genuine interest in the issue and I hope I can make the case for the importance and benefits of pursuing the FP-paradigm optional, even on these stdlib types.

You listed three positions. Regarding the first position, I would actually argue that this is completely in line with my proposal. In fact, nothing should very much be "something empty", like an empty (immutable) vector, or the empty tuple Tuple{}. Being "empty" implies it's a collection, though, so it should be iterable, as any collection. Except it's a quirky collection, because it's always empty.

Regarding the second point, of nothing being used as a sentinel value, that is perfectly fine. These are not conflicting goals. You can use nothing as a special value regardless of whether it is iterable or not. You just test it with x==nothing or x===nothing or preferably isnothing(x), as you should. You could use [] or () as special values as well, regardless of them being iterable or not.

Pardon if I disagree, but the conclusion must be that these positions are not conflicting at all. Special-value use is consistent with nothing being an iterable, empty collection, what is my proposal, and what the first proposal implies.

You are very right to point out that the use of Nothing in unions with other types is somehow different from having it strictly as one value from an ADT at the same level as Some, both being iterables, and for instance the nothing values disappearing at the same time where Some(x) becomes x when you flatten a list. So, for instance, in Union{Float64, Nothing} this is kind of weird because Float64 is not a collection (although it happens to be iterable in Julia for some reason), and Nothing would be a collection. The "rigorous" use of Nothing to create an option monad would imply always sticking to something like Union{Some{Float64},Nothing}.

Union{X, Nothing} is just a handy thing, similar to using

I would argue that every use of Union{X, Nothing} is a special-value use where users rely on the isnothing() predicate when necessary. In fact, the only guarantee for nothing is that isnothing will return true for nothing and false for the X values. Whether Nothing is iterable or not is very much unspecified behavior in Julia right now. My proposal is that it should be specified, and that it should behave as an empty collection.

Of course that's not up for me to decide. But the fact length(nothing) and iterate(nothing) causes an error must not be understood as "how it's supposed to be" just because this is how it is. This is unspecified behavior, as far as I know, and making Nothing and Some iterable is very much on the table. The current requirements for nothing is that it returns true for the function isnothing, and that's it. As far as I know. There is no reason anyone should be expecting it not to behave as an empty collection. In other words, nobody should be expecting length(nothing) to throw an error instead of returning 0. Is there a test for that in the codebase?

To conclude, I would like to make the point that this topic is even larger than that. There are many such types that "contain" a value, and then we have special operations to construct the value, and to fetch the value... That's what the monad stuff is all about. And flatmap is at the heart of this idea. Right now in Julia we have the something method for Some, and fetch for Future. The great insight from FP is that these methods could all just be first, these are just wonky collections with a single element. And for the optional type, it's either a unit set or the empty set. And treating these types like collections enable all sorts of interesting high-level procedures to be created on top of them, that's why this is important.

Apologies for the long screed, I'm not sure this is the proper forum to discuss a proposal like that. I would also like to stress that this is coming from me scratching my own itch. I frequently code using "do-syntax" with map, because it's higher-level than for, and beats comprehensions because it's multi-line, but then you hit this limitation, and using explicit flatten, filter or custom iterable-nothings takes a lot of the the buzz out of it.

I would love to hear more thoughts about this, especially from Mr. Bezanson! I'm not very familiar with LISP and I'm not sure how a LISP programmer sees this. Is nothing not supposed to be Julias ()orNIL`?

@StefanKarpinski StefanKarpinski added the triage This should be discussed on a triage call label Mar 30, 2022
@bramtayl
Copy link
Contributor

Ref #16961?

@jakobnissen
Copy link
Contributor

jakobnissen commented Mar 31, 2022

Let me try to explain why I think nothing as sentinel value directly conflicts with adding additional methods to nothing, and why I disagree with the statement that the only promised behaviour of nothing is its behaviour with isnothing and ===.

A programming language is just as much defined by what bugs it doesn't allow you to write, as it is by what it does allow you to do. This sounds like an empty truism, but it really is true - I spend about as much time debugging and verifying correct behaviour of code as I do writing it in the first place. Hence, considerations about preventing footguns should have at least as much weight as considerations about new functionality, and arguably more than considerations about convenience.

So, why am I so certain that adding new methods such as iterate to nothing will cause bugs? Couldn't it also fix bugs?

Consider a widely regarded bad PL design: The billion dollar mistake, the null pointer. What is so bad about the null pointer? Isn't it functionally equivalent to Union{T, Nothing}? After all, the programmer can simply check for the presence of a null pointer, right?

I think the aspect that made the null pointer one of the worst design decisions in PL history is the fact that it presents an edge case that is difficult to reliably guard against. Any given instance specified to be of type T could covertly be a null pointer instead, and the programmer is not able to express its exclusion in the type system. That is, the programmer cannot represent "No, this really, actually is a T".

If nothing loses its important feature that it throws an error in nearly every function, then it becomes harder to express the concept of "T or not T" as Union{T, Nothing}. For an example, consider the code:

struct Gene
   DNA::Union{String, Nothing}
end

count_ts(x::Gene) = count(isequal('T'), x.DNA)

genes = [Gene("TATCGAGTA"), Gene(""), Gene(nothing)]

map(count_ts, genes)

If nothing is iterable, then the gene containing nothing silently passes through count_ts, returning 0. In other words, a type error which could easily have been found by the linter or static analysis is turned in to a logic error, which probably means several minutes (or hours) of debugging.

Hence, in this regard, if methods are implemented for nothing, then Union{String, Nothing} acts like the billion dollar mistake: The programmer can never be certain that the edge case nothing is dealt with, but must remember to deal with it ahead of time, without any help from the linter or static analysis if they forget.

And that's why I think adding methods to nothing erodes its usefulness as a sentinel value: If nothing implements methods, I can no longer rely on the fact that it cannot accidentally get used in computation: It is either dealt with explicitly, or throws errors.

@nlw0
Copy link
Contributor Author

nlw0 commented Mar 31, 2022

@jakobnissen Your example demonstrates why using Union{X, Nothing} to create an optional type is not as good as the more strict Union{Some{X}, Nothing}. This is how optional types are implemented in more fussy FP languages. What is very attractive, I should say, I was blown away when I started working with optional types in type-checked languages, and saw how greatly useful this can be to programmers. I'm very much inspired by this experience as I create this PR.

In your example I would suggest the issue is how a low level type, String is mixed together with a high-high level Nothing, while kind of skipping the role the Gene class should have. Method count_ts takes a Gene, and that's perfect. But Gene should contain valid, guaranteed data, so just a String. This optional thing should be handled in a higher level, where your method taking Gene would fail if given a Nothing object. I would recommend either something like Union{Gene, Nothing} or Union{Some{Gene}, Nothing}, or maybe create some other struct to represent what this Gene holding a nothing represents. Your data type is actually optional, it's better if you have all the business logic implemented without optionals as much as possible, and move these concerns to a higher level.

In Julia Union{X, Nothing} is kind of the recommended way to implement optional, and I understand it can be more handy. I think Julia caters a bit to this kind of application, more similar to Python, javascript, or even LISP. I can't actively prevent those people to shoot themselves in the foot, but I can warn them, and try to offer an alternative.

It is precisely to enable this kind of safe and strict programming to handle optionals, that it would be good to be able to flatmap over optionals, implemented as Union{Some{X}, Nothing}. Or maybe Union{Tuple{X}, Tuple{}}. That works, but lacks the named classes stating the intention of the types as an optional type. It just lacks the blessing of the Julia patriarchs or whatever. And a stdlib flatmap.

I cannot say what was intended in Julia with nothing. In my point of view it could be turned into a strict FP style optional, intended to be used along Some, or along with a String if you don't want the blissful strictness. So here's my vote and arguments. I kind of understand following a precautionary principle, etc, but we should also try to follow design principles. My suggestion is informed by the design principles of a whole school of though in programming language design, one that is actually quite concerned with the issues you seem to be concerned.

Maybe this could be something for Julia 2.0? Or should we implement a separate iterable None type? What about Some being iterable? I'm kind of fine with everything. I would like to push this ahead with Nothing because I think it works, it's how it should have been and all, and I think will help other fellow programmers. I wouldn't like to just relegate flatmap+alternative iterable Some and Nothing to some external module (that I imagine probably exists out there already). This should be on stdlib. And to re-iterate, something like this is necessary to make do-syntax pipelines as powerful (and more) than comprehensions and for-loops right now. I would really like to see standard Julia offering a way to do that, and I'm totally open to alternatives if deemed necessary.

@nlw0
Copy link
Contributor Author

nlw0 commented Mar 31, 2022

This package illustrates what I'm looking for, using Union{Tuple{},Tuple{X}} as an optional class. Ideally we would have something like Union{Some{X},None} with more meaningful names. I'm pretty sure there must be other proposals like that around, I'm sorry I can't find anything to reference.

https://github.com/nlw0/Flatland

@tkf
Copy link
Member

tkf commented Apr 1, 2022

I'm strongly against the iterate method portion of this PR likely at any Julia versions (but flatmap/mapcat for iterables is a nice thing to have). Adding to what @jakobnissen have mentioned, I believe it is important to notice that there is no canonical functor interface in Julia that every "functor-like thing" can use (flatmap has more structure but this explanation can be done with map). Iterators.map is not the "fmap" of all functors in Julia.

To explain this, let us revisit the functor law:

fmap id = id
fmap (g . h) = (fmap g) . (fmap h)

--- Functor (functional programming) - Wikipedia

Does this hold in Julia?

jjulia> isequal(Iterators.map(identity, 1:2), 1:2)
false

Unfortunately not. However, I argue that Iterators.map is still a fmap but with an equivalence class different from the one defined by isequal. For example, we can consider an "equivalence class" defined by

==′(a, b) = isequal(collect(a), collect(a))

(let me allow ignoring details like how to handle more difficult cases like Channel). This lets us verify the functor law.

This is what I mean by "Iterators.map is not the "fmap" of all functors in Julia." It is a functor where objects are the equivalence class more coarse-grained than the isequal-induced one.

When designing functors, it is important to understand what equivalence class we are working on. This is in the end same as what @jakobnissen said initially. People like to think that nothing and [] are not "the same thing." We can avoid this in a principled way by noticing that Iterators.map as a functor operating on a very coarse-grained equivalence relation.

I also argue this is not bad and also not unique to Julia (cf "setoid hell"). I think it's good to be inspired by other languages to make Julia better. But, since Julia is a very unique language, I think we need to carefully distill the idea.

(FWIW, I'm also curious about making "nice" Maybe/Optional and Either/Result APIs in Julia. Here's my shot at giving a simple unified interface https://github.com/tkf/Try.jl#focus-on-actions-not-the-types)

@nlw0
Copy link
Contributor Author

nlw0 commented Apr 1, 2022

Thank you, @tkf . Would you mind making a concrete proposal for how we might have a flattable optional type in the standard library? All I want is to be able to write something like flatmap(randn(111)) do x x>0 ? Some(x^2) : None() end in my code. And I believe this should be available in the language out of the box. How do we get there?

@tkf
Copy link
Member

tkf commented Apr 1, 2022

flatmap(randn(111)) do x x>0 ? Some(x^2) : None() end

If you want to write the function body, using tuples as you suggested sounds good to me:

Iterators.flatmap(randn(111)) do x x>0 ? (x^2,) : () end

If you already have an Option-valued function and want to feed it to Iterators.flatmap, I suggest to manually transform an Option value to an iterable:

f(x) = x>0 ? Some(x^2) : nothing  # this is given

Iterators.flatmap(Try.astuple  f, randn(111))

@JeffBezanson
Copy link
Member

Thank you @tkf, I agree. Long story short, triage yesterday was ok with flatmap but not with making nothing iterable.

In fact this can be considered a deficiency of the Union{Some{X}, Nothing} design: whether nothing refers to an absent value in the sense of this PR is context-dependent, but whether nothing is iterable cannot be context-dependent. For example nothing is used in other places like the return value of print, where the result really just shouldn't be used in value context.

@JeffBezanson JeffBezanson removed the triage This should be discussed on a triage call label Apr 1, 2022
@nlw0
Copy link
Contributor Author

nlw0 commented Apr 1, 2022

Thanks, @JeffBezanson . It sounds like Nothing is sometimes the negative side of an optional type, but sometimes the "Unit" or "Void" type. And indeed in Julia nothing===Void()? That really puts a lid on the subject. I would still point out that if integers can be iterable, this might not be such a big deal. But I'm ready to give up on that.

It would still be nice to have an iterable Some and None to build a flattenable optional type. Or any other names instead of just relying on tuples.

With or without these neat types, having flatmap and also an astuple method seems fine by me. I'll write a new patch for that. Any specific details I should follow?

@oscardssmith
Copy link
Member

oscardssmith commented Apr 1, 2022

Integers being iterable really is just an accident. We attempted to remove it for Julia 1.0, but by the time we tried to remove it, doing so would have broken an annoyingly large amount of code, so we didn't. It's a major source of bugs (I write some variation of for i = 100 at least every other week), and we don't want more of those.

@nlw0
Copy link
Contributor Author

nlw0 commented Apr 2, 2022

I've put an astuple method in Experimental, but maybe it could go in base/tuple.jl or somewhere else?

@nlw0
Copy link
Contributor Author

nlw0 commented Apr 3, 2022

I have changed the function name from astuple to monuple because astuple seems to be a slightly popular function name. In fact, it's even defined as a utility function in test/ranges.jl.

monuple is short and unusual. Hopefully it's not too highbrow or inaccurate, as it actually returns either a monuple or an empty tuple. I thought of tuptionalas well, but I'm not sure what's our stance on funny names. Anyways, we can change the name again if anyone has a better idea.

@tkf
Copy link
Member

tkf commented Apr 5, 2022

Just to be clear, I didn't mention astuple to suggest adding this to Base. I'd guess there is a higher chance of merging this PR quickly if it focuses on flatmap/mapcat since triage has approved this part already #44792 (comment)

@nlw0
Copy link
Contributor Author

nlw0 commented Apr 5, 2022

Understood, I'll split it in two PRs

nlw0 added 9 commits April 5, 2022 17:54
flatmap is the composition of map and flatten. It is important for functional programming patterns.

Some tasks that can be easily attained with list-comprehensions, including the composition of filter and mapping, or flattening a list of computed lists, can only be attained with do-syntax style if a flatmap functor is available. (Or appending a `|> flatten`, etc.)

Filtering can be implemented by outputing empty lists or singleton lists for the values to be removed or kept. A more proper approach would be the optional monad, though, usually implemented in Julia as a union of Some and Nothing.

This patch therefore also implements iteration methods for Some and Nothing, to enable the filtermap pattern with flatmap.
@nlw0
Copy link
Contributor Author

nlw0 commented Apr 6, 2022

@tkf Would you like to write, or for me to write a bit more in the docstring, or just leave it like that?

Copy link
Member

@tkf tkf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! I think it's good to go.

@tkf tkf added the merge me PR is reviewed. Merge when all tests are passing label Apr 6, 2022
@giordano giordano changed the title Implements flatmap ~~and iteration methods for Some and Nothing~~ Implements flatmap Apr 6, 2022
@nlw0
Copy link
Contributor Author

nlw0 commented Apr 6, 2022

Thanks a lot @tkf

@tkf tkf merged commit badad9d into JuliaLang:master Apr 7, 2022
@tkf tkf removed the merge me PR is reviewed. Merge when all tests are passing label Apr 7, 2022
@tkf
Copy link
Member

tkf commented Apr 7, 2022

CI/testing issues ignored: #44892 #44891 #44859

@fredrikekre fredrikekre added needs news A NEWS entry is required for this change needs compat annotation Add !!! compat "Julia x.y" to the docstring labels Apr 7, 2022
@tkf
Copy link
Member

tkf commented Apr 7, 2022

Ah, thanks for catching these points!

@nlw0 Can you address @fredrikekre's points? (Remove the comment, add news and compat)

@nlw0
Copy link
Contributor Author

nlw0 commented Apr 7, 2022

Sure thing, hopefuly later today

@thchr
Copy link
Contributor

thchr commented Apr 8, 2022

Little bit late to the game but: how is flatmap(f, c...) an improvement syntactically or expression-wise over the very simple and already quite short flatten(map(f, c...)) (the implementation)?

I.e. this seems to just add another short verb for a nearly trivial composition of two more basic verbs?

Bit grouchy, I realize, but from reading the discussion it was not clear to me why it is preferable to introduce a new function for this essentially trivial composition (unless there's e.g. a performance argument)?

@nlw0
Copy link
Contributor Author

nlw0 commented Apr 8, 2022

Hi @thchr , the need for flattening maps arise often in more complex tasks. Having flatmap defined you can write do-blocks in a more natural way.

[(j,k) for j in 1:3 for k in 1:3 if j!=k]

flatmap(1:3) do j
    flatmap(1:3) do k
        j!=k ? ((j,k),) : ()
    end
end

flatten(map(1:3) do j
            flatten(map(1:3) do k
                        j!=k ? ((j,k),) : ()
                    end)
        end)

map(1:3) do j
    map(1:3) do k
        j!=k ? ((j,k),) : ()
    end |> flatten
end |> flatten

One particular example when this is very useful is if you have a map, and you decide to filter the values. In a comprehension, it's just a matter of appending a predicate. With a for-loop, you can just add a continue statement. With flatmap, you just need to use the predicate to return something like (x,):(), and change the map to flatmap. It saves a few parethesis and preserves the nice indentation structure of the do-blocks. I personally write do blocks very often and noticed that you often can get a lot done, but when the time arrives for something more complex, you end up needing a lot more refactoring. And in fact, I would often forget that flatmap even exists, so it's a bit of a PSA thing as well. I was about to add flatmap to my .julia/config/startup.jl when I realized "wait a minute, so many more people can benefit from this." So that's the story.

@nlw0
Copy link
Contributor Author

nlw0 commented Apr 8, 2022

@thchr It's a great question if there may be performance considerations. I think there may be cases where a custom flatmap could do something smarter than flatten(map()), but I'll need to verify that. On the other hand, I suspect you're generally safe just defining flatten(x) = flatmap(identity, x).

@thchr
Copy link
Contributor

thchr commented Apr 8, 2022

Having flatmap defined you can write do-blocks in a more natural way.

But composition already allows you to write this equivalently as:

(flatten  map)(1:3) do j
    (flatten  map)(1:3) do k
        j!=k ? ((j,k),) : ()
    end
end

On the face of it, you save 8 characters for every (flatten ∘ map) you replace by flatmap; but in the above example, that's a meager 17% reduction of the total do-block's length (in any nontrivial example, it seems likely the savings would be even smaller).

And in fact, I would often forget that flatmap even exists, so it's a bit of a PSA thing as well.

I think this is my main counterpoint to adding functions like this: it increases the vocabulary of the language and reduces the chance that everyone can read a bit code without needing to look up functions in the "dictionary"/docs.
I think it is worthwhile to add new "words" to Base if (a) things cannot be done simply and briefly in terms of existing words or (b) if the new word enables performance benefits; but neither seems to apply here?

@nlw0
Copy link
Contributor Author

nlw0 commented Apr 8, 2022

I do believe there might be performance benefits, I was motivated at first only by high-level reasons. I'll gladly look into that if I find the time.

I am used to the function being available in other languages. In my opinion, it's worth having it. I have pointed out the importance to different applications. It's not just about flatmap, it's also about having types such as Optional and Task being iterable, and being able to compose functions that return optional, for instance.

It's not about coding golf. Some people are probably more familiar with the importance of flatmap in specific situations, and can see the appeal more quickly.

In fact, I suspect there may be classes in the language today where the API implements a key function with a custom name, but it might actually have been just another flatmap. That might promote an economy in new terms, as instead of finding a new name for this peculiar function in each different class, it's actually just another flatmap.

I'm not really sure what else I can write, and it's a pretty long PR discussion already. I wish I had more compelling theoretical and computational arguments I could just lay down. In the end I'm mostly inspired from experience, and made the patch to see what the community thinks. I very glad I have found some acceptance. I hope others can help address these concerns. Best I can do is to try looking for examples of efficiency gains, I bet they do exist.

@mcabbott
Copy link
Contributor

To address @thchr's complaint: Why can't this just be a method flatten(f, x), not a new verb?

We have sum(f, x) as a more efficient sum(map(f, x)), but no summap symbol. Likewise all(f, x) instead of allmap, and so on...

(This isn't in 1.8, BTW, so no great hurry.)

@nlw0
Copy link
Contributor Author

nlw0 commented Jul 10, 2022

@mcabbott I could live with flatmap functionality being attained through an optional function argument provided to flatten. Hopefully this change might enable something I would really love to have: a version of flatten in Base, along withmap and reduce, producing concrete vectors instead of iterators. This flatten plus flatmap locked away in Iterators is actually not that handy. What I really want is flatmap functionality in Base producing a concrete vector, not just and iterator, and not having to write (flatten ∘ map) or (collect ∘ Iterators.flatten ∘ Iterators.map). The function name can be whatever name you all fancy the most, i would just be great if it is a single symbol.

@aplavin
Copy link
Contributor

aplavin commented Jul 10, 2022

Array flatten and flatmap are available in the FlexiMaps.jl package. They have the same interface, consistent with these Iterators.* functions.

I find the array-based flatmap/flatten useful more often than their Iterators.* counterparts. They are more efficient and generic in the sense that the container type stays the same: e.g., flattening a collection of StructArrays returns a StructArray.

Regardging eager vs lazy flatmap performance:

julia> @btime flatmap(i -> 1:i, 1:1000);
  333.458 μs (13 allocations: 5.23 MiB)

julia> @btime Iterators.flatmap(i -> 1:i, 1:1000) |> collect;
  2.028 ms (14 allocations: 5.23 MiB)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs compat annotation Add !!! compat "Julia x.y" to the docstring
Projects
None yet
Development

Successfully merging this pull request may close these issues.