Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement a filtermap (or flatmap) function to make do-syntax as powerful as list comprehensions #44294

Open
nlw0 opened this issue Feb 21, 2022 · 6 comments

Comments

@nlw0
Copy link
Contributor

nlw0 commented Feb 21, 2022

Suggestion: Julia offers map and filter, which is great, and using the list comprehension we can also perform a filtermap, eg

julia> sample = randn(5)
5-element Vector{Float64}:
 -0.06735837502279898
  0.09030518381226459
 -0.26615854113225973
 -1.4282255796719185
  0.5329984098262069

julia> [x^2+10 for x in sample if x < 0]
3-element Vector{Float64}:
 10.004537150685712
 10.070840369017652
 12.039828306429188

A filtermap function would allow us to perform the same operation, using do-syntax for writing the function. The problem is how to define an interface, since you basically would need to define two functions, one for the filter and another for the map. The suggestion is that this could be accomplished by using Some and Nothing, and then we flatten the output to produce the final filtered result.

julia> function filtermap(f, x)
         mm = map(f,x)
         [something(x) for x in mm if !isnothing(x)]
end
filtermap (generic function with 1 method)

julia> filtermap(sample) do x
         if x<0 x^2+10 else nothing end
       end
3-element Vector{Float64}:
 10.004537150685712
 10.070840369017652
 12.039828306429188

I couldn't find the suggestion anywhere, so here it goes. Hopefully it's not too futile. I only think it's a good idea because right now this is a unique feature of list comprehension, and I feel it would be nice if do-syntax was more on par with it straight out of the box.

@jakobnissen
Copy link
Contributor

Internally, a list comprehension [f(xs) for x in xs if cond(x)] is implemented by collect(Iterators.map(f, Iterators.filter(cond, xs))).

I think it would be nicer - that is, more generic - to have functions that work like imap(f) = xs -> Iterators.map(f, xs) and similar for ifilter. Then you could do

imap(ifilter(sample) do x
    x < 0
end) do x
   x^2+10
end

Or equivalently (and nicer):

sample |> ifilter() do x
    x < 0
end |> imap() do x
    x^2+10
end

@nlw0 nlw0 changed the title Implement a filtermap function to make do-syntax as powerful as list comprehensions Implement a filtermap (or flatmap) function to make do-syntax as powerful as list comprehensions Feb 21, 2022
@nlw0
Copy link
Contributor Author

nlw0 commented Feb 21, 2022

Actually I'm thinking now what I really miss is flatmap. This is a very handy function from many functional programming languages, along with fold-left. Julia does offer fold-left, but I think there's only a flatten at iterators that I'm not even sure would work the way I'm thinking.

If flatten deals with Some and Nothing the way I'm thinking, then flatten(map(f, x)) would work with the function I propose, and then flatmap would allow for do-syntax, and this also allows for things such as


julia> [n+m for n in 1:3 for m in 1:4]
12-element Vector{Int64}:
 2
 3
 4
 5
 3
 4
 5
 6
 4
 5
 6
 7

julia> collect(Iterators.flatten(map(1:3) do n
         collect(Iterators.flatten(map(1:4) do m
           n + m
           end))
           end))
12-element Vector{Int64}:
 2
 3
 4
 5
 3
 4
 5
 6
 4
 5
 6
 7

julia> flatmap(f,x) = collect(Iterators.flatten(map(f,x)))
flatmap (generic function with 2 methods)

julia> flatmap(1:3) do n
         flatmap(1:4) do m
           n + m
           end
           end
12-element Vector{Int64}:
 2
 3
 4
 5
 3
 4
 5
 6
 4
 5
 6
 7

Something would be still be required for flattening these vector of Some/Nothing, though:

julia> flatmap(randn(11)) do x if x < 0 Some(10+x^2) else nothing end end
ERROR: MethodError: no method matching iterate(::Nothing)

But it does work with vectors containing 1 or 0 elements , what is kind of hacky

julia> flatmap(randn(11)) do x if x < 0 ([10+x^2]) else ([]) end end
5-element Vector{Any}:
 10.269636340340607
 10.467156587296035
 10.62399307970644
 15.466278126877425
 10.170517061109889

@nlw0
Copy link
Contributor Author

nlw0 commented Feb 21, 2022

@jakobnissen that looks kind of neat, a form of Currying I guess. My point, though, is being able to have a single functor, and solve the problem with a single "do" body, equivalent to a single comprehension call (variable name only defined once). The flatmap approach is also a little bit more powerful than chaining filter and map, actually even more powerful than the comprehension itself.

My frustration is that I often start coding with for-loops that later become maps, but then I may need something a little more complex and I see myself requiring a larger refactoring than simply changing the do block body, and maybe the functor from map to flatmap. Then the alternative is either a comprehension, or write some lower-level code, or even chaining two functors, which is a pretty big refactoring as well. I never do that, I end up going back to a for-loop with append!... And anyways, adding a filter to an existing map is also not as comfortable as adding a predicate to an existing comprehension, what is super handy.

Scala has this amazing for-loop construction where you can very comfortably extend pipelines into more complex things. The do-syntax reminds me of it a bit, but it's not the same. One thing that happens in Scala is precisely that the whole thing is actually "de-sugared" into flatmaps during compilation. I'm wondering now if having a do-able flatmap in Julia would enable an even more powerful and pleasant functional-coding style!

@aplavin
Copy link
Contributor

aplavin commented Apr 30, 2022

The filtermap function is often more convenient to use compare to flatmap.
For example, you start with map:

map(A) do a
    b = computation(a)
    sqrt(b)
end

then you remember that b can be negative, and want to remove these occurrences:

filtermap(A) do a
    b = computation(a)
    b >= 0 || return nothing
    sqrt(b)
end

Done! No changes to the return value of the do-block. With flatmap you'd also need to change sqrt(b) to (sqrt(b),). That is error-prone, especially when there are multiple return points, and when the original return value is iterable by itself.

I define filtermap in FlexiMaps.jl as

function filtermap(f, A...)
    map(something, filter!(!isnothing, map(f, A...)))
end

and have found it very useful in data processing/analysis workflows.
Would also be useful to have it in Base, if flatmap will be there. Both filtermap and flatmap exist in Rust, so there's at least this recent precedent.

@yatra9
Copy link

yatra9 commented Nov 3, 2022

JeffBezanson suggests more efficient implementation based on comprehension syntax.
#32905

filtermap(f, A) = collect(y for x in A for y in (f(x),) if !isnothing(y))

@aplavin
Copy link
Contributor

aplavin commented Nov 3, 2022

How is that more efficient?

julia> using FlexiMaps: filtermap as filtermap1  # the map-filter!-map implementation

julia> filtermap2(f, A) = collect(y for x in A for y in (f(x),) if !isnothing(y))  # generator implementation

julia> @btime filtermap1(identity, 1:10000);
  15.446 μs (8 allocations: 156.45 KiB)

julia> @btime filtermap2(identity, 1:10000);
  101.462 μs (10 allocations: 326.61 KiB)

So, my map-filter!-map version is much faster.
Also, it keeps the array type like map does: eg, passing a StructArray would return a StructArray.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants