Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A function to convert an optional Union{X,Nothing} to an iterable Union{Tuple{X},Tuple{}} #44864

Open
wants to merge 11 commits into
base: master
Choose a base branch
from

Conversation

nlw0
Copy link
Contributor

@nlw0 nlw0 commented Apr 5, 2022

Split from #44792

Union{X, Nothing} is a popular way to represent missing values in Julia. A list of optionals can be filtered with filter(!isnothing, data), or [x for x in data if !isnothing(x)].

Comprehensions are handy because they enable simultaneous filtering and mapping in a terse notation:
[f(x) for x in 1:11 if p(x)==true]
is equivalent with for-loops to

a=[]
for x in 1:11
    if p(x)
        push!(a, f(x))
    end
end

Slightly more complex tasks quickly challenge the power of comprehensions. For instance, consider your test depends on the computed value, you can easily adapt a for-loop to write:

a=[]
for x in 1:11
    y=f(x)
    if p(y)
        push!(a, y)
    end
end

A comprehension has no simple way to store an intermediate value, and you'd be forced to write
[f(x) for x in 1:11 if p(f(x))==true]
or a hacky
[y for x in 1:11 for y in (f(x),) if p(y)==true]

Using do-syntax, this could be written with filter(p, map(f, data)), what is probably OK if you're using iterators.

There's another style that allows us to filter-map in a single go, relying on flatten

flatten(map(data) do x
    y=f(x)
    if p(y) (y,) else () end
end)

This is interesting because do-syntax gives us more freedom to write a body, similar to the for-loop, but we are still avoiding to write an explicit vector declaration and a push!.

Please notice I'm not discussing anything related to using the |> operator, even though the topic easily goes that way. This is not the point, though. In the previous code, we might have written |> flatten, but simply having flatmap available solves this.

Some languages offer an optional type that is iterable. The equivalent to Nothing would represent an empty list, the equivalent to Some a container with exactly one element. If something like that were available in Julia, we could write

flatmap(data) do x
    y = f(x)
    if p(y) Some(y) else None() end
end

In my opinion, it would be nice to have this iterable optional type available in the Julia standard library. This is not the point of this PR, though. The idea here is that this "tuptional" approach is already pretty handy, and good enough to bring flatmap closer to the power of comprehensions and for-loops. I wouldn't like to see libraries using this as an API. Union{X,nothing} or Union{Some{X}, Nothing} is just fine. But given the ubiquity of using the non-iterable nothing as optional, and the natural desire to map that to Union{Tuple{X}, Tuple{}} if you're coding in that style, it would be nice to offer a function to do that. That's what monuple does. I'd be glad to call the function whatever, but here's an example of how you could use it in practice with a standard library function that returns nothing.

julia> data = match.(r"(x.?)", ["x", "aoeu", "xoxo", ">>=", ";qjkx"])
5-element Vector{Union{Nothing, RegexMatch}}:
 RegexMatch("x", 1="x")
 nothing
 RegexMatch("xo", 1="xo")
 nothing
 RegexMatch("x", 1="x")

julia> [optx for optx in data if !isnothing(optx) && optx[1] != "x"]
1-element Vector{RegexMatch}:
 RegexMatch("xo", 1="xo")

julia> Iterators.flatmap(monuple.(data)) do optx
           Iterators.flatmap(optx) do x
               x[1] == "x" ? () : (x,)
           end
       end |> collect
1-element Vector{RegexMatch}:
 RegexMatch("xo", 1="xo")

Notice how mapping over the tuples we are able to compose functions that return optional values, and the flattening hides it away, no need for explicit !isnothing filtering. In my opinion, it's a big deal. Also it's pretty close to standard Julia.

Anyways, that's the point of the PR, which was split away from #44792. With Iterators.flatmap available, it would be nice to offer something that enables this "flatmap + tuples" approach for mapping and filtering tasks.

nlw0 added 10 commits April 5, 2022 17:54
flatmap is the composition of map and flatten. It is important for functional programming patterns.

Some tasks that can be easily attained with list-comprehensions, including the composition of filter and mapping, or flattening a list of computed lists, can only be attained with do-syntax style if a flatmap functor is available. (Or appending a `|> flatten`, etc.)

Filtering can be implemented by outputing empty lists or singleton lists for the values to be removed or kept. A more proper approach would be the optional monad, though, usually implemented in Julia as a union of Some and Nothing.

This patch therefore also implements iteration methods for Some and Nothing, to enable the filtermap pattern with flatmap.
@nlw0 nlw0 changed the title Nic/monuple A function to convert an optional Union{X,Nothing} to an iterable Union{Tuple{X},Tuple{}} Apr 5, 2022
@aplavin
Copy link
Contributor

aplavin commented Jul 2, 2022

This concept of filter & map together is definitely useful, I find myself often reaching for it in data processing tasks.
But the proposed solution doesn't really read intuitive for me. It's hard to tell from a glance what this code does:

julia> Iterators.flatmap(monuple.(data)) do optx
           Iterators.flatmap(optx) do x
               x[1] == "x" ? () : (x,)
           end
       end |> collect

Compare to a basic filter:

julia> filter(data) do x
       !isnothing(x) && x[1] != "x"
end

As for more general cases that include both map and filter steps: I commented already in #44294 (comment) that filtermap and not flatmap is a natural fit here.

Taking another example from your post,

flatmap(data) do x
    y = f(x)
    if p(y) Some(y) else None() end
end

that actually involves a map step. It can be directly translated to filtermap:

julia> filtermap(data) do x
    y = f(x)
    if p(y) y else nothing end
end

This doesn't involve any extra types such as None().

@nlw0
Copy link
Contributor Author

nlw0 commented Jul 2, 2022

@aplavin thanks for the comment. You are completely right, the ():(x,) example is not great, as it would be better expressed as a straight filter. Unfortunately this is the nature of this kind of discussion. It's a slightly more complicated control structure, and once we start looking for a very simple example, we may end up losing essential details, but if we keep looking just at the more complex and real examples, it's more difficult to focus on the general idea. I believe I have offered better examples somewhere else.

It is true Union{Nothing,X} can be used as an optional type. This has been the way to go in Julia, and I find it understandable. It is often handy and effective. In my personal experience, though, the fact you cannot express Some(None()) is not something to be dismissed (ie. nesting of optional types). And it's also nice when you are able to consider the optional type to be just another container class. This is just how I see things today after studying functional programming and some rudiments of category theory. For instance, if optional is just another container, it's clear what "map(f, optional)" should do. I bet there's some alternative for Union{Nothing,X}. But if they were simply containers (or applicatives, iterables, or whatever the proper term is), it's just the same as other types. That alternative should be just map itself...

There's value in being able to see optional types, and other types, as objects similar to vectors. That's just what I'm trying to bring up. I'm not trying to force anyone to change their ways. And i can use tuples if I want to. The language already supports what I'm trying to do. I cannot convert my code to use Union{Nothing,X} because of the limitations I see. I am not asking for every Julian in the planet to convert their code not to use Union{Nothing,X}. It would be great if Julia offered an "official" alternative to construct an iterable optional type, though. Even if we don't offer a special class like Some and None or whatever, than at least have what's being suggested here.

The whole point of the patch is this actually: I'm claiming it's fine if Union{Nothing,X} is Julia's default optional type representation, although it would be nice to have access to a iterable optional type, so I can resort to that when I find it necessary. And it would be nice if it's something recognized in the core language, and not just some crazy alien pattern that is nonetheless completely valid.

I hope this clarifies that I am completely aware that Union{Nothing,X} can and is largely used in Julia to represent optional types. I believe it is important to offer an alternative optional type that is iterable. This shouldn't be controversial.

By the way, Julia does offer the Some class, a fact I suppose most people who are more comfortable with Union{Nothing,X} might be unaware of. It is not iterable, though, same as Nothing (although integers are, but let's not get into that).

I'm perfectly fine with anyone who prefers Union{Nothing,X} to propose specialized functions for that. But keep in mind that these necessary specialized functions might actually not be necessary if you had an iterable optional type and relied on more generic functors that also work with eg vectors and other collections.

@aplavin
Copy link
Contributor

aplavin commented Jul 2, 2022

the fact you cannot express Some(None()) is not something to be dismissed (ie. nesting of optional types)

But... You can? It's just Some(nothing).

The whole point of the patch is this actually: I'm claiming it's fine if Union{Nothing,X} is Julia's default optional type representation, although it would be nice to have access to a iterable optional type, so I can resort to that when I find it necessary. And it would be nice if it's something recognized in the core language, and not just some crazy alien pattern that is nonetheless completely valid.

There are "iterable optional types" in Julia, as you point out as well. A 0/1 tuple, or a 0/1 vector, depending on your type stability requirements, is this optional type already.

For instance, if optional is just another container, it's clear what "map(f, optional)" should do. I bet there's some alternative for Union{Nothing,X}. But if they were simply containers (or applicatives, iterables, or whatever the proper term is), it's just the same as other types. That alternative should be just map itself...

Please, not map(nothing), or anything else that works with nothing... Its major purpose is to be an object that throws errors as early as possible if passed to unsuspecting methods, so that these errors are easier to understand.

You are completely right, the ():(x,) example is not great, as it would be better expressed as a straight filter. Unfortunately this is the nature of this kind of discussion. It's a slightly more complicated control structure, and once we start looking for a very simple example, we may end up losing essential details, but if we keep looking just at the more complex and real examples, it's more difficult to focus on the general idea.

It would be nice to see concrete examples comparison with/without your proposal, where the benefit is clear. The one from the first post,

julia> Iterators.flatmap(monuple.(data)) do optx
           Iterators.flatmap(optx) do x
               x[1] == "x" ? () : (x,)
           end
       end |> collect

looks far from intuitive to me. It's hard to see that it's just filtering of the data array!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants