-
-
Notifications
You must be signed in to change notification settings - Fork 609
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parallel
edge-cases
#1685
Comments
The contract for Parallel has 3 supported cases:
Anything else should probably be guarded against in the call method. We could use dispatch for this if we enforce
The pairwise part is inherited from
This is just constructor confusion between the default and https://github.com/FluxML/Flux.jl/blob/master/src/layers/basic.jl#L414. There are a number of ways to resolve this. The shortest I found was to (as mentioned above) constrain |
Oh now I see, thanks. As you say it seems simplest to always store a tuple, even for one element, perhaps just by moving that line to be an inner constructor. But
I like the sound of this restriction. While dispatch could perhaps be made to give a MethodError on others, it also seems fine to check at runtime and throw a descriptive error. |
Just referencing a parallel conversation that has some relevance to this discussion: #1673 (comment) I do feel that naming branches will be more useful than preventing the mismatched branches vs. args foot gun. For example, keeping track of categorical vs continuous pre-processing branches, or the plethora of branches in an inception block. We could always address the foot gun with a runtime check. |
Note also that this case currently isn't supported. Its julia> Parallel(hcat, (x->x.+1,))([0], [0], [0])
1-element Vector{Int64}:
1
julia> Parallel(myplus, (identity,))([1], [10], [100])
1-element Vector{Int64}:
1 |
My main worry with a dynamic runtime check is that one of AD, GPUCompiler or whatever new tracing functionality is coming down the pipe with Symbolics/tracing will really dislike it. Perhaps that's a non-issue though. As for the not actually present case 3, take it as a feature request ;) |
I think the vararg version can be modified to do the pairwise thing by managing Inner constructors are usually bad for AD, and we definitely don't want to restrict what the types of the branches are. Storing layers as a tuple should suffice here. Good to avoid runtime checks if we can get most of the way there without them. |
First, about vararg vs single-argument. I think this is the documented behaviour, using
zip
, but how surprising is this? Is its early stopping a feature or a footgun?Second, from the docs it's not very clear whether "reducing the output with
connection
" means pairwise, or vararg. In fact it means pairwise, which is the same (if slightly less efficient) for+
,vcat
, etc.Third, it currently allows the construction of a layer with no sub-layers, but this cannot be called. Is this desirable to allow some automatic generation not not to produce errors in a trivial case, even if it cannot be called? Would it be better to error on construction? Or should calling this be given some meaning --- if
connection
is vararg, then possiblyParallel(myplus)([1]) == +([1])
?In fact, even for one sub-layer there are surprises. Probably this should run, but should it call the
connection
with one argument, or not? (At present, not).(Split off from discussion in #1681, which is really orthogonal.)
The text was updated successfully, but these errors were encountered: