-
-
Notifications
You must be signed in to change notification settings - Fork 613
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
added GlobalMaxPool, GlobalMeanPool, and flatten layers #950
Conversation
src/layers/conv.jl
Outdated
Transforms (w,h,c,b)-shaped input into (w*h*c,b)-shaped output, | ||
by linearizing all values for each element in the batch. | ||
""" | ||
struct Flatten end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this actually have to be a struct, or could it just as well be a function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
About to say the same, we don't really need a struct here, just a function should be sufficient.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason I did this, is because I am treating flattening as a layer (like Conv en Pool) and not as a function (like softmax). Technically, it could also have an activation function as a property.
struct Flatten{F}
σ::F
function Flatten(σ::F = identity) where {F}
return new{F}(σ)
end
end
function (f::Flatten)(x::AbstractArray)
σ = f.σ
σ(reshape(x, :, size(x)[end]))
end
Keras also has flattening implemented as a layer btw.
I can also change the whole part on flattening to:
flatten(x::AbstractArray) = reshape(x, :, size(x)[end])
Would that be the way to go?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A layer is a function. Since this layer does not need to store any parameters, a regular function is enough. Layers with internal parameters are callable structs, I. E. They are functions as well.
flatten(x::AbstractArray) = reshape(x, :, size(x)[end])
Would that be the way to go?
I would say so :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still tend more towards the solution with the struct.
The Flatten Layer could have properties like an activation function, the number of parameters, a name, what dimensions it flattens, the data format, ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It could, but your implementation does not. An activation function is just the next function in the chain, or alternatively the activation function of the preceeding layer, as flatten is a linear operation. No other layers have names in Flux.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the other hand, the Conv
layers do have their activation function as a property.
You can thus specify Conv((3,3), 32=>64, softmax)
, which is a bit cleaner than splitting them up in a layer and a function.
Maybe we need both? A flatten
function that can be called whenever necessary and a Flatten
layer that calls that function, but also contains other properties?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It could, but your implementation does not.
I would commit the changes where the activation function is used as a property for the Flatten layer first, before this gets merged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Under Flux, there is nothing special about a layer and the way its defined, a function, or any transform is treated exactly the same. For the case of layers here, since we don't really have any parameters, we should be good with a simple layers (see stateless layers as an example). The function can be an input, I feel
EDIT: If we were to add the parameters, making it a closure makes sense, and therefore, the struct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added a flatten function that will be called in the Flatten layer and added support for activation functions in that layer. (see 2nd commit)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this ready to get merged now?
I think a |
Hi @gartangh, this PR looks good, could you rebase? Also, maybe you could remove the Flatten struct as it seems to be the majority's opinion and document and export the |
Hi @CarloLucibello , I rebased, removed the |
Could you perhaps rebase on master? I just updated the environment, since we had a fix go in in Zygote |
MeanPool | ||
GlobalMeanPool | ||
DepthwiseConv | ||
ConvTranspose | ||
CrossCor |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add flatten
here? not sure if it is the right place, maybe we could use the NNlib section for the functional alternative of struct layers, but I think adding flatten
here is ok for the time being
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@CarloLucibello , I rebased and added the flatten
documentation tag.
Lgtm |
nice, thanks! bors r+ |
Build succeeded |
No description provided.