Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow only real input for activation functions #98

Merged
merged 10 commits into from
Apr 8, 2019
Merged

Allow only real input for activation functions #98

merged 10 commits into from
Apr 8, 2019

Conversation

devmotion
Copy link
Contributor

This PR implements what was discussed in #85. A simple macro ensures that activation functions are defined for real inputs only and calling them with an array results in a (hopefully informative) error message.

@MikeInnes
Copy link
Member

I like the idea but would prefer not to annotate each definition like this. How about σ(x::Real) = ... and then later in the file, @onlyreal σ, relu, ... to add the error message to all functions. A simple @eval loop might actually be easier than a macro here.

@devmotion
Copy link
Contributor Author

I agree, as soon as I had written this initial implementation I thought that would have been easier 😄 I guess, it might also simplify overloading existing methods - e.g., I was wondering whether it would make sense to have a common implementation of softplus in NNlib and StatsFuns (the implementation there seems to be numerically more stable, and there exists a PR to improve it even further)? What's the reason for having different implementations in different packages? I guess it's not intended to add a dependence on Rmath to NNlib, so maybe it would be better to move those implementations to a separate package on which both StatsFuns and NNlib could depend (or to remove the R dependency from StatsFuns 😛).

@MikeInnes
Copy link
Member

What's the reason for having different implementations in different packages?

Only Conway's law. And I think there's some advantage in exposing softplus as a name, just because it's familiar. Of course that doesn't mean we should have multiple impls.

I think a StatsFuns dependency would be ok, especially since Rmath is going away eventually. We probably want all of it to work well with the GPU and AD ecosystem anyway.

@MikeInnes
Copy link
Member

Note that we want to keep the manifest (it's useful for development/testing, especially if we need to use unreleased functionality temporarily). If it needs an update feel free to do that.

@codecov-io
Copy link

codecov-io commented Mar 26, 2019

Codecov Report

Merging #98 into master will increase coverage by 11.67%.
The diff coverage is 100%.

Impacted file tree graph

@@             Coverage Diff             @@
##           master      #98       +/-   ##
===========================================
+ Coverage   84.94%   96.61%   +11.67%     
===========================================
  Files          17       17               
  Lines         611      532       -79     
===========================================
- Hits          519      514        -5     
+ Misses         92       18       -74
Impacted Files Coverage Δ
src/NNlib.jl 100% <ø> (ø) ⬆️
src/activation.jl 93.75% <100%> (+15.97%) ⬆️
src/impl/conv_direct.jl 100% <0%> (+3.84%) ⬆️
src/dim_helpers/DepthwiseConvDims.jl 100% <0%> (+4.16%) ⬆️
src/dim_helpers/DenseConvDims.jl 100% <0%> (+4.34%) ⬆️
src/dim_helpers/ConvDims.jl 79.54% <0%> (+5.03%) ⬆️
src/softmax.jl 93.1% <0%> (+6%) ⬆️
src/impl/depthwiseconv_direct.jl 100% <0%> (+7.84%) ⬆️
src/impl/depthwiseconv_im2col.jl 100% <0%> (+7.89%) ⬆️
src/impl/conv_im2col.jl 98.96% <0%> (+10.08%) ⬆️
... and 4 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ed4fe9a...84acf9d. Read the comment docs.

@devmotion
Copy link
Contributor Author

Ah I see, sorry for that! Cleaning the requirements lead to test errors on Travis, so I thought the easiest solution would be to remove the Manifest all together. I updated it.

# Provide an informative error message if activation functions are called with an array
for f in (:σ, :σ_stable, :logσ, :relu, :leakyrelu, :elu, :gelu, :swish, :selu, :softsign, :softplus)
@eval $(f)(x::AbstractArray, args...) =
error("Use explicit invocations such as `", $(string(f)), ".(x)` to apply activation functions to tensors!")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
error("Use explicit invocations such as `", $(string(f)), ".(x)` to apply activation functions to tensors!")
error("Use broadcasting (`", $f, ".(x)`) to apply activation functions to arrays.")

I think $f will print the same but worth checking quickly.

Copy link
Contributor Author

@devmotion devmotion Apr 7, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually in my initial implementation I used $f, but I added string since I was not satisfied with the output. With just $f I get, e.g.,

ERROR: Use broadcasting (NNlib.σ.(x)) to apply activation functions to arrays.

whereas with $(string(f))

ERROR: Use broadcasting (σ.(x)) to apply activation functions to arrays.

is printed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, a bit surprising that they behave differently, but that makes sense.

@MikeInnes
Copy link
Member

Unfortunately the changes to the manifest etc are conflicting now, but that should be easy to fix. I suggested a tweak to the error message but otherwise this is good to go!

@MikeInnes
Copy link
Member

Perfect, thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants