Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The relu definition is a bit "scary" for array inputs. #85

Closed
KristofferC opened this issue Jan 17, 2019 · 8 comments
Closed

The relu definition is a bit "scary" for array inputs. #85

KristofferC opened this issue Jan 17, 2019 · 8 comments

Comments

@KristofferC
Copy link
Contributor

The following seems like an easy mistake to make:

julia> x = randn(5)
5-element Array{Float64,1}:
 -1.0483606281411753
 -0.13524020825581384
 -2.184303973828124
  0.4308162306092796
 -0.673766167758703

julia> relu(x)
5-element Array{Float64,1}:
 0.0
 0.0
 0.0
 0.0
 0.0

julia> relu.(x)
5-element Array{Float64,1}:
 0.0
 0.0
 0.0
 0.4308162306092796
 0.0
@MikeInnes
Copy link
Member

I agree but I'm not sure what to do about it. I guess we can just disable relu for arrays, and perhaps re-enable if someone complains about it.

@KristofferC
Copy link
Contributor Author

Or add a vectorized definition for it.

relu(x::AbstractArray) = clamp.(x, 0, Inf)

or something.

@MikeInnes
Copy link
Member

Strictly speaking you could argue that this changes the meaning of relu, and it's not consistent with the way that Base has gradually moved away from special-cased vectorised operations. But it's hard to imagine that being a big problem in practice.

Happy to take a PR for that fix.

@KristofferC
Copy link
Contributor Author

Base has gradually moved away from special-cased vectorised operations.

Base typically have type annotations though

julia> sin(rand(5))
ERROR: MethodError: no method matching sin(::Array{Float64,1})

@KristofferC
Copy link
Contributor Author

But yes, there are cases where f(x::AbstractArray) is not equal to f.(x), like for trigonometric functions for matrices.

@MikeInnes
Copy link
Member

Yeah true. Maybe the better fix is to just define relu(::Real) only; it might make sense to do that across our activation functions.

@staticfloat
Copy link
Contributor

I would support a help.jl file filled with things like:

relu(x::Array) = error("Use explicitly-broadcasted invocations such as `relu.(x)` to apply activation functions to tensors!")

These little gotchas can be very helpful to users coming from less-enlightened frameworks.

@staticfloat
Copy link
Contributor

Fixed by #98

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants