-
-
Notifications
You must be signed in to change notification settings - Fork 611
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create performance tips docs section #615
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great idea.
Co-Authored-By: oxinabox <[email protected]>
Co-Authored-By: oxinabox <[email protected]>
CI failure is an unrelated network issue with downloading some test data. |
Flux works great with all kinds of number types. | ||
But often you do not need to be working with say `Float64` (let alone `BigFloat`). | ||
Switching to `Float32` can give you a significant speed up, | ||
not because the operations are faster, but because the memory usage is halved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems relevant to mention Float32
on GPU here. Also, operations do tend to be faster since you can fit more numbers in a SIMD lane at a given size.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Feel free to use the github suggestion feature, or to PR after it is merged.
I know little of GPU so someone else is better to write it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could just remove the "not because the operations are faster, but because the memory usage is halved." part?
For example: | ||
|
||
``` | ||
my_tanh(x) = Float64(tanh(x)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems a bit artificial, perhaps something like tanh(x) + 2.0
or 5.0 * tanh(x)
etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted something very obvious, for the first example, the one below is less obvious.
Co-Authored-By: oxinabox <[email protected]>
Co-Authored-By: oxinabox <[email protected]>
Co-Authored-By: oxinabox <[email protected]>
Merge? |
Co-Authored-By: oxinabox <[email protected]>
Great stuff and sorely needed, thanks a lot @oxinabox and reviewers. |
Not only should your activation functions be [type-stable](https://docs.julialang.org/en/v1/manual/performance-tips/#Write-%22type-stable%22-functions-1), | ||
they should also preserve the type of their inputs. | ||
|
||
A very artificial example using an activatioon function like |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"activatioon"
And you use less memory. | ||
|
||
|
||
## Make sure your custom activation functions preserve the type of their inputs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add empty line
you will see a large slow-down | ||
|
||
This can occur sneakily, because you can cause type-promotion by interacting with a numeric literals. | ||
E.g. the following will have run into the same problem as above: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
E.g. the following will have run into the same problem as above: | |
E.g. the following will run into the same problem as above: |
leaky_tanh(x) = 0.01x + tanh(x) | ||
``` | ||
|
||
While one could change your activation function (e.g. to use `0.01f0x`) to avoid this when ever your inputs change, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While one could change your activation function (e.g. to use `0.01f0x`) to avoid this when ever your inputs change, | |
While one could change the activation function (e.g. to use `0.01f0x`) to avoid this when ever your inputs change, |
@KristofferC ah noo. |
In part in response to #613,
but more generally I think good to have.