Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cuDNN acceleration #1046

Merged
merged 9 commits into from
Sep 8, 2014
Merged

cuDNN acceleration #1046

merged 9 commits into from
Sep 8, 2014

Conversation

shelhamer
Copy link
Member

Caffe + cuDNN is an NVIDIA-Caffe collaboration for deep learning. cuDNN is an acceleration library for deep network operations with drop-in integration to Caffe. It is a free library downloadable with CUDA developer registration. It requires CUDA >= 6.5. This combination is the fastest public framework for deep learning in vision when benchmarked on the AlexNet / CaffeNet architectures with overall model speedups of 1.2-1.5x and layer-wise speedups of 1.2-3x over standard Caffe. Caffe + cuDNN lets you define your models just as before while taking advantage of these computational speedups.

In this first release cuDNN includes

  • convolution
  • pooling
  • nonlinearities (ReLU, Sigmoid, TanH)
  • softmax

These operations are drop-in accelerations of the Caffe layers. To switch on acceleration, set

USE_CUDNN := 1

in your Makefile.config during installation. Layers will be accelerated by default.

NVIDIA and Caffe will coordinate future releases to further accelerate computation and introduce new features. NVIDIA has committed to tuning cuDNN to current and future GPU architectures.

Caffe is free and open-source and cuDNN is a CUDA developer library like cuBLAS and cuRAND.

Check out the cuDNN site, the Caffe's latest roast slides, and NVIDIA parallel forall blog announcement!

Thanks to the cuDNN team for this collaboration and special thanks to Cliff Woolley for his attention to detail.


Note on convolution: the cuDNN convolution aims to match or exceed the speed of Caffe's own matrix-multiplication approach while reducing memory usage. In many input and model regimes it accelerates the computation 1.3-3x and never requires buffers. In certain cases of fully-convolutional models or large inputs the Caffe convolution is slightly faster at the cost of more memory usage -- this is a direction for further optimization.

To pick the computational engine per-layer in your models, set the engine: CAFFEor engine: CUDNN field in the {convolution,pooling,relu,sigmoid,tanh,softmax}_param in your model definition:

layers {
  type: CONVOLUTION
  ...
  convolution_param {
    engine: CAFFE
    ...
  }
}

@sguada
Copy link
Contributor

sguada commented Sep 7, 2014

@shelhamer the slides are not public

@shelhamer
Copy link
Member Author

Thank you for pointing that out. The slides are now public.

On Sunday, September 7, 2014, Sergio Guadarrama [email protected]
wrote:

@shelhamer https://github.com/shelhamer the slides are not public


Reply to this email directly or view it on GitHub
#1046 (comment).

@niuzhiheng
Copy link
Contributor

Awesome!
The NVidia blog for this is here.

@OpenHero
Copy link

OpenHero commented Sep 8, 2014

It only published a lib, but no source code.

On Mon, Sep 8, 2014 at 11:23 AM, NIU ZHIHENG [email protected]
wrote:

Awesome!
The NVidia blog for this is here
http://devblogs.nvidia.com/parallelforall/accelerate-machine-learning-cudnn-deep-neural-network-library/
.


Reply to this email directly or view it on GitHub
#1046 (comment).

Best wishes,
Kaiyong Zhao

shelhamer added a commit that referenced this pull request Sep 8, 2014
@shelhamer shelhamer merged commit 3bafe2f into BVLC:dev Sep 8, 2014
@bhack bhack mentioned this pull request Sep 8, 2014
@Yangqing
Copy link
Member

Just for the record - when compiled with cudnn and no changes are made to the pre-cudnn protobuffers, the default behavior is to use the cudnn implementation of things.

This was referenced Sep 18, 2014
@shelhamer shelhamer deleted the cudnn branch September 19, 2014 04:36
@qianghuang84
Copy link

cool component

@sguada
Copy link
Contributor

sguada commented Sep 27, 2014

@shelhamer it is a bit annoying to get so many warnings when Falling back to standard Caffe. I think it could say it during the Setup and then don't say it again.

W0926 23:53:11.389243 18332 cudnn_pooling_layer.cu:17] Falling back to standard Caffe for padded pooling.
W0926 23:53:11.390046 18332 cudnn_pooling_layer.cu:17] Falling back to standard Caffe for padded pooling.
W0926 23:53:11.546059 18332 cudnn_pooling_layer.cu:36] Falling back to standard Caffe for padded pooling.
W0926 23:53:11.546967 18332 cudnn_pooling_layer.cu:36] Falling back to standard Caffe for padded pooling.
W0926 23:53:11.547852 18332 cudnn_pooling_layer.cu:36] Falling back to standard Caffe for padded pooling.

See #1170

mitmul pushed a commit to mitmul/caffe that referenced this pull request Sep 30, 2014
RazvanRanca pushed a commit to RazvanRanca/caffe that referenced this pull request Nov 4, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants