-
Notifications
You must be signed in to change notification settings - Fork 18.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Split CUDA code (*.cu) from CPU code (*.cpp). #152
Comments
What's the plan to go about this? Split each layer into two subclasses, e.g. ConvLayerCPU and ConvLayerGPU or then have dummy Foward_GPU functions in the *.cpp files? |
@erictzeng would be good if you submitted a work-in-progress pull request for this, or shared your plan here so that others could provide feedback before too much work is done |
Sorry, we talked off-list about this. He's waiting for the merge of #142 and #163 but already made the split and it builds and tests pass. @erictzeng it'd be good to push and PR your current work to dev anyway, and we can look at it once the mentioned PRs are folded in. |
Closing since #172 is in. |
The discussion in #172 indicates that a CPU only version is only possible when the CPU and CUDA codes live in different classes or better different namespaces like what was done in OpenCV. Do you agree? |
@kloudkl what was it exactly that the opencv folks did? Something like having a namespace opencv_cpu and opencv_gpu? |
The modules dir contains cuda and cudev namespaces to separate the GPU related codes. The platforms dir includes cmake files and build scripts specific to the most common platforms. |
Splitting common into cpp and cuda code with a common interface that satisfies the osx issue in #165 turns out to be intricate. An immediate frustration is that the Caffe singleton itself relies on curand and cublas, and although random number generation and blas can be abstracted into strategies for cpu/gpu operation this doesn't handle 1. switching modes or 2. mixed cpu+gpu implementations. Instead of totally splitting, the simplest approach could be to implement two Caffe singletons
that both obey the same interface. However, the cpu-only singleton will have no-ops for gpu methods and instead complain through warnings when it is passed gpu methods are called or gpu args are passed. In this way the rest of Caffe such as the solver, tools, and the like can stay the same. Instead of ifdefs all around the changes will be isolated to the Caffe singletons. |
Add printing GPU name in timing mode
This will enable CPU-only Caffe compilation (#3), and go a long way to 10.9 CUDA compilation problems (#122)
Checklist [shelhamer]:
Follow-up: abstract CPU / GPU device computation #610
The text was updated successfully, but these errors were encountered: