-
Notifications
You must be signed in to change notification settings - Fork 18.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Device Abstraction #610
Device Abstraction #610
Conversation
@shelhamer, I'm a little confused about the most suitable workflow to contribute to a feature branch with an open PR. Is it the case that you will rebase this PR against device-abstraction from time to time while @robwhess's #587 and any other similar PRs should rebase against both dev and device-abstraction before being merged into the device-abstraction branch? |
@kloudkl this PR and the device-abstraction PR are one and the same. Any change to BVLC:device-abstraction is automatically reflected in this PR. It's just like when you push further commits to a branch on your fork while a PR is open. So the workflow to contribute to this PR is to make a PR on BVLC:device-abstraction. The usual rules for PRs hold except the base is device-abstraction instead of dev: you should have a clean merge, PRs should be short and to the point, etc. The one complication for a feature branch that you are right to point out is that device-abstraction must itself track dev and be rebased from time to time. @robwhess has volunteered to do the first rebase of this kind. You and any other contributors can also help by rebasing when the feature branch has fallen behind. Once rebased, the contributor should push to their fork and comment for a BVLC member with push rights to update BVLC:device-abstraction to their rebased fork. This is what was done for #165 except that I did too much of the rebasing then. Let me know if that's not clear, since BVLC feature branches PRs are a little different in this respect. |
Very clear. Thanks a lot! |
Interesting that Travis build totally fails...I at least compiled @ e316202 on Linux with gcc successfully (but had a few warnings). edit: oh right -- it's because of CPU_ONLY. If I do |
I did another rebase onto the latest dev (just another few days' worth of commits) and force pushed again. This rebase was relatively easy -- maybe the trick is to just not let too much time pass between rebases, or maybe these changes just happened to be easier than most. |
Did one more rebase and force pushed after seeing Travis pass on my fork. This one was a bit more painful due to a couple interface-breaking PRs, but manageable; took around an hour. I've tested |
@jeffdonahue yeah, the Thanks for doing the rebases and getting things working with Travis. |
Perhaps once device abstraction is complete (which I see might take a while), implementing the OpenCL portions of the math code might use parts of VexCL (https://github.com/ddemidov/vexcl). It's a template expression library that generates OpenCL C kernels from C++ expressions involving vex::vector<T> (a device analogue to std::vector<T>). Incidentally, it also has a CUDA backend. For instance, a ReLU kernel might be as simple as the following C++ code: or caffe_copy: I believe it can be mixed with typical OpenCL-style libraries like clBLAS or clFFT. However, it uses C++11 features. I'm not sure which versions of clang/g++ caffe uses. |
VexCL and ViennaCL were once considered as candidates. Out of the concerns about the performance, clBLAS was preferred. Before making further decisions, please benchmark how do they perform in Caffe in the most computationally expensive parts such as the convolution layer. |
Yes VexCL can be used with CLBLAS and CUBLAS but you can try to benchmark in the convolutional layer. |
Hi all, |
First things first. Git rebase and then add your own code. |
I don't know if this is definitely stalled but consider also this news: Arrayfire is now under BSD |
Device abstraction is still a worthy goal but has stalled for the moment. On Wed, Nov 26, 2014 at 5:37 AM, bhack [email protected] wrote:
|
Well, it is still my intention to come back on this. We'll eventually need to use the device abstraction at Flickr, so there should be an opportunity for us to work on this again, but that time just hasn't come yet, unfortunately. I'll keep Arrayfile in mind for when I finally do get the chance to work on this again. |
If any is interested this is an example of a trasparent multi device gemv |
I also ask to @pavanky if he could give us some opinion about device abstraction role that could give arrayfire and what will be the performance impact if we don't use anymore directly cublas/openblas/atlas and cufft calls. |
@bhack No. I can no longer open source any code. |
@kloudkl I'm sorry that we have lost a very active member. |
Caffe has been among the top 15 most forked C++ projects on GitHub. How could there be not enough contributions? At the same time, many other organizations, e.g. Google, Baidu, Microsoft, Tencent, Samsung and GraphLab Inc, have all published or even open sourced various other (distributed) deep learning frameworks some of which may pose serious disruptive threats in the coming year. |
Referencing #2195 |
While device abstraction is still a good direction this PR is closed since it is against the deprecated dev branch. |
@shelhamer Probably @naibaf7 has a plan to supersed this. |
CPU / GPU device separation and abstraction
and so improves Caffe all around. That is, provided there is little to no overhead in both performance and coding. Since this requires a non-trivial set of coordinated changes, it has been promoted to the BVLC:device-abstraction feature branch. To contribute to this development, PR to BVLC:device-abstraction. When you rebase your feature branch on dev, comment with your fork to notify the maintainers to update.
See #415 for history.
This PR tracks the progress and integration of this branch for its final merge to dev.