-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mpiCudaGeneric is a program transfer data from a machine to another with GPU using MPI and CUDA. #37309
Conversation
…then to the GPU using MPI and CUDA.
-code-checks Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-37309/28947
Code check has found code style and quality issues which could be resolved by applying following patch(s)
|
@cmsbot, please test |
7ee8239
to
12a0524
Compare
@cmsbot, please test |
+code-checks Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-37309/29163
|
A new Pull Request was created by @AliinCern (Marafi) for master. It involves the following packages:
@cmsbuild, @makortel, @fwyzard can you please review it and eventually sign? Thanks. cms-bot commands are listed here |
@cmsbuild, please test |
+1 Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-ef83a2/23658/summary.html Comparison SummarySummary:
|
+heterogeneous |
This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @perrotta, @dpiparo, @qliphy (and backports should be raised in the release meeting by the corresponding L2) |
+1 |
Program description:
In this program, we transfer data from one machine to GPU remotely in different approaches using MPI and CUDA.
Program Validation:
We have successfully compiled it with scram b runtests
Program Mechanism:
There are three approaches of transferring data, we call them parts:
Part 1: Transfer data from Root pageable memory to Host pageable memory using MPI, then allocate memory in GPU using
cudaMalloc. Finally, transfer data from Host pageable memory to GPU using cudaMemcpy.
Part 2: Transfer data from Root pageable memory to Host Pinned memory using MPI and cudaMallocHost, then allocate
memory in GPU using cudaMalloc. Finally, transfer data from Host Pinned memory to GPU using cudaMemcpy.
Part 3: Allocate memory in GPU using cudaMalloc, then transfer data from Root pageable memory to GPU memory
using MPI.
Program Measurements:
There are seven sections that we have measure time elapse:
Program command line options are:
[-np] number of processes or processors that you would like to run.
[-s] size of vectors that you would like to send, the type is float and there are two vectors.
[-t] number of repeating task on the Device(GPU) side.
[-a] number of repeating the part.
[-p] choice of what part to run in the program.
[-q] print Stander Deviation.
[-f] save the results into a file for each part.
[-h] for help.