-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add GPU workflow to runTheMatrix #35263
Add GPU workflow to runTheMatrix #35263
Conversation
+code-checks Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-35263/25249
|
A new Pull Request was created by @srimanob (Phat Srimanobhas) for master. It involves the following packages:
@jordan-martins, @chayanit, @bbilin, @wajidalikhan, @cmsbuild, @AdrianoDee, @srimanob, @kskovpen can you please review it and eventually sign? Thanks. cms-bot commands are listed here |
@cmsbuild please test |
Good question. One option is to read it from the version of the CUDA library bundled with CMSSW, e.g.: ls $(scram tool tag cuda LIBDIR)/libcudart.so.*.*.* | sed -e 's/.*libcudart.so.//' Another option is to add it as variable to SCRAM. Another good question is: what version should we use here ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd make all options lowercase and use a dash to separate words, like --gpu-version
.
@@ -305,6 +334,41 @@ def runSelected(opt): | |||
default=None, | |||
action='store') | |||
|
|||
parser.add_option('--RequiresGPU', | |||
help='if GPU is reuired or not: forbidden (default, CPU-only), optional, required. For relvals, the GPU option will be turned off for optional.', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reuired
-> required
Also, why the special treatment for relvals ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For relvals, I assume that we always control what relvals should run on, w/ or w/o (specific) GPU. So optional may not fit with the relvals. But this can be changed, no solid reason here.
dest='CUDADriverVersion', | ||
default='') | ||
|
||
parser.add_option('--CUDARuntimeVersion', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll double check, but I don't think this should ever be set on the client side ?
Besides the |
+1 Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-b0c6cc/18586/summary.html Comparison SummaryThe workflows 140.53 have different files in step1_dasquery.log than the ones found in the baseline. You may want to check and retrigger the tests if necessary. You can check it in the "files" directory in the results of the comparisons Summary:
|
+code-checks Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-35263/25256
|
Pull request #35263 was updated. @jordan-martins, @makortel, @chayanit, @bbilin, @wajidalikhan, @cmsbuild, @AdrianoDee, @srimanob, @kskovpen, @fwyzard can you please check and sign again. |
@cmsbuild please test |
+1 Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-b0c6cc/18634/summary.html Comparison SummarySummary:
|
sure |
+heterogeneous |
+Upgrade |
@cms-sw/pdmv-l2 Do you have any comment? |
+1 |
This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @perrotta, @dpiparo, @qliphy (and backports should be raised in the release meeting by the corresponding L2) |
+1 |
PR description:
The original PR/discussion is #33057 and dmwm/WMCore#10388 from the WMCore side.
Information on GPU support from WMCore side: https://github.com/dmwm/WMCore/wiki/GPU-Support
This PR is to add GPU workflow to
runTheMatrix
.In addition, this PR migrates from
optparse
toargparse
, with code clean up.What to be done with this PR:
(Done)
CUDARuntime
for each release, to be the default value? @makortel @fwyzard @smuzaffar (Done)GPUParams
as expected from WMCore? I just usejsons.dump(the dictionary)
as suggested. @amaltaro (Confirmed)(Pending)
Thanks.
FYI @dpiparo @davidlange6 @justinasr
PR validation:
Using
runTheMatrix.py --what upgrade -l 11650.502 -t 8 --requires-gpu required --wm init
or
runTheMatrix.py --what upgrade -l 11650.502 -t 8 --gpu --wm init
you will get
if this PR is a backport please specify the original PR and why you need to backport that PR:
This PR is not a backport.