-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix crashing PixelOnlyGPU workflows (add missing CUDAServices) #31333
Conversation
The code-checks are being triggered in jenkins. |
assign heterogeneous |
please test workflows 136.885502,136.888502,10824.502,10842.502,11634.502,11650.502 |
+code-checks Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-31333/18093
|
The tests are being triggered in jenkins.
|
A new Pull Request was created by @silviodonato (Silvio Donato) for master. It involves the following packages: RecoVertex/BeamSpotProducer @perrotta, @makortel, @slava77, @christopheralanwest, @tocheng, @cmsbuild, @tlampen, @jpata, @fwyzard, @pohsun can you please review it and eventually sign? Thanks. cms-bot commands are listed here |
def _addCUDAServices(theProcess): | ||
theProcess.load("HeterogeneousCore.CUDAServices.CUDAService_cfi") | ||
|
||
modifyRecoVertexBeamSpotProducerBeamSpotAddCUDAService_ = gpu.makeProcessModifier( _addCUDAServices ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was sort of agreed in #28575 to place this in Configuration/StandardSequences/Services_cff
. Unless we want to place this in every future cfi/cff that introduces a CUDA module.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed a central place would be better.
By the way, how do we deal with this for the GCC 10 builds ?
The CUDAService
does not build there (though maybe in principle it could), so there is no CUDAService_cfi
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By the way, how do we deal with this for the GCC 10 builds ?
TheCUDAService
does not build there (though maybe in principle it could), so there is noCUDAService_cfi
.
It will indeed continue to fail in GCC 10 on workflows enabling gpu
modifier (better than all workflows but not fully satisfactory). Once we agree with the immediate-term solution in #31261, the same pattern could be applied to CUDAService
as well.
On the other hand, I'd think the gpu
workflows would not make much sense on GCC 10 anyway, so should we look into disabling them in GCC 10 IBs? (in which case we could leave CUDAService
untouched for now)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a better solution for the CUDAService
could be to try and build it also on GCC 10.
With
diff --git a/HeterogeneousCore/CUDAServices/BuildFile.xml b/HeterogeneousCore/CUDAServices/BuildFile.xml
index e063a0dd4e7..945d272ba85 100644
--- a/HeterogeneousCore/CUDAServices/BuildFile.xml
+++ b/HeterogeneousCore/CUDAServices/BuildFile.xml
@@ -1,4 +1,4 @@
-<iftool name="cuda-gcc-support">
+<iftool name="cuda">
<use name="FWCore/Framework"/>
<use name="FWCore/ServiceRegistry"/>
<use name="FWCore/ParameterSet"/>
diff --git a/HeterogeneousCore/CUDAServices/plugins/BuildFile.xml b/HeterogeneousCore/CUDAServices/plugins/BuildFile.xml
index 73a760fa117..0928247e400 100644
--- a/HeterogeneousCore/CUDAServices/plugins/BuildFile.xml
+++ b/HeterogeneousCore/CUDAServices/plugins/BuildFile.xml
@@ -1,4 +1,4 @@
-<iftool name="cuda-gcc-support">
+<iftool name="cuda">
<use name="cuda"/>
<use name="DataFormats/Common"/>
<use name="DataFormats/Provenance"/>
@@ -11,6 +11,7 @@
<use name="FWCore/ServiceRegistry"/>
<use name="FWCore/Utilities"/>
<use name="HeterogeneousCore/CUDAServices"/>
+
<library file="*.cc" name="HeterogeneousCoreCUDAServicesPlugins">
<flags EDM_PLUGIN="1"/>
</library>
diff --git a/HeterogeneousCore/CUDAUtilities/BuildFile.xml b/HeterogeneousCore/CUDAUtilities/BuildFile.xml
index fc2752ff845..e7193e1ccfd 100644
--- a/HeterogeneousCore/CUDAUtilities/BuildFile.xml
+++ b/HeterogeneousCore/CUDAUtilities/BuildFile.xml
@@ -1,4 +1,4 @@
-<iftool name="cuda-gcc-support">
+<iftool name="cuda">
<use name="cuda"/>
<use name="eigen"/>
<use name="FWCore/Utilities"/>
the CUDAService
can be built and loaded.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea is that <iftool name="cuda">
tells if we have the CUDA libraries, while <iftool name="cuda-gcc-support">
tells if we can compile the device code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See #31334 .
+1 |
Comparison job queued. |
move modifier to Configuration/StandardSequences/python/Services_cff.py
c44a78a
to
b282750
Compare
please test workflows 136.885502,136.888502,10824.502,10842.502,11634.502,11650.502 |
The tests are being triggered in jenkins.
|
Comparison is ready @slava77 comparisons for the following workflows were not done due to missing matrix map:
Comparison Summary:
|
+1 |
Comparison job queued. |
Comparison is ready @slava77 comparisons for the following workflows were not done due to missing matrix map:
Comparison Summary:
|
+operations |
+1 |
This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @silviodonato, @dpiparo, @qliphy (and backports should be raised in the release meeting by the corresponding L2) |
+1 |
PR description:
This PR fixes the crash of workflows 136.885502,136.888502,10824.502,10842.502,11634.502,11650.502 as reported in #31130 (comment) .
This fix loads
HeterogeneousCore.CUDAServices.CUDAService_cfi
inRecoVertex/BeamSpotProducer/python/BeamSpot_cff.py
through thegpu
Modifier.PR validation:
runTheMatrix.py -l 136.885502
works