-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for Alpaka modules in HLT configs of CMSSW_14_0_0_pre1
or higher
#85
Conversation
Hi @missirol Thanks for this. I believe we should not allow C++ types For ALPAKA, I thought it is of the form with a single '@', no? And why do you refer to |
// pluginType itself can contain the substring "::" (e.g. plugin types with namespace specification, | ||
// or Alpaka plugins with explicit backend selection). | ||
// The hack below consists in replacing "::" in "name" with | ||
// a very unlikely string without semicolons, before "name" is split by ":". |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// a very unlikely string without semicolons, before "name" is split by ":". | |
// a very unlikely string without colons, before "name" is split by ":". |
Typo, to be fixed before possibly merging this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in 1203e1d.
Hi Martin
I don't have a strong opinion on this, but as long as they are, it's probably better to make the simplest fix needed for the GUI to handle them (at least, that was my thinking in this PR).
Yes, that's the 'portable' way (see here). There is also the non-portable way (see here). As far as I understood.. (@fwyzard should correct me, obviously)
[1]
[2]
My mistake, I meant colon, I fixed the description and I have to fix a comment in the source code. |
Yes, we can do |
…nd higher - added python/readme.md with recipe to update CMSSW dependencies of ConfDB python parser - python libraries of relevant CMSSW packages updated to CMSSW_14_0_0_pre1 - added HeterogeneousCore/{AlpakaCore,ROCmCore} to parse configs making use of ProcessAccelerator{Alpaka,ROCm} - added HeterogeneousCore/Common (now required by HeterogeneousCore/{CUDA,ROCm,Alpaka}Core) - added copy of enum34 library (version of enum compatible with python2.7), now required by HeterogeneousCore/Common - updated PythonConfigurationWriter to load/import Accelerators_cff when the configuration uses Alpaka modules and/or SwitchProducerCUDA - fixed bug in ConfigurationTreeActions preventing to add modules to a Path/Sequence when the plugin type of the module contained "::" in its name
512f142
to
1203e1d
Compare
Given this, could we remove the 'UnlikelyString' kludge in the PR @missirol ? |
Sure, I can remove it. This means that, unrelated to Alpaka, the GUI won't allow to add instances (modules) of plugins whose type contains a |
CMSSW_14_0_0_pre1
or higherCMSSW_14_0_0_pre1
or higher
With the help of @missirol I tested this PR in the following way.
cmsrel CMSSW_14_0_0_pre3 && cd CMSSW_14_0_0_pre3/src && cmsenv
git cms-merge-topic -u 43853 && git cms-checkdeps -a
scram b -j 8
hltGetConfiguration /dev/CMSSW_14_0_0/GRun \
--unprescale --output all \
--globaltag auto:phase1_2024_realistic --mc --max-events 10000 \
--input /store/mc/Run3Winter24Digi/TT_TuneCP5_13p6TeV_powheg-pythia8/GEN-SIM-RAW/133X_mcRun3_2024_realistic_v8-v2/80000/dc984f7f-2e54-48c4-8950-5daa848b6db9.root \
--customise HLTrigger/Configuration/customizeHLTforAlpaka.customizeHLTforAlpaka >& hlt_gpu.py &
edmConfigDump hlt_gpu.py >& hlt_gpu_dump.py&
git clone -b confdbv3-test https://github.com/cms-sw/hlt-confdb.git
cd hlt-confdb
ant clean
./start
hltGetConfiguration v3-test/run3:/users/musich/tests/dev/CMSSW_14_0_0/TestConfDBGUI_Alpaka/HLT/V1 \
--unprescale --output all \
--globaltag auto:phase1_2024_realistic --mc --max-events 10000 \
--input /store/mc/Run3Winter24Digi/TT_TuneCP5_13p6TeV_powheg-pythia8/GEN-SIM-RAW/133X_mcRun3_2024_realistic_v8-v2/80000/dc984f7f-2e54-48c4-8950-5daa848b6db9.root >& hlt_fromCondDB.py &
Given the above I would be in favor of merging this PR, such that proponents could start using this version of the GUI for filling in the tickets for the |
That is great! As a check, what is the diff between the file obtained in 1) and in 5)? Hmm, I do not think proponents should do the alpaka migration in ConfDb for their area/paths individually, as this will not be synchronised and messy towards integration. Rather, I think STORM should do that after closing the V1.0 menu for ticket integration and before handing it to FOG. Also, we need to be prepared to revert to non-alpaka if suddenly there are show-stoppers. Of course, proponents should still test their paths, but with the ALPAKA customisation routine as advertised. |
edmConfigDump hlt_fromCondDB.py > hlt_fromConfDB_dump.py
diff --color -u hlt_gpu_dump.py hlt_fromConfDB_dump.py > diff.txt diff available here.
OK, I thought we would need tickets from proponents in any case. We can discuss more at the coordination meeting. |
I modified I get the diff in [1], which is what I expected. With the resulting configurations, I ran the I did not try to create subtables with this PR, but I do not see what would change. This PR is ready from my side.
[1] diff --git a/HLTrigger/Configuration/python/HLT_FULL_cff.py b/HLTrigger/Configuration/python/HLT_FULL_cff.py
index c74d1549e69..988a6a43381 100644
--- a/HLTrigger/Configuration/python/HLT_FULL_cff.py
+++ b/HLTrigger/Configuration/python/HLT_FULL_cff.py
@@ -1,18 +1,17 @@
-# hltGetConfiguration /dev/CMSSW_14_0_0/HLT/V36 --cff --data --type FULL
+# hltGetConfiguration v3-test/run3:/dev/CMSSW_14_0_0/HLT/V36 --cff --data --type FULL
# /dev/CMSSW_14_0_0/HLT/V36 (CMSSW_14_0_0)
import FWCore.ParameterSet.Config as cms
from HeterogeneousCore.CUDACore.SwitchProducerCUDA import SwitchProducerCUDA
-from HeterogeneousCore.CUDACore.ProcessAcceleratorCUDA import ProcessAcceleratorCUDA
fragment = cms.ProcessFragment( "HLT" )
-fragment.ProcessAcceleratorCUDA = ProcessAcceleratorCUDA()
+fragment.load("Configuration.StandardSequences.Accelerators_cff")
fragment.HLTConfigVersion = cms.PSet(
- tableName = cms.string('/dev/CMSSW_14_0_0/HLT/V36')
+ tableName = cms.string("/dev/CMSSW_14_0_0/HLT/V36")
)
fragment.HLTIter0PSetTrajectoryBuilderIT = cms.PSet(
diff --git a/HLTrigger/Configuration/python/HLT_GRun_cff.py b/HLTrigger/Configuration/python/HLT_GRun_cff.py
index ca9e0966546..b293153a3eb 100644
--- a/HLTrigger/Configuration/python/HLT_GRun_cff.py
+++ b/HLTrigger/Configuration/python/HLT_GRun_cff.py
@@ -1,18 +1,17 @@
-# hltGetConfiguration /dev/CMSSW_14_0_0/GRun/V30 --cff --data --type GRun
+# hltGetConfiguration v3-test/run3:/dev/CMSSW_14_0_0/GRun/V30 --cff --data --type GRun
# /dev/CMSSW_14_0_0/GRun/V30 (CMSSW_14_0_0)
import FWCore.ParameterSet.Config as cms
from HeterogeneousCore.CUDACore.SwitchProducerCUDA import SwitchProducerCUDA
-from HeterogeneousCore.CUDACore.ProcessAcceleratorCUDA import ProcessAcceleratorCUDA
fragment = cms.ProcessFragment( "HLT" )
-fragment.ProcessAcceleratorCUDA = ProcessAcceleratorCUDA()
+fragment.load("Configuration.StandardSequences.Accelerators_cff")
fragment.HLTConfigVersion = cms.PSet(
- tableName = cms.string('/dev/CMSSW_14_0_0/GRun/V30')
+ tableName = cms.string("/dev/CMSSW_14_0_0/GRun/V30")
)
fragment.HLTIter0PSetTrajectoryBuilderIT = cms.PSet(
diff --git a/HLTrigger/Configuration/python/HLT_HIon_cff.py b/HLTrigger/Configuration/python/HLT_HIon_cff.py
index 4c24fbd814b..a457f9cc29a 100644
--- a/HLTrigger/Configuration/python/HLT_HIon_cff.py
+++ b/HLTrigger/Configuration/python/HLT_HIon_cff.py
@@ -1,18 +1,17 @@
-# hltGetConfiguration /dev/CMSSW_14_0_0/HIon/V30 --cff --data --type HIon
+# hltGetConfiguration v3-test/run3:/dev/CMSSW_14_0_0/HIon/V30 --cff --data --type HIon
# /dev/CMSSW_14_0_0/HIon/V30 (CMSSW_14_0_0)
import FWCore.ParameterSet.Config as cms
from HeterogeneousCore.CUDACore.SwitchProducerCUDA import SwitchProducerCUDA
-from HeterogeneousCore.CUDACore.ProcessAcceleratorCUDA import ProcessAcceleratorCUDA
fragment = cms.ProcessFragment( "HLT" )
-fragment.ProcessAcceleratorCUDA = ProcessAcceleratorCUDA()
+fragment.load("Configuration.StandardSequences.Accelerators_cff")
fragment.HLTConfigVersion = cms.PSet(
- tableName = cms.string('/dev/CMSSW_14_0_0/HIon/V30')
+ tableName = cms.string("/dev/CMSSW_14_0_0/HIon/V30")
)
fragment.HLTIter0PSetTrajectoryBuilderIT = cms.PSet(
diff --git a/HLTrigger/Configuration/python/HLT_PIon_cff.py b/HLTrigger/Configuration/python/HLT_PIon_cff.py
index 6e6d4a581e1..c171595d91c 100644
--- a/HLTrigger/Configuration/python/HLT_PIon_cff.py
+++ b/HLTrigger/Configuration/python/HLT_PIon_cff.py
@@ -1,18 +1,17 @@
-# hltGetConfiguration /dev/CMSSW_14_0_0/PIon/V30 --cff --data --type PIon
+# hltGetConfiguration v3-test/run3:/dev/CMSSW_14_0_0/PIon/V30 --cff --data --type PIon
# /dev/CMSSW_14_0_0/PIon/V30 (CMSSW_14_0_0)
import FWCore.ParameterSet.Config as cms
from HeterogeneousCore.CUDACore.SwitchProducerCUDA import SwitchProducerCUDA
-from HeterogeneousCore.CUDACore.ProcessAcceleratorCUDA import ProcessAcceleratorCUDA
fragment = cms.ProcessFragment( "HLT" )
-fragment.ProcessAcceleratorCUDA = ProcessAcceleratorCUDA()
+fragment.load("Configuration.StandardSequences.Accelerators_cff")
fragment.HLTConfigVersion = cms.PSet(
- tableName = cms.string('/dev/CMSSW_14_0_0/PIon/V30')
+ tableName = cms.string("/dev/CMSSW_14_0_0/PIon/V30")
)
fragment.HLTIter0PSetTrajectoryBuilderIT = cms.PSet(
diff --git a/HLTrigger/Configuration/python/HLT_PRef_cff.py b/HLTrigger/Configuration/python/HLT_PRef_cff.py
index 2a44fa92c0b..16184709216 100644
--- a/HLTrigger/Configuration/python/HLT_PRef_cff.py
+++ b/HLTrigger/Configuration/python/HLT_PRef_cff.py
@@ -1,18 +1,17 @@
-# hltGetConfiguration /dev/CMSSW_14_0_0/PRef/V30 --cff --data --type PRef
+# hltGetConfiguration v3-test/run3:/dev/CMSSW_14_0_0/PRef/V30 --cff --data --type PRef
# /dev/CMSSW_14_0_0/PRef/V30 (CMSSW_14_0_0)
import FWCore.ParameterSet.Config as cms
from HeterogeneousCore.CUDACore.SwitchProducerCUDA import SwitchProducerCUDA
-from HeterogeneousCore.CUDACore.ProcessAcceleratorCUDA import ProcessAcceleratorCUDA
fragment = cms.ProcessFragment( "HLT" )
-fragment.ProcessAcceleratorCUDA = ProcessAcceleratorCUDA()
+fragment.load("Configuration.StandardSequences.Accelerators_cff")
fragment.HLTConfigVersion = cms.PSet(
- tableName = cms.string('/dev/CMSSW_14_0_0/PRef/V30')
+ tableName = cms.string("/dev/CMSSW_14_0_0/PRef/V30")
)
fragment.HLTIter0PSetTrajectoryBuilderIT = cms.PSet(
diff --git a/HLTrigger/Configuration/python/HLT_Special_cff.py b/HLTrigger/Configuration/python/HLT_Special_cff.py
index 1ec07bb94a0..60741abeb74 100644
--- a/HLTrigger/Configuration/python/HLT_Special_cff.py
+++ b/HLTrigger/Configuration/python/HLT_Special_cff.py
@@ -1,18 +1,17 @@
-# hltGetConfiguration /dev/CMSSW_14_0_0/Special/V30 --cff --data --type Special
+# hltGetConfiguration v3-test/run3:/dev/CMSSW_14_0_0/Special/V30 --cff --data --type Special |
Additional note.
|
Eventually also
would disappear once we no longer have SwitchProducers in the menu (which seem CUDA specific)? |
Also, we give a fragment.load to online? They can handle that? |
Yes. (
I think they can handle that. My understanding is that the framework guarantees that loading |
I hope it's a
The menu is parsed by |
@Martin-Grunewald @mmusich , please let me know if you would like to see other checks before integration this PR. I think we should try to converge and release a new version of the ConfDB GUI this week. |
OK, fine with me! |
fine with me too. Discussing with @missirol earlier today, we plan to merge / deploy this tomorrow morning. |
so I had a look, it looks good. There seemed minimal changes to confdb itself. I didnt test it though. And yes as you say, we need to have it as a new major release and deploy to the DAQ. I can assist with that if needed. |
Okay, thanks everyone. Tomorrow morning we deploy the new ConfDB version (i will have a look with Marco). If by next week we see no problems, we prepare the ticket to DAQ (never done it); @Sam-Harper, if i have doubts on this, i will ask you to help (or i will ask you to cross-check it at the end). Thanks ! |
Something has gone wrong in merging: this PR changes 34 files, while in the branch 35 files are changed. I think you merged an earlier incarnation where there was a change to ConfigurationTreeActions.java which was reverted (c.f. unlikely string hack). |
I merged the PR as is. |
Please revert the change in ConfigurationTreeActions.java |
in the |
The aim of this PR is to introduce all the changes needed for the ConfDB GUI to support HLT configurations making use of Alpaka modules, only for configurations based on the release template of
CMSSW_14_0_0_pre1
or later releases.Details on the content of the PR are given in [1].
The tests done so far using this PR are described in [2].
I have deployed a ConfDB version that includes this PR to the
v3-test
instance of the ConfDB converter, so that it can be tested by others, using for exampleSince this PR changes the python output of ConfDB, current policy wants that the next version of ConfDB be propagated to DAQ as well (see CMSHLT-2347 for an example).
The first step is to review this PR, and let others test it. As to when to merge it and cut a new ConfDB version, I'm not sure (before/after the EOY break, before/after the HLT combined table moves to
14_0_X
).Attn: @Martin-Grunewald @mmusich @fwyzard @Sam-Harper
[1] Summary of changes in this PR.
The python converter (
PythonConfigurationWriter
) is updated to detect the presence of Alpaka modules. If at least one Alpaka modues is present a call import or load into the process the content ofAccelerators_cff
is added to the python configuration.I understood that the idea is that CMSSW guarantees that the call above will set up what is needed for GPU offloading with minimal overhead. I find this easier and more future-proof than teaching ConfDB to explicitly defining what is needed in the configuration (e.g. creating the necessary
ProcessAccelerator
s explicitly).CMSSW_14_0_0_pre1
is chosen somewhat arbitrarily as the first (pre-)release for which ConfDB should support Alpaka modules. One reason to choose this pre-release is Add Alpaka Process Modifier cmssw#43229, which makes it sufficient to loadAccelerators_cff
in the python output without also explicitly importing/loadingProcessAcceleratorAlpaka_cfi
(which one would have to do in previous releases).The
python/
directory is updated with the python API of relevant CMSSW packages. Alpaka-related packages are included in order to parse python configurations using Alpaka modules into ConfDB. This PR also adds a readme with a recipe for future updates of this kind.python/
directory based on a recent CMSSW release has the side effect of bringing in a change which is not compatible withpython2
(afaiu, the python interpreter injava
is compatible only withpython2
, and notpython3
). This line (from Enforce cms python types in configurations cmssw#41361) is not compatible withpython2
, so it needed to be changed (without doing so, I got an error when parsing python configurations via the GUI). I tried to implement the same using**kwargs
as function argument in a way which is compatible withpython2
.One fix to the GUI is included in
ConfigurationTreeActions
. Right now (pre-PR), if one tries to use the "Open Module Inserter" to add to a Path/Sequence a module whose type contains a colon, the GUI fails to do it (it throws an exception, and the module is not added). The issue is that ConfDB internally uses strings of the form[pluginType]:[moduleLabel]
, then splits them by:
to retrieve the values of[pluginType]
and[moduleLabel]
, implicitly assuming that[pluginType]
and[moduleLabel]
do not include colons. This assumption is incorrect, as plugin types can often have::
as part of their name (e.g. plugins inside namespace, or Alpaka plugins with explicit backend selection). A hack is introduced behind the scenes to avoid this issue (see the source code), without changing what the GUI displays to users.[2] Tests done so far.
v3-test/run3
, and found no problems (the configs are unchanged, modulo minor formatting changes), see missirol/cmssw@5c473df.edmConfigDump testAlpakaModules_cfg.py > dump.py
(configuration fromHeterogeneousCore/AlpakaTest
), and parseddump.py
into ConfDB athltGetConfiguration v3-test/run3:/users/missirol/test/dev/CMSSW_14_0_0/hltconfdb_85/Test01/HLT/V1
. This worked. The goal was to check that parsing and python output are as expected, which they are. The output configuration itself fails at runtime withcmsRun
, because of several unrelated issues.. ConfDB does not have a template for the pluginIntProducer
, various parameters of theAlpakaTest
plugins are dropped as they do not have a default value fromfillDescriptions
, the output ofhltGetConfiguration
ignores the fact that this configuration should use aEmptySource
rather than aPoolSource
.. all these issues should not apply to realistic HLT menus with Alpaka modules.