-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Minimal implementation of in-memory Geant4->VecGeom conversion with G4VG #289
base: master
Are you sure you want to change the base?
Conversation
Can one of the admins verify this patch? |
Following discussion this morning, I've submitted celeritas-project/g4vg#7 which fixes the install issue (G4VG cannot be re-found). Once that's merged I'll up this PR to use the new commit. |
Latest commits update the G4VG commit to that providing full install support, so find_package(AdePT) in client code will now work without errors. |
I updated one of the GPU machines to alma9 (in view of centos7 going EOL), which caused your jenkins build to fail. I will submit a PR for the CI to use the new OS tomorrow I hope. |
Hi Ben, I get the same |
I haven't been able to link to AdePT from an external project because of |
Latest commit should fix the
The latest commit which this MR includes for G4VG should install a |
I see, I can't find
|
|
It's installed parallel, so the structure is something like:
Check that you have those, if not, try a from-scratch build just in case it's retained the older G4VG. The only other thing is to check that you're configuring the external project with |
I see, I did try a build from scratch several times but the G4VG directory is not there, building from the latest commit I'm configuring the project with: |
Those options look fine, what gets printed when you do
|
In my case, I have everything except for the G4VG lines:
|
I am getting the right directory structure (like Ben), however, it fails to link the tests: [ 51%] Linking CXX executable ../BuildProducts/bin/test_g4vg_link This is very strange, because libG4analysis.so is in the same directory as libG4processes.so, so I don't understand why it can't link. |
@JuanGonzalezCaminero though I know that you did a fresh build, all I can think of is that the old G4VG version is still present. Could you confirm that you fully deleted the build directory (i.e. not just a "make clean") and then re-cmaked from a completely empty directory? Which CMake version are you using? @WitekPokorski that's very weird... Which Linux/gcc/CMake and Geant4 versions are you using? |
RHEL9, gcc (GCC) 12.3.0, cmake version 3.27.4, G4.11.1.2 |
Thanks! I'll set up this locally (Rocky 9, but close enough) and see if I can reproduce. |
@WitekPokorski, I think there is an issue in G4VG, but to help track this down, could you do a fresh build of AdePT (just run
and
please? Also, when you've done those, what's the output of
and
please? |
|
|
|
`[witoldp@omenrtx ~/test_g4gv/AdePT/build]$ readelf -d BuildProducts/lib64/libg4vg.so Dynamic section at offset 0x2d3088 contains 48 entries: |
Discovered in testing of integration of G4VG in apt-sim/AdePT#289, targets linking to installed G4VG may fail to link due to linker not having needed information on location of deps-of-deps of G4VG. This happens in build environments where 1. The installed dependencies (here, Geant4, VecGeom) do not have their rpath set correctly. 2. They are otherwise in a location the linker is not aware of, e.g. via ldconfig of LD_LIBRARY_PATH etc. The link error warns about this and suggests adding the path through, e.g. `-rpath-link`. CMake will always add this flag/path correctly, but only if it knows about the dependencies (even if private). The G4VG library only depends on the static Celeritas::geocel target, which does not forward its dependencies to Geant4/Vecgeom and so it does not set `-rpath-link` for these. Workaround this issue by explicitly making used Geant4 targets PRIVATE deps of the g4vg target. Refind Geant4/VecGeom dependencies when finding G4VG so these are resolved and available if the linker requires it. Add smoke test for linking that simply tries to link only the G4VG target. Confirmed to reproduce problem observed in AdePT, with the fixes above resolving it.
Thanks @WitekPokorski! This looks like it may be a GCC 12/binutils issue with increased strictness around link paths/resolving libraries through rpath/etc (I couldn't reproduce in a GCC 11 environment). I'll submit a workaround to upstream in G4VG, but in the meantime could you check if adding |
Discovered in testing of integration of G4VG in apt-sim/AdePT#289, targets linking to installed G4VG may fail to link due to linker not having needed information on location of deps-of-deps of G4VG. This happens in build environments where 1. The installed dependencies (here, Geant4, VecGeom) do not have their rpath set correctly. 2. They are otherwise in a location the linker is not aware of, e.g. via ldconfig of LD_LIBRARY_PATH etc. The link error warns about this and suggests adding the path through, e.g. `-rpath-link`. CMake will always add this flag/path correctly, but only if it knows about the dependencies (even if private). The G4VG library only depends on the static Celeritas::geocel target, which does not forward its dependencies to Geant4/Vecgeom and so it does not set `-rpath-link` for these. Workaround this issue by explicitly making used Geant4 targets PRIVATE deps of the g4vg target. Refind Geant4/VecGeom dependencies when finding G4VG so these are resolved and available if the linker requires it. Add smoke test for linking that simply tries to link only the G4VG target. Confirmed to reproduce problem observed in AdePT, with the fixes above resolving it.
Yes, on one side there is the crash in the HGCALTB application, which I didn't have time to check more in detail yet. Then, regarding the AdePT examples, I noticed that the results are not numerically identical to master (With relatively large differences in some cases I remember). This is not necessarily wrong, but we should understand where they come from since this PR shouldn't have an effect there. |
- Taken as of commit fb03216c0cd701c6aff0eb901370dad19e178503 - celeritas-project/celeritas@fb03216
Update README to explain and document use of "cuda_rdc_..." for linking to AdePT.
Provides full support for install and reuse by clients.
Some platforms, Ubuntu being the canonical case, have their linker configured to use the "as-needed" option by default. Due to the current structure of RDC split device/host code libraries, link time errors can result due to the linker ignoring one of these libraries. Explicitly add `--no-as-needed` to AdePT's host linking stage to workaround this issue. Document use in downstream projects.
- Update to latest upstream commit with fix for typo in G4VGConfig.cmake - Do not use EXCLUDE_FROM_ALL so that CMake 3.28 and newer will install G4VG and support files Expect second point to be temporary until true common G4VG as external dependency shared between Celeritas and AdePT can be established.
If explicit ALIAS targets with the final install namespace (e.g. "NS::Target" for "Target") are not created for an RDC library, the installed targets have the wrong names set for the CUDA_RDC properties. This results in both compile and (unseen) link errors because the bare target names are interpreted as raw libraries, not CMake targets. Dependencies are then not transmitted correctly, resulting in compile/link issues for missing include paths etc. Explicitly create ALIAS targets for CopCore and AdePT RDC libraries to ensure they are installed with the correct install-time target names for their `CUDA_RDC_...` properties.
@andresailer, @JuanGonzalezCaminero, @agheata, could someone take a look at why the CI is failing please? The output suggests something on the Jenkins side rather than this PR, but I'm not 100% sure. |
@drbenmorgan the CI was migrated from centos to alma, can you try to rebase this branch on the master? |
The latest build was triggered by me rebasing this onto the current master (as of 92851f7) and pushing, so the latest changes are here. |
Can you push another change? |
Thanks! Now the jenkins build runs. |
Phew, that's good to know! Things all seem to pass both on CI and locally, so could people try out in the current integration projects, or point me to the current state/repos for them please? It would be good to get this progressed/merged asap. |
@drbenmorgan the status is that linking a simple example using the RDC from this branch works indeed, but then we get a CUDA stack corruption, which we have not managed to debug yet. The corruption does not happen if linking against the master. This did not go away by increasing the stack size per thread in CUDA. @JuanGonzalezCaminero can provide a simple reproducer if you want to look. |
Hi Ben, Yes, for now the only reproducer we have is still the HGCALTB application. I described how to adapt it in my previous comment #289 (comment) |
Thanks @JuanGonzalezCaminero I'll revisit that and see if I can reproduce. |
This implements the minimal code/link to use G4VG for converting a Geant4
G4VPhysicalVolume
tovecgeom::VPlacedVolume
in-memory without an intermediary GDML file. Because of the link to CUDA-enabled VecGeom, the link structure originally trialled in #279 is also implemented, and now works properly thanks to the work done removing the old circular dependency described in that PR. Note that it's certain we'll need the--no-as-needed
trick for linkers that useas-needed
by default, but first things first. The major changes are thus:CudaRdcUtils
andFindVecGeom
modules from Celeritas to support coherent linking/device linking of VecGeom into Cuda clients.AdePTGeant4Integration::CreateVecGeomWorld(const G4VPhysicalVolume *physvol)
to take the input Geant4 volume and convert to VecGeom using G4VG's interface.Because of the linking issues seen before I haven't gone further yet in migrating the code to use
CreateVecGeomWorld
. I'd like to get a round of CI in first just to be sure there's no other issue. We then should confirm/decide how to solve theas-needed
issue. Whilst it's a bit of a blunt instrument, we probably just linkAdePT_G4_integration
with-Wl,--no-as-needed
, and perhaps enforce downstream clients to use that as well.Supersedes #279 and #285.