Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Moving the memory space detection outside the Engine classes for the GPU-aware logic #3929

Open
anagainaru opened this issue Nov 18, 2023 · 2 comments
Assignees

Comments

@anagainaru
Copy link
Contributor

The BP5, SST and DataMan engines are GPU aware which means they can receive GPU pointers as well as Kokkos::Views.

For Kokkos::View the memory space is updated based on the type of the View. For raw pointers, the memory space needs to be detected. There is currently no consistent way of detecting the memory space across the three engines and if not handled correctly the engine might propagate a Detect memory space until it's time to copy the memory to our internal buffers (which defaults to Host and a seg fault if the pointer was actually a device pointer). To overcome this problem, the safe way is for the application to call SetMemorySpace to Device or Host for all raw pointers. SST and DataMan are not exhaustively tested so there might be scenarios for them where the memory is not detected.

This issue is an action call to move the detection of the memory space before we reach each particular engine so all behaviors are consistent and the BP5 tests cover SST and DataMan as well.

@anagainaru anagainaru self-assigned this Nov 18, 2023
@eisenhauer
Copy link
Member

eisenhauer commented Nov 18, 2023 via email

@anagainaru
Copy link
Contributor Author

I like your solution to use MPI_Comm_split to move from BP to SST testing. I will try to update the device testing to include something similar.

I would still like to trigger the detection mechanism for memory space as early as possible. There are so many paths that engines take to move the data, I am starting to regret the decision to allow Puts/Gets on different memory space for the same variable. I am thinking of forcing just one (once you send device data to a variable, all data needs to be on the device) since I am having trouble transposing on the other side data if you don't read it on the same space.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants