Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should SSTs MPI dataplane be enabled when ADIOS is built in serial mode (and related questions). #3829

Closed
eisenhauer opened this issue Sep 27, 2023 · 5 comments
Assignees

Comments

@eisenhauer
Copy link
Member

See comments on PR #3823.
#3823

@vicentebolea
Copy link
Collaborator

@eisenhauer regarding to the deadlock discussed in the linked issue, it appears to me that by default the MPI DataPlane should not ever call MPI_Init(). This is, only the user can call MPI_Init() or the user has to explicitly signal somehow adios2 to do this. Originally the idea of the MPI DataPlane doing this was that it was expected to replace any other DataPlane when MPI was available, however, this should not be the case since the support for MPI Client/Server capabilities is very spotty/flaky in cutting-edge systems (Our target systems).

However I think that we should still leave some resort to the user to enable it if he really wants ADIOS2 to call MPI_Init, maybe another cmake Flag or an SST option?

What do you think?

@eisenhauer
Copy link
Member Author

I'm inclined to say that SST should never call MPI_Init()... We wouldn't be able to call MPI_Finalize without complexity. If the user had created multiple ADIOS instances (perfectly legal), then each might call MPI_Init() and bad things might happen. (I was just advising someone about the possibility of creating two adios instances, one of which was created with MPI and one was not (so that they could have single-process-written files too).) If we were going to allow the possiblity of SST calling MPI_Init() we'd really have to think of all these circumstances and make sure we behaved reasonably in them. Rather than creating a flag that allows SST to call MPI_Init I tend to think that if the user wants this to happen, they can do it and use the established mechanisms. I could be convinced otherwise, but that's my initial take.

@vicentebolea
Copy link
Collaborator

Note that currently the MPI DataPlane only call MPI_Init if it is not initialized.

The argument pro-flag is that some users might expect to use ADIOS2 with the MPI Dataplane without further changes to its code. This is we can enable the MPI dataplane in application code without requesting the users to make changes (such as adding an MPI_Init()), however, as you mentioned this raises the question about what to do with MPI_Finalize, an idea is that if the user did not initialize MPI we can assume that adios2 should take care of the MPI lifecycle, this is on the engine class for SST destructor we call MPI_FInalize. In principle sounds good but we will have to consider many use cases which might further complicate this.

At the end of the day I am also more inclined towards your idea of keeping it simple, this is, SST never calls MPI_Init. If there is a user need for this we can rethink how to do it.

@eisenhauer
Copy link
Member Author

Yeah, I don't think we can call MPI_Finalize in the engine destructor because the application might have created multiple engines. Subsequent ones would have noticed that mpi was initialized and not called Init again, but if we finalize whenever the first gets destroyed we mess up the ones that remain. That these guys might be in multiple ADIOS instances would mean that can't realistically keep a reference count or anything like that without introducing a singleton. It's just too messy.

@vicentebolea
Copy link
Collaborator

fixed in #3847

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants