DAOS-16365 client: intercept MPI_Init() to avoid nested call #14992

wiliamhuang · 2024-08-22T20:09:34Z

We observed deadlock in MPI applications on Aurora due to nested calls of zeInit() inside MPI_Init(). daos_init() is involved in such nested calls. This PR intercepts MPI_Init() and avoid running daos_init() inside MPI_Init().

Features: pil4dfs

Required-githooks: true

Before requesting gatekeeper:

Two review approvals and any prior change requests have been resolved.
Testing is complete and all tests passed or there is a reason documented in the PR why it should be force landed and forced-landing tag is set.
Features: (or Test-tag*) commit pragma was used or there is a reason documented that there are no appropriate tags for this PR.
Commit messages follows the guidelines outlined here.
Any tests skipped by the ticket being addressed have been run and passed in the PR.

Gatekeeper:

Features: pil4dfs Required-githooks: true Signed-off-by: Lei Huang <[email protected]>

github-actions · 2024-08-22T20:09:55Z

Ticket title is 'deadlock in MPI application on Aurora with libpil4dfs'
Status is 'In Review'
Labels: 'intercept_lib,scrubbed_2.8'
https://daosio.atlassian.net/browse/DAOS-16365

Features: pil4dfs Required-githooks: true Signed-off-by: Lei Huang <[email protected]>

mchaarawi · 2024-08-23T14:21:56Z

i will test that today

wiliamhuang · 2024-08-23T14:23:10Z

i will test that today

Thank you very much!

knard38

The overall PR looks good to me, just a concurrency management concerns which is not clear for me.

knard38 · 2024-08-23T14:36:30Z

src/client/dfuse/pil4dfs/int_dfs.c

+			 * libc functions. Avoid possible zeInit reentrancy/nested call.
+			 */
+
+			if (atomic_load_relaxed(&mpi_init_count) > 0) {


I am not sure to perfectly understand the _relaxed semantic, but for this test it should probably be better to use a strict atomic (from my understanding of the gcc documentation).

https://en.cppreference.com/w/cpp/atomic/memory_order
From my understanding, atomic operation is already guaranteed with "memory_order_relaxed".

Indeed you are right, I had misunderstood the following sentence from the documentation https://gcc.gnu.org/wiki/Atomic/GCCMM/AtomicSync
__ATOMIC_RELAXED Implies no inter-thread ordering constraints.
At the end the following documentation is really more clear on the different memory model used for inter thread synchronization https://gcc.gnu.org/wiki/Atomic/GCCMM/AtomicSync

However, I still have a concern with this synchronization design pattern:

Thread A: Execute line 1147 and the condition is false Stopped by the scheduler Thread B: Execute line 1037 Start Execute line 1038 and is interrupted by the scheduler during the execution of next_mpi_init() Thread A: Execute line 1158 and following

From my understanding we still have the race issue.

@knard-intel Thank you very much for your comments! I will think more about this.
The issue we encountered are in MPI applications on Aurora. The hang was caused by dead lock in nested calls of zeInit() in Intel level zero drivers in the same thread.
Our goal was to avoid daos_init() being called inside MPI_Init(). All IO requests are forward to dfuse.
We do not know whether we will have issues if thread A is calling daos_init() and thread B starts calling MPI_Init().

Thanks for the explanation which makes sense to me :-)

Features: pil4dfs Required-githooks: true Signed-off-by: Lei Huang <[email protected]>

daosbuild1 · 2024-08-25T02:11:02Z

Test stage Functional Hardware Medium completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-14992/5/execution/node/1417/log

knard38

LGTM

knard38 · 2024-08-26T19:37:23Z

src/client/dfuse/pil4dfs/int_dfs.c

+			 * libc functions. Avoid possible zeInit reentrancy/nested call.
+			 */
+
+			if (atomic_load_relaxed(&mpi_init_count) > 0) {


Thanks for the explanation which makes sense to me :-)

wiliamhuang · 2024-08-28T16:30:39Z

The only failure is a known issue, fio with libaio + fork() due to bug in mercury.
https://daosio.atlassian.net/browse/DAOS-16315

@mchaarawi Could you please review this PR? Thank you very much!

wiliamhuang · 2024-08-29T14:56:44Z

@daos-stack/daos-gatekeeper Can we land this PR? Thank you very much!

wiliamhuang · 2024-08-29T15:57:06Z

@mchaarawi We should port this to 2.6. Right?

mchaarawi · 2024-08-29T16:18:04Z

@mchaarawi We should port this to 2.6. Right?

yes please

wiliamhuang · 2024-08-29T16:36:46Z

@mchaarawi We should port this to 2.6. Right?

yes please

Thank you! I requested for the backport to release/2.6. I will create a PR once it is approved.

We observed deadlock in MPI applications on Aurora due to nested calls of zeInit() inside MPI_Init(). daos_init() is involved in such nested calls. This PR intercepts MPI_Init() and avoid running daos_init() inside MPI_Init(). Signed-off-by: Lei Huang <[email protected]>

…#15047) We observed deadlock in MPI applications on Aurora due to nested calls of zeInit() inside MPI_Init(). daos_init() is involved in such nested calls. This PR intercepts MPI_Init() and avoid running daos_init() inside MPI_Init(). Signed-off-by: Lei Huang <[email protected]>

DAOS-16365 client: intercept MPI_Init() to avoid nested call

8aeaf8d

Features: pil4dfs Required-githooks: true Signed-off-by: Lei Huang <[email protected]>

wiliamhuang added 3 commits August 22, 2024 20:22

fix format issue

c507602

Features: pil4dfs Required-githooks: true Signed-off-by: Lei Huang <[email protected]>

fix more format issue

32abfcb

Features: pil4dfs Required-githooks: true Signed-off-by: Lei Huang <[email protected]>

empty commit to trigger rebuild

4273623

Features: pil4dfs Required-githooks: true Signed-off-by: Lei Huang <[email protected]>

wiliamhuang marked this pull request as ready for review August 23, 2024 14:00

wiliamhuang requested review from a team as code owners August 23, 2024 14:00

wiliamhuang requested review from mchaarawi and knard38 August 23, 2024 14:19

knard38 reviewed Aug 23, 2024

View reviewed changes

intercept MPI_Init() only

644de09

Features: pil4dfs Required-githooks: true Signed-off-by: Lei Huang <[email protected]>

knard38 approved these changes Aug 26, 2024

View reviewed changes

mchaarawi approved these changes Aug 28, 2024

View reviewed changes

wiliamhuang requested a review from a team August 28, 2024 21:31

wiliamhuang added the forced-landing The PR has known failures or has intentionally reduced testing, but should still be landed. label Aug 28, 2024

daltonbohning merged commit a5552da into master Aug 29, 2024
54 of 56 checks passed

daltonbohning deleted the lei/DAOS-16365_mpi_init branch August 29, 2024 15:55

jolivier23 mentioned this pull request Sep 4, 2024

Merge upstream/release/2.6 into upstream/google/2.6 #15066

Merged

mjmac mentioned this pull request Nov 13, 2024

mjmac/DAOS 16787 google 2.6 #15498

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DAOS-16365 client: intercept MPI_Init() to avoid nested call #14992

DAOS-16365 client: intercept MPI_Init() to avoid nested call #14992

wiliamhuang commented Aug 22, 2024 •

edited

Loading

github-actions bot commented Aug 22, 2024

mchaarawi commented Aug 23, 2024

wiliamhuang commented Aug 23, 2024

knard38 left a comment

knard38 Aug 23, 2024 •

edited

Loading

wiliamhuang Aug 23, 2024

knard38 Aug 25, 2024 •

edited

Loading

wiliamhuang Aug 26, 2024 •

edited

Loading

knard38 Aug 26, 2024 •

edited

Loading

daosbuild1 commented Aug 25, 2024

knard38 left a comment

knard38 Aug 26, 2024 •

edited

Loading

wiliamhuang commented Aug 28, 2024

wiliamhuang commented Aug 29, 2024

wiliamhuang commented Aug 29, 2024

mchaarawi commented Aug 29, 2024

wiliamhuang commented Aug 29, 2024

DAOS-16365 client: intercept MPI_Init() to avoid nested call #14992

DAOS-16365 client: intercept MPI_Init() to avoid nested call #14992

Conversation

wiliamhuang commented Aug 22, 2024 • edited Loading

Before requesting gatekeeper:

Gatekeeper:

github-actions bot commented Aug 22, 2024

mchaarawi commented Aug 23, 2024

wiliamhuang commented Aug 23, 2024

knard38 left a comment

Choose a reason for hiding this comment

knard38 Aug 23, 2024 • edited Loading

Choose a reason for hiding this comment

wiliamhuang Aug 23, 2024

Choose a reason for hiding this comment

knard38 Aug 25, 2024 • edited Loading

Choose a reason for hiding this comment

wiliamhuang Aug 26, 2024 • edited Loading

Choose a reason for hiding this comment

knard38 Aug 26, 2024 • edited Loading

Choose a reason for hiding this comment

daosbuild1 commented Aug 25, 2024

knard38 left a comment

Choose a reason for hiding this comment

knard38 Aug 26, 2024 • edited Loading

Choose a reason for hiding this comment

wiliamhuang commented Aug 28, 2024

wiliamhuang commented Aug 29, 2024

wiliamhuang commented Aug 29, 2024

mchaarawi commented Aug 29, 2024

wiliamhuang commented Aug 29, 2024

wiliamhuang commented Aug 22, 2024 •

edited

Loading

knard38 Aug 23, 2024 •

edited

Loading

knard38 Aug 25, 2024 •

edited

Loading

wiliamhuang Aug 26, 2024 •

edited

Loading

knard38 Aug 26, 2024 •

edited

Loading

knard38 Aug 26, 2024 •

edited

Loading