Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

util/pingpong: close mr after ep close #10640

Merged
merged 1 commit into from
Dec 17, 2024
Merged

Conversation

shijin-aws
Copy link
Contributor

pingpong doesn't support FI_MR_ENDPOINT today,
so the mr is associated with domain instead of ep. It is unsafe to close mr before closing ep because it can cause an EBUSY error when there are outstanding recvs of the mr posted to the ep/qp. This patch fixes this issue by moving the mr close after the ep close.

pingpong doesn't support FI_MR_ENDPOINT today,
so the mr is associated with domain instead of ep.
It is unsafe to close mr before closing ep because
it can cause an EBUSY error when there are outstanding
recvs of the mr posted to the ep/qp. This patch fixes
this issue by moving the mr close after the ep close.

Signed-off-by: Shi Jin <[email protected]>
@shijin-aws
Copy link
Contributor Author

I can implement a more comprehensive change like #8119 for this, but to my knowledge we likely won't support FI_MR_ENDPOINT for fi_pingpong, so I just make this simple change.

@aingerson
Copy link
Contributor

I don't necessarily have an objection to this change but I'm trying to understand what outstanding operations this addresses - the finalize calls a send with FI_TRANSMIT_COMPLETE to consume the last rx and waits for the matching receive from the other side. Shouldn't all sends and receives be done with the MRs at the point of closing them?

@shijin-aws
Copy link
Contributor Author

shijin-aws commented Dec 17, 2024

@aingerson there is 1 extra fi_recv posted in this test: https://github.com/ofiwg/libfabric/blob/main/util/pingpong.c#L2216-L2222 for dgram ep type, so there is always 1 extra fi_recv than the sends it will actually get. There is already 1 fi_recv after enable https://github.com/ofiwg/libfabric/blob/main/util/pingpong.c#L1558-L1559 which matches the send in the finalize call.

I don't fully understand why dgram pingpong has to be implemented in this way, especially fabtests fi_dgram_pingpong has such extra recv as well. I tried to remove such extra fi_recv, but then I will frequently get a cq read timeout.

@shijin-aws
Copy link
Contributor Author

Is Intel CI failure related?

@aingerson
Copy link
Contributor

@shijin-aws No, they were weird cluster issues. We don't test fi_pingpong in our CI. You can merge whenever you want, thanks!

@shijin-aws
Copy link
Contributor Author

@shefty @j-xiong do you have any background for that 1 extra recv for dgram pingpong?

@shefty
Copy link
Member

shefty commented Dec 17, 2024

dgram_pingpong has this comment:

	/* Post an extra receive to avoid lacking a posted receive in the
	 * finalize.
	 */

The early receive likely handles a race where the sender sends a datagram prior to the receive being posted. You likely don't need an 'extra' receive, but one which is posted early. The very last receive which is posted could probably be removed, except it's hard to know which 'fi_recv' will actually be the last one.

@shijin-aws
Copy link
Contributor Author

@shefty Thanks! I will keep such extra recv as is for now, and adjust the sequence of mr close in pingpong to be consistent with dgram_pingpong for non-FI_MR_ENDPOINT case.

@shijin-aws shijin-aws merged commit 17d9cf2 into ofiwg:main Dec 17, 2024
12 of 13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants