[BUG] UCX error Message truncated
observed with UCX 1.11 RC in Q77 NDS
#2892
Labels
Milestone
Message truncated
observed with UCX 1.11 RC in Q77 NDS
#2892
We are seeing an error in Q77 NDS at 3TB:
I used these UCX settings:
This error only happens with the latest 21.08 nightly (rapids-4-spark_2.12-21.08.0-20210708.152651-40.jar) build and UCX 1.11 RC (https://github.com/openucx/ucx/releases/tag/v1.11.0-rc3).
Reverting UCX to 1.10.1 (https://github.com/openucx/ucx/releases/tag/v1.10.1) works. So this appears to be a regression with UCX 1.11, or at least a bad interplay between JUCX 1.11.0 (as opposed to JUCX 1.11.0-RC3) and the native bits.
UCX 1.11 allows logging, and the logs are just showing:
I am checking the other side of the connection, and I don't see 8230 bytes sent, so I think this is a fragment size (i.e. we are potentially falling back to fragment based copies for such tiny buffers).
The text was updated successfully, but these errors were encountered: