diff --git a/man/man7/fi_cxi.7 b/man/man7/fi_cxi.7 index 528787a6e1e..336b716e982 100644 --- a/man/man7/fi_cxi.7 +++ b/man/man7/fi_cxi.7 @@ -1,7 +1,7 @@ .\"t .\" Automatically generated by Pandoc 2.9.2.1 .\" -.TH "fi_cxi" "7" "2024\-11\-20" "Libfabric Programmer\[cq]s Manual" "#VERSION#" +.TH "fi_cxi" "7" "2024\-11\-25" "Libfabric Programmer\[cq]s Manual" "#VERSION#" .hy .SH NAME .PP @@ -442,53 +442,25 @@ hybrid RX match modes increase Request buffer space using the variables \f[I]FI_CXI_REQ_*\f[R]. .SS Message Ordering .PP -The CXI provider supports the following ordering rules: -.IP \[bu] 2 -All message Send operations are always ordered. -.IP \[bu] 2 -RMA Writes may be ordered by specifying \f[I]FI_ORDER_RMA_WAW\f[R]. -.IP \[bu] 2 -AMOs may be ordered by specifying -\f[I]FI_ORDER_AMO_{WAW|WAR|RAW|RAR}\f[R]. -.IP \[bu] 2 -RMA Writes may be ordered with respect to AMOs by specifying -\f[I]FI_ORDER_WAW\f[R]. -Fetching AMOs may be used to perform short reads that are ordered with -respect to RMA Writes. +Supported message ordering: FI_ORDER_SAS, FI_ORDER_WAW, +FI_ORDER_RMA_WAW, FI_ORDER_RMA_RAR, FI_ORDER_ATOMIC_WAW, and +FI_ORDER_ATOMIC_RAR. +.PP +Note: Any FI_ORDER_*_{WAR,RAW} are not supported. +.PP +Note: Relaxing the message ordering may result in improved performance. +.SS Target Ordering .PP Ordered RMA size limits are set as follows: .IP \[bu] 2 \f[I]max_order_waw_size\f[R] is -1. -RMA Writes and non-fetching AMOs of any size are ordered with respect to -each other. -.IP \[bu] 2 -\f[I]max_order_raw_size\f[R] is -1. -Fetching AMOs of any size are ordered with respect to RMA Writes and -non-fetching AMOs. -.IP \[bu] 2 -\f[I]max_order_war_size\f[R] is -1. -RMA Writes and non-fetching AMOs of any size are ordered with respect to -fetching AMOs. -.SS PCIe Ordering -.PP -Generally, PCIe writes are strictly ordered. -As an optimization, PCIe TLPs may have the Relaxed Order (RO) bit set to -allow writes to be reordered. -Cassini sets the RO bit in PCIe TLPs when possible. -Cassini sets PCIe RO as follows: -.IP \[bu] 2 -Ordering of messaging operations is established using completion events. -Therefore, all PCIe TLPs related to two-sided message payloads will have -RO set. -.IP \[bu] 2 -Every PCIe TLP associated with an unordered RMA or AMO operation will -have RO cleared. -.IP \[bu] 2 -PCIe TLPs associated with the last packet of an ordered RMA or AMO -operation will have RO cleared. -.IP \[bu] 2 -PCIe TLPs associated with the body packets (all except the last packet -of an operation) of an ordered RMA operation will have RO set. +RMA Writes and AMO writes of any size are ordered with respect to each +other. +.PP +Note: Due to FI_ORDER_*_{WAR,RAW} not being supported, +max_order_{raw,war}_size are forced to zero. +.PP +Note: Relaxing the target ordering may result in improved performance. .SS Translation .PP The CXI provider supports two translation mechanisms: Address @@ -1172,6 +1144,11 @@ offloading are met. .PP The CXI provider checks for the following environment variables: .TP +\f[I]FI_CXI_MR_TARGET_ORDERING\f[R] +MR target ordering (i.e.\ PCI ordering). +Options: default, strict, or relaxed. +Recommendation is to leave at default behavior. +.TP \f[I]FI_CXI_ODP\f[R] Enables on-demand paging. If disabled, all DMA buffers are pinned. @@ -1836,12 +1813,6 @@ if (ret) \f[R] .fi .PP -When an endpoint does not support FI_FENCE (e.g.\ optimized MR), a -provider specific transmit flag, FI_CXI_WEAK_FENCE, may be specified on -an alias EP to issue a FENCE operation to create a data ordering point -for the alias. -This is supported for one-sided operations only. -.PP Alias EP must be closed prior to closing the original EP. .SS PCIe Atomics .PP