prov/tcp: introduce TCP_NO_CONNECT flag #10534

ooststep · 2024-11-13T22:09:08Z

There are some specific use cases where we may not want one side of communication to initiate connections, namely when we know that one side of our configuration is being heavily restricted by a firewall. To prevent indefinite hangs with certain operations, such as RMA reads and writes, introduce a provider specific flag to trigger an error if there is not already an established connection. In this case, the application can force the connection from the other direction.

prov/tcp/src/xnet.h

aingerson

Add to the man page as well

prov/tcp/src/xnet_rdm_cm.c

prov/tcp/src/xnet_attr.c

include/rdma/fi_ext.h

man/fi_tcp.7.md

j-xiong · 2024-11-15T21:02:59Z

prov/tcp/src/xnet.h

@@ -98,6 +98,9 @@ typedef void  xnet_profile_t;
 #define XNET_MIN_MULTI_RECV	16384
 #define XNET_PORT_MAX_RANGE	(USHRT_MAX)

+/* provider specific op flags */
+#define TCP_NO_CONNECT FI_TCP_NO_CONNECT


Any reason not to use FI_TCP_NO_CONNECT directly?

The provider specific flags are typically defined within the provider header files without any FI prefixing and used internally. I wanted to have a definition here as a placeholder so that it's obvious there are existing provider op flags that need to be accounted for.

@ooststep This is true when the API flag differs from the provider flag, or the scope of the flags are different. E.g. the API flags cover a 64-bit range, but the provider flag maps to some 16-bit range in the protocol.

There are some specific use cases where we may not want one side of communication to initiate connections, namely when we know that one side of our configuration is being heavily restricted by a firewall. To prevent indefinite hangs with certain operations, such as RMA reads and writes, introduce a provider specific flag to trigger an error if there is not already an established connection. In this case, the application can force the connection from the other direction. Signed-off-by: Stephen Oost <[email protected]>

shefty · 2024-11-27T17:41:36Z

man/fi_tcp.7.md

+*FI_TCP_NO_CONNECT*
+: This flag indicates that operations should fail if there is no
+  existing connection to the remote peer.  In such case, an FI_ENOTCONN
+  error should be expected.


I would not make this a flag that's checked on every operation (i.e. it's okay to connect if it's a send, but not a write?). It makes more sense either applied to the entire rdm endpoint or to a specific peer.

in the target problematic scenario, the rdm is used for some peers where this flag is needed and some peers where it would not be, so applying to the entire rdm was undesirable.

This doesn't make sense. An RDM endpoint is unconnected. Exposing low-level connection implementation details is not desirable. There could be multiple connections to the same peer. The connection might be in another process (e.g. Pony Express or SNAP or whatever it's called). It could be in the kernel (RDS).

This still isn't a per operation flag. At best it's per peer, but even that use case seems questionable. It's like putting half of an RDM endpoint behind a firewall, but the other half ignores it. Apply firewall semantics to the entire RDM endpoint. If some peers are outside the firewall, but some are inside, require 2 endpoints with some sort of per EP configuration.

I think I also agree with the first part (ie. exposing low-level connection is a bad idea) though providing 2 types of endpoints is also difficult for the user. To provide some context as to what problem we were trying to solve, in a client-server configuration where the client is behind a firewall, we have cases where for example client A would send a message to server A, and server A would then attempt to do an emulated tcp RMA to client B through an fi_write() call (without prior connection established from client B to server A). In that case, it seems that with the tcp provider server A remains stuck attempting to establish a connection and doing an fi_cancel on the RMA does not seem to be able to complete, as it's not supported currently by the tcp provider. So what we wanted was a way of having some completion and error being returned when server A is not able to reach client B. I'm opened to other means but having to manage 2 endpoints seems also cumbersome, I would also be happy though if we don't have to expose any connection logic.

@shefty Just to add a bit of context beyond what Jerome said, this use case came from Parallelstore (Google's DAOS service). We control only the server side of the equation and rely on the user to do client configuration (as we don't control their VMs, network configuration, and processes). Opening the firewall to server to client connections is not common for services and requires them to do it explicitly or their writes will simply hang. We want to remove this requirement as it is becoming a major scale issue for onboarding new customers because users don't read documentation.

For the alternative, I would only configure the EPs at the server, as that's where the actual problem occurs. The client behavior is unchanged.

I think the problem is really on the client. The client endpoint is the one behind the firewall and can't be reached (unless the server has already connected to it). Can we encode something into the client side URI that would indicate it can't be reached? The advantage to that approach from our end is that then the client could have complete control over telling the server whether or not it can handle the error (e.g. can support getting back an error indicating that the server couldn't connect and handle it appropriately)

It's the server's behavior that should change.

From the viewpoint of the client, everything works. It wants to talk to server A and can do so. That server A wants to pass off the response to some other system that the client doesn't know about is related to the storage architecture, not the client SW.

I don't think pushing this detail into the apps is the best option. But you can work-around this in the client by having the client send some sort of 'hello' message to every storage server during initialization -- to poke holes through the firewall. That pushes the burden onto every client app that might want to use DAOS.

Alternatively, you can configure the server SW to be firewall aware, so that it avoids forwarding requests to servers not already communicating with the client.

Or, change the protocol around handling firewalls. Have server A tell the client to retry its request with server B, rather than forwarding it internally.

There are likely other options for this. But I would avoid picking one which encoded these details in the SW API.

It would probably be better if this thread were copied into an issue for continued discussion, rather than attaching it to this PR.

shefty · 2024-11-27T17:44:52Z

prov/tcp/src/xnet.h

@@ -98,6 +98,9 @@ typedef void  xnet_profile_t;
 #define XNET_MIN_MULTI_RECV	16384
 #define XNET_PORT_MAX_RANGE	(USHRT_MAX)

+/* provider specific op flags */
+#define TCP_NO_CONNECT FI_TCP_NO_CONNECT


@ooststep This is true when the API flag differs from the provider flag, or the scope of the flags are different. E.g. the API flags cover a 64-bit range, but the provider flag maps to some 16-bit range in the protocol.

ooststep added ⚠️ Do not merge evaluating labels Nov 13, 2024

soumagne reviewed Nov 13, 2024

View reviewed changes

prov/tcp/src/xnet.h Outdated Show resolved Hide resolved

aingerson reviewed Nov 13, 2024

View reviewed changes

prov/tcp/src/xnet_rdm_cm.c Outdated Show resolved Hide resolved

prov/tcp/src/xnet_attr.c Outdated Show resolved Hide resolved

soumagne reviewed Nov 13, 2024

View reviewed changes

prov/tcp/src/xnet_attr.c Outdated Show resolved Hide resolved

ooststep force-pushed the tcp-no-connect branch 2 times, most recently from 1ed132e to d447e69 Compare November 14, 2024 14:09

j-xiong reviewed Nov 14, 2024

View reviewed changes

include/rdma/fi_ext.h Outdated Show resolved Hide resolved

man/fi_tcp.7.md Outdated Show resolved Hide resolved

man/fi_tcp.7.md Outdated Show resolved Hide resolved

man/fi_tcp.7.md Outdated Show resolved Hide resolved

ooststep force-pushed the tcp-no-connect branch from d447e69 to f575a3b Compare November 15, 2024 20:28

j-xiong reviewed Nov 15, 2024

View reviewed changes

ooststep force-pushed the tcp-no-connect branch from f575a3b to 95ae4b7 Compare November 21, 2024 21:08

shefty reviewed Nov 27, 2024

View reviewed changes

j-xiong mentioned this pull request Dec 13, 2024

Handling TCP traffic restricted by a firewall #10637

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

prov/tcp: introduce TCP_NO_CONNECT flag #10534

prov/tcp: introduce TCP_NO_CONNECT flag #10534

ooststep commented Nov 13, 2024 •

edited

Loading

aingerson left a comment

j-xiong Nov 15, 2024

ooststep Nov 15, 2024 •

edited

Loading

shefty Nov 27, 2024

shefty Nov 27, 2024

ooststep Dec 11, 2024 •

edited

Loading

shefty Dec 11, 2024

soumagne Dec 11, 2024

jolivier23 Dec 11, 2024 •

edited

Loading

shefty Dec 12, 2024

jolivier23 Dec 13, 2024

shefty Dec 13, 2024

shefty Dec 13, 2024

j-xiong Dec 13, 2024

shefty Nov 27, 2024

prov/tcp: introduce TCP_NO_CONNECT flag #10534

Are you sure you want to change the base?

prov/tcp: introduce TCP_NO_CONNECT flag #10534

Conversation

ooststep commented Nov 13, 2024 • edited Loading

aingerson left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ooststep Nov 15, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ooststep Dec 11, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jolivier23 Dec 11, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ooststep commented Nov 13, 2024 •

edited

Loading

ooststep Nov 15, 2024 •

edited

Loading

ooststep Dec 11, 2024 •

edited

Loading

jolivier23 Dec 11, 2024 •

edited

Loading