Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unable to detect Socket.Send failures #101210

Open
wfurt opened this issue Apr 17, 2024 · 5 comments
Open

unable to detect Socket.Send failures #101210

wfurt opened this issue Apr 17, 2024 · 5 comments

Comments

@wfurt
Copy link
Member

wfurt commented Apr 17, 2024

This is related to #99181 and #99490.
I wanted to write some test that fail predictably and I failed to do so.
So I decided to open this to capture some conversation
Let consider this

   [Fact]
   public async Task SendAsync_WithIncompleteData_Throws()
   {
       using (var listen = new Socket(AddressFamily.InterNetwork, SocketType.Stream, ProtocolType.Tcp))
       {
           listen.SendBufferSize = 100;
           listen.ReceiveBufferSize = 100;
           listen.Bind(new IPEndPoint(IPAddress.Loopback, 0));
           listen.Listen();

           int size = listen.SendBufferSize;

           var client = new Socket(SocketType.Stream, ProtocolType.Tcp);
       //    client.Blocking = false;
           client.SendBufferSize = 100;
           client.ReceiveBufferSize = 100;
  
          // take default ReceiveBufferSize and double it to make sure sending data do not fit into it
           var sendBuffer = new byte[100_100];
           var receiveBuffer = new byte[sendBuffer.Length];

           Task<Socket> serverTask = listen.AcceptAsync();

           Task t = client.ConnectAsync(listen.LocalEndPoint);
           Socket serverSocket = await serverTask;
           await t;
           serverSocket.ReceiveBufferSize = 100;
           serverSocket.SendBufferSize = 100;
           serverSocket.Dispose();

           int sentBytes = await client.SendAsync(sendBuffer);
           Assert.Equal(sendBuffer.Length, sentBytes);
           client.Shutdown(SocketShutdown.Both);
           client.Close();
       }
   }

when I run it on my Windows 11 box everything finishes without error.
But looking at packet capture, we clearly did not sent 100K worth of data.
image

Same code throws on Linux. I thought it may be just timing so I did somewhat more elaborate test

[Fact]
public async Task ConnectAsync_WithIncompleteData_Throws()
{
    using (var listen = new Socket(AddressFamily.InterNetwork, SocketType.Stream, ProtocolType.Tcp))
    {
        listen.SendBufferSize = 3000;
        listen.ReceiveBufferSize = 3000;

        listen.Bind(new IPEndPoint(IPAddress.Loopback, 0));
        listen.Listen();

        int size = listen.SendBufferSize;

        var client = new Socket(SocketType.Stream, ProtocolType.Tcp);
        client.Blocking = false;
        client.SendBufferSize = 3000;
        client.ReceiveBufferSize = 3000;


        Task<Socket> serverTask = listen.AcceptAsync();
        Task t = client.ConnectAsync(listen.LocalEndPoint);
        Socket serverSocket = await serverTask;
        await t;

        var sendBuffer = new byte[client.SendBufferSize +  serverSocket.ReceiveBufferSize];
        var receiveBuffer = new byte[sendBuffer.Length];

        int sent = 0;
        new Span<Byte>(sendBuffer, 0, 100).Fill(65);
        while (true)
        {
            try
            {
                client.Send(new ReadOnlySpan<Byte>(sendBuffer, 0, 100));
                sent += 100;
            }
            catch (SocketException ex) when (ex.SocketErrorCode == SocketError.WouldBlock)
            {
                break;
            }
        }
        // read some data to free TCP Window 
        int receivedLength = serverSocket.Receive(new Span<Byte>(receiveBuffer, 0, 100));
    
        new Span<Byte>(sendBuffer, 0, 100).Fill(66);
        client.Send(new ReadOnlySpan<Byte>(sendBuffer, 0, 100 ));
        client.Shutdown(SocketShutdown.Both);
        client.Close();
    }
}

this test is constructed to fill up socket buffers and TCP Window. The last Send succeeds as data are written to Socket buffer but not transmitted on the wire and ACKed back. (e.g. the 'B' block never goes out

image

Linux man page is pretty specific about it

  Dealing with error returns from close()
       A careful programmer will check the return value of close(), since it is quite possible that errors on a previous write(2) operation are reported only on the final close()  that  releases
       the open file description.  Failing to check the return value when closing a file may lead to silent loss of data.  This can especially be observed with NFS and with disk quota.

       Note, however, that a failure return should be used only for diagnostic purposes (i.e., a warning to the application that there may still be I/O pending or there may have been failed I/O)
       or remedial purposes (e.g., writing the file once more or creating a backup).

       Retrying the close() after a failure return is the wrong thing to do, since this may cause a reused file descriptor from another thread to be closed.  This can  occur  because  the  Linux
       kernel always releases the file descriptor early in the close operation, freeing it for reuse; the steps that may return an error, such as flushing data to the filesystem or device, occur
       only later in the close operation.

       A careful programmer who wants to know about I/O errors may precede close() with a call to fsync(2).

it seems like our API does not allow us to check for the error and therefore I don't know how Socket users would know that data written to Socket or NetworkStream were actually lost.

From what I can see this looks like silent data loss.
cc @tmds @antonfirsov @stephentoub for comments.

@dotnet-policy-service dotnet-policy-service bot added the untriaged New issue has not been triaged by the area owner label Apr 17, 2024
Copy link
Contributor

Tagging subscribers to this area: @dotnet/ncl
See info in area-owners.md if you want to be subscribed.

@tmds
Copy link
Member

tmds commented Apr 18, 2024

The last Send succeeds as data are written to Socket buffer but not transmitted on the wire

To have the bytes transmitted immediately you should set Socket.NoDelay = true.

it seems like our API does not allow us to check for the error and therefore I don't know how Socket users would know that data written to Socket or NetworkStream were actually lost.

Hypothetically speaking: if socket send would need to report the error, it would have to block for at least a round trip to receive an acknowledgement from the peer. Also, this acknowledgement would be sent by the peer's kernel, so this doesn't tell the sender anything about how the peer is handling the data.

What happens instead is that the acknowledgement is the responsibility of the application level protocol through replying back. For example: client sends HTTP POST, server sends back 200 OK.

@wfurt
Copy link
Member Author

wfurt commented Apr 22, 2024

The last Send succeeds as data are written to Socket buffer but not transmitted on the wire

To have the bytes transmitted immediately you should set Socket.NoDelay = true.

That will not help. The test is purposely structured in a way that TCP windows is filled (as server does not read data) so we can test behavior when data cannot be transmitted. I could not find any existing tests that would cover that behavior.

@tmds
Copy link
Member

tmds commented Apr 23, 2024

The test is purposely structured in a way that TCP windows is filled (as server does not read data) so we can test behavior when data cannot be transmitted.

Synchronous Send blocks until data can be transmitted and SendAsync will wait until poll says the socket is writable. In both cases, the application is blocked until it can write more data (or the connection is closed).

To stop the operation earlier users can set SendTimeout/ReceiveTimeout (on sync api) or pass a CancellationToken (on async API).

One way to know when the TCP window is full is calling Send with Blocking to false. That will SocketError.WouldBlock.

@liveans
Copy link
Member

liveans commented Jul 2, 2024

Triage: Moving to the future, we will back to this once time permits.

@liveans liveans added this to the Future milestone Jul 2, 2024
@liveans liveans removed the untriaged New issue has not been triaged by the area owner label Jul 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants