Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

http: replace destroySoon with socket.end #36205

Open
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

mmomtchev
Copy link
Contributor

Closing a TCP connection requires 2 FIN/ACK
exchanges, so we should wait for both events
before destroying a socket

Fixes: #36180

Checklist
  • make -j4 test (UNIX), or vcbuild test (Windows) passes
  • [?] tests and/or benchmarks are included
  • commit message follows commit guidelines

Closing a TCP connection requires 2 FIN/ACK
exchanges, so we should wait for both events
before destroying a socket

Fixes: nodejs#36180
@nodejs-github-bot
Copy link
Collaborator

Review requested:

  • @nodejs/net
  • @nodejs/quic

@nodejs-github-bot nodejs-github-bot added the net Issues and PRs related to the net subsystem. label Nov 20, 2020
lib/net.js Outdated Show resolved Hide resolved
lib/net.js Outdated Show resolved Hide resolved
Copy link
Member

@mcollina mcollina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Copy link
Member

@ronag ronag left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this can be resolved by replacing destroySoon() with end(). The suggested changes looks to me like a roundabout way to achieve the same thing.

@lpinca
Copy link
Member

lpinca commented Nov 21, 2020

I think this kind of defeat the point of socket.destroySoon().

  • Wouldn't it be better to make res.end() call socket.end() instead of changing socket.destroySoon()?
  • What happens if the other peer never sends EOF? The socket will be kept open indefinitely no? Even if res.end() was called.

@lpinca
Copy link
Member

lpinca commented Nov 21, 2020

I totally agree with @ronag. I also think that this might introduce a DoS vulnerability as per my previous comment.

Copy link
Member

@mcollina mcollina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding my -1 to the list.

@addaleax
Copy link
Member

What happens if the other peer never sends EOF? The socket will be kept open indefinitely no? Even if res.end() was called.

Maybe I’m not understanding correctly, but this is how sockets work in general? You can receive data on them until you get EOF.

I think what’s missing here is a check for the allowHalfOpen, but conceptually this is correct otherwise.

@lpinca
Copy link
Member

lpinca commented Nov 21, 2020

Maybe I’m not understanding correctly, but this is how sockets work in general? You can receive data on them until you get EOF.

Yes, but in the case of HTTP I no longer care about the readable side once response.end() has been called and, in fact, all incoming data is discarded, and the socket destroyed as soon as the last chunk has been written.

I think this opens the risk of waiting for unresponsive clients after response.end() is called.

@mmomtchev
Copy link
Contributor Author

What happens if the other peer never sends EOF? The socket will be kept open indefinitely no? Even if res.end() was called.

Maybe I’m not understanding correctly, but this is how sockets work in general? You can receive data on them until you get EOF.

I think what’s missing here is a check for the allowHalfOpen, but conceptually this is correct otherwise.

If the peer never sends an EOF, yes it can continue sending data as long as it wishes. It is not a very good attack vector, since it is only marginally better than simply sending unsolicited data, but still, some firewalls will block a half-open connection after a fixed timeout.
@ronag, I think that maybe the best solution would be to have a safeDestroy or drainAndDestroy function that does the same as autoDestroy , but can be triggered on demand.

@mmomtchev
Copy link
Contributor Author

I think this can be resolved by replacing destroySoon() with end(). The suggested changes looks to me like a roundabout way to achieve the same thing.

What is net.Socket.end()? From what I found it goes directly to Stream.end() and it won't drain the output side and it won't do a shutdown before close?

@ronag
Copy link
Member

ronag commented Nov 21, 2020

Stream.end should not autoDestroy until drained. It should do exactly what you want as far as I understand.

@mmomtchev
Copy link
Contributor Author

Stream.end should not autoDestroy until drained. It should do exactly what you want as far as I understand.

I can't find it, where is it?

@ronag
Copy link
Member

ronag commented Nov 21, 2020

See the autoDestroy logic in Writable. It checks for endEmitted.

@mmomtchev
Copy link
Contributor Author

See the autoDestroy logic in Writable. It checks for endEmitted.

There is no shutdown here? It just checks if the readable is finished?

@mmomtchev
Copy link
Contributor Author

Yes, I confirm, I just traced it with .end - there is no shutdown - now I have a close before the shutdown

@ronag
Copy link
Member

ronag commented Nov 21, 2020

I don't know anything about shutdown. I'm just talking about streams and autoDestroy in the context of the changes in this PR.

@ronag
Copy link
Member

ronag commented Nov 21, 2020

I would suggest you change .destroySoon() to .end() and if that does not do what you want try to trace what streams are doing and when net.Socket._final and/or net.Socket._destroy is called. Also make sure that it's only autoDestroy that calls net.Socket.destroy().

@ronag
Copy link
Member

ronag commented Nov 21, 2020

What version of Node are you testing on? master?

@ronag
Copy link
Member

ronag commented Nov 21, 2020

there is no shutdown - now I have a close before the shutdown

From what I can see this should not be possible, _final which calls shutdown should always happen before _destroy and never in the other order. If that is not true then we have a bug.

@mmomtchev
Copy link
Contributor Author

server.close() provokes it - this is the old situation

I suggest we do it like this for the time being, because this must have been the original system - and then maybe try to devise a better method

server.close() is delayed by server.emitCloseIfDrained() - and Stream.end() does not hold it back - one has to go through the whole chain to see what is the reason

@mmomtchev
Copy link
Contributor Author

No, wait, server.close() closes the listen()ing file descriptor - this should have no effect on the already accept()ed connection
I will check this tomorrow

@mmomtchev
Copy link
Contributor Author

@addaleax In fact, I think that this was already the case - there was no race condition, it is only now that I realize it - server.close does not wait and does not need to wait - it simply closes the binding on the port

@mmomtchev
Copy link
Contributor Author

mmomtchev commented Nov 21, 2020

@ronag, what is the point of Writable.end()? What does it mean? Is it called only on a Duplex?

@ronag
Copy link
Member

ronag commented Nov 21, 2020

@ronag, what is the point of Writable.end()? What does it mean? Is it called only on a Duplex?

It makes sure _final is called.

@mmomtchev
Copy link
Contributor Author

@ronag, what is the point of Writable.end()? What does it mean? Is it called only on a Duplex?

It makes sure _final is called.

Yeah, but how is it called? Is it only for a Duplex? Normally, you don't have an end on a solo Writable?

@lpinca
Copy link
Member

lpinca commented Dec 3, 2020

What I have in mind is checking whether socket is a tls.Socket or net.socket when using response.end() and use socket.end() + a timeout, or socket.destroySoon() accordingly. I don't know if it is viable or if it will make everything unnecessarily more complicated.

After discussions with @ronag and @lpinca, the
best solution to the TLS RST problem is the one
that limits the new behaviour to TLSSockets

This is complementary to @vtjnash solution which
allows the client to ignore this (somewhat normal) RST
Both are needed, since both ends of Node.js TLS code
must interact with other TLS implementations

Refs: nodejs#36205
@mmomtchev
Copy link
Contributor Author

After discussions with @ronag and @lpinca, the best solution to the TLS RST problem is the one that limits the new behaviour to TLSSockets

I restored destroySoon to its previous behavior and added an override in TLSSocket

This is complementary to @vtjnash solution which allows the client to ignore servers that send this (somewhat normal) RST
Both are needed, since both ends of Node.js TLS code must interact with other TLS implementations

lib/_tls_wrap.js Outdated
TLSSocket.prototype.destroySoon = function() {
if (this.writable)
this.end();

Copy link
Member

@ronag ronag Dec 9, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this.end() is enough here. The code below is possibly redundant (at least in Node 14+).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also should we have a timeout waiting for 'close' here? @lpinca

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Normally, the kernel will expire sockets stuck in FIN_WAIT_2 state as it has always been a problem and there was an old DoS attack that starved out the system of sockets by deliberately not closing them

// If the server immediately destroys its socket, this data will trigger a
// RST packet in response
// https://github.com/nodejs/node/issues/36180
TLSSocket.prototype.destroySoon = Writable.prototype.end;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need a timeout for 'close' here? @lpinca

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it makes sense.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you want a timeout here. The underlying socket will already have a timeout (from the time the last packet was received). And if it's slowly trickling in data, any hard timeout could break the connection prematurely.

Copy link
Member

@lpinca lpinca Dec 9, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't this allow a Slowloris like attack? To clarify, I think it would make sense to have a timeout on tlsSocket.destroySoon() if it is changed to be an alias for tlsSocket.end(), not on tlsSocket.end() itself that should work like netSocket.end() for feature parity.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that seems quite possible. As I mentioned elsewhere, I'm opposed to the general implementation in this PR. At a minimum, it needs to respect the allowHalfOpen / autoDestroy flags. But it also needs to be implemented at the point in the TLS code that is generating this event, and not by disregarding the explicit call to destroy or close it.

Additionally, I expect that, in principle, as soon as this function returns, a sufficiently smart garbage collector could observe that there are no remaining useful references to this TLS socket (we were just requested to destroy the socket for reading and writing, so there's nothing that should generate a new callback event), and thus the underlying socket should be expected to be finalized and then immediately destroyed as soon as the call to 'end' returns. And thus still causing the RST packet this PR was supposedly going to prevent. It's probably not hard to fool the garbage collector into keeping the memory reference live (though who is ever going to clean up this socket in that case?), but I worry that this PR may thus just create new problems, without actually addressing existing ones.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Normally, all three major OSes have a built-in default timeout for sockets in FIN_WAIT_2 state (first FIN sent and then ACK from remote) - it is usually about 60-120 seconds.
For sockets in FIN_WAIT_1 state (first FIN sent, waiting for first ACK), the much longer default TCP timeout applies - that would be 300s by default on Linux.
As for the Slowloris attack, I don't see how this change could help an attacker? A Slowloris attack exploits the timeouts by always sending just enough data to keep them from triggering. The most advantageous timeout for this would still be the default TCP ACK timeout - be it by withholding the ACK on a data packet or a FIN packet. In fact, FIN would be a little bit less practical, because you can delay it only once - the connection will be closed afterwards.

As for the event that triggers (or not) the RST, it is not the destruction of the socket object by the garbage collector, it is the explicit calling of the kernel close() syscall by libuv uv_tcp_close(). If any remote data is received after this call, a RST is sent back.

@vtjnash
Copy link
Contributor

vtjnash commented Dec 9, 2020

I'm strongly opposed to this. The conversation is long, so it's hard to respond to specific comments, but I'll try to respond to many.

First a comment on bugs though, what the TLS wrap client code currently does is NOT a simultaneous close of the TCP level—that’s the bug I’m proposing fixing in #36111. The server side (this code) currently assumes a simultaneous close (allowed, per TLS spec), while the client (other) side assumes that the server will do a graceful shutdown of the TCP socket after the TLS socket is destroyed (allowed, per TLS spec, but only if the application layer handles it independently). All types of normal close involve first an active close (except for broken connection timeout failures). What you're proposing would be for the server to additionally wait for a TCP graceful shutdown to complete after the TLS connection is already satisfactorily destroyed. Not all of these state transitions are represented in the diagram you sent earlier (https://users.cs.northwestern.edu/~agupta/cs340/project2/TCPIP_State_Transition_Diagram.pdf), so I wanted to clarify some terminology. For reference again too, here's the relevant paragraphs in the TLS spec for a proper close: https://tools.ietf.org/html/rfc5246#section-7.2.1

If the peer never sends an EOF, yes it can continue sending data as long as it wishes

This is true of non-TLS sockets using TCP, but is explicitly prohibited by the TLS specification above. As soon as the peer receives or sends a close-notify, it must immediately stop sending data: the allowHalfOpen option is required to be false on the TLS wrapper (it can be set true on the underlying socket, but it's never necessary, since any state changes it causes should be ignored by any implementation of the TLS protocol). Graceful shutdown is not defined in the TLS protocol: the definition of it at the higher application level is thus required—for https that means the Content-Length header is needed (or chunked)—and the TLS spec recommends ignoring state changes at the lower socket level—for TCP, that means it should not alter the connection state based upon unencrypted packets (RST, FIN, etc).

In the old TLS specification, it was non-conforming to process a FIN or RST packet as an EOF (e.g. to process any state change notification from the TCP socket). In the current spec, it's permitted to treat a EOF on the underlying socket as a TLS close-notify. As such, this is a misreading of the spec:

TLS 1.0 specifies a mandatory simultaneous close after exchanging two close-notify messages
As this was one of the most disrespected aspects of the specification (most implementation did force close the socket), they made it optional in TLS 1.1 where one-sided closing is officially supported - the other must ignore everything sent after the close-notify
So in theory:
The old behavior was correct for TLS 1.0
The new one is correct for TLS 1.1
HTTP is undefined
In practice:
The new one does not get RSTs
The old one did

In TLS, both simultaneous close and active/passive close of the TCP layer are permitted (in neither case is it necessary to wait for the message exchanges). The underlying socket is permitted to be closed immediately after sending a close-notify message (the exact timing is explicitly undefined in the spec). Thus many implementations (including nodejs) may call the equivalent of forceClose immediately to optimize cleanup. The spec requires the other side to always ignore RST and FIN packets anyways (since they’re unencrypted), so it's not important to wait for the close-notify exchange (the connection is already destroyed).

So the current behavior of the active close is completely correct for TLS 1.1. Neither the existing behavior nor this PR would be entirely correct for TLS 1.0 (since it'll also react to a EOF from TCP, which the original spec prohibited).

The socket will be kept open indefinitely no?

Yes, but I would guess that Connection: Keep-Alive is probably already a worse offender of that? So I think this PR just increases the level of confusion around the TLS close-notify expectations, but shouldn't actually cause any functional differences, aside from exposing other bugs, and being slightly more kind to buggy clients (which includes nodejs, until my PR #36111 is finished and merged).

@lpinca
Copy link
Member

lpinca commented Dec 9, 2020

@vtjnash thank you. I want to clarify only one minor thing. My comment

The socket will be kept open indefinitely no?

was referring to the non keep alive case. When keep alive is used response.end() does not call socket.destroySoon().

@mmomtchev
Copy link
Contributor Author

This is the same old discussion on whether we should ignore this (somewhat normal) RST packet that is a direct result of the TLS protocol allowing (allowing, but not requiring) behavior not really compatible with TCP:
(RFC 4346)
The other party MUST respond with a close_notify
alert of its own and close down the connection immediately,
discarding any pending writes. It is not required for the initiator
of the close to wait for the responding close_notify alert before
closing the read side of the connection.

This is begging for a RST.

So @vtjnash , I agree with what you are saying. But the TLS protocol has been specified independently of TCP - it does not even require it - even if 99% of the time it is used over TCP. So I think that the best possible behavior is that Node's TLS client ignores a RST that comes after its own close_notify - as many servers will respond with a RST to this final transmission.

And the best possible behavior for the TLS server is to make sure that it does not send a RST packet when the remote client sends this one final transmission - since many clients will report an error.

@vtjnash
Copy link
Contributor

vtjnash commented Dec 9, 2020

since many clients will report an error

A TLS client that reports an error is in explicit violation of this part of the TLS spec. They are going to encounter reliability issues irregardless of this PR, especially since nodejs isn't the only server implementation they may encounter. However, this PR will make it harder to test for correct implementations of the TLS spec at the nodejs client level, since the test server would now usually be more lenient than required.

@mmomtchev
Copy link
Contributor Author

since many clients will report an error

A TLS client that reports an error is in explicit violation of this part of the TLS spec. They are going to encounter reliability

No, there are almost no mentions of TCP and no mention of a TCP RST at all in the TLS spec. TLS is specified on top of some reliable transport protocol (e.g., TCP) - this being the only occurrence of TCP in the main document excluding the appendices. And it allows you to implement it without being perfectly compliant with TCP (ie frequently triggering a RST when closing, and then ignoring it) - but does not stop you if you want to comply with both. In this case, when closing, you have to continue reading on the TCP level without interpreting the data.

@vtjnash
Copy link
Contributor

vtjnash commented Dec 9, 2020

you have to continue reading on the TCP level

This is explicitly mentioned as being not required of a conforming implementation:
"then the implementation MAY choose to close the transport without waiting for the responding close_notify"

Argue all you want about hypothetical alternative TLS designs, but that's not what the authors of the 1.2 spec wrote as the behaviors you should expect to encounter when interoperating with real-world TLS implementations.

(note that "have to" is not typically terminology used in a spec—it's typically written as MUST, for clarity of communication. But here we actually have the statement to the contrary given with MAY.)

One mistake in my last text I realized is that I'm using the text from the older version of the spec. For the current version (1.3), that link is https://tools.ietf.org/html/rfc8446#section-6.1. There's no substantive difference for this purpose, though the wording changed slightly:

"Both parties need not wait to receive a close_notify alert before closing their read side of the connection"

The reason for the change is that the allowHalfClose state is actually now permitted in TLS 1.3 (receiving close_notify no longer should cause pending writes to be discarded). This will be important, as it will resolve some of the TODO questions I had left in #36111 in the opposite direction as I was expecting.

@ronag
Copy link
Member

ronag commented Dec 27, 2020

@vtjnash @mmomtchev What does this need to move forward or being closed? Can you give a quick summary of any blocking issues or disagreements?

@mmomtchev
Copy link
Contributor Author

@ronag
Main discussions were/are

  • @lpinca Should this be limited to TLS or should it also apply to HTTP - we converged to TLS-only
  • @vtjnash Should this be fixed at all or is the RST to be ignored by the client - both me and @vtjnash have strong opinions on that one
  • @ronag And the exact nature of the fix, keeping or not destroySoon

@ronag
Copy link
Member

ronag commented Dec 30, 2020

both me and @vtjnash have strong opinions on that one

How can we help resolving this? Could one of you summarize the disagreement and then maybe we can ask the TSC for an opinion?

@mmomtchev
Copy link
Contributor Author

The last two exchanges summarize it very well.

TLS allows for a behavior that is almost guaranteed to provoke a RST unless it is mitigated.

RFC 4346, section concerning closing of connections:
The other party MUST respond with a close_notify
alert of its own and close down the connection immediately,
discarding any pending writes. It is not required for the initiator
of the close to wait for the responding close_notify alert before
closing the read side of the connection.

The not required part is the problem.
Unless the initiator (the server) does wait, the server will (very frequently) reset the final close_notify.

I favor mitigation since we are to respect both protocols (TLS and TCP).
For @vtjnash , in this case, TCP is not to be respected, being overridden by the higher level TLS

@Trott Trott added the tsc-agenda Issues and PRs to discuss during the meetings of the TSC. label Dec 30, 2020
@mcollina
Copy link
Member

Could any of you do a quick survey on what other languages/runtimes/load balancers do?

@mmomtchev
Copy link
Contributor Author

I (with input from @lpinca) created a small gist for testing the behavior of different HTTP servers:
https://gist.github.com/mmomtchev/918b70c93a56bc83fc7ada2066c77123

Quick reminder: we are testing which servers will drain the data the client sent after his request in a Connnection: close mode - some will read after FIN and will discard the data, and some will destroy the socket and will send back a TCP RST.

Such behavior (not reading after FIN) is not compliant with the TCP specs. Incidentally, it is not compliant with the HTTP specs either, so in plain HTTP mode it doesn't bother anyone. However in TLS mode, the close_notify is always sent after the FIN so unless the server reads after FIN the RST is very likely.

So when testing, I came to conclusion that most UNIX-socket based implementations (ie Apache and nginx), the server will read after FIN. This would have been Node's behavior if that PR was to be applied to plain HTTP too.
ASIC-based implementations, such as those found on the front entry points of high traffic sites, won't read after FIN. This is the current Node behavior.
But this is a mostly theoretical debate as in plain HTTP it concerns the server's behavior in response to an invalid client behavior.

In the TLS case, this is about the server's behavior in response to a valid client behavior. Currently (after @vtjnash PR) the Node client ignores a RST received by the server after the close_notify message is emitted.

This PR is about the server not sending the RST in the first place.

I think that in order to assure interoperability, both are needed.

Do other TLS servers send a RST in Connection: close mode? None that I am aware of - testing the TLS case doesn't require any special tooling as in TLS this is a valid behavior - just use curl from the CLI, ask for a resource that is served in Connection: close mode and then use a packet sniffer to verify if the server closes sometimes with a RST. Just be aware that for some reason, that I haven't investigated, this is difficult to reproduce on Linux - OSX and Windows do it much more often - close to 100% occurrence - which is the reason why this stayed for so long in this state.
There is a very good example in #23169.

@mcollina
Copy link
Member

Have you been able to survey some HTTP and HTTPS server implementations to check for this?

@vtjnash
Copy link
Contributor

vtjnash commented Dec 31, 2020

I think that test may be a bit confusing to some, since you appear to be testing what happens to your http client when your http client violates the http protocol, which isn't precisely the case we're discussing here (which is the desired socket shutdown sequence of a https server for valid request)—though it does allow us to infer some properties relevant here. Just wanted to note that limitation on its applicability: the test was done on an http server (without TLS) while violating the spec (so not applicable to the use of http), but as a means of triggering the likely behavior of the https server while observing the spec, by widening the window for the race to be detected.

The behavior you describe isn't a violation of the TCP spec—sending an RST is exactly what the TCP spec said is permitted to happen for that sequence of events. You shouldn't be able to violate the spec while running the kernel state machine, since its job is to enforce compliance. However, the TLS spec defines that the RST packet should be ignored, so there should not be an issue bubbling up from there to the application, and that handling is currently not implemented correctly in nodejs. Since ASIC implementations behave this way also, it may be desirable for performance to close the socket early instead of late (I make no claim either way). Either way, I think the right place to make this change would be to alter which sequence of events the TLSWrap state machine executes, rather than changing the meaning of this function.

@Trott
Copy link
Member

Trott commented Jan 14, 2021

We talked about this at TSC today. The sense of the group is that a path to resolution might be to check what nginx, HAproxy, and Apache do. If they all do the same thing, we certainly won't want to do anything different. Is someone up for investigating that?

@Trott Trott removed the tsc-agenda Issues and PRs to discuss during the meetings of the TSC. label Jan 14, 2021
@mmomtchev
Copy link
Contributor Author

🚓

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
http Issues or PRs related to the http subsystem. semver-major PRs that contain breaking changes and should be released in the next major version.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

res.end() from a https.server() handler sends a RST packet provoking ECONNRESET
8 participants