Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[awc 3.0.0-beta.5] panic on exit if awc::Client is kept in a thread_local!{} #2207

Closed
abonander opened this issue May 10, 2021 · 9 comments
Closed

Comments

@abonander
Copy link

For convenience, we have a function at the root of our crate that returns an awc::Client which caches it in a thread-local and returns it:

fn actix_client() -> awc::Client {
    thread_local! {
        static ref CLIENT: awc::Client = awc::Client::new();
    }

    CLIENT.with(Clone::clone)
}

This worked fine with actix-web 3.* using awc 2.* but upon upgrading to awc 3.0.0-beta.5 we noticed panics on exit:

thread '<unnamed>' panicked at 'there is no reactor running, must be called from the context of a Tokio 1.x runtime',
...
   8: tokio::time::driver::handle::Handle::current
...
  12: actix_http::client::pool::CloseConnection<Io>::new
...
  35: core::ptr::drop_in_place<awc::ClientConfig>
             at /nix/store/gk5rma02vnj5r71b277fwdhrnakd6p42-rust-default-1.52.0/lib/rustlib/src/rust/library/core/src/ptr/mod.rs:187:1
  36: core::ptr::drop_in_place<awc::Client>
             at /nix/store/gk5rma02vnj5r71b277fwdhrnakd6p42-rust-default-1.52.0/lib/rustlib/src/rust/library/core/src/ptr/mod.rs:187:1
  37: core::ptr::drop_in_place<core::option::Option<awc::Client>>
             at /nix/store/gk5rma02vnj5r71b277fwdhrnakd6p42-rust-default-1.52.0/lib/rustlib/src/rust/library/core/src/ptr/mod.rs:187:1
  38: core::mem::drop
             at /nix/store/gk5rma02vnj5r71b277fwdhrnakd6p42-rust-default-1.52.0/lib/rustlib/src/rust/library/core/src/mem/mod.rs:890:24
  39: std::thread::local::fast::destroy_value
             at /nix/store/gk5rma02vnj5r71b277fwdhrnakd6p42-rust-default-1.52.0/lib/rustlib/src/rust/library/std/src/thread/local.rs:519:13
  40: __call_tls_dtors

What appears to be happening is that the thread_local! is trying to run the destructor on awc::Client on exit of the thread, which is trying to gracefully close its internally held TCP connections, but the Tokio runtime has already gone away because the thread is exiting and so tokio::runtime::Handle::current() is panicking.

Suggested Solution

The destructor of awc::Client or whatever internal structure that is trying to close TCP connections on-drop needs to use Handle::try_current() so as to not panic if the Tokio runtime is exiting, and if it can't get a handle to the runtime it should just exit without trying to gracefully close any connections.

Your Environment

  • Rust Version (I.e, output of rustc -V): 1.52.1
  • Actix Web Version: awc 3.0.0-beta.5
@fakeshadow
Copy link
Contributor

You didn't get panic in awc v2 because it was silently leaking tcp connections on drop.

I don't think your suggested solution is a good one. If you are holding a awc client object longer than it's runtime I believe a panic is a good way to tell you are doing it wrong.

@abonander
Copy link
Author

abonander commented May 12, 2021

If it's properly closing the file descriptors on-drop (which it should be doing anyway) then nothing is being "leaked" here, you're just losing the graceful shutdown message sent by TcpStream::close() telling the other end you're hanging up. The connection on the other end would then simply time out, or error the next time the other end tries to send data. That's a normal happenstance on the web.

For example, Tokio does not care one single bit if you drop a TcpStream after the runtime has exited:

//# anyhow = "1.0"
//# tokio = { version = "1.0", features = ["net", "rt-multi-thread"] }

use tokio::net::TcpStream;
use tokio::runtime::Runtime;

fn main() -> anyhow::Result<()> {
    // nslookup google.com
    let stream =
        Runtime::new()?.block_on(async { TcpStream::connect("172.217.164.110:443").await })?;

    drop(stream);

    Ok(())
}

Panics should be reserved for signaling actual bugs.

@fakeshadow
Copy link
Contributor

awc does not only hold TcpStream. It holds other objects that have detached spawn tasks related to it.

Holding an awc object that can go outside the context of it's runtime is a bug.

@abonander
Copy link
Author

abonander commented May 12, 2021

Holding an awc object that can go outside the context of it's runtime is a bug.

Attempting to use it outside the runtime is a bug, certainly, which is why I'm not complaining about panics in general.

However, we're talking about a Drop impl here which is only attempting to perform a graceful shutdown of connections, which is not a bug to skip. For the HTTP/2 connections, their background tasks (and the TCP connections they own) would have already been cleaned up by the Drop impl of the runtime. HTTP/1 connections do just wrap TcpStreams (through a few layers of abstractions) which will have their file descriptors closed on-drop. There's no downside to skipping this.

Here's the actual culprit after looking at the stack trace again (which I admittedly trimmed down too much; it's not even in awc actually): https://github.com/actix/actix-web/blob/master/actix-http/src/client/pool.rs#L76

Also, it looks like if disconnect_timeout is not set then no panics should be triggered anyway, at least not in the same place.

@fakeshadow
Copy link
Contributor

#1123

@abonander
Copy link
Author

That's an orthogonal issue? If the HTTP/2 connection isn't cleaned up when the runtime is dropped that's a bug in how HTTP/2 connections are handled.

@abonander
Copy link
Author

I'm not saying to never run the close() handler, I'm saying if the runtime is already gone then it shouldn't attempt it just to panic.

@fakeshadow
Copy link
Contributor

fakeshadow commented May 12, 2021

This is my point.
I would prefer simple logic to make clean up code always run. And if the code can not run I prefer it's panic no matter the reason.

This kind of issue always come down to the lack of async drop in Rust. There are people prefer to hack a way by skipping, there are people prefer to block the thread and do drop. I'm not saying they are wrong it's just not my preference. Doing spawn task in Drop is not right either but I find it's one of the few ways work in situation like this.

@abonander
Copy link
Author

abonander commented May 12, 2021

I would prefer simple logic to make clean up code always run. And if the code can not run I prefer it's panic no matter the reason.

Given that it needs tokio::task::spawn_local() (via actix_rt::spawn()) which doesn't seem to have a fallible version (no LocalSet::try_current() or similar) then this probably wouldn't have a simple solution anyway. If someone else complains of this panic at least you can point them to this conversation.

We already migrated our remaining uses of awc to reqwest since we needed an HTTP client that could work in Tokio's multi-threaded runtime as our application does a lot of work in the background (we run both actix_rt and Tokio's multithreaded runtime).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants