Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using a large Stream as body occassionally sends no data #1377

Closed
jolhoeft opened this issue Nov 17, 2017 · 19 comments
Closed

Using a large Stream as body occassionally sends no data #1377

jolhoeft opened this issue Nov 17, 2017 · 19 comments

Comments

@jolhoeft
Copy link
Contributor

I am working on migrating nickel.rs to hyper.0.11.x (see nickel-org/nickel.rs#402 and nickel-org/nickel.rs#410), and have run into an issue sending files. I am using futures-cpupool to open and read the file and create a stream to use as the body in my response. I am seeing occasional failures in getting data. No errors are returned, just no data either. I've seen this both using cpupool directly, and with futures-fs which streams the file instead of just loading into a buffer. I go into more detail at seanmonstar/futures-fs#4.

Before I dig into this much more deeply, is there a straightforward way of sending a file on a response I am missing?

@seanmonstar
Copy link
Member

Do you have how you are configuring either hyper::server::Http or hyper::Server?

@jolhoeft
Copy link
Contributor Author

The only configuration we are doing is keep alives. For testing we are using our default, which is to enable them.

The configuration is all done in https://github.com/jolhoeft/nickel.rs/blob/0aa741ea2977821aee9079fe29a57ba1eaeb3c2c/src/server.rs, specifically the serve method:

    pub fn serve(mut self,
                 addr: &SocketAddr,
                 keep_alive_timeout: Option<Duration>,
                 thread_count: Option<usize>,
                 verbose: bool)
                 -> HttpResult<()> {
        let arc = ArcServer(Arc::new(self));
        let mut http = Http::new();

        if let Some(threads) = thread_count {
            // override the default set in Server::new
            self.fspool = FsPool::new(threads);
        }
        http.keep_alive(keep_alive_timeout.is_some());
        let server = http.bind(addr, move || Ok(arc.clone()))?;

        if verbose {
            match server.local_addr() {
                Ok(a) => { println!("Listening on http://{}", server.local_addr().unwrap()); },
                Err(e) => { println!("Error getting socket: {:?}", e); }
            };
            println!("Ctrl-C to shutdown server");
        }

        server.run()
    }

@seanmonstar
Copy link
Member

Would you be able to add server.no_proto() after http.bind()? Additionally, it'd need to use hyper master, as a bug was fixed with no_proto that is in 0.11.7.

@seanmonstar
Copy link
Member

Rather, that fix was released in 0.11.8, so simply an upgrade would be needed (and adding server.no_proto()).

@jolhoeft
Copy link
Contributor Author

jolhoeft commented Dec 8, 2017

I will try 0.11.8 shortly. In the meantime, I tried a different approach. I used a scoped thread pool (the scoped-pool crate, I have a use case that needs scoped threads) and oneshot::channel. This is working all the time so far. The code:

        let (tx, rx) = oneshot::channel();
        self.pool.scoped(|scope| {
            scope.execute(move || {
                let mut file = match File::open(path_buf) {
                    Ok(f) => f,
                    Err(e) => { tx.send(Err(e)); return; },
                };
                let mut buf: Vec<u8> = Vec::new();
                match copy(&mut file, &mut buf) {
                    Ok(_) => { tx.send(Ok(buf)); },
                    Err(e) => { tx.send(Err(e)); },
                };
            })
        });
        let body: ResponseStream = Box::new(rx.
                                            into_stream().
                                            map_err(|e| HyperError::from(io::Error::new(io::ErrorKind::Other, e))).
                                            and_then(|r| match r {
                                                Ok(r) => Ok(Chunk::from(r)),
                                                Err(e) => Err(HyperError::from(e)),
                                            })
        );
        self.origin.set_body(body);
        Ok(Halt(self))

I thought this was what futures-cpupool was doing, so I'm puzzled why this works and that sometimes fails.

@jolhoeft
Copy link
Contributor Author

jolhoeft commented Dec 8, 2017

My tests with 0.11.8/server.np_proto() and the futures-cpupool approach continue to fail intermittently.

I'm seeing the same behavior on both Linux and Windows (gcc toolchain). I.e. futures-cpupool fails intermittently, and my oneshot::channel approach works reliably.

@jolhoeft
Copy link
Contributor Author

jolhoeft commented Dec 8, 2017

I suspect this is an issue in futures-cpupool, but I don't understand the workings there enough to be sure.

@seanmonstar
Copy link
Member

The two things you are comparing are actually doing different things: the oneshot is buffering the full file contents into a single Vec, and then sending that as the body to hyper. The fs-pool approach is reading a chunk of the file and sending that on the Stream. I believe the fs-pool implementation has been shown by people with large files to work, so it could be a bug in hyper dealing with a streaming body...

@jolhoeft
Copy link
Contributor Author

jolhoeft commented Dec 8, 2017

Yes, the futures-fs approach is streaming, but the futures-cpupool approach I tested against is also buffering into a Vec:

        using futures-cpupool
        let stream = self.cpupool.spawn_fn(|| {
            let mut file = match File::open(path_buf) {
                Ok(f) => f,
                Err(e) => { return future::err(e) },
            };
            let mut buf = Vec::new();
            match copy(&mut file, &mut buf) {
                Ok(_) => {
                    eprintln!("Got buf: {:?}", &buf[0..16]);
                    future::ok(buf)
                },
                Err(e) => future::err(e),
            }
        }).into_stream().
            map(|b| Chunk::from(b)).
            map_err(|e| HyperError::from(e));

        let body: ResponseStream = Box::new(stream);
        self.origin.set_body(body);
        Ok(Halt(self))

My next thought is to migrate the oneshot::channel approach to mpsc::channel so that streaming large files is feasible.

@jolhoeft
Copy link
Contributor Author

What would you like to do with this issue? We haven't resolved the issue with futures-fs or futures-cpupool, but using channels from futures works fine (as in the send_file.rs example). That is my plan going forward, so this is solved for me.

@seanmonstar seanmonstar added the C-bug Category: bug. Something is wrong. This is bad! label Jan 2, 2018
@seanmonstar
Copy link
Member

It sounds like this is a bug, but I haven't been able to identify what causes it, nor set up a reliable way to investigate it.

@seanmonstar seanmonstar changed the title Sending files Using a large Stream as body occassionally sends no data Jan 2, 2018
@kamyuentse
Copy link
Contributor

kamyuentse commented Jan 3, 2018 via email

@jolhoeft
Copy link
Contributor Author

jolhoeft commented Jan 3, 2018

It is tricky to reproduce. It only occurs with the nickel.rs test harness. Loading pages through a browser always works. I suspect there is some timing issue that the test harness is triggering more regularly than a browser would.

I think the approach would be to modify the send_files.rs example to use futures-cpupool, then rig up a test harness similar to nickel's and see if that can trigger it. That is probably the best way to get trace level logging too. I expect trace on the nickel test harness would be a vast amount of data.

I'll see if I can get to this in the next week or so.

@jolhoeft
Copy link
Contributor Author

jolhoeft commented Jan 3, 2018

@kamyuentse , I go in more detail in the futures-fs issue seanmonstar/futures-fs#4. The code I am using is present there.

@seanmonstar
Copy link
Member

I modified a hyper example to use cpu-pool to send a small file, and ran curl against it a whole bunch of times, and could never get it to error.

So, I read through your branch of nickel, seeing how you used the cpu-pool, and also the test code. The test code might be the cause, or it could be something even more hidden. Either way, there is a bug in the test code that could explain the issue you're seeing:

  • The response_for util function creates a Core and runs a request on it, returning a Response. The Core and Client are dropped as the function returns.
  • The test function, after getting the Response (and thus the Core is dropped) tries to stream from the body. With the Core dropped, any data that came on the socket will never be read, since the Core never polls the connection again.

@seanmonstar
Copy link
Member

@jolhoeft does the issue happen if with the test code adjusted?

@jolhoeft
Copy link
Contributor Author

jolhoeft commented Jan 6, 2018

Just getting a chance to look at this. That does sound like a possible cause, and should be fixed in any case.

@jolhoeft
Copy link
Contributor Author

jolhoeft commented Jan 6, 2018

I reworked the clients in the test harness as you suggested, and that seems to have resolved the problem with files streaming in futures-cpupool. I understand why it was failing now, although it is unclear why it was sometimes succeeding. Is there some buffering going on in tokio, so sometimes the read would complete before the core and client were dropped?

@seanmonstar
Copy link
Member

Yes, tokio would eagerly try to buffer the first body chunk, which means depending on if the data was in the socket or not before core.run() returned would affect your test. (I believe the new dispatcher in hyper doesn't read as eagerly...)

Glad to see that this just a test error!

@seanmonstar seanmonstar removed the C-bug Category: bug. Something is wrong. This is bad! label Jan 8, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants