Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make aio, chdir, and wait tests thread safe #638

Merged
merged 2 commits into from
Jul 18, 2017
Merged

Conversation

asomers
Copy link
Member

@asomers asomers commented Jun 30, 2017

Fix thread safety issues in aio, chdir, and wait tests

They have four problems:

  • The chdir tests change the process's cwd, which is global. Protect them all with a mutex.

  • The wait tests will reap any subprocess, and several tests create subprocesses. Protect them all with a mutex so only one subprocess-creating test will run at a time.

  • When a multithreaded test forks, the child process can sometimes block in the stack unwinding code. It blocks on a mutex that was held by a different thread in the parent, but that thread doesn't exist in the child, so a deadlock results. Fix this by immediately calling std::process:;exit in the child processes.

  • My previous attempt at thread safety in the aio tests didn't work, because anonymous MutexGuards drop immediately. Fix this by naming the SIGUSR2_MTX MutexGuards.

Fixes #251

use std::os::unix::prelude::*;
use std::env::current_dir;
use tempfile::tempfile;
use tempdir::TempDir;
use libc::off_t;

fn test_in_subprocess<F: Fn()>(func: F) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could use a doccomment.

let mut inner_tmp_dir = tmp_dir.path().to_path_buf();
for _ in 0..5 {
let newdir = iter::repeat("a").take(100).collect::<String>();
//inner_tmp_dir = inner_tmp_dir.join(newdir).path();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably want to remove this.

@Susurrus
Copy link
Contributor

Susurrus commented Jul 5, 2017

This needs a rebase to the updated CI configuration. Then it LGTM assuming it pasts all tests.

@asomers
Copy link
Member Author

asomers commented Jul 5, 2017

Actually, it's not ok. It seemed ok at first, but now I can occasionally produce deadlocks when I use 8 threads. This PR requires more investigation.

@asomers
Copy link
Member Author

asomers commented Jul 7, 2017

test_in_subprocesswill never work. It looks good, but it doesn't work if the process is multithreaded (which it usually is). The child process is single threaded; only the thread that called fork will exist in the child. So if some other threads in the master held locks that the test function needs, then it will never complete. And some of the stuff in Rust's standard library is using locks under the hood, which is why I'm seeing deadlocks. I'll try a different technique to fix these tests.

@asomers
Copy link
Member Author

asomers commented Jul 12, 2017

Ok, I switched the chdir tests to use a Mutex for protection instead of forking. And I also fixed some issues with the aio tests and with other tests that fork(). Now, if I disable some unrelated tests like the pty tests, I can run 10000 iterations with 8 threads on both FreeBSD and Linux.

@asomers asomers changed the title Make the chdir tests thread safe Make aio, chdir, and wait tests thread safe Jul 12, 2017
test/test_mq.rs Outdated
@@ -53,6 +54,7 @@ fn test_mq_send_and_receive() {
// panic, fork should never fail unless there is a serious problem with the OS
Err(_) => panic!("Error: Fork Failed")
}
drop(m); //appease the unused_variable checker
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs a space and capitalize "Appease"

let pid = fork();
match pid {
Ok(Child) => {} // ignore child here
Ok(Child) => { unsafe { _exit(0) }; }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you not use std::process::exit(0) here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I initially used _exit(0) because I didn't know exactly where the deadlock was, and _exit(0) skips virtually all teardown. But I just tried it, and it looks like std::process::exit(0) works too. I guess the deadlock is probably in the test harness code somewhere. I'll switch.

@@ -29,23 +29,26 @@ fn test_fork_and_waitpid() {
Ok(WaitStatus::Exited(pid_t, _)) => assert!(pid_t == child),

// panic, must never happen
Ok(_) => panic!("Child still alive, should never happen"),
s @ Ok(_) => panic!("Child exited {:?}, should never happen", s),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if unreachable!() doesn't provide better semantics here than panic!().

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so. The examples for unreachable! are for stuff that's so obviously unreachable that a smarter compiler should be able to figure it out. But in this case, the assertion can actually be hit; for example if you SIGKILL the child. I actually did hit this assertion frequently when my child deadlocked and had to be killed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, well if it isn't unreachable, then we definitely shouldn't use that macro!

}

},
// panic, fork should never fail unless there is a serious problem with the OS
Err(_) => panic!("Error: Fork Failed")
}
drop(m); // appease the unused_variable checker
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Capitalize "Appease"

}

#[test]
fn test_wait() {
// grab FORK_MTX so wait doesn't reap a different test's child process
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Capitalize "Grab"

let pid = fork();
match pid {
Ok(Child) => {} // ignore child here
Ok(Child) => { unsafe { _exit(0) }; }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See earlier comment

inner_tmp_dir.push(newdir);
assert!(mkdir(inner_tmp_dir.as_path(), stat::S_IRWXU).is_ok());
}
assert!(chdir(inner_tmp_dir.as_path()).is_ok());
assert_eq!(getcwd().unwrap(), current_dir().unwrap());
assert_eq!(getcwd().unwrap(), inner_tmp_dir.as_path());
drop(m); // appease the unused_variable checker
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you could also do #[allow(unused_variable)] on line 170 here, couldn't you? I don't know if that's any nicer, but it's got cleaner semantics.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't know you could apply that to a single variable. Cool! BTW, if you didn't figure out, it doesn't work to do "let _ = ..." because the anonymous variable gets dropped right away.

@@ -6,6 +6,8 @@ use libc::exit;

#[test]
fn test_wait_signal() {
let m = ::FORK_MTX.lock().expect("Mutex got poisoned by another test");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You forgot to change this to allow(unused_variables) as well. I would suggest you also add a comment for each of these sections stating why we want to have an unused variable, as someone not familiar with this might think it should be removed without one.

let pid = fork();
match pid {
Ok(Child) => {} // ignore child here
Ok(Child) => { exit(0); }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can just be exit(0) I think, without the braces or semicolon, yes?

@@ -29,23 +29,26 @@ fn test_fork_and_waitpid() {
Ok(WaitStatus::Exited(pid_t, _)) => assert!(pid_t == child),

// panic, must never happen
Ok(_) => panic!("Child still alive, should never happen"),
s @ Ok(_) => panic!("Child exited {:?}, should never happen", s),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, well if it isn't unreachable, then we definitely shouldn't use that macro!

// make path 500 chars longer so that buffer doubling in getcwd kicks in.
// Note: One path cannot be longer than 255 bytes (NAME_MAX)
// whole path cannot be longer than PATH_MAX (usually 4096 on linux, 1024 on macos)
// make path 500 chars longer so that buffer doubling in getcwd
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No double-spaces before "Note" and capitalize the start of sentences. Also, please line wrap to 100 characters

@Susurrus
Copy link
Contributor

Also tests are failing on mac. Possibly because trust on mac doesn't use docker?

They have four problems:

* The chdir tests change the process's cwd, which is global.  Protect them
  all with a mutex.

* The wait tests will reap any subprocess, and several tests create
  subprocesses.  Protect them all with a mutex so only one
  subprocess-creating test will run at a time.

* When a multithreaded test forks, the child process can sometimes block in
  the stack unwinding code.  It blocks on a mutex that was held by a
  different thread in the parent, but that thread doesn't exist in the
  child, so a deadlock results.  Fix this by immediately calling
  std::process:exit in the child processes.

* My previous attempt at thread safety in the aio tests didn't work, because
  anonymous MutexGuards drop immediately.  Fix this by naming the
  SIGUSR2_MTX MutexGuards.

Fixes nix-rust#251
It isn't necessary, and can cause deadlocks in Rust's test harness
@asomers asomers mentioned this pull request Jul 18, 2017
@Susurrus
Copy link
Contributor

I'm assuming this is finished on your end. LGTM. I think some more comments would be helpful, but I lean towards more comments rather than less, and I don't think it should hold up this PR, especially with #681 waiting to reuse the mutex functionality this sets up.

bors r+

bors bot added a commit that referenced this pull request Jul 18, 2017
638: Make aio, chdir, and wait tests thread safe r=Susurrus

Fix thread safety issues in aio, chdir, and wait tests
    
They have four problems:
    
* The chdir tests change the process's cwd, which is global.  Protect them all with a mutex.
    
* The wait tests will reap any subprocess, and several tests create subprocesses.  Protect them all with a mutex so only one subprocess-creating test will run at a time.
    
* When a multithreaded test forks, the child process can sometimes block in the stack unwinding code.  It blocks on a mutex that was held by a different thread in the parent, but that thread doesn't exist in the child, so a deadlock results.  Fix this by immediately calling `std::process:;exit` in the child processes.
    
* My previous attempt at thread safety in the aio tests didn't work, because anonymous MutexGuards drop immediately.  Fix this by naming the SIGUSR2_MTX MutexGuards.

Fixes #251
@bors
Copy link
Contributor

bors bot commented Jul 18, 2017

@bors bors bot merged commit 2cd4420 into nix-rust:master Jul 18, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants