Make aio, chdir, and wait tests thread safe #638

asomers · 2017-06-30T16:57:28Z

Fix thread safety issues in aio, chdir, and wait tests

They have four problems:

The chdir tests change the process's cwd, which is global. Protect them all with a mutex.
The wait tests will reap any subprocess, and several tests create subprocesses. Protect them all with a mutex so only one subprocess-creating test will run at a time.
When a multithreaded test forks, the child process can sometimes block in the stack unwinding code. It blocks on a mutex that was held by a different thread in the parent, but that thread doesn't exist in the child, so a deadlock results. Fix this by immediately calling std::process:;exit in the child processes.
My previous attempt at thread safety in the aio tests didn't work, because anonymous MutexGuards drop immediately. Fix this by naming the SIGUSR2_MTX MutexGuards.

Fixes #251

Susurrus · 2017-06-30T17:12:05Z

test/test_unistd.rs

 use std::os::unix::prelude::*;
 use std::env::current_dir;
 use tempfile::tempfile;
 use tempdir::TempDir;
 use libc::off_t;

+fn test_in_subprocess<F: Fn()>(func: F) {


This could use a doccomment.

Susurrus · 2017-06-30T17:12:16Z

test/test_unistd.rs

+        let mut inner_tmp_dir = tmp_dir.path().to_path_buf();
+        for _ in 0..5 {
+            let newdir = iter::repeat("a").take(100).collect::<String>();
+            //inner_tmp_dir = inner_tmp_dir.join(newdir).path();


Probably want to remove this.

Susurrus · 2017-07-05T14:40:23Z

This needs a rebase to the updated CI configuration. Then it LGTM assuming it pasts all tests.

asomers · 2017-07-05T14:49:39Z

Actually, it's not ok. It seemed ok at first, but now I can occasionally produce deadlocks when I use 8 threads. This PR requires more investigation.

asomers · 2017-07-07T23:28:25Z

test_in_subprocesswill never work. It looks good, but it doesn't work if the process is multithreaded (which it usually is). The child process is single threaded; only the thread that called fork will exist in the child. So if some other threads in the master held locks that the test function needs, then it will never complete. And some of the stuff in Rust's standard library is using locks under the hood, which is why I'm seeing deadlocks. I'll try a different technique to fix these tests.

asomers · 2017-07-12T14:05:38Z

Ok, I switched the chdir tests to use a Mutex for protection instead of forking. And I also fixed some issues with the aio tests and with other tests that fork(). Now, if I disable some unrelated tests like the pty tests, I can run 10000 iterations with 8 threads on both FreeBSD and Linux.

Susurrus · 2017-07-12T22:48:07Z

test/test_mq.rs

@@ -53,6 +54,7 @@ fn test_mq_send_and_receive() {
      // panic, fork should never fail unless there is a serious problem with the OS
      Err(_) => panic!("Error: Fork Failed")
    }
+    drop(m);    //appease the unused_variable checker


Needs a space and capitalize "Appease"

Susurrus · 2017-07-12T22:48:51Z

test/test_unistd.rs

    let pid = fork();
    match pid {
-        Ok(Child) => {} // ignore child here
+        Ok(Child) => { unsafe { _exit(0) }; }


Can you not use std::process::exit(0) here?

I initially used _exit(0) because I didn't know exactly where the deadlock was, and _exit(0) skips virtually all teardown. But I just tried it, and it looks like std::process::exit(0) works too. I guess the deadlock is probably in the test harness code somewhere. I'll switch.

Susurrus · 2017-07-12T22:49:48Z

test/test_unistd.rs

@@ -29,23 +29,26 @@ fn test_fork_and_waitpid() {
                Ok(WaitStatus::Exited(pid_t, _)) =>  assert!(pid_t == child),

                // panic, must never happen
-                Ok(_) => panic!("Child still alive, should never happen"),
+                s @ Ok(_) => panic!("Child exited {:?}, should never happen", s),


I'm wondering if unreachable!() doesn't provide better semantics here than panic!().

I don't think so. The examples for unreachable! are for stuff that's so obviously unreachable that a smarter compiler should be able to figure it out. But in this case, the assertion can actually be hit; for example if you SIGKILL the child. I actually did hit this assertion frequently when my child deadlocked and had to be killed.

Okay, well if it isn't unreachable, then we definitely shouldn't use that macro!

Susurrus · 2017-07-12T22:49:58Z

test/test_unistd.rs

            }

        },
        // panic, fork should never fail unless there is a serious problem with the OS
        Err(_) => panic!("Error: Fork Failed")
    }
+    drop(m);    // appease the unused_variable checker


Capitalize "Appease"

Susurrus · 2017-07-12T22:50:06Z

test/test_unistd.rs

 }

 #[test]
 fn test_wait() {
+    // grab FORK_MTX so wait doesn't reap a different test's child process


Capitalize "Grab"

Susurrus · 2017-07-12T22:50:17Z

test/test_unistd.rs

    let pid = fork();
    match pid {
-        Ok(Child) => {} // ignore child here
+        Ok(Child) => { unsafe { _exit(0) }; }


See earlier comment

Susurrus · 2017-07-12T22:52:32Z

test/test_unistd.rs

        inner_tmp_dir.push(newdir);
        assert!(mkdir(inner_tmp_dir.as_path(), stat::S_IRWXU).is_ok());
    }
    assert!(chdir(inner_tmp_dir.as_path()).is_ok());
-    assert_eq!(getcwd().unwrap(), current_dir().unwrap());
+    assert_eq!(getcwd().unwrap(), inner_tmp_dir.as_path());
+    drop(m);    // appease the unused_variable checker


I think you could also do #[allow(unused_variable)] on line 170 here, couldn't you? I don't know if that's any nicer, but it's got cleaner semantics.

I didn't know you could apply that to a single variable. Cool! BTW, if you didn't figure out, it doesn't work to do "let _ = ..." because the anonymous variable gets dropped right away.

Susurrus · 2017-07-13T03:17:43Z

test/sys/test_wait.rs

@@ -6,6 +6,8 @@ use libc::exit;

 #[test]
 fn test_wait_signal() {
+    let m = ::FORK_MTX.lock().expect("Mutex got poisoned by another test");


You forgot to change this to allow(unused_variables) as well. I would suggest you also add a comment for each of these sections stating why we want to have an unused variable, as someone not familiar with this might think it should be removed without one.

Susurrus · 2017-07-13T03:18:24Z

test/test_unistd.rs

    let pid = fork();
    match pid {
-        Ok(Child) => {} // ignore child here
+        Ok(Child) => { exit(0); }


This can just be exit(0) I think, without the braces or semicolon, yes?

Susurrus · 2017-07-13T03:19:24Z

test/test_unistd.rs

@@ -29,23 +29,26 @@ fn test_fork_and_waitpid() {
                Ok(WaitStatus::Exited(pid_t, _)) =>  assert!(pid_t == child),

                // panic, must never happen
-                Ok(_) => panic!("Child still alive, should never happen"),
+                s @ Ok(_) => panic!("Child exited {:?}, should never happen", s),


Okay, well if it isn't unreachable, then we definitely shouldn't use that macro!

Susurrus · 2017-07-13T03:20:30Z

test/test_unistd.rs

-    // make path 500 chars longer so that buffer doubling in getcwd kicks in.
-    // Note: One path cannot be longer than 255 bytes (NAME_MAX)
-    // whole path cannot be longer than PATH_MAX (usually 4096 on linux, 1024 on macos)
+    // make path 500 chars longer so that buffer doubling in getcwd


No double-spaces before "Note" and capitalize the start of sentences. Also, please line wrap to 100 characters

Susurrus · 2017-07-13T03:21:02Z

Also tests are failing on mac. Possibly because trust on mac doesn't use docker?

They have four problems: * The chdir tests change the process's cwd, which is global. Protect them all with a mutex. * The wait tests will reap any subprocess, and several tests create subprocesses. Protect them all with a mutex so only one subprocess-creating test will run at a time. * When a multithreaded test forks, the child process can sometimes block in the stack unwinding code. It blocks on a mutex that was held by a different thread in the parent, but that thread doesn't exist in the child, so a deadlock results. Fix this by immediately calling std::process:exit in the child processes. * My previous attempt at thread safety in the aio tests didn't work, because anonymous MutexGuards drop immediately. Fix this by naming the SIGUSR2_MTX MutexGuards. Fixes nix-rust#251

It isn't necessary, and can cause deadlocks in Rust's test harness

Susurrus · 2017-07-18T03:43:39Z

I'm assuming this is finished on your end. LGTM. I think some more comments would be helpful, but I lean towards more comments rather than less, and I don't think it should hold up this PR, especially with #681 waiting to reuse the mutex functionality this sets up.

bors r+

638: Make aio, chdir, and wait tests thread safe r=Susurrus Fix thread safety issues in aio, chdir, and wait tests They have four problems: * The chdir tests change the process's cwd, which is global. Protect them all with a mutex. * The wait tests will reap any subprocess, and several tests create subprocesses. Protect them all with a mutex so only one subprocess-creating test will run at a time. * When a multithreaded test forks, the child process can sometimes block in the stack unwinding code. It blocks on a mutex that was held by a different thread in the parent, but that thread doesn't exist in the child, so a deadlock results. Fix this by immediately calling `std::process:;exit` in the child processes. * My previous attempt at thread safety in the aio tests didn't work, because anonymous MutexGuards drop immediately. Fix this by naming the SIGUSR2_MTX MutexGuards. Fixes #251

bors · 2017-07-18T04:07:00Z

Build succeeded

Susurrus suggested changes Jun 30, 2017

View reviewed changes

Susurrus approved these changes Jul 5, 2017

View reviewed changes

asomers force-pushed the chdir branch from 8301f79 to 0847bbf Compare July 12, 2017 14:04

asomers changed the title ~~Make the chdir tests thread safe~~ Make aio, chdir, and wait tests thread safe Jul 12, 2017

Susurrus suggested changes Jul 12, 2017

View reviewed changes

Susurrus suggested changes Jul 13, 2017

View reviewed changes

asomers force-pushed the chdir branch from c8ab4cb to 64f7984 Compare July 16, 2017 19:30

asomers mentioned this pull request Jul 16, 2017

Intermittent test failures in CI #679

Closed

8 tasks

Don't fork in test_mq_send_receive

2cd4420

It isn't necessary, and can cause deadlocks in Rust's test harness

asomers mentioned this pull request Jul 18, 2017

Remove feature flags #681

Merged

Susurrus approved these changes Jul 18, 2017

View reviewed changes

bors bot merged commit 2cd4420 into nix-rust:master Jul 18, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make aio, chdir, and wait tests thread safe #638

Make aio, chdir, and wait tests thread safe #638

asomers commented Jun 30, 2017 •

edited

Loading

Susurrus Jun 30, 2017

Susurrus Jun 30, 2017

Susurrus commented Jul 5, 2017

asomers commented Jul 5, 2017

asomers commented Jul 7, 2017

asomers commented Jul 12, 2017

Susurrus Jul 12, 2017

Susurrus Jul 12, 2017

asomers Jul 13, 2017

Susurrus Jul 12, 2017

asomers Jul 13, 2017

Susurrus Jul 13, 2017

Susurrus Jul 12, 2017

Susurrus Jul 12, 2017

Susurrus Jul 12, 2017

Susurrus Jul 12, 2017

asomers Jul 13, 2017

Susurrus Jul 13, 2017

Susurrus Jul 13, 2017

Susurrus Jul 13, 2017

Susurrus Jul 13, 2017

Susurrus commented Jul 13, 2017

Susurrus commented Jul 18, 2017

bors bot commented Jul 18, 2017

Make aio, chdir, and wait tests thread safe #638

Make aio, chdir, and wait tests thread safe #638

Conversation

asomers commented Jun 30, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Susurrus commented Jul 5, 2017

asomers commented Jul 5, 2017

asomers commented Jul 7, 2017

asomers commented Jul 12, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Susurrus commented Jul 13, 2017

Susurrus commented Jul 18, 2017

bors bot commented Jul 18, 2017

Build succeeded

asomers commented Jun 30, 2017 •

edited

Loading