Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with AtomicCell containing struct with padding bytes #388

Closed
mchesser opened this issue Jun 5, 2019 · 3 comments · Fixed by #391
Closed

Issue with AtomicCell containing struct with padding bytes #388

mchesser opened this issue Jun 5, 2019 · 3 comments · Fixed by #391

Comments

@mchesser
Copy link

mchesser commented Jun 5, 2019

Edit: root cause here: #388 (comment)


The following code appears to deadlock on Windows only:

fn main() {
    let (tx, rx) = crossbeam_channel::unbounded();
    std::thread::spawn(move || {
        loop {
            tx.send(()).unwrap();
            std::thread::sleep(std::time::Duration::from_millis(50));
        }
    });

    let update_tick = crossbeam_channel::tick(std::time::Duration::from_secs(10));
    let mut data_i = 0;
    loop {
        crossbeam_channel::select! {
            recv(update_tick) -> tick => {
                eprintln!("tick: {:?}", tick);
            },

            recv(rx) -> data => match data {
                Ok(data) => {
                    data_i += 1;
                    if data_i % 50== 0 {
                        dbg!(data_i);
                    }
                    std::thread::sleep(std::time::Duration::from_millis(1));
                }
                Err(_) => break,
            },
        }
    }
}

It should print something like:

[src/main.rs:22] data_i = 50
[src/main.rs:22] data_i = 100
[src/main.rs:22] data_i = 150
tick: Ok(Instant { tv_sec: 1023, tv_nsec: 433738600 })
[src/main.rs:22] data_i = 200
[src/main.rs:22] data_i = 250
[src/main.rs:22] data_i = 300

But I see no further output after [src/main.rs:22] data_i = 150 when running on Windows. It works fine on Linux.

Tested with: stable-x86_64-pc-windows-msvc (rustc 1.35.0 (3c235d560 2019-05-20)).

@ghost
Copy link

ghost commented Jun 5, 2019

Thanks for the bug report! Unfortunately, I don't have a Windows machine at hand to reproduce. Could you perhaps try replacing tick channel with an unbounded channel + a thread and see if the problem persists?

fn main() {
    let (tx, rx) = crossbeam_channel::unbounded();
    std::thread::spawn(move || {
        loop {
            tx.send(()).unwrap();
            std::thread::sleep(std::time::Duration::from_millis(50));
        }
    });

    let (tx, update_tick) = crossbeam_channel::unbounded();
    std::thread::spawn(move || {
        loop {
            std::thread::sleep(std::time::Duration::from_secs(10));
            tx.send(std::time::Instant::now()).unwrap();
        }
    });

    let mut data_i = 0;
    loop {
        crossbeam_channel::select! {
            recv(update_tick) -> tick => {
                eprintln!("tick: {:?}", tick);
            },

            recv(rx) -> data => match data {
                Ok(data) => {
                    data_i += 1;
                    if data_i % 50== 0 {
                        dbg!(data_i);
                    }
                    std::thread::sleep(std::time::Duration::from_millis(1));
                }
                Err(_) => break,
            },
        }
    }
}

Also, does the deadlock still occur if the tick channel is removed from select!?

@mchesser
Copy link
Author

mchesser commented Jun 6, 2019

The issue appears to be caused by the byte_eq in compare_exchange implementation of AtomicCell (https://github.com/crossbeam-rs/crossbeam/blob/master/crossbeam-utils/src/atomic/atomic_cell.rs#L640-L646).

On Windows std::time::Instant is basically: { t: { secs: u64, nanos: u32 } } i.e. 12 bytes of data, however it is padded to 16 bytes: mem::size_of::<Instant>() == 16, byte_eq includes the padding bytes in the equality, however the ptr::read does not copy the padding bytes. Therefore if there is junk in the padding bytes, byte_eq will never return true causing an infinite loop.

On 64 bit linux, std::time::Instant is { t: { t: { tv_sec: libc::time_t, tv_nsec: libc::c_long } } } which has no padding bytes, so the issue does not occur.

Here is an example that should show the issue on both platforms (debug only, garbage gets optimized away on release):

use crossbeam_utils::atomic::AtomicCell;

#[derive(Copy, Clone, Eq, PartialEq)]
struct Object {
    a: i64,
    b: i32,
}

fn main() {
    dbg!();
    let cell = AtomicCell::new(Object { a: 0, b: 0 });
    let _garbage = [0xfe, 0xfe, 0xfe, 0xfe, 0xfe]; // Needed
    let next = Object { a: 0, b: 0 };

    loop {
        let prev = cell.load();
        if cell.compare_exchange(prev, next).is_ok() {
            break;
        }
    }
    dbg!();
}

@mchesser mchesser changed the title Possible deadlock in crossbeam-channel with tick (Windows only) Issue with AtomicCell containing struct with padding bytes Jun 6, 2019
@mchesser
Copy link
Author

mchesser commented Jun 6, 2019

Looks related to this issue: #315

@ghost ghost closed this as completed in #391 Jun 13, 2019
mchesser added a commit to mchesser/vector that referenced this issue Sep 24, 2019
Hotmic uses an outdated version of crossbeam-channel, which depends on a version of crossbeam-utils that has a bug in it's atomic cell implementation (see: crossbeam-rs/crossbeam#388)

Hotmic is deprecated in favour of the metrics crate so this commit does an initial migration to that crate.
vangroan added a commit to vangroan/rengine that referenced this issue Apr 25, 2020
Bug in windows only code which causes `crossbeam::channel::timer` to deadlock.

See: [Issue with AtomicCell containing struct with padding bytes](crossbeam-rs/crossbeam#388)
vangroan added a commit to vangroan/rengine that referenced this issue Jun 6, 2020
Bug in windows only code which causes `crossbeam::channel::timer` to deadlock.

See: [Issue with AtomicCell containing struct with padding bytes](crossbeam-rs/crossbeam#388)
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging a pull request may close this issue.

1 participant