Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Take non-Send resources out of World #9122

Draft
wants to merge 21 commits into
base: main
Choose a base branch
from

Conversation

maniwani
Copy link
Contributor

@maniwani maniwani commented Jul 12, 2023

This PR builds on top of #9202 (I separated what could be separated from this PR into that one).

Objective

The aim of this PR is to be the "final" fix to our messy problems involving thread-local data (see #6552 for context).

Bevy only deals with a handful of thread-local resources (that mostly come from low-level OS APIs or libs that talk to those), but that handful has resulted in a ton of internal complexity (in bevy_ecs, bevy_tasks, and bevy_app). The way we currently store them makes it either difficult or impossible to implement some nice-to-have features (e.g. generalized multi-world apps, framerate-independent input handling, etc.).

Solution

tl;dr Separate non-Send data from the ECS entirely. It will remain indirectly accessible through an event loop running on the main thread. Then, move app updates into a separate thread (on permitting targets) so they don't block the event loop.

This setup should be familiar to any audio plugin developers (or async runtime developers). If a system on thread B needs to do something with data that can't leave thread A, it can just send a command to A's event loop and wait for A to run it (and send any result back).

Some unique benefits of separating Send and !Send data like this are:

  • Res and ResMut can be removed in favor of Ref and Mut since they're redundant without NonSend and NonSendMut.
  • World becomes unambiguously Send, so a world can be dropped on any thread.
  • The multi-threaded executor is able to run any system on any thread (including exclusive systems).
  • The multi-threaded executor is able to run any system without spawning a task for it (especially exclusive systems).
  • Any system from any world on any thread can remotely access the local resources in the main thread.
  • Input events could be timestamped (with Instant::now()) with virtually no delay/aliasing (the event loop isn't blocked).

Changelog

TODO: finish this section

  • NonSend and NonSendMut have been removed.
  • Moved storage of non-Send resources to an external ThreadLocals container. It's normally owned by the App.

Migration Guide

If you were doing this before:

fn foo(x: NonSend<X>) {
    /* ... */
}

you're doing this now:

fn foo(mut thread: ThreadLocal) {
    // This closure will run on the thread that owns the TLS.
    // `run` blocks until the closure is completed.
    thread.run(|tls| {
        let x = tls.resource::<X>();
        /* ... */
    });
}

For better or worse, the handling that used to be buried inside the scheduler is now the responsibility of whoever writes an App runner. It's not too much boilerplate though IMO. Here's an example of a basic runner that runs the app once.

fn run_once(mut app: App) {
    let (mut sub_apps, tls, send, recv, _runner) = app.into_parts();
          
    // Move sub-apps to another thread.
    let thread = std::thread::spawn(move || {
        let result = std::panic::catch_unwind(std::panic::AssertUnwindSafe(|| {
            sub_apps.update();
            send.send(AppEvent::Exit(sub_apps)).unwrap();
        }));
        
        if let Some(payload) = result.err() {
            send.send(AppEvent::Error(payload)).unwrap();
        }
    });
    
    // Run an event loop in this thread.
    loop {
        let event = recv.recv().unwrap();
        match event {
            AppEvent::RunTask(task) => {
                task();
            }
            AppEvent::Exit(_) => {
                thread.join().unwrap();
                break;
            }
            AppEvent::Error(payload) => {
                std::panic::resume_unwind(payload);
            }
        }
    }
}

Spawning a thread like this is basically mandatory unless you're targeting an environment (i.e. wasm32) or are enabling a bevy compiler flag that forces completely single-threaded system execution. Otherwise, not doing so will risk running into a deadlock.

Unfortunately, the channel between the main and app threads can't always be a std::sync::mpsc::channel. The winit runner spawns an event loop that can only be woken up by winit events, so we have to use their special EventLoopProxy sender. That's the reason for the ThreadLocalTaskSender trait. I still had to box the channel as a trait object because bevy_ecs and bevy_app are separated.

@alice-i-cecile alice-i-cecile added A-ECS Entities, components, systems, and events C-Code-Quality A section of code that is hard to understand or change A-App Bevy apps and plugins D-Complex Quite challenging from either a design or technical perspective. Ask for help! X-Controversial There is active debate or serious implications around merging this PR labels Jul 12, 2023
@alice-i-cecile alice-i-cecile added this to the 0.12 milestone Jul 12, 2023
Copy link
Contributor

@hymm hymm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just a quick skim with comments on trying to shrink this pr a bit. Will dig more into the meat of the pr later.

crates/bevy_ecs/src/schedule/executor/multi_threaded.rs Outdated Show resolved Hide resolved
crates/bevy_tasks/src/task_pool.rs Outdated Show resolved Hide resolved
crates/bevy_app/src/headless.rs Outdated Show resolved Hide resolved
crates/bevy_app/src/sub_app.rs Outdated Show resolved Hide resolved
@hymm hymm self-requested a review July 12, 2023 05:30
@james7132 james7132 self-requested a review July 12, 2023 10:21
Copy link
Contributor

@SkiFire13 SkiFire13 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

    // Insert channel resource into every sub-app.
    let (send, recv) = std::sync::mpsc::channel();
    sub_apps.main.world.insert_resource(ThreadLocalAccessor::new(&mut tls, send.clone());
    for sub_app in sub_apps.sub_apps.values_mut() {
        sub_app.world.insert_resource(ThreadLocalAccessor::new(&mut tls, send.clone());
    }

FYI this example is unsound, each &mut tls technically invalidate any pointer obtained from the previous &mut tls.

crates/bevy_ecs/src/storage/resource_non_send.rs Outdated Show resolved Hide resolved
crates/bevy_ecs/src/storage/resource_non_send.rs Outdated Show resolved Hide resolved
@maniwani maniwani force-pushed the non-send-split branch 5 times, most recently from 2b08380 to ed449db Compare July 12, 2023 20:19
unsafe impl Sync for ThreadLocalChannel {}

/// Guards access to its thread-local data storage.
pub struct ThreadLocalStorage {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the advantage of the approach here vs building an abstraction around a global thread_local!()?

Copy link
Contributor Author

@maniwani maniwani Jul 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main benefit is this approach is siloed within an App. You could prepare multiple App instances on the same thread without them conflicting on the same global variable.

If thread_local! (or #[thread_local]) has any other notable caveats, they're not our problem. Maybe it'll have issues on console platforms and we just don't know about them yet.

Keep in mind that this setup is about making the same store of thread-local data "available on all threads". I think thread_local! could at most replace the mutex1. The rest of the implementation wouldn't change much. We still need the channel, etc.

Footnotes

  1. But this mutex is only used to ensure soundness. It cannot be contended, so any performance impact is negligible.

Copy link
Contributor Author

@maniwani maniwani Oct 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed it to use thread_local! after all.

This has an annoying caveat where we kind of need to drop the non-send resources before exiting the app (or the user/plugin needs to do it) because of how thread_local! handles destructors.

@alice-i-cecile
Copy link
Member

I think we should swap this to the 0.13 milestone now. We're drawing close and this is still in draft.

@maniwani
Copy link
Contributor Author

maniwani commented Sep 19, 2023

Let me see if I can have this ready for review this weekend. I've already rebased (just haven't pushed) and only the winit runner is unfinished (winit runner code is the bulk of this PR).

@maniwani maniwani force-pushed the non-send-split branch 6 times, most recently from 9a6cda7 to 34091b1 Compare September 21, 2023 20:47
@maniwani maniwani force-pushed the non-send-split branch 2 times, most recently from 8ea4c58 to a8575de Compare November 9, 2023 04:16
@alice-i-cecile alice-i-cecile modified the milestones: 0.13, 0.14 Jan 24, 2024
@alice-i-cecile alice-i-cecile added the M-Needs-Release-Note Work that should be called out in the blog due to impact label Feb 22, 2024
github-merge-queue bot pushed a commit that referenced this pull request Mar 31, 2024
# Objective

This is a necessary precursor to #9122 (this was split from that PR to
reduce the amount of code to review all at once).

Moving `!Send` resource ownership to `App` will make it unambiguously
`!Send`. `SubApp` must be `Send`, so it can't wrap `App`.

## Solution

Refactor `App` and `SubApp` to not have a recursive relationship. Since
`SubApp` no longer wraps `App`, once `!Send` resources are moved out of
`World` and into `App`, `SubApp` will become unambiguously `Send`.

There could be less code duplication between `App` and `SubApp`, but
that would break `App` method chaining.

## Changelog

- `SubApp` no longer wraps `App`.
- `App` fields are no longer publicly accessible.
- `App` can no longer be converted into a `SubApp`.
- Various methods now return references to a `SubApp` instead of an
`App`.
## Migration Guide

- To construct a sub-app, use `SubApp::new()`. `App` can no longer
convert into `SubApp`.
- If you implemented a trait for `App`, you may want to implement it for
`SubApp` as well.
- If you're accessing `app.world` directly, you now have to use
`app.world()` and `app.world_mut()`.
- `App::sub_app` now returns `&SubApp`.
- `App::sub_app_mut`  now returns `&mut SubApp`.
- `App::get_sub_app` now returns `Option<&SubApp>.`
- `App::get_sub_app_mut` now returns `Option<&mut SubApp>.`
@JMS55 JMS55 removed this from the 0.14 milestone Apr 25, 2024
@alice-i-cecile alice-i-cecile added X-Blessed Has a large architectural impact or tradeoffs, but the design has been endorsed by decision makers and removed X-Controversial There is active debate or serious implications around merging this PR labels May 2, 2024
github-merge-queue bot pushed a commit that referenced this pull request May 6, 2024
# Objective

I'm adopting #9122 and pulling some of the non controversial changes out
to make the final pr easier to review.

This pr removes the NonSend resource usage from `bevy_log`. 

## Solution

`tracing-chrome` uses a guard that is stored in the world, so that when
it is dropped the json log file is written out. The guard is Send +
!Sync, so we can store it in a SyncCell to hold it in a regular resource
instead of using a non send resource.

## Testing

Tested by running an example with `-F tracing chrome` and making sure
there weren't any errors and the json file was created.

---

## Changelog

- replaced `bevy_log`'s usage of a non send resource.
@maniwani maniwani added the S-Adopt-Me The original PR author has no intent to complete this work. Pick me up! label Sep 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-App Bevy apps and plugins A-ECS Entities, components, systems, and events C-Code-Quality A section of code that is hard to understand or change D-Complex Quite challenging from either a design or technical perspective. Ask for help! M-Needs-Release-Note Work that should be called out in the blog due to impact S-Adopt-Me The original PR author has no intent to complete this work. Pick me up! X-Blessed Has a large architectural impact or tradeoffs, but the design has been endorsed by decision makers
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants