Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stop & start the world (undocumented API) #14729

Merged

Conversation

ysbaddaden
Copy link
Contributor

@ysbaddaden ysbaddaden commented Jun 18, 2024

Add GC.stop_world and GC.start_world methods to be able to stop and restart the world at will from within Crystal.

  • gc/boehm: delegates to GC_stop_world_external and GC_start_world_external;
  • gc/none: implements its own mechanism (tested on UNIX & Windows).

My use case is a perf-tools feature for RFC 2 that must stop the world to print out runtime information of each ExecutionContext with their schedulers and fibers. See crystal-lang/perf-tools#18

Notes:

  1. I tested the behavior using this simple program;
  2. Darwin has the thread_suspend, thread_resume and thread_get_state syscalls that could be used instead of using signals;
  3. I'm having a hard time to articulate the relationship between GC and Thread on this feature. Thread#suspend feels pretty neat but we need a couple signals on UNIX. For now I expose GC.sig_suspend and GC.sig_resume but they feel out of place 😞
    The entrypoints are now Thread.start_world and Thread.stop_world and sig suspend/resume are only defined on Crystal::System::Thread for UNIX.

@ysbaddaden
Copy link
Contributor Author

Maybe it should be Thread.stop_world and Thread.start_world, and they'd call into GC (like for creating a thread)? But that still doesn't say where sig_suspend and sig_resume should be defined.

src/crystal/system/thread.cr Show resolved Hide resolved
src/gc/none.cr Outdated Show resolved Hide resolved
@beta-ziliani
Copy link
Member

Maybe it should be Thread.stop_world and Thread.start_world, and they'd call into GC (like for creating a thread)?

I think this makes sense, yes.

But that still doesn't say where sig_suspend and sig_resume should be defined.

These are the Bohem specfic functions, right? why would they not be defined there?

SIGINFO = 29
SIGWINCH = 28
SIGRTMIN = 65
SIGRTMAX = 126
Copy link
Contributor Author

@ysbaddaden ysbaddaden Jun 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's note that SIGRTMIN and SIGRTMAX also exist on:

  • Linux: but are functions: __libc_current_sigrtmin() and __libc_current_sigrtmin();
  • Solaris: but are calling sysconf: _sysconf(_SC_SIGRT_MIN) and _sysconf(_SC_SIGRT_MAX);
  • NetBSD: but kernel only (not exposed to userland yet).

@ysbaddaden
Copy link
Contributor Author

ysbaddaden commented Jun 19, 2024

There are weird CI failures on AArch64 with no error (the run just cancelled) but I can't replicate 🤨

And I can't download the crystal binary artifact to try it out. The GNU test finally finished compilation and is now running, but the musl one keeps failing. Maybe the VMs still have some issue.

One VM on the AArch64 CI server was acting up.

@ysbaddaden ysbaddaden force-pushed the feature/stop-the-world branch from 9c934d5 to 88d2be3 Compare June 27, 2024 09:45
@ysbaddaden
Copy link
Contributor Author

Rebased from master to add #14733 + fixed calls to Crystal::System.panic.

@ysbaddaden ysbaddaden marked this pull request as ready for review July 2, 2024 20:44
@straight-shoota straight-shoota added this to the 1.14.0 milestone Jul 3, 2024
@straight-shoota straight-shoota merged commit 8f26137 into crystal-lang:master Aug 7, 2024
61 checks passed
@ysbaddaden ysbaddaden deleted the feature/stop-the-world branch September 13, 2024 13:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

4 participants