-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Switch back to Counter based SpinLockIRQ #534
base: master
Are you sure you want to change the base?
Conversation
This reverts commit e850a89.
It still deadlocks later on... :/
This was interesting to figure out. In short, when a scheduler based process switch occurs, there is a lock held exclusively for enabling and disabling interrupts. This lock is not dropped before the process switch. If this is not the first time this process is run, it evens out by dropping the copy of this lock held by the 'new' process. However, if it is *not* the first time the process is run, there is no lock, and the counter never gets decremented. This commit fixes the issue by decrementing the counter in the new process right before the iret which reenabled interrupts.
Forgot to copy paste after fixing issues in lock. Doesn't seem to be used early enough to matter, but in case it ever is...
Disable preemption.SunriseOS/kernel/src/sync/spin_lock_irq.rs Lines 112 to 122 in 495cb50
This comment was generated by todo based on a
|
Spin acquireSunriseOS/kernel/src/sync/spin_lock_irq.rs Lines 113 to 123 in 495cb50
This comment was generated by todo based on a
|
Disable preemption.SunriseOS/kernel/src/sync/spin_lock_irq.rs Lines 125 to 135 in 495cb50
This comment was generated by todo based on a
|
Spin acquireSunriseOS/kernel/src/sync/spin_lock_irq.rs Lines 126 to 136 in 495cb50
This comment was generated by todo based on a
|
@@ -592,6 +594,7 @@ macro_rules! generate_trap_gate_handler { | |||
|
|||
if $interrupt_context { | |||
let _ = INSIDE_INTERRUPT_COUNT.fetch_sub(1, Ordering::SeqCst); | |||
unsafe { decrement_lock_count(); } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unsafe needs to have a // Safety
comment, like this
kernel/src/i386/process_switch.rs
Outdated
@@ -347,6 +347,8 @@ fn jump_to_entrypoint(ep: usize, userspace_stack_ptr: usize, arg1: usize, arg2: | |||
const_assert_eq!((GdtIndex::UTlsRegion as u16) << 3 | 0b11, 0x3B); | |||
const_assert_eq!((GdtIndex::UTlsElf as u16) << 3 | 0b11, 0x43); | |||
const_assert_eq!((GdtIndex::UStack as u16) << 3 | 0b11, 0x4B); | |||
// We're about to enable interrupts by doing an iret. | |||
unsafe { crate::sync::spin_lock_irq::decrement_lock_count(); } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IDEM, needs a // Safety
comment explaining why this operation is safe.
Also, I'm a bit confused at why this is necessary? Is this what was talked about on discord, because of the interrupt_manager
SpinLockIRQ? If so:
- It should definitely have a big fat comment explaining this in detail
- I think we can get rid of
interrupt_manager
, removing the need for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is interrupt_manager
necessary?
the guard for the queue is dropped before the context switch (if it wasn't, that'd probably cause deadlocks or cause need to force_unlock, etc.), which must still have interrupts disabled. So it's not removable as such.
Alternatives to what I did do include wrapping context_switch
in decrements/increments of the counter and/or removing the guard object and using calls to increment/decrement. I think it's a bit better to associate it with the iret, but shrug.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, here's another alternative: making context_switch
and co less unsafe by having it manage interrupt enabling/disabling by itself instead of making everything else manage it.
kernel/src/sync/spin_lock_irq.rs
Outdated
if self.1 { | ||
unsafe { interrupts::sti(); } | ||
} | ||
unsafe { enable_interrupts(); } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know the old code didn't have one, but a safety comment here would be nice ^^.
kernel/src/sync/spin_lock_irq.rs
Outdated
Some(internalguard) => Some(SpinLockIRQGuard(ManuallyDrop::new(internalguard))), | ||
None => { | ||
// We couldn't lock. Restore irqs and return None | ||
unsafe { enable_interrupts(); } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Safety comment
kernel/src/sync/spin_lock_irq.rs
Outdated
} | ||
pub fn try_lock(&self) -> Option<SpinLockIRQGuard<'_, T>> { | ||
// Disable irqs | ||
unsafe { disable_interrupts(); } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Safety comment
kernel/src/sync/spin_lock_irq.rs
Outdated
} | ||
pub fn lock(&self) -> SpinLockIRQGuard<'_, T> { | ||
// Disable irqs | ||
unsafe { disable_interrupts(); } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Safety comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, needs a few changes.
I think disable_interrupts/enable_interrupts don't actually need to be wrapped in unsafe
?
kernel/src/sync/spin_lock_irq.rs
Outdated
/// Decrement the interrupt disable counter without possibly re-enabling interrupts. | ||
/// | ||
/// Used to decrement counter while keeping interrupts disabled inside an interrupt. | ||
/// Look at documentation for INTERRUPT_DISABLE_COUNTER to know more. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Safety comment explaining when it is safe to call this function.
They don't, per-se. Even the one which I have marked However, I consider calling them unsafe, as manually incrementing/decrementing the counter at the wrong time is going to break the counter and cause deadlocks. Could possibly make it less fragile by disabling interrupts every time disable_interrupts is called, but that probably means that stuff will just break in a different, non-obvious way. |
Is it valid for interrupts to be active here?The interrupt wrappers call decrement_lock_count before calling this function, because it has the possibility to never return. Upon a spurious wake up, interrupts will end up being enabled. SunriseOS/kernel/src/i386/interrupt_service_routines.rs Lines 67 to 73 in c62dc63
This comment was generated by todo based on a
|
The safety comments on the As for the |
Looks like when I went looking for examples, I found the wrong example.
It's funny, I could've sworn there was already some tool for this, but couldn't find it when I went to look for it. *shrug*
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent! This feels good. Just have a couple nits on the documentation front. I'll merge it once those are included!
Thanks a lot for this 😄
Co-Authored-By: Robin Lambertz <[email protected]>
4bd1401
to
9ebe582
Compare
Fixes #190. Fixes #192.
Why this doesn't suck
Interrupt handler wrappers are now generated by a macro, so we have a single place to inject code into interrupt handlers. This allows us to tell SpinLockIRQ that interrupts are already disabled by incrementing the counter and allows us to decrement the counter again (without disabling interrupts yet) right before entering userspace again.
This also contains a fix for a mistake in #528, because I failed to copy-paste some checks into try_lock. >_> If this PR doesn't go through, then that should still be fixed anyway.