Skip to content
This repository has been archived by the owner on Jul 16, 2024. It is now read-only.

Commit

Permalink
BACKPORT: mm/memory-failure: Add memory_failure_queue_kick()
Browse files Browse the repository at this point in the history
The GHES code calls memory_failure_queue() from IRQ context to schedule
work on the current CPU so that memory_failure() can sleep.

For synchronous memory errors the arch code needs to know any signals
that memory_failure() will trigger are pending before it returns to
user-space, possibly when exiting from the IRQ.

Add a helper to kick the memory failure queue, to ensure the scheduled
work has happened. This has to be called from process context, so may
have been migrated from the original cpu. Pass the cpu the work was
queued on.

Change memory_failure_work_func() to permit being called on the 'wrong'
cpu.

This patch is needed because Quicksilver firmware-first error handling
uses the SDEI notification type for communication between trusted
firmware and the OS. This adds needed NMI and SDEI functionality so
that the SDEI path in the kernel through APEI acts as an NMI and is
properly wired up to the APEI interfaces.

Backported from: https://patchwork.kernel.org/patch/10786963/

Cherry pick from: AmpereComputing/ampere-centos-kernel---DEPRECATED@f845034

Signed-off-by: James Morse <[email protected]>
Signed-off-by: Tyler Baicar <[email protected]>
  • Loading branch information
James Morse authored and Yi Li committed Jan 28, 2021
1 parent 8109a6c commit af3b585
Show file tree
Hide file tree
Showing 2 changed files with 15 additions and 1 deletion.
1 change: 1 addition & 0 deletions include/linux/mm.h
Original file line number Diff line number Diff line change
Expand Up @@ -2811,6 +2811,7 @@ enum mf_flags {
};
extern int memory_failure(unsigned long pfn, int flags);
extern void memory_failure_queue(unsigned long pfn, int flags);
extern void memory_failure_queue_kick(int cpu);
extern int unpoison_memory(unsigned long pfn);
extern int get_hwpoison_page(struct page *page);
#define put_hwpoison_page(page) put_page(page)
Expand Down
15 changes: 14 additions & 1 deletion mm/memory-failure.c
Original file line number Diff line number Diff line change
Expand Up @@ -1482,7 +1482,7 @@ static void memory_failure_work_func(struct work_struct *work)
unsigned long proc_flags;
int gotten;

mf_cpu = this_cpu_ptr(&memory_failure_cpu);
mf_cpu = container_of(work, struct memory_failure_cpu, work);
for (;;) {
spin_lock_irqsave(&mf_cpu->lock, proc_flags);
gotten = kfifo_get(&mf_cpu->fifo, &entry);
Expand All @@ -1496,6 +1496,19 @@ static void memory_failure_work_func(struct work_struct *work)
}
}

/*
* Process memory_failure work queued on the specified CPU.
* Used to avoid return-to-userspace racing with the memory_failure workqueue.
*/
void memory_failure_queue_kick(int cpu)
{
struct memory_failure_cpu *mf_cpu;

mf_cpu = &per_cpu(memory_failure_cpu, cpu);
cancel_work_sync(&mf_cpu->work);
memory_failure_work_func(&mf_cpu->work);
}

static int __init memory_failure_init(void)
{
struct memory_failure_cpu *mf_cpu;
Expand Down

0 comments on commit af3b585

Please sign in to comment.