ACK: [SRU][Q/R][PATCH 0/1] Fix graceful fault handling after FPU softirq changes causes hard freeze on EFI runtime calls
Yufeng Gao
yufeng.gao at canonical.com
Thu May 28 04:21:16 UTC 2026
On 22/5/26 17:07, Ivan Hu wrote:
> BugLink: https://bugs.launchpad.net/bugs/2153976
>
> [Impact]
> Running `sudo fwts uefirttime` on HP systems (CID: 202511-38089, 202511-38088,
> 02511-38091, 202511-38068, 202511-38069) with 6.17 causes a hardsystem freeze.
> The machine becomes unreachable by ping/SSH and requires a hardpower cycle.
>
> Root cause: commit d02198550423 ("x86/fpu: Improve crypto performance by making
> kernel-mode FPU reliably usable in softirqs") changed kernel_fpu_begin() to use
> local_bh_disable() instead of preempt_disable(). This sets SOFTIRQ_OFFSET in
> preempt_count during EFI runtime calls, making in_interrupt() return true in
> normal task context. The EFI graceful page fault handler
> efi_crash_gracefully_on_page_fault() uses in_interrupt() to bail out for faults
> in real interrupt context. With SOFTIRQ_OFFSET now set, the handler always bails
> out, leaving firmware page faults unhandled. This escalates to die() which also
> sees in_interrupt() as true and calls panic("Fatal exception in interrupt"),
> freezing the system.
>
> [Fix]
> Replace in_interrupt() with !in_task() in efi_crash_gracefully_on_page_fault().
> This preserves the original intent of bailing for interrupts or NMI faults,
> while no longer falsely triggering from the FPU code path's local_bh_disable().
>
> [Test Plan]
> 1. Boot affected HP machine with the patched 6.17-oem kernel.
> 2. Run:
> $ sudo fwts uefirttime
>
> Without patch: system hard-freezes, requires power cycle.
> With patch: fwts completes (pass or fail), system remains responsive.
>
> [Where problems could occur]
> The change is in the EFI page fault handler (arch/x86/platform/efi/quirks.c).
> If !in_task() incorrectly identifies a real interrupt-context fault as task
> context, the handler would try to process it as an EFI firmware fault instead
> of letting the normal oops path handle it. This could mask real kernel bugs
> during interrupt-context EFI faults, though such faults are extremely rare.
> Also, a softirq taken between efi_rts_work.efi_rts_id assignment and the
> fpregs_lock() call could cause a page fault that gets misidentified. The use
> of !in_task() (which incorporates in_serving_softirq()) handles this window
> correctly.
>
> Ivan Hu (1):
> x86/efi: Fix graceful fault handling after FPU softirq changes
>
> arch/x86/platform/efi/quirks.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
Acked-by: Yufeng Gao <yufeng.gao at canonical.com>
More information about the kernel-team
mailing list