ACK: [SRU][J:linux-bluefield][PATCH v1 0/7] arm64/sme: Implement ZA context switching

Kuba Pawlak kuba.pawlak at canonical.com
Mon Aug 18 11:49:11 UTC 2025


On 7.08.2025 15:33, Stav Aviram wrote:
> BugLink: https://bugs.launchpad.net/bugs/2119457
>
> SRU Justification:
>
> [IMPACT]
> In Bluefield-2 and Bluefield-3 embedded ARM cores running Ubuntu 22.04
> Jammy (linux-bluefield-5.15), ptp4l randomly goes out of sync during
> long-running operations (~24 hours) with the error message:
> "ptp4l[3416283.946]: port 1: SLAVE to FAULTY on FAULT_DETECTED (FT_UNSPECIFIED)"
>
> Debugging traces reveal that the failure occurs in the network stack's
> sendto() system call when ptp4l attempts to send DelayReq messages,
> returning error code -6 (ENXIO - "No such device or address").
> The root cause is corrupted FPSIMD register state during kernel mode
> context switches. ARM64 kernel code using NEON/FPSIMD instructions for
> network operations, cryptographic functions, or other
> performance-critical tasks can lose register state when preempted, as
> the current kernel does not preserve kernel mode FPSIMD state across
> context switches. This corruption manifests as unpredictable behavior in
> subsequent operations including network socket calls, leading to the
> observed sendto() failures that disrupt PTP synchronization. This issue
> affects PTP synchronization reliability on Bluefield hardware.
>
>
> [FIX]
> Cherry picking and backporting 7 upstream patches centered around the
> core fix for FPSIMD register corruption.
> The backport required some adaptation to remove SME dependencies not
> supported in linux-bluefield-5.15, while preserving all core kernel NEON
> functionality.
> Patches [PATCH v1 1/7] - [PATCH v1 2/7] provide the necessary
> infrastructure, [PATCH v1 3/7] - [PATCH v1 5/7] form the core functional
> series with [PATCH v1 4/7] as the primary fix, and [PATCH v1 6/7] -
> [PATCH v1 7/7] address issues discovered in the core implementation:
>
> **Core fix:**
> [PATCH v1 4/7]: This is the primary patch that fixes the FPSIMD register
> corruption issue by introducing TIF_KERNEL_FPSTATE thread flag and
> kernel_fpsimd_state storage, enabling proper preservation and
> restoration of FPSIMD register state during context switches.
>
> **Core fix series:**
> [PATCH v1 3/7]: Removes complexity blocking the core transformation by
> eliminating the fpsimd_context_busy flag and associated infrastructure.
> [PATCH v1 5/7]: Adds performance optimization to the core fix with lazy
> restore functionality and CPU tracking to minimize unnecessary FPSIMD
> state reloads.
>
> **Prerequisites for the core fix series:**
> Required for the core series to compile and function:
> [PATCH v1 1/7]: Provides SME infrastructure dependencies (most of which
> are not needed in Jammy since it does not support SME)
> [PATCH v1 2/7]: Refactors API to support the richer state management
> needed by the core functionality.
>
> **Bug fixes for the core patch:**
> [PATCH v1 6/7]: Fixes assembly macro broken by the preemption model
> changes in [PATCH v1 4/7].
> [PATCH v1 7/7]: Fixes critical user state reload bug introduced by the
> TIF_FOREIGN_FPSTATE management changes in [PATCH v1 4/7].
>
>
> [TEST CASE]
> Compile tested on linux-bluefield-5.15 on the master-next branch. All
> patches compile cleanly with no warnings or errors.  Prior to the patch
> series, ptp4l consistently failed within 24 hours with ENXIO errors from
> sendto() calls during DelayReq message transmission. After applying the
> fix, the system was tested for 7 consecutive days under the same
> conditions that previously triggered failures. No ptp4l synchronization
> failures, ENXIO errors from sendto() calls, or FPSIMD-related corruption
> were observed during the extended test period.
>
>
> [REGRESSION POTENTIAL]
> The patches introduce new code paths for kernel mode FPSIMD state
> management, with potential impact on context switch performance.
> However, the upstream patches are well-tested and present in mainline
> kernels, the backport maintains functional equivalence with upstream,
> and the extensive 7-day testing under the original failure conditions
> provides confidence in the implementation's stability.
>
> Ard Biesheuvel (5):
>    arm64: fpsimd: Drop unneeded 'busy' flag
>    arm64: fpsimd: Preserve/restore kernel mode NEON at context switch
>    arm64: fpsimd: Implement lazy restore for kernel mode FPSIMD
>    arm64: fpsimd: Bring cond_yield asm macro in line with new rules
>    arm64/fpsimd: Avoid erroneous elide of user state reload
>
> Mark Brown (2):
>    arm64/sme: Implement ZA context switching
>    arm64/fp: Use a struct to pass data to fpsimd_bind_state_to_cpu()
>
>   arch/arm64/include/asm/assembler.h   |  25 ++--
>   arch/arm64/include/asm/fpsimd.h      |  15 +-
>   arch/arm64/include/asm/kvm_host.h    |   1 +
>   arch/arm64/include/asm/processor.h   |   3 +
>   arch/arm64/include/asm/simd.h        |  11 +-
>   arch/arm64/include/asm/thread_info.h |   1 +
>   arch/arm64/kernel/asm-offsets.c      |   2 -
>   arch/arm64/kernel/fpsimd.c           | 215 +++++++++++++++------------
>   arch/arm64/kvm/fpsimd.c              |  23 +--
>   9 files changed, 165 insertions(+), 131 deletions(-)
>
Acked-by: Kuba Pawlak <kuba.pawlak at canonical.com>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_0x216A9D7E3B63DCB4.asc
Type: application/pgp-keys
Size: 3139 bytes
Desc: OpenPGP public key
URL: <https://lists.ubuntu.com/archives/kernel-team/attachments/20250818/f26bff38/attachment-0001.key>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_signature.asc
Type: application/pgp-signature
Size: 840 bytes
Desc: OpenPGP digital signature
URL: <https://lists.ubuntu.com/archives/kernel-team/attachments/20250818/f26bff38/attachment-0001.sig>


More information about the kernel-team mailing list