[SRU][N:gke][PATCH 000/106] Enable NVIDIA Grace platform for Google Kubernetes Engine
Tim Whisonant
tim.whisonant at canonical.com
Mon Jul 21 16:20:43 UTC 2025
BugLink: https://bugs.launchpad.net/bugs/2117098
SRU Justification
[Impact]
Google requested a Grace-enabled 6.8 kernel for GKE, supporting 64k
page size. This patchset targets noble:linux-gke, and it is based on
an earlier patchset of the same name that targeted GCP. All of the
patches included here are the same as from the GCP patchset with
the 3 exceptions listed below. Note that noble:gke had no 64k page
size flavor prior to this patchset.
[SRU][N:gke][PATCH 094/106] UBUNTU: [Packaging] gke: enable CONFIG_CPUFREQ_ARCH_CUR_FREQ
[SRU][N:gke][PATCH 095/106] UBUNTU: [Packaging] gke: enable CONFIG_ARM64_CONTPTE
[SRU][N:gke][PATCH 106/106] UBUNTU: [Packaging] gke: Add 64k page flavor
[Fix]
Add 64k page size flavor to noble:linux-gke.
Add select NVIDIA Grace platform patches, similar to what has been done
for kernels like noble:linux-azure-nvidia. NVIDIA provides a
document [1] listing all of the required and recommended kernel patches for
Grace enablement. Version 11 April 25, 2025 of the document was used
when preparing this patchset as was version
Ubuntu-azure-nvidia-6.8.0-1016.17 of the noble:linux-azure-nvidia kernel.
[1] https://docs.nvidia.com/grace-patch-config-guide.pdf
[Test Plan]
Boot testing on arm64/64k pages was performed in house. Further testing
will be done by Google in their Grace-enabled environment once available.
[Regression potential]
The regression potential is considered moderate due to the number of
patches involved and the breadth of the changes. The changes affect
pte, perf, PCI, cpufreq, ACPI, mmu, coresight, and other kernel
subsystems.
[Other]
PIT #427777553
Alexey Kardashevskiy (1):
PCI/DOE: Support discovery version 2
Arnd Bergmann (1):
arm64/io: add constant-argument check
Barry Song (1):
mm: make folio_pte_batch available outside of mm/memory.c
Beata Michalska (6):
arm64: amu: Delay allocating cpumask for AMU FIE support
cpufreq: Allow arch_freq_get_on_cpu to return an error
cpufreq: Introduce an optional cpuinfo_avg_freq sysfs entry
arm64: Provide an AMU-based version of arch_freq_get_on_cpu
arm64: Update AMU-based freq scale factor on entering idle
arm64: Utilize for_each_cpu_wrap for reference lookup
Besar Wicaksono (5):
perf arm-spe: Add Neoverse-V2 to common data source encoding list
perf: arm_cspmu: nvidia: remove unsupported SCF events
perf: arm_cspmu: nvidia: fix sysfs path in the kernel doc
perf: arm_cspmu: nvidia: enable NVLINK-C2C port filtering
perf: arm_cspmu: nvidia: monitor all ports by default
Dan Williams (1):
ACPI/HMAT: Move HMAT messages to pr_debug()
David Hildenbrand (14):
arm/pgtable: define PFN_PTE_SHIFT
nios2/pgtable: define PFN_PTE_SHIFT
powerpc/pgtable: define PFN_PTE_SHIFT
riscv/pgtable: define PFN_PTE_SHIFT
s390/pgtable: define PFN_PTE_SHIFT
sparc/pgtable: define PFN_PTE_SHIFT
mm/pgtable: make pte_next_pfn() independent of set_ptes()
arm/mm: use pte_next_pfn() in set_ptes()
powerpc/mm: use pte_next_pfn() in set_ptes()
mm/memory: factor out copying the actual PTE in copy_present_pte()
mm/memory: pass PTE to copy_present_pte()
mm/memory: optimize fork() with PTE-mapped THP
mm/memory: ignore dirty/accessed/soft-dirty bits in folio_pte_batch()
mm/memory: ignore writable bit in folio_pte_batch()
Gavin Shan (2):
arm64: tlb: Improve __TLBI_VADDR_RANGE()
arm64: tlb: Allow range operation for MAX_TLBI_RANGE_PAGES
Ian Rogers (1):
perf arm-spe/cs-etm: Directly iterate CPU maps
Ilkka Koskinen (1):
perf cs-etm: Fix the assert() to handle captured and unprocessed cpu
trace
Ionela Voinescu (1):
arch_topology: init capacity_freq_ref to 0
James Clark (30):
coresight: Remove unused ETM Perf stubs
coresight: Clarify comments around the PID of the sink owner
coresight: Move struct coresight_trace_id_map to common header
coresight: Expose map arguments in trace ID API
coresight: Make CPU id map a property of a trace ID map
coresight: Make language around "activated" sinks consistent
coresight: Remove ops callback checks
coresight: Move mode to struct coresight_device
coresight: Remove the 'enable' field.
coresight: Move all sysfs code to sysfs file
coresight: Remove atomic type from refcnt
coresight: Remove unused stubs
coresight: Add explicit member initializers to coresight_dev_type
coresight: Add helper for atomically taking the device
coresight: Add a helper for getting csdev->mode
coresight: Use per-sink trace ID maps for Perf sessions
coresight: Remove pending trace ID release mechanism
coresight: Emit sink ID in the HW_ID packets
coresight: Make trace ID map spinlock local to the map
perf auxtrace: Allow number of queues to be specified
perf cs-etm: Print error for new PERF_RECORD_AUX_OUTPUT_HW_ID versions
perf cs-etm: Use struct perf_cpu as much as possible
perf cs-etm: Create decoders after both AUX and HW_ID search passes
perf: cs-etm: Allocate queues for all CPUs
perf: cs-etm: Move traceid_list to each queue
perf: cs-etm: Create decoders based on the trace ID mappings
perf: cs-etm: Only save valid trace IDs into files
perf: cs-etm: Support version 0.1 of HW_ID packets
perf: cs-etm: Print queue number in raw trace dump
perf arm-spe: Use old behavior when opening old SPE files
Jason Gunthorpe (5):
arm64/io: Provide a WC friendly __iowriteXX_copy()
net: hns3: Remove io_stop_wc() calls after __iowrite64_copy()
x86: Stop using weak symbols for __iowrite32_copy()
s390: Implement __iowrite32_copy()
s390: Stop using weak symbols for __iowrite64_copy()
Jie Zhan (1):
cppc_cpufreq: Remove HiSilicon CPPC workaround
Kai-Heng Feng (1):
PCI: Use downstream bridges for distributing resources
Leo Yan (8):
perf arm-spe: Rename arm_spe__synth_data_source_generic()
perf arm-spe: Rename the common data source encoding
perf arm-spe: Support metadata version 2
perf arm-spe: Introduce arm_spe__is_homogeneous()
perf arm-spe: Use metadata to decide the data source feature
perf arm-spe: Remove the unused 'midr' field
perf arm-spe: Add Cortex CPUs to common data source encoding list
perf arm-spe: Define metadata header version 2
Namhyung Kim (1):
tools/include: Sync arm64 headers with the kernel sources
Petr Vaněk (1):
mm: fix folio_pte_batch() on XEN PV
Piotr Jaroszynski (1):
Fix mmu notifiers for range-based invalidates
Ryan Roberts (20):
arm64/mm: make set_ptes() robust when OAs cross 48-bit boundary
mm: clarify the spec for set_ptes()
mm: thp: batch-collapse PMD with set_ptes()
mm: introduce pte_advance_pfn() and use for pte_next_pfn()
arm64/mm: convert pte_next_pfn() to pte_advance_pfn()
x86/mm: convert pte_next_pfn() to pte_advance_pfn()
mm: tidy up pte_next_pfn() definition
arm64/mm: convert READ_ONCE(*ptep) to ptep_get(ptep)
arm64/mm: convert set_pte_at() to set_ptes(..., 1)
arm64/mm: convert ptep_clear() to ptep_get_and_clear()
arm64/mm: new ptep layer to manage contig bit
arm64/mm: dplit __flush_tlb_range() to elide trailing DSB
arm64/mm: wire up PTE_CONT for user mappings
arm64/mm: implement new wrprotect_ptes() batch API
arm64/mm: implement new [get_and_]clear_full_ptes() batch APIs
mm: add pte_batch_hint() to reduce scanning in folio_pte_batch()
arm64/mm: implement pte_batch_hint()
arm64/mm: __always_inline to improve fork() perf
arm64/mm: automatically fold contpte mappings
arm64/mm: export contpte symbols only to GPL users
Tim Whisonant (3):
UBUNTU: [Packaging] gke: enable CONFIG_CPUFREQ_ARCH_CUR_FREQ
UBUNTU: [Packaging] gke: enable CONFIG_ARM64_CONTPTE
UBUNTU: [Packaging] gke: Add 64k page flavor
Vidya Sagar (1):
PCI: Clear Secondary Status errors after enumeration
Documentation/admin-guide/perf/nvidia-pmu.rst | 52 +-
Documentation/admin-guide/pm/cpufreq.rst | 17 +-
arch/arm/include/asm/pgtable.h | 2 +
arch/arm/mm/mmu.c | 2 +-
arch/arm64/Kconfig | 9 +
arch/arm64/include/asm/io.h | 128 ++++
arch/arm64/include/asm/pgtable.h | 431 ++++++++++--
arch/arm64/include/asm/tlbflush.h | 68 +-
arch/arm64/kernel/efi.c | 4 +-
arch/arm64/kernel/io.c | 42 ++
arch/arm64/kernel/mte.c | 2 +-
arch/arm64/kernel/topology.c | 150 ++++-
arch/arm64/kvm/guest.c | 2 +-
arch/arm64/mm/Makefile | 1 +
arch/arm64/mm/contpte.c | 404 +++++++++++
arch/arm64/mm/fault.c | 12 +-
arch/arm64/mm/fixmap.c | 4 +-
arch/arm64/mm/hugetlbpage.c | 40 +-
arch/arm64/mm/kasan_init.c | 6 +-
arch/arm64/mm/mmu.c | 16 +-
arch/arm64/mm/pageattr.c | 6 +-
arch/arm64/mm/trans_pgd.c | 6 +-
arch/nios2/include/asm/pgtable.h | 2 +
arch/powerpc/include/asm/pgtable.h | 2 +
arch/powerpc/mm/pgtable.c | 5 +-
arch/riscv/include/asm/pgtable.h | 2 +
arch/s390/include/asm/io.h | 15 +
arch/s390/include/asm/pgtable.h | 2 +
arch/s390/pci/pci.c | 6 -
arch/sparc/include/asm/pgtable_64.h | 2 +
arch/x86/include/asm/io.h | 17 +
arch/x86/include/asm/pgtable.h | 8 +-
arch/x86/kernel/cpu/aperfmperf.c | 2 +-
arch/x86/kernel/cpu/proc.c | 7 +-
arch/x86/lib/Makefile | 1 -
arch/x86/lib/iomap_copy_64.S | 15 -
debian.gke/config/annotations | 9 +-
debian.gke/control.d/gke-64k.inclusion-list | 259 +++++++
debian.gke/control.d/vars.gke-64k | 6 +
debian.gke/rules.d/arm64.mk | 2 +-
drivers/acpi/numa/hmat.c | 24 +-
drivers/base/arch_topology.c | 8 +-
drivers/cpufreq/Kconfig.x86 | 12 +
drivers/cpufreq/cppc_cpufreq.c | 73 +-
drivers/cpufreq/cpufreq.c | 38 +-
drivers/hwtracing/coresight/coresight-core.c | 515 +-------------
drivers/hwtracing/coresight/coresight-dummy.c | 3 +-
drivers/hwtracing/coresight/coresight-etb10.c | 29 +-
.../hwtracing/coresight/coresight-etm-perf.c | 43 +-
.../hwtracing/coresight/coresight-etm-perf.h | 18 -
drivers/hwtracing/coresight/coresight-etm.h | 2 -
.../coresight/coresight-etm3x-core.c | 32 +-
.../coresight/coresight-etm3x-sysfs.c | 4 +-
.../coresight/coresight-etm4x-core.c | 35 +-
drivers/hwtracing/coresight/coresight-etm4x.h | 1 -
drivers/hwtracing/coresight/coresight-priv.h | 8 +-
drivers/hwtracing/coresight/coresight-stm.c | 33 +-
drivers/hwtracing/coresight/coresight-sysfs.c | 392 +++++++++++
.../hwtracing/coresight/coresight-tmc-core.c | 2 +-
.../hwtracing/coresight/coresight-tmc-etf.c | 46 +-
.../hwtracing/coresight/coresight-tmc-etr.c | 38 +-
drivers/hwtracing/coresight/coresight-tmc.h | 7 +-
drivers/hwtracing/coresight/coresight-tpda.c | 13 +-
drivers/hwtracing/coresight/coresight-tpdm.c | 3 +-
drivers/hwtracing/coresight/coresight-tpiu.c | 14 +-
.../hwtracing/coresight/coresight-trace-id.c | 138 ++--
.../hwtracing/coresight/coresight-trace-id.h | 70 +-
drivers/hwtracing/coresight/ultrasoc-smb.c | 22 +-
drivers/hwtracing/coresight/ultrasoc-smb.h | 2 -
.../net/ethernet/hisilicon/hns3/hns3_enet.c | 4 -
drivers/pci/doe.c | 12 +-
drivers/pci/probe.c | 3 +
drivers/pci/setup-bus.c | 3 +-
drivers/perf/arm_cspmu/nvidia_cspmu.c | 75 +--
include/linux/coresight-pmu.h | 17 +-
include/linux/coresight.h | 151 ++---
include/linux/cpufreq.h | 2 +-
include/linux/efi.h | 5 +
include/linux/io.h | 8 +-
include/linux/pgtable.h | 65 +-
include/uapi/linux/pci_regs.h | 1 +
lib/iomap_copy.c | 13 +-
mm/huge_memory.c | 58 +-
mm/internal.h | 88 +++
mm/memory.c | 143 ++--
tools/arch/arm64/include/asm/cputype.h | 10 +
tools/include/linux/coresight-pmu.h | 17 +-
tools/perf/arch/arm/util/cs-etm.c | 307 ++++-----
tools/perf/arch/arm64/util/arm-spe.c | 8 +-
.../util/arm-spe-decoder/arm-spe-decoder.h | 18 +-
tools/perf/util/arm-spe.c | 234 ++++++-
tools/perf/util/arm-spe.h | 38 +-
tools/perf/util/auxtrace.c | 9 +-
tools/perf/util/auxtrace.h | 1 +
.../perf/util/cs-etm-decoder/cs-etm-decoder.c | 36 +-
.../perf/util/cs-etm-decoder/cs-etm-decoder.h | 2 +-
tools/perf/util/cs-etm.c | 631 +++++++++++-------
tools/perf/util/cs-etm.h | 12 +-
98 files changed, 3547 insertions(+), 1815 deletions(-)
create mode 100644 arch/arm64/mm/contpte.c
delete mode 100644 arch/x86/lib/iomap_copy_64.S
create mode 100644 debian.gke/control.d/gke-64k.inclusion-list
create mode 100644 debian.gke/control.d/vars.gke-64k
--
2.43.0
More information about the kernel-team
mailing list