ACK/Cmnt: [SRU][N:gke][PATCH 000/106] Enable NVIDIA Grace platform for Google Kubernetes Engine
Philip Cox
philip.cox at canonical.com
Tue Jul 22 16:39:40 UTC 2025
In the changes to the annotations file, you have the note as:
note<'LP: #2111859, Google Grace'>
Fix this to point to the correct LP bug and then this series is
Acked-by: Philip Cox <philip.cox at canonical.com>
On 2025-07-21 12:20 p.m., Tim Whisonant wrote:
> BugLink: https://bugs.launchpad.net/bugs/2117098
>
> SRU Justification
>
> [Impact]
>
> Google requested a Grace-enabled 6.8 kernel for GKE, supporting 64k
> page size. This patchset targets noble:linux-gke, and it is based on
> an earlier patchset of the same name that targeted GCP. All of the
> patches included here are the same as from the GCP patchset with
> the 3 exceptions listed below. Note that noble:gke had no 64k page
> size flavor prior to this patchset.
>
> [SRU][N:gke][PATCH 094/106] UBUNTU: [Packaging] gke: enable CONFIG_CPUFREQ_ARCH_CUR_FREQ
> [SRU][N:gke][PATCH 095/106] UBUNTU: [Packaging] gke: enable CONFIG_ARM64_CONTPTE
> [SRU][N:gke][PATCH 106/106] UBUNTU: [Packaging] gke: Add 64k page flavor
>
> [Fix]
>
> Add 64k page size flavor to noble:linux-gke.
>
> Add select NVIDIA Grace platform patches, similar to what has been done
> for kernels like noble:linux-azure-nvidia. NVIDIA provides a
> document [1] listing all of the required and recommended kernel patches for
> Grace enablement. Version 11 April 25, 2025 of the document was used
> when preparing this patchset as was version
> Ubuntu-azure-nvidia-6.8.0-1016.17 of the noble:linux-azure-nvidia kernel.
>
> [1] https://docs.nvidia.com/grace-patch-config-guide.pdf
>
> [Test Plan]
>
> Boot testing on arm64/64k pages was performed in house. Further testing
> will be done by Google in their Grace-enabled environment once available.
>
> [Regression potential]
>
> The regression potential is considered moderate due to the number of
> patches involved and the breadth of the changes. The changes affect
> pte, perf, PCI, cpufreq, ACPI, mmu, coresight, and other kernel
> subsystems.
>
> [Other]
>
> PIT #427777553
>
> Alexey Kardashevskiy (1):
> PCI/DOE: Support discovery version 2
>
> Arnd Bergmann (1):
> arm64/io: add constant-argument check
>
> Barry Song (1):
> mm: make folio_pte_batch available outside of mm/memory.c
>
> Beata Michalska (6):
> arm64: amu: Delay allocating cpumask for AMU FIE support
> cpufreq: Allow arch_freq_get_on_cpu to return an error
> cpufreq: Introduce an optional cpuinfo_avg_freq sysfs entry
> arm64: Provide an AMU-based version of arch_freq_get_on_cpu
> arm64: Update AMU-based freq scale factor on entering idle
> arm64: Utilize for_each_cpu_wrap for reference lookup
>
> Besar Wicaksono (5):
> perf arm-spe: Add Neoverse-V2 to common data source encoding list
> perf: arm_cspmu: nvidia: remove unsupported SCF events
> perf: arm_cspmu: nvidia: fix sysfs path in the kernel doc
> perf: arm_cspmu: nvidia: enable NVLINK-C2C port filtering
> perf: arm_cspmu: nvidia: monitor all ports by default
>
> Dan Williams (1):
> ACPI/HMAT: Move HMAT messages to pr_debug()
>
> David Hildenbrand (14):
> arm/pgtable: define PFN_PTE_SHIFT
> nios2/pgtable: define PFN_PTE_SHIFT
> powerpc/pgtable: define PFN_PTE_SHIFT
> riscv/pgtable: define PFN_PTE_SHIFT
> s390/pgtable: define PFN_PTE_SHIFT
> sparc/pgtable: define PFN_PTE_SHIFT
> mm/pgtable: make pte_next_pfn() independent of set_ptes()
> arm/mm: use pte_next_pfn() in set_ptes()
> powerpc/mm: use pte_next_pfn() in set_ptes()
> mm/memory: factor out copying the actual PTE in copy_present_pte()
> mm/memory: pass PTE to copy_present_pte()
> mm/memory: optimize fork() with PTE-mapped THP
> mm/memory: ignore dirty/accessed/soft-dirty bits in folio_pte_batch()
> mm/memory: ignore writable bit in folio_pte_batch()
>
> Gavin Shan (2):
> arm64: tlb: Improve __TLBI_VADDR_RANGE()
> arm64: tlb: Allow range operation for MAX_TLBI_RANGE_PAGES
>
> Ian Rogers (1):
> perf arm-spe/cs-etm: Directly iterate CPU maps
>
> Ilkka Koskinen (1):
> perf cs-etm: Fix the assert() to handle captured and unprocessed cpu
> trace
>
> Ionela Voinescu (1):
> arch_topology: init capacity_freq_ref to 0
>
> James Clark (30):
> coresight: Remove unused ETM Perf stubs
> coresight: Clarify comments around the PID of the sink owner
> coresight: Move struct coresight_trace_id_map to common header
> coresight: Expose map arguments in trace ID API
> coresight: Make CPU id map a property of a trace ID map
> coresight: Make language around "activated" sinks consistent
> coresight: Remove ops callback checks
> coresight: Move mode to struct coresight_device
> coresight: Remove the 'enable' field.
> coresight: Move all sysfs code to sysfs file
> coresight: Remove atomic type from refcnt
> coresight: Remove unused stubs
> coresight: Add explicit member initializers to coresight_dev_type
> coresight: Add helper for atomically taking the device
> coresight: Add a helper for getting csdev->mode
> coresight: Use per-sink trace ID maps for Perf sessions
> coresight: Remove pending trace ID release mechanism
> coresight: Emit sink ID in the HW_ID packets
> coresight: Make trace ID map spinlock local to the map
> perf auxtrace: Allow number of queues to be specified
> perf cs-etm: Print error for new PERF_RECORD_AUX_OUTPUT_HW_ID versions
> perf cs-etm: Use struct perf_cpu as much as possible
> perf cs-etm: Create decoders after both AUX and HW_ID search passes
> perf: cs-etm: Allocate queues for all CPUs
> perf: cs-etm: Move traceid_list to each queue
> perf: cs-etm: Create decoders based on the trace ID mappings
> perf: cs-etm: Only save valid trace IDs into files
> perf: cs-etm: Support version 0.1 of HW_ID packets
> perf: cs-etm: Print queue number in raw trace dump
> perf arm-spe: Use old behavior when opening old SPE files
>
> Jason Gunthorpe (5):
> arm64/io: Provide a WC friendly __iowriteXX_copy()
> net: hns3: Remove io_stop_wc() calls after __iowrite64_copy()
> x86: Stop using weak symbols for __iowrite32_copy()
> s390: Implement __iowrite32_copy()
> s390: Stop using weak symbols for __iowrite64_copy()
>
> Jie Zhan (1):
> cppc_cpufreq: Remove HiSilicon CPPC workaround
>
> Kai-Heng Feng (1):
> PCI: Use downstream bridges for distributing resources
>
> Leo Yan (8):
> perf arm-spe: Rename arm_spe__synth_data_source_generic()
> perf arm-spe: Rename the common data source encoding
> perf arm-spe: Support metadata version 2
> perf arm-spe: Introduce arm_spe__is_homogeneous()
> perf arm-spe: Use metadata to decide the data source feature
> perf arm-spe: Remove the unused 'midr' field
> perf arm-spe: Add Cortex CPUs to common data source encoding list
> perf arm-spe: Define metadata header version 2
>
> Namhyung Kim (1):
> tools/include: Sync arm64 headers with the kernel sources
>
> Petr Vaněk (1):
> mm: fix folio_pte_batch() on XEN PV
>
> Piotr Jaroszynski (1):
> Fix mmu notifiers for range-based invalidates
>
> Ryan Roberts (20):
> arm64/mm: make set_ptes() robust when OAs cross 48-bit boundary
> mm: clarify the spec for set_ptes()
> mm: thp: batch-collapse PMD with set_ptes()
> mm: introduce pte_advance_pfn() and use for pte_next_pfn()
> arm64/mm: convert pte_next_pfn() to pte_advance_pfn()
> x86/mm: convert pte_next_pfn() to pte_advance_pfn()
> mm: tidy up pte_next_pfn() definition
> arm64/mm: convert READ_ONCE(*ptep) to ptep_get(ptep)
> arm64/mm: convert set_pte_at() to set_ptes(..., 1)
> arm64/mm: convert ptep_clear() to ptep_get_and_clear()
> arm64/mm: new ptep layer to manage contig bit
> arm64/mm: dplit __flush_tlb_range() to elide trailing DSB
> arm64/mm: wire up PTE_CONT for user mappings
> arm64/mm: implement new wrprotect_ptes() batch API
> arm64/mm: implement new [get_and_]clear_full_ptes() batch APIs
> mm: add pte_batch_hint() to reduce scanning in folio_pte_batch()
> arm64/mm: implement pte_batch_hint()
> arm64/mm: __always_inline to improve fork() perf
> arm64/mm: automatically fold contpte mappings
> arm64/mm: export contpte symbols only to GPL users
>
> Tim Whisonant (3):
> UBUNTU: [Packaging] gke: enable CONFIG_CPUFREQ_ARCH_CUR_FREQ
> UBUNTU: [Packaging] gke: enable CONFIG_ARM64_CONTPTE
> UBUNTU: [Packaging] gke: Add 64k page flavor
>
> Vidya Sagar (1):
> PCI: Clear Secondary Status errors after enumeration
>
> Documentation/admin-guide/perf/nvidia-pmu.rst | 52 +-
> Documentation/admin-guide/pm/cpufreq.rst | 17 +-
> arch/arm/include/asm/pgtable.h | 2 +
> arch/arm/mm/mmu.c | 2 +-
> arch/arm64/Kconfig | 9 +
> arch/arm64/include/asm/io.h | 128 ++++
> arch/arm64/include/asm/pgtable.h | 431 ++++++++++--
> arch/arm64/include/asm/tlbflush.h | 68 +-
> arch/arm64/kernel/efi.c | 4 +-
> arch/arm64/kernel/io.c | 42 ++
> arch/arm64/kernel/mte.c | 2 +-
> arch/arm64/kernel/topology.c | 150 ++++-
> arch/arm64/kvm/guest.c | 2 +-
> arch/arm64/mm/Makefile | 1 +
> arch/arm64/mm/contpte.c | 404 +++++++++++
> arch/arm64/mm/fault.c | 12 +-
> arch/arm64/mm/fixmap.c | 4 +-
> arch/arm64/mm/hugetlbpage.c | 40 +-
> arch/arm64/mm/kasan_init.c | 6 +-
> arch/arm64/mm/mmu.c | 16 +-
> arch/arm64/mm/pageattr.c | 6 +-
> arch/arm64/mm/trans_pgd.c | 6 +-
> arch/nios2/include/asm/pgtable.h | 2 +
> arch/powerpc/include/asm/pgtable.h | 2 +
> arch/powerpc/mm/pgtable.c | 5 +-
> arch/riscv/include/asm/pgtable.h | 2 +
> arch/s390/include/asm/io.h | 15 +
> arch/s390/include/asm/pgtable.h | 2 +
> arch/s390/pci/pci.c | 6 -
> arch/sparc/include/asm/pgtable_64.h | 2 +
> arch/x86/include/asm/io.h | 17 +
> arch/x86/include/asm/pgtable.h | 8 +-
> arch/x86/kernel/cpu/aperfmperf.c | 2 +-
> arch/x86/kernel/cpu/proc.c | 7 +-
> arch/x86/lib/Makefile | 1 -
> arch/x86/lib/iomap_copy_64.S | 15 -
> debian.gke/config/annotations | 9 +-
> debian.gke/control.d/gke-64k.inclusion-list | 259 +++++++
> debian.gke/control.d/vars.gke-64k | 6 +
> debian.gke/rules.d/arm64.mk | 2 +-
> drivers/acpi/numa/hmat.c | 24 +-
> drivers/base/arch_topology.c | 8 +-
> drivers/cpufreq/Kconfig.x86 | 12 +
> drivers/cpufreq/cppc_cpufreq.c | 73 +-
> drivers/cpufreq/cpufreq.c | 38 +-
> drivers/hwtracing/coresight/coresight-core.c | 515 +-------------
> drivers/hwtracing/coresight/coresight-dummy.c | 3 +-
> drivers/hwtracing/coresight/coresight-etb10.c | 29 +-
> .../hwtracing/coresight/coresight-etm-perf.c | 43 +-
> .../hwtracing/coresight/coresight-etm-perf.h | 18 -
> drivers/hwtracing/coresight/coresight-etm.h | 2 -
> .../coresight/coresight-etm3x-core.c | 32 +-
> .../coresight/coresight-etm3x-sysfs.c | 4 +-
> .../coresight/coresight-etm4x-core.c | 35 +-
> drivers/hwtracing/coresight/coresight-etm4x.h | 1 -
> drivers/hwtracing/coresight/coresight-priv.h | 8 +-
> drivers/hwtracing/coresight/coresight-stm.c | 33 +-
> drivers/hwtracing/coresight/coresight-sysfs.c | 392 +++++++++++
> .../hwtracing/coresight/coresight-tmc-core.c | 2 +-
> .../hwtracing/coresight/coresight-tmc-etf.c | 46 +-
> .../hwtracing/coresight/coresight-tmc-etr.c | 38 +-
> drivers/hwtracing/coresight/coresight-tmc.h | 7 +-
> drivers/hwtracing/coresight/coresight-tpda.c | 13 +-
> drivers/hwtracing/coresight/coresight-tpdm.c | 3 +-
> drivers/hwtracing/coresight/coresight-tpiu.c | 14 +-
> .../hwtracing/coresight/coresight-trace-id.c | 138 ++--
> .../hwtracing/coresight/coresight-trace-id.h | 70 +-
> drivers/hwtracing/coresight/ultrasoc-smb.c | 22 +-
> drivers/hwtracing/coresight/ultrasoc-smb.h | 2 -
> .../net/ethernet/hisilicon/hns3/hns3_enet.c | 4 -
> drivers/pci/doe.c | 12 +-
> drivers/pci/probe.c | 3 +
> drivers/pci/setup-bus.c | 3 +-
> drivers/perf/arm_cspmu/nvidia_cspmu.c | 75 +--
> include/linux/coresight-pmu.h | 17 +-
> include/linux/coresight.h | 151 ++---
> include/linux/cpufreq.h | 2 +-
> include/linux/efi.h | 5 +
> include/linux/io.h | 8 +-
> include/linux/pgtable.h | 65 +-
> include/uapi/linux/pci_regs.h | 1 +
> lib/iomap_copy.c | 13 +-
> mm/huge_memory.c | 58 +-
> mm/internal.h | 88 +++
> mm/memory.c | 143 ++--
> tools/arch/arm64/include/asm/cputype.h | 10 +
> tools/include/linux/coresight-pmu.h | 17 +-
> tools/perf/arch/arm/util/cs-etm.c | 307 ++++-----
> tools/perf/arch/arm64/util/arm-spe.c | 8 +-
> .../util/arm-spe-decoder/arm-spe-decoder.h | 18 +-
> tools/perf/util/arm-spe.c | 234 ++++++-
> tools/perf/util/arm-spe.h | 38 +-
> tools/perf/util/auxtrace.c | 9 +-
> tools/perf/util/auxtrace.h | 1 +
> .../perf/util/cs-etm-decoder/cs-etm-decoder.c | 36 +-
> .../perf/util/cs-etm-decoder/cs-etm-decoder.h | 2 +-
> tools/perf/util/cs-etm.c | 631 +++++++++++-------
> tools/perf/util/cs-etm.h | 12 +-
> 98 files changed, 3547 insertions(+), 1815 deletions(-)
> create mode 100644 arch/arm64/mm/contpte.c
> delete mode 100644 arch/x86/lib/iomap_copy_64.S
> create mode 100644 debian.gke/control.d/gke-64k.inclusion-list
> create mode 100644 debian.gke/control.d/vars.gke-64k
>
More information about the kernel-team
mailing list