[SRU][N:linux-aws][PULL] Backport patches to support NVIDIA GB200
Magali Lemes
magali.lemes at canonical.com
Mon Mar 17 19:57:12 UTC 2025
BugLink: https://bugs.launchpad.net/bugs/2101185
[Impact]
AWS requested the following patchsets to support GB200 to be backported
to 6.8 kernels and newer:
#1: Update SMMUv3 to the modern iommu API (part 2b/3)
(https://lore.kernel.org/linux-iommu/0-v9-5cd718286059+79186-smmuv3_newapi_p2b_jgg@nvidia.com/)
#2: Add Tegra241 (Grace) CMDQV Support (part 1/2)
(https://lore.kernel.org/linux-iommu/cover.1724970714.git.nicolinc@nvidia.com/)
#3: iommu/tegra241-cmdqv: Fix alignment failure at max_n_shift
(https://lore.kernel.org/all/20241111030226.1940737-1-nicolinc@nvidia.com/)
[Fix]
Backporting patchset #1 required backporting the patchsets it was built
on top of:
- All patches from "Update SMMUv3 to the modern iommu API (part 1/3)"
(https://lore.kernel.org/all/0-v6-96275f25c39d+2d4-smmuv3_newapi_p1_jgg@nvidia.com/),
except for "iommu/arm-smmu-v3: Hold arm_smmu_asid_lock during all of
attach_dev", which is already in the tree. And
* "iommu/arm-smmu-v3: Add cpu_to_le64() around STRTAB_STE_0_V", as
a fix to "iommu/arm-smmu-v3: Make STE programming independent of the
callers".
- Parts of "Update SMMUv3 to the modern iommu API (part 2/3)"
(https://lore.kernel.org/all/0-v6-228e7adf25eb+4155-smmuv3_newapi_p2_jgg@nvidia.com/).
- All patches from "Make the SMMUv3 CD logic match the new STE
design" (part 2a/3)
(https://lore.kernel.org/all/0-v9-5040dc602008+177d7-smmuv3_newapi_p2_jgg@nvidia.com/).
And
* "iommu: Introduce iommu_group_mutex_assert()", as pre-req for
"iommu/arm-smmu-v3: Move the CD generation for SVA into a function".
* "iommu/arm-smmu-v3: Fix access for STE.SHCFG", as pre-req for
"iommu/arm-smmu-v3: Add unit tests for arm_smmu_write_entry".
* "iommu/arm-smmu-v3: Avoid uninitialized asid in case of error",
as a fix for "iommu/arm-smmu-v3: Build the whole CD in
arm_smmu_make_s1_cd()".
Other pre-reqs for patchset #1 included:
- "iommu: Pass domain to remove_dev_pasid() op", as pre-req for
"iommu/arm-smmu-v3: Put the SVA mmu notifier in the smmu_domain".
- A backport of "iommu: Add ops->domain_alloc_sva()", as pre-req for
"iommu/arm-smmu-v3: Convert to domain_alloc_sva()".
Patchset #2 was almost all clean cherry-picks except for
"iommu/arm-smmu-v3: Add acpi_smmu_iort_probe_model for impl".
Additionally, these were needed:
* "iommu/arm-smmu-v3: Make the kunit into a module" and
"iommu/arm-smmu-v3: Use *-y instead of *-objs in Makefile" to clean
cherry-pick "iommu/arm-smmu-v3: Add in-kernel support for NVIDIA
Tegra241 (Grace) CMDQV".
* "iommu/arm-smmu-v3: add missing MODULE_DESCRIPTION() macro", as a
fix for "iommu/arm-smmu-v3: Make the kunit into a module".
* Other `Fixes` to commit 918eb5c856f6 ("iommu/arm-smmu-v3: Add
in-kernel support for NVIDIA Tegra241 (Grace) CMDQV"), which were clean
cherry-picks.
Patchset #3's only patch and one fix to it were clean cherry-picks.
[Test Case]
Compile and boot tested.
Tested by AWS.
[Where problems could occur]
These patches introduce significant refactoring and features to the
SMMUv3 driver. Additionally, support to Tegra241's CMDQ-Virtualization
is being added.
[Other info]
SF #00404773
------------------------------------------------------------------------
The following changes since commit 21b44fd410f8f0e8330c93553ebaba64ab9dd36e:
UBUNTU: Ubuntu-aws-6.8.0-1025.27 (2025-02-19 11:40:06 -0500)
are available in the Git repository at:
git://git.launchpad.net/~magalilemes/ubuntu/+source/linux-aws/+git/noble
gb200
for you to fetch changes up to 45b2986d7b60d6d562aac2e844632dfbc70d8115:
iommu/tegra241-cmdqv: Read SMMU IDR1.CMDQS instead of hardcoding
(2025-03-17 13:57:50 -0300)
----------------------------------------------------------------
Andy Shevchenko (1):
iommu/arm-smmu-v3: Use *-y instead of *-objs in Makefile
Dan Carpenter (1):
iommu/tegra241-cmdqv: Fix ioremap() error handling in probe()
Jason Gunthorpe (44):
iommu/arm-smmu-v3: Make STE programming independent of the callers
iommu/arm-smmu-v3: Consolidate the STE generation for abort/bypass
iommu/arm-smmu-v3: Move the STE generation for S1 and S2 domains
into functions
iommu/arm-smmu-v3: Build the whole STE in
arm_smmu_make_s2_domain_ste()
iommu/arm-smmu-v3: Compute the STE only once for each master
iommu/arm-smmu-v3: Do not change the STE twice during
arm_smmu_attach_dev()
iommu/arm-smmu-v3: Put writing the context descriptor in the
right order
iommu/arm-smmu-v3: Pass smmu_domain to arm_enable/disable_ats()
iommu/arm-smmu-v3: Remove arm_smmu_master->domain
iommu/arm-smmu-v3: Check that the RID domain is S1 in SVA
iommu/arm-smmu-v3: Add a global static IDENTITY domain
iommu/arm-smmu-v3: Add a global static BLOCKED domain
iommu/arm-smmu-v3: Use the identity/blocked domain during release
iommu/arm-smmu-v3: Pass arm_smmu_domain and arm_smmu_device to
finalize
iommu/arm-smmu-v3: Convert to domain_alloc_paging()
iommu/arm-smmu-v3: Add cpu_to_le64() around STRTAB_STE_0_V
iommu/arm-smmu-v3: Do not ATC invalidate the entire domain
iommu/arm-smmu-v3: Add a type for the CD entry
iommu/arm-smmu-v3: Add an ops indirection to the STE code
iommu/arm-smmu-v3: Make CD programming use arm_smmu_write_entry()
iommu/arm-smmu-v3: Move the CD generation for S1 domains into a
function
iommu/arm-smmu-v3: Consolidate clearing a CD table entry
iommu/arm-smmu-v3: Make arm_smmu_alloc_cd_ptr()
iommu/arm-smmu-v3: Allocate the CD table entry in advance
iommu/arm-smmu-v3: Move the CD generation for SVA into a function
iommu/arm-smmu-v3: Build the whole CD in arm_smmu_make_s1_cd()
iommu/arm-smmu-v3: Add unit tests for arm_smmu_write_entry
iommu: Add ops->domain_alloc_sva()
iommu/arm-smmu-v3: Convert to domain_alloc_sva()
iommu/arm-smmu-v3: Start building a generic PASID layer
iommu/arm-smmu-v3: Make smmu_domain->devices into an allocated list
iommu/arm-smmu-v3: Make changing domains be hitless for ATS
iommu/arm-smmu-v3: Add ssid to struct arm_smmu_master_domain
iommu/arm-smmu-v3: Do not use master->sva_enable to restrict attaches
iommu/arm-smmu-v3: Thread SSID through the arm_smmu_attach_*()
interface
iommu/arm-smmu-v3: Make SVA allocate a normal arm_smmu_domain
iommu/arm-smmu-v3: Keep track of arm_smmu_master_domain for SVA
iommu/arm-smmu-v3: Put the SVA mmu notifier in the smmu_domain
iommu/arm-smmu-v3: Allow IDENTITY/BLOCKED to be set while PASID
is used
iommu/arm-smmu-v3: Test the STE S1DSS functionality
iommu/arm-smmu-v3: Allow a PASID to be set when RID is
IDENTITY/BLOCKED
iommu/arm-smmu-v3: Allow setting a S1 domain to a PASID
iommu/arm-smmu-v3: Make the kunit into a module
iommu/arm-smmu-v3: Add struct arm_smmu_impl_ops
Jeff Johnson (1):
iommu/arm-smmu-v3: add missing MODULE_DESCRIPTION() macro
Luis Claudio R. Goncalves (1):
iommu/tegra241-cmdqv: do not use smp_processor_id in preemptible
context
Magali Lemes (1):
UBUNTU: [Config] updateconfigs to enable CONFIG_TEGRA241_CMDQV
Mostafa Saleh (2):
iommu/arm-smmu-v3: Fix access for STE.SHCFG
iommu/arm-smmu-v3: Avoid uninitialized asid in case of error
Nate Watterson (1):
iommu/arm-smmu-v3: Add in-kernel support for NVIDIA Tegra241
(Grace) CMDQV
Nicolin Chen (14):
iommu/arm-smmu-v3: Issue a batch of commands to the same cmdq
iommu/arm-smmu-v3: Pass in cmdq pointer to
arm_smmu_cmdq_build_sync_cmd
iommu/arm-smmu-v3: Pass in cmdq pointer to arm_smmu_cmdq_init
iommu/arm-smmu-v3: Make symbols public for CONFIG_TEGRA241_CMDQV
iommu/arm-smmu-v3: Add ARM_SMMU_OPT_TEGRA241_CMDQV
iommu/arm-smmu-v3: Add acpi_smmu_iort_probe_model for impl
iommu/arm-smmu-v3: Start a new batch if new command is not supported
iommu/tegra241-cmdqv: Limit CMDs for VCMDQs of a guest owned VINTF
iommu/tegra241-cmdqv: Fix -Wformat-truncation warnings in
lvcmdq_error_header
iommu/tegra241-cmdqv: Drop static at local variable
iommu/tegra241-cmdqv: Do not allocate vcmdq until
dma_set_mask_and_coherent
iommu/tegra241-cmdqv: Staticize cmdqv_debugfs_dir
iommu/tegra241-cmdqv: Fix alignment failure at max_n_shift
iommu/tegra241-cmdqv: Read SMMU IDR1.CMDQS instead of hardcoding
Vasant Hegde (1):
iommu: Introduce iommu_group_mutex_assert()
Yi Liu (1):
iommu: Pass domain to remove_dev_pasid() op
MAINTAINERS | 1 +
debian.aws/config/annotations | 1 +
drivers/iommu/Kconfig | 24 +-
drivers/iommu/arm/arm-smmu-v3/Makefile | 8 +-
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 526
+++++++------------
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-test.c | 571
++++++++++++++++++++
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 1666
++++++++++++++++++++++++++++++++++++++++++-----------------
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 151 ++++--
drivers/iommu/arm/arm-smmu-v3/tegra241-cmdqv.c | 912
++++++++++++++++++++++++++++++++
drivers/iommu/intel/iommu.c | 11 +-
drivers/iommu/iommu-sva.c | 4 +-
drivers/iommu/iommu.c | 40 +-
include/linux/iommu.h | 14 +-
13 files changed, 3054 insertions(+), 875 deletions(-)
create mode 100644 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-test.c
create mode 100644 drivers/iommu/arm/arm-smmu-v3/tegra241-cmdqv.c
More information about the kernel-team
mailing list