ACK: [SRU][N][PATCH v2 0/2] PCI: Wait for device readiness with Configuration RRS

Massimiliano Pellizzer massimiliano.pellizzer at canonical.com
Tue Apr 8 07:04:05 UTC 2025


On Tue, 8 Apr 2025 at 02:03, Tim Whisonant <tim.whisonant at canonical.com> wrote:
>
> BugLink: https://bugs.launchpad.net/bugs/2106251
>
> SRU Justification:
>
> [Impact]
>
> PCI: Wait for device readiness with Configuration RRS
>
> After a device reset, delays are required before the device can
> successfully complete config accesses.  PCIe r6.0, sec 6.6, specifies some
> delays required before software can perform config accesses.  Devices that
> require more time after those delays may respond to config accesses with
> Configuration Request Retry Status (RRS) completions.
>
> Callers of pci_dev_wait() are responsible for delays until the device can
> respond to config accesses.  pci_dev_wait() waits any additional time until
> the device can successfully complete config accesses.
>
> Reading config space of devices that are not present or not ready typically
> returns ~0 (PCI_ERROR_RESPONSE).  Previously we polled the Command register
> until we got a value other than ~0.  This is sometimes a problem because
> Root Complex handling of RRS completions may include several retries and
> implementation-specific behavior that is invisible to software (see sec
> 2.3.2), so the exponential backoff in pci_dev_wait() may not work as
> intended.
>
> Linux enables Configuration RRS Software Visibility on all Root Ports that
> support it.  If it is enabled, read the Vendor ID instead of the Command
> register.  RRS completions cause immediate return of the 0x0001 reserved
> Vendor ID value, so the pci_dev_wait() backoff works correctly.
>
> When a read of Vendor ID eventually completes successfully by returning a
> non-0x0001 value (the Vendor ID or 0xffff for VFs), the device should be
> initialized and ready to respond to config requests.
>
> For conventional PCI devices or devices below Root Ports that don't support
> Configuration RRS Software Visibility, poll the Command register as before.
>
> This was developed independently, but is very similar to Stanislav
> Spassov's previous work at
> https://lore.kernel.org/linux-pci/20200223122057.6504-1-stanspas@amazon.com
>
> PCI: Avoid FLR for Mediatek MT7922 WiFi
>
> commit 81f64e925c29fe6e99f04b131fac1935ac931e81 upstream.
>
> The Mediatek MT7922 WiFi device advertises FLR support, but it apparently
> does not work, and all subsequent config reads return ~0:
>
>   pci 0000:01:00.0: [14c3:0616] type 00 class 0x028000 PCIe Endpoint
>   pciback 0000:01:00.0: not ready 65535ms after FLR; giving up
>
> After an FLR, pci_dev_wait() waits for the device to become ready.  Prior
> to d591f6804e7e ("PCI: Wait for device readiness with Configuration RRS"),
> it polls PCI_COMMAND until it is something other that PCI_POSSIBLE_ERROR
> (~0).  If it times out, pci_dev_wait() returns -ENOTTY and
> __pci_reset_function_locked() tries the next available reset method.
> Typically this is Secondary Bus Reset, which does work, so the MT7922 is
> eventually usable.
>
> After d591f6804e7e, if Configuration Request Retry Status Software
> Visibility (RRS SV) is enabled, pci_dev_wait() polls PCI_VENDOR_ID until it
> is something other than the special 0x0001 Vendor ID that indicates a
> completion with RRS status.
>
> When RRS SV is enabled, reads of PCI_VENDOR_ID should return either 0x0001,
> i.e., the config read was completed with RRS, or a valid Vendor ID.  On the
> MT7922, it seems that all config reads after FLR return ~0 indefinitely.
> When pci_dev_wait() reads PCI_VENDOR_ID and gets 0xffff, it assumes that's
> a valid Vendor ID and the device is now ready, so it returns with success.
>
> After pci_dev_wait() returns success, we restore config space and continue.
> Since the MT7922 is not actually ready after the FLR, the restore fails and
> the device is unusable.
>
> We considered changing pci_dev_wait() to continue polling if a
> PCI_VENDOR_ID read returns either 0x0001 or 0xffff.  This "works" as it did
> before d591f6804e7e, although we have to wait for the timeout and then fall
> back to SBR.  But it doesn't work for SR-IOV VFs, which *always* return
> 0xffff as the Vendor ID.
>
> Mark Mediatek MT7922 WiFi devices to avoid the use of FLR completely.  This
> will cause fallback to another reset method, such as SBR.
>
> [Fix]
>
> Oracular: cherry-picked from upstream
> Noble:    backported from upstream
> Jammy:    not required
> Focal:    not required
> Bionic:   not required
> Xenial:   not required
> Trusty:   not required
>
> [Test Plan]
>
> Compile and boot tested.
>
> [Where problems could occur]
>
> The changes affect the PCI core and the Mediatek MT7922 WiFi driver.
> The PCI core is affected in the device wait logic. Latency between
> device reset and configuration accesses may be altered. The Mediatek
> driver introduces a quirk that specifies those devices don't implement
> FLR.
>
> [Notes]
>
> Noble required the 2nd commit (81f64e925c29) to be cherry-picked.
> Oracular already included this commit.
>
> v2 - inspection of v1 revealed that linux-6.12.y was missing in the
> cherry picked from commit message in the Mediatek patch. It was also
> observed that mutt seemed to have added MIME-Version: 1.0 to the
> subject line of the cover letter.
>
> Bjorn Helgaas (2):
>   PCI: Wait for device readiness with Configuration RRS
>   PCI: Avoid FLR for Mediatek MT7922 WiFi
>
>  drivers/pci/pci.c    | 41 ++++++++++++++++++++++++++++-------------
>  drivers/pci/pci.h    |  5 +++++
>  drivers/pci/probe.c  |  9 +++------
>  drivers/pci/quirks.c |  3 ++-
>  include/linux/pci.h  |  1 +
>  5 files changed, 39 insertions(+), 20 deletions(-)
>
> --
> 2.43.0
>
>
> --
> kernel-team mailing list
> kernel-team at lists.ubuntu.com
> https://lists.ubuntu.com/mailman/listinfo/kernel-team

Acked-by: Massimiliano Pellizzer <massimiliano.pellizzer at canonical.com>

-- 
Massimiliano Pellizzer



More information about the kernel-team mailing list