[SRU][N][PATCH 0/2] PCI: Wait for device readiness with Configuration RRSMIME-Version: 1.0
Tim Whisonant
tim.whisonant at canonical.com
Fri Apr 4 17:13:25 UTC 2025
BugLink: https://bugs.launchpad.net/bugs/2106251
SRU Justification:
[Impact]
PCI: Wait for device readiness with Configuration RRS
After a device reset, delays are required before the device can
successfully complete config accesses. PCIe r6.0, sec 6.6, specifies some
delays required before software can perform config accesses. Devices that
require more time after those delays may respond to config accesses with
Configuration Request Retry Status (RRS) completions.
Callers of pci_dev_wait() are responsible for delays until the device can
respond to config accesses. pci_dev_wait() waits any additional time until
the device can successfully complete config accesses.
Reading config space of devices that are not present or not ready typically
returns ~0 (PCI_ERROR_RESPONSE). Previously we polled the Command register
until we got a value other than ~0. This is sometimes a problem because
Root Complex handling of RRS completions may include several retries and
implementation-specific behavior that is invisible to software (see sec
2.3.2), so the exponential backoff in pci_dev_wait() may not work as
intended.
Linux enables Configuration RRS Software Visibility on all Root Ports that
support it. If it is enabled, read the Vendor ID instead of the Command
register. RRS completions cause immediate return of the 0x0001 reserved
Vendor ID value, so the pci_dev_wait() backoff works correctly.
When a read of Vendor ID eventually completes successfully by returning a
non-0x0001 value (the Vendor ID or 0xffff for VFs), the device should be
initialized and ready to respond to config requests.
For conventional PCI devices or devices below Root Ports that don't support
Configuration RRS Software Visibility, poll the Command register as before.
This was developed independently, but is very similar to Stanislav
Spassov's previous work at
https://lore.kernel.org/linux-pci/20200223122057.6504-1-stanspas@amazon.com
PCI: Avoid FLR for Mediatek MT7922 WiFi
commit 81f64e925c29fe6e99f04b131fac1935ac931e81 upstream.
The Mediatek MT7922 WiFi device advertises FLR support, but it apparently
does not work, and all subsequent config reads return ~0:
pci 0000:01:00.0: [14c3:0616] type 00 class 0x028000 PCIe Endpoint
pciback 0000:01:00.0: not ready 65535ms after FLR; giving up
After an FLR, pci_dev_wait() waits for the device to become ready. Prior
to d591f6804e7e ("PCI: Wait for device readiness with Configuration RRS"),
it polls PCI_COMMAND until it is something other that PCI_POSSIBLE_ERROR
(~0). If it times out, pci_dev_wait() returns -ENOTTY and
__pci_reset_function_locked() tries the next available reset method.
Typically this is Secondary Bus Reset, which does work, so the MT7922 is
eventually usable.
After d591f6804e7e, if Configuration Request Retry Status Software
Visibility (RRS SV) is enabled, pci_dev_wait() polls PCI_VENDOR_ID until it
is something other than the special 0x0001 Vendor ID that indicates a
completion with RRS status.
When RRS SV is enabled, reads of PCI_VENDOR_ID should return either 0x0001,
i.e., the config read was completed with RRS, or a valid Vendor ID. On the
MT7922, it seems that all config reads after FLR return ~0 indefinitely.
When pci_dev_wait() reads PCI_VENDOR_ID and gets 0xffff, it assumes that's
a valid Vendor ID and the device is now ready, so it returns with success.
After pci_dev_wait() returns success, we restore config space and continue.
Since the MT7922 is not actually ready after the FLR, the restore fails and
the device is unusable.
We considered changing pci_dev_wait() to continue polling if a
PCI_VENDOR_ID read returns either 0x0001 or 0xffff. This "works" as it did
before d591f6804e7e, although we have to wait for the timeout and then fall
back to SBR. But it doesn't work for SR-IOV VFs, which *always* return
0xffff as the Vendor ID.
Mark Mediatek MT7922 WiFi devices to avoid the use of FLR completely. This
will cause fallback to another reset method, such as SBR.
[Fix]
Oracular: cherry-picked from upstream
Noble: backported from upstream
Jammy: not required
Focal: not required
Bionic: not required
Xenial: not required
Trusty: not required
[Test Plan]
Compile and boot tested.
[Where problems could occur]
The changes affect the PCI core and the Mediatek MT7922 WiFi driver.
The PCI core is affected in the device wait logic. Latency between
device reset and configuration accesses may be altered. The Mediatek
driver introduces a quirk that specifies those devices don't implement
FLR.
[Notes]
Noble required the 2nd commit (81f64e925c29) to be cherry-picked.
Oracular already included this commit.
Bjorn Helgaas (2):
PCI: Wait for device readiness with Configuration RRS
PCI: Avoid FLR for Mediatek MT7922 WiFi
drivers/pci/pci.c | 41 ++++++++++++++++++++++++++++-------------
drivers/pci/pci.h | 5 +++++
drivers/pci/probe.c | 9 +++------
drivers/pci/quirks.c | 3 ++-
include/linux/pci.h | 1 +
5 files changed, 39 insertions(+), 20 deletions(-)
--
2.43.0
More information about the kernel-team
mailing list