[SRU][F][PATCH 0/3] Fix nouveau error storm and unresponsive display after desktop idle timeout

Jacob Martin jacob.martin at canonical.com
Tue Dec 10 22:03:05 UTC 2024


BugLink: https://bugs.launchpad.net/bugs/2078011

SRU Justification

[Impact]

On a system with a GV100 GPU using the nouveau driver, the display becomes
unresponsive and a storm of "nouveau 0000:07:00.0: disp: ctrl 00000080"
messages are continuously printed to dmesg once the desktop environment reaches
its idle timeout. This is interfering with certification testing for the DGX
Station desktop system, as the system eventually will become unresponsive
during testing.

[Fix]

This only affects Focal.

Backporting the following patches from K5.6 resolves the issue:
58ae5284f6 ("drm/nouveau/disp/gv100-: halt NV_PDISP_FE_RM_INTR_STAT_CTRL_DISP_ERROR storms")
5bb88d0794 ("drm/nouveau/kms/gv100-: move window ownership setup into modesetting path")
137c4ba716 ("drm/nouveau/kms/gv100-: avoid sending a core update until the first modeset")

[Test Case]

1. Install desktop environment
$ sudo apt install ubuntu-desktop

2. Configure GDM
$ sudo vim /etc/gdm3/custom.conf
  => Uncomment WaylandEnable=false
  => Configure automatic login for the `ubuntu` user by setting
        AutomaticLoginEnable = true
        AutomaticLogin = ubuntu

3. Disable display timeout
$ gsettings set org.gnome.desktop.session idle-delay 0

4. Set graphical as the default target
$ sudo systemctl set-default graphical.target

5. Reboot the system

6. Enable 1 second display timeout and wait ~10 seconds
$ gsettings set org.gnome.desktop.session idle-delay 1

7. Observe that after applying these patches, the display can wake up from idle
and the system continues to be usable without a storm of "nouveau 0000:07:00.0:
disp: ctrl 00000080" messages in dmesg.

[Where things could go wrong]

These changes affect only the nouveau driver. Issues would appear as
misbehavior of the nouveau driver, mostly likely for Volta NVIDIA GPUs.

Ben Skeggs (3):
  drm/nouveau/disp/gv100-: halt NV_PDISP_FE_RM_INTR_STAT_CTRL_DISP_ERROR
    storms
  drm/nouveau/kms/gv100-: move window ownership setup into modesetting
    path
  drm/nouveau/kms/gv100-: avoid sending a core update until the first
    modeset

 drivers/gpu/drm/nouveau/dispnv50/core.h       |  6 +++++
 drivers/gpu/drm/nouveau/dispnv50/corec37d.c   | 23 +++++++++++++++----
 drivers/gpu/drm/nouveau/dispnv50/corec57d.c   |  9 ++++----
 drivers/gpu/drm/nouveau/dispnv50/disp.c       | 16 +++++++++++++
 .../gpu/drm/nouveau/nvkm/engine/disp/gv100.c  |  6 +++++
 5 files changed, 50 insertions(+), 10 deletions(-)

-- 
2.43.0




More information about the kernel-team mailing list