[Bug 1874270] Re: NVMe/FC connections fail to reestablish after controller is reset
Dan Streetman
1874270 at bugs.launchpad.net
Wed Apr 7 18:09:55 UTC 2021
It looks like the service is failing because your controller is in the
process of resetting, which appears to take several minutes. I'm not
sure what the design is for nvme-cli tools handling such a long reset
time, but my first guess would be to increase the kernel rport timeout,
which appears to be around 30 seconds, from the log output. In your
hardware's case, it seems like that timeout should be more than 180
seconds.
Apr 07 11:45:10 ICTM1608S01H1 root[2894793]: JD: Resetting controller A
Apr 07 11:45:28 ICTM1608S01H1 kernel: lpfc 0000:af:00.1: 5:(0):6172 NVME rescanned DID x3d0a00 port_state x2
Apr 07 11:45:28 ICTM1608S01H1 kernel: lpfc 0000:18:00.1: 1:(0):6172 NVME rescanned DID x3d0a00 port_state x2
Apr 07 11:45:28 ICTM1608S01H1 kernel: nvme nvme5: NVME-FC{4}: controller connectivity lost. Awaiting Reconnect
Apr 07 11:45:28 ICTM1608S01H1 kernel: nvme nvme1: NVME-FC{0}: controller connectivity lost. Awaiting Reconnect
Apr 07 11:45:28 ICTM1608S01H1 systemd-udevd[2895178]: fc_udev_device: Process 'systemctl --no-block start nvmf-connect at --device=none\t--transp>
Apr 07 11:45:28 ICTM1608S01H1 systemd-udevd[2895178]: fc_udev_device: Process 'systemctl --no-block start nvmf-connect at --device=none\t--transp>
Apr 07 11:45:28 ICTM1608S01H1 kernel: nvme nvme5: NVME-FC{4}: io failed due to lldd error 6
Apr 07 11:45:28 ICTM1608S01H1 kernel: nvme nvme1: NVME-FC{0}: io failed due to lldd error 6
Apr 07 11:45:29 ICTM1608S01H1 kernel: lpfc 0000:af:00.0: 4:(0):6172 NVME rescanned DID x011400 port_state x2
Apr 07 11:45:29 ICTM1608S01H1 kernel: lpfc 0000:18:00.0: 0:(0):6172 NVME rescanned DID x011400 port_state x2
Apr 07 11:45:29 ICTM1608S01H1 kernel: nvme nvme4: NVME-FC{1}: controller connectivity lost. Awaiting Reconnect
Apr 07 11:45:29 ICTM1608S01H1 kernel: nvme nvme8: NVME-FC{5}: controller connectivity lost. Awaiting Reconnect
Apr 07 11:45:29 ICTM1608S01H1 systemd-udevd[2895178]: fc_udev_device: Process 'systemctl --no-block start nvmf-connect at --device=none\t--transp>
Apr 07 11:45:29 ICTM1608S01H1 systemd-udevd[2895178]: fc_udev_device: Process 'systemctl --no-block start nvmf-connect at --device=none\t--transp>
Apr 07 11:45:29 ICTM1608S01H1 kernel: nvme nvme4: NVME-FC{1}: io failed due to lldd error 6
Apr 07 11:45:29 ICTM1608S01H1 kernel: nvme nvme8: NVME-FC{5}: io failed due to lldd error 6
Apr 07 11:45:59 ICTM1608S01H1 kernel: rport-10:0-9: blocked FC remote port time out: removing rport
Apr 07 11:45:59 ICTM1608S01H1 kernel: rport-16:0-9: blocked FC remote port time out: removing rport
Apr 07 11:45:59 ICTM1608S01H1 kernel: rport-15:0-9: blocked FC remote port time out: removing rport
Apr 07 11:45:59 ICTM1608S01H1 kernel: rport-12:0-9: blocked FC remote port time out: removing rport
Apr 07 11:46:28 ICTM1608S01H1 kernel: nvme nvme5: NVME-FC{4}: dev_loss_tmo (60) expired while waiting for remoteport connectivity.
Apr 07 11:46:28 ICTM1608S01H1 kernel: nvme nvme5: Removing ctrl: NQN "nqn.1992-08.com.netapp:5700.600a098000d8580e000000005c0136a2"
Apr 07 11:46:28 ICTM1608S01H1 kernel: nvme nvme1: NVME-FC{0}: dev_loss_tmo (60) expired while waiting for remoteport connectivity.
Apr 07 11:46:28 ICTM1608S01H1 kernel: nvme nvme1: Removing ctrl: NQN "nqn.1992-08.com.netapp:5700.600a098000d8580e000000005c0136a2"
Apr 07 11:46:29 ICTM1608S01H1 kernel: nvme nvme4: NVME-FC{1}: dev_loss_tmo (60) expired while waiting for remoteport connectivity.
Apr 07 11:46:29 ICTM1608S01H1 kernel: nvme nvme4: Removing ctrl: NQN "nqn.1992-08.com.netapp:5700.600a098000d8580e000000005c0136a2"
Apr 07 11:46:29 ICTM1608S01H1 kernel: nvme nvme8: NVME-FC{5}: dev_loss_tmo (60) expired while waiting for remoteport connectivity.
Apr 07 11:46:29 ICTM1608S01H1 kernel: nvme nvme8: Removing ctrl: NQN "nqn.1992-08.com.netapp:5700.600a098000d8580e000000005c0136a2"
Apr 07 11:47:07 ICTM1608S01H1 systemd-udevd[2896874]: fc_udev_device: Process 'systemctl --no-block start nvmf-connect at --device=none\t--transp>
Apr 07 11:47:07 ICTM1608S01H1 systemd-udevd[2896874]: fc_udev_device: Process 'systemctl --no-block start nvmf-connect at --device=none\t--transp>
Apr 07 11:47:08 ICTM1608S01H1 systemd-udevd[2896872]: fc_udev_device: Process 'systemctl --no-block start nvmf-connect at --device=none\t--transp>
Apr 07 11:47:08 ICTM1608S01H1 systemd-udevd[2896874]: fc_udev_device: Process 'systemctl --no-block start nvmf-connect at --device=none\t--transp>
Apr 07 11:49:56 ICTM1608S01H1 root[2899783]: JD: Controller A online
Apr 07 11:50:04 ICTM1608S01H1 root[2899924]: nvme-subsys0 - NQN=nqn.1992-08.com.netapp:5700.600a098000d8580e000000005c0136a2
Apr 07 11:50:04 ICTM1608S01H1 root[2899924]: \
Apr 07 11:50:04 ICTM1608S01H1 root[2899924]: +- nvme2 fc traddr=nn-0x200200a098d8580e:pn-0x202300a098d8580e host_traddr=nn-0x20000090fadcc5ce>
Apr 07 11:50:04 ICTM1608S01H1 root[2899924]: +- nvme3 fc traddr=nn-0x200200a098d8580e:pn-0x201300a098d8580e host_traddr=nn-0x200000109b8f2b8d>
Apr 07 11:50:04 ICTM1608S01H1 root[2899924]: +- nvme6 fc traddr=nn-0x200200a098d8580e:pn-0x202300a098d8580e host_traddr=nn-0x200000109b8f2b8e>
Apr 07 11:50:04 ICTM1608S01H1 root[2899924]: +- nvme7 fc traddr=nn-0x200200a098d8580e:pn-0x201300a098d8580e host_traddr=nn-0x20000090fadcc5cd>
--
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to nvme-cli in Ubuntu.
https://bugs.launchpad.net/bugs/1874270
Title:
NVMe/FC connections fail to reestablish after controller is reset
Status in nvme-cli package in Ubuntu:
Incomplete
Bug description:
My FC host can't seem to reestablish NVMe/FC connections after
resetting one of my E-Series controllers. this is with Ubuntu 20.04
kernel-5.4.0-25-generic nvme-cli 1.9-1. I'm seeing this on my fabric-
attached and direct-connect systems. These are the HBAs I'm running
with:
Emulex LPe16002B-M6 FV12.4.243.11 DV12.6.0.4 HN:ICTM1610S01H1 OS:Linux
Emulex LPe16002B-M6 FV12.4.243.11 DV12.6.0.4 HN:ICTM1610S01H1 OS:Linux
Emulex LPe32002-M2 FV12.4.243.17 DV12.6.0.4 HN:ICTM1610S01H1 OS:Linux
Emulex LPe32002-M2 FV12.4.243.17 DV12.6.0.4 HN:ICTM1610S01H1 OS:Linux
Emulex LPe35002-M2 FV12.4.243.23 DV12.6.0.4 HN:ICTM1610S01H1 OS:Linux
Emulex LPe35002-M2 FV12.4.243.23 DV12.6.0.4 HN:ICTM1610S01H1 OS:Linux
QLE2742 FW:v8.08.231 DVR:v10.01.00.19-k
QLE2742 FW:v8.08.231 DVR:v10.01.00.19-k
QLE2692 FW:v8.08.231 DVR:v10.01.00.19-k
QLE2692 FW:v8.08.231 DVR:v10.01.00.19-k
ProblemType: Bug
DistroRelease: Ubuntu 20.04
Package: nvme-cli 1.9-1
ProcVersionSignature: Ubuntu 5.4.0-25.29-generic 5.4.30
Uname: Linux 5.4.0-25-generic x86_64
ApportVersion: 2.20.11-0ubuntu27
Architecture: amd64
CasperMD5CheckResult: skip
Date: Wed Apr 22 09:26:00 2020
InstallationDate: Installed on 2020-04-13 (8 days ago)
InstallationMedia: Ubuntu-Server 20.04 LTS "Focal Fossa" - Alpha amd64 (20200124)
ProcEnviron:
TERM=xterm
PATH=(custom, no user)
XDG_RUNTIME_DIR=<set>
LANG=en_US.UTF-8
SHELL=/bin/bash
SourcePackage: nvme-cli
UpgradeStatus: No upgrade log present (probably fresh install)
modified.conffile..etc.nvme.hostnqn: ictm1610s01h1-hostnqn
mtime.conffile..etc.nvme.hostnqn: 2020-04-14T16:02:14.512816
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/nvme-cli/+bug/1874270/+subscriptions
More information about the foundations-bugs
mailing list