[Bug 2127072] Re: NVMe/TCP on 25.10 fails to boot

Olivier Gayot 2127072 at bugs.launchpad.net
Wed Oct 8 10:49:50 UTC 2025


** Also affects: curtin
   Importance: Undecided
       Status: New

** Description changed:

  On 25.10, installing using NVMe/TCP results in a system that fails to
  boot.
  
  After transitioning from the initramfs to the real rootfs, there is a
  network drop and the system fails to recover. The following errors are
  visible:
  
  [   12.922501] nvme0c0n1: I/O Cmd(0x2) @ LBA 9945776, 168 blocks, I/O Error (sct 0x3 / sc 0x71)
  [   12.922575] nvme0c0n1: I/O Cmd(0x2) @ LBA 592372, 256 blocks, I/O Error (sct 0x3 / sc 0x71)
  [   12.923408] I/O error, dev nvme0c0n1, sector 9945776 op 0x0:(READ) flags 0x2080700 phys_seg 21 prio class 2
  [   12.923936] I/O error, dev nvme0c0n1, sector 592372 op 0x0:(READ) flags 0x2080700 phys_seg 16 prio class 2
  [   12.924652] nvme0c0n1: I/O Cmd(0x2) @ LBA 5667208, 32 blocks, I/O Error (sct 0x3 / sc 0x71)
  [   12.951841] nvme0c0n1: I/O Cmd(0x2) @ LBA 5863296, 128 blocks, I/O Error (sct 0x3 / sc 0x71)
  [   12.955095] I/O error, dev nvme0c0n1, sector 5667208 op 0x0:(READ) flags 0x2080700 phys_seg 4 prio class 2
  [   12.960451] I/O error, dev nvme0c0n1, sector 5863296 op 0x0:(READ) flags 0x2080700 phys_seg 16 prio class 2
  [   12.963821] nvme0c0n1: I/O Cmd(0x2) @ LBA 204992, 32 blocks, I/O Error (sct 0x3 / sc 0x71)
  [   12.926722] nvme0c0n1: I/O Cmd(0x2) @ LBA 9967528, 32 blocks, I/O Error (sct 0x3 / sc 0x71)
  [   12.969511] I/O error, dev nvme0c0n1, sector 204992 op 0x0:(READ) flags 0x2080700 phys_seg 4 prio class 2
  [   12.972241] I/O error, dev nvme0c0n1, sector 9967528 op 0x0:(READ) flags 0x2080700 phys_seg 1 prio class 2
  [   12.927391] nvme0c0n1: I/O Cmd(0x2) @ LBA 2514864, 32 blocks, I/O Error (sct 0x3 / sc 0x71)
  [   12.927942] nvme0c0n1: I/O Cmd(0x2) @ LBA 1111536, 8 blocks, I/O Error (sct 0x3 / sc 0x71)
  [   12.928080] I/O error, dev nvme0c0n1, sector 1111536 op 0x0:(READ) flags 0x2083700 phys_seg 1 prio class 2
  [   12.928256] nvme0c0n1: I/O Cmd(0x2) @ LBA 1111560, 8 blocks, I/O Error (sct 0x3 / sc 0x71)
  [   12.928257] I/O error, dev nvme0c0n1, sector 1111560 op 0x0:(READ) flags 0x2083700 phys_seg 1 prio class 2
  [   12.928288] nvme0c0n1: I/O Cmd(0x2) @ LBA 1111584, 32 blocks, I/O Error (sct 0x3 / sc 0x71)
  [   12.928259] I/O error, dev nvme0c0n1, sector 1111584 op 0x0:(READ) flags 0x2083700 phys_seg 4 prio class 2
  [   23.950951] nvme nvme0: failed nvme_keep_alive_end_io error=-10
  [   23.034258] nvme nvme0: failed to bind queue 0 socket -99
  [   23.274221] nvme nvme0: failed to bind queue 0 socket -99
  [   43.514653] nvme nvme0: failed to bind queue 0 socket -99
  [   53.754266] nvme nvme0: failed to bind queue 0 socket -99
  [   63.994488] nvme nvme0: failed to bind queue 0 socket -99
  
  After investigation, this happens because of the `set-name` directive in
  the netplan configuration.
  
  $ grep set-name /etc/netplan/00-installer-config.yaml
-        set-name enp1s0
+        set-name enp1s0
  
  which is coming from the /etc/netplan/50-cloud-init.yaml config in the
- live installer.
+ live installer since 25.10.
  
  Indeed, this directive seems to cause netplan to attempt to rename
  `nbft0` to `enp1s0` and causing the network loss that we need to avoid
  with NVMe/TCP.
  
  We have at least two options:
-  * dropping the set-name directive (recommended)
-  * force dracut to use the expected name (i.e., enp1s0) instead of `nbft0`.
+  * dropping the set-name directive (recommended)
+  * force dracut to use the expected name (i.e., enp1s0) instead of `nbft0`.

** Description changed:

  On 25.10, installing using NVMe/TCP results in a system that fails to
  boot.
  
  After transitioning from the initramfs to the real rootfs, there is a
  network drop and the system fails to recover. The following errors are
  visible:
  
  [   12.922501] nvme0c0n1: I/O Cmd(0x2) @ LBA 9945776, 168 blocks, I/O Error (sct 0x3 / sc 0x71)
  [   12.922575] nvme0c0n1: I/O Cmd(0x2) @ LBA 592372, 256 blocks, I/O Error (sct 0x3 / sc 0x71)
  [   12.923408] I/O error, dev nvme0c0n1, sector 9945776 op 0x0:(READ) flags 0x2080700 phys_seg 21 prio class 2
  [   12.923936] I/O error, dev nvme0c0n1, sector 592372 op 0x0:(READ) flags 0x2080700 phys_seg 16 prio class 2
  [   12.924652] nvme0c0n1: I/O Cmd(0x2) @ LBA 5667208, 32 blocks, I/O Error (sct 0x3 / sc 0x71)
  [   12.951841] nvme0c0n1: I/O Cmd(0x2) @ LBA 5863296, 128 blocks, I/O Error (sct 0x3 / sc 0x71)
  [   12.955095] I/O error, dev nvme0c0n1, sector 5667208 op 0x0:(READ) flags 0x2080700 phys_seg 4 prio class 2
  [   12.960451] I/O error, dev nvme0c0n1, sector 5863296 op 0x0:(READ) flags 0x2080700 phys_seg 16 prio class 2
  [   12.963821] nvme0c0n1: I/O Cmd(0x2) @ LBA 204992, 32 blocks, I/O Error (sct 0x3 / sc 0x71)
  [   12.926722] nvme0c0n1: I/O Cmd(0x2) @ LBA 9967528, 32 blocks, I/O Error (sct 0x3 / sc 0x71)
  [   12.969511] I/O error, dev nvme0c0n1, sector 204992 op 0x0:(READ) flags 0x2080700 phys_seg 4 prio class 2
  [   12.972241] I/O error, dev nvme0c0n1, sector 9967528 op 0x0:(READ) flags 0x2080700 phys_seg 1 prio class 2
  [   12.927391] nvme0c0n1: I/O Cmd(0x2) @ LBA 2514864, 32 blocks, I/O Error (sct 0x3 / sc 0x71)
  [   12.927942] nvme0c0n1: I/O Cmd(0x2) @ LBA 1111536, 8 blocks, I/O Error (sct 0x3 / sc 0x71)
  [   12.928080] I/O error, dev nvme0c0n1, sector 1111536 op 0x0:(READ) flags 0x2083700 phys_seg 1 prio class 2
  [   12.928256] nvme0c0n1: I/O Cmd(0x2) @ LBA 1111560, 8 blocks, I/O Error (sct 0x3 / sc 0x71)
  [   12.928257] I/O error, dev nvme0c0n1, sector 1111560 op 0x0:(READ) flags 0x2083700 phys_seg 1 prio class 2
  [   12.928288] nvme0c0n1: I/O Cmd(0x2) @ LBA 1111584, 32 blocks, I/O Error (sct 0x3 / sc 0x71)
  [   12.928259] I/O error, dev nvme0c0n1, sector 1111584 op 0x0:(READ) flags 0x2083700 phys_seg 4 prio class 2
  [   23.950951] nvme nvme0: failed nvme_keep_alive_end_io error=-10
  [   23.034258] nvme nvme0: failed to bind queue 0 socket -99
  [   23.274221] nvme nvme0: failed to bind queue 0 socket -99
  [   43.514653] nvme nvme0: failed to bind queue 0 socket -99
  [   53.754266] nvme nvme0: failed to bind queue 0 socket -99
  [   63.994488] nvme nvme0: failed to bind queue 0 socket -99
  
  After investigation, this happens because of the `set-name` directive in
  the netplan configuration.
  
  $ grep set-name /etc/netplan/00-installer-config.yaml
         set-name enp1s0
  
  which is coming from the /etc/netplan/50-cloud-init.yaml config in the
  live installer since 25.10.
  
  Indeed, this directive seems to cause netplan to attempt to rename
- `nbft0` to `enp1s0` and causing the network loss that we need to avoid
- with NVMe/TCP.
+ `nbft0` to `enp1s0` and causing the network loss that we must avoid with
+ NVMe/TCP.
  
  We have at least two options:
   * dropping the set-name directive (recommended)
   * force dracut to use the expected name (i.e., enp1s0) instead of `nbft0`.

-- 
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to curtin.
https://bugs.launchpad.net/bugs/2127072

Title:
  NVMe/TCP on 25.10 fails to boot

Status in curtin:
  New
Status in subiquity:
  New

Bug description:
  On 25.10, installing using NVMe/TCP results in a system that fails to
  boot.

  After transitioning from the initramfs to the real rootfs, there is a
  network drop and the system fails to recover. The following errors are
  visible:

  [   12.922501] nvme0c0n1: I/O Cmd(0x2) @ LBA 9945776, 168 blocks, I/O Error (sct 0x3 / sc 0x71)
  [   12.922575] nvme0c0n1: I/O Cmd(0x2) @ LBA 592372, 256 blocks, I/O Error (sct 0x3 / sc 0x71)
  [   12.923408] I/O error, dev nvme0c0n1, sector 9945776 op 0x0:(READ) flags 0x2080700 phys_seg 21 prio class 2
  [   12.923936] I/O error, dev nvme0c0n1, sector 592372 op 0x0:(READ) flags 0x2080700 phys_seg 16 prio class 2
  [   12.924652] nvme0c0n1: I/O Cmd(0x2) @ LBA 5667208, 32 blocks, I/O Error (sct 0x3 / sc 0x71)
  [   12.951841] nvme0c0n1: I/O Cmd(0x2) @ LBA 5863296, 128 blocks, I/O Error (sct 0x3 / sc 0x71)
  [   12.955095] I/O error, dev nvme0c0n1, sector 5667208 op 0x0:(READ) flags 0x2080700 phys_seg 4 prio class 2
  [   12.960451] I/O error, dev nvme0c0n1, sector 5863296 op 0x0:(READ) flags 0x2080700 phys_seg 16 prio class 2
  [   12.963821] nvme0c0n1: I/O Cmd(0x2) @ LBA 204992, 32 blocks, I/O Error (sct 0x3 / sc 0x71)
  [   12.926722] nvme0c0n1: I/O Cmd(0x2) @ LBA 9967528, 32 blocks, I/O Error (sct 0x3 / sc 0x71)
  [   12.969511] I/O error, dev nvme0c0n1, sector 204992 op 0x0:(READ) flags 0x2080700 phys_seg 4 prio class 2
  [   12.972241] I/O error, dev nvme0c0n1, sector 9967528 op 0x0:(READ) flags 0x2080700 phys_seg 1 prio class 2
  [   12.927391] nvme0c0n1: I/O Cmd(0x2) @ LBA 2514864, 32 blocks, I/O Error (sct 0x3 / sc 0x71)
  [   12.927942] nvme0c0n1: I/O Cmd(0x2) @ LBA 1111536, 8 blocks, I/O Error (sct 0x3 / sc 0x71)
  [   12.928080] I/O error, dev nvme0c0n1, sector 1111536 op 0x0:(READ) flags 0x2083700 phys_seg 1 prio class 2
  [   12.928256] nvme0c0n1: I/O Cmd(0x2) @ LBA 1111560, 8 blocks, I/O Error (sct 0x3 / sc 0x71)
  [   12.928257] I/O error, dev nvme0c0n1, sector 1111560 op 0x0:(READ) flags 0x2083700 phys_seg 1 prio class 2
  [   12.928288] nvme0c0n1: I/O Cmd(0x2) @ LBA 1111584, 32 blocks, I/O Error (sct 0x3 / sc 0x71)
  [   12.928259] I/O error, dev nvme0c0n1, sector 1111584 op 0x0:(READ) flags 0x2083700 phys_seg 4 prio class 2
  [   23.950951] nvme nvme0: failed nvme_keep_alive_end_io error=-10
  [   23.034258] nvme nvme0: failed to bind queue 0 socket -99
  [   23.274221] nvme nvme0: failed to bind queue 0 socket -99
  [   43.514653] nvme nvme0: failed to bind queue 0 socket -99
  [   53.754266] nvme nvme0: failed to bind queue 0 socket -99
  [   63.994488] nvme nvme0: failed to bind queue 0 socket -99

  After investigation, this happens because of the `set-name` directive
  in the netplan configuration.

  $ grep set-name /etc/netplan/00-installer-config.yaml
         set-name enp1s0

  which is coming from the /etc/netplan/50-cloud-init.yaml config in the
  live installer since 25.10.

  Indeed, this directive seems to cause netplan to attempt to rename
  `nbft0` to `enp1s0` and causing the network loss that we must avoid
  with NVMe/TCP.

  We have at least two options:
   * dropping the set-name directive (recommended)
   * force dracut to use the expected name (i.e., enp1s0) instead of `nbft0`.

To manage notifications about this bug go to:
https://bugs.launchpad.net/curtin/+bug/2127072/+subscriptions




More information about the foundations-bugs mailing list