[Bug 1987663] Re: cinder-volume: "Failed to re-export volume, setting to ERROR" with "tgtadm: failed to send request hdr to tgt daemon, Transport endpoint is not connected" on service startup

Mauricio Faria de Oliveira 1987663 at bugs.launchpad.net
Tue Apr 30 18:09:23 UTC 2024


Verification done on {mantic,jammy,focal}-proposed.

The detailed verification steps as per comment #3
are attached. This is the summary with key steps.

With the packages from -proposed, when (re)starting {cinder-volume,tgt}.service, 
the cinder-volume service is started _after_ the tgt service, and
the volumes remain with 'in-use' status (no longer 'error' status).

Reproducer (45-second startup delay in tgt.service)
---

	FILE=/etc/systemd/system/tgt.service.d/start-delay.conf
	mkdir -p $(dirname $FILE)
	cat <<EOF > $FILE
	[Service]
	ExecStartPre=$(which sleep) 45
	EOF
	systemctl daemon-reload

	# systemctl show tgt.service | grep ^ExecStartPre=
	ExecStartPre={ path=/usr/bin/sleep ; argv[]=/usr/bin/sleep 45 ; ignore_errors=no ; start_time=[n/a] ; stop_time=[n/a] ; pid=0 ; code=(null) ; status=0/0 }

        # date; systemctl restart cinder-volume.service tgt.service;
date

Before / packages from -updates:
---

... Notice that cinder-volume.service is Started _BEFORE_ tgt.service
... And the 'tgtadm' error _IS_ seen in the logs.
... And volumes are listed with 'error' Status.

mantic:

	root at cinder-mantic:~# dpkg -l | grep cinder-volume
	ii  cinder-volume                   2:23.0.0-0ubuntu1.1                     all          Cinder storage service - Volume server

	root at cinder-mantic:~# date; systemctl restart cinder-volume.service tgt.service; date
	Tue Apr 30 15:54:43 UTC 2024
	Tue Apr 30 15:55:29 UTC 2024

	root at cinder-mantic:~# journalctl -b -u cinder-volume.service -u tgt.service | grep Start | tail -3
	Apr 30 15:54:43 cinder-mantic systemd[1]: Started cinder-volume.service - OpenStack Cinder Volume.
	Apr 30 15:54:44 cinder-mantic systemd[1]: Starting tgt.service - (i)SCSI target daemon...
	Apr 30 15:55:29 cinder-mantic systemd[1]: Started tgt.service - (i)SCSI target daemon.

	root at cinder-mantic:~# grep 'ERROR cinder.volume.manager' /var/log/cinder/cinder-volume.log | grep tgtadm
	2024-04-30 15:54:54.603 13692 ERROR cinder.volume.manager     (out, err) = cinder.privsep.targets.tgt.tgtadm_show()
	2024-04-30 15:54:54.603 13692 ERROR cinder.volume.manager Command: tgtadm --lld iscsi --op show --mode target
	2024-04-30 15:54:54.603 13692 ERROR cinder.volume.manager Stderr: 'tgtadm: failed to send request hdr to tgt daemon, Transport endpoint is not connected\n'

	root at cinder-mantic:~# cinder list
	+--------------------------------------+--------+-------------+------+----------------+-------------+----------+-------------+
	| ID                                   | Status | Name        | Size | Consumes Quota | Volume Type | Bootable | Attached to |
	+--------------------------------------+--------+-------------+------+----------------+-------------+----------+-------------+
	| 45b39ae0-2552-4703-a664-ae8de77a23a1 | error  | test-volume | 1    | True           | __DEFAULT__ | false    |             |
	+--------------------------------------+--------+-------------+------+----------------+-------------+----------+-------------+


jammy:

	root at cinder-jammy:~# dpkg -l | grep cinder-volume
	ii  cinder-volume                   2:20.3.1-0ubuntu1.1                     all          Cinder storage service - Volume server
	
	root at cinder-jammy:~# date; systemctl restart cinder-volume.service tgt.service; date
	Tue Apr 30 15:54:45 UTC 2024
	Tue Apr 30 15:55:30 UTC 2024

	root at cinder-jammy:~# journalctl -b -u cinder-volume.service -u tgt.service | grep Start | tail -3
	Apr 30 15:54:45 cinder-jammy systemd[1]: Started OpenStack Cinder Volume.
	Apr 30 15:54:45 cinder-jammy systemd[1]: Starting (i)SCSI target daemon...
	Apr 30 15:55:30 cinder-jammy systemd[1]: Started (i)SCSI target daemon.

	root at cinder-jammy:~# grep 'ERROR cinder.volume.manager' /var/log/cinder/cinder-volume.log | grep tgtadm
	2024-04-30 15:54:55.918 14576 ERROR cinder.volume.manager     (out, err) = cinder.privsep.targets.tgt.tgtadm_show()
	2024-04-30 15:54:55.918 14576 ERROR cinder.volume.manager Command: tgtadm --lld iscsi --op show --mode target
	2024-04-30 15:54:55.918 14576 ERROR cinder.volume.manager Stderr: 'tgtadm: failed to send request hdr to tgt daemon, Transport endpoint is not connected\n'

	root at cinder-jammy:~# cinder list
	+--------------------------------------+--------+-------------+------+----------------+-------------+----------+-------------+
	| ID                                   | Status | Name        | Size | Consumes Quota | Volume Type | Bootable | Attached to |
	+--------------------------------------+--------+-------------+------+----------------+-------------+----------+-------------+
	| 88024eca-203c-4f8f-9ff0-4de704041a2f | error  | test-volume | 1    | True           | __DEFAULT__ | false    |             |
	+--------------------------------------+--------+-------------+------+----------------+-------------+----------+-------------+

focal:

	root at cinder-focal:~# dpkg -l | grep cinder-volume
	ii  cinder-volume                  2:16.4.2-0ubuntu2.4               all          Cinder storage service - Volume server

	root at cinder-focal:~# date; systemctl restart cinder-volume.service tgt.service; date
	Tue Apr 30 15:54:46 UTC 2024
	Tue Apr 30 15:55:31 UTC 2024

	root at cinder-focal:~# journalctl -b -u cinder-volume.service -u tgt.service | grep Start | tail -3
	Apr 30 15:54:46 cinder-focal systemd[1]: Started OpenStack Cinder Volume.
	Apr 30 15:54:46 cinder-focal systemd[1]: Starting (i)SCSI target daemon...
	Apr 30 15:55:31 cinder-focal systemd[1]: Started (i)SCSI target daemon.

	root at cinder-focal:~# grep 'ERROR cinder.volume.manager' /var/log/cinder/cinder-volume.log | grep tgtadm
	2024-04-30 15:54:55.830 15834 ERROR cinder.volume.manager     (out, err) = cinder.privsep.targets.tgt.tgtadm_show()
	2024-04-30 15:54:55.830 15834 ERROR cinder.volume.manager Command: tgtadm --lld iscsi --op show --mode target
	2024-04-30 15:54:55.830 15834 ERROR cinder.volume.manager Stderr: 'tgtadm: failed to send request hdr to tgt daemon, Transport endpoint is not connected\n'

	root at cinder-focal:~# cinder list
	+--------------------------------------+--------+-------------+------+-------------+----------+-------------+
	| ID                                   | Status | Name        | Size | Volume Type | Bootable | Attached to |
	+--------------------------------------+--------+-------------+------+-------------+----------+-------------+
	| 5fa41f17-afaf-4a56-8494-8c9246531523 | error  | test-volume | 1    | __DEFAULT__ | false    |             |
	+--------------------------------------+--------+-------------+------+-------------+----------+-------------+
	
After / packages from -proposed
	
... Notice that cinder-volume.service is Started *AFTER* tgt.service now

... And now the 'tgtadm' error is *NOT* seen in the logs.

... And the volumes are listed with Status 'in-use' (no longer 'error')

mantic:

	root at cinder-mantic:~# apt-cache policy cinder-volume
	cinder-volume:
	  Installed: 2:23.0.0-0ubuntu1.2
	  Candidate: 2:23.0.0-0ubuntu1.2
	  Version table:
	 *** 2:23.0.0-0ubuntu1.2 500
		400 http://archive.ubuntu.com/ubuntu mantic-proposed/main amd64 Packages
	...

	root at cinder-mantic:~# dpkg -l | grep cinder-volume
	ii  cinder-volume                   2:23.0.0-0ubuntu1.2                     all          Cinder storage service - Volume server

	root at cinder-mantic:~# date; systemctl restart cinder-volume.service tgt.service; date
	Tue Apr 30 16:34:03 UTC 2024
	Tue Apr 30 16:34:49 UTC 2024

	root at cinder-mantic:~# journalctl -b -u cinder-volume.service -u tgt.service | grep Start | tail -3
	Apr 30 16:34:03 cinder-mantic systemd[1]: Starting tgt.service - (i)SCSI target daemon...
	Apr 30 16:34:49 cinder-mantic systemd[1]: Started tgt.service - (i)SCSI target daemon.
	Apr 30 16:34:49 cinder-mantic systemd[1]: Started cinder-volume.service - OpenStack Cinder Volume.

	(error timestamps are from the previous/non-proposed run)
	root at cinder-mantic:~# grep 'ERROR cinder.volume.manager' /var/log/cinder/cinder-volume.log | grep tgtadm
	2024-04-30 15:54:54.603 13692 ERROR cinder.volume.manager     (out, err) = cinder.privsep.targets.tgt.tgtadm_show()
	2024-04-30 15:54:54.603 13692 ERROR cinder.volume.manager Command: tgtadm --lld iscsi --op show --mode target
	2024-04-30 15:54:54.603 13692 ERROR cinder.volume.manager Stderr: 'tgtadm: failed to send request hdr to tgt daemon, Transport endpoint is not connected\n'

	root at cinder-mantic:~# cinder list
	+--------------------------------------+--------+-------------+------+----------------+-------------+----------+-------------+
	| ID                                   | Status | Name        | Size | Consumes Quota | Volume Type | Bootable | Attached to |
	+--------------------------------------+--------+-------------+------+----------------+-------------+----------+-------------+
	| 45b39ae0-2552-4703-a664-ae8de77a23a1 | in-use | test-volume | 1    | True           | __DEFAULT__ | false    |             |
	+--------------------------------------+--------+-------------+------+----------------+-------------+----------+-------------+

jammy:

	root at cinder-jammy:~# apt-cache policy cinder-volume
	cinder-volume:
	  Installed: 2:20.3.1-0ubuntu1.2
	  Candidate: 2:20.3.1-0ubuntu1.2
	  Version table:
	 *** 2:20.3.1-0ubuntu1.2 400
		400 http://archive.ubuntu.com/ubuntu jammy-proposed/main amd64 Packages
	...

	root at cinder-jammy:~# dpkg -l | grep cinder-volume	
	ii  cinder-volume                   2:20.3.1-0ubuntu1.2                     all          Cinder storage service - Volume server

	root at cinder-jammy:~# date; systemctl restart cinder-volume.service tgt.service; date
	Tue Apr 30 16:34:04 UTC 2024
	Tue Apr 30 16:34:50 UTC 2024

	root at cinder-jammy:~# journalctl -b -u cinder-volume.service -u tgt.service | grep Start | tail -3
	Apr 30 16:34:04 cinder-jammy systemd[1]: Starting (i)SCSI target daemon...
	Apr 30 16:34:50 cinder-jammy systemd[1]: Started (i)SCSI target daemon.
	Apr 30 16:34:50 cinder-jammy systemd[1]: Started OpenStack Cinder Volume.

	(error timestamps are from the previous/non-proposed run)
	root at cinder-jammy:~# grep 'ERROR cinder.volume.manager' /var/log/cinder/cinder-volume.log | grep tgtadm
	2024-04-30 15:54:55.918 14576 ERROR cinder.volume.manager     (out, err) = cinder.privsep.targets.tgt.tgtadm_show()
	2024-04-30 15:54:55.918 14576 ERROR cinder.volume.manager Command: tgtadm --lld iscsi --op show --mode target
	2024-04-30 15:54:55.918 14576 ERROR cinder.volume.manager Stderr: 'tgtadm: failed to send request hdr to tgt daemon, Transport endpoint is not connected\n'

	root at cinder-jammy:~# cinder list
	+--------------------------------------+--------+-------------+------+----------------+-------------+----------+-------------+
	| ID                                   | Status | Name        | Size | Consumes Quota | Volume Type | Bootable | Attached to |
	+--------------------------------------+--------+-------------+------+----------------+-------------+----------+-------------+
	| 88024eca-203c-4f8f-9ff0-4de704041a2f | in-use | test-volume | 1    | True           | __DEFAULT__ | false    |             |
	+--------------------------------------+--------+-------------+------+----------------+-------------+----------+-------------+

focal:

	root at cinder-focal:~# apt-cache policy cinder-volume
	cinder-volume:
	  Installed: 2:16.4.2-0ubuntu2.5
	  Candidate: 2:16.4.2-0ubuntu2.5
	  Version table:
	 *** 2:16.4.2-0ubuntu2.5 500
		400 http://archive.ubuntu.com/ubuntu focal-proposed/main amd64 Packages
	...

	root at cinder-focal:~# dpkg -l | grep cinder-volume	
	ii  cinder-volume                  2:16.4.2-0ubuntu2.5               all          Cinder storage service - Volume server

	root at cinder-focal:~# date; systemctl restart cinder-volume.service tgt.service; date
	Tue Apr 30 16:34:05 UTC 2024
	Tue Apr 30 16:34:51 UTC 2024

	root at cinder-focal:~# journalctl -b -u cinder-volume.service -u tgt.service | grep Start | tail -3
	Apr 30 16:34:05 cinder-focal systemd[1]: Starting (i)SCSI target daemon...
	Apr 30 16:34:51 cinder-focal systemd[1]: Started (i)SCSI target daemon.
	Apr 30 16:34:51 cinder-focal systemd[1]: Started OpenStack Cinder Volume.

	(error timestamps are from the previous/non-proposed run)
	root at cinder-focal:~# grep 'ERROR cinder.volume.manager' /var/log/cinder/cinder-volume.log | grep tgtadm
	2024-04-30 15:54:55.830 15834 ERROR cinder.volume.manager     (out, err) = cinder.privsep.targets.tgt.tgtadm_show()
	2024-04-30 15:54:55.830 15834 ERROR cinder.volume.manager Command: tgtadm --lld iscsi --op show --mode target
	2024-04-30 15:54:55.830 15834 ERROR cinder.volume.manager Stderr: 'tgtadm: failed to send request hdr to tgt daemon, Transport endpoint is not connected\n'

	root at cinder-focal:~# cinder list
	+--------------------------------------+--------+-------------+------+-------------+----------+-------------+
	| ID                                   | Status | Name        | Size | Volume Type | Bootable | Attached to |
	+--------------------------------------+--------+-------------+------+-------------+----------+-------------+
	| 5fa41f17-afaf-4a56-8494-8c9246531523 | in-use | test-volume | 1    | __DEFAULT__ | false    |             |
	+--------------------------------------+--------+-------------+------+-------------+----------+-------------+

** Attachment added: "2024 04 30 proposed verification mantic jammy focal.txt"
   https://bugs.launchpad.net/cloud-archive/ussuri/+bug/1987663/+attachment/5773339/+files/2024%2004%2030%20proposed%20verification%20mantic%20jammy%20focal.txt

** Tags removed: verification-needed verification-needed-focal verification-needed-jammy verification-needed-mantic
** Tags added: verification-done verification-done-focal verification-done-jammy verification-done-mantic

-- 
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to cinder in Ubuntu.
https://bugs.launchpad.net/bugs/1987663

Title:
  cinder-volume: "Failed to re-export volume, setting to ERROR" with
  "tgtadm: failed to send request hdr to tgt daemon, Transport endpoint
  is not connected" on service startup

Status in Ubuntu Cloud Archive:
  Fix Released
Status in Ubuntu Cloud Archive antelope series:
  Triaged
Status in Ubuntu Cloud Archive bobcat series:
  In Progress
Status in Ubuntu Cloud Archive caracal series:
  Fix Released
Status in Ubuntu Cloud Archive ussuri series:
  Triaged
Status in Ubuntu Cloud Archive victoria series:
  Won't Fix
Status in Ubuntu Cloud Archive wallaby series:
  Won't Fix
Status in Ubuntu Cloud Archive xena series:
  Won't Fix
Status in Ubuntu Cloud Archive yoga series:
  In Progress
Status in Ubuntu Cloud Archive zed series:
  Won't Fix
Status in cinder package in Ubuntu:
  Fix Released
Status in cinder source package in Bionic:
  Won't Fix
Status in cinder source package in Focal:
  Fix Committed
Status in cinder source package in Jammy:
  Fix Committed
Status in cinder source package in Lunar:
  Won't Fix
Status in cinder source package in Mantic:
  Fix Committed
Status in cinder source package in Noble:
  Fix Released

Bug description:
  [Impact]

   * The cinder-volume service might fail to re-export volumes
     in-use on startup if tgt.service isn't fully started yet.

   * This affects the 'lvm' driver with 'tgtadm' target helper
     (which runs 'tgtadm' commands that need the service ready).

   * Snippets from /var/log/cinder/cinder-volume.log:

     Failed to re-export volume, setting to ERROR.
     ...
     Command: tgtadm --lld iscsi --op show --mode target
     ...
     Stderr: 'tgtadm: failed to send request hdr to tgt daemon,
     Transport endpoint is not connected\n'

   * This issue is more common in openstack compute nodes
     with networking (ovs/ovn) that takes long to startup,
     which might delay the startup of tgt.service _after_
     cinder-volume.service.

  [Test Steps]

   * Steps to reproduce are detailed in comment #3.
     Summary:

   * Install mysql, rabbitmq-server, keystone, and cinder
     (controller and storage nodes; backup node unneeded).

   * Configure cinder-volume (storage node) for LVM backend
     and tgtadm iSCSI helper (tgt.service).

   * Create a cinder volume, and configure it as 'in-use'.

   * Simulate a start delay on tgt.service with a drop-in.

   * Restart services: cinder-volume.service tgt.service

   * Check sequence of service startup.

   * Check status of the cinder volume:
     'in-use' (expected) or 'error' (bug).

   * Check log file /var/log/cinder/cinder-volume.log for
     'tgtadm: failed to send request hdr to tgt daemon'.

  [Regression Potential]

   * The fix introduces systemd unit 'After=' and 'Wants='
     properties for tgt.service in cinder-volume.service,
     thus might delay the boot process (multi-user.target).

       $ systemctl show cinder-volume.service | grep WantedBy=
       WantedBy=multi-user.target

   * However, the boot process already waits on tgt.service
     anyway, thus the difference (if any) should not be big,
     and would provide more correct behavior.

       $ systemctl show tgt.service | grep WantedBy=
       WantedBy=multi-user.target

   * If tgt.service is not present (tgt package not installed)
     _no errors_ occur, as both 'After=' and 'Wants=' are weak
     ordering/dependency properties (man 5 systemd.unit).

  [Other Info]

   * The fix uses a systemd service drop-in snippet because
     the service unit is generated by openstack-pkg-tools
     (pkgos-gen-systemd-unit) based on the 'init' service,
     and it only emits 'Wants=' for network-online.target.

   * Changing that in openstack-pkg-tools changes behavior
     in stable releases, and only manifest at build time,
     for many openstack packages that have no issues now.

   * We'll continue to pursue the general improvement in
     Debian, so it comes into Ubuntu development release,
     but for the Ubuntu stable releases, this should do.

  [Original Bug Description]

  Real-world example in comment #2.

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1987663/+subscriptions




More information about the Ubuntu-openstack-bugs mailing list