[Bug 1877617] Please test proposed package
Steve Langasek
1877617 at bugs.launchpad.net
Fri May 22 22:25:24 UTC 2020
Hello Ben, or anyone else affected,
Accepted open-iscsi into bionic-proposed. The package will build now and
be available at https://launchpad.net/ubuntu/+source/open-
iscsi/2.0.874-5ubuntu2.10 in a few hours, and then in the -proposed
repository.
Please help us by testing this new package. See
https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how
to enable and use -proposed. Your feedback will aid us getting this
update out to other Ubuntu users.
If this package fixes the bug for you, please add a comment to this bug,
mentioning the version of the package you tested, what testing has been
performed on the package and change the tag from verification-needed-
bionic to verification-done-bionic. If it does not fix the bug for you,
please add a comment stating that, and change the tag to verification-
failed-bionic. In either case, without details of your testing we will
not be able to proceed.
Further information regarding the verification process can be found at
https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in
advance for helping!
N.B. The updated package will be released to -updates after the bug(s)
fixed by this package have been verified and the package has been in
-proposed for a minimum of 7 days.
--
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to open-iscsi in Ubuntu.
https://bugs.launchpad.net/bugs/1877617
Title:
Automatic scans cause instability for cloud use cases
Status in open-iscsi package in Ubuntu:
Triaged
Status in open-iscsi source package in Bionic:
Fix Committed
Status in open-iscsi source package in Eoan:
Fix Committed
Status in open-iscsi source package in Focal:
Fix Committed
Status in open-iscsi source package in Groovy:
Triaged
Bug description:
[Impact]
When using iSCSI storage underneath cloud applications such as
OpenStack or Kubernetes, the automatic bus scan on login causes
problems, because it results in SCSI disks being registered in the
kernel that will never get cleaned up, and when those disks are
eventually deleted off the server, I/O errors begin to accumulate,
eventually slowing down the whole SCSI subsystem, spamming the kernel
log, and causing timeouts at higher levels such that users are forced
to reboot the node to get back to a usable state.
[Test Case]
################
# To demonstrate this problem, I create a VM running Ubuntu 20.04.0
# Install both iSCSI initiator and target on this host
sudo apt-get -y install open-iscsi targetcli-fb
# Start the services
sudo systemctl start iscsid.service targetclid.service
# Create a randomly generated target IQN
TARGET_IQN=$(iscsi-iname)
# Get the initator IQN
INITIATOR_IQN=$(sudo awk -F = '/InitiatorName=/ {print $2}' /etc/iscsi/initiatorname.iscsi)
# Set up an iSCSI target and target portal, and grant access to ourselves
sudo targetcli /iscsi create $TARGET_IQN
sudo targetcli /iscsi/$TARGET_IQN/tpg1/acls create $INITIATOR_IQN
# Create two 1GiB LUNs backed by files, and expose them through the target portal
sudo targetcli /backstores/fileio create lun1 /lun1 1G
sudo targetcli /iscsi/$TARGET_IQN/tpg1/luns create /backstores/fileio/lun1 1
sudo targetcli /backstores/fileio create lun2 /lun2 1G
sudo targetcli /iscsi/$TARGET_IQN/tpg1/luns create /backstores/fileio/lun2 2
# Truncate the kernel log so we can see messages after this point only
sudo dmesg -C
# Register the local iSCSI target with out initiator, and login
sudo iscsiadm -m node -p 127.0.0.1 -T $TARGET_IQN -o new
sudo iscsiadm -m node -p 127.0.0.1 -T $TARGET_IQN --login
# Get the list of disks from the iSCSI session, and stash it in an array
eval "DISKS=\$(sudo iscsiadm -m session -P3 | awk '/Attached scsi disk/ {print \$4}')"
# Print the list
echo $DISKS
# Note that there are two disks found already (the two LUNs we created
# above) despite the fact that we only just logged in.
# Now delete a LUN from the target
sudo targetcli /iscsi/$TARGET_IQN/tpg1/luns delete lun2
sudo targetcli /backstores/fileio delete lun2
# Attempt to read each of the disks
for DISK in $DISKS ; do sudo blkid /dev/$DISK || true ; done
# Look at the kernel log
dmesg
# Notice I/O errors related to the disk that the kernel remembers
################
# Now to demostrate how this problem is fixed, I create a new Ubuntu
20.04.0 VM
# Add PPA with modified version of open-iscsi
sudo add-apt-repository -y ppa:bswartz/open-iscsi
sudo apt-get update
# Install both iSCSI initiator and target on this host
sudo apt-get -y install open-iscsi targetcli-fb
# Start the services
sudo systemctl start iscsid.service targetclid.service
# Set the scan option to "manual"
sudo sed -i 's/^\(node.session.scan\).*/\1 = manual/' /etc/iscsi/iscsid.conf
sudo systemctl restart iscsid.service
# Create a randomly generated target IQN
TARGET_IQN=$(iscsi-iname)
# Get the initator IQN
INITIATOR_IQN=$(sudo awk -F = '/InitiatorName=/ {print $2}' /etc/iscsi/initiatorname.iscsi)
# Set up an iSCSI target and target portal, and grant access to ourselves
sudo targetcli /iscsi create $TARGET_IQN
sudo targetcli /iscsi/$TARGET_IQN/tpg1/acls create $INITIATOR_IQN
# Create two 1GiB LUNs backed by files, and expose them through the target portal
sudo targetcli /backstores/fileio create lun1 /lun1 1G
sudo targetcli /iscsi/$TARGET_IQN/tpg1/luns create /backstores/fileio/lun1 1
sudo targetcli /backstores/fileio create lun2 /lun2 1G
sudo targetcli /iscsi/$TARGET_IQN/tpg1/luns create /backstores/fileio/lun2 2
# Truncate the kernel log so we can see messages after this point only
sudo dmesg -C
# Register the local iSCSI target with out initiator, and login
sudo iscsiadm -m node -p 127.0.0.1 -T $TARGET_IQN -o new
sudo iscsiadm -m node -p 127.0.0.1 -T $TARGET_IQN --login
# Get the list of disks from the iSCSI session, and stash it in an array
eval "DISKS=\$(sudo iscsiadm -m session -P3 | awk '/Attached scsi disk/ {print \$4}')"
# Print the list
echo $DISKS
# Note that the list is empty!
# Get the iSCSI host
SCSI_HOST=$(ls /sys/class/iscsi_host)
# Specifically scan the one disk we want
sudo sh -c "echo '0 0 1' > /sys/class/scsi_host/$SCSI_HOST/scan"
# Get the list of disks from the iSCSI session, and stash it in an array
eval "DISKS=\$(sudo iscsiadm -m session -P3 | awk '/Attached scsi disk/ {print \$4}')"
# Print the list
echo $DISKS
# This time notice there's exactly one disk
# Now delete the other LUN from the target
sudo targetcli /iscsi/$TARGET_IQN/tpg1/luns delete lun2
sudo targetcli /backstores/fileio delete lun2
# Attempt to read each of the disks
for DISK in $DISKS ; do sudo blkid /dev/$DISK || true ; done
# Look at the kernel log
dmesg
# No errors in the log
################
[Regression Potential]
These changes have been proven safe by 3 years of soak time in the
RedHat ecosystem, so I don't see much risk to taking them into Ubuntu.
They apply cleanly to the most recent versions of focal, bionic, and
xenial.
The change introduces a new config option in iscsid.conf but the default is to do exactly what it used to do. Only users who explicitly change this option will get altered behavior, and the behavior with the option set is
superior for the above mentioned cloud use cases.
[Other Info]
RedHat discovered this problem more than 3 years ago and fixed it
upstream.
https://bugzilla.redhat.com/show_bug.cgi?id=1422941
I had hoped that Debian would eventually pick up the version in which
it was fixed, but another LTS has gone by without picking up the newer
upstream version, and this is a critical problem, so I propose
backporting the fixes.
The 2 patches that need porting are:
https://github.com/open-iscsi/open-iscsi/pull/40
https://github.com/open-iscsi/open-iscsi/pull/49
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/open-iscsi/+bug/1877617/+subscriptions
More information about the foundations-bugs
mailing list