[Bug 1914456] Re: 120 sec kernel timeout is seen during SCSI remove_device.

Guilherme G. Piccoli 1914456 at bugs.launchpad.net
Tue Feb 23 11:52:46 UTC 2021


Hi Jitendra, thanks for your prompt response. I'd like to ask you the
following data in order to speed-up the debug process:

1) Right before the test, please collect the outputs of: "dmesg", "lspci
-vvv", "lsblk", "ls -l /sys/block", "mount"

2) After the I/O test + SCSI removal, collect please a "dmesg", "lsblk"
and "mount" please.

This is an initial data collection, will help to understand the state of the system before and after the issue. About the sosreport, let's forget about it for a while, I guess this data is good enough for now.
Thanks in advance!

-- 
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to open-iscsi in Ubuntu.
https://bugs.launchpad.net/bugs/1914456

Title:
  120 sec kernel timeout is seen during SCSI remove_device.

Status in open-iscsi package in Ubuntu:
  Confirmed

Bug description:
  Ubuntu Release
  --------------
  #lsb_release -rd
  Description:    Ubuntu 18.04.5 LTS
  Release:        18.04

  #cat /proc/version_signature
  Ubuntu 4.15.0-122.124-generic 4.15.18

  Package version
  ---------------
  #apt-cache policy open-iscsi
  open-iscsi:
    Installed: 2.0.874-5ubuntu2.10
    Candidate: 2.0.874-5ubuntu2.10

  Problem statement and details
  -----------------------------
  During the automation testing of vendor's iSCSI target using Open-iscsi initiator, Initiator host reported 120 second kernel hang issue. The automation test was doing SCSI remove_device operation when the issue observered.
  Automation test perform following sequence of operations,
  1. Establish iSCSI session
  2. Create bunch of iSCSI LUNs.
  3. Discover LUNs through sysfs scan
  4. Format LUNs
  5. Perform IO
  6. Remove LUN
  7. Delete LUN

  Observations from initiator host:
  1. Already discovered iSCSI LUNs went to offline state.
  2. New LUNs are not being discovered.
  3. NOP-in/NOP-out PDU exchange works fine from the iSCSI session.

  Note: Single iSCSI session is present between initiator and target.

  Expected behavior
  -----------------
  SCSI remove_device should succeed and automation test should continue.
  Issue is observed even with following commit, which has fix for similar issue.
  https://kernel.ubuntu.com/git/ubuntu/ubuntu-bionic.git/commit/?id=27dfa4073289ee5737d45b4cfa40b11f5cdeeaa5

  Stack trace
  -----------
  [91832.800739] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
  [91832.809982] Call Trace:
  [91832.809994]  __schedule+0x24e/0x880
  [91832.810002]  ? __enqueue_entity+0x5c/0x60
  [91832.810006]  ? select_task_rq_fair+0x642/0xab0
  [91832.810008]  schedule+0x2c/0x80
  [91832.810010]  schedule_preempt_disabled+0xe/0x10
  [91832.810012]  __mutex_lock.isra.5+0x276/0x4e0
  [91832.810017]  ? kernfs_name_hash+0x17/0x80
  [91832.810020]  __mutex_lock_slowpath+0x13/0x20
  [91832.810021]  ? __mutex_lock_slowpath+0x13/0x20
  [91832.810023]  mutex_lock+0x2f/0x40
  [91832.810030]  scsi_remove_device+0x1e/0x40
  [91832.810033]  sdev_store_delete+0x55/0xa0
  [91832.810036]  dev_attr_store+0x1b/0x30
  [91832.810039]  sysfs_kf_write+0x3c/0x50
  [91832.810040]  kernfs_fop_write+0x125/0x1a0
  [91832.810046]  __vfs_write+0x1b/0x40
  [91832.810048]  vfs_write+0xb1/0x1a0
  [91832.810050]  SyS_write+0x5c/0xe0
  [91832.810055]  do_syscall_64+0x73/0x130
  [91832.810058]  entry_SYSCALL_64_after_hwframe+0x41/0xa6

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/open-iscsi/+bug/1914456/+subscriptions



More information about the foundations-bugs mailing list