[Bug 2073846] Re: [SRU] Fix the session workqueue thread priority setting issue for newer Linux kernels (>=6.x)

Timo Aaltonen 2073846 at bugs.launchpad.net
Fri Nov 15 15:03:18 UTC 2024


Hello Mustafa, or anyone else affected,

Accepted open-iscsi into noble-proposed. The package will build now and
be available at https://launchpad.net/ubuntu/+source/open-
iscsi/2.1.9-3ubuntu5.2 in a few hours, and then in the -proposed
repository.

Please help us by testing this new package.  See
https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how
to enable and use -proposed.  Your feedback will aid us getting this
update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug,
mentioning the version of the package you tested, what testing has been
performed on the package and change the tag from verification-needed-
noble to verification-done-noble. If it does not fix the bug for you,
please add a comment stating that, and change the tag to verification-
failed-noble. In either case, without details of your testing we will
not be able to proceed.

Further information regarding the verification process can be found at
https://wiki.ubuntu.com/QATeam/PerformingSRUVerification .  Thank you in
advance for helping!

N.B. The updated package will be released to -updates after the bug(s)
fixed by this package have been verified and the package has been in
-proposed for a minimum of 7 days.

** Changed in: open-iscsi (Ubuntu Noble)
       Status: In Progress => Fix Committed

** Tags added: verification-needed verification-needed-noble

-- 
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to open-iscsi in Ubuntu.
https://bugs.launchpad.net/bugs/2073846

Title:
  [SRU] Fix the session workqueue thread priority setting issue for
  newer Linux kernels (>=6.x)

Status in open-iscsi package in Ubuntu:
  Fix Released
Status in open-iscsi source package in Noble:
  Fix Committed
Status in open-iscsi source package in Oracular:
  Fix Released

Bug description:
  [ Impact ]

  The Linux SCSI driver uses `alloc_workqueue()` to create a kernel
  workqueue for session transmit work. This call would cause the kernel
  < 6.x to create a dedicated worker thread for the workqueue. The
  userspace library open-iscsi version < 2.1.10 then adjusts the
  workqueue thread's nice value for performance reasons when a new iSCSI
  session is initiated. The algorithm for that is roughly as follows
  (https://github.com/open-iscsi/open-
  iscsi/blob/2.1.9/usr/initiator.c#L1390) :

  - Check if the driver in use has a write work queue. If not, abort.
  - Open the /proc dir, and iterate over all dir entries:
  - Run "stat" over /proc/<n>/stat
  - Read the contents of "stat" file, which looks like the following:
  898582 (kworker/u512:1-iscsi_q_0) I 2 0 0 0 -1 69238880 0 0 0 0 0 8 0 0 20 0 1 0 52431895 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 1 0 0 17 28 0 0 0 0 0 0 0 0 0 0 0 0 0
  - Try to locate "(" with strchr, starting from the beginning. Skip the process if not found.
  - Try to locate ")" with strchr, starting from the position of "(". Skip the process if not found.
  - Check whether the string between "(" and ")" contains the following pattern "iscsi_q_%d"
  - Check the number %d matches with the host ID of the session.
  - If %d matches the session id, grab the PID of the current proc entry and call `setpriority()`

  So the algorithm assumes the following about the kernel workqueue
  thread:

  - It would be present in the /proc list
  - Its name would match the iscsi_q_%d pattern

  Due to the changes in how Linux workqueue threads work in v6.x, the
  priority setting approach won't work for the following reasons:

  - The `alloc_workqueue()` no longer creates a dedicated thread for the workqueue. The workqueue thread is shared between different workqueues.
  - The workqueue thread is dynamically renamed to the name of the workqueue that's actively running
  - The workqueue thread disappears from the /proc list when it's inactive

  The algorithm as-is does the following right now:

  - If the kernel workqueue thread *by luck* happens to be running the
  iscsi task, the name matches, and the priority is set. But that's not
  what the code wants to do since it also increases the priority for all
  the other tasks that are scheduled to the workqueue thread as well.

  - If not, the open-iscsi prints the following log message, and
  proceeds to operate as normal:

  ```
  iscsistart: Could not set session1 priority. READ/WRITE throughout and latency could be affected.
  ```

  The upstream has fixed this issue with the patch
  (https://github.com/open-iscsi/open-iscsi/pull/445). The patch sets
  the default nice value for `node.session.xmit_thread_priority` to `0`,
  and then skips the priority adjustment algorithm altogether when the
  priority is set to zero.

  This SRU proposes to backport this patch to the Ubuntu releases that
  use Linux kernel 6.8 and above by default, and have an open-iscsi
  version of less than (2.1.10).

  [ Test Plan ]

  # Launch a test VM:
  $> lxc launch ubuntu:noble --vm iscsi-test-noble

  # Obtain a shell from the VM:
  $> lxc shell iscsi-test-noble

  # Install 'tgt' and 'open-iscsi':
  $> sudo apt -y update && sudo apt -y install tgt open-iscsi

  # Configure 'tgt':

  ## Step 1: Configure a LUN

  Add the following to '/etc/tgt/conf.d/iscsi.conf':

  <target iqn.2020-07.example.com:lun1>
       backing-store /dev/sda
       initiator-address 127.0.0.1
       incominguser iscsi-user password
       outgoinguser iscsi-target secretpass
  </target>

  (change /dev/sda to an existing device's name if it's not present)

  ## Step 2: Restart 'tgt' to make changes effective:
  $> systemctl restart tgt

  ## Step 3: Check if 'tgt' has started serving the LUN:
  $> tgtadm --mode target --op show

  (output should be similar to below)
  Target 1: iqn.2020-07.example.com:lun1
      System information:
          Driver: iscsi
          State: ready
      I_T nexus information:
      LUN information:
          LUN: 0
              Type: controller
              SCSI ID: IET     00010000
              SCSI SN: beaf10
              Size: 0 MB, Block size: 1
              Online: Yes
              Removable media: No
              Prevent removal: No
              Readonly: No
              SWP: No
              Thin-provisioning: No
              Backing store type: null
              Backing store path: None
              Backing store flags:
      Account information:
          iscsi-user
          iscsi-target (outgoing)
      ACL information:
          127.0.0.1

  # Configure 'open-iscsi':

  ## Step 1: Check whether the LUN being served by 'tgt' is
  discoverable:

  $> iscsiadm -m discovery -t st -p 127.0.0.1

  (should output the text below)
  127.0.0.1:3260,1 iqn.2020-07.example.com:lun1

  ## Step 2: Configure open-iscsi to consume the target LUN:

  Add the following line to '/etc/iscsi/initiatorname.iscsi':

  ```
  InitiatorName=iqn.2020-07.example.com:lun1
  ```

  ## Step 3: Modify the following file '/etc/iscsi/nodes/iqn.2020-07.example.com:lun1/127.0.0.1,3260,1/default':
  # (the file must already exist, it should've been automatically created after the discovery)

  Append the following lines to the end of the file, and save:

  ```
  node.session.auth.authmethod = CHAP
  node.session.auth.username = iscsi-user
  node.session.auth.password = password
  node.session.auth.username_in = iscsi-target
  node.session.auth.password_in = secretpass
  node.startup = automatic
  ```

  ## Step 4: Restart open-iscsi to make changes effective:

  $> systemctl restart open-iscsi.service iscsid

  ## Step 5: Check the outcome
  (the service status should indicate that login to 'iqn.2020-07.example.com:lun1' has been successful)

  $> systemctl status open-iscsi.service

  ● open-iscsi.service - Login to default iSCSI targets
       Loaded: loaded (/usr/lib/systemd/system/open-iscsi.service; enabled; preset: enabled)
       Active: active (exited) since Mon 2024-07-22 13:36:15 UTC; 4s ago
         Docs: man:iscsiadm(8)
               man:iscsid(8)
      Process: 3049 ExecStart=/usr/sbin/iscsiadm -m node --loginall=automatic (code=exited, status=0/SUCCESS)
      Process: 3065 ExecStart=/usr/lib/open-iscsi/activate-storage.sh (code=exited, status=0/SUCCESS)
     Main PID: 3065 (code=exited, status=0/SUCCESS)
          CPU: 4ms

  Jul 22 13:36:15 welcomed-bluebird systemd[1]: Starting open-iscsi.service - Login to default iSCSI targets...
  Jul 22 13:36:15 welcomed-bluebird iscsiadm[3049]: Logging in to [iface: default, target: iqn.2020-07.example.com:lun1, portal: 127.0.0.1,3260]
  Jul 22 13:36:15 welcomed-bluebird iscsiadm[3049]: Login to [iface: default, target: iqn.2020-07.example.com:lun1, portal: 127.0.0.1,3260] successful.
  Jul 22 13:36:15 welcomed-bluebird systemd[1]: Finished open-iscsi.service - Login to default iSCSI targets

  # (the command should list an active connection to the
  'iqn.2020-07.example.com:lun1')

  $> iscsiadm -m session -o show

  tcp: [1] 127.0.0.1:3260,1 iqn.2020-07.example.com:lun1 (non-flash)

  # Observe iscsid is complaining about priority:

  $> cat /var/log/syslog | grep "Could not set"

  2024-07-22T13:36:16.874243+00:00 welcomed-bluebird iscsid: Could not set session1 priority. READ/WRITE throughout and latency could be affected.
  2024-07-22T13:38:31.002732+00:00 welcomed-bluebird iscsid: Could not set session1 priority. READ/WRITE throughout and latency could be affected.

  ## VERIFICATION OF THE FIX ##

  # Add the PPA that includes the fix, and update the open-iscsi
  package:

  $> add-apt-repository ppa:mustafakemalgilor/lp-2073846
  $> apt update
  $> apt -y install open-iscsi

  # Edit the '/etc/iscsi/nodes/iqn.2020-07.example.com:lun1/127.0.0.1,3260,1/default' file to set the priority to 0:
  (this is needed because we've created it before the update so the priority is explicitly set to "-20". The new ones should have the "node.session.xmit_thread_priority" value of "0".)

  $> sed -E -i
  's/^(node.session.xmit_thread_priority[[:blank:]]*=[[:blank:]]*).*/\10/'
  /etc/iscsi/nodes/iqn.2020-07.example.com\:lun1/127.0.0.1\,3260\,1/default

  # Verify that "node.session.xmit_thread_priority" is indeed set to
  "0":

  $> grep "node.session.xmit_thread_priority"
  /etc/iscsi/nodes/iqn.2020-07.example.com\:lun1/127.0.0.1\,3260\,1/default

  (should output the following):
  node.session.xmit_thread_priority = 0

  # Truncate the syslog
  $> truncate -s 0 /var/log/syslog

  # Restart the service

  $> systemctl restart open-iscsi.service

  # Observe that the priority warning has disappeared:

  $> grep "iscsi\(adm\|d\).*:" /var/log/syslog

  # Remove the node altogether:

  $> rm -rf /etc/iscsi/nodes/iqn.2020-07.example.com

  # Re-discover the node:

  $> iscsiadm -m discovery -t st -p 127.0.0.1

  # Confirm that the priority is set to "0":

  $> grep "node.session.xmit_thread_priority"
  /etc/iscsi/nodes/iqn.2020-07.example.com\:lun1/127.0.0.1\,3260\,1/default

  (should output the following):
  node.session.xmit_thread_priority = 0

  # Confirm that the /etc/iscsi/iscsid.conf priority is set to "0":

  $> grep "node.session.xmit_thread_priority" /etc/iscsi/iscsid.conf

  (should output the following):
  node.session.xmit_thread_priority = 0

  [ Where problems could occur ]

  The change prevents a priority change that shouldn't happen in the
  first place. That might affect some workloads unknowingly depending on
  it. On the other hand, the nice setting happens intermittently (i.e.
  by luck) so the behavior right now can't be depended on anyway. The
  patch only touches the priority setting code so I wouldn't expect any
  serious breakage.

  [ Other Info ]

  The other releases that is running a 6.x kernel installed with other
  means (e.g. hw-enablement, availability) may set the
  `node.session.xmit_thread_priority` from `-20` to `0` in
  `/etc/iscsid.conf` as a workaround:

  node.session.xmit_thread_priority = 0

  which is the default priority for the workqueue threads.

  Also, the existing configuration for the nodes under
  `/etc/iscsi/nodes` path will not be altered, so they must be manually
  set to "0".

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/open-iscsi/+bug/2073846/+subscriptions




More information about the foundations-bugs mailing list