[Bug 1856871] Re: i/o error if next unused loop device is queried

Thu Dec 19 17:30:53 UTC 2019

It then might be a problem with "/dev/loop-control" device node. Which
dynamically find or allocate a free device, but also add and remove loop
devices from the running system.

# drivers/block/loop.c

   2090 static void loop_remove(struct loop_device *lo)
   2091 {
   2092         del_gendisk(lo->lo_disk);
   2093         blk_cleanup_queue(lo->lo_queue);
   2094         blk_mq_free_tag_set(&lo->tag_set);
   2095         put_disk(lo->lo_disk);
   2096         kfree(lo);
   2097 }

   2177         case LOOP_CTL_REMOVE:
   2178                 ret = loop_lookup(&lo, parm);
   2179                 if (ret < 0)
   2180                         break;
   2181                 if (lo->lo_state != Lo_unbound) {
   2182                         ret = -EBUSY;
   2183                         break;
   2184                 }
   2185                 if (atomic_read(&lo->lo_refcnt) > 0) {
   2186                         ret = -EBUSY;
   2187                         break;
   2188                 }
   2189                 lo->lo_disk->private_data = NULL;
   2190                 idr_remove(&loop_index_idr, lo->lo_number);
   2191                 loop_remove(lo);
   2192                 break;

-- 
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to systemd in Ubuntu.
https://bugs.launchpad.net/bugs/1856871

Title:
  i/o error if next unused loop device is queried

Status in linux package in Ubuntu:
  Incomplete
Status in snapd package in Ubuntu:
  Invalid
Status in systemd package in Ubuntu:
  New
Status in udev package in Ubuntu:
  New

Bug description:
  This is reproducible in Bionic and late.

  Here's an example running 'focal':

  $ lsb_release -cs
  focal

  $ uname -r
  5.3.0-24-generic

  The error is:
  blk_update_request: I/O error, dev loop2, sector 0

  and on more recent kernel:

  kernel: [18135.185709] blk_update_request: I/O error, dev loop18,
  sector 0 op 0x1:(WRITE) flags 0x800 phys_seg 0 prio class 0

  How to trigger it:
  $ sosreport -o block

  or more precisely the cmd causing the situation inside the block plugin:
  $ parted -s $(losetup -f) unit s print

  https://github.com/sosreport/sos/blob/master/sos/plugins/block.py#L52

  but if I run it on the next next unused loop device, in this case
  /dev/loop3 (which is also unused), no errors.

  While I agree that sosreport shouldn't query unused loop devices,
  there is definitely something going on with the next unused loop
  device.

  What is differentiate loop2 and loop3 and any other unused ones ?

  3 things so far I have noticed:
  * loop2 is the next unused loop device (losetup -f)
  * A reboot is needed (if some loop modification (snap install, mount loop, ...) has been made at runtime
  * I have also noticed that loop2 (or whatever the next unused one is) have some stat as oppose to other unused loop devices. The stat exist already right after the system boot for the next unused loop device.

  /sys/block/loop2/stat
  ::::::::::::::
  2 0 10 0 1 0 0 0 0 0 0

  2  = number of read I/Os processed
  10 = number of sectors read
  1  = number of write I/Os processed

  Explanation of each column:
  https://www.kernel.org/doc/html/latest/block/stat.html

  while /dev/loop3 doesn't

  /sys/block/loop3/stat
  ::::::::::::::
  0 0 0 0 0 0 0 0 0 0 0

  Which tells me that something during the boot process most likely
  acquired (on purpose or not) the next unused loop and possibly didn't
  released it well enough.

  If loop2 is generating errors, and I install a snap, the snap squashfs
  will take loop2, making loop3 the next unused loop device.

  If I query loop3 with 'parted' right after, no errors.

  If I reboot, and query loop3 again, then no I'll have an error.

  To triggers the errors it need to be after a reboot and it only impact
  the first unused loop device available (losetup -f).

  This was tested with focal/systemd whic his very close to latest
  upstream code.

  This has been test with latest v5.5 mainline kernel as well.

  For now, I don't think it's a kernel problem, I'm more thinking of a
  userspace misbehaviour dealing with loop device (or block device) at
  boot.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1856871/+subscriptions