[Bug 2098861] Re: systemd 255.4-1ubuntu8.5 crashing on arm64 (Azure v6 VM sizes)

Wed Feb 19 16:06:18 UTC 2025

I repro'd a case w/o reload but it didn't generate a crash dump (or
didn't flush to disk):

[   16.538616] temp-vm-cpatterson-eastus2-t02191036311 systemd[1]: Starting lxd-installer.socket - Helper to install lxd snap on demand...
[   16.538640] temp-vm-cpatterson-eastus2-t02191036311 systemd[1]: Starting snapd.socket - Socket activation for snappy daemon...
[   16.538659] temp-vm-cpatterson-eastus2-t02191036311 systemd[1]: Listening on ssh.socket - OpenBSD Secure Shell server socket.
[   16.538679] temp-vm-cpatterson-eastus2-t02191036311 systemd[1]: Listening on uuidd.socket - UUID daemon activation socket.
[   16.538697] temp-vm-cpatterson-eastus2-t02191036311 systemd[1]: open-iscsi.service - Login to default iSCSI targets was skipped because no trigger condition checks were met.
[   16.538714] temp-vm-cpatterson-eastus2-t02191036311 systemd[1]: Finished blk-availability.service - Availability of block devices.
[   16.538739] temp-vm-cpatterson-eastus2-t02191036311 systemd[1]: systemd-pcrphase-sysinit.service - TPM2 PCR Barrier (Initialization) was skipped because of an unmet condition check (ConditionSecurity=measured-uki).
[   16.538758] temp-vm-cpatterson-eastus2-t02191036311 systemd[1]: Listening on lxd-installer.socket - Helper to install lxd snap on demand.
[   16.538776] temp-vm-cpatterson-eastus2-t02191036311 systemd[1]: Caught <SEGV>, from unknown sender process.
[   16.538837] temp-vm-cpatterson-eastus2-t02191036311 systemd[1]: Caught <SEGV>, dumped core as pid 873.
[   16.538870] temp-vm-cpatterson-eastus2-t02191036311 systemd[1]: Freezing execution.
[   48.943456] temp-vm-cpatterson-eastus2-t02191036311 kernel: hv_balloon: Max. dynamic memory size: 8192 MB
[   92.985449] temp-vm-cpatterson-eastus2-t02191036311 systemd-journald[135]: Failed to send WATCHDOG=1 notification message: Connection refused
[  212.986308] temp-vm-cpatterson-eastus2-t02191036311 systemd-journald[135]: Failed to send WATCHDOG=1 notification message: Transport endpoint is not connected

-- 
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to systemd in Ubuntu.
https://bugs.launchpad.net/bugs/2098861

Title:
  systemd 255.4-1ubuntu8.5 crashing on arm64 (Azure v6 VM sizes)

Status in systemd package in Ubuntu:
  New

Bug description:
  Systemd appears to crash often on Azure v6 arm64 VM sizes during
  initial (provisioning) boot.  I caught the crash on my first attempt
  to repro on Standard_D2pds_v6 with canonical:ubuntu-24_04-lts:server-
  arm64:latest.

  [   14.082815] temp-vm-cpatterson-eastus2-t02190915031 systemd[1]: Reloading requested from client PID 1208 ('systemctl') (unit walinuxagent.service)...
  [   14.082982] temp-vm-cpatterson-eastus2-t02190915031 systemd[1]: Reloading...
  [   14.096465] temp-vm-cpatterson-eastus2-t02190915031 systemd[1]: Caught <SEGV> from PID -535718928.
  ...
  [  108.535662] temp-vm-cpatterson-eastus2-t02190915031 systemd[1]: Caught <SEGV>, dumped core as pid 1227.
  [  108.536032] temp-vm-cpatterson-eastus2-t02190915031 systemd[1]: Freezing execution.
  ...
  [  184.052078] temp-vm-cpatterson-eastus2-t02190915031 systemd-journald[136]: Failed to send WATCHDOG=1 notification message: Connection refused

  (gdb) bt
  #0  syscall () at ../sysdeps/unix/sysv/linux/aarch64/syscall.S:39
  #1  0x0000e14d999df8e4 in missing_rt_tgsigqueueinfo (info=0xffffddda1ca0, sig=11, tid=<optimized out>, tgid=1227) at ../src/basic/missing_syscall.h:384
  #2  propagate_signal (sig=sig at entry=11, siginfo=siginfo at entry=0xffffddda1ca0) at ../src/basic/signal-util.c:301
  #3  0x0000bb509f70e9bc [PAC] in crash (sig=11, siginfo=0xffffddda1ca0, context=<optimized out>) at ../src/core/crash-handler.c:94
  #4  <signal handler called>
  #5  0x0000e14d99d2944c in unit_active_state (u=u at entry=0xbb50c8aeeb10) at ../src/core/unit.c:941
  #6  0x0000e14d99d2d454 in unit_may_gc (u=0xbb50c8aeeb10) at ../src/core/unit.c:465
  #7  0x0000e14d99d2e7d8 [PAC] in unit_add_to_gc_queue (u=u at entry=0xbb50c8aeeb10) at ../src/core/unit.c:535
  #8  0x0000e14d99d2efe0 [PAC] in unit_clear_dependencies (u=0xbb50c8b0a7c0) at ../src/core/unit.c:656
  #9  unit_free (u=0xbb50c8b0a7c0) at ../src/core/unit.c:797
  #10 0x0000e14d99ce228c [PAC] in manager_clear_jobs_and_units.part.0.lto_priv.0 (m=m at entry=0xbb50c8acf790) at ../src/core/manager.c:1594
  #11 0x0000e14d99ce2624 [PAC] in manager_clear_jobs_and_units (m=0xbb50c8acf790) at ../src/core/manager.c:1591
  #12 manager_reload (m=m at entry=0xbb50c8acf790) at ../src/core/manager.c:3568
  #13 0x0000bb509f709338 [PAC] in invoke_main_loop (ret_error_message=0xffffddda3178, ret_switch_root_init=<synthetic pointer>, ret_switch_root_dir=<synthetic pointer>, ret_fds=0xffffddda3168, ret_retval=<synthetic pointer>,
      saved_rlimit_memlock=0xffffddda31a0, saved_rlimit_nofile=0xffffddda31b0, m=0xbb50c8acf790) at ../src/core/main.c:1982
  #14 main (argc=1, argv=0xffffddda3668) at ../src/core/main.c:3106

  In this particular case the reload was requested by WALinuxAgent, but
  I have evidence of other crashes when cloud-init request the reload:

  2025-01-02T10:18:03.007621+00:00 localhost systemd[1]: Reloading requested from client PID 871 ('systemctl') (unit cloud-init-local.service)...
  2025-01-02T10:18:03.007662+00:00 localhost systemd[1]: Reloading...
  2025-01-02T10:18:03.010753+00:00 localhost systemd[1]: Caught <SEGV>, from unknown sender process.

  In some cases I don't even see a reload but I will see if I can get a
  crash dump for those cases.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/2098861/+subscriptions