[Bug 2071747] [NEW] netplan causes unresponsive system with certain nsswitch config

Adam Saponara 2071747 at bugs.launchpad.net
Tue Jul 2 22:40:41 UTC 2024


Public bug reported:

A recent patch appears to chown networkd-related files to `root:systemd-
network`[1]. If nsswitch.conf is configured with `group: systemd files`,
this appears to create a circular dependency as systemd relies on
netplan via systemd-networkd. On the next `systemctl daemon-reload`, pid
1 invokes netplan, netplan queries systemd for group info of `systemd-
network`, but systemd cannot respond yet as it's waiting on netplan. Any
programs making libc calls that nsswitch to systemd during this time are
blocked. Something in systemd eventually SIGTERMs netplan after ~45s.

Here is an strace log of pid 1 during a reload illustrating the problem:

```
30854<(sd-executor)> 1719955780.753479 <... waitid resumed>{si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=30866, si_uid=0, si_status=0, si_utime=0, si_stime=0}, WEXITED, NULL) = 0 <0.022023>
30854<(sd-executor)> 1719955780.753519 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=30866, si_uid=0, si_status=0, si_utime=1, si_stime=3} ---
30854<(sd-executor)> 1719955780.753561 waitid(P_PID, 30856<friendly-recove>, {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=30856, si_uid=0, si_status=0, si_utime=0, si_stime=0}, WEXITED, NULL) = 0 <0.000039>
30854<(sd-executor)> 1719955780.753646 waitid(P_PID, 30868<systemd-rc-loca>, {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=30868, si_uid=0, si_status=0, si_utime=0, si_stime=0}, WEXITED, NULL) = 0 <0.000023>
30854<(sd-executor)> 1719955780.753711 waitid(P_PID, 30869<systemd-run-gen>, {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=30869, si_uid=0, si_status=0, si_utime=0, si_stime=0}, WEXITED, NULL) = 0 <0.000023>
30854<(sd-executor)> 1719955780.753773 waitid(P_PID, 30861<systemd-bless-b>, {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=30861, si_uid=0, si_status=0, si_utime=0, si_stime=0}, WEXITED, NULL) = 0 <0.000022>
30854<(sd-executor)> 1719955780.753840 waitid(P_PID, 30858<netplan>,  <unfinished ...>
30858<netplan> <snip> (netplan looking up systemd-network group)
30858<netplan> 1719955825.602714 sendto(4<UNIX-STREAM:[182429]>, "{\"method\":\"io.systemd.UserDatabase.GetMemberships\",\"parameters\":{\"groupName\":\"systemd-network\",\"service\":\"io.systemd.DynamicUser\"},\"more\":true}\0", 144, MSG_DONTWAIT|MSG_NOSIGNAL, NULL, 0)>
30858<netplan> 1719955825.602771 epoll_ctl(5<anon_inode:[eventpoll]>, EPOLL_CTL_MOD, 4<UNIX-STREAM:[182429]>, {events=EPOLLIN, data={u32=3132670720, u64=106458192069376}}) = 0 <0.000010>
30858<netplan> 1719955825.602823 epoll_wait(5<anon_inode:[eventpoll]>, [], 8, 0) = 0 <0.000010>
30858<netplan> 1719955825.602859 brk(0x60d2babee000) = 0x60d2babee000 <0.000017>
30858<netplan> 1719955825.602901 recvfrom(4<UNIX-STREAM:[182429]>, 0x60d2babad2e0, 131080, MSG_DONTWAIT, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable) <0.000011>
30858<netplan> 1719955825.602951 epoll_wait(5<anon_inode:[eventpoll]>,  <unfinished ...>
30854<(sd-executor)> 1719955870.201033 <... waitid resumed>0x7fffaec9c570, WEXITED, NULL) = ? ERESTARTSYS (To be restarted if SA_RESTART is set) <89.447162>
30854<(sd-executor)> 1719955870.201147 --- SIGALRM {si_signo=SIGALRM, si_code=SI_KERNEL} ---
30854<(sd-executor)> 1719955870.201607 +++ killed by SIGALRM +++
30858<netplan> 1719955870.201625 <... epoll_wait resumed>0x60d2baba48b0, 8, -1) = -1 EINTR (Interrupted system call) <44.598663>
30858<netplan> 1719955870.201670 --- SIGTERM {si_signo=SIGTERM, si_code=SI_USER, si_pid=30854, si_uid=0} ---

```

Changing nsswitch.conf to `group: files systemd` or removing systemd
fixes the problem.

Note this is not resolved by the patch added for a recent similar
bug[2].

[1] https://git.launchpad.net/~ubuntu-core-
dev/netplan/+git/ubuntu/tree/debian/patches/lp2065738/0013-libnetplan-
use-more-restrictive-file-permissions.patch?h=ubuntu-
jammy&id=6836c2bf27a209090ed9eb2c3deceb4cb2c9d85c#n88

[2] https://bugs.launchpad.net/ubuntu/+source/netplan.io/+bug/2071333

** Affects: netplan.io (Ubuntu)
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to netplan.io in Ubuntu.
Matching subscriptions: foundations-bugs
https://bugs.launchpad.net/bugs/2071747

Title:
  netplan causes unresponsive system with certain nsswitch config

Status in netplan.io package in Ubuntu:
  New

Bug description:
  A recent patch appears to chown networkd-related files to
  `root:systemd-network`[1]. If nsswitch.conf is configured with `group:
  systemd files`, this appears to create a circular dependency as
  systemd relies on netplan via systemd-networkd. On the next `systemctl
  daemon-reload`, pid 1 invokes netplan, netplan queries systemd for
  group info of `systemd-network`, but systemd cannot respond yet as
  it's waiting on netplan. Any programs making libc calls that nsswitch
  to systemd during this time are blocked. Something in systemd
  eventually SIGTERMs netplan after ~45s.

  Here is an strace log of pid 1 during a reload illustrating the
  problem:

  ```
  30854<(sd-executor)> 1719955780.753479 <... waitid resumed>{si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=30866, si_uid=0, si_status=0, si_utime=0, si_stime=0}, WEXITED, NULL) = 0 <0.022023>
  30854<(sd-executor)> 1719955780.753519 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=30866, si_uid=0, si_status=0, si_utime=1, si_stime=3} ---
  30854<(sd-executor)> 1719955780.753561 waitid(P_PID, 30856<friendly-recove>, {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=30856, si_uid=0, si_status=0, si_utime=0, si_stime=0}, WEXITED, NULL) = 0 <0.000039>
  30854<(sd-executor)> 1719955780.753646 waitid(P_PID, 30868<systemd-rc-loca>, {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=30868, si_uid=0, si_status=0, si_utime=0, si_stime=0}, WEXITED, NULL) = 0 <0.000023>
  30854<(sd-executor)> 1719955780.753711 waitid(P_PID, 30869<systemd-run-gen>, {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=30869, si_uid=0, si_status=0, si_utime=0, si_stime=0}, WEXITED, NULL) = 0 <0.000023>
  30854<(sd-executor)> 1719955780.753773 waitid(P_PID, 30861<systemd-bless-b>, {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=30861, si_uid=0, si_status=0, si_utime=0, si_stime=0}, WEXITED, NULL) = 0 <0.000022>
  30854<(sd-executor)> 1719955780.753840 waitid(P_PID, 30858<netplan>,  <unfinished ...>
  30858<netplan> <snip> (netplan looking up systemd-network group)
  30858<netplan> 1719955825.602714 sendto(4<UNIX-STREAM:[182429]>, "{\"method\":\"io.systemd.UserDatabase.GetMemberships\",\"parameters\":{\"groupName\":\"systemd-network\",\"service\":\"io.systemd.DynamicUser\"},\"more\":true}\0", 144, MSG_DONTWAIT|MSG_NOSIGNAL, NULL, 0)>
  30858<netplan> 1719955825.602771 epoll_ctl(5<anon_inode:[eventpoll]>, EPOLL_CTL_MOD, 4<UNIX-STREAM:[182429]>, {events=EPOLLIN, data={u32=3132670720, u64=106458192069376}}) = 0 <0.000010>
  30858<netplan> 1719955825.602823 epoll_wait(5<anon_inode:[eventpoll]>, [], 8, 0) = 0 <0.000010>
  30858<netplan> 1719955825.602859 brk(0x60d2babee000) = 0x60d2babee000 <0.000017>
  30858<netplan> 1719955825.602901 recvfrom(4<UNIX-STREAM:[182429]>, 0x60d2babad2e0, 131080, MSG_DONTWAIT, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable) <0.000011>
  30858<netplan> 1719955825.602951 epoll_wait(5<anon_inode:[eventpoll]>,  <unfinished ...>
  30854<(sd-executor)> 1719955870.201033 <... waitid resumed>0x7fffaec9c570, WEXITED, NULL) = ? ERESTARTSYS (To be restarted if SA_RESTART is set) <89.447162>
  30854<(sd-executor)> 1719955870.201147 --- SIGALRM {si_signo=SIGALRM, si_code=SI_KERNEL} ---
  30854<(sd-executor)> 1719955870.201607 +++ killed by SIGALRM +++
  30858<netplan> 1719955870.201625 <... epoll_wait resumed>0x60d2baba48b0, 8, -1) = -1 EINTR (Interrupted system call) <44.598663>
  30858<netplan> 1719955870.201670 --- SIGTERM {si_signo=SIGTERM, si_code=SI_USER, si_pid=30854, si_uid=0} ---

  ```

  Changing nsswitch.conf to `group: files systemd` or removing systemd
  fixes the problem.

  Note this is not resolved by the patch added for a recent similar
  bug[2].

  [1] https://git.launchpad.net/~ubuntu-core-
  dev/netplan/+git/ubuntu/tree/debian/patches/lp2065738/0013-libnetplan-
  use-more-restrictive-file-permissions.patch?h=ubuntu-
  jammy&id=6836c2bf27a209090ed9eb2c3deceb4cb2c9d85c#n88

  [2] https://bugs.launchpad.net/ubuntu/+source/netplan.io/+bug/2071333

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/netplan.io/+bug/2071747/+subscriptions




More information about the foundations-bugs mailing list