[Bug 2071747] [NEW] netplan causes unresponsive system with certain nsswitch config
Adam Saponara
2071747 at bugs.launchpad.net
Tue Jul 2 22:40:41 UTC 2024
Public bug reported:
A recent patch appears to chown networkd-related files to `root:systemd-
network`[1]. If nsswitch.conf is configured with `group: systemd files`,
this appears to create a circular dependency as systemd relies on
netplan via systemd-networkd. On the next `systemctl daemon-reload`, pid
1 invokes netplan, netplan queries systemd for group info of `systemd-
network`, but systemd cannot respond yet as it's waiting on netplan. Any
programs making libc calls that nsswitch to systemd during this time are
blocked. Something in systemd eventually SIGTERMs netplan after ~45s.
Here is an strace log of pid 1 during a reload illustrating the problem:
```
30854<(sd-executor)> 1719955780.753479 <... waitid resumed>{si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=30866, si_uid=0, si_status=0, si_utime=0, si_stime=0}, WEXITED, NULL) = 0 <0.022023>
30854<(sd-executor)> 1719955780.753519 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=30866, si_uid=0, si_status=0, si_utime=1, si_stime=3} ---
30854<(sd-executor)> 1719955780.753561 waitid(P_PID, 30856<friendly-recove>, {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=30856, si_uid=0, si_status=0, si_utime=0, si_stime=0}, WEXITED, NULL) = 0 <0.000039>
30854<(sd-executor)> 1719955780.753646 waitid(P_PID, 30868<systemd-rc-loca>, {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=30868, si_uid=0, si_status=0, si_utime=0, si_stime=0}, WEXITED, NULL) = 0 <0.000023>
30854<(sd-executor)> 1719955780.753711 waitid(P_PID, 30869<systemd-run-gen>, {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=30869, si_uid=0, si_status=0, si_utime=0, si_stime=0}, WEXITED, NULL) = 0 <0.000023>
30854<(sd-executor)> 1719955780.753773 waitid(P_PID, 30861<systemd-bless-b>, {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=30861, si_uid=0, si_status=0, si_utime=0, si_stime=0}, WEXITED, NULL) = 0 <0.000022>
30854<(sd-executor)> 1719955780.753840 waitid(P_PID, 30858<netplan>, <unfinished ...>
30858<netplan> <snip> (netplan looking up systemd-network group)
30858<netplan> 1719955825.602714 sendto(4<UNIX-STREAM:[182429]>, "{\"method\":\"io.systemd.UserDatabase.GetMemberships\",\"parameters\":{\"groupName\":\"systemd-network\",\"service\":\"io.systemd.DynamicUser\"},\"more\":true}\0", 144, MSG_DONTWAIT|MSG_NOSIGNAL, NULL, 0)>
30858<netplan> 1719955825.602771 epoll_ctl(5<anon_inode:[eventpoll]>, EPOLL_CTL_MOD, 4<UNIX-STREAM:[182429]>, {events=EPOLLIN, data={u32=3132670720, u64=106458192069376}}) = 0 <0.000010>
30858<netplan> 1719955825.602823 epoll_wait(5<anon_inode:[eventpoll]>, [], 8, 0) = 0 <0.000010>
30858<netplan> 1719955825.602859 brk(0x60d2babee000) = 0x60d2babee000 <0.000017>
30858<netplan> 1719955825.602901 recvfrom(4<UNIX-STREAM:[182429]>, 0x60d2babad2e0, 131080, MSG_DONTWAIT, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable) <0.000011>
30858<netplan> 1719955825.602951 epoll_wait(5<anon_inode:[eventpoll]>, <unfinished ...>
30854<(sd-executor)> 1719955870.201033 <... waitid resumed>0x7fffaec9c570, WEXITED, NULL) = ? ERESTARTSYS (To be restarted if SA_RESTART is set) <89.447162>
30854<(sd-executor)> 1719955870.201147 --- SIGALRM {si_signo=SIGALRM, si_code=SI_KERNEL} ---
30854<(sd-executor)> 1719955870.201607 +++ killed by SIGALRM +++
30858<netplan> 1719955870.201625 <... epoll_wait resumed>0x60d2baba48b0, 8, -1) = -1 EINTR (Interrupted system call) <44.598663>
30858<netplan> 1719955870.201670 --- SIGTERM {si_signo=SIGTERM, si_code=SI_USER, si_pid=30854, si_uid=0} ---
```
Changing nsswitch.conf to `group: files systemd` or removing systemd
fixes the problem.
Note this is not resolved by the patch added for a recent similar
bug[2].
[1] https://git.launchpad.net/~ubuntu-core-
dev/netplan/+git/ubuntu/tree/debian/patches/lp2065738/0013-libnetplan-
use-more-restrictive-file-permissions.patch?h=ubuntu-
jammy&id=6836c2bf27a209090ed9eb2c3deceb4cb2c9d85c#n88
[2] https://bugs.launchpad.net/ubuntu/+source/netplan.io/+bug/2071333
** Affects: netplan.io (Ubuntu)
Importance: Undecided
Status: New
--
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to netplan.io in Ubuntu.
Matching subscriptions: foundations-bugs
https://bugs.launchpad.net/bugs/2071747
Title:
netplan causes unresponsive system with certain nsswitch config
Status in netplan.io package in Ubuntu:
New
Bug description:
A recent patch appears to chown networkd-related files to
`root:systemd-network`[1]. If nsswitch.conf is configured with `group:
systemd files`, this appears to create a circular dependency as
systemd relies on netplan via systemd-networkd. On the next `systemctl
daemon-reload`, pid 1 invokes netplan, netplan queries systemd for
group info of `systemd-network`, but systemd cannot respond yet as
it's waiting on netplan. Any programs making libc calls that nsswitch
to systemd during this time are blocked. Something in systemd
eventually SIGTERMs netplan after ~45s.
Here is an strace log of pid 1 during a reload illustrating the
problem:
```
30854<(sd-executor)> 1719955780.753479 <... waitid resumed>{si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=30866, si_uid=0, si_status=0, si_utime=0, si_stime=0}, WEXITED, NULL) = 0 <0.022023>
30854<(sd-executor)> 1719955780.753519 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=30866, si_uid=0, si_status=0, si_utime=1, si_stime=3} ---
30854<(sd-executor)> 1719955780.753561 waitid(P_PID, 30856<friendly-recove>, {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=30856, si_uid=0, si_status=0, si_utime=0, si_stime=0}, WEXITED, NULL) = 0 <0.000039>
30854<(sd-executor)> 1719955780.753646 waitid(P_PID, 30868<systemd-rc-loca>, {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=30868, si_uid=0, si_status=0, si_utime=0, si_stime=0}, WEXITED, NULL) = 0 <0.000023>
30854<(sd-executor)> 1719955780.753711 waitid(P_PID, 30869<systemd-run-gen>, {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=30869, si_uid=0, si_status=0, si_utime=0, si_stime=0}, WEXITED, NULL) = 0 <0.000023>
30854<(sd-executor)> 1719955780.753773 waitid(P_PID, 30861<systemd-bless-b>, {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=30861, si_uid=0, si_status=0, si_utime=0, si_stime=0}, WEXITED, NULL) = 0 <0.000022>
30854<(sd-executor)> 1719955780.753840 waitid(P_PID, 30858<netplan>, <unfinished ...>
30858<netplan> <snip> (netplan looking up systemd-network group)
30858<netplan> 1719955825.602714 sendto(4<UNIX-STREAM:[182429]>, "{\"method\":\"io.systemd.UserDatabase.GetMemberships\",\"parameters\":{\"groupName\":\"systemd-network\",\"service\":\"io.systemd.DynamicUser\"},\"more\":true}\0", 144, MSG_DONTWAIT|MSG_NOSIGNAL, NULL, 0)>
30858<netplan> 1719955825.602771 epoll_ctl(5<anon_inode:[eventpoll]>, EPOLL_CTL_MOD, 4<UNIX-STREAM:[182429]>, {events=EPOLLIN, data={u32=3132670720, u64=106458192069376}}) = 0 <0.000010>
30858<netplan> 1719955825.602823 epoll_wait(5<anon_inode:[eventpoll]>, [], 8, 0) = 0 <0.000010>
30858<netplan> 1719955825.602859 brk(0x60d2babee000) = 0x60d2babee000 <0.000017>
30858<netplan> 1719955825.602901 recvfrom(4<UNIX-STREAM:[182429]>, 0x60d2babad2e0, 131080, MSG_DONTWAIT, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable) <0.000011>
30858<netplan> 1719955825.602951 epoll_wait(5<anon_inode:[eventpoll]>, <unfinished ...>
30854<(sd-executor)> 1719955870.201033 <... waitid resumed>0x7fffaec9c570, WEXITED, NULL) = ? ERESTARTSYS (To be restarted if SA_RESTART is set) <89.447162>
30854<(sd-executor)> 1719955870.201147 --- SIGALRM {si_signo=SIGALRM, si_code=SI_KERNEL} ---
30854<(sd-executor)> 1719955870.201607 +++ killed by SIGALRM +++
30858<netplan> 1719955870.201625 <... epoll_wait resumed>0x60d2baba48b0, 8, -1) = -1 EINTR (Interrupted system call) <44.598663>
30858<netplan> 1719955870.201670 --- SIGTERM {si_signo=SIGTERM, si_code=SI_USER, si_pid=30854, si_uid=0} ---
```
Changing nsswitch.conf to `group: files systemd` or removing systemd
fixes the problem.
Note this is not resolved by the patch added for a recent similar
bug[2].
[1] https://git.launchpad.net/~ubuntu-core-
dev/netplan/+git/ubuntu/tree/debian/patches/lp2065738/0013-libnetplan-
use-more-restrictive-file-permissions.patch?h=ubuntu-
jammy&id=6836c2bf27a209090ed9eb2c3deceb4cb2c9d85c#n88
[2] https://bugs.launchpad.net/ubuntu/+source/netplan.io/+bug/2071333
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/netplan.io/+bug/2071747/+subscriptions
More information about the foundations-bugs
mailing list