[Bug 1696415] Re: NetworkManager does not update IPv4 address lifetime even though DHCP lease was successfully renewed
Luke Faraone
luke at faraone.cc
Tue Nov 14 22:15:15 UTC 2017
** Description changed:
- I've found an issue on some of our Xenial office machines, causing
- NetworkManager to drop its IP address lease in some cases when it
- shouldn't. I'm not sure if the actual bug is in NetworkManager or
- perhaps dbus or dhclient, but I'll do my best to help to figure out
- where it is.
+ SRU REQUEST:
+
+ Debdiff (nm-dhcp-helper.debdiff) attached.
+
+ Fixed in current Ubuntu zesty and newer: Bionic uses NM 1.8.x. This bug
+ was fixed upstream in 1.4.
+
+ [Impact]
+
+ * nm-dhcp-helper sometimes fails to notify NetworkManager of a DHCP
+ lease renewal due to a DBus race condition.
+
+ * Upstream NetworkManager 1.4 fixes the race condition by changing
+ nm-dhcp-helper's DBus notification from signal "Event" to method
+ "Notify".
+
+ * Original bug submitter backported NM 1.4's nm-dhcp-helper notification fix to NM 1.2. This SRU request applies that backported patch to Xenial's
+ NM 1.2.x.
+
+ [Test Case]
+
+ * Not reliably reproducible. Out of hundreds of machines, only a
+ dozen or so fail to notify NetworkManager of a DHCP lease
+ renewal about 30-50% of the time. (i.e. It's always the same
+ handful of machines that fail.)
+
+ * All such machines with the patched packages have been fine for weeks,
+ over many dozens of lease renewals.
+
+ [Regression Potential]
+
+ * The patch changes both nm-dhcp-helper and NetworkManager itself. As
+ soon as the new packages are unpacked, the new nm-dhcp-helper will be
+ used on DHCP lease renewals, with the new Notify mechanism. Since the
+ running, old NetworkManager is still expecting Event notifications, the
+ patched nm-dhcp-helper has fallback capability to Event.
+
+ * Once NetworkManager is restarted and is running the patched version,
+ it will have the new Notify support.
+
+ [Other Info]
+
+ * Upstream bug w/ patch:
+ https://bugzilla.gnome.org/show_bug.cgi?id=784636
+
+ * RHEL bug with links to the 1.4 commits from which the patch was
+ derived: https://bugzilla.redhat.com/show_bug.cgi?id=1373276
+
+ * NOTE: The final comment on the upstream GNOME bug claims that the
+ fix is incomplete. However, it is possible that the running
+ NetworkManager was not restarted (see Regression Potential notes above),
+ which is why nm-dhcp-helper is falling back to Event. The remainder of
+ the log messages in that final comment are from a custom wrapper the
+ submitter was running around nm-dhcp-helper. I have deployed the exact
+ same patch (without said wrapper) to real-world systems and tested
+ extensively, and see nothing but successful DHCP lease renewal
+ notifications using D-Bus Notify, not D-Bus Event.
+
+
+ ----
+ I've found an issue on some of our Xenial office machines, causing NetworkManager to drop its IP address lease in some cases when it shouldn't. I'm not sure if the actual bug is in NetworkManager or perhaps dbus or dhclient, but I'll do my best to help to figure out where it is.
What appears to happen:
* NetworkManager is informed of a new IPv4 lease.
* During the lease, dhclient keeps it fresh by renewing it using DHCPREQUESTs regularly.
* In spite of this, NetworkManager drops the IP address from the interface when the last reported lease time expires.
This happens on various machines, once every few days. We are using a
failover DHCP configuration using two machines (192.168.0.3 'bonaire'
and 192.168.0.4 'curacao').
The machine where I've done the debugging is called 'pampus'
(192.168.0.166). As you can see in the logs, at 01:21:06 NetworkManager
reports a new lease with lease time 7200.
jun 07 01:21:06 pampus dhclient[1532]: DHCPREQUEST of 192.168.0.166 on eth0 to 192.168.0.4 port 67 (xid=0x3295b440)
jun 07 01:21:06 pampus dhclient[1532]: DHCPACK of 192.168.0.166 from 192.168.0.4
jun 07 01:21:06 pampus NetworkManager[1161]: <info> [1496791266.9530] address 192.168.0.166
jun 07 01:21:06 pampus NetworkManager[1161]: <info> [1496791266.9530] plen 24 (255.255.255.0)
jun 07 01:21:06 pampus NetworkManager[1161]: <info> [1496791266.9530] gateway 192.168.0.5
jun 07 01:21:06 pampus NetworkManager[1161]: <info> [1496791266.9530] server identifier 192.168.0.4
jun 07 01:21:06 pampus NetworkManager[1161]: <info> [1496791266.9530] lease time 7200
jun 07 01:21:06 pampus NetworkManager[1161]: <info> [1496791266.9530] nameserver '192.168.0.3'
jun 07 01:21:06 pampus NetworkManager[1161]: <info> [1496791266.9530] nameserver '192.168.0.4'
jun 07 01:21:06 pampus NetworkManager[1161]: <info> [1496791266.9530] domain name 'office.screenpointmed.com'
jun 07 01:21:06 pampus NetworkManager[1161]: <info> [1496791266.9531] dhcp4 (eth0): state changed bound -> bound
After this, dhclient is supposed to keep the lease fresh, which it does.
E.g. at 03:13:19 you can see a DHCPREQUEST and DHCPACK; I've seen this
DHCPACK in a tcpdump and it contains a new lease time of 7200 seconds.
jun 07 03:13:19 pampus dhclient[1532]: DHCPREQUEST of 192.168.0.166 on eth0 to 192.168.0.4 port 67 (xid=0x3295b440)
jun 07 03:13:19 pampus dhclient[1532]: DHCPACK of 192.168.0.166 from 192.168.0.4
jun 07 03:13:19 pampus dhclient[1532]: bound to 192.168.0.166 -- renewal in 2708 seconds.
However, at 03:21:07 (exactly 2 hours and 1 second after the last lease
reported by NetworkManager) Avahi and NTP report that the IP address is
gone:
jun 07 03:21:07 pampus avahi-daemon[1167]: Withdrawing address record for 192.168.0.166 on eth0.
jun 07 03:21:07 pampus avahi-daemon[1167]: Leaving mDNS multicast group on interface eth0.IPv4 with address 192.168.0.166.
jun 07 03:21:07 pampus avahi-daemon[1167]: Interface eth0.IPv4 no longer relevant for mDNS.
jun 07 03:21:08 pampus ntpd[18832]: Deleting interface #3 eth0, 192.168.0.166#123, interface stats: received=2512, sent=2549, dropped=0, active_time=111819 secs
So I suspect NetworkManager dropped the IP address from the interface,
because it wasn't informed by dhclient that the lease was renewed. The
logs don't explicitly say this, so I may have to turn on more verbose
debugging logs in NetworkManager or dhclient to verify this.
ProblemType: Bug
DistroRelease: Ubuntu 16.04
Package: network-manager 1.2.6-0ubuntu0.16.04.1
ProcVersionSignature: Ubuntu 4.4.0-66.87-generic 4.4.44
Uname: Linux 4.4.0-66-generic x86_64
NonfreeKernelModules: nvidia_uvm nvidia_drm nvidia_modeset nvidia
ApportVersion: 2.20.1-0ubuntu2.6
Architecture: amd64
Date: Wed Jun 7 14:48:59 2017
IfupdownConfig:
# interfaces(5) file used by ifup(8) and ifdown(8)
auto lo
iface lo inet loopback
InstallationDate: Installed on 2016-11-04 (214 days ago)
InstallationMedia: Ubuntu 14.04.5 LTS "Trusty Tahr" - Release amd64 (20160803)
IpRoute:
default via 192.168.0.5 dev eth0 proto static metric 100
192.168.0.0/24 dev eth0 proto kernel scope link src 192.168.0.166
192.168.0.0/24 dev eth0 proto kernel scope link src 192.168.0.166 metric 100
IwConfig:
lo no wireless extensions.
eth1 no wireless extensions.
eth0 no wireless extensions.
NetworkManager.state:
[main]
NetworkingEnabled=true
WirelessEnabled=true
WWANEnabled=true
WimaxEnabled=true
RfKill:
SourcePackage: network-manager
UpgradeStatus: No upgrade log present (probably fresh install)
nmcli-con:
NAME UUID TYPE TIMESTAMP TIMESTAMP-REAL AUTOCONNECT AUTOCONNECT-PRIORITY READONLY DBUS-PATH ACTIVE DEVICE STATE ACTIVE-PATH
Wired connection 1 37da1802-e1ce-3326-a6d0-f855cc32806d 802-3-ethernet 1496839466 wo 07 jun 2017 14:44:26 CEST yes 4294966297 no /org/freedesktop/NetworkManager/Settings/0 yes eth0 activated /org/freedesktop/NetworkManager/ActiveConnection/0
Wired connection 2 a040d7fe-3c52-39ba-82b8-50fad0b602c1 802-3-ethernet 1496399665 vr 02 jun 2017 12:34:25 CEST yes 4294966297 no /org/freedesktop/NetworkManager/Settings/1 no -- -- --
nmcli-dev:
DEVICE TYPE STATE DBUS-PATH CONNECTION CON-UUID CON-PATH
eth0 ethernet connected /org/freedesktop/NetworkManager/Devices/0 Wired connection 1 37da1802-e1ce-3326-a6d0-f855cc32806d /org/freedesktop/NetworkManager/ActiveConnection/0
eth1 ethernet unavailable /org/freedesktop/NetworkManager/Devices/2 -- -- --
lo loopback unmanaged /org/freedesktop/NetworkManager/Devices/1 -- -- --
nmcli-nm:
RUNNING VERSION STATE STARTUP CONNECTIVITY NETWORKING WIFI-HW WIFI WWAN-HW WWAN
running 1.2.6 connected started full enabled enabled enabled enabled enabled
-
- ------------------------------------------------
- SRU REQUEST:
-
- Debdiff (nm-dhcp-helper.debdiff) attached.
-
- Fixed in current Ubuntu zesty and newer: Bionic uses NM 1.8.x. This bug
- was fixed upstream in 1.4.
-
- [Impact]
-
- * nm-dhcp-helper sometimes fails to notify NetworkManager of a DHCP
- lease renewal due to a DBus race condition.
-
- * Upstream NetworkManager 1.4 fixes the race condition by changing
- nm-dhcp-helper's DBus notification from signal "Event" to method
- "Notify".
-
- * Original bug submitter backported NM 1.4's nm-dhcp-helper notification fix to NM 1.2. This SRU request applies that backported patch to Xenial's
- NM 1.2.x.
-
- [Test Case]
-
- * Not reliably reproducible. Out of hundreds of machines, only a
- dozen or so fail to notify NetworkManager of a DHCP lease
- renewal about 30-50% of the time. (i.e. It's always the same
- handful of machines that fail.)
-
- * All such machines with the patched packages have been fine for weeks,
- over many dozens of lease renewals.
-
- [Regression Potential]
-
- * The patch changes both nm-dhcp-helper and NetworkManager itself. As
- soon as the new packages are unpacked, the new nm-dhcp-helper will be
- used on DHCP lease renewals, with the new Notify mechanism. Since the
- running, old NetworkManager is still expecting Event notifications, the
- patched nm-dhcp-helper has fallback capability to Event.
-
- * Once NetworkManager is restarted and is running the patched version,
- it will have the new Notify support.
-
- [Other Info]
-
- * Upstream bug w/ patch:
- https://bugzilla.gnome.org/show_bug.cgi?id=784636
-
- * RHEL bug with links to the 1.4 commits from which the patch was
- derived: https://bugzilla.redhat.com/show_bug.cgi?id=1373276
-
- * NOTE: The final comment on the upstream GNOME bug claims that the
- fix is incomplete. However, it is possible that the running
- NetworkManager was not restarted (see Regression Potential notes above),
- which is why nm-dhcp-helper is falling back to Event. The remainder of
- the log messages in that final comment are from a custom wrapper the
- submitter was running around nm-dhcp-helper. I have deployed the exact
- same patch (without said wrapper) to real-world systems and tested
- extensively, and see nothing but successful DHCP lease renewal
- notifications using D-Bus Notify, not D-Bus Event.
--
You received this bug notification because you are a member of Network-
manager, which is subscribed to NetworkManager.
https://bugs.launchpad.net/bugs/1696415
Title:
NetworkManager does not update IPv4 address lifetime even though DHCP
lease was successfully renewed
To manage notifications about this bug go to:
https://bugs.launchpad.net/network-manager/+bug/1696415/+subscriptions
More information about the Ubuntu-reviews
mailing list