[Bug 2057629] [NEW] DNS busted on system that commissioned w/ a NIC MAC of all 0's
dann frazier
2057629 at bugs.launchpad.net
Tue Mar 12 13:35:27 UTC 2024
Public bug reported:
We had a weird issue where a new system we deployed with MAAS was
unusable because it couldn't resolve DNS names. Ultimately the problem
was that netplan was configured to set a MAC of all 0's to a USB NiC,
and that messes up the loopback interface (lo).
Here's a cut & paste of my diagnosis notes. I believe it to be related
to, but different, than bug 1936972.
I was verifying an SRU and hit what I think is the same problem. DNS
wasn’t working - but networking generally was. I finally noticed this:
ubuntu at hinyari:~$ ip addr
1: usb0: <LOOPBACK> mtu 65536 qdisc noqueue state DOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host usb0
valid_lft forever preferred_lft forever
2: enP6p1s0f0np0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN group default qlen 1000
link/ether b8:3f:d2:1d:37:40 brd ff:ff:ff:ff:ff:ff
3: enP6p1s0f1np1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
link/ether b8:3f:d2:1d:37:41 brd ff:ff:ff:ff:ff:ff
inet 10.229.100.0/16 brd 10.229.255.255 scope global enP6p1s0f1np1
valid_lft forever preferred_lft forever
inet6 fe80::ba3f:d2ff:fe1d:3741/64 scope link
valid_lft forever preferred_lft forever
4: enx9699ad470dd1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether 96:99:ad:47:0d:d1 brd ff:ff:ff:ff:ff:ff
There is no lo. Rather, there’s a usb0 device with a MAC of all 0’s. I’m
guessing this is the host redfish interface, which often has a NULL mac
in hardware, and the kernel generates one randomly on boot.
The netplan config has this entry:
usb0:
match:
macaddress: 00:00:00:00:00:00
mtu: 1500
set-name: usb0
version: 2
So I’m guessing what is happening is that netplan is deciding that it
should rename the device that has a MAC of all 0’s to usb0. The device
called usb0 is the loopback device - the USB device is actually called
enx9699ad470dd1. Presumably it later got a random MAC assigned by the
kernel.
This presumably breaks DNS because the system is configured to use
127.0.0.53 as the DNS server. systemd-resolved should be listening there
and forwarding requests to the real DNS server. But I suspect systemd-
resolved is trying to bind to lo, which doesn’t exist.
Perhaps MAAS, cloud-init or curtin or whatever creates the netplan
config should be told not to do this. Perhaps cdc-ether shouldn’t expose
a 0 MAC to userspace before it generates a random one. Or perhaps each
of these pieces should notice all 0's MACs and handle them specially.
** Affects: curtin
Importance: Undecided
Status: New
** Affects: maas
Importance: Undecided
Status: New
** Affects: netplan
Importance: Undecided
Status: New
** Also affects: curtin
Importance: Undecided
Status: New
** Also affects: netplan
Importance: Undecided
Status: New
--
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to curtin.
https://bugs.launchpad.net/bugs/2057629
Title:
DNS busted on system that commissioned w/ a NIC MAC of all 0's
Status in curtin:
New
Status in MAAS:
New
Status in netplan:
New
Bug description:
We had a weird issue where a new system we deployed with MAAS was
unusable because it couldn't resolve DNS names. Ultimately the problem
was that netplan was configured to set a MAC of all 0's to a USB NiC,
and that messes up the loopback interface (lo).
Here's a cut & paste of my diagnosis notes. I believe it to be related
to, but different, than bug 1936972.
I was verifying an SRU and hit what I think is the same problem. DNS
wasn’t working - but networking generally was. I finally noticed this:
ubuntu at hinyari:~$ ip addr
1: usb0: <LOOPBACK> mtu 65536 qdisc noqueue state DOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host usb0
valid_lft forever preferred_lft forever
2: enP6p1s0f0np0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN group default qlen 1000
link/ether b8:3f:d2:1d:37:40 brd ff:ff:ff:ff:ff:ff
3: enP6p1s0f1np1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
link/ether b8:3f:d2:1d:37:41 brd ff:ff:ff:ff:ff:ff
inet 10.229.100.0/16 brd 10.229.255.255 scope global enP6p1s0f1np1
valid_lft forever preferred_lft forever
inet6 fe80::ba3f:d2ff:fe1d:3741/64 scope link
valid_lft forever preferred_lft forever
4: enx9699ad470dd1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether 96:99:ad:47:0d:d1 brd ff:ff:ff:ff:ff:ff
There is no lo. Rather, there’s a usb0 device with a MAC of all 0’s.
I’m guessing this is the host redfish interface, which often has a
NULL mac in hardware, and the kernel generates one randomly on boot.
The netplan config has this entry:
usb0:
match:
macaddress: 00:00:00:00:00:00
mtu: 1500
set-name: usb0
version: 2
So I’m guessing what is happening is that netplan is deciding that it
should rename the device that has a MAC of all 0’s to usb0. The device
called usb0 is the loopback device - the USB device is actually called
enx9699ad470dd1. Presumably it later got a random MAC assigned by the
kernel.
This presumably breaks DNS because the system is configured to use
127.0.0.53 as the DNS server. systemd-resolved should be listening
there and forwarding requests to the real DNS server. But I suspect
systemd-resolved is trying to bind to lo, which doesn’t exist.
Perhaps MAAS, cloud-init or curtin or whatever creates the netplan
config should be told not to do this. Perhaps cdc-ether shouldn’t
expose a 0 MAC to userspace before it generates a random one. Or
perhaps each of these pieces should notice all 0's MACs and handle
them specially.
To manage notifications about this bug go to:
https://bugs.launchpad.net/curtin/+bug/2057629/+subscriptions
More information about the foundations-bugs
mailing list