[Bug 1629797] Re: resolve service in nsswitch.conf adds 25 seconds to failed lookups before systemd-resolved is up
Scott Moser
smoser at ubuntu.com
Tue Nov 22 04:42:14 UTC 2016
Marking this verified.
I booted an instance in gce.
## launch an instance
project="smoser-00"
# from gcloud compute images list ubuntu-1604-xenial-v20161115
img="/ubuntu-os-cloud/ubuntu-1604-xenial-v20161115"
name="smfoo3"
zone="us-east1-b"
mtype="f1-micro"
gcloud compute "--project=$project" instances create "$name" \
"--zone=$zone" "--machine-type=$mtype" --network=default \
"--maintenance-policy=MIGRATE" \
--image="$img" \
--boot-disk-size=10 --boot-disk-type=pd-standard \
"--boot-disk-device-name=$name"
## ssh in
# get htools for saving logs and such
% git clone https://gist.github.com/29ea35a797c0df1fcb6ac875a024efa9.git htools
% sudo ./htools/save-old-data orig-boot
new instance local: not found
new instance net : not found
reformattable: not found
disk_setup ran: true
mounts ran: true
proc-mounts:
/etc/fstab:
% sudo ./htools/enable-proposed
deb http://us-east1.gce.archive.ubuntu.com/ubuntu/ xenial-proposed main universe
% sudo apt-get update -qy && sudo apt-get install cloud-init -qy
% dpkg-query --show cloud-init
cloud-init 0.7.8-49-g9e904bb-0ubuntu1~16.04.1
% sudo ./htools/do-reboot clean
cleared /var/lib/cloud
cleared logs
rebooting
# ssh back in
% cat /proc/uptime
29.78 19.66
% journalctl --full --no-pager | grep -i "ordering" || echo no order
no order
% journalctl --full --no-pager | grep -i "break" || echo no break
%
--
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to dbus in Ubuntu.
https://bugs.launchpad.net/bugs/1629797
Title:
resolve service in nsswitch.conf adds 25 seconds to failed lookups
before systemd-resolved is up
Status in cloud-init:
Fix Committed
Status in D-Bus:
Unknown
Status in cloud-init package in Ubuntu:
Fix Released
Status in dbus package in Ubuntu:
Won't Fix
Status in cloud-init source package in Xenial:
Confirmed
Status in cloud-init source package in Yakkety:
Fix Released
Bug description:
=== Begin SRU Template ===
[Impact]
In cases where cloud-init used dns during early boot and system was
configured in nsswitch.conf to use systemd-resolvd, the system would
timeout on dns attempts making system boot terribly slow.
[Test Case]
Boot a system on GCE.
check for WARN in /var/log/messages
check that time to boot is reasonable (<30 seconds). In failure case the
times would be minutes.
[Regression Potential]
Changing order in boot can be dangerous. There is real chance for
regression here, but it should be fairly small as xenial does not include
systemd-resolved usage. This was first noticed on yakkety where it did.
[Other Info]
It seems useful to SRU this in the event that systemd-resolvd is used
on 16.04 or the case where user upgrades components (admittedly small use
case).
=== End SRU Template ===
During boot, cloud-init does DNS resolution checks to if particular metadata services are available (in order to determine which cloud it is running on). These checks happen before systemd-resolved is up[0] and if they resolve unsuccessfully they take 25 seconds to complete.
This has substantial impact on boot time in all contexts, because
cloud-init attempts to resolve three known-invalid addresses ("does-
not-exist.example.com.", "example.invalid." and a random string) to
enable it to detect when it's running in an environment where a DNS
server will always return some sort of redirect. As such, we're
talking a minimum impact of 75 seconds in all environments. This
increases when cloud-init is configured to check for multiple
environments.
This means that yakkety is consistently taking 2-3 minutes to boot on
EC2 and GCE, compared to the ~30 seconds of the first boot and ~10
seconds thereafter in xenial.
To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-init/+bug/1629797/+subscriptions
More information about the foundations-bugs
mailing list