[Bug 2057965] [NEW] google-startup-scripts runs before cloud-init finished network setup
Catherine Redfield
2057965 at bugs.launchpad.net
Thu Mar 14 19:51:37 UTC 2024
Public bug reported:
New GCP dailies are failing startup-script tests, due to network not
being fully set up when startup scripts are run. The failure can be
reproduced as follows:
Using startup_script.sh:
#!/bin/bash
cp /etc/apt/sources.list /tmp/startup-sources.list
$ gcloud compute instances create startup-test --image daily-ubuntu-2204-jammy-v20240314 --image-project ubuntu-os-cloud-devel --metadata-from-file=startup-script=startup_script.sh
[...]
$ ssh [INSTANCE IP]
> diff /tmp/startup-sources.list /etc/apt/sources.list
0a1,8
> ## Note, this file is written by cloud-init on first boot of an instance
> ## modifications made here will not survive a re-bundle.
> ## if you wish to make changes you can:
> ## a.) add 'apt_preserve_sources_list: true' to /etc/cloud/cloud.cfg
> ## or do the same in user-data
> ## b.) add sources in /etc/apt/sources.list.d
> ## c.) make changes to template file /etc/cloud/templates/sources.list.tmpl
>
3,4c11,12
< deb http://archive.ubuntu.com/ubuntu/ jammy main restricted
< # deb-src http://archive.ubuntu.com/ubuntu/ jammy main restricted
---
> deb http://us-central1.gce.archive.ubuntu.com/ubuntu/ jammy main restricted
> # deb-src http://us-central1.gce.archive.ubuntu.com/ubuntu/ jammy main restricted
8,9c16,17
< deb http://archive.ubuntu.com/ubuntu/ jammy-updates main restricted
< # deb-src http://archive.ubuntu.com/ubuntu/ jammy-updates main restricted
---
[...]
On earlier images (such as ubuntu-2204-jammy-v20240307 in ubuntu-os-cloud) do not show this behaviour. The change is due to a change in ubuntu-pro 31 (see https://github.com/canonical/ubuntu-pro-client/blob/dfe1f1ed4678c50240d4e251f41d33bb4034135e/debian/changelog#L40 for details) that removes a systemd ordering on cloud-config.service. As side effect of this change was the removal of cloud-config.service (and ubuntu-advantage.service) from systemd's critical chain.
On v20240307 (startup scripts execute correctly):
catred at startup-test-control:~$ systemd-analyze critical-chain google-startup-scripts.service
The time when unit became active or started is printed after the "@" character.
The time the unit took to start is printed after the "+" character.
google-startup-scripts.service +18.262s
└─multi-user.target @28.480s
└─ubuntu-advantage.service @28.480s
└─cloud-config.service @27.372s +1.095s
└─snapd.seeded.service @20.048s +7.312s
└─snapd.service @12.469s +7.555s
└─basic.target @11.558s
└─sockets.target @11.540s
└─snap.lxd.daemon.unix.socket @24.376s
└─sysinit.target @10.825s
└─cloud-init.service @8.432s +2.267s
└─systemd-networkd-wait-online.service @6.467s +1.935s
└─systemd-networkd.service @6.347s +112ms
└─network-pre.target @6.328s
└─cloud-init-local.service @4.309s +2.006s
└─systemd-remount-fs.service @1.829s +68ms
└─systemd-fsck-root.service @1.587s +160ms
└─systemd-journald.socket @1.292s
└─system.slice @1.068s
└─-.slice @1.068s
On v20240314 (startup scripts fail):
catred at startup-test:~$ systemd-analyze critical-chain google-startup-scripts.service
The time when unit became active or started is printed after the "@" characte>
The time the unit took to start is printed after the "+" character.
google-startup-scripts.service +260ms
└─multi-user.target @29.237s
└─chrony.service @30.240s +56ms
└─basic.target @13.364s
└─sockets.target @13.225s
└─snap.lxd.user-daemon.unix.socket @26.765s
└─sysinit.target @12.550s
└─cloud-init.service @7.933s +4.503s
└─systemd-networkd-wait-online.service @6.741s +1.171s
└─systemd-networkd.service @6.593s +124ms
└─network-pre.target @6.573s
└─cloud-init-local.service @4.478s +2.083s
└─systemd-remount-fs.service @1.717s +64ms
└─systemd-fsck-root.service @1.510s +95ms
└─systemd-journald.socket @1.193s
└─-.mount @974ms
└─-.slice @974ms
This can be fixed by adding an explict `After=cloud-config.service` to the google-startup-scripts.service file, which enforces the correct ordering between google-startup-scripts and cloud-init.
** Affects: google-guest-agent (Ubuntu)
Importance: Undecided
Status: New
--
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to google-guest-agent in Ubuntu.
https://bugs.launchpad.net/bugs/2057965
Title:
google-startup-scripts runs before cloud-init finished network setup
Status in google-guest-agent package in Ubuntu:
New
Bug description:
New GCP dailies are failing startup-script tests, due to network not
being fully set up when startup scripts are run. The failure can be
reproduced as follows:
Using startup_script.sh:
#!/bin/bash
cp /etc/apt/sources.list /tmp/startup-sources.list
$ gcloud compute instances create startup-test --image daily-ubuntu-2204-jammy-v20240314 --image-project ubuntu-os-cloud-devel --metadata-from-file=startup-script=startup_script.sh
[...]
$ ssh [INSTANCE IP]
> diff /tmp/startup-sources.list /etc/apt/sources.list
0a1,8
> ## Note, this file is written by cloud-init on first boot of an instance
> ## modifications made here will not survive a re-bundle.
> ## if you wish to make changes you can:
> ## a.) add 'apt_preserve_sources_list: true' to /etc/cloud/cloud.cfg
> ## or do the same in user-data
> ## b.) add sources in /etc/apt/sources.list.d
> ## c.) make changes to template file /etc/cloud/templates/sources.list.tmpl
>
3,4c11,12
< deb http://archive.ubuntu.com/ubuntu/ jammy main restricted
< # deb-src http://archive.ubuntu.com/ubuntu/ jammy main restricted
---
> deb http://us-central1.gce.archive.ubuntu.com/ubuntu/ jammy main restricted
> # deb-src http://us-central1.gce.archive.ubuntu.com/ubuntu/ jammy main restricted
8,9c16,17
< deb http://archive.ubuntu.com/ubuntu/ jammy-updates main restricted
< # deb-src http://archive.ubuntu.com/ubuntu/ jammy-updates main restricted
---
[...]
On earlier images (such as ubuntu-2204-jammy-v20240307 in ubuntu-os-cloud) do not show this behaviour. The change is due to a change in ubuntu-pro 31 (see https://github.com/canonical/ubuntu-pro-client/blob/dfe1f1ed4678c50240d4e251f41d33bb4034135e/debian/changelog#L40 for details) that removes a systemd ordering on cloud-config.service. As side effect of this change was the removal of cloud-config.service (and ubuntu-advantage.service) from systemd's critical chain.
On v20240307 (startup scripts execute correctly):
catred at startup-test-control:~$ systemd-analyze critical-chain google-startup-scripts.service
The time when unit became active or started is printed after the "@" character.
The time the unit took to start is printed after the "+" character.
google-startup-scripts.service +18.262s
└─multi-user.target @28.480s
└─ubuntu-advantage.service @28.480s
└─cloud-config.service @27.372s +1.095s
└─snapd.seeded.service @20.048s +7.312s
└─snapd.service @12.469s +7.555s
└─basic.target @11.558s
└─sockets.target @11.540s
└─snap.lxd.daemon.unix.socket @24.376s
└─sysinit.target @10.825s
└─cloud-init.service @8.432s +2.267s
└─systemd-networkd-wait-online.service @6.467s +1.935s
└─systemd-networkd.service @6.347s +112ms
└─network-pre.target @6.328s
└─cloud-init-local.service @4.309s +2.006s
└─systemd-remount-fs.service @1.829s +68ms
└─systemd-fsck-root.service @1.587s +160ms
└─systemd-journald.socket @1.292s
└─system.slice @1.068s
└─-.slice @1.068s
On v20240314 (startup scripts fail):
catred at startup-test:~$ systemd-analyze critical-chain google-startup-scripts.service
The time when unit became active or started is printed after the "@" characte>
The time the unit took to start is printed after the "+" character.
google-startup-scripts.service +260ms
└─multi-user.target @29.237s
└─chrony.service @30.240s +56ms
└─basic.target @13.364s
└─sockets.target @13.225s
└─snap.lxd.user-daemon.unix.socket @26.765s
└─sysinit.target @12.550s
└─cloud-init.service @7.933s +4.503s
└─systemd-networkd-wait-online.service @6.741s +1.171s
└─systemd-networkd.service @6.593s +124ms
└─network-pre.target @6.573s
└─cloud-init-local.service @4.478s +2.083s
└─systemd-remount-fs.service @1.717s +64ms
└─systemd-fsck-root.service @1.510s +95ms
└─systemd-journald.socket @1.193s
└─-.mount @974ms
└─-.slice @974ms
This can be fixed by adding an explict `After=cloud-config.service` to the google-startup-scripts.service file, which enforces the correct ordering between google-startup-scripts and cloud-init.
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/google-guest-agent/+bug/2057965/+subscriptions
More information about the foundations-bugs
mailing list