[Bug 1938299] Re: Unable to SSH Into Instance when deploying Impish 21.10
Chad Smith
1938299 at bugs.launchpad.net
Tue Oct 12 02:37:59 UTC 2021
To clarify the actual root cause here and reflect it back to this
original bug.
google-guest-agent defines a `PartOf=` relationship with systemd-
networkd.service[1]. This relationship means that if systemd-
networkd.service is either stopped, google-guest-agent.service gets
stopped. When systemd-networkd.service is restarted, so is google-guest-
agent.
But if systemd-networkd.service is subsequently started after a previous
stop call, google-guest-agent is left in stopped state. The call
`netplan apply` (emitted by cloud-init after writing network config) in
fact calls systemctl stop systemd-networkd.service and follows it with a
'start' instead of directly invoking systemctl restart systemd-
networkd.service[2]. This leaves google-guest in stopped state
indefinitely.
I'm not entirely sure netplan can fix this issue due to some other
cleanup they are doing between networkd stop and start, but I have
reflected this bug to netplan.io folks and we'll see what the consensus
is about whether this can be resolved with instrumenting a "systemctl
restart" instead of separate "systemctl stop" and "systemctl start"
calls.
References:
[1] https://github.com/GoogleCloudPlatform/guest-agent/blob/main/google-guest-agent.service#L13
[2] https://git.launchpad.net/ubuntu/+source/netplan.io/tree/netplan/cli/commands/apply.py?h=applied/ubuntu/devel#n169
** Also affects: netplan.io (Ubuntu)
Importance: Undecided
Status: New
--
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to google-guest-agent in Ubuntu.
Matching subscriptions: foundations-bugs
https://bugs.launchpad.net/bugs/1938299
Title:
Unable to SSH Into Instance when deploying Impish 21.10
Status in cloud-init package in Ubuntu:
Fix Released
Status in google-guest-agent package in Ubuntu:
Confirmed
Status in netplan.io package in Ubuntu:
New
Status in cloud-init source package in Bionic:
Fix Committed
Status in google-guest-agent source package in Bionic:
Confirmed
Status in netplan.io source package in Bionic:
New
Status in cloud-init source package in Focal:
Fix Committed
Status in google-guest-agent source package in Focal:
Confirmed
Status in netplan.io source package in Focal:
New
Status in cloud-init source package in Hirsute:
Fix Committed
Status in google-guest-agent source package in Hirsute:
Confirmed
Status in netplan.io source package in Hirsute:
New
Status in cloud-init source package in Impish:
Fix Released
Status in google-guest-agent source package in Impish:
Confirmed
Status in netplan.io source package in Impish:
New
Bug description:
=== Begin SRU Template ===
[Impact]
In PR #919 (81299de), we refactored some of the code used to bring up networks across distros. Previously, the call to bring up network interfaces during 'init' stage unintentionally resulted in a no-op such that network interfaces were NEVER brought up by cloud-init, even if new network interfaces were found after crawling the metadata.
In #919, the code was altered to bring up these discovered network
interfaces. On Ubuntu, this results in a 'netplan apply' call during
'init' stage for any ubuntu-based distro on a datasource that has a
NETWORK dependency. On GCE, this additional 'netplan apply' conflicts
with the google-guest-agent service, resulting in an instance that can
not be connected to.
To fix this, we added a new 'disable_network_activation' option that
can be enabled in /etc/cloud.cfg to disable the activation of network
interfaces in 'init' stage.
[Test Case]
An integration test has been added at `tests/integration_tests/datasources/test_network_dependency.py` to test this functionality. To test manually:
1. Launch an instance on GCE
2. Install the cloud-init version with the fix
3. Add a file, '/etc/cloud/cloud.cfg.d/99-disable-network-activation.cfg' with the contents:
disable_network_activation: true
4. Run cloud-init clean --logs
5. Create a new image based on this instance
6. Launch a new instance based on the new image
7. Instance should launch successfully and able to be ssh'ed into
8. "['netplan', 'apply']" should not be present anywhere in /var/log/cloud-init.log.
9. "Bringing up newly configured network interfaces" should not exist anywhere in /var/log/cloud-init.log
In the failure case, we will fail at step 7.
[Regression Potential]
The code in question determines whether to bring up interfaces after applying network config. Accidentally not doing this should not be a problem as we previously (unintentionally) did not bring these interfaces up. Accidentally bringing up interfaces when we shouldn't be also generally shouldn't cause a large problem outside of GCE, because outside of GCE there aren't (that we're aware of) other processes independently setting up network. If this setup determination code somehow fails, it happens early enough in boot that it could leave an instance unusable, however, the code is small enough and defensive enough that we don't believe that is a possibility.
[Other Info]
Github PR: https://github.com/canonical/cloud-init/pull/1048
Upstream commit: <TODO>
=== End SRU Template ===
Original bug report:
Google Instances deployed with the Ubuntu 21.10 Daily images are
inaccessible via SSH.
gcloud compute instances create sf-impish-v20200720 --zone us-west1-a
--network "default" --no-restart-on-failure --image-project ubuntu-os-
cloud-devel --image daily-ubuntu-2110-impish-v20210720 --machine-type
n1-standard-2
Will result in a successful deploy yet, inaccessible via ssh from the
end users configured laptop.
This appears to affect all daily images after 20210719.
daily-ubuntu-2110-impish-v20210719 ubuntu-os-cloud-devel ubuntu-2110 READY
daily-ubuntu-2110-impish-v20210720 ubuntu-os-cloud-devel ubuntu-2110 READY
daily-ubuntu-2110-impish-v20210721 ubuntu-os-cloud-devel ubuntu-2110 READY
daily-ubuntu-2110-impish-v20210723 ubuntu-os-cloud-devel ubuntu-2110 READY
daily-ubuntu-2110-impish-v20210724 ubuntu-os-cloud-devel ubuntu-2110 READY
daily-ubuntu-2110-impish-v20210725 ubuntu-os-cloud-devel ubuntu-2110 READY
daily-ubuntu-2110-impish-v20210728 ubuntu-os-cloud-devel ubuntu-2110
This problem also appears to be reproducible via the gcloud UI, create
a new virtual machine using the daily-ubuntu-2110-impish-v20210720 or
greater and instruct the virtual machine to import a ssh_pub_key in
the security tab. The Instance will start, yet still be inaccessible
via the users private sshkey
The google-guest-agent.service appears to be responsible for adding
the google project ssh keys to the instance once its deployed. Please
see below when queried on the 20210719 image:
google-guest-agent.service - Google Compute Engine Guest Agent
Loaded: loaded (/lib/systemd/system/google-guest-agent.service; enabled; vendor preset: enabled)
Active: active (running) since Tue 2021-07-27 19:47:48 UTC; 18h ago
Main PID: 711 (google_guest_ag)
Tasks: 9 (limit: 8924)
Memory: 19.7M
CGroup: /system.slice/google-guest-agent.service
└─711 /usr/bin/google_guest_agent
Jul 27 19:47:55 sean-imp gpasswd[1469]: user google added by root to group floppy
Jul 27 19:47:55 sean-imp gpasswd[1475]: user google added by root to group audio
Jul 27 19:47:55 sean-imp gpasswd[1481]: user google added by root to group dip
Jul 27 19:47:55 sean-imp gpasswd[1487]: user google added by root to group video
Jul 27 19:47:55 sean-imp gpasswd[1493]: user google added by root to group plugdev
Jul 27 19:47:55 sean-imp gpasswd[1499]: user google added by root to group netdev
Jul 27 19:47:55 sean-imp gpasswd[1505]: user google added by root to group lxd
Jul 27 19:47:55 sean-imp gpasswd[1511]: user google added by root to group google-sudoers
Jul 27 19:47:55 sean-imp GCEGuestAgent[711]: 2021-07-27T19:47:55.1699Z GCEGuestAgent Info: Updating keys for user google.
Jul 27 19:47:55 sean-imp google_guest_agent[711]: 2021/07/27 19:47:55 logging client: rpc error: code = PermissionDenied desc = Clo>
lines 1-19/19 (END)
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/cloud-init/+bug/1938299/+subscriptions
More information about the foundations-bugs
mailing list