[Bug 1743249] Re: Failed Deployment after timeout trying to retrieve grub cfg
Jason Hobbs
jason.hobbs at canonical.com
Tue Feb 6 00:32:53 UTC 2018
@Mike, you can see the stacked response behavior in
https://bugs.launchpad.net/maas/+bug/1743249/+attachment/5046952/+files/spearow-fall-back-to-default-amd64.pcap
You can tell packet 90573 is a response to the requests for
grub.cfg-<mac> because its destination port (25305) is the src port
the request for grub.cfg-<mac> was coming from (packets 2 through 38).
On Mon, Feb 5, 2018 at 6:11 PM, Mike Pontillo
<mike.pontillo at canonical.com> wrote:
> Steve, can you be more specific about which packet capture showed the
> "stacked OACK" behavior?
>
> I looked at a packet capture Andres pointed me to, and don't see the
> "stacked OACKs" you describe. Each TFTP transaction (per RFC 1350) is
> indicated by the (source port, dest port) tuple, and I see that MAAS
> correctly OACKs each individual transaction (per RFC 2347) - not the
> retry packets within the same transaction. Subsequently (in the same
> second, after the client ACKs the data packet) it re-requests the same
> file (which is the bug in grub that I understand is fixed), and then the
> client starts a new transaction and MAAS correctly issues another OACK.
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1743249
>
> Title:
> Failed Deployment after timeout trying to retrieve grub cfg
>
> Status in MAAS:
> New
> Status in grub2 package in Ubuntu:
> In Progress
>
> Bug description:
> A node failed to deploy after it failed to retrieve a grub.cfg from
> MAAS due to a timeout. In the logs, it's clear that the server tried
> to retrieve the grub cfg many times, over about 30 seconds:
>
> http://paste.ubuntu.com/26387256/
>
> We see the same thing for other hosts around the same time:
>
> http://paste.ubuntu.com/26387262/
>
> It seems like MAAS is taking way too long to respond to these
> requests.
>
> This is very similar to bug 1724677, which was happening pre-
> metldown/spectre. The only difference is we don't see "[critical] TFTP
> back-end failed" in the logs anymore.
>
> I connected to the console on this system and it had errors about
> timing out retrieving the grub-cfg, then it had an error message along
> the lines of "error not an ip" and then "double free". After I
> connected but before I could get a screenshot the system rebooted and
> was directed by maas to power off, which it did successfully after
> booting to linux.
>
> Full logs are available here:
> https://10.245.162.101/artifacts/14a34b5a-9321-4d1a-b2fa-
> ed277a020e7c/cpe_cloud_395/infra-logs.tar
>
> This is with 2.3.0-6434-gd354690-0ubuntu1~16.04.1.
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/maas/+bug/1743249/+subscriptions
--
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to grub2 in Ubuntu.
https://bugs.launchpad.net/bugs/1743249
Title:
Failed Deployment after timeout trying to retrieve grub cfg
Status in MAAS:
New
Status in grub2 package in Ubuntu:
In Progress
Bug description:
A node failed to deploy after it failed to retrieve a grub.cfg from
MAAS due to a timeout. In the logs, it's clear that the server tried
to retrieve the grub cfg many times, over about 30 seconds:
http://paste.ubuntu.com/26387256/
We see the same thing for other hosts around the same time:
http://paste.ubuntu.com/26387262/
It seems like MAAS is taking way too long to respond to these
requests.
This is very similar to bug 1724677, which was happening pre-
metldown/spectre. The only difference is we don't see "[critical] TFTP
back-end failed" in the logs anymore.
I connected to the console on this system and it had errors about
timing out retrieving the grub-cfg, then it had an error message along
the lines of "error not an ip" and then "double free". After I
connected but before I could get a screenshot the system rebooted and
was directed by maas to power off, which it did successfully after
booting to linux.
Full logs are available here:
https://10.245.162.101/artifacts/14a34b5a-9321-4d1a-b2fa-
ed277a020e7c/cpe_cloud_395/infra-logs.tar
This is with 2.3.0-6434-gd354690-0ubuntu1~16.04.1.
To manage notifications about this bug go to:
https://bugs.launchpad.net/maas/+bug/1743249/+subscriptions
More information about the foundations-bugs
mailing list