[Bug 1662804] [NEW] [SRU] Agent is failing to process HA router if initialize() fails

Launchpad Bug Tracker 1662804 at bugs.launchpad.net
Wed May 31 10:40:39 UTC 2017


You have been subscribed to a public bug by Edward Hope-Morley (hopem):

[Impact]

This patch resolves, amongst other things, issues with a create and
delete router request race condition when using l3 HA. At the time of
backport this patch is already available from Ocata onwards and has been
verified as sufficiently minimal and safe for backport to Newton and
Mitaka. Essentially the error case is a result of an incorrectly
intialised router update action being executed without proper checks and
this patch fixes this.

[Test Case]

 * Deploy Openstack Mitaka - http://pastebin.ubuntu.com/24637244/ - with
neutron-l3-agent configured to provide HA (vrrp) routers.

 * Repeatedly create and delete routers in rapid succession and check
that the l3 agent does not go into an infinite error loop i.e. run
http://pastebin.ubuntu.com/24634950/ and run do tail -F
/var/log/neutron/neutron-l3-agent.log on all units of l3 agent. Also
check that qrouter- namepspaces are not stacking up. For Mitaka I
typically hit the error after ~20 create/deletes.

[Regression Potential]

 * I do not envisage any regression potential from this patch.

====

When HA router initialize() function fails for some reason(rabbitmq
restart or no ha_port), keepalived_manager or KeepalivedInstance won't
be configured. In this case, _process_router_if_compatible fails with
exception, then _resync_router(update) will again try to process this
router in loop. As we try initialize() only once(which was failed),
retry of _process_router_if_compatible will always fail(no keepalived
manager or instance) and router is never configured(see below trace).

2017-02-06 18:34:18.539 26120 DEBUG neutron.agent.linux.utils [-] Running command (rootwrap daemon): ['ip', 'netns', 'exec', 'qrouter-114a72fe-02ae-4b87-a2e7-70f962df0951', 'ip', '-o', 'link', 'show', 'qr-e6
3406e1-e7'] execute_rootwrap_daemon /usr/lib/python2.7/site-packages/neutron/agent/linux/utils.py:101
2017-02-06 18:34:18.544 26120 DEBUG neutron.agent.linux.utils [-]
Command: ['ip', 'netns', 'exec', u'qrouter-114a72fe-02ae-4b87-a2e7-70f962df0951', 'ip', '-o', 'link', 'show', u'qr-e63406e1-e7']
Exit code: 0
 execute /usr/lib/python2.7/site-packages/neutron/agent/linux/utils.py:156
2017-02-06 18:34:18.544 26120 ERROR neutron.agent.l3.router_info [-] 'NoneType' object has no attribute 'get_process'
2017-02-06 18:34:18.544 26120 ERROR neutron.agent.l3.router_info Traceback (most recent call last):
2017-02-06 18:34:18.544 26120 ERROR neutron.agent.l3.router_info   File "/usr/lib/python2.7/site-packages/neutron/common/utils.py", line 359, in call
2017-02-06 18:34:18.544 26120 ERROR neutron.agent.l3.router_info     return func(*args, **kwargs)
2017-02-06 18:34:18.544 26120 ERROR neutron.agent.l3.router_info   File "/usr/lib/python2.7/site-packages/neutron/agent/l3/router_info.py", line 744, in process
2017-02-06 18:34:18.544 26120 ERROR neutron.agent.l3.router_info     self._process_internal_ports(agent.pd)
2017-02-06 18:34:18.544 26120 ERROR neutron.agent.l3.router_info   File "/usr/lib/python2.7/site-packages/neutron/agent/l3/router_info.py", line 394, in _process_internal_ports
2017-02-06 18:34:18.544 26120 ERROR neutron.agent.l3.router_info     self.internal_network_added(p)
2017-02-06 18:34:18.544 26120 ERROR neutron.agent.l3.router_info   File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 275, in internal_network_added
2017-02-06 18:34:18.544 26120 ERROR neutron.agent.l3.router_info     self._disable_ipv6_addressing_on_interface(interface_name)
2017-02-06 18:34:18.544 26120 ERROR neutron.agent.l3.router_info   File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 235, in _disable_ipv6_addressing_on_interface
2017-02-06 18:34:18.544 26120 ERROR neutron.agent.l3.router_info     if self._should_delete_ipv6_lladdr(ipv6_lladdr):
2017-02-06 18:34:18.544 26120 ERROR neutron.agent.l3.router_info   File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 217, in _should_delete_ipv6_lladdr
2017-02-06 18:34:18.544 26120 ERROR neutron.agent.l3.router_info     if manager.get_process().active:
2017-02-06 18:34:18.544 26120 ERROR neutron.agent.l3.router_info AttributeError: 'NoneType' object has no attribute 'get_process'
2017-02-06 18:34:18.544 26120 ERROR neutron.agent.l3.router_info
2017-02-06 18:34:18.549 26120 ERROR neutron.agent.l3.agent [-] Failed to process compatible router '114a72fe-02ae-4b87-a2e7-70f962df0951'
2017-02-06 18:34:18.549 26120 ERROR neutron.agent.l3.agent Traceback (most recent call last):
2017-02-06 18:34:18.549 26120 ERROR neutron.agent.l3.agent   File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 506, in _process_router_update
2017-02-06 18:34:18.549 26120 ERROR neutron.agent.l3.agent     self._process_router_if_compatible(router)
2017-02-06 18:34:18.549 26120 ERROR neutron.agent.l3.agent   File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 445, in _process_router_if_compatible
2017-02-06 18:34:18.549 26120 ERROR neutron.agent.l3.agent     self._process_updated_router(router)
2017-02-06 18:34:18.549 26120 ERROR neutron.agent.l3.agent   File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 459, in _process_updated_router
2017-02-06 18:34:18.549 26120 ERROR neutron.agent.l3.agent     ri.process(self)
2017-02-06 18:34:18.549 26120 ERROR neutron.agent.l3.agent   File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 377, in process
2017-02-06 18:34:18.549 26120 ERROR neutron.agent.l3.agent     super(HaRouter, self).process(agent)
2017-02-06 18:34:18.549 26120 ERROR neutron.agent.l3.agent   File "/usr/lib/python2.7/site-packages/neutron/common/utils.py", line 362, in call
2017-02-06 18:34:18.549 26120 ERROR neutron.agent.l3.agent     self.logger(e)
2017-02-06 18:34:18.549 26120 ERROR neutron.agent.l3.agent   File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 204, in __exit__
2017-02-06 18:34:18.549 26120 ERROR neutron.agent.l3.agent     six.reraise(self.type_, self.value, self.tb)
2017-02-06 18:34:18.549 26120 ERROR neutron.agent.l3.agent   File "/usr/lib/python2.7/site-packages/neutron/common/utils.py", line 359, in call
2017-02-06 18:34:18.549 26120 ERROR neutron.agent.l3.agent     return func(*args, **kwargs)
2017-02-06 18:34:18.549 26120 ERROR neutron.agent.l3.agent   File "/usr/lib/python2.7/site-packages/neutron/agent/l3/router_info.py", line 744, in process
2017-02-06 18:34:18.549 26120 ERROR neutron.agent.l3.agent     self._process_internal_ports(agent.pd)
2017-02-06 18:34:18.549 26120 ERROR neutron.agent.l3.agent   File "/usr/lib/python2.7/site-packages/neutron/agent/l3/router_info.py", line 394, in _process_internal_ports
2017-02-06 18:34:18.549 26120 ERROR neutron.agent.l3.agent     self.internal_network_added(p)
2017-02-06 18:34:18.549 26120 ERROR neutron.agent.l3.agent   File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 275, in internal_network_added
2017-02-06 18:34:18.549 26120 ERROR neutron.agent.l3.agent     self._disable_ipv6_addressing_on_interface(interface_name)
2017-02-06 18:34:18.549 26120 ERROR neutron.agent.l3.agent   File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 235, in _disable_ipv6_addressing_on_interface
2017-02-06 18:34:18.549 26120 ERROR neutron.agent.l3.agent     if self._should_delete_ipv6_lladdr(ipv6_lladdr):
2017-02-06 18:34:18.549 26120 ERROR neutron.agent.l3.agent   File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 217, in _should_delete_ipv6_lladdr
2017-02-06 18:34:18.549 26120 ERROR neutron.agent.l3.agent     if manager.get_process().active:
2017-02-06 18:34:18.549 26120 ERROR neutron.agent.l3.agent AttributeError: 'NoneType' object has no attribute 'get_process'

** Affects: cloud-archive
     Importance: Undecided
         Status: Fix Released

** Affects: neutron
     Importance: High
     Assignee: venkata anil (anil-venkata)
         Status: Fix Released

** Affects: neutron (Ubuntu)
     Importance: Undecided
         Status: Fix Released


** Tags: in-stable-newton in-stable-ocata l3-ha sts sts-sru-needed
-- 
[SRU] Agent is failing to process HA router if initialize() fails
https://bugs.launchpad.net/bugs/1662804
You received this bug notification because you are a member of Ubuntu Sponsors Team, which is subscribed to the bug report.



More information about the Ubuntu-sponsors mailing list