[Bug 2127061] Re: [questing][linux-6.17][openvswitch] regression causing system test failures

Frode Nordahl 2127061 at bugs.launchpad.net
Wed Oct 22 07:53:05 UTC 2025


Using retis [0] to trace flow of the upcalls alludes to there being some
issue with how the upcall PIDs are passed.

There appears to be some successful exchanges, and then the kernel
attempts to address a PID on which there are no listener.

Excerpt from retis.log:
2025-10-22 07:43:12.193852 (14) [handler30] 263879/263930 [u] dpif_recv:recv_upcall (ovs-vswitchd)
  upcall_recv q 2917707731 pkt_size 118

2025-10-22 07:43:12.194007 (14) [handler30] 263879/263930 [u] dpif_netlink_operate__:op_flow_execute (ovs-vswitchd)
  flow_exec q 2917707731 ts 64902331497331 (0)

2025-10-22 07:43:12.194048 (14) [handler30] 263879/263930 [tp] openvswitch:ovs_do_execute_action #3b07406cd4f6ffff898e1739e5c0 (skb ffff898e1b3e5300)
  fe80::200:ff:fe00:1 > fe80::9c7a:54ff:fe41:2888 ttl 254 label 0xbc33e len 64 proto ICMPv6 (58) type 129 code 0
  ns 0x1/4026531833 if 2213 (ovs-vif0)
  exec userspace

2025-10-22 07:43:12.194055 (14) [handler30] 263879/263930 [tp] openvswitch:ovs_dp_upcall #3b07406cd4f6ffff898e1739e5c0 (skb ffff898e1b3e5300)
  fe80::200:ff:fe00:1 > fe80::9c7a:54ff:fe41:2888 ttl 254 label 0xbc33e len 64 proto ICMPv6 (58) type 129 code 0
  ns 0x1/4026531833 if 2213 (ovs-vif0)
  upcall (action) port 2122027008 cpu 14

2025-10-22 07:43:12.194073 (14) [handler30] 263879/263930 [kr] queue_userspace_packet #3b07406cd4f6ffff898e1739e5c0 (skb ffff898e1b3e5300)
  fe80::200:ff:fe00:1 > fe80::9c7a:54ff:fe41:2888 ttl 254 label 0xbc33e len 64 proto ICMPv6 (58) type 129 code 0
  ns 0x1/4026531833 if 2213 (ovs-vif0)
  upcall_enqueue (action) (14/64902331700177) q 1699949049 ret -111

2025-10-22 07:43:12.194078 (14) [handler30] 263879/263930 [kr] ovs_dp_upcall #3b07406cd4f6ffff898e1739e5c0 (skb ffff898e1b3e5300)
  fe80::200:ff:fe00:1 > fe80::9c7a:54ff:fe41:2888 ttl 254 label 0xbc33e len 64 proto ICMPv6 (58) type 129 code 0
  ns 0x1/4026531833 if 2213 (ovs-vif0)
  upcall_ret (14/64902331700177) ret -111

sudo grep 2122027008 ovs-vswitchd.log
[ crickets ]

And sure enough, none of the handlers have this PID:
2025-10-22T07:43:04.439Z|00033|dpif_netlink|DBG|Dispatch mode(per-cpu): handler 0 has Netlink PID of 3283752141
2025-10-22T07:43:04.439Z|00034|dpif_netlink|DBG|Dispatch mode(per-cpu): handler 1 has Netlink PID of 2358290975
2025-10-22T07:43:04.439Z|00035|dpif_netlink|DBG|Dispatch mode(per-cpu): handler 2 has Netlink PID of 3109568256
2025-10-22T07:43:04.439Z|00036|dpif_netlink|DBG|Dispatch mode(per-cpu): handler 3 has Netlink PID of 2254746516
2025-10-22T07:43:04.440Z|00037|dpif_netlink|DBG|Dispatch mode(per-cpu): handler 4 has Netlink PID of 3410556952
2025-10-22T07:43:04.440Z|00038|dpif_netlink|DBG|Dispatch mode(per-cpu): handler 5 has Netlink PID of 3253702169
2025-10-22T07:43:04.440Z|00039|dpif_netlink|DBG|Dispatch mode(per-cpu): handler 6 has Netlink PID of 2395298204
2025-10-22T07:43:04.440Z|00040|dpif_netlink|DBG|Dispatch mode(per-cpu): handler 7 has Netlink PID of 3325318506
2025-10-22T07:43:04.440Z|00041|dpif_netlink|DBG|Dispatch mode(per-cpu): handler 8 has Netlink PID of 2520519280
2025-10-22T07:43:04.440Z|00042|dpif_netlink|DBG|Dispatch mode(per-cpu): handler 9 has Netlink PID of 2818850944
2025-10-22T07:43:04.440Z|00043|dpif_netlink|DBG|Dispatch mode(per-cpu): handler 10 has Netlink PID of 3805190526
2025-10-22T07:43:04.440Z|00044|dpif_netlink|DBG|Dispatch mode(per-cpu): handler 11 has Netlink PID of 2287368837
2025-10-22T07:43:04.440Z|00045|dpif_netlink|DBG|Dispatch mode(per-cpu): handler 12 has Netlink PID of 2698870990
2025-10-22T07:43:04.440Z|00046|dpif_netlink|DBG|Dispatch mode(per-cpu): handler 13 has Netlink PID of 2259458406
2025-10-22T07:43:04.440Z|00047|dpif_netlink|DBG|Dispatch mode(per-cpu): handler 14 has Netlink PID of 4003028982
2025-10-22T07:43:04.440Z|00048|dpif_netlink|DBG|Dispatch mode(per-cpu): handler 15 has Netlink PID of 3620645476
2025-10-22T07:43:04.440Z|00049|dpif_netlink|DBG|Dispatch mode(per-cpu): handler 16 has Netlink PID of 3595625422
2025-10-22T07:43:04.440Z|00050|dpif_netlink|DBG|Dispatch mode(per-cpu): handler 17 has Netlink PID of 3138296197
2025-10-22T07:43:04.440Z|00051|dpif_netlink|DBG|Dispatch mode(per-cpu): handler 18 has Netlink PID of 4050347252
2025-10-22T07:43:04.440Z|00052|dpif_netlink|DBG|Dispatch mode(per-cpu): handler 19 has Netlink PID of 4240860670
2025-10-22T07:43:04.440Z|00053|dpif_netlink|DBG|Dispatch mode(per-cpu): handler 20 has Netlink PID of 2278772482
2025-10-22T07:43:04.440Z|00054|dpif_netlink|DBG|Dispatch mode(per-cpu): handler 21 has Netlink PID of 4216385937
2025-10-22T07:43:04.440Z|00055|dpif_netlink|DBG|Dispatch mode(per-cpu): handler 22 has Netlink PID of 2581116576
2025-10-22T07:43:04.440Z|00056|dpif_netlink|DBG|Dispatch mode(per-cpu): handler 23 has Netlink PID of 2290033899
2025-10-22T07:43:04.440Z|00057|dpif_netlink|DBG|Dispatch mode(per-cpu): handler 24 has Netlink PID of 3767708888
2025-10-22T07:43:04.440Z|00058|dpif_netlink|DBG|Dispatch mode(per-cpu): handler 25 has Netlink PID of 2336571054
2025-10-22T07:43:04.440Z|00059|dpif_netlink|DBG|Dispatch mode(per-cpu): handler 26 has Netlink PID of 4132681928
2025-10-22T07:43:04.440Z|00060|dpif_netlink|DBG|Dispatch mode(per-cpu): handler 27 has Netlink PID of 4089567416
2025-10-22T07:43:04.440Z|00061|dpif_netlink|DBG|Dispatch mode(per-cpu): handler 28 has Netlink PID of 2550568625
2025-10-22T07:43:04.440Z|00062|dpif_netlink|DBG|Dispatch mode(per-cpu): handler 29 has Netlink PID of 2814870700
2025-10-22T07:43:04.440Z|00063|dpif_netlink|DBG|Dispatch mode(per-cpu): handler 30 has Netlink PID of 3591118026
2025-10-22T07:43:04.440Z|00064|dpif_netlink|DBG|Dispatch mode(per-cpu): handler 31 has Netlink PID of 3797233916
2025-10-22T07:43:04.440Z|00065|dpif_netlink|DBG|Dispatch mode(per-cpu): handler 32 has Netlink PID of 2755705116
2025-10-22T07:43:04.440Z|00066|dpif_netlink|DBG|Dispatch mode(per-cpu): handler 33 has Netlink PID of 2655937864
2025-10-22T07:43:04.440Z|00067|dpif_netlink|DBG|Dispatch mode(per-cpu): handler 34 has Netlink PID of 2155878095
2025-10-22T07:43:04.460Z|00068|dpif_netlink|DBG|Dispatch mode(per-cpu): handler 35 has Netlink PID of 3240413810
2025-10-22T07:43:04.460Z|00069|dpif_netlink|DBG|Dispatch mode(per-cpu): handler 36 has Netlink PID of 2377761266

A successful excerpt from retis.log:
2025-10-22 07:43:12.193642 (28) [ping] 264298 [tp] net:net_dev_start_xmit #3b074066c7f6ffff898e1ac2f0c0 (skb ffff898e0fa61b00)
  fe80::9c7a:54ff:fe41:2888 > fe80::200:ff:fe00:1 ttl 64 label 0xbc33e len 64 proto ICMPv6 (58) type 128 code 0
  ns 0x7c/4026533121 if 2214 (vif0)

2025-10-22 07:43:12.193669 (28) [ping] 264298 [tp] net:netif_receive_skb #3b074066c7f6ffff898e1ac2f0c0 (skb ffff898e0fa61b00)
  fe80::9c7a:54ff:fe41:2888 > fe80::200:ff:fe00:1 ttl 64 label 0xbc33e len 64 proto ICMPv6 (58) type 128 code 0
  ns 0x1/4026531833 if 2213 (ovs-vif0)

2025-10-22 07:43:12.193683 (28) [ping] 264298 [kr] ovs_flow_tbl_lookup_stats #3b074066c7f6ffff898e1ac2f0c0 (skb ffff898e0fa61b00)
  fe80::9c7a:54ff:fe41:2888 > fe80::200:ff:fe00:1 ttl 64 label 0xbc33e len 64 proto ICMPv6 (58) type 128 code 0
  ns 0x1/4026531833 if 2213 (ovs-vif0) rxif 2213
  flow hit ufid c0bb9975-6bfb-4bac-a411-f49e6dfd126d mask 1 cache 1 flow ffff898e3be5be90 sf_acts ffff898d94e4ca80

2025-10-22 07:43:12.193690 (28) [ping] 264298 [tp] openvswitch:ovs_do_execute_action #3b074066c7f6ffff898e1ac2f0c0 (skb ffff898e0fa61b00)
  fe80::9c7a:54ff:fe41:2888 > fe80::200:ff:fe00:1 ttl 64 label 0xbc33e len 64 proto ICMPv6 (58) type 128 code 0
  ns 0x1/4026531833 if 2213 (ovs-vif0) rxif 2213
  exec userspace

2025-10-22 07:43:12.193694 (28) [ping] 264298 [tp] openvswitch:ovs_dp_upcall #3b074066c7f6ffff898e1ac2f0c0 (skb ffff898e0fa61b00)
  fe80::9c7a:54ff:fe41:2888 > fe80::200:ff:fe00:1 ttl 64 label 0xbc33e len 64 proto ICMPv6 (58) type 128 code 0
  ns 0x1/4026531833 if 2213 (ovs-vif0) rxif 2213
  upcall (action) port 2550568625 cpu 28

2025-10-22 07:43:12.193713 (28) [ping] 264298 [kr] queue_userspace_packet #3b074066c7f6ffff898e1ac2f0c0 (skb ffff898e0fa61b00)
  fe80::9c7a:54ff:fe41:2888 > fe80::200:ff:fe00:1 ttl 64 label 0xbc33e len 64 proto ICMPv6 (58) type 128 code 0
  ns 0x1/4026531833 if 2213 (ovs-vif0) rxif 2213
  upcall_enqueue (action) (28/64902331339171) q 2917707731 ret 0

2025-10-22 07:43:12.193717 (28) [ping] 264298 [kr] ovs_dp_upcall #3b074066c7f6ffff898e1ac2f0c0 (skb ffff898e0fa61b00)
  fe80::9c7a:54ff:fe41:2888 > fe80::200:ff:fe00:1 ttl 64 label 0xbc33e len 64 proto ICMPv6 (58) type 128 code 0
  ns 0x1/4026531833 if 2213 (ovs-vif0) rxif 2213
  upcall_ret (28/64902331339171) ret 0


And sure enough port 2550568625 matches the Netlink PID of handler 28.

0: https://retis.readthedocs.io/en/stable/

-- 
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to openvswitch in Ubuntu.
https://bugs.launchpad.net/bugs/2127061

Title:
  [questing][linux-6.17][openvswitch] regression causing system test
  failures

Status in linux package in Ubuntu:
  New
Status in openvswitch package in Ubuntu:
  New
Status in ovn package in Ubuntu:
  New

Bug description:
  At this point in time, with the 6.17.0-5.5 kernel, the following OVN
  system tests fail on Questing:

      DNAT and SNAT on distributed router - N/S - IPv6
      Traffic to router port via LLA
      LR with SNAT fragmentation needed for external server

  Mainline test with v6.16 makes all above mentioned tests work.

  Kernel from previous Ubuntu versions also make all above mentioned
  tests work.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2127061/+subscriptions




More information about the Ubuntu-openstack-bugs mailing list