APPLIED: [SRU][J:linux-bluefield][PATCH v1 0/1] tcp: fix forever orphan socket caused by tcp_abort
Kuba Pawlak
kuba.pawlak at canonical.com
Thu Jul 10 13:13:26 UTC 2025
On 6.07.2025 14:47, Stav Aviram wrote:
> BugLink: https://bugs.launchpad.net/bugs/2114965
>
> SRU Justification:
>
> [Impact]
> In BFB version DOCA_2.6.0_BSP_4.6.0_Ubuntu_22.04-2.20240114, container
> deletion via removal of its kubelet YAML from /etc/kubelet.d sometimes
> fails to complete. The process waits for the container to disappear from
> crictl ps, but the container remains in Running state indefinitely. This
> behavior is seen with container version 2.dev.50 and FW 32.40.0324.
> The issue appears to stem from a kernel bug affecting orphaned TCP
> sockets stuck in a zero-window state. These sockets are not closed and
> timers are not rescheduled, leading to "forever orphan" behavior that
> prevents resource cleanup.
>
> [Fix]
> Backporting the upstream commit:
> bac76cf89816bff06c4ec2f3df97dc34e150a1c4 ("tcp: fix forever orphan socket caused by tcp_abort")
> This commit removes a conditional check on SOCK_DEAD in tcp_abort,
> allowing proper closure of orphaned sockets and preventing indefinite
> stalling. Backporting is needed as the error handling and logging
> methods differ from the original upstream code.
>
> [Test Case]
> Compile tested on linux-bluefield-5.15 on the master-next branch.
> Further testing includes reproducing the issue by removing the pod's
> YAML from /etc/kubelet.d and monitoring container termination using
> crictl ps. With the patch applied, the container should no longer
> remain stuck in Running state.
>
> [Regression Potential]
> The patch targets a specific edge case in TCP socket handling, and after
> backporting, it is as close as possible to the original upstream commit.
> However, since the change removes a check that previously avoided
> closing SOCK_DEAD sockets, there's a small risk if other kernel paths
> still rely on the earlier behavior. This could theoretically lead to
> unexpected side effects in force-close logic if assumptions about socket
> state are violated. Also, the backport is not an absolute match for the
> original commit, and so there's a possibility for unexpected behavior in
> edge cases related to socket teardown.
>
> Xueming Feng (1):
> tcp: fix forever orphan socket caused by tcp_abort
>
> net/ipv4/tcp.c | 16 ++++++++++------
> 1 file changed, 10 insertions(+), 6 deletions(-)
>
Applied-by: Kuba Pawlak <kuba.pawlak at canonical.com>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_0x216A9D7E3B63DCB4.asc
Type: application/pgp-keys
Size: 3139 bytes
Desc: OpenPGP public key
URL: <https://lists.ubuntu.com/archives/kernel-team/attachments/20250710/de5f98ec/attachment.key>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_signature.asc
Type: application/pgp-signature
Size: 840 bytes
Desc: OpenPGP digital signature
URL: <https://lists.ubuntu.com/archives/kernel-team/attachments/20250710/de5f98ec/attachment.sig>
More information about the kernel-team
mailing list