[SRU][focal:linux-gcp-5.15][PULL v2] TCPDirect patches

John Cabaj john.cabaj at canonical.com
Fri Sep 22 15:11:05 UTC 2023


On 9/22/23 10:00 AM, Tim Gardner wrote:
> On 9/22/23 6:36 AM, John Cabaj wrote:
>> BugLink: https://bugs.launchpad.net/ubuntu/+source/linux-gcp-5.15/+bug/2037087
>>
>> [Impact]
>>
>> * Include patches to enable TCPDirect. This is considered a v1 implementation while Google works to
>>    upsteam a v2 implementation, hence the "UBUNTU: SAUCE: (no-up)" classification.
>>
>> [Fix]
>>
>> d1016eee0b1a ("UBUNTU: SAUCE: (no-up) UPSTREAM: tcp: derive delack_max from rto_min")
>> 10364f0d83ee ("UBUNTU: SAUCE: (no-up) gve: Add retry logic for recoverable adminq errors")
>> 403fb6f43a7c ("UBUNTU: SAUCE: (no-up) tcp: defer regular ACK while processing socket backlog")
>> be97d51d24da ("UBUNTU: SAUCE: (no-up) gve: Enable header-split without gve_close/gve_open")
>> 499a0ec31c74 ("UBUNTU: SAUCE: (no-up) gve: fix rx issues for skb free and append frags")
>> 89b4d1e69fac ("UBUNTU: SAUCE: (no-up) net: fix silent put_cmsg() failures")
>> d025ece61dab ("UBUNTU: SAUCE: (no-up) tcp: get rid of sysctl_tcp_adv_win_scale")
>> d1380ff2f4e3 ("UBUNTU: SAUCE: (no-up) gve: Add tx watchdog to avoid race condition on miss path")
>> a4bc2e0cd1a3 ("UBUNTU: SAUCE: (no-up) net-tcp_5k_mtu: force wscale >= 12 for active flows")
>> bbe85e8e856c ("UBUNTU: SAUCE: (no-up) net-tcp_5k_mtu: force wscale >= 12 for 4K MTU TCP flows")
>> 7d5f15f733e8 ("UBUNTU: SAUCE: (no-up) dma-buf: fix int overflow")
>> 22668842a913 ("UBUNTU: SAUCE: (no-up) gve: add flow steering and rss reset when teardown device resources")
>> 18e66e5d9ac4 ("UBUNTU: SAUCE: (no-up) net: create skb_frags_not_readable() helper")
>> 5d81162de488 ("UBUNTU: SAUCE: (no-up) net: add missing skb->devmem checks")
>> cf0bcfdf9f87 ("UBUNTU: SAUCE: (no-up) net: remove devmem check from __pskb_copy_fclone()")
>> 9ca03b5c36c7 ("UBUNTU: SAUCE: (no-up) net: allow tcp coallapsing and coallescing for devmem skbs")
>> 030aef8b0fd2 ("UBUNTU: SAUCE: (no-up) net: skb_store_bits() should succeed on devmem header")
>> 6aaf939fb49e ("UBUNTU: SAUCE: (no-up) net: fix skb_split unnecessarily setting skb->devmem")
>> 51e12a8201a1 ("UBUNTU: SAUCE: (no-up) net: skb_copy_bits() should be able to copy devmem header")
>> ea8a78e040f5 ("UBUNTU: SAUCE: (no-up) net: fix memory leaks due to skb->devmem checks")
>> 1fcd891606c0 ("UBUNTU: SAUCE: (no-up) net: fix snaplen for devmem packets")
>> e4a4e05f4831 ("UBUNTU: SAUCE: (no-up) net: keep track and avoid access of skb containing dma-buf pages.")
>> 424871c3c564 ("UBUNTU: SAUCE: (no-up) gve: implement device memory socket data path")
>> 4f8668ee4cdb ("UBUNTU: SAUCE: (no-up) gve: implement devmem socket stats")
>> abe1772b0e39 ("UBUNTU: SAUCE: (no-up) gve: add rss support")
>> c08a1adb6130 ("UBUNTU: SAUCE: (no-up) gve: add flow steering support")
>> ffa842923ad6 ("UBUNTU: SAUCE: (no-up) gve: Add header split support")
>> 940527385c4c ("UBUNTU: SAUCE: (no-up) lakitu config: enable TCP Direct configs")
>> ebfa162318f8 ("UBUNTU: SAUCE: (no-up) tcp, cos-only: revert changes to skb_zerocopy_iter_stream")
>> b87d0b0e659a ("UBUNTU: SAUCE: (no-up) tcp: let sendmsg() take file descriptors via cmsg to enable devmem Tx")
>> 1d523c425a8c ("UBUNTU: SAUCE: (no-up) net: add SO_DEVMEM_DONTNEED setsockopt to release pages")
>> 3e34b0936884 ("UBUNTU: SAUCE: (no-up) net: backport fixes to devmem TCP rx")
>> 446d7a742177 ("UBUNTU: SAUCE: (no-up) tcp: implement RX path for devmem sockets")
>> 109f2e3feb98 ("UBUNTU: SAUCE: (no-up) net: use get_file_rcu() instead of get_file for __netdev_rxq_alloc_page_from_dmabuf_pool")
>> 0d5117d4ea53 ("UBUNTU: SAUCE: (no-up) net: add netdev_rxq_alloc_page and skb->devmem")
>> d0d273544148 ("UBUNTU: SAUCE: (no-up) dmabuf: add ioctl that binds dmabuf pagepool to a netdevice")
>> 05c81288f77b ("UBUNTU: SAUCE: (no-up) dma-buf: fix int overflow in addr calculation")
>> 46366a3326d9 ("UBUNTU: SAUCE: (no-up) dma-buf: create struct pages backing a dma-bu")
>>
>> [Test Cases]
>>
>> * Compile tested
>> * Boot tested
>> * Ran ubuntu_kernel_selftests and ubuntu_performance_stress_ng test suites
>> * Tested by Google
>>
>> [Other Info]
>>
>> * Bulk of patchset came from https://cos.googlesource.com/third_party/kernel/+log/refs/heads/tcpd/R105,
>>    but some backports were given in SalesForce case below.
>> * SF: #00359122
>>
>> [Where things could go wrong]
>>
>> * Most changes target Google gve driver specifically.
>> * Some required updates to dma and network implementation to enable new API.
>> * Could lead to DMA or network instabilities\
>>
>> -v2:
>> * Fixing BugLink - no code changes
>>
>> -----
>>
>> The following changes since commit ff1e5d1fb15ae4317e2649fd369483a9e70898d7:
>>
>>    UBUNTU: Ubuntu-gcp-5.15-5.15.0-1043.51~20.04.1 (2023-09-14 15:19:39 -0300)
>>
>> are available in the git repository at:
>>
>>    blah
>>
>> for you to fetch changes up to f3581125aad7368d093c86ca0165d548ca3ef586:
>>
>>    UBUNTU: SAUCE: (no-up) UPSTREAM: tcp: derive delack_max from rto_min (2023-09-22 07:28:22 -0500)
>>
>> ----------------------------------------------------------------
>> Eric Dumazet (5):
>>        UBUNTU: SAUCE: (no-up) net-tcp_5k_mtu: force wscale >= 12 for 4K MTU TCP flows
>>        UBUNTU: SAUCE: (no-up) net-tcp_5k_mtu: force wscale >= 12 for active flows
>>        UBUNTU: SAUCE: (no-up) tcp: get rid of sysctl_tcp_adv_win_scale
>>        UBUNTU: SAUCE: (no-up) tcp: defer regular ACK while processing socket backlog
>>        UBUNTU: SAUCE: (no-up) UPSTREAM: tcp: derive delack_max from rto_min
>>
>> Jeroen de Borst (1):
>>        UBUNTU: SAUCE: (no-up) gve: Add retry logic for recoverable adminq errors
>>
>> Mina Almasry (25):
>>        UBUNTU: SAUCE: (no-up) dma-buf: create struct pages backing a dma-buf
>>        UBUNTU: SAUCE: (no-up) dma-buf: fix int overflow in addr calculation
>>        UBUNTU: SAUCE: (no-up) dmabuf: add ioctl that binds dmabuf pagepool to a netdevice
>>        UBUNTU: SAUCE: (no-up) net: add netdev_rxq_alloc_page and skb->devmem
>>        UBUNTU: SAUCE: (no-up) net: use get_file_rcu() instead of get_file for __netdev_rxq_alloc_page_from_dmabuf_pool
>>        UBUNTU: SAUCE: (no-up) tcp: implement RX path for devmem sockets
>>        UBUNTU: SAUCE: (no-up) net: backport fixes to devmem TCP rx
>>        UBUNTU: SAUCE: (no-up) net: add SO_DEVMEM_DONTNEED setsockopt to release pages
>>        UBUNTU: SAUCE: (no-up) tcp: let sendmsg() take file descriptors via cmsg to enable devmem Tx
>>        UBUNTU: SAUCE: (no-up) tcp, cos-only: revert changes to skb_zerocopy_iter_stream
>>        UBUNTU: SAUCE: (no-up) lakitu config: enable TCP Direct configs
>>        UBUNTU: SAUCE: (no-up) gve: implement devmem socket stats
>>        UBUNTU: SAUCE: (no-up) gve: implement device memory socket data path
>>        UBUNTU: SAUCE: (no-up) net: keep track and avoid access of skb containing dma-buf pages.
>>        UBUNTU: SAUCE: (no-up) net: fix snaplen for devmem packets
>>        UBUNTU: SAUCE: (no-up) net: fix memory leaks due to skb->devmem checks
>>        UBUNTU: SAUCE: (no-up) net: skb_copy_bits() should be able to copy devmem header
>>        UBUNTU: SAUCE: (no-up) net: fix skb_split unnecessarily setting skb->devmem
>>        UBUNTU: SAUCE: (no-up) net: skb_store_bits() should succeed on devmem header
>>        UBUNTU: SAUCE: (no-up) net: allow tcp coallapsing and coallescing for devmem skbs
>>        UBUNTU: SAUCE: (no-up) net: remove devmem check from __pskb_copy_fclone()
>>        UBUNTU: SAUCE: (no-up) net: add missing skb->devmem checks
>>        UBUNTU: SAUCE: (no-up) net: create skb_frags_not_readable() helper
>>        UBUNTU: SAUCE: (no-up) dma-buf: fix int overflow
>>        UBUNTU: SAUCE: (no-up) net: fix silent put_cmsg() failures
>>
>> Ziwei Xiao (7):
>>        UBUNTU: SAUCE: (no-up) gve: Add header split support
>>        UBUNTU: SAUCE: (no-up) gve: add flow steering support
>>        UBUNTU: SAUCE: (no-up) gve: add rss support
>>        UBUNTU: SAUCE: (no-up) gve: add flow steering and rss reset when teardown device resources
>>        UBUNTU: SAUCE: (no-up) gve: Add tx watchdog to avoid race condition on miss path
>>        UBUNTU: SAUCE: (no-up) gve: fix rx issues for skb free and append frags
>>        UBUNTU: SAUCE: (no-up) gve: Enable header-split without gve_close/gve_open
>>
>>   Documentation/networking/ip-sysctl.rst        |    8 +
>>   arch/x86/configs/lakitu_defconfig             | 4239 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>   drivers/dma-buf/dma-buf.c                     |  378 ++++++++++++
>>   drivers/net/ethernet/google/gve/gve.h         |  134 +++++
>>   drivers/net/ethernet/google/gve/gve_adminq.c  |  348 +++++++++++-
>>   drivers/net/ethernet/google/gve/gve_adminq.h  |  130 ++++-
>>   drivers/net/ethernet/google/gve/gve_dqo.h     |    3 +
>>   drivers/net/ethernet/google/gve/gve_ethtool.c |  748 +++++++++++++++++++++++-
>>   drivers/net/ethernet/google/gve/gve_main.c    |  348 ++++++++++--
>>   drivers/net/ethernet/google/gve/gve_rx_dqo.c  |  377 ++++++++++--
>>   drivers/net/ethernet/google/gve/gve_tx_dqo.c  |   12 +-
>>   drivers/net/ethernet/google/gve/gve_utils.c   |   16 +-
>>   drivers/net/ethernet/google/gve/gve_utils.h   |    3 +
>>   include/linux/dma-buf.h                       |   70 +++
>>   include/linux/netdevice.h                     |   34 ++
>>   include/linux/skbuff.h                        |   31 +-
>>   include/linux/socket.h                        |    2 +
>>   include/linux/tcp.h                           |   18 +-
>>   include/net/netns/ipv4.h                      |    3 +-
>>   include/net/sock.h                            |   12 +-
>>   include/net/tcp.h                             |   32 +-
>>   include/uapi/asm-generic/socket.h             |    7 +
>>   include/uapi/linux/dma-buf.h                  |   14 +
>>   include/uapi/linux/uio.h                      |   11 +
>>   mm/swap.c                                     |   10 +
>>   net/core/datagram.c                           |    3 +
>>   net/core/dev.c                                |   50 ++
>>   net/core/skbuff.c                             |  104 +++-
>>   net/core/sock.c                               |   75 ++-
>>   net/ipv4/sysctl_net_ipv4.c                    |    9 +
>>   net/ipv4/tcp.c                                |  273 ++++++++-
>>   net/ipv4/tcp_input.c                          |   40 +-
>>   net/ipv4/tcp_ipv4.c                           |    9 +
>>   net/ipv4/tcp_output.c                         |   53 +-
>>   net/packet/af_packet.c                        |    4 +-
>>   35 files changed, 7404 insertions(+), 204 deletions(-)
>>   create mode 100644 arch/x86/configs/lakitu_defconfig
>>
> 
> I am confused. f/gcp-5.15 is a derived kernel. Why wouldn't you apply this patch set to j/gcp ?

As was I. When I discussed with Google on starting this patch in j/gcp -> f/gcp-5.15, they indicated that people generally use j/gcp-6.2 now. Ultimately, the v2 of this work will come in to l/gcp -> j/gcp-6.2. The concern from Google was that someone using j/gcp with the v1 patches could update to j/gcp-6.2 and see the feature dropped as v2 is not ready yet - so it comes down to timing.

Possibly, I could get v1 of the patches into j/gcp once the v2 submission is ready.


John




More information about the kernel-team mailing list