[SRU][jammy:linux-gcp][PATCH 3/3] tcp: add sysctl_tcp_rto_min_us
Vinicius Peixoto
vinicius.peixoto at canonical.com
Tue Aug 13 23:26:28 UTC 2024
From: Kevin Yang <yyd at google.com>
Adding a sysctl knob to allow user to specify a default
rto_min at socket init time, other than using the hard
coded 200ms default rto_min.
Note that the rto_min route option has the highest precedence
for configuring this setting, followed by the TCP_BPF_RTO_MIN
socket option, followed by the tcp_rto_min_us sysctl.
Signed-off-by: Kevin Yang <yyd at google.com>
Reviewed-by: Neal Cardwell <ncardwell at google.com>
Reviewed-by: Yuchung Cheng <ycheng at google.com>
Reviewed-by: Eric Dumazet <edumazet at google.com>
Reviewed-by: Tony Lu <tonylu at linux.alibaba.com>
Reviewed-by: Jakub Kicinski <kuba at kernel.org>
Signed-off-by: David S. Miller <davem at davemloft.net>
(backported from commit f086edef71be7174a16c1ed67ac65a085cda28b1)
[vpeixoto: fixed conflicts in include/net/netns/ipv4.h due to missing
commits 18fd64d25422 ("netns-ipv4: reorganize netns_ipv4 fast path
variables"), and 1c106eb01cee ("net: ipv{6,4}: Remove the now
superfluous sentinel elements from ctl_table array"), as well as
context conflicts in net/ipv4/tcp_ipv4.c due to missing commits adding
other unrelated TCP sysctls.]
Signed-off-by: Vinicius Peixoto <vinicius.peixoto at canonical.com>
---
Documentation/networking/ip-sysctl.rst | 13 +++++++++++++
include/net/netns/ipv4.h | 1 +
net/ipv4/sysctl_net_ipv4.c | 8 ++++++++
net/ipv4/tcp.c | 4 +++-
net/ipv4/tcp_ipv4.c | 2 ++
5 files changed, 27 insertions(+), 1 deletion(-)
diff --git a/Documentation/networking/ip-sysctl.rst b/Documentation/networking/ip-sysctl.rst
index 7f75767a24f1..9c931bb5fdc3 100644
--- a/Documentation/networking/ip-sysctl.rst
+++ b/Documentation/networking/ip-sysctl.rst
@@ -999,6 +999,19 @@ tcp_rx_skb_cache - BOOLEAN
Default: 0 (disabled)
+tcp_rto_min_us - INTEGER
+ Minimal TCP retransmission timeout (in microseconds). Note that the
+ rto_min route option has the highest precedence for configuring this
+ setting, followed by the TCP_BPF_RTO_MIN socket option, followed by
+ this tcp_rto_min_us sysctl.
+
+ The recommended practice is to use a value less or equal to 200000
+ microseconds.
+
+ Possible Values: 1 - INT_MAX
+
+ Default: 200000
+
UDP variables
=============
diff --git a/include/net/netns/ipv4.h b/include/net/netns/ipv4.h
index d60a10cfc382..d35337cb7b87 100644
--- a/include/net/netns/ipv4.h
+++ b/include/net/netns/ipv4.h
@@ -137,6 +137,7 @@ struct netns_ipv4 {
u8 sysctl_tcp_window_scaling;
u8 sysctl_tcp_timestamps;
u8 sysctl_tcp_early_retrans;
+ int sysctl_tcp_rto_min_us;
u8 sysctl_tcp_recovery;
u8 sysctl_tcp_thin_linear_timeouts;
u8 sysctl_tcp_slow_start_after_idle;
diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c
index 1f22e72074fd..f3df0a77a27d 100644
--- a/net/ipv4/sysctl_net_ipv4.c
+++ b/net/ipv4/sysctl_net_ipv4.c
@@ -1362,6 +1362,14 @@ static struct ctl_table ipv4_net_table[] = {
.extra1 = SYSCTL_ZERO,
.extra2 = &two,
},
+ {
+ .procname = "tcp_rto_min_us",
+ .data = &init_net.ipv4.sysctl_tcp_rto_min_us,
+ .maxlen = sizeof(int),
+ .mode = 0644,
+ .proc_handler = proc_dointvec_minmax,
+ .extra1 = SYSCTL_ONE,
+ },
{ }
};
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 95398c10086c..e709031df533 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -417,6 +417,7 @@ void tcp_init_sock(struct sock *sk)
{
struct inet_connection_sock *icsk = inet_csk(sk);
struct tcp_sock *tp = tcp_sk(sk);
+ int rto_min_us;
tp->out_of_order_queue = RB_ROOT;
sk->tcp_rtx_queue = RB_ROOT;
@@ -425,7 +426,8 @@ void tcp_init_sock(struct sock *sk)
INIT_LIST_HEAD(&tp->tsorted_sent_queue);
icsk->icsk_rto = TCP_TIMEOUT_INIT;
- icsk->icsk_rto_min = TCP_RTO_MIN;
+ rto_min_us = READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_rto_min_us);
+ icsk->icsk_rto_min = usecs_to_jiffies(rto_min_us);
icsk->icsk_delack_max = TCP_DELACK_MAX;
tp->mdev_us = jiffies_to_usecs(TCP_TIMEOUT_INIT);
minmax_reset(&tp->rtt_min, tcp_jiffies32, ~0U);
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index e162bed1916a..37f017d6d82f 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -3217,6 +3217,8 @@ static int __net_init tcp_sk_init(struct net *net)
else
net->ipv4.tcp_congestion_control = &tcp_reno;
+ net->ipv4.sysctl_tcp_rto_min_us = jiffies_to_usecs(TCP_RTO_MIN);
+
return 0;
}
--
2.43.0
More information about the kernel-team
mailing list