NACK: [SRU][O/N/J/F][PATCH 1/1] net/sched: netem: account for backlog updates from child qdisc
Koichiro Den
koichiro.den at canonical.com
Mon Feb 17 02:33:39 UTC 2025
On Fri, Jan 24, 2025 at 12:55:07PM GMT, Tim Whisonant wrote:
> From: Martin Ottens <martin.ottens at fau.de>
>
> In general, 'qlen' of any classful qdisc should keep track of the
> number of packets that the qdisc itself and all of its children holds.
> In case of netem, 'qlen' only accounts for the packets in its internal
> tfifo. When netem is used with a child qdisc, the child qdisc can use
> 'qdisc_tree_reduce_backlog' to inform its parent, netem, about created
> or dropped SKBs. This function updates 'qlen' and the backlog statistics
> of netem, but netem does not account for changes made by a child qdisc.
> 'qlen' then indicates the wrong number of packets in the tfifo.
> If a child qdisc creates new SKBs during enqueue and informs its parent
> about this, netem's 'qlen' value is increased. When netem dequeues the
> newly created SKBs from the child, the 'qlen' in netem is not updated.
> If 'qlen' reaches the configured sch->limit, the enqueue function stops
> working, even though the tfifo is not full.
>
> Reproduce the bug:
> Ensure that the sender machine has GSO enabled. Configure netem as root
> qdisc and tbf as its child on the outgoing interface of the machine
> as follows:
> $ tc qdisc add dev <oif> root handle 1: netem delay 100ms limit 100
> $ tc qdisc add dev <oif> parent 1:0 tbf rate 50Mbit burst 1542 latency 50ms
>
> Send bulk TCP traffic out via this interface, e.g., by running an iPerf3
> client on the machine. Check the qdisc statistics:
> $ tc -s qdisc show dev <oif>
>
> Statistics after 10s of iPerf3 TCP test before the fix (note that
> netem's backlog > limit, netem stopped accepting packets):
> qdisc netem 1: root refcnt 2 limit 1000 delay 100ms
> Sent 2767766 bytes 1848 pkt (dropped 652, overlimits 0 requeues 0)
> backlog 4294528236b 1155p requeues 0
> qdisc tbf 10: parent 1:1 rate 50Mbit burst 1537b lat 50ms
> Sent 2767766 bytes 1848 pkt (dropped 327, overlimits 7601 requeues 0)
> backlog 0b 0p requeues 0
>
> Statistics after the fix:
> qdisc netem 1: root refcnt 2 limit 1000 delay 100ms
> Sent 37766372 bytes 24974 pkt (dropped 9, overlimits 0 requeues 0)
> backlog 0b 0p requeues 0
> qdisc tbf 10: parent 1:1 rate 50Mbit burst 1537b lat 50ms
> Sent 37766372 bytes 24974 pkt (dropped 327, overlimits 96017 requeues 0)
> backlog 0b 0p requeues 0
>
> tbf segments the GSO SKBs (tbf_segment) and updates the netem's 'qlen'.
> The interface fully stops transferring packets and "locks". In this case,
> the child qdisc and tfifo are empty, but 'qlen' indicates the tfifo is at
> its limit and no more packets are accepted.
>
> This patch adds a counter for the entries in the tfifo. Netem's 'qlen' is
> only decreased when a packet is returned by its dequeue function, and not
> during enqueuing into the child qdisc. External updates to 'qlen' are thus
> accounted for and only the behavior of the backlog statistics changes. As
> in other qdiscs, 'qlen' then keeps track of how many packets are held in
> netem and all of its children. As before, sch->limit remains as the
> maximum number of packets in the tfifo. The same applies to netem's
> backlog statistics.
>
> Fixes: 50612537e9ab ("netem: fix classful handling")
> Signed-off-by: Martin Ottens <martin.ottens at fau.de>
> Acked-by: Jamal Hadi Salim <jhs at mojatatu.com>
> Link: https://patch.msgid.link/20241210131412.1837202-1-martin.ottens@fau.de
> Signed-off-by: Jakub Kicinski <kuba at kernel.org>
> (cherry picked from 4eb35be67e63e26beeb3004a89bc1199baee2e6e)
I believe it should be f8d4bc455047cf3903cd6f85f49978987dbb3027
Other than that LGTM.
By the way, as of today, due to the delay of review, all but Noble already
includes the fix in master-next:
Oracular : commit 9e0307e4c85e (LP: #2097332)
Jammy : commit e6e239216e3f (LP: #2095302)
Focal : commit 3ebc89b88819 (LP: #2095199)
> CVE-2024-56770
> Signed-off-by: Tim Whisonant <tim.whisonant at canonical.com>
> ---
> net/sched/sch_netem.c | 22 ++++++++++++++++------
> 1 file changed, 16 insertions(+), 6 deletions(-)
>
> diff --git a/net/sched/sch_netem.c b/net/sched/sch_netem.c
> index 39382ee1e331..3b519adc0125 100644
> --- a/net/sched/sch_netem.c
> +++ b/net/sched/sch_netem.c
> @@ -78,6 +78,8 @@ struct netem_sched_data {
> struct sk_buff *t_head;
> struct sk_buff *t_tail;
>
> + u32 t_len;
> +
> /* optional qdisc for classful handling (NULL at netem init) */
> struct Qdisc *qdisc;
>
> @@ -382,6 +384,7 @@ static void tfifo_reset(struct Qdisc *sch)
> rtnl_kfree_skbs(q->t_head, q->t_tail);
> q->t_head = NULL;
> q->t_tail = NULL;
> + q->t_len = 0;
> }
>
> static void tfifo_enqueue(struct sk_buff *nskb, struct Qdisc *sch)
> @@ -411,6 +414,7 @@ static void tfifo_enqueue(struct sk_buff *nskb, struct Qdisc *sch)
> rb_link_node(&nskb->rbnode, parent, p);
> rb_insert_color(&nskb->rbnode, &q->t_root);
> }
> + q->t_len++;
> sch->q.qlen++;
> }
>
> @@ -517,7 +521,7 @@ static int netem_enqueue(struct sk_buff *skb, struct Qdisc *sch,
> 1<<get_random_u32_below(8);
> }
>
> - if (unlikely(sch->q.qlen >= sch->limit)) {
> + if (unlikely(q->t_len >= sch->limit)) {
> /* re-link segs, so that qdisc_drop_all() frees them all */
> skb->next = segs;
> qdisc_drop_all(skb, sch, to_free);
> @@ -701,8 +705,8 @@ static struct sk_buff *netem_dequeue(struct Qdisc *sch)
> tfifo_dequeue:
> skb = __qdisc_dequeue_head(&sch->q);
> if (skb) {
> - qdisc_qstats_backlog_dec(sch, skb);
> deliver:
> + qdisc_qstats_backlog_dec(sch, skb);
> qdisc_bstats_update(sch, skb);
> return skb;
> }
> @@ -718,8 +722,7 @@ static struct sk_buff *netem_dequeue(struct Qdisc *sch)
>
> if (time_to_send <= now && q->slot.slot_next <= now) {
> netem_erase_head(q, skb);
> - sch->q.qlen--;
> - qdisc_qstats_backlog_dec(sch, skb);
> + q->t_len--;
> skb->next = NULL;
> skb->prev = NULL;
> /* skb->dev shares skb->rbnode area,
> @@ -746,16 +749,21 @@ static struct sk_buff *netem_dequeue(struct Qdisc *sch)
> if (net_xmit_drop_count(err))
> qdisc_qstats_drop(sch);
> qdisc_tree_reduce_backlog(sch, 1, pkt_len);
> + sch->qstats.backlog -= pkt_len;
> + sch->q.qlen--;
> }
> goto tfifo_dequeue;
> }
> + sch->q.qlen--;
> goto deliver;
> }
>
> if (q->qdisc) {
> skb = q->qdisc->ops->dequeue(q->qdisc);
> - if (skb)
> + if (skb) {
> + sch->q.qlen--;
> goto deliver;
> + }
> }
>
> qdisc_watchdog_schedule_ns(&q->watchdog,
> @@ -765,8 +773,10 @@ static struct sk_buff *netem_dequeue(struct Qdisc *sch)
>
> if (q->qdisc) {
> skb = q->qdisc->ops->dequeue(q->qdisc);
> - if (skb)
> + if (skb) {
> + sch->q.qlen--;
> goto deliver;
> + }
> }
> return NULL;
> }
> --
> 2.43.0
>
>
> --
> kernel-team mailing list
> kernel-team at lists.ubuntu.com
> https://lists.ubuntu.com/mailman/listinfo/kernel-team
More information about the kernel-team
mailing list