NACK: [SRU][O/N/J/F][PATCH 1/1] net/sched: netem: account for backlog updates from child qdisc

Koichiro Den koichiro.den at canonical.com
Mon Feb 17 02:33:39 UTC 2025


On Fri, Jan 24, 2025 at 12:55:07PM GMT, Tim Whisonant wrote:
> From: Martin Ottens <martin.ottens at fau.de>
> 
> In general, 'qlen' of any classful qdisc should keep track of the
> number of packets that the qdisc itself and all of its children holds.
> In case of netem, 'qlen' only accounts for the packets in its internal
> tfifo. When netem is used with a child qdisc, the child qdisc can use
> 'qdisc_tree_reduce_backlog' to inform its parent, netem, about created
> or dropped SKBs. This function updates 'qlen' and the backlog statistics
> of netem, but netem does not account for changes made by a child qdisc.
> 'qlen' then indicates the wrong number of packets in the tfifo.
> If a child qdisc creates new SKBs during enqueue and informs its parent
> about this, netem's 'qlen' value is increased. When netem dequeues the
> newly created SKBs from the child, the 'qlen' in netem is not updated.
> If 'qlen' reaches the configured sch->limit, the enqueue function stops
> working, even though the tfifo is not full.
> 
> Reproduce the bug:
> Ensure that the sender machine has GSO enabled. Configure netem as root
> qdisc and tbf as its child on the outgoing interface of the machine
> as follows:
> $ tc qdisc add dev <oif> root handle 1: netem delay 100ms limit 100
> $ tc qdisc add dev <oif> parent 1:0 tbf rate 50Mbit burst 1542 latency 50ms
> 
> Send bulk TCP traffic out via this interface, e.g., by running an iPerf3
> client on the machine. Check the qdisc statistics:
> $ tc -s qdisc show dev <oif>
> 
> Statistics after 10s of iPerf3 TCP test before the fix (note that
> netem's backlog > limit, netem stopped accepting packets):
> qdisc netem 1: root refcnt 2 limit 1000 delay 100ms
>  Sent 2767766 bytes 1848 pkt (dropped 652, overlimits 0 requeues 0)
>  backlog 4294528236b 1155p requeues 0
> qdisc tbf 10: parent 1:1 rate 50Mbit burst 1537b lat 50ms
>  Sent 2767766 bytes 1848 pkt (dropped 327, overlimits 7601 requeues 0)
>  backlog 0b 0p requeues 0
> 
> Statistics after the fix:
> qdisc netem 1: root refcnt 2 limit 1000 delay 100ms
>  Sent 37766372 bytes 24974 pkt (dropped 9, overlimits 0 requeues 0)
>  backlog 0b 0p requeues 0
> qdisc tbf 10: parent 1:1 rate 50Mbit burst 1537b lat 50ms
>  Sent 37766372 bytes 24974 pkt (dropped 327, overlimits 96017 requeues 0)
>  backlog 0b 0p requeues 0
> 
> tbf segments the GSO SKBs (tbf_segment) and updates the netem's 'qlen'.
> The interface fully stops transferring packets and "locks". In this case,
> the child qdisc and tfifo are empty, but 'qlen' indicates the tfifo is at
> its limit and no more packets are accepted.
> 
> This patch adds a counter for the entries in the tfifo. Netem's 'qlen' is
> only decreased when a packet is returned by its dequeue function, and not
> during enqueuing into the child qdisc. External updates to 'qlen' are thus
> accounted for and only the behavior of the backlog statistics changes. As
> in other qdiscs, 'qlen' then keeps track of  how many packets are held in
> netem and all of its children. As before, sch->limit remains as the
> maximum number of packets in the tfifo. The same applies to netem's
> backlog statistics.
> 
> Fixes: 50612537e9ab ("netem: fix classful handling")
> Signed-off-by: Martin Ottens <martin.ottens at fau.de>
> Acked-by: Jamal Hadi Salim <jhs at mojatatu.com>
> Link: https://patch.msgid.link/20241210131412.1837202-1-martin.ottens@fau.de
> Signed-off-by: Jakub Kicinski <kuba at kernel.org>
> (cherry picked from 4eb35be67e63e26beeb3004a89bc1199baee2e6e)

I believe it should be f8d4bc455047cf3903cd6f85f49978987dbb3027
Other than that LGTM. 

By the way, as of today, due to the delay of review, all but Noble already
includes the fix in master-next:

  Oracular : commit 9e0307e4c85e (LP: #2097332)
  Jammy    : commit e6e239216e3f (LP: #2095302)
  Focal    : commit 3ebc89b88819 (LP: #2095199)

> CVE-2024-56770
> Signed-off-by: Tim Whisonant <tim.whisonant at canonical.com>
> ---
>  net/sched/sch_netem.c | 22 ++++++++++++++++------
>  1 file changed, 16 insertions(+), 6 deletions(-)
> 
> diff --git a/net/sched/sch_netem.c b/net/sched/sch_netem.c
> index 39382ee1e331..3b519adc0125 100644
> --- a/net/sched/sch_netem.c
> +++ b/net/sched/sch_netem.c
> @@ -78,6 +78,8 @@ struct netem_sched_data {
>  	struct sk_buff	*t_head;
>  	struct sk_buff	*t_tail;
>  
> +	u32 t_len;
> +
>  	/* optional qdisc for classful handling (NULL at netem init) */
>  	struct Qdisc	*qdisc;
>  
> @@ -382,6 +384,7 @@ static void tfifo_reset(struct Qdisc *sch)
>  	rtnl_kfree_skbs(q->t_head, q->t_tail);
>  	q->t_head = NULL;
>  	q->t_tail = NULL;
> +	q->t_len = 0;
>  }
>  
>  static void tfifo_enqueue(struct sk_buff *nskb, struct Qdisc *sch)
> @@ -411,6 +414,7 @@ static void tfifo_enqueue(struct sk_buff *nskb, struct Qdisc *sch)
>  		rb_link_node(&nskb->rbnode, parent, p);
>  		rb_insert_color(&nskb->rbnode, &q->t_root);
>  	}
> +	q->t_len++;
>  	sch->q.qlen++;
>  }
>  
> @@ -517,7 +521,7 @@ static int netem_enqueue(struct sk_buff *skb, struct Qdisc *sch,
>  			1<<get_random_u32_below(8);
>  	}
>  
> -	if (unlikely(sch->q.qlen >= sch->limit)) {
> +	if (unlikely(q->t_len >= sch->limit)) {
>  		/* re-link segs, so that qdisc_drop_all() frees them all */
>  		skb->next = segs;
>  		qdisc_drop_all(skb, sch, to_free);
> @@ -701,8 +705,8 @@ static struct sk_buff *netem_dequeue(struct Qdisc *sch)
>  tfifo_dequeue:
>  	skb = __qdisc_dequeue_head(&sch->q);
>  	if (skb) {
> -		qdisc_qstats_backlog_dec(sch, skb);
>  deliver:
> +		qdisc_qstats_backlog_dec(sch, skb);
>  		qdisc_bstats_update(sch, skb);
>  		return skb;
>  	}
> @@ -718,8 +722,7 @@ static struct sk_buff *netem_dequeue(struct Qdisc *sch)
>  
>  		if (time_to_send <= now && q->slot.slot_next <= now) {
>  			netem_erase_head(q, skb);
> -			sch->q.qlen--;
> -			qdisc_qstats_backlog_dec(sch, skb);
> +			q->t_len--;
>  			skb->next = NULL;
>  			skb->prev = NULL;
>  			/* skb->dev shares skb->rbnode area,
> @@ -746,16 +749,21 @@ static struct sk_buff *netem_dequeue(struct Qdisc *sch)
>  					if (net_xmit_drop_count(err))
>  						qdisc_qstats_drop(sch);
>  					qdisc_tree_reduce_backlog(sch, 1, pkt_len);
> +					sch->qstats.backlog -= pkt_len;
> +					sch->q.qlen--;
>  				}
>  				goto tfifo_dequeue;
>  			}
> +			sch->q.qlen--;
>  			goto deliver;
>  		}
>  
>  		if (q->qdisc) {
>  			skb = q->qdisc->ops->dequeue(q->qdisc);
> -			if (skb)
> +			if (skb) {
> +				sch->q.qlen--;
>  				goto deliver;
> +			}
>  		}
>  
>  		qdisc_watchdog_schedule_ns(&q->watchdog,
> @@ -765,8 +773,10 @@ static struct sk_buff *netem_dequeue(struct Qdisc *sch)
>  
>  	if (q->qdisc) {
>  		skb = q->qdisc->ops->dequeue(q->qdisc);
> -		if (skb)
> +		if (skb) {
> +			sch->q.qlen--;
>  			goto deliver;
> +		}
>  	}
>  	return NULL;
>  }
> -- 
> 2.43.0
> 
> 
> -- 
> kernel-team mailing list
> kernel-team at lists.ubuntu.com
> https://lists.ubuntu.com/mailman/listinfo/kernel-team



More information about the kernel-team mailing list