From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-io1-xd32.google.com (mail-io1-xd32.google.com [IPv6:2607:f8b0:4864:20::d32]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id 9C1523B29E for ; Fri, 24 Sep 2021 21:10:38 -0400 (EDT) Received: by mail-io1-xd32.google.com with SMTP id b10so14916884ioq.9 for ; Fri, 24 Sep 2021 18:10:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :content-transfer-encoding; bh=A+iWXlGlggaTKM3llqosakjUpzNx8X/fuARCSdRsffY=; b=FX9ImAF9TbN8qHRb/yo396FJx/z/WCV3UV+g7zjjQISV9cZlsFmZ3Kwag6jMOEGsxx +RQn8dA60hn82nUOlbyZtaX/Z6Q7BBv6XNUZvN3c16FHW3YfKnRanZWXDRqZaEHD4ZJh STFNXR9qFSRGKt6Qa+zJS/ey0UppmwxWAd5cZHQDFa8qQdohm0XRLn1yA1/lwnzdaVO4 qFgKEaa0h7lKTJwN/VuSYWFB/NE0A5SMPnnawvofVLDZ+il8hGlbsoO7JdoMWW+SqMTo vRGCN2huVzKchcF8cKcFovGun+MnBc2wF116Sbwmur+yj65yjyYAhkc+Ynk4yp7453lj 5ttA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:content-transfer-encoding; bh=A+iWXlGlggaTKM3llqosakjUpzNx8X/fuARCSdRsffY=; b=MjDh+YmSqLWlhzvdRowsT2zGlxGhn4vriQ871flvaEQXXQJIwKT2jmIUy6VFaF5P2o lMjXmwLPjZ3urO6nrRY5I7IW0tmrg1aHaIWttjuNyI4VFUf2SmGHaIgRWgUDmliebgS7 /vM4+7szESMZhZAGmZGM/QvLn+/sWdT8COme3bR45PQYkKdgnDmymv24DvzqXCcs1/Vp wyVjDEzB0s6FscXQ0RTojaUHwtuwGRDige13FAKRTfd8AZTz7xYsu9fnaAYGP0XgyoDg voO2aQE9b1ZbYLAF4MDSQ/Tt8eUuM/VW4X4ca885KTBigC8uOX0DJ9q72hvfa0LW1lK6 qshg== X-Gm-Message-State: AOAM533Ur7VVn1AjyyffbBvt3vLF497TPmRmivdkpYLuOkr5vnHX01Da YMlEOdSAomz5zfuqIdpprH5FPm2o7xa225uyzZllQ1NtRb0= X-Google-Smtp-Source: ABdhPJwDczmqWFEAcWykmGIxk1q/MvyLi/9fMR1ewokFEfBxNTwNsJjHXXfhY4IS0caiGUU5C1dDfSIlpDneo/qJ/zs= X-Received: by 2002:a02:1081:: with SMTP id 123mr11595371jay.83.1632532237578; Fri, 24 Sep 2021 18:10:37 -0700 (PDT) MIME-Version: 1.0 References: <20210923211706.2553282-1-luke.w.hsiao@gmail.com> In-Reply-To: <20210923211706.2553282-1-luke.w.hsiao@gmail.com> From: Dave Taht Date: Fri, 24 Sep 2021 18:10:24 -0700 Message-ID: To: ECN-Sane Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Subject: [Ecn-sane] Fwd: [PATCH net-next] tcp: tracking packets with CE marks in BW rate sample X-BeenThere: ecn-sane@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: Discussion of explicit congestion notification's impact on the Internet List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 25 Sep 2021 01:10:38 -0000 ---------- Forwarded message --------- From: Luke Hsiao Date: Thu, Sep 23, 2021 at 2:20 PM Subject: [PATCH net-next] tcp: tracking packets with CE marks in BW rate sa= mple To: David Miller Cc: , Yuchung Cheng , Lawrence Brakmo , Neal Cardwell , Eric Dumazet , Luke Hsiao From: Yuchung Cheng In order to track CE marks per rate sample (one round trip), TCP needs a per-skb header field to record the tp->delivered_ce count when the skb was sent. To make space, we replace the "last_in_flight" field which is used exclusively for NV congestion control. The stat needed by NV can be alternatively approximated by existing stats tcp_sock delivered and mss_cache. This patch counts the number of packets delivered which have CE marks in the rate sample, using similar approach of delivery accounting. Cc: Lawrence Brakmo Signed-off-by: Yuchung Cheng Acked-by: Neal Cardwell Signed-off-by: Eric Dumazet Signed-off-by: Luke Hsiao --- include/net/tcp.h | 9 ++++++--- net/ipv4/tcp_input.c | 11 +++++------ net/ipv4/tcp_output.c | 2 -- net/ipv4/tcp_rate.c | 6 ++++++ 4 files changed, 17 insertions(+), 11 deletions(-) diff --git a/include/net/tcp.h b/include/net/tcp.h index 673c3b01e287..32cf6c01f403 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -874,10 +874,11 @@ struct tcp_skb_cb { __u32 ack_seq; /* Sequence number ACK'd */ union { struct { +#define TCPCB_DELIVERED_CE_MASK ((1U<<20) - 1) /* There is space for up to 24 bytes */ - __u32 in_flight:30,/* Bytes in flight at transmit *= / - is_app_limited:1, /* cwnd not fully used? */ - unused:1; + __u32 is_app_limited:1, /* cwnd not fully used? */ + delivered_ce:20, + unused:11; /* pkts S/ACKed so far upon tx of skb, incl retrans= : */ __u32 delivered; /* start of send pipeline phase */ @@ -1029,7 +1030,9 @@ struct ack_sample { struct rate_sample { u64 prior_mstamp; /* starting timestamp for interval */ u32 prior_delivered; /* tp->delivered at "prior_mstamp" */ + u32 prior_delivered_ce;/* tp->delivered_ce at "prior_mstamp" */ s32 delivered; /* number of packets delivered over interva= l */ + s32 delivered_ce; /* number of packets delivered w/ CE marks*= / long interval_us; /* time for tp->delivered to incr "delivere= d" */ u32 snd_interval_us; /* snd interval for delivered packets */ u32 rcv_interval_us; /* rcv interval for delivered packets */ diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index 141e85e6422b..53675e284841 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -3221,7 +3221,6 @@ static int tcp_clean_rtx_queue(struct sock *sk, const struct sk_buff *ack_skb, long seq_rtt_us =3D -1L; long ca_rtt_us =3D -1L; u32 pkts_acked =3D 0; - u32 last_in_flight =3D 0; bool rtt_update; int flag =3D 0; @@ -3257,7 +3256,6 @@ static int tcp_clean_rtx_queue(struct sock *sk, const struct sk_buff *ack_skb, if (!first_ackt) first_ackt =3D last_ackt; - last_in_flight =3D TCP_SKB_CB(skb)->tx.in_flight; if (before(start_seq, reord)) reord =3D start_seq; if (!after(scb->end_seq, tp->high_seq)) @@ -3323,8 +3321,8 @@ static int tcp_clean_rtx_queue(struct sock *sk, const struct sk_buff *ack_skb, seq_rtt_us =3D tcp_stamp_us_delta(tp->tcp_mstamp, first_ack= t); ca_rtt_us =3D tcp_stamp_us_delta(tp->tcp_mstamp, last_ackt)= ; - if (pkts_acked =3D=3D 1 && last_in_flight < tp->mss_cache &= & - last_in_flight && !prior_sacked && fully_acked && + if (pkts_acked =3D=3D 1 && fully_acked && !prior_sacked && + (tp->snd_una - prior_snd_una) < tp->mss_cache && sack->rate->prior_delivered + 1 =3D=3D tp->delivered && !(flag & (FLAG_CA_ALERT | FLAG_SYN_ACKED))) { /* Conservatively mark a delayed ACK. It's typicall= y @@ -3381,9 +3379,10 @@ static int tcp_clean_rtx_queue(struct sock *sk, const struct sk_buff *ack_skb, if (icsk->icsk_ca_ops->pkts_acked) { struct ack_sample sample =3D { .pkts_acked =3D pkts_acked, - .rtt_us =3D sack->rate->rtt_us= , - .in_flight =3D last_in_flight = }; + .rtt_us =3D sack->rate->rtt_us= }; + sample.in_flight =3D tp->mss_cache * + (tp->delivered - sack->rate->prior_delivered); icsk->icsk_ca_ops->pkts_acked(sk, &sample); } diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index 6d72f3ea48c4..fdc39b4fbbfa 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -1256,8 +1256,6 @@ static int __tcp_transmit_skb(struct sock *sk, struct sk_buff *skb, tp->tcp_wstamp_ns =3D max(tp->tcp_wstamp_ns, tp->tcp_clock_cache); skb->skb_mstamp_ns =3D tp->tcp_wstamp_ns; if (clone_it) { - TCP_SKB_CB(skb)->tx.in_flight =3D TCP_SKB_CB(skb)->end_seq - - tp->snd_una; oskb =3D skb; tcp_skb_tsorted_save(oskb) { diff --git a/net/ipv4/tcp_rate.c b/net/ipv4/tcp_rate.c index 0de693565963..fbab921670cc 100644 --- a/net/ipv4/tcp_rate.c +++ b/net/ipv4/tcp_rate.c @@ -65,6 +65,7 @@ void tcp_rate_skb_sent(struct sock *sk, struct sk_buff *s= kb) TCP_SKB_CB(skb)->tx.first_tx_mstamp =3D tp->first_tx_mstamp; TCP_SKB_CB(skb)->tx.delivered_mstamp =3D tp->delivered_mstamp; TCP_SKB_CB(skb)->tx.delivered =3D tp->delivered; + TCP_SKB_CB(skb)->tx.delivered_ce =3D tp->delivered_ce; TCP_SKB_CB(skb)->tx.is_app_limited =3D tp->app_limited ? 1 : 0= ; } @@ -86,6 +87,7 @@ void tcp_rate_skb_delivered(struct sock *sk, struct sk_buff *skb, if (!rs->prior_delivered || after(scb->tx.delivered, rs->prior_delivered)) { + rs->prior_delivered_ce =3D scb->tx.delivered_ce; rs->prior_delivered =3D scb->tx.delivered; rs->prior_mstamp =3D scb->tx.delivered_mstamp; rs->is_app_limited =3D scb->tx.is_app_limited; @@ -138,6 +140,10 @@ void tcp_rate_gen(struct sock *sk, u32 delivered, u32 = lost, } rs->delivered =3D tp->delivered - rs->prior_delivered; + rs->delivered_ce =3D tp->delivered_ce - rs->prior_delivered_ce; + /* delivered_ce occupies less than 32 bits in the skb control block= */ + rs->delivered_ce &=3D TCPCB_DELIVERED_CE_MASK; + /* Model sending data and receiving ACKs as separate pipeline phase= s * for a window. Usually the ACK phase is longer, but with ACK * compression the send phase can be longer. To be safe we use the -- 2.33.0.685.g46640cef36-goog --=20 Fixing Starlink's Latencies: https://www.youtube.com/watch?v=3Dc9gLo6Xrwgw Dave T=C3=A4ht CEO, TekLibre, LLC