From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-lf1-x131.google.com (mail-lf1-x131.google.com [IPv6:2a00:1450:4864:20::131]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id 26BBB3CB40 for ; Tue, 26 Oct 2021 06:04:43 -0400 (EDT) Received: by mail-lf1-x131.google.com with SMTP id l13so19800337lfg.6 for ; Tue, 26 Oct 2021 03:04:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=domos-no.20210112.gappssmtp.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=MLqqQ61/mOX2MmeyMngimmwtMFDm6CZokqAifA+BZHw=; b=JvBGAj8teisfxkbWXGVsc7RyxQeSnYr1K31mTZPQSnkKBeXNjForLlF5jcF/xX8dvg qa9AHGq36CdfRe9N3WyOBLGBW8FsNF802p+qI0nZdIqRHRorXV/pQRw8OBMrM2iP4k5H yn93E3tALaUvec2YmQrthpnPJ0cPJ/cW1I+/y0XnQb2wYGkSmKY6wf3034PjF6CsMBOP nfB51m6omMTwsXzoft1gLEtbwOc2Pd72oJvkb/leOXYexEAJlsRtU035CzUVbn+D4U9J 6R+C85jk4CtgS1ccgmH3pkBxTO/jLntfLWi3GRdoFZyeQ//54+D4kbdgr9CYyz90TxWM OhjA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=MLqqQ61/mOX2MmeyMngimmwtMFDm6CZokqAifA+BZHw=; b=fka+m6FHv06As+4iGXi+J0m3YYJAdLT4rvKlv8eHQFvd/PfjoM32IJnC+PFbE+7E9E ZBcpiKUF2Kt//t0CpsBfzCgv/BcPDJiodBcQRH23hnZaOeoN5hmn6HyJGbJM2Znj6qd4 pzsPEw/uL8weMbpH8MNbTI6Uau3caKquUui5+sNSnxkwjn2jfv8wELdJYsu++08GL/Iw mDDlqAvOvmxnnM/QM829dF5mmyVkw3+t60gEBzblEqIv2J2gnPemLYIrpiB1c6KWBKAt hdNNaDjqcdxuGe4YdHWrS33iTOSVZRTmt0Opl0gBb3xAcfHFJcwqN4K0vZqXKqDCMHSF JCJA== X-Gm-Message-State: AOAM532DLv0QoflYceuVnU6tc3xRZgK6qtbYT3PFmW3jEeEz+pCbkIRy yJdbmUIZUaeupXcDTIxMsDRuI/DSQFG3qDnETPqwnQ== X-Google-Smtp-Source: ABdhPJwYOIuLtXvXCOWCHnyQdTltDAF28/cutJW2nh1lnoZLy7OCGyvFuKbgKl3iAZ+68kvxkY8LFePXJrkkt4ly68Q= X-Received: by 2002:a05:6512:32a9:: with SMTP id q9mr21288734lfe.58.1635242681741; Tue, 26 Oct 2021 03:04:41 -0700 (PDT) MIME-Version: 1.0 References: <1625188609.32718319@apps.rackspace.com> <989de0c1-e06c-cda9-ebe6-1f33df8a4c24@candelatech.com> <1625773080.94974089@apps.rackspace.com> <1625859083.09751240@apps.rackspace.com> <257851.1632110422@turing-police> <1632680642.869711321@apps.rackspace.com> In-Reply-To: From: =?UTF-8?Q?Bj=C3=B8rn_Ivar_Teigen?= Date: Tue, 26 Oct 2021 11:04:30 +0100 Message-ID: To: Bob McMahon Cc: Stuart Cheshire , starlink@lists.bufferbloat.net, =?UTF-8?Q?Valdis_Kl=C4=93tnieks?= , Make-Wifi-fast , "David P. Reed" , Cake List , codel , Matt Mathis , cerowrt-devel , bloat , Neal Cardwell Content-Type: multipart/alternative; boundary="0000000000002acbc505cf3e9b74" X-Mailman-Approved-At: Tue, 26 Oct 2021 06:18:40 -0400 Subject: Re: [Cerowrt-devel] [Starlink] [Make-wifi-fast] TCP_NOTSENT_LOWAT applied to e2e TCP msg latency X-BeenThere: cerowrt-devel@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: Development issues regarding the cerowrt test router project List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 26 Oct 2021 10:04:43 -0000 --0000000000002acbc505cf3e9b74 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hi Bob, My name is Bj=C3=B8rn Ivar Teigen and I'm working on modeling and measuring= WiFi MAC-layer protocol performance for my PhD. Is it necessary to measure the latency using the TCP stream itself? I had a similar problem in the past, and solved it by doing the latency measurements using TWAMP running alongside the TCP traffic. The requirement for this to work is that the TWAMP packets are placed in the same queue(s) as the TCP traffic, and that the impact of measurement traffic is small enough so as not to interfere too much with your TCP results. Just my two cents, hope it's helpful. Bj=C3=B8rn On Tue, 26 Oct 2021 at 06:32, Bob McMahon wrote: > Thanks Stuart this is helpful. I'm measuring the time just before the > first write() (of potentially a burst of writes to achieve a burst size) > per a socket fd's select event occurring when TCP_NOT_SENT_LOWAT being se= t > to a small value, then sampling the RTT and CWND and providing histograms > for all three, all on that event. I'm not sure the correctness of RTT and > CWND at this sample point. This is a controlled test over 802.11ax and > OFDMA where the TCP acks per the WiFi clients are being scheduled by the = AP > using 802.11ax trigger frames so the AP is affecting the end/end BDP per > scheduling the transmits and the acks. The AP can grow the BDP or shrink = it > based on these scheduling decisions. From there we're trying to maximize > network power (throughput/delay) for elephant flows and just latency for > mouse flows. (We also plan some RF frequency stuff to per OFDMA) Anyway, > the AP based scheduling along with aggregation and OFDMA makes WiFi > scheduling optimums non-obvious - at least to me - and I'm trying to > provide insights into how an AP is affecting end/end performance. > > The more direct approach for e2e TCP latency and network power has been t= o > measure first write() to final read() and compute the e2e delay. This > requires clock sync on the ends. (We're using ptp4l with GPS OCXO > atomic references for that but this is typically only available in some > labs.) > > Bob > > > On Mon, Oct 25, 2021 at 8:11 PM Stuart Cheshire > wrote: > >> On 21 Oct 2021, at 17:51, Bob McMahon via Make-wifi-fast < >> make-wifi-fast@lists.bufferbloat.net> wrote: >> >> > Hi All, >> > >> > Sorry for the spam. I'm trying to support a meaningful TCP message >> latency w/iperf 2 from the sender side w/o requiring e2e clock >> synchronization. I thought I'd try to use the TCP_NOTSENT_LOWAT event to >> help with this. It seems that this event goes off when the bytes are in >> flight vs have reached the destination network stack. If that's the case= , >> then iperf 2 client (sender) may be able to produce the message latency = by >> adding the drain time (write start to TCP_NOTSENT_LOWAT) and the sampled >> RTT. >> > >> > Does this seem reasonable? >> >> I=E2=80=99m not 100% sure what you=E2=80=99re asking, but I will try to = help. >> >> When you set TCP_NOTSENT_LOWAT, the TCP implementation won=E2=80=99t rep= ort your >> endpoint as writable (e.g., via kqueue or epoll) until less than that >> threshold of data remains unsent. It won=E2=80=99t stop you writing more= bytes if >> you want to, up to the socket send buffer size, but it won=E2=80=99t *as= k* you for >> more data until the TCP_NOTSENT_LOWAT threshold is reached. In other wor= ds, >> the TCP implementation attempts to keep BDP bytes in flight + >> TCP_NOTSENT_LOWAT bytes buffered and ready to go. The BDP of bytes in >> flight is necessary to fill the network pipe and get good throughput. Th= e >> TCP_NOTSENT_LOWAT of bytes buffered and ready to go is provided to give = the >> source software some advance notice that the TCP implementation will soo= n >> be looking for more bytes to send, so that the buffer doesn=E2=80=99t ru= n dry, >> thereby lowering throughput. (The old SO_SNDBUF option conflates both >> =E2=80=9Cbytes in flight=E2=80=9D and =E2=80=9Cbytes buffered and ready = to go=E2=80=9D into the same >> number.) >> >> If you wait for the TCP_NOTSENT_LOWAT notification, write a chunk of n >> bytes of data, and then wait for the next TCP_NOTSENT_LOWAT notification= , >> that will tell you roughly how long it took n bytes to depart the machin= e. >> You won=E2=80=99t know why, though. The bytes could depart the machine i= n response >> for acks indicating that the same number of bytes have been accepted at = the >> receiver. But the bytes can also depart the machine because CWND is >> growing. Of course, both of those things are usually happening at the sa= me >> time. >> >> How to use TCP_NOTSENT_LOWAT is explained in this video: >> >> >> >> Later in the same video is a two-minute demo (time offset 42:00 to time >> offset 44:00) showing a =E2=80=9Cbefore and after=E2=80=9D demo illustra= ting the dramatic >> difference this makes for screen sharing responsiveness. >> >> >> >> Stuart Cheshire > > > This electronic communication and the information and any files > transmitted with it, or attached to it, are confidential and are intended > solely for the use of the individual or entity to whom it is addressed an= d > may contain information that is confidential, legally privileged, protect= ed > by privacy laws, or otherwise restricted from disclosure to anyone else. = If > you are not the intended recipient or the person responsible for deliveri= ng > the e-mail to the intended recipient, you are hereby notified that any us= e, > copying, distributing, dissemination, forwarding, printing, or copying of > this e-mail is strictly prohibited. If you received this e-mail in error, > please return the e-mail to the sender, delete it from your computer, and > destroy any printed copy of it. > _______________________________________________ > Starlink mailing list > Starlink@lists.bufferbloat.net > https://lists.bufferbloat.net/listinfo/starlink > --=20 Bj=C3=B8rn Ivar Teigen Head of Research +47 47335952 | bjorn@domos.no | www.domos.no WiFi Slicing by Domos --0000000000002acbc505cf3e9b74 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hi Bob,

My name is Bj=C3=B8r= n Ivar Teigen and I'm working on modeling and measuring WiFi MAC-layer = protocol performance for my PhD.

Is it necessary t= o measure the latency using the TCP stream itself? I had a similar problem = in the past, and solved it by doing the latency measurements using TWAMP ru= nning alongside the TCP traffic. The requirement for this to work is that t= he TWAMP packets are placed in the same queue(s) as the TCP traffic, and th= at the impact of measurement traffic is small enough so as not to interfere= too much with your TCP results.
Just my two cents, hope it's= helpful.

Bj=C3=B8rn

On Tue, 26 Oct 2021 = at 06:32, Bob McMahon <bob.m= cmahon@broadcom.com> wrote:
Thanks Stuart this is helpful. I'm = measuring=C2=A0the time just before the first write() (of potentially a bur= st of writes to achieve a burst size) per a socket fd's select event oc= curring when TCP_NOT_SENT_LOWAT being set to a small value, then sampling t= he RTT and CWND and providing histograms for all three, all on that event. = I'm not sure the correctness of RTT and CWND at this sample point. This= is a controlled test over 802.11ax and OFDMA where the TCP acks per the Wi= Fi clients are being scheduled by the AP using 802.11ax trigger frames so t= he AP is affecting the end/end BDP per scheduling the transmits and the ack= s. The AP can grow the BDP or shrink it based on these scheduling decisions= .=C2=A0 From there we're trying to maximize network power (throughput/d= elay) for elephant flows and just latency for mouse flows. (We also plan so= me RF frequency stuff to per OFDMA) Anyway, the AP based scheduling along w= ith aggregation=C2=A0and OFDMA makes WiFi scheduling optimums non-obvious -= at least to me - and I'm trying to provide insights into how an AP is = affecting end/end performance.

The more direct approach for e2e TCP = latency and network power has been to measure first write() to final read()= and compute the e2e delay. This requires clock sync on the ends. (We'r= e using ptp4l with GPS OCXO atomic=C2=A0references=C2=A0for that but this i= s typically only available in some labs.)=C2=A0

Bob
=C2=A0

On Mon, Oct 25, 2021 at 8:11 PM Stuart Cheshire <cheshire@apple.com> wrote:
<= /div>
On 21 Oct 2021, at 1= 7:51, Bob McMahon via Make-wifi-fast <make-wifi-fast@lists.bufferbloat.ne= t> wrote:

> Hi All,
>
> Sorry for the spam. I'm trying to support a meaningful TCP message= latency w/iperf 2 from the sender side w/o requiring e2e clock synchroniza= tion. I thought I'd try to use the TCP_NOTSENT_LOWAT event to help with= this. It seems that this event goes off when the bytes are in flight vs ha= ve reached the destination network stack. If that's the case, then iper= f 2 client (sender) may be able to produce the message latency by adding th= e drain time (write start to TCP_NOTSENT_LOWAT) and the sampled RTT.
>
> Does this seem reasonable?

I=E2=80=99m not 100% sure what you=E2=80=99re asking, but I will try to hel= p.

When you set TCP_NOTSENT_LOWAT, the TCP implementation won=E2=80=99t report= your endpoint as writable (e.g., via kqueue or epoll) until less than that= threshold of data remains unsent. It won=E2=80=99t stop you writing more b= ytes if you want to, up to the socket send buffer size, but it won=E2=80=99= t *ask* you for more data until the TCP_NOTSENT_LOWAT threshold is reached.= In other words, the TCP implementation attempts to keep BDP bytes in fligh= t + TCP_NOTSENT_LOWAT bytes buffered and ready to go. The BDP of bytes in f= light is necessary to fill the network pipe and get good throughput. The TC= P_NOTSENT_LOWAT of bytes buffered and ready to go is provided to give the s= ource software some advance notice that the TCP implementation will soon be= looking for more bytes to send, so that the buffer doesn=E2=80=99t run dry= , thereby lowering throughput. (The old SO_SNDBUF option conflates both =E2= =80=9Cbytes in flight=E2=80=9D and =E2=80=9Cbytes buffered and ready to go= =E2=80=9D into the same number.)

If you wait for the TCP_NOTSENT_LOWAT notification, write a chunk of n byte= s of data, and then wait for the next TCP_NOTSENT_LOWAT notification, that = will tell you roughly how long it took n bytes to depart the machine. You w= on=E2=80=99t know why, though. The bytes could depart the machine in respon= se for acks indicating that the same number of bytes have been accepted at = the receiver. But the bytes can also depart the machine because CWND is gro= wing. Of course, both of those things are usually happening at the same tim= e.

How to use TCP_NOTSENT_LOWAT is explained in this video:

<https://developer.apple.com/v= ideos/play/wwdc2015/719/?time=3D2199>

Later in the same video is a two-minute demo (time offset 42:00 to time off= set 44:00) showing a =E2=80=9Cbefore and after=E2=80=9D demo illustrating t= he dramatic difference this makes for screen sharing responsiveness.

<https://developer.apple.com/v= ideos/play/wwdc2015/719/?time=3D2520>

Stuart Cheshire

This ele= ctronic communication and the information and any files transmitted with it= , or attached to it, are confidential and are intended solely for the use o= f the individual or entity to whom it is addressed and may contain informat= ion that is confidential, legally privileged, protected by privacy laws, or= otherwise restricted from disclosure to anyone else. If you are not the in= tended recipient or the person responsible for delivering the e-mail to the= intended recipient, you are hereby notified that any use, copying, distrib= uting, dissemination, forwarding, printing, or copying of this e-mail is st= rictly prohibited. If you received this e-mail in error, please return the = e-mail to the sender, delete it from your computer, and destroy any printed= copy of it._______________________________________________ Starlink mailing list
Starlin= k@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/starlink


--
Bj=C3=B8rn Ivar Teigen=
Head of Research
+47 47335952 | = bjorn@domos.no=C2=A0|=C2=A0www.domos.no
WiFi Slicing by Domos
<= /span>
--0000000000002acbc505cf3e9b74--