From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp69.iad3a.emailsrvr.com (smtp69.iad3a.emailsrvr.com [173.203.187.69]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id 573A33B29D for ; Thu, 14 Apr 2022 12:54:33 -0400 (EDT) Received: from app31.wa-webapps.iad3a (relay-webapps.rsapps.net [172.27.255.140]) by smtp17.relay.iad3a.emailsrvr.com (SMTP Server) with ESMTP id 904E125D74; Thu, 14 Apr 2022 12:54:32 -0400 (EDT) Received: from deepplum.com (localhost.localdomain [127.0.0.1]) by app31.wa-webapps.iad3a (Postfix) with ESMTP id 7A2702089B; Thu, 14 Apr 2022 12:54:32 -0400 (EDT) Received: by apps.rackspace.com (Authenticated sender: dpreed@deepplum.com, from: dpreed@deepplum.com) with HTTP; Thu, 14 Apr 2022 12:54:32 -0400 (EDT) X-Auth-ID: dpreed@deepplum.com Date: Thu, 14 Apr 2022 12:54:32 -0400 (EDT) From: "David P. Reed" To: "Michael Welzl" Cc: "Sebastian Moeller" , ecn-sane@lists.bufferbloat.net MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_20220414125432000000_92878" Importance: Normal X-Priority: 3 (Normal) X-Type: html In-Reply-To: References: <4430DD9F-2556-4D38-8BE2-6609265319AF@ifi.uio.no> <1649778681.721621839@apps.rackspace.com> <0026CF35-46DF-4C0C-8FEE-B5309246C1B7@ifi.uio.no> <08F92DA0-1D59-4E58-A289-3D35103CF78B@gmx.de> X-Client-IP: 209.6.168.128 Message-ID: <1649955272.49298319@apps.rackspace.com> X-Mailer: webmail/19.0.13-RC X-Classification-ID: 8b017c04-81c9-4038-adf3-4b79ce074e91-1-1 Subject: Re: [Ecn-sane] rtt-fairness question X-BeenThere: ecn-sane@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: Discussion of explicit congestion notification's impact on the Internet List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 14 Apr 2022 16:54:33 -0000 ------=_20220414125432000000_92878 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable =0AAm I to assume, then, that routers need not pay any attention to RTT to = achieve RTT-fairness?=0A =0AHow does a server or client (at the endpoint) a= djust RTT so that it is fair?=0A =0ANow RTT, technically, is just the sum o= f the instantaneous queue lengths in bytes along the path and the reverse p= ath, plus a fixed wire-level delay. And routers along any path do not have = correlated queue sizes.=0A =0AIt seems to me that RTT adjustment requires c= ollective real-time cooperation among all-or-most future users of that path= . The path is partially shared by many servers and many users, none of who= m directly speak to each other.=0A =0AAnd routers have very limited memory = compared to their throughput-RTdelay product. So calculating the RTT using = spin bits and UIDs for packets seems a bit much to expect all routers to do= .=0A =0ASo, what process measures the cross-interactions among all the user= s of all the paths, and what control-loop (presumably stable and TCP-compat= ible) actually converges to RTT fairness IRL.=0A =0AToday, the basis of con= gestion control in the Internet is that each router is a controller of all = endpoint flows that share a link, and each router is free to do whatever it= takes to reduce its queue length to near zero as an average on all timesca= les larger than about 1/10 of a second (a magic number that is directly der= ived from measured human brain time resolution).=0A =0ASo, for any two mach= ines separated by less than 1/10 of a light-second in distance, the total q= ueueing delay has to stabilize in about 1/10 of a second. (I'm using a ligh= t-second in a fiber medium, not free-space, as the speed of light in fiber = is a lot slower than the speed of light on microwaves, as Wall Street has r= ecently started recoginizing and investing in).=0A =0AI don't see how RTT-f= airness can be achieved by some set of bits in the IP header. You can't sho= rten RTT below about 2/10 of a second in that desired system state. You can= only "lengthen" RTT by delaying packets in source or endpoint buffers, bec= ause it's unreasonable to manage all the routers.=0A =0AAnd the endpoints t= hat share a path can't talk to each other and reach a decision in on the or= der of 2/10 of a second.=0A =0ASo at the very highest level, what is RTT-fa= irness's objective function optimizing, and how can it work?=0A =0ACan it b= e done without any change to routers?=0A =0A =0A =0A =0AOn Tuesday, April 1= 2, 2022 3:07pm, "Michael Welzl" said:=0A=0A=0A=0A=0AOn= Apr 12, 2022, at 8:52 PM, Sebastian Moeller <[ moeller0@gmx.de ]( mailto:m= oeller0@gmx.de )> wrote:=0A=0AQuestion: is QUIC actually using the spin bit= as an essential part of the protocol?The spec says it=E2=80=99s optional: = [ https://www.rfc-editor.org/rfc/rfc9000.html#name-latency-spin-bit ]( htt= ps://www.rfc-editor.org/rfc/rfc9000.html#name-latency-spin-bit )=0A=0A=0AOt= herwise endpoints might just game this if faking their RTT at a router yiel= ds an advantage...=0AThis was certainly discussed in the QUIC WG. Probably = perceived as an unclear incentive, but I didn=E2=80=99t really follow this.= Cheers,=0AMichael=0A=0A=0AThis is why pping's use of tcp timestamps is eleg= ant, little incentive for the endpoints to fudge....Regards Sebastian=0AOn = 12 April 2022 18:00:15 CEST, Michael Welzl <[ michawe@ifi.uio.no ]( mailto:= michawe@ifi.uio.no )> wrote:Hi,=0AWho or what are you objecting against? = At least nothing that I described does what you suggest.=0ABTW, just as a s= ide point, for QUIC, routers can know the RTT today - using the spin bit, w= hich was designed for that specific purpose.=0ACheers,=0AMichael=0A=0A=0AOn= Apr 12, 2022, at 5:51 PM, David P. Reed <[ dpreed@deepplum.com ]( mailto:d= preed@deepplum.com )> wrote:=0A=0AI strongly object to congestion control *= in the network* attempting to measure RTT (which is an end-to-end comparati= ve metric). Unless the current RTT is passed in each packet a router cannot= enforce fairness. Period. =0A =0AToday, by packet drops and fair marking, = information is passed to the sending nodes (eventually) about congestion. B= ut the router can't know RTT today.=0A =0AThe result of *requiring* RTT fai= rness would be to put the random bottleneck router (chosen because it is th= e slowest forwarder on a contended path) become the endpoint controller.=0A= =0AThat's the opposite of an "end-to-end resource sharing protocol".=0A = =0ANow, I'm not saying it is impossible - what I'm saying it is asking all = endpoints to register with an "Internet-wide" RTT real-time tracking and co= ntrol service.=0A =0AThis would be the technical equivalent of an ITU centr= al control point.=0A =0ASo, either someone will invent something I cannot i= magine (a distributed, rapid-convergence algortithm that rellects to *every= potential user* of a shared router along the current path the RTT's of ALL= other users (and potential users).=0A =0AIMHO, the wish for RTT fairness i= s like saying that the entire solar system's gravitational pull should be e= qualized so that all planets and asteroids have fair access to 1G gravity.= =0A =0A =0AOn Friday, April 8, 2022 2:03pm, "Michael Welzl" <[ michawe@ifi.= uio.no ]( mailto:michawe@ifi.uio.no )> said:=0AHi,=0AFWIW, we have done som= e analysis of fairness and convergence of DCTCP in:=0APeyman Teymoori, Davi= d Hayes, Michael Welzl, Stein Gjessing: "Estimating an Additive Path Cost w= ith Explicit Congestion Notification", IEEE Transactions on Control of Netw= ork Systems, 8(2), pp. 859-871, June 2021. DOI 10.1109/TCNS.2021.3053179=0A= Technical report (longer version):=0A[ https://folk.universitetetioslo.no/m= ichawe/research/publications/NUM-ECN_report_2019.pdf ]( https://folk.univer= sitetetioslo.no/michawe/research/publications/NUM-ECN_report_2019.pdf )=0Aa= nd there=E2=80=99s also some in this paper, which first introduced our LGC = mechanism:=0A[ https://ieeexplore.ieee.org/document/7796757 ]( https://ieee= xplore.ieee.org/document/7796757 )=0ASee the technical report on page 9, se= ction D: a simple trick can improve DCTCP=E2=80=99s fairness (if that=E2= =80=99s really the mechanism to stay with=E2=80=A6 I=E2=80=99m getting qu= ite happy with the results we get with our LGC scheme :-) )=0A=0ACheers= ,=0AMichael=0A=0AOn Apr 8, 2022, at 6:33 PM, Dave Taht <[ dave.taht@gmail.c= om ]( mailto:dave.taht@gmail.com )> wrote:=0A=0AI have managed to drop most= of my state regarding the state of variousdctcp-like solutions. At one lev= el it's good to have not been keepingup, washing my brain clean, as it were= . For some reason or another Iwent back to the original paper last week, an= d have been poundingthrough this one again:Analysis of DCTCP: Stability, Co= nvergence, and Fairness"Instead, we propose subtracting =CE=B1/2 from the w= indow size for each marked ACK,resulting in the following simple window upd= ate equation:One result of which I was most proud recently was of demonstra= tingperfect rtt fairness in a range of 20ms to 260ms with fq_codel[ https:/= /forum.mikrotik.com/viewtopic.php?t=3D179307 ]( https://forum.mikrotik.com/= viewtopic.php?t=3D179307 ) )- and I'm prettyinterested in 2-260ms, but have= n't got around to it.Now, one early result from the sce vs l4s testing I re= call was severelatecomer convergence problems - something like 40s to come = into flowbalance - but I can't remember what presentation, paper, or rtt th= atwas from. ?Another one has been various claims towards some level of rttu= nfairness being ok, but not the actual ratio, nor (going up to thepaper's p= roposal above) whether that method had been tried.My opinion has long been = that any form of marking should look moreclosely at the observed RTT than a= ny fixed rate reduction method, andcompensate the paced rate to suit. But t= hat's presently just reducedto an opinion, not having kept up with progress= on prague, dctcp-sce,or bbrv2. As one example of ignorance, are 2 packets = still paced backto back? DRR++ + early marking seems to lead to one packet = beingconsistently unmarked and the other marked.-- I tried to build a bette= r future, a few times:[ https://wayforward.archive.org/?site=3Dhttps%3A%2F%= 2Fwww.icei.org ]( https://wayforward.archive.org/?site=3Dhttps%3A%2F%2Fwww.= icei.org )Dave T=C3=A4ht CEO, TekLibre, LLC________________________________= _______________Ecn-sane mailing list[ Ecn-sane@lists.bufferbloat.net ]( mai= lto:Ecn-sane@lists.bufferbloat.net )[ https://lists.bufferbloat.net/listinf= o/ecn-sane ]( https://lists.bufferbloat.net/listinfo/ecn-sane )=0A=0A-- Sen= t from my Android device with K-9 Mail. Please excuse my brevity. ------=_20220414125432000000_92878 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable

Am I to assume, then, = that routers need not pay any attention to RTT to achieve RTT-fairness?

= =0A

 

=0A

How does a server= or client (at the endpoint) adjust RTT so that it is fair?

=0A

 

=0A

Now RTT, technically, is jus= t the sum of the instantaneous queue lengths in bytes along the path and th= e reverse path, plus a fixed wire-level delay. And routers along any path d= o not have correlated queue sizes.

=0A

 

=0A=

It seems to me that RTT adjustment requires collective= real-time cooperation among all-or-most future users of that path.  T= he path is partially shared by many servers and many users, none of whom di= rectly speak to each other.

=0A

 

=0A

And routers have very limited memory compared to their throug= hput-RTdelay product. So calculating the RTT using spin bits and UIDs for p= ackets seems a bit much to expect all routers to do.

=0A

 

=0A

So, what process measures the cross-= interactions among all the users of all the paths, and what control-loop (p= resumably stable and TCP-compatible) actually converges to RTT fairness IRL= .

=0A

 

=0A

Today, the b= asis of congestion control in the Internet is that each router is a control= ler of all endpoint flows that share a link, and each router is free to do = whatever it takes to reduce its queue length to near zero as an average on = all timescales larger than about 1/10 of a second (a magic number that is d= irectly derived from measured human brain time resolution).

=0A

 

=0A

So, for any two machines sep= arated by less than 1/10 of a light-second in distance, the total queueing = delay has to stabilize in about 1/10 of a second. (I'm using a light-second= in a fiber medium, not free-space, as the speed of light in fiber is a lot= slower than the speed of light on microwaves, as Wall Street has recently = started recoginizing and investing in).

=0A

 =0A

I don't see how RTT-fairness can be achieved by s= ome set of bits in the IP header. You can't shorten RTT below about 2/10 of= a second in that desired system state. You can only "lengthen" RTT by dela= ying packets in source or endpoint buffers, because it's unreasonable to ma= nage all the routers.

=0A

 

=0A

And the endpoints that share a path can't talk to each other and re= ach a decision in on the order of 2/10 of a second.

=0A

 

=0A

So at the very highest level, what is= RTT-fairness's objective function optimizing, and how can it work?

=0A<= p style=3D"margin:0;padding:0;font-family: arial; font-size: 10pt; overflow= -wrap: break-word;"> 

=0A

Can it be done withou= t any change to routers?

=0A

 

=0A

 

=0A

 

=0A

 

=0A

On Tuesday, April 12, 2022 3:07pm, = "Michael Welzl" <michawe@ifi.uio.no> said:

=0A

=0A

=0A=0A
On Apr 12, 2022, at 8:52 PM, Sebastia= n Moeller <moeller0@gmx.de= > wrote:
=0A
=0A
Q= uestion: is QUIC actually using the spin bit as an essential part of the pr= otocol?
=0A
=0A=0AThe spec says it=E2=80=99s optiona= l:  https://www.rfc-editor.org/rfc/rfc9000.html#name-lat= ency-spin-bit
=0A
=0A
=0A
=0A
Otherwise endpoints might just game this= if faking their RTT at a router yields an advantage...
=0A
=0A=0A
This was certainly discussed in the QUIC WG. Probably pe= rceived as an unclear incentive, but I didn=E2=80=99t really follow this.=0ACheers,
=0A
Michael
=0A

=0A=0A
=0A
This = is why pping's use of tcp timestamps is elegant, little incentive for the e= ndpoints to fudge....

Regards
Sebastian


=0A
On 12 April 2022 18:00:15 CEST, Michael Welzl <michawe@ifi.uio.no> wrot= e:=0A
Hi,=0A
W= ho or what are you objecting against?   At least nothing that I descri= bed does what you suggest.
=0A
BTW, just as a side poin= t, for QUIC, routers can know the RTT today - using the spin bit, which was= designed for that specific purpose.
=0A
Cheers,
= =0A
Michael
=0A

=0A
=0A
=0A
O= n Apr 12, 2022, at 5:51 PM, David P. Reed <dpreed@deepplum.com> wrote:
=0A
=0A
I strongly object to con= gestion control *in the network* attempting to measure RTT (which is an end= -to-end comparative metric). Unless the current RTT is passed in each packe= t a router cannot enforce fairness. Period. 
=0A

 

=0A
Today, by packet drops and fair marking, information is p= assed to the sending nodes (eventually) about congestion. But the router ca= n't know RTT today.
=0A

 

=0A
The result o= f *requiring* RTT fairness would be to put the random bottleneck router (ch= osen because it is the slowest forwarder on a contended path) become the en= dpoint controller.
=0A

 

=0A
That's the op= posite of an "end-to-end resource sharing protocol".
=0A

 

=0A
Now, I'm not saying it is impossible - what I'm saying= it is asking all endpoints to register with an "Internet-wide" RTT real-ti= me tracking and control service.
=0A

 

=0A
This would be the technical equivalent of an ITU central control point.=0A

 

=0A<= div class=3D"" style=3D"margin: 0px; padding: 0px; font-family: arial; font= -size: 10pt; overflow-wrap: break-word;">So, either someone will invent som= ething I cannot imagine (a distributed, rapid-convergence algortithm that r= ellects to *every potential user* of a shared router along the current path= the RTT's of ALL other users (and potential users).
=0A

 

=0A
IMHO, the wish for RTT fairness is like saying that th= e entire solar system's gravitational pull should be equalized so that all = planets and asteroids have fair access to 1G gravity.
=0A

 

=0A

 

=0A
On Friday, April 8, 2022 2:03pm, "Michael Welzl" <michawe@ifi.uio.no> said:=

=0A
Hi,=0A
FWIW, we have done some analysis of fairn= ess and convergence of DCTCP in:
=0A
Peyman Teymoori, D= avid Hayes, Michael Welzl, Stein Gjessing: "Estimating an Additive Path Cos= t with Explicit Congestion Notification", IEEE Transactions on Control of N= etwork Systems, 8(2), pp. 859-871, June 2021. DOI 10.1109/TCNS.2021.30= 53179
=0A
Technical report (longer version):
=0Ahttps://folk.universitet= etioslo.no/michawe/research/publications/NUM-ECN_report_2019.pdf
= =0A
and there=E2=80=99s also some in this paper, which first= introduced our LGC mechanism:
=0A=0A
See the technical report on= page 9, section D: a simple trick can improve DCTCP=E2=80=99s fairness &nb= sp;(if that=E2=80=99s really the mechanism to stay with=E2=80=A6   I= =E2=80=99m getting quite happy with the results we get with our LGC scheme =   :-)   )
=0A

=0A
Cheers,
=0A
Michael
=0A

=0A
=0A
On Apr 8, 2022, at= 6:33 PM, Dave Taht <d= ave.taht@gmail.com> wrote:
=0A
=0A
I have managed to drop most of my state regarding the state of variousdctcp-like solutions. At one level it's good to have not been= keeping
up, washing my brain clean, as it were. For some r= eason or another I
went back to the original paper last wee= k, and have been pounding
through this one again:

Analysis of DCTCP: Stability, Convergence, and Fai= rness

"Instead, we propose subtracting =CE= =B1/2 from the window size for each marked ACK,
resulting i= n the following simple window update equation:

One result of which I was most proud recently was of demonstrating
perfect rtt fairness in a range of 20ms to 260ms with fq_codel=
https://forum.mikrotik.com/viewtopic.php?t=3D179307 )-= and I'm pretty
interested in 2-260ms, but haven't got arou= nd to it.

Now, one early result from the s= ce vs l4s testing I recall was severe
latecomer convergence= problems - something like 40s to come into flow
balance - = but I can't remember what presentation, paper, or rtt that
= was from. ?

Another one has been various c= laims towards some level of rtt
unfairness being ok, but no= t the actual ratio, nor (going up to the
paper's proposal a= bove) whether that method had been tried.

= My opinion has long been that any form of marking should look more
closely at the observed RTT than any fixed rate reduction method, a= nd
compensate the paced rate to suit. But that's presently = just reduced
to an opinion, not having kept up with progres= s on prague, dctcp-sce,
or bbrv2. As one example of ignoran= ce, are 2 packets still paced back
to back? DRR++ + early m= arking seems to lead to one packet being
consistently unmar= ked and the other marked.

--
I tried to build a better future, a few times:
https://wayforward.archive.org/?site=3Dhttps%3A%2F%2Fwww.icei.org=

Dave T=C3=A4ht CEO, TekLibre, LLC
_______________________________________________
Ec= n-sane mailing list
Ecn-sane@lists.bufferbloat.net
https:= //lists.bufferbloat.net/listinfo/ecn-sane
=0A
=0A=0A
=0A
=0A
=0A
=0A
=0A
=0A
=0A<= /blockquote>=0A
=0A
= =0A
--
Sent from my Android= device with K-9 Mail. Please excuse my brevity.
=0A
=0A
=0A=
=0A=0A
=0A
------=_20220414125432000000_92878--