From mboxrd@z Thu Jan 1 00:00:00 1970
Return-Path:
Received: from smtp69.iad3a.emailsrvr.com (smtp69.iad3a.emailsrvr.com
[173.203.187.69])
(using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
(No client certificate requested)
by lists.bufferbloat.net (Postfix) with ESMTPS id 573A33B29D
for ; Thu, 14 Apr 2022 12:54:33 -0400 (EDT)
Received: from app31.wa-webapps.iad3a (relay-webapps.rsapps.net
[172.27.255.140])
by smtp17.relay.iad3a.emailsrvr.com (SMTP Server) with ESMTP id 904E125D74;
Thu, 14 Apr 2022 12:54:32 -0400 (EDT)
Received: from deepplum.com (localhost.localdomain [127.0.0.1])
by app31.wa-webapps.iad3a (Postfix) with ESMTP id 7A2702089B;
Thu, 14 Apr 2022 12:54:32 -0400 (EDT)
Received: by apps.rackspace.com
(Authenticated sender: dpreed@deepplum.com, from: dpreed@deepplum.com)
with HTTP; Thu, 14 Apr 2022 12:54:32 -0400 (EDT)
X-Auth-ID: dpreed@deepplum.com
Date: Thu, 14 Apr 2022 12:54:32 -0400 (EDT)
From: "David P. Reed"
To: "Michael Welzl"
Cc: "Sebastian Moeller" ,
ecn-sane@lists.bufferbloat.net
MIME-Version: 1.0
Content-Type: multipart/alternative;
boundary="----=_20220414125432000000_92878"
Importance: Normal
X-Priority: 3 (Normal)
X-Type: html
In-Reply-To:
References:
<4430DD9F-2556-4D38-8BE2-6609265319AF@ifi.uio.no>
<1649778681.721621839@apps.rackspace.com>
<0026CF35-46DF-4C0C-8FEE-B5309246C1B7@ifi.uio.no>
<08F92DA0-1D59-4E58-A289-3D35103CF78B@gmx.de>
X-Client-IP: 209.6.168.128
Message-ID: <1649955272.49298319@apps.rackspace.com>
X-Mailer: webmail/19.0.13-RC
X-Classification-ID: 8b017c04-81c9-4038-adf3-4b79ce074e91-1-1
Subject: Re: [Ecn-sane] rtt-fairness question
X-BeenThere: ecn-sane@lists.bufferbloat.net
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Discussion of explicit congestion notification's impact on the
Internet
List-Unsubscribe: ,
List-Archive:
List-Post:
List-Help:
List-Subscribe: ,
X-List-Received-Date: Thu, 14 Apr 2022 16:54:33 -0000
------=_20220414125432000000_92878
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
=0AAm I to assume, then, that routers need not pay any attention to RTT to =
achieve RTT-fairness?=0A =0AHow does a server or client (at the endpoint) a=
djust RTT so that it is fair?=0A =0ANow RTT, technically, is just the sum o=
f the instantaneous queue lengths in bytes along the path and the reverse p=
ath, plus a fixed wire-level delay. And routers along any path do not have =
correlated queue sizes.=0A =0AIt seems to me that RTT adjustment requires c=
ollective real-time cooperation among all-or-most future users of that path=
. The path is partially shared by many servers and many users, none of who=
m directly speak to each other.=0A =0AAnd routers have very limited memory =
compared to their throughput-RTdelay product. So calculating the RTT using =
spin bits and UIDs for packets seems a bit much to expect all routers to do=
.=0A =0ASo, what process measures the cross-interactions among all the user=
s of all the paths, and what control-loop (presumably stable and TCP-compat=
ible) actually converges to RTT fairness IRL.=0A =0AToday, the basis of con=
gestion control in the Internet is that each router is a controller of all =
endpoint flows that share a link, and each router is free to do whatever it=
takes to reduce its queue length to near zero as an average on all timesca=
les larger than about 1/10 of a second (a magic number that is directly der=
ived from measured human brain time resolution).=0A =0ASo, for any two mach=
ines separated by less than 1/10 of a light-second in distance, the total q=
ueueing delay has to stabilize in about 1/10 of a second. (I'm using a ligh=
t-second in a fiber medium, not free-space, as the speed of light in fiber =
is a lot slower than the speed of light on microwaves, as Wall Street has r=
ecently started recoginizing and investing in).=0A =0AI don't see how RTT-f=
airness can be achieved by some set of bits in the IP header. You can't sho=
rten RTT below about 2/10 of a second in that desired system state. You can=
only "lengthen" RTT by delaying packets in source or endpoint buffers, bec=
ause it's unreasonable to manage all the routers.=0A =0AAnd the endpoints t=
hat share a path can't talk to each other and reach a decision in on the or=
der of 2/10 of a second.=0A =0ASo at the very highest level, what is RTT-fa=
irness's objective function optimizing, and how can it work?=0A =0ACan it b=
e done without any change to routers?=0A =0A =0A =0A =0AOn Tuesday, April 1=
2, 2022 3:07pm, "Michael Welzl" said:=0A=0A=0A=0A=0AOn=
Apr 12, 2022, at 8:52 PM, Sebastian Moeller <[ moeller0@gmx.de ]( mailto:m=
oeller0@gmx.de )> wrote:=0A=0AQuestion: is QUIC actually using the spin bit=
as an essential part of the protocol?The spec says it=E2=80=99s optional: =
[ https://www.rfc-editor.org/rfc/rfc9000.html#name-latency-spin-bit ]( htt=
ps://www.rfc-editor.org/rfc/rfc9000.html#name-latency-spin-bit )=0A=0A=0AOt=
herwise endpoints might just game this if faking their RTT at a router yiel=
ds an advantage...=0AThis was certainly discussed in the QUIC WG. Probably =
perceived as an unclear incentive, but I didn=E2=80=99t really follow this.=
Cheers,=0AMichael=0A=0A=0AThis is why pping's use of tcp timestamps is eleg=
ant, little incentive for the endpoints to fudge....Regards Sebastian=0AOn =
12 April 2022 18:00:15 CEST, Michael Welzl <[ michawe@ifi.uio.no ]( mailto:=
michawe@ifi.uio.no )> wrote:Hi,=0AWho or what are you objecting against? =
At least nothing that I described does what you suggest.=0ABTW, just as a s=
ide point, for QUIC, routers can know the RTT today - using the spin bit, w=
hich was designed for that specific purpose.=0ACheers,=0AMichael=0A=0A=0AOn=
Apr 12, 2022, at 5:51 PM, David P. Reed <[ dpreed@deepplum.com ]( mailto:d=
preed@deepplum.com )> wrote:=0A=0AI strongly object to congestion control *=
in the network* attempting to measure RTT (which is an end-to-end comparati=
ve metric). Unless the current RTT is passed in each packet a router cannot=
enforce fairness. Period. =0A =0AToday, by packet drops and fair marking, =
information is passed to the sending nodes (eventually) about congestion. B=
ut the router can't know RTT today.=0A =0AThe result of *requiring* RTT fai=
rness would be to put the random bottleneck router (chosen because it is th=
e slowest forwarder on a contended path) become the endpoint controller.=0A=
=0AThat's the opposite of an "end-to-end resource sharing protocol".=0A =
=0ANow, I'm not saying it is impossible - what I'm saying it is asking all =
endpoints to register with an "Internet-wide" RTT real-time tracking and co=
ntrol service.=0A =0AThis would be the technical equivalent of an ITU centr=
al control point.=0A =0ASo, either someone will invent something I cannot i=
magine (a distributed, rapid-convergence algortithm that rellects to *every=
potential user* of a shared router along the current path the RTT's of ALL=
other users (and potential users).=0A =0AIMHO, the wish for RTT fairness i=
s like saying that the entire solar system's gravitational pull should be e=
qualized so that all planets and asteroids have fair access to 1G gravity.=
=0A =0A =0AOn Friday, April 8, 2022 2:03pm, "Michael Welzl" <[ michawe@ifi.=
uio.no ]( mailto:michawe@ifi.uio.no )> said:=0AHi,=0AFWIW, we have done som=
e analysis of fairness and convergence of DCTCP in:=0APeyman Teymoori, Davi=
d Hayes, Michael Welzl, Stein Gjessing: "Estimating an Additive Path Cost w=
ith Explicit Congestion Notification", IEEE Transactions on Control of Netw=
ork Systems, 8(2), pp. 859-871, June 2021. DOI 10.1109/TCNS.2021.3053179=0A=
Technical report (longer version):=0A[ https://folk.universitetetioslo.no/m=
ichawe/research/publications/NUM-ECN_report_2019.pdf ]( https://folk.univer=
sitetetioslo.no/michawe/research/publications/NUM-ECN_report_2019.pdf )=0Aa=
nd there=E2=80=99s also some in this paper, which first introduced our LGC =
mechanism:=0A[ https://ieeexplore.ieee.org/document/7796757 ]( https://ieee=
xplore.ieee.org/document/7796757 )=0ASee the technical report on page 9, se=
ction D: a simple trick can improve DCTCP=E2=80=99s fairness (if that=E2=
=80=99s really the mechanism to stay with=E2=80=A6 I=E2=80=99m getting qu=
ite happy with the results we get with our LGC scheme :-) )=0A=0ACheers=
,=0AMichael=0A=0AOn Apr 8, 2022, at 6:33 PM, Dave Taht <[ dave.taht@gmail.c=
om ]( mailto:dave.taht@gmail.com )> wrote:=0A=0AI have managed to drop most=
of my state regarding the state of variousdctcp-like solutions. At one lev=
el it's good to have not been keepingup, washing my brain clean, as it were=
. For some reason or another Iwent back to the original paper last week, an=
d have been poundingthrough this one again:Analysis of DCTCP: Stability, Co=
nvergence, and Fairness"Instead, we propose subtracting =CE=B1/2 from the w=
indow size for each marked ACK,resulting in the following simple window upd=
ate equation:One result of which I was most proud recently was of demonstra=
tingperfect rtt fairness in a range of 20ms to 260ms with fq_codel[ https:/=
/forum.mikrotik.com/viewtopic.php?t=3D179307 ]( https://forum.mikrotik.com/=
viewtopic.php?t=3D179307 ) )- and I'm prettyinterested in 2-260ms, but have=
n't got around to it.Now, one early result from the sce vs l4s testing I re=
call was severelatecomer convergence problems - something like 40s to come =
into flowbalance - but I can't remember what presentation, paper, or rtt th=
atwas from. ?Another one has been various claims towards some level of rttu=
nfairness being ok, but not the actual ratio, nor (going up to thepaper's p=
roposal above) whether that method had been tried.My opinion has long been =
that any form of marking should look moreclosely at the observed RTT than a=
ny fixed rate reduction method, andcompensate the paced rate to suit. But t=
hat's presently just reducedto an opinion, not having kept up with progress=
on prague, dctcp-sce,or bbrv2. As one example of ignorance, are 2 packets =
still paced backto back? DRR++ + early marking seems to lead to one packet =
beingconsistently unmarked and the other marked.-- I tried to build a bette=
r future, a few times:[ https://wayforward.archive.org/?site=3Dhttps%3A%2F%=
2Fwww.icei.org ]( https://wayforward.archive.org/?site=3Dhttps%3A%2F%2Fwww.=
icei.org )Dave T=C3=A4ht CEO, TekLibre, LLC________________________________=
_______________Ecn-sane mailing list[ Ecn-sane@lists.bufferbloat.net ]( mai=
lto:Ecn-sane@lists.bufferbloat.net )[ https://lists.bufferbloat.net/listinf=
o/ecn-sane ]( https://lists.bufferbloat.net/listinfo/ecn-sane )=0A=0A-- Sen=
t from my Android device with K-9 Mail. Please excuse my brevity.
------=_20220414125432000000_92878
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Am I to assume, then, =
that routers need not pay any attention to RTT to achieve RTT-fairness?
=
=0A
=0AHow does a server=
or client (at the endpoint) adjust RTT so that it is fair?
=0A
=0ANow RTT, technically, is jus=
t the sum of the instantaneous queue lengths in bytes along the path and th=
e reverse path, plus a fixed wire-level delay. And routers along any path d=
o not have correlated queue sizes.
=0A
=0A=
It seems to me that RTT adjustment requires collective=
real-time cooperation among all-or-most future users of that path. T=
he path is partially shared by many servers and many users, none of whom di=
rectly speak to each other.
=0A
=0AAnd routers have very limited memory compared to their throug=
hput-RTdelay product. So calculating the RTT using spin bits and UIDs for p=
ackets seems a bit much to expect all routers to do.
=0A
=0ASo, what process measures the cross-=
interactions among all the users of all the paths, and what control-loop (p=
resumably stable and TCP-compatible) actually converges to RTT fairness IRL=
.
=0A
=0AToday, the b=
asis of congestion control in the Internet is that each router is a control=
ler of all endpoint flows that share a link, and each router is free to do =
whatever it takes to reduce its queue length to near zero as an average on =
all timescales larger than about 1/10 of a second (a magic number that is d=
irectly derived from measured human brain time resolution).
=0A
=0ASo, for any two machines sep=
arated by less than 1/10 of a light-second in distance, the total queueing =
delay has to stabilize in about 1/10 of a second. (I'm using a light-second=
in a fiber medium, not free-space, as the speed of light in fiber is a lot=
slower than the speed of light on microwaves, as Wall Street has recently =
started recoginizing and investing in).
=0A =
p>=0A
I don't see how RTT-fairness can be achieved by s=
ome set of bits in the IP header. You can't shorten RTT below about 2/10 of=
a second in that desired system state. You can only "lengthen" RTT by dela=
ying packets in source or endpoint buffers, because it's unreasonable to ma=
nage all the routers.
=0A
=0AAnd the endpoints that share a path can't talk to each other and re=
ach a decision in on the order of 2/10 of a second.
=0A
=0ASo at the very highest level, what is=
RTT-fairness's objective function optimizing, and how can it work?
=0A<=
p style=3D"margin:0;padding:0;font-family: arial; font-size: 10pt; overflow=
-wrap: break-word;">
=0ACan it be done withou=
t any change to routers?
=0A
=0A
=0A
=0A
=0AOn Tuesday, April 12, 2022 3:07pm, =
"Michael Welzl" <michawe@ifi.uio.no> said:
=0A=0A
=0A
=0A
=0A=0A
Otherwise endpoints might just game this=
if faking their RTT at a router yields an advantage...
=0A
=0A=
blockquote>=0AThis was certainly discussed in the QUIC WG. Probably pe=
rceived as an unclear incentive, but I didn=E2=80=99t really follow this.=
div>=0ACheers,
=0AMichael
=0A=0A
=0A=0A
This =
is why pping's use of tcp timestamps is elegant, little incentive for the e=
ndpoints to fudge....
Regards
Sebastian
=0A
On 12 April 2022 18:00:15 CEST, Michael Welzl <
michawe@ifi.uio.no> wrot=
e:=0A
Hi,=0AW=
ho or what are you objecting against? At least nothing that I descri=
bed does what you suggest.
=0ABTW, just as a side poin=
t, for QUIC, routers can know the RTT today - using the spin bit, which was=
designed for that specific purpose.
=0ACheers,
=
=0AMichael
=0A=0A
=0A
=0A=0A=0A
I strongly object to con=
gestion control *in the network* attempting to measure RTT (which is an end=
-to-end comparative metric). Unless the current RTT is passed in each packe=
t a router cannot enforce fairness. Period.
=0A
=0A
Today, by packet drops and fair marking, information is p=
assed to the sending nodes (eventually) about congestion. But the router ca=
n't know RTT today.
=0A
=0A
The result o=
f *requiring* RTT fairness would be to put the random bottleneck router (ch=
osen because it is the slowest forwarder on a contended path) become the en=
dpoint controller.
=0A
=0A
That's the op=
posite of an "end-to-end resource sharing protocol".
=0A
=0A
Now, I'm not saying it is impossible - what I'm saying=
it is asking all endpoints to register with an "Internet-wide" RTT real-ti=
me tracking and control service.
=0A
=0A
This would be the technical equivalent of an ITU central control point.=0A
=0A<=
div class=3D"" style=3D"margin: 0px; padding: 0px; font-family: arial; font=
-size: 10pt; overflow-wrap: break-word;">So, either someone will invent som=
ething I cannot imagine (a distributed, rapid-convergence algortithm that r=
ellects to *every potential user* of a shared router along the current path=
the RTT's of ALL other users (and potential users).
=0A
=0A
IMHO, the wish for RTT fairness is like saying that th=
e entire solar system's gravitational pull should be equalized so that all =
planets and asteroids have fair access to 1G gravity.
=0A
=0A
=0A
=0A
Hi,=0A
FWIW, we have done some analysis of fairn=
ess and convergence of DCTCP in:
=0A
Peyman Teymoori, D=
avid Hayes, Michael Welzl, Stein Gjessing: "Estimating an Additive Path Cos=
t with Explicit Congestion Notification", IEEE Transactions on Control of N=
etwork Systems, 8(2), pp. 859-871, June 2021. DOI 10.1109/TCNS.2021.30=
53179
=0A
Technical report (longer version):
=0A
https://folk.universitet=
etioslo.no/michawe/research/publications/NUM-ECN_report_2019.pdf=
=0A
and there=E2=80=99s also some in this paper, which first=
introduced our LGC mechanism:
=0A
=0A
See the technical report on=
page 9, section D: a simple trick can improve DCTCP=E2=80=99s fairness &nb=
sp;(if that=E2=80=99s really the mechanism to stay with=E2=80=A6 I=
=E2=80=99m getting quite happy with the results we get with our LGC scheme =
:-) )
=0A
=0A
Cheers,
=0A
Michael
=0A
=0A
=0A=0A=0A
I have managed to drop most of my state regarding the state of various
dctcp-like solutions. At one level it's good to have not been=
keeping
up, washing my brain clean, as it were. For some r=
eason or another I
went back to the original paper last wee=
k, and have been pounding
through this one again:
Analysis of DCTCP: Stability, Convergence, and Fai=
rness
"Instead, we propose subtracting =CE=
=B1/2 from the window size for each marked ACK,
resulting i=
n the following simple window update equation:
One result of which I was most proud recently was of demonstrating
perfect rtt fairness in a range of 20ms to 260ms with fq_codel=
https://forum.mikrotik.com/viewtopic.php?t=3D179307 )-=
and I'm pretty
interested in 2-260ms, but haven't got arou=
nd to it.
Now, one early result from the s=
ce vs l4s testing I recall was severe
latecomer convergence=
problems - something like 40s to come into flow
balance - =
but I can't remember what presentation, paper, or rtt that
=
was from. ?
Another one has been various c=
laims towards some level of rtt
unfairness being ok, but no=
t the actual ratio, nor (going up to the
paper's proposal a=
bove) whether that method had been tried.
=
My opinion has long been that any form of marking should look more
closely at the observed RTT than any fixed rate reduction method, a=
nd
compensate the paced rate to suit. But that's presently =
just reduced
to an opinion, not having kept up with progres=
s on prague, dctcp-sce,
or bbrv2. As one example of ignoran=
ce, are 2 packets still paced back
to back? DRR++ + early m=
arking seems to lead to one packet being
consistently unmar=
ked and the other marked.
--
I tried to build a better future, a few times:
https://wayforward.archive.org/?site=3Dhttps%3A%2F%2Fwww.icei.org=
Dave T=C3=A4ht CEO, TekLibre, LLC
_______________________________________________
Ec=
n-sane mailing list
Ecn-sane@lists.bufferbloat.nethttps:=
//lists.bufferbloat.net/listinfo/ecn-sane=0A
=0A
=0A
=0A
=0A
=0A
=0A=0A
=0A
=0A<=
/blockquote>=0A
=0A
=
=0A
--
Sent from my Android=
device with K-9 Mail. Please excuse my brevity.
=0A
=0A
=0A=
=0A
=0A
=0A
------=_20220414125432000000_92878--