From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <moeller0@gmx.de>
Received: from mout.gmx.net (mout.gmx.net [212.227.17.22])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (No client certificate requested)
 by lists.bufferbloat.net (Postfix) with ESMTPS id 211CF3B2A4
 for <bloat@lists.bufferbloat.net>; Tue, 12 Jul 2022 05:47:41 -0400 (EDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=gmx.net;
 s=badeba3b8450; t=1657619260;
 bh=YHyLNf7BHNWGo3KXsiIOdnyXYDe6zqhzBsqBDX/cdHs=;
 h=X-UI-Sender-Class:Subject:From:In-Reply-To:Date:Cc:References:To;
 b=ZU0YIl2OwZlq+WKMwR+ACE5ZWr71fi5AslNfbLy40AL86Fuf3ZvxdlQE5Zga4OmT1
 eauxrTg37k+tJ3z2RSbB3PEe16imLduOSOc201PUFkNU+L+Sd3h5q/cCt3A4Pv3oy3
 rbqjQ3G7MZFJi6VMgF6CNVWowVI5keS1/z76WKsU=
X-UI-Sender-Class: 01bb95c1-4bf8-414a-932a-4f6e2808ef9c
Received: from smtpclient.apple ([134.76.241.253]) by mail.gmx.net (mrgmx105
 [212.227.17.168]) with ESMTPSA (Nemesis) id 1MaJ3t-1o5PbN0BDZ-00WCTy; Tue, 12
 Jul 2022 11:47:40 +0200
Content-Type: text/plain;
	charset=utf-8
Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3696.100.31\))
From: Sebastian Moeller <moeller0@gmx.de>
In-Reply-To: <E9AD9F79-41D6-437E-AC25-293EDDA6FBBF@ifi.uio.no>
Date: Tue, 12 Jul 2022 11:47:39 +0200
Cc: =?utf-8?Q?Dave_T=C3=A4ht?= <dave.taht@gmail.com>,
 bloat <bloat@lists.bufferbloat.net>
Content-Transfer-Encoding: quoted-printable
Message-Id: <32A9050A-B81F-4838-BE7F-691F0670DB84@gmx.de>
References: <6458C1E6-14CB-4A36-8BB3-740525755A95@ifi.uio.no>
 <CAA93jw6bE2NYSHeAcqL+w_j0Mv4KMxRWJR_AwRVT+vGCYiXg7A@mail.gmail.com>
 <7D20BEF3-8A1C-4050-AE6F-66E1B4203EE1@gmx.de>
 <4E163307-9B8A-4BCF-A2DE-8D7F3C6CCEF4@ifi.uio.no>
 <F5C9EFF0-9DEB-4843-A21E-2DB3E9E44483@gmx.de>
 <95FB54F9-973F-40DE-84BF-90D05A642D6B@ifi.uio.no>
 <DE7A9468-056D-44E5-9ADF-DC83B5C10E03@gmx.de>
 <0BAAEF4C-331B-493C-B1F5-47AA648C64F8@ifi.uio.no>
 <9DF7ADFC-B5FC-4488-AF80-A905FECC17E8@gmx.de>
 <E9AD9F79-41D6-437E-AC25-293EDDA6FBBF@ifi.uio.no>
To: Michael Welzl <michawe@ifi.uio.no>
X-Mailer: Apple Mail (2.3696.100.31)
X-Provags-ID: V03:K1:O05SQkUREYnZvi47+32pp+IqLIvD8GN4X7317mH5IpiO8esQVzG
 C4yOv6v0smsxpENPgAxMkFF6z+9Hh9f8pObzo066CW4LCnY89bwQ27lPJlhwwL6UntbVnhl
 ngvIWaNB6eVvLciEz75xR4V074/EvaPrfXxZU08P/2/gtWpqVW3ZHHV8QFwtWg3eJbqt4i7
 9VzItprStlHbVqShaAwIQ==
X-Spam-Flag: NO
X-UI-Out-Filterresults: notjunk:1;V03:K0:KPMqKK1Khw4=:kxuCXnE0wSAE9UVX7tID1a
 g4gXfrAlCzN7Jah6XjNMB2KU+1EQax7GeVNaFqOR/yWydAYIl5tISKes8P5oYKhA9riHEdUtP
 bmH3dDmD6tYuvylea/DFvFDMfC/vbRrTtcsGLQoIsjKzcwP/wHtO9IS/E+yf84hyJ9qIneY+R
 23i/R/+oBKwOphysGrgUadv6D48UnD0RwLS9A1eVauHQ/Z/cOMLKtF6uZrFPkSQOptca/txku
 Li/4tKFJNxU/lLxWSx+GntpSf3VLnwLwExLgS+201pXnOeqpzwGO0ntsKe98x4M/ZB6+Wr+fP
 Qk3GqBsq5tGAWZAWX7qUF5idNJA2vpX92oMfpuGM6CeAyH9t9ANdpRyYrV9NJSKYpU7mQz6eg
 C/1wqDmR4ImsX3nZrk1MQaojKJnZPegUJAOMK61z6U9RMk0AyvDq4PntzSuYw9WSDiMvfy8Q0
 7Hzc6H2TlMv210WpYWWKCCkHj7uqy5JuD/UfRyrmnj+ltv6ZcglLYAMfsfHePj8F8f66GeByM
 k6smLT2+nG7rQFUTp0IUHPQIgPcqDeVHgucnT3AMblRZI7fiNWEA9DQW3MiyftVYR/xDcFKOw
 zr0ICphWdmZBNnPCst+vqO+P8G1PbQ3w5xVsjv4rCBsUoIDMKsBx/fkN5DgaM87G8UsxOb541
 NIvDYnUTfuDPtcWOdIl5EjfPb0btUbgvQ9BfHAx/rCO3U5GF4pmW8ThUF+ZMIKOiJj/Pi9V+r
 YNp9MQnLhpcXnjwwzjeXlgkt2MFnfBE+HIoCxPrID52tghTBcc5NW3mUrZN6jfMJTOtiB1kvP
 fmxxSqTjLdtvWrtNhbA7Od3z9XlZTgUp8LU8VcFZ8J2roZx4oC+HNv2YkF9WHXch3oQJU8XKP
 6vOt4nFZVd4GvyX2GkFAmYvliU3RC47wPu8uArxyK+SzQ7TOlY3Sfn6/0FiGXgi6kQpqZP2Ec
 kTW4bgch53uqIoaaKlUsBE8ouarmtcUt/YVbFmLwIqDCLFG8Eh6TgJZFQUYhvVNoUd1TIS1N8
 S2cvH7jygKR249H7zgvPq1O0pcgviRbMdv70AD9wc8vMvhojzCjDZBl4GIJL9ymn7fX8LCSbj
 xTqgfpfL/E7X+tt2ANF3vVEZNJ+pYqerwqHZN1nRULrLsSpD5BDZf9Xpg==
Subject: Re: [Bloat] [iccrg] Musings on the future of Internet Congestion
 Control
X-BeenThere: bloat@lists.bufferbloat.net
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: General list for discussing Bufferbloat <bloat.lists.bufferbloat.net>
List-Unsubscribe: <https://lists.bufferbloat.net/options/bloat>,
 <mailto:bloat-request@lists.bufferbloat.net?subject=unsubscribe>
List-Archive: <https://lists.bufferbloat.net/pipermail/bloat>
List-Post: <mailto:bloat@lists.bufferbloat.net>
List-Help: <mailto:bloat-request@lists.bufferbloat.net?subject=help>
List-Subscribe: <https://lists.bufferbloat.net/listinfo/bloat>,
 <mailto:bloat-request@lists.bufferbloat.net?subject=subscribe>
X-List-Received-Date: Tue, 12 Jul 2022 09:47:42 -0000

Hi Michael,


> On Jul 11, 2022, at 10:49, Michael Welzl <michawe@ifi.uio.no> wrote:
>=20
> Hi !
>=20
> A few answers below -
>=20
>=20
>> On Jul 11, 2022, at 9:33 AM, Sebastian Moeller <moeller0@gmx.de> =
wrote:
>>=20
>> HI Michael,
>>=20
>>=20
>>> On Jul 11, 2022, at 08:24, Michael Welzl <michawe@ifi.uio.no> wrote:
>>>=20
>>> Hi Sebastian,
>>>=20
>>> Neither our paper nor me are advocating one particular solution - we =
point at a problem and suggest that research on ways to solve the =
under-utilization problem might be worthwhile.
>>=20
>> 	[SM2] That is easy to agree upon, as is agreeing on improving =
slow start and trying to reduce underutilization, but actually doing is =
hard; personally I am more interested in the hard part, so I might have =
misunderstood the gist of the discussion you want to start with that =
publication.
>=20
> What you=E2=80=99re doing is jumping ahead. I suggest doing this with =
research rather than an email discussion, but that=E2=80=99s what =
we=E2=80=99re now already into.

	[SM] Sure, except my day job is in a completely different field =
(wetware instead of hard- or software), it is very unlikely that I will =
be able to contribute meaningfully to the kind of research. I am not =
saying all I can contribute is just talk; I helped out with a few =
obscure details in the bufferbloat movement (better anti-bufferbloat =
movement) and I did contribute a bit to sqm and the earlier cited =
autorate approach (which is topical because it is concerned with =
capacity detection), but I will not be conducting nor publishing any =
CS-grade paper on CC and/or slow start. Maybe a reason for me to =
respectfully bow out of this discussion as talk is cheap and easy to =
come by even without me helping.

>>> Jumping from this to discussing the pro=E2=80=99s and con=E2=80=99s =
of a potential concrete solution is quite a leap=E2=80=A6
>>>=20
>>> More below:
>>>=20
>>>=20
>>>> On Jul 10, 2022, at 11:29 PM, Sebastian Moeller <moeller0@gmx.de> =
wrote:
>>>>=20
>>>> Hi Michael,
>>>>=20
>>>>=20
>>>>> On Jul 10, 2022, at 22:01, Michael Welzl <michawe@ifi.uio.no> =
wrote:
>>>>>=20
>>>>> Hi !
>>>>>=20
>>>>>=20
>>>>>> On Jul 10, 2022, at 7:27 PM, Sebastian Moeller <moeller0@gmx.de> =
wrote:
>>>>>>=20
>>>>>> Hi Michael,
>>>>>>=20
>>>>>> so I reread your paper and stewed a bit on it.
>>>>>=20
>>>>> Many thanks for doing that! :)
>>>>>=20
>>>>>=20
>>>>>> I believe that I do not buy some of your premises.
>>>>>=20
>>>>> you say so, but I don=E2=80=99t really see much disagreement here. =
Let=E2=80=99s see:
>>>>>=20
>>>>>=20
>>>>>> e.g. you write:
>>>>>>=20
>>>>>> "We will now examine two factors that make the the present =
situation particularly worrisome. First, the way the infrastructure has =
been evolving gives TCP an increasingly large operational space in which =
it does not see any feedback at all. Second, most TCP connections are =
extremely short. As a result, it is quite rare for a TCP connection to =
even see a single congestion notification during its lifetime."
>>>>>>=20
>>>>>> And seem to see a problem that flows might be able to finish =
their data transfer business while still in slow start. I see the same =
data, but see no problem. Unless we have an oracle that tells each =
sender (over a shared bottleneck) exactly how much to send at any given =
time point, different control loops will interact on those intermediary =
nodes.
>>>>>=20
>>>>> You really say that you don=E2=80=99t see the solution. The =
problem is that capacities are underutilized, which means that flows =
take longer (sometimes, much longer!) to finish than they theoretically =
could, if we had a better solution.
>>>>=20
>>>> 	[SM] No IMHO the underutilization is the direct consequence of =
requiring a gradual filling of the "pipes" to probe he available =
capacity. I see no way how this could be done differently with the =
traffic sources/sinks being uncoordinated entities at the edge, and I =
see no way of coordinating all end points and handle all paths. In other =
words, we can fine tune a parameters to tweak the probing a bit, make it =
more or less aggressive/fast, but the fact that we need to probe =
capacity somehow means underutilization can not be avoided unless we =
find a way of coordinating all of the sinks and sources. But being =
sufficiently dumb, all I can come up with is an all-knowing oracle or =
faster than light communication, and neither strikes me to be realistic =
;)
>>>=20
>>> There=E2=80=99s quite a spectrum of possibilities between an oracle =
or =E2=80=9Ccoordinating all of the sinks and sources=E2=80=9D on one =
hand, and quite =E2=80=9Cblindly=E2=80=9D probing from a constant IW on =
the other.
>>=20
>> 	[SM] You say "blindly" I say "starting from a conservative but =
reliable prior"... And what I see is that qualitatively significantly =
better approaches are not really possible, so we need to discuss small =
quantitative changes.
>=20
> More about the term =E2=80=9Cblind=E2=80=9D below:
>=20
>=20
>>> The =E2=80=9Cfine tuning=E2=80=9D that you mention is interesting =
research, IMO!
>>=20
>> 	[SM] The paper did not read that you were soliciting ideas for =
small gradual improvements to me.
>=20
> It calls for being drastic in the way we think about things, because =
it makes the argument that PEPs (different kinds of them!) might in fact =
be the right approach - but it doesn=E2=80=99t say that =E2=80=9Conly =
drastic solutions are good solutions=E2=80=9D. Our =E2=80=9CThe Way =
Forward=E2=80=9D section has 3 subsections; one of them is on end-to-end =
approaches, where we call out the RL-IW approach I mention below as one =
good way ahead. I would categorize this as =E2=80=9Csmall and =
gradual=E2=80=9D.

	[SM] Having contact with machine learning and reenforcement =
learning in my dayjob, I am at best cautiously optimistic for such a =
solution to end up with an improvement across the board, but again happy =
to be shown wrong. I prefer solutions with algorithms that are easier to =
interpret than what machine learning (especially with deep networks) =
ends up with.


>=20
>=20
>>>>>> I might be limited in my depth of thought here, but having each =
flow probing for capacity seems exactly the right approach... and =
doubling CWND or rate every RTT is pretty aggressive already (making =
slow start shorter by reaching capacity faster within the slow-start =
framework requires either to start with a higher initial value (what =
increasing IW tries to achieve?) or use a larger increase factor than 2 =
per RTT). I consider increased IW a milder approach than the =
alternative. And once one accepts that a gradual rate increasing is the =
way forward it falls out logically that some flows will finish before =
they reach steady state capacity especially if that flows available =
capacity is large. So what exactly is the problem with short flows not =
reaching capacity and what alternative exists that does not lead to =
carnage if more-aggressive start-up phases drive the bottleneck load =
into emergency drop territory?
>>>>>=20
>>>>> There are various ways to do this
>>>=20
>>> [snip: a couple of concrete suggestions from me, and answers about =
what problems they might have, with requests for references from you]
>>>=20
>>> I=E2=80=99m sorry, but I wasn=E2=80=99t really going to have a =
discussion about these particular possibilities. My point was only that =
many possible directions exist - being completely =E2=80=9Cblind=E2=80=9D =
isn=E2=80=99t the only possible approach.
>>=20
>> 	[SM] Again I do not consider "blind" to be an appropriate =
qualification here.
>=20
> IW is a global constant (not truly configured the same everywhere, =
most probably for good reason!

	[SM] Arguable how "good" these reasons are. It is already =
noticeable that a number of CDNs use higher than standard IWs and =
extract more than their expected fair share of a link's capacity. That =
is certainly in the interest of the CDN abnd the CDN's paying customers, =
but not necessarily in the interest of the end-user accessing the data =
on the CDN. (Personally I do not care too much about this, since I use =
an fq-scheduler at my ingress and egress, so high IW flows finish faster =
if spare capacity is available and only cause queue build up for them =
selves if there is no spare capacity).

>  but the standard suggests a globally unique value).

	[SM] One thing I learned about IETF standards in the last years =
is that not all of them are of the same quality, I will look carefully =
at each standards before accepting its recommendation as "gospel". (IMHO =
the IETF process is very welcome but assumes good faith all around, and =
hence seems easily gamed).

> =46rom then on, the cwnd is doubled a couple of times. No feedback =
about the path=E2=80=99s capacity exists - and then, the connection is =
over.

	[SM] I disagree, the flow very much knows that the reached =
CWND/sending rate was <=3D than the capacity available for that flow, =
that is feed-back IMHO. If you are a gambling type you could try to =
speculatively re-use that information.

> Okay, there is ONE thing that such a flow gets: the RTT. =E2=80=9CBlind =
except for RTT measurements=E2=80=9D, then.

	[SM] I guess your point is "flow does not know the maximal =
capacity it could have gotten"?

> Importantly, such a flow never learns how large its cwnd *could* have =
become without ever causing a problem. Perhaps 10 times more? 100 times?

	[SM] Sure. ATM the only way to learn a path's capacity is =
actually to saturate the path*, but if a flow is done with its data =
transfer, having if exchange dummy data just to probe capacity seems =
like a waste all around. I guess what I want to ask is, how would =
knowing how much available but untapped capacity was available at one =
point help?


*) stuff like deducing capacity from packet pair interval at the =
receiver (assuming packets sent back to back) is notoriously imprecise, =
so unless "chirping" overcomes that imprecision without costing too many =
round trips worth of noise supression measuring capacity by causing =
congestion is the only way. Not great.


>=20
>=20
>>> Instead of answering your comments to my suggestions, let me give =
you one single concrete piece here: our reference 6, as one example of =
the kind of resesarch that we consider worthwhile for the future:
>>>=20
>>> "X. Nie, Y. Zhao, Z. Li, G. Chen, K. Sui, J. Zhang, Z. Ye, and D. =
Pei, =E2=80=9CDynamic TCP initial windows and congestion control schemes =
through reinforcement learning,=E2=80=9D IEEE JSAC, vol. 37, no. 6, =
2019.=E2=80=9D
>>> https://1989chenguo.github.io/Publications/TCP-RL-JSAC19.pdf
>>=20
>> 	[SM] =46rom the title I predict that this is going to lean into =
the "cache" idea trying to improve the average hit rate of said cache...
>>=20
>>> This work learns a useful value of IW over time, rather than using a =
constant. One author works at Baidu, the paper uses data from Baidu, and =
it says:
>>> "TCP-RL has been deployed in one of the top global search engines =
for more than a year. Our online and testbed experiments show that for =
short flow transmission, compared with the common practice of IW =3D 10, =
TCP-RL can reduce the average transmission time by 23% to 29%.=E2=80=9D
>>>=20
>>> - so it=E2=80=99s probably fair to assume that this was (and perhaps =
still is) active in Baidu.
>>=20
>> 	[SM] This seems to confirm my prediction... however the paper =
seems to be written pretty exclusively from the view of an operator of =
server farms, not sure this approach will actually do any good for leaf =
end-points in e.g. home networks (that is for their sending behavior). I =
tend to prefer symmetric solutions, but if data center traffic can reach =
higher utilization without compromising end-user quality of experience =
and fairness, what is not to like about this. It is however fully within =
the existing slow-start framework, no?
>=20
> Yes!
>=20
>=20
>>>>>> And as an aside, a PEP (performance enhancing proxy) that does =
not enhance performance is useless at best and likely harmful (rather a =
PDP, performance degrading proxy).
>>>>>=20
>>>>> You=E2=80=99ve made it sound worse by changing the term, for =
whatever that=E2=80=99s worth. If they never help, why has anyone ever =
called them PEPs in the first place?
>>>>=20
>>>> 	[SM] I would guess because "marketing" was unhappy with =
"engineering" emphasizing the side-effects/potential problems and =
focussed in the best-case scenario? ;)
>>>=20
>>> It appears that you want to just ill-talk PEPs.
>>=20
>> 	[SM] Not really, I just wanted to point out that I expect the =
term PEP to come from entities selling those products and in our current =
environment it is clear that products are named and promoted emphasizing =
the potential benefit they can bring and not by the additional risks =
they might carry (e.g. fission power plants were sold on the idea of =
essentially unlimited cheap emission free energy, and not on the =
concurrent problem with waste disposal over time frames in the order of =
the aggregate human civilisation from the bronze age). I have no beef =
with that, but I do not think that taking the "positive" name as a sign =
that PEPs are generally liked or live up to their name (note I am also =
not saying that they do not, just that the name PEP is a rather =
unreliable predictor here).
>=20
> I don=E2=80=99t even think that this name has that kind of history. My =
point was that they=E2=80=99re called PEPs because they=E2=80=99re =
*meant* to improve performance;

	[SM] That is not really how our economic system works... =
products are primarily intended to generate more revenue than cost, it =
helps if they offer something to the customer, but that is really just a =
means to extract revenue...


> that=E2=80=99s what they=E2=80=99re designed for. You describe =E2=80=9C=
a PEP that does not enhance performance=E2=80=9D, which, to me, is like =
talking about a web server that doesn=E2=80=99t serve web pages. Sure, =
not all PEPs may always work well, but they should - that=E2=80=99s =
their raison d=E2=80=99=C3=AAtre.

	[SM] That is a very optimistic view, I would love to be able to =
share.

>=20
>>> There are plenty of useful things that they can do and yes, I =
personally think they=E2=80=99re the way of the future - but **not** in =
their current form, where they must =E2=80=9Clie=E2=80=9D to TCP, cause =
ossification,
>>=20
>> 	[SM] Here I happily agree, if we can get the nagative =
side-effects removed that would be great, however is that actually =
feasible or just desirable?
>>=20
>>> etc. PEPs have never been considered as part of the congestion =
control design - when they came on the scene, in the IETF, they were =
despised for breaking the architecture, and then all the trouble with =
how they need to play tricks was discovered (spoofing IP addresses, =
making assumptions about header fields, and whatnot). That doesn=E2=80=99t=
 mean that a very different kind of PEP - one which is authenticated and =
speaks an agreed-upon protocol - couldn=E2=80=99t be a good solution.
>>=20
>> 	[SM] Again, I agree it could in theory especially if =
well-architected.=20
>=20
> That=E2=80=99s what I=E2=80=99m advocating.

	[SM] Well, can you give an example of an existing =
well-architected PEP as proof of principle?

>=20
>>> You=E2=80=99re bound to ask me for concrete things next, and if I =
give you something concrete (e.g., a paper on PEPs), you=E2=80=99ll find =
something bad about it
>>=20
>> 	[SM] Them are the rules of the game... however if we should play =
the game that way, I will come out of it having learned something new =
and potentially changing my opinion.
>>=20
>>> - but this is not a constructive direction of this conversation. =
Please note that I=E2=80=99m not saying =E2=80=9CPEPs are always =
good=E2=80=9D: I only say that, in my personal opinion, they=E2=80=99re =
a worthwhile direction of future research. That=E2=80=99s a very =
different statement.
>>=20
>> 	[SM] Fair enough. I am less optimistic, but happy to be =
disappointed in my pessimism.
>>=20
>>>=20
>>>>> Why do people buy these boxes?
>>>>=20
>>>> 	[SM] Because e.g. for GEO links, latency is in a range where =
default unadulterated TCP will likely choke on itself, and when faced =
with requiring customers to change/tune TCPs or having "PEP" fudge it, =
ease of use of fudging won the day. That is a generous explanation (as =
this fudging is beneficial to both the operator and most end-users), I =
can come up with less charitable theories if you want ;) .
>>>>=20
>>>>>> The network so far has been doing reasonably well with putting =
more protocol smarts at the ends than in the parts in between.
>>>>>=20
>>>>> Truth is, PEPs are used a lot: at cellular edges, at satellite =
links=E2=80=A6 because the network is *not* always doing reasonably well =
without them.
>>>>=20
>>>> 	[SM] Fair enough, I accept that there are use cases for those, =
but again, only if the actually enhance the "experience" will users be =
happy to accept them.
>>>=20
>>> =E2=80=A6 and that=E2=80=99s the only reason to deploy them, given =
that (as the name suggests) they=E2=80=99re meant to increase =
performance. I=E2=80=99d be happy to learn more about why you appear to =
hate them so much (even just anecdotal).
>>>=20
>>>> The goals of the operators and the paying customers are not always =
aligned here, a PEP might be advantageous more to the operator than the =
end-user (theoretically also the other direction, but since operators =
pay for PEPs they are unlikely to deploy those) think mandatory image =
recompression or forced video quality downscaling.... (and sure these =
are not as clear as I pitched them, if after an emergency a PEP allows =
most/all users in a cell to still send somewhat degraded images that is =
better than the network choking itself with a few high quality images, =
assuming images from the emergency are somewhat useful).
>>>=20
>>> What is this, are you inventing a (too me, frankly, strange) =
scenario where PEPs do some evil for customers yet help operators,
>>=20
>> 	[SM] This is no invention, but how capitalism works, sorry. The =
party paying for the PEP decides on using it based on the advantages it =
offers for them. E.g. a mobile carrier that (in the past) forcible =
managed to downgrade the quality of streaming video over mobile links =
without giving the paying end-user an option to use either choppy high =
resolution or smooth low resolution video. By the way, that does not =
make the operator evil, it is just that operator and paying customers =
goals and desires are not all that well aligned (e.g. the operator wants =
to maximize revenue, the customer to minimize cost).
>=20
> You claim that these goals and desires are not well aligned (and a PEP =
is then an instrument in this evil)

	[SM] Again this is expected behavior in our economic system, I =
have not and will not classify that as "evil", but I will also not start =
believing that companies offer products just to get a warm and fuzzy =
feeling. It is part of the principle of how a market economy works that =
the goals of the entities involved are opposite of each other, that is =
how a market is supposed to optimize resource allocation.

> - do you have any proof, or even anecdotes, to support that claim?

	[SM] The claim that sellers want the highest revenue/cost ratio =
while buyers want the lowest cost/utility seems hardly controversial or =
requiring a citation.


> I would think that operators generally try to make their customers =
happy (or they would switch to different operators).  Yes there may be =
some misalignments in incentives, but I believe that these are more =
subtle points. E.g., who wants a choppy high resolution video? Do such =
users really exist?

	[SM] I might be able to buffer that choppy video well enough to =
allow smooth playback at the desired higher resolution/quality (or I =
might be happy with a few seconds to compare quality of displays), given =
that I essentially buy internet access from my mobile carrier that =
carrier should get out of my way. (However if the carrier also offers =
"video-optimization" as an opt-in feature end-users can toggle at will =
that is a different kettle of fish and something I would consider good =
service). IIRC a German carrier was simply downforcing quality for all =
video streaming at all time, mostly to minimize cost and bandwidth =
usage, which pretty much looks like an exercise to minimize operational =
cost and not to increase customer satisfaction. So yes there are =
"misalignments in incentives" that are inherent and structural to the =
way we organize our society. (I am sort of okay living with that, but I =
will not sugar coat it).


>>> or is there an anecdote here?
>>=20
>> 	[SM] I think the video downscaling thing actually happened in =
the German market, but I am not sure on the exact details, so I might =
misinterpret things a bit here. However the observation about alignment =
of goals I believe to be universally true.
>=20
> I=E2=80=99d be interested in hearing more. Was there an outcry of =
customers who wanted their choppy high resolution video back?   :-)    =
:-)

	[SM] The ISP downscaled not only when it reduced "choppy" play =
but at all times to minimize bandwidth use and increase available "free =
cell capacity" which IMHO is an economic decision to minimize costs and =
not one to improve customer experience.=20


>>>>>> I have witnessed the arguments in the "L4S wars" about how little =
processing one can ask the more central network nodes perform, e.g. flow =
queueing which would solve a lot of the issues (e.g. a hyper aggressive =
slow-start flow would mostly hurt itself if it overshoots its capacity) =
seems to be a complete no-go.
>>>>>=20
>>>>> That=E2=80=99s to do with scalability, which depends on how close =
to the network=E2=80=99s edge one is.
>>>>=20
>>>> 	[SM] I have heard the alternative that it has to do with what =
operators of core-links request from their vendors and what features =
they are willing to pay for... but this is very anecdotal as I have =
little insight into big-iron vendors or core-link operators.=20
>>>>=20
>>>>>> I personally think what we should do is have the network supply =
more information to the end points to control their behavior better. =
E.g. if we would mandate a max_queue-fill-percentage field in a protocol =
header and have each node write max(current_value_of_the_field, =
queue-filling_percentage_of_the_current_node) in every packet, end =
points could estimate how close to congestion the path is (e.g. by =
looking at the rate of %queueing changes) and tailor their =
growth/shrinkage rates accordingly, both during slow-start and during =
congestion avoidance.
>>>>>=20
>>>>> That could well be one way to go. Nice if we provoked you to =
think!
>>>>=20
>>>> 	[SM] You mostly made me realize what the recent increases in IW =
actually aim to accomplish ;)
>>>=20
>>> That=E2=80=99s fine! Increasing IW is surely a part of the solution =
space - though I advocate doing something else (as in the example above) =
than just to increase the constant in a worldwide standard.
>>=20
>> 	[SM] Happy to agree, I am not saying I think increasing IW is =
something I unconditionally support, just that I see what it offers.
>>=20
>>=20
>>>> and that current slow start seems actually better than its =
reputation; it solves a hard problem surprisingly well.
>>>=20
>>> Actually, given that the large majority of flows end somewhere in =
slow start, what makes you say that it solves it =E2=80=9Cwell=E2=80=9D?
>>=20
>> 	[SM] As I said, I accepted that there is no silver bullet, and =
hence some gradual probing with increasing CWND/rate is unavoidable =
which immediately implies that some flows will end before reaching =
capacity.
>=20
> You say =E2=80=9Csome=E2=80=9D but data says =E2=80=9Cthe large =
majority=E2=80=9D.

	[SM] Well, I am fine with that as I see no real viable =
alternative. Again we might be able to tweak the capacity search =
parameters a bit but that the ramp up takes time is a feature not a bug =
(so even if we would no the immeiate available capacity, we should not =
jump there ib one fell swoop, unless we are big fans of non-dampened =
oscillations), this is how multiple concurrent not synchronized flows =
can coexist next to each other in a reasonable fashion without requiring =
too much smarts in the network.


>> So the fact that flows end in slow-start is not a problem but part of =
the solution. I see no way of ever having all flows immediately start at =
their "stable" long-term capacity share (something that does not exist =
in the first place in environments with un-correlated and unpredictable =
cross traffic). But short of that almost all flows will need more round =
trips to finish that theoretically minimally possible. I tried to make =
that point before, and I am not saying current slow-start is 100% =
perfect, but I do not expect the possible fine-tuning to get us close =
enough to the theoretical performance of an "oracle" solution to count =
as "revolutionary" improvement.
>=20
> It doesn=E2=80=99t need to be revolutionary; I think that ways to =
learn / cache the IW are already quite useful.

	[SM] As all speculation that is a "gamble", where the risk and =
cost of mis-prediction need to be weighted against the advantage of =
getting it right. But I guess I agree that making IW more of a free =
parameter might be interesting and a way forward. My gut feeling tells =
me however we might want to look at pacing these out a bit, I have no =
data to back this up though.

>=20
> Now, you repeatedly mentioned that caching may not work because flows =
don=E2=80=99t always traverse the same path. True =E2=80=A6 but then, =
what about all the flows that do traverse the same bottleneck (to the =
same receiver, or set of receivers in a home), which is usually at the =
edge? That bottleneck may often be the same.

	[SM] Yes, "may" is the operational word here... but the point =
about IW is that it would need to scale inversely with the expected flow =
length (in packets or bytes) for a long running bulk flow current =
start-up should just in the noise regarding completion time for a very =
short flow however changing IW will be quite noticeable. But we do not =
know this about flows in advance often, no? Not sure that the optimal IW =
is actually determined by the bottleneck (I really do not know and have =
not thought about this so I might well be wrong here).

> Now, if we just had an in-network device that could divide the path =
into a =E2=80=9Ccore=E2=80=9D segment where it=E2=80=99s safe to use a =
pretty large IW value, and a downstream segment where the IW value may =
need be smaller, but a certain workable range might be known to the =
device, because that devices sits right at the edge=E2=80=A6

	[SM] This seems to be problematic if end-to-end encryption is =
desired, no? But essentially this also seems to be implemented already, =
except that we call these things CDNs instead of proxies ;) (kidding!)

>>>> The max(pat_queue%) idea has been kicking around in my head ever =
since reading a paper about storing queue occupancy into packets to help =
CC along (sorry, do not recall the authors or the title right now) so =
that is not even my own original idea, but simply something I borrowed =
from smarter engineers simply because I found the data convincing and =
the theory sane. (Also because I grudgingly accept that latency =
increases measured over the internet are a tad too noisy to be easily =
useful* and too noisy for a meaningful controller based on the latency =
rate of change**)
>>>>=20
>>>>>> But alas we seem to go the path of a relative dumb 1 bit signal =
giving us an under-defined queue filling state instead and to estimate =
relative queue filling dynamics from that we need many samples (so =
literally too little too late, or L3T2), but I digress.
>>>>>=20
>>>>> Yeah you do :-)
>>>>=20
>>>> 	[SM] Less than you let on ;). If L4S gets ratified
>>>=20
>>> [snip]
>>>=20
>>> I=E2=80=99m really not interested in an L4S debate.
>>=20
>> 	[SM] I understand, however I see clear reasons why L4S is =
detrimental to your stated goals as it will getting more information =
from the network less likely. I also tried to explain, why I believe =
that to be a theoretically viable way forwards to improve slow-start =
dynamics. Maybe show why my proposal is bunk while completely ignoring =
L4S? Or is that the kind of "particular solution" you do not want to =
discuss at the current stage?
>=20
> I=E2=80=99d say the latter. We could spend weeks of time and tonds of =
emails discussing explicit-feedback based schemes=E2=80=A6  instead, if =
you think your idea is good, why not build it, test it, and evaluate its =
trade-offs?

	[SM] In all honesty, because my day-job is in a pretty different =
field and I do not believe I can or even would want to perform =
publishable CS research after hours (let alone find a journal accepting =
layman submissions without any relevant affiliation).

> I don=E2=80=99t see L4S as being *detrimental* to our stated goals, =
BTW - but, as it stands, I see limitations in its usefulness because TCP =
Prague (AFAIK) only changes Congestion Avoidance, at least up to now. =
I=E2=80=99m getting the impression that Congestion Avoidance with a =
greedy sender is a rare animal.

	[SM] Erm, DASH-type traffic seems quite common, no? There the =
individual segments transmitted can be large enough to reach (close to) =
capacity?


> Non-greedy (i.e., the sender takes a break) is a different thing again =
- various implementations exist, as do proposals for how to handle this =
=E2=80=A6 a flow with pauses is not too different from multiple =
consecutive short flows. Well, it always uses the same 5-tuple, which =
makes caching strategies more likely to succeed.

	[SM] Not having looked at that, but I would have assumed that =
TCP will not start from scratch if there was a short break in =
transmission due the the sender having nothing to send for a while? =
However, my playing around with shaper tracking on variable rate links =
like LTE or starlink (or WiFi with mobile clients in motion) indicates =
that state caching should probably be sufficiently short duration so it =
will not result in too much traffic when the path capacity dropped in =
the rest period. Always starting from a low baseline, as TCP does, is =
clearly the conservative and prudent thing to do, unless one values =
reducing e.g. average completion time over say a high quantile.


Regards
	Sebastian


>> Anyway, thanks for your time. I fear I have made my points in the =
last mail already and are mostly repeating myself, so I would not feel =
offended in any way if you let this sub-discussion sleep and wait for =
more topical discussion entries.=20
>>=20
>>=20
>> Regards
>> 	Sebastian
>=20
> Cheers,
> Michael