From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mout.gmx.net (mout.gmx.net [212.227.17.22]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id 211CF3B2A4 for ; Tue, 12 Jul 2022 05:47:41 -0400 (EDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=gmx.net; s=badeba3b8450; t=1657619260; bh=YHyLNf7BHNWGo3KXsiIOdnyXYDe6zqhzBsqBDX/cdHs=; h=X-UI-Sender-Class:Subject:From:In-Reply-To:Date:Cc:References:To; b=ZU0YIl2OwZlq+WKMwR+ACE5ZWr71fi5AslNfbLy40AL86Fuf3ZvxdlQE5Zga4OmT1 eauxrTg37k+tJ3z2RSbB3PEe16imLduOSOc201PUFkNU+L+Sd3h5q/cCt3A4Pv3oy3 rbqjQ3G7MZFJi6VMgF6CNVWowVI5keS1/z76WKsU= X-UI-Sender-Class: 01bb95c1-4bf8-414a-932a-4f6e2808ef9c Received: from smtpclient.apple ([134.76.241.253]) by mail.gmx.net (mrgmx105 [212.227.17.168]) with ESMTPSA (Nemesis) id 1MaJ3t-1o5PbN0BDZ-00WCTy; Tue, 12 Jul 2022 11:47:40 +0200 Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3696.100.31\)) From: Sebastian Moeller In-Reply-To: Date: Tue, 12 Jul 2022 11:47:39 +0200 Cc: =?utf-8?Q?Dave_T=C3=A4ht?= , bloat Content-Transfer-Encoding: quoted-printable Message-Id: <32A9050A-B81F-4838-BE7F-691F0670DB84@gmx.de> References: <6458C1E6-14CB-4A36-8BB3-740525755A95@ifi.uio.no> <7D20BEF3-8A1C-4050-AE6F-66E1B4203EE1@gmx.de> <4E163307-9B8A-4BCF-A2DE-8D7F3C6CCEF4@ifi.uio.no> <95FB54F9-973F-40DE-84BF-90D05A642D6B@ifi.uio.no> <0BAAEF4C-331B-493C-B1F5-47AA648C64F8@ifi.uio.no> <9DF7ADFC-B5FC-4488-AF80-A905FECC17E8@gmx.de> To: Michael Welzl X-Mailer: Apple Mail (2.3696.100.31) X-Provags-ID: V03:K1:O05SQkUREYnZvi47+32pp+IqLIvD8GN4X7317mH5IpiO8esQVzG C4yOv6v0smsxpENPgAxMkFF6z+9Hh9f8pObzo066CW4LCnY89bwQ27lPJlhwwL6UntbVnhl ngvIWaNB6eVvLciEz75xR4V074/EvaPrfXxZU08P/2/gtWpqVW3ZHHV8QFwtWg3eJbqt4i7 9VzItprStlHbVqShaAwIQ== X-Spam-Flag: NO X-UI-Out-Filterresults: notjunk:1;V03:K0:KPMqKK1Khw4=:kxuCXnE0wSAE9UVX7tID1a g4gXfrAlCzN7Jah6XjNMB2KU+1EQax7GeVNaFqOR/yWydAYIl5tISKes8P5oYKhA9riHEdUtP bmH3dDmD6tYuvylea/DFvFDMfC/vbRrTtcsGLQoIsjKzcwP/wHtO9IS/E+yf84hyJ9qIneY+R 23i/R/+oBKwOphysGrgUadv6D48UnD0RwLS9A1eVauHQ/Z/cOMLKtF6uZrFPkSQOptca/txku Li/4tKFJNxU/lLxWSx+GntpSf3VLnwLwExLgS+201pXnOeqpzwGO0ntsKe98x4M/ZB6+Wr+fP Qk3GqBsq5tGAWZAWX7qUF5idNJA2vpX92oMfpuGM6CeAyH9t9ANdpRyYrV9NJSKYpU7mQz6eg C/1wqDmR4ImsX3nZrk1MQaojKJnZPegUJAOMK61z6U9RMk0AyvDq4PntzSuYw9WSDiMvfy8Q0 7Hzc6H2TlMv210WpYWWKCCkHj7uqy5JuD/UfRyrmnj+ltv6ZcglLYAMfsfHePj8F8f66GeByM k6smLT2+nG7rQFUTp0IUHPQIgPcqDeVHgucnT3AMblRZI7fiNWEA9DQW3MiyftVYR/xDcFKOw zr0ICphWdmZBNnPCst+vqO+P8G1PbQ3w5xVsjv4rCBsUoIDMKsBx/fkN5DgaM87G8UsxOb541 NIvDYnUTfuDPtcWOdIl5EjfPb0btUbgvQ9BfHAx/rCO3U5GF4pmW8ThUF+ZMIKOiJj/Pi9V+r YNp9MQnLhpcXnjwwzjeXlgkt2MFnfBE+HIoCxPrID52tghTBcc5NW3mUrZN6jfMJTOtiB1kvP fmxxSqTjLdtvWrtNhbA7Od3z9XlZTgUp8LU8VcFZ8J2roZx4oC+HNv2YkF9WHXch3oQJU8XKP 6vOt4nFZVd4GvyX2GkFAmYvliU3RC47wPu8uArxyK+SzQ7TOlY3Sfn6/0FiGXgi6kQpqZP2Ec kTW4bgch53uqIoaaKlUsBE8ouarmtcUt/YVbFmLwIqDCLFG8Eh6TgJZFQUYhvVNoUd1TIS1N8 S2cvH7jygKR249H7zgvPq1O0pcgviRbMdv70AD9wc8vMvhojzCjDZBl4GIJL9ymn7fX8LCSbj xTqgfpfL/E7X+tt2ANF3vVEZNJ+pYqerwqHZN1nRULrLsSpD5BDZf9Xpg== Subject: Re: [Bloat] [iccrg] Musings on the future of Internet Congestion Control X-BeenThere: bloat@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: General list for discussing Bufferbloat List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 12 Jul 2022 09:47:42 -0000 Hi Michael, > On Jul 11, 2022, at 10:49, Michael Welzl wrote: >=20 > Hi ! >=20 > A few answers below - >=20 >=20 >> On Jul 11, 2022, at 9:33 AM, Sebastian Moeller = wrote: >>=20 >> HI Michael, >>=20 >>=20 >>> On Jul 11, 2022, at 08:24, Michael Welzl wrote: >>>=20 >>> Hi Sebastian, >>>=20 >>> Neither our paper nor me are advocating one particular solution - we = point at a problem and suggest that research on ways to solve the = under-utilization problem might be worthwhile. >>=20 >> [SM2] That is easy to agree upon, as is agreeing on improving = slow start and trying to reduce underutilization, but actually doing is = hard; personally I am more interested in the hard part, so I might have = misunderstood the gist of the discussion you want to start with that = publication. >=20 > What you=E2=80=99re doing is jumping ahead. I suggest doing this with = research rather than an email discussion, but that=E2=80=99s what = we=E2=80=99re now already into. [SM] Sure, except my day job is in a completely different field = (wetware instead of hard- or software), it is very unlikely that I will = be able to contribute meaningfully to the kind of research. I am not = saying all I can contribute is just talk; I helped out with a few = obscure details in the bufferbloat movement (better anti-bufferbloat = movement) and I did contribute a bit to sqm and the earlier cited = autorate approach (which is topical because it is concerned with = capacity detection), but I will not be conducting nor publishing any = CS-grade paper on CC and/or slow start. Maybe a reason for me to = respectfully bow out of this discussion as talk is cheap and easy to = come by even without me helping. >>> Jumping from this to discussing the pro=E2=80=99s and con=E2=80=99s = of a potential concrete solution is quite a leap=E2=80=A6 >>>=20 >>> More below: >>>=20 >>>=20 >>>> On Jul 10, 2022, at 11:29 PM, Sebastian Moeller = wrote: >>>>=20 >>>> Hi Michael, >>>>=20 >>>>=20 >>>>> On Jul 10, 2022, at 22:01, Michael Welzl = wrote: >>>>>=20 >>>>> Hi ! >>>>>=20 >>>>>=20 >>>>>> On Jul 10, 2022, at 7:27 PM, Sebastian Moeller = wrote: >>>>>>=20 >>>>>> Hi Michael, >>>>>>=20 >>>>>> so I reread your paper and stewed a bit on it. >>>>>=20 >>>>> Many thanks for doing that! :) >>>>>=20 >>>>>=20 >>>>>> I believe that I do not buy some of your premises. >>>>>=20 >>>>> you say so, but I don=E2=80=99t really see much disagreement here. = Let=E2=80=99s see: >>>>>=20 >>>>>=20 >>>>>> e.g. you write: >>>>>>=20 >>>>>> "We will now examine two factors that make the the present = situation particularly worrisome. First, the way the infrastructure has = been evolving gives TCP an increasingly large operational space in which = it does not see any feedback at all. Second, most TCP connections are = extremely short. As a result, it is quite rare for a TCP connection to = even see a single congestion notification during its lifetime." >>>>>>=20 >>>>>> And seem to see a problem that flows might be able to finish = their data transfer business while still in slow start. I see the same = data, but see no problem. Unless we have an oracle that tells each = sender (over a shared bottleneck) exactly how much to send at any given = time point, different control loops will interact on those intermediary = nodes. >>>>>=20 >>>>> You really say that you don=E2=80=99t see the solution. The = problem is that capacities are underutilized, which means that flows = take longer (sometimes, much longer!) to finish than they theoretically = could, if we had a better solution. >>>>=20 >>>> [SM] No IMHO the underutilization is the direct consequence of = requiring a gradual filling of the "pipes" to probe he available = capacity. I see no way how this could be done differently with the = traffic sources/sinks being uncoordinated entities at the edge, and I = see no way of coordinating all end points and handle all paths. In other = words, we can fine tune a parameters to tweak the probing a bit, make it = more or less aggressive/fast, but the fact that we need to probe = capacity somehow means underutilization can not be avoided unless we = find a way of coordinating all of the sinks and sources. But being = sufficiently dumb, all I can come up with is an all-knowing oracle or = faster than light communication, and neither strikes me to be realistic = ;) >>>=20 >>> There=E2=80=99s quite a spectrum of possibilities between an oracle = or =E2=80=9Ccoordinating all of the sinks and sources=E2=80=9D on one = hand, and quite =E2=80=9Cblindly=E2=80=9D probing from a constant IW on = the other. >>=20 >> [SM] You say "blindly" I say "starting from a conservative but = reliable prior"... And what I see is that qualitatively significantly = better approaches are not really possible, so we need to discuss small = quantitative changes. >=20 > More about the term =E2=80=9Cblind=E2=80=9D below: >=20 >=20 >>> The =E2=80=9Cfine tuning=E2=80=9D that you mention is interesting = research, IMO! >>=20 >> [SM] The paper did not read that you were soliciting ideas for = small gradual improvements to me. >=20 > It calls for being drastic in the way we think about things, because = it makes the argument that PEPs (different kinds of them!) might in fact = be the right approach - but it doesn=E2=80=99t say that =E2=80=9Conly = drastic solutions are good solutions=E2=80=9D. Our =E2=80=9CThe Way = Forward=E2=80=9D section has 3 subsections; one of them is on end-to-end = approaches, where we call out the RL-IW approach I mention below as one = good way ahead. I would categorize this as =E2=80=9Csmall and = gradual=E2=80=9D. [SM] Having contact with machine learning and reenforcement = learning in my dayjob, I am at best cautiously optimistic for such a = solution to end up with an improvement across the board, but again happy = to be shown wrong. I prefer solutions with algorithms that are easier to = interpret than what machine learning (especially with deep networks) = ends up with. >=20 >=20 >>>>>> I might be limited in my depth of thought here, but having each = flow probing for capacity seems exactly the right approach... and = doubling CWND or rate every RTT is pretty aggressive already (making = slow start shorter by reaching capacity faster within the slow-start = framework requires either to start with a higher initial value (what = increasing IW tries to achieve?) or use a larger increase factor than 2 = per RTT). I consider increased IW a milder approach than the = alternative. And once one accepts that a gradual rate increasing is the = way forward it falls out logically that some flows will finish before = they reach steady state capacity especially if that flows available = capacity is large. So what exactly is the problem with short flows not = reaching capacity and what alternative exists that does not lead to = carnage if more-aggressive start-up phases drive the bottleneck load = into emergency drop territory? >>>>>=20 >>>>> There are various ways to do this >>>=20 >>> [snip: a couple of concrete suggestions from me, and answers about = what problems they might have, with requests for references from you] >>>=20 >>> I=E2=80=99m sorry, but I wasn=E2=80=99t really going to have a = discussion about these particular possibilities. My point was only that = many possible directions exist - being completely =E2=80=9Cblind=E2=80=9D = isn=E2=80=99t the only possible approach. >>=20 >> [SM] Again I do not consider "blind" to be an appropriate = qualification here. >=20 > IW is a global constant (not truly configured the same everywhere, = most probably for good reason! [SM] Arguable how "good" these reasons are. It is already = noticeable that a number of CDNs use higher than standard IWs and = extract more than their expected fair share of a link's capacity. That = is certainly in the interest of the CDN abnd the CDN's paying customers, = but not necessarily in the interest of the end-user accessing the data = on the CDN. (Personally I do not care too much about this, since I use = an fq-scheduler at my ingress and egress, so high IW flows finish faster = if spare capacity is available and only cause queue build up for them = selves if there is no spare capacity). > but the standard suggests a globally unique value). [SM] One thing I learned about IETF standards in the last years = is that not all of them are of the same quality, I will look carefully = at each standards before accepting its recommendation as "gospel". (IMHO = the IETF process is very welcome but assumes good faith all around, and = hence seems easily gamed). > =46rom then on, the cwnd is doubled a couple of times. No feedback = about the path=E2=80=99s capacity exists - and then, the connection is = over. [SM] I disagree, the flow very much knows that the reached = CWND/sending rate was <=3D than the capacity available for that flow, = that is feed-back IMHO. If you are a gambling type you could try to = speculatively re-use that information. > Okay, there is ONE thing that such a flow gets: the RTT. =E2=80=9CBlind = except for RTT measurements=E2=80=9D, then. [SM] I guess your point is "flow does not know the maximal = capacity it could have gotten"? > Importantly, such a flow never learns how large its cwnd *could* have = become without ever causing a problem. Perhaps 10 times more? 100 times? [SM] Sure. ATM the only way to learn a path's capacity is = actually to saturate the path*, but if a flow is done with its data = transfer, having if exchange dummy data just to probe capacity seems = like a waste all around. I guess what I want to ask is, how would = knowing how much available but untapped capacity was available at one = point help? *) stuff like deducing capacity from packet pair interval at the = receiver (assuming packets sent back to back) is notoriously imprecise, = so unless "chirping" overcomes that imprecision without costing too many = round trips worth of noise supression measuring capacity by causing = congestion is the only way. Not great. >=20 >=20 >>> Instead of answering your comments to my suggestions, let me give = you one single concrete piece here: our reference 6, as one example of = the kind of resesarch that we consider worthwhile for the future: >>>=20 >>> "X. Nie, Y. Zhao, Z. Li, G. Chen, K. Sui, J. Zhang, Z. Ye, and D. = Pei, =E2=80=9CDynamic TCP initial windows and congestion control schemes = through reinforcement learning,=E2=80=9D IEEE JSAC, vol. 37, no. 6, = 2019.=E2=80=9D >>> https://1989chenguo.github.io/Publications/TCP-RL-JSAC19.pdf >>=20 >> [SM] =46rom the title I predict that this is going to lean into = the "cache" idea trying to improve the average hit rate of said cache... >>=20 >>> This work learns a useful value of IW over time, rather than using a = constant. One author works at Baidu, the paper uses data from Baidu, and = it says: >>> "TCP-RL has been deployed in one of the top global search engines = for more than a year. Our online and testbed experiments show that for = short flow transmission, compared with the common practice of IW =3D 10, = TCP-RL can reduce the average transmission time by 23% to 29%.=E2=80=9D >>>=20 >>> - so it=E2=80=99s probably fair to assume that this was (and perhaps = still is) active in Baidu. >>=20 >> [SM] This seems to confirm my prediction... however the paper = seems to be written pretty exclusively from the view of an operator of = server farms, not sure this approach will actually do any good for leaf = end-points in e.g. home networks (that is for their sending behavior). I = tend to prefer symmetric solutions, but if data center traffic can reach = higher utilization without compromising end-user quality of experience = and fairness, what is not to like about this. It is however fully within = the existing slow-start framework, no? >=20 > Yes! >=20 >=20 >>>>>> And as an aside, a PEP (performance enhancing proxy) that does = not enhance performance is useless at best and likely harmful (rather a = PDP, performance degrading proxy). >>>>>=20 >>>>> You=E2=80=99ve made it sound worse by changing the term, for = whatever that=E2=80=99s worth. If they never help, why has anyone ever = called them PEPs in the first place? >>>>=20 >>>> [SM] I would guess because "marketing" was unhappy with = "engineering" emphasizing the side-effects/potential problems and = focussed in the best-case scenario? ;) >>>=20 >>> It appears that you want to just ill-talk PEPs. >>=20 >> [SM] Not really, I just wanted to point out that I expect the = term PEP to come from entities selling those products and in our current = environment it is clear that products are named and promoted emphasizing = the potential benefit they can bring and not by the additional risks = they might carry (e.g. fission power plants were sold on the idea of = essentially unlimited cheap emission free energy, and not on the = concurrent problem with waste disposal over time frames in the order of = the aggregate human civilisation from the bronze age). I have no beef = with that, but I do not think that taking the "positive" name as a sign = that PEPs are generally liked or live up to their name (note I am also = not saying that they do not, just that the name PEP is a rather = unreliable predictor here). >=20 > I don=E2=80=99t even think that this name has that kind of history. My = point was that they=E2=80=99re called PEPs because they=E2=80=99re = *meant* to improve performance; [SM] That is not really how our economic system works... = products are primarily intended to generate more revenue than cost, it = helps if they offer something to the customer, but that is really just a = means to extract revenue... > that=E2=80=99s what they=E2=80=99re designed for. You describe =E2=80=9C= a PEP that does not enhance performance=E2=80=9D, which, to me, is like = talking about a web server that doesn=E2=80=99t serve web pages. Sure, = not all PEPs may always work well, but they should - that=E2=80=99s = their raison d=E2=80=99=C3=AAtre. [SM] That is a very optimistic view, I would love to be able to = share. >=20 >>> There are plenty of useful things that they can do and yes, I = personally think they=E2=80=99re the way of the future - but **not** in = their current form, where they must =E2=80=9Clie=E2=80=9D to TCP, cause = ossification, >>=20 >> [SM] Here I happily agree, if we can get the nagative = side-effects removed that would be great, however is that actually = feasible or just desirable? >>=20 >>> etc. PEPs have never been considered as part of the congestion = control design - when they came on the scene, in the IETF, they were = despised for breaking the architecture, and then all the trouble with = how they need to play tricks was discovered (spoofing IP addresses, = making assumptions about header fields, and whatnot). That doesn=E2=80=99t= mean that a very different kind of PEP - one which is authenticated and = speaks an agreed-upon protocol - couldn=E2=80=99t be a good solution. >>=20 >> [SM] Again, I agree it could in theory especially if = well-architected.=20 >=20 > That=E2=80=99s what I=E2=80=99m advocating. [SM] Well, can you give an example of an existing = well-architected PEP as proof of principle? >=20 >>> You=E2=80=99re bound to ask me for concrete things next, and if I = give you something concrete (e.g., a paper on PEPs), you=E2=80=99ll find = something bad about it >>=20 >> [SM] Them are the rules of the game... however if we should play = the game that way, I will come out of it having learned something new = and potentially changing my opinion. >>=20 >>> - but this is not a constructive direction of this conversation. = Please note that I=E2=80=99m not saying =E2=80=9CPEPs are always = good=E2=80=9D: I only say that, in my personal opinion, they=E2=80=99re = a worthwhile direction of future research. That=E2=80=99s a very = different statement. >>=20 >> [SM] Fair enough. I am less optimistic, but happy to be = disappointed in my pessimism. >>=20 >>>=20 >>>>> Why do people buy these boxes? >>>>=20 >>>> [SM] Because e.g. for GEO links, latency is in a range where = default unadulterated TCP will likely choke on itself, and when faced = with requiring customers to change/tune TCPs or having "PEP" fudge it, = ease of use of fudging won the day. That is a generous explanation (as = this fudging is beneficial to both the operator and most end-users), I = can come up with less charitable theories if you want ;) . >>>>=20 >>>>>> The network so far has been doing reasonably well with putting = more protocol smarts at the ends than in the parts in between. >>>>>=20 >>>>> Truth is, PEPs are used a lot: at cellular edges, at satellite = links=E2=80=A6 because the network is *not* always doing reasonably well = without them. >>>>=20 >>>> [SM] Fair enough, I accept that there are use cases for those, = but again, only if the actually enhance the "experience" will users be = happy to accept them. >>>=20 >>> =E2=80=A6 and that=E2=80=99s the only reason to deploy them, given = that (as the name suggests) they=E2=80=99re meant to increase = performance. I=E2=80=99d be happy to learn more about why you appear to = hate them so much (even just anecdotal). >>>=20 >>>> The goals of the operators and the paying customers are not always = aligned here, a PEP might be advantageous more to the operator than the = end-user (theoretically also the other direction, but since operators = pay for PEPs they are unlikely to deploy those) think mandatory image = recompression or forced video quality downscaling.... (and sure these = are not as clear as I pitched them, if after an emergency a PEP allows = most/all users in a cell to still send somewhat degraded images that is = better than the network choking itself with a few high quality images, = assuming images from the emergency are somewhat useful). >>>=20 >>> What is this, are you inventing a (too me, frankly, strange) = scenario where PEPs do some evil for customers yet help operators, >>=20 >> [SM] This is no invention, but how capitalism works, sorry. The = party paying for the PEP decides on using it based on the advantages it = offers for them. E.g. a mobile carrier that (in the past) forcible = managed to downgrade the quality of streaming video over mobile links = without giving the paying end-user an option to use either choppy high = resolution or smooth low resolution video. By the way, that does not = make the operator evil, it is just that operator and paying customers = goals and desires are not all that well aligned (e.g. the operator wants = to maximize revenue, the customer to minimize cost). >=20 > You claim that these goals and desires are not well aligned (and a PEP = is then an instrument in this evil) [SM] Again this is expected behavior in our economic system, I = have not and will not classify that as "evil", but I will also not start = believing that companies offer products just to get a warm and fuzzy = feeling. It is part of the principle of how a market economy works that = the goals of the entities involved are opposite of each other, that is = how a market is supposed to optimize resource allocation. > - do you have any proof, or even anecdotes, to support that claim? [SM] The claim that sellers want the highest revenue/cost ratio = while buyers want the lowest cost/utility seems hardly controversial or = requiring a citation. > I would think that operators generally try to make their customers = happy (or they would switch to different operators). Yes there may be = some misalignments in incentives, but I believe that these are more = subtle points. E.g., who wants a choppy high resolution video? Do such = users really exist? [SM] I might be able to buffer that choppy video well enough to = allow smooth playback at the desired higher resolution/quality (or I = might be happy with a few seconds to compare quality of displays), given = that I essentially buy internet access from my mobile carrier that = carrier should get out of my way. (However if the carrier also offers = "video-optimization" as an opt-in feature end-users can toggle at will = that is a different kettle of fish and something I would consider good = service). IIRC a German carrier was simply downforcing quality for all = video streaming at all time, mostly to minimize cost and bandwidth = usage, which pretty much looks like an exercise to minimize operational = cost and not to increase customer satisfaction. So yes there are = "misalignments in incentives" that are inherent and structural to the = way we organize our society. (I am sort of okay living with that, but I = will not sugar coat it). >>> or is there an anecdote here? >>=20 >> [SM] I think the video downscaling thing actually happened in = the German market, but I am not sure on the exact details, so I might = misinterpret things a bit here. However the observation about alignment = of goals I believe to be universally true. >=20 > I=E2=80=99d be interested in hearing more. Was there an outcry of = customers who wanted their choppy high resolution video back? :-) = :-) [SM] The ISP downscaled not only when it reduced "choppy" play = but at all times to minimize bandwidth use and increase available "free = cell capacity" which IMHO is an economic decision to minimize costs and = not one to improve customer experience.=20 >>>>>> I have witnessed the arguments in the "L4S wars" about how little = processing one can ask the more central network nodes perform, e.g. flow = queueing which would solve a lot of the issues (e.g. a hyper aggressive = slow-start flow would mostly hurt itself if it overshoots its capacity) = seems to be a complete no-go. >>>>>=20 >>>>> That=E2=80=99s to do with scalability, which depends on how close = to the network=E2=80=99s edge one is. >>>>=20 >>>> [SM] I have heard the alternative that it has to do with what = operators of core-links request from their vendors and what features = they are willing to pay for... but this is very anecdotal as I have = little insight into big-iron vendors or core-link operators.=20 >>>>=20 >>>>>> I personally think what we should do is have the network supply = more information to the end points to control their behavior better. = E.g. if we would mandate a max_queue-fill-percentage field in a protocol = header and have each node write max(current_value_of_the_field, = queue-filling_percentage_of_the_current_node) in every packet, end = points could estimate how close to congestion the path is (e.g. by = looking at the rate of %queueing changes) and tailor their = growth/shrinkage rates accordingly, both during slow-start and during = congestion avoidance. >>>>>=20 >>>>> That could well be one way to go. Nice if we provoked you to = think! >>>>=20 >>>> [SM] You mostly made me realize what the recent increases in IW = actually aim to accomplish ;) >>>=20 >>> That=E2=80=99s fine! Increasing IW is surely a part of the solution = space - though I advocate doing something else (as in the example above) = than just to increase the constant in a worldwide standard. >>=20 >> [SM] Happy to agree, I am not saying I think increasing IW is = something I unconditionally support, just that I see what it offers. >>=20 >>=20 >>>> and that current slow start seems actually better than its = reputation; it solves a hard problem surprisingly well. >>>=20 >>> Actually, given that the large majority of flows end somewhere in = slow start, what makes you say that it solves it =E2=80=9Cwell=E2=80=9D? >>=20 >> [SM] As I said, I accepted that there is no silver bullet, and = hence some gradual probing with increasing CWND/rate is unavoidable = which immediately implies that some flows will end before reaching = capacity. >=20 > You say =E2=80=9Csome=E2=80=9D but data says =E2=80=9Cthe large = majority=E2=80=9D. [SM] Well, I am fine with that as I see no real viable = alternative. Again we might be able to tweak the capacity search = parameters a bit but that the ramp up takes time is a feature not a bug = (so even if we would no the immeiate available capacity, we should not = jump there ib one fell swoop, unless we are big fans of non-dampened = oscillations), this is how multiple concurrent not synchronized flows = can coexist next to each other in a reasonable fashion without requiring = too much smarts in the network. >> So the fact that flows end in slow-start is not a problem but part of = the solution. I see no way of ever having all flows immediately start at = their "stable" long-term capacity share (something that does not exist = in the first place in environments with un-correlated and unpredictable = cross traffic). But short of that almost all flows will need more round = trips to finish that theoretically minimally possible. I tried to make = that point before, and I am not saying current slow-start is 100% = perfect, but I do not expect the possible fine-tuning to get us close = enough to the theoretical performance of an "oracle" solution to count = as "revolutionary" improvement. >=20 > It doesn=E2=80=99t need to be revolutionary; I think that ways to = learn / cache the IW are already quite useful. [SM] As all speculation that is a "gamble", where the risk and = cost of mis-prediction need to be weighted against the advantage of = getting it right. But I guess I agree that making IW more of a free = parameter might be interesting and a way forward. My gut feeling tells = me however we might want to look at pacing these out a bit, I have no = data to back this up though. >=20 > Now, you repeatedly mentioned that caching may not work because flows = don=E2=80=99t always traverse the same path. True =E2=80=A6 but then, = what about all the flows that do traverse the same bottleneck (to the = same receiver, or set of receivers in a home), which is usually at the = edge? That bottleneck may often be the same. [SM] Yes, "may" is the operational word here... but the point = about IW is that it would need to scale inversely with the expected flow = length (in packets or bytes) for a long running bulk flow current = start-up should just in the noise regarding completion time for a very = short flow however changing IW will be quite noticeable. But we do not = know this about flows in advance often, no? Not sure that the optimal IW = is actually determined by the bottleneck (I really do not know and have = not thought about this so I might well be wrong here). > Now, if we just had an in-network device that could divide the path = into a =E2=80=9Ccore=E2=80=9D segment where it=E2=80=99s safe to use a = pretty large IW value, and a downstream segment where the IW value may = need be smaller, but a certain workable range might be known to the = device, because that devices sits right at the edge=E2=80=A6 [SM] This seems to be problematic if end-to-end encryption is = desired, no? But essentially this also seems to be implemented already, = except that we call these things CDNs instead of proxies ;) (kidding!) >>>> The max(pat_queue%) idea has been kicking around in my head ever = since reading a paper about storing queue occupancy into packets to help = CC along (sorry, do not recall the authors or the title right now) so = that is not even my own original idea, but simply something I borrowed = from smarter engineers simply because I found the data convincing and = the theory sane. (Also because I grudgingly accept that latency = increases measured over the internet are a tad too noisy to be easily = useful* and too noisy for a meaningful controller based on the latency = rate of change**) >>>>=20 >>>>>> But alas we seem to go the path of a relative dumb 1 bit signal = giving us an under-defined queue filling state instead and to estimate = relative queue filling dynamics from that we need many samples (so = literally too little too late, or L3T2), but I digress. >>>>>=20 >>>>> Yeah you do :-) >>>>=20 >>>> [SM] Less than you let on ;). If L4S gets ratified >>>=20 >>> [snip] >>>=20 >>> I=E2=80=99m really not interested in an L4S debate. >>=20 >> [SM] I understand, however I see clear reasons why L4S is = detrimental to your stated goals as it will getting more information = from the network less likely. I also tried to explain, why I believe = that to be a theoretically viable way forwards to improve slow-start = dynamics. Maybe show why my proposal is bunk while completely ignoring = L4S? Or is that the kind of "particular solution" you do not want to = discuss at the current stage? >=20 > I=E2=80=99d say the latter. We could spend weeks of time and tonds of = emails discussing explicit-feedback based schemes=E2=80=A6 instead, if = you think your idea is good, why not build it, test it, and evaluate its = trade-offs? [SM] In all honesty, because my day-job is in a pretty different = field and I do not believe I can or even would want to perform = publishable CS research after hours (let alone find a journal accepting = layman submissions without any relevant affiliation). > I don=E2=80=99t see L4S as being *detrimental* to our stated goals, = BTW - but, as it stands, I see limitations in its usefulness because TCP = Prague (AFAIK) only changes Congestion Avoidance, at least up to now. = I=E2=80=99m getting the impression that Congestion Avoidance with a = greedy sender is a rare animal. [SM] Erm, DASH-type traffic seems quite common, no? There the = individual segments transmitted can be large enough to reach (close to) = capacity? > Non-greedy (i.e., the sender takes a break) is a different thing again = - various implementations exist, as do proposals for how to handle this = =E2=80=A6 a flow with pauses is not too different from multiple = consecutive short flows. Well, it always uses the same 5-tuple, which = makes caching strategies more likely to succeed. [SM] Not having looked at that, but I would have assumed that = TCP will not start from scratch if there was a short break in = transmission due the the sender having nothing to send for a while? = However, my playing around with shaper tracking on variable rate links = like LTE or starlink (or WiFi with mobile clients in motion) indicates = that state caching should probably be sufficiently short duration so it = will not result in too much traffic when the path capacity dropped in = the rest period. Always starting from a low baseline, as TCP does, is = clearly the conservative and prudent thing to do, unless one values = reducing e.g. average completion time over say a high quantile. Regards Sebastian >> Anyway, thanks for your time. I fear I have made my points in the = last mail already and are mostly repeating myself, so I would not feel = offended in any way if you let this sub-discussion sleep and wait for = more topical discussion entries.=20 >>=20 >>=20 >> Regards >> Sebastian >=20 > Cheers, > Michael