From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mout.gmx.net (mout.gmx.net [212.227.15.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id 5E25F3B29E for ; Mon, 11 Jul 2022 03:33:18 -0400 (EDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=gmx.net; s=badeba3b8450; t=1657524796; bh=jCAJ+7FmObipbdZfFVGPi6m81XTMBh5p58GK3N4bBas=; h=X-UI-Sender-Class:Subject:From:In-Reply-To:Date:Cc:References:To; b=gUX6QV0jVmoIH+luTBS4jvQEO0h/vfL/in3oOHjaVPF1SDll46tAhnbAAD1GUSnkB 6LK0nAwJJix6NEPDWB2l+lpSz9//BOev6/K83mFJbmOsgJi+5uxZn5E7grNxQe3v/f s0kNiEGCiJV0yrQFZaF2viNY/NMNfJ8vE/F9Q+0o= X-UI-Sender-Class: 01bb95c1-4bf8-414a-932a-4f6e2808ef9c Received: from smtpclient.apple ([134.76.241.253]) by mail.gmx.net (mrgmx004 [212.227.17.190]) with ESMTPSA (Nemesis) id 1MrQIv-1noxoZ0n3v-00oTmC; Mon, 11 Jul 2022 09:33:16 +0200 Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3696.100.31\)) From: Sebastian Moeller In-Reply-To: <0BAAEF4C-331B-493C-B1F5-47AA648C64F8@ifi.uio.no> Date: Mon, 11 Jul 2022 09:33:15 +0200 Cc: =?utf-8?Q?Dave_T=C3=A4ht?= , bloat Content-Transfer-Encoding: quoted-printable Message-Id: <9DF7ADFC-B5FC-4488-AF80-A905FECC17E8@gmx.de> References: <6458C1E6-14CB-4A36-8BB3-740525755A95@ifi.uio.no> <7D20BEF3-8A1C-4050-AE6F-66E1B4203EE1@gmx.de> <4E163307-9B8A-4BCF-A2DE-8D7F3C6CCEF4@ifi.uio.no> <95FB54F9-973F-40DE-84BF-90D05A642D6B@ifi.uio.no> <0BAAEF4C-331B-493C-B1F5-47AA648C64F8@ifi.uio.no> To: Michael Welzl X-Mailer: Apple Mail (2.3696.100.31) X-Provags-ID: V03:K1:kAR43Zbm4Jr5j4Q2MRUWBr8aGuCuy20zKLgQLeMLIqH5Xe+DUsb w96ndKHBqx9KRcprjKQxPh6t8pY2k8wUBKinpxXu8Dh2x/z7K4Y4QXmaxNplsHkE110+CTZ 69tS2jnIhlvKgbFABTd+SuvFh5CrkIpajiq7WBomKd+qb4hdazL2m1CQLi75gDyFRIxGzAj XDaRMGyjpWkWeu+X16eNA== X-Spam-Flag: NO X-UI-Out-Filterresults: notjunk:1;V03:K0:U4NbBZ9ckCk=:IVtZlN5V+7QqODr7XKreaS 0hv8CD+xzocbaYcxKMpr/eoWEN4Es8ix0+jwchloU8MUh1snMPMFVrMX4BK/Fv/glCwrAZx9C 9/oI9Cvqn+bIEnREF4K7qs0SUXz9w5dBJrGtv0hA5NHsKgGQ0mOvFZQ6lR0ajZ9o/nYIkB5uc /43Sr8tuOGupcmcoU/djVS7bqR+FRKEbW/eSfwjHTkSdVENLeDBdfZ0Uk7UZMZOVD6dRSoJLG U/E8p1QF52eM58ANUcaqfjm8doAxCNAjhE9xuoZR4hWdgpfRUUhoth/dG4N2Z45Y/lJCOFq+S fwE95o5K0v5RJ/PpmoOg683qUfXueTtyGhvew0l5j6KkHtN7WifkLHnwm/6+3yAkrmhw7QSGZ 86ner5zr9EOmRkRlo1PlvMcyhnrhlYA1CcifQqFAOZYLUUna9IpFyqTWI4zTu2j4RaujlIeuU 1gI+8bbhmd63+N5cLdH32BHHCCpT8NTF8rSNptxgmW28ZL2k6a6CjvitdDf6d+Kb5O0yAbMKq fAfXqb5EOd44iGyN14YaIKQkDk/zqz2/InDumiQnvuQal+2gRqooLeTAEQhOocEheRZq1FRPj TPn2qKuL2DXDWh+juQHtT1ugqGmy2Ckkd/Wazr9j3GHoXA1HqHzCTcN50xjqREcPikBu0dVxo leEy3W0aCxxRstGAKCxArtAzqlNxxY8zHrMIwBBk/LSAsn94E4cKaj3PVGazP+mWbLvkMlGgz nNPtIJMEl1RiObUYd8Iafku805QmyteuIGjeex4PD8Z3Qskgb9GmfFhMwiFi9jkmyLWtYY9Ia uxzi67L1JC8/JTFOkPZXj3YFOLlF74wjxth2EUuaJpTRcMZT2NVmxyFzoqaGhvN0rygidKnKe J/LVkHsmX7NiZzwAjWFLYjwum6xYk3UHS0/lNQWYbsiSp4hhTDW5FS7a2zyBcmGpQde/Qtpyd Lb7j0hyaTO68C3d5LG0OiUdwdqvWkYPQU2frFDsselnAPsSlMn35olxNZK5s5nfWzE2/JLn+c 0IyLVqM/0i3xCtA1Zf8jBV3Sv0wxvkfpmJ3bKUdVNHX4YtwYuM6Gs4yKzSuCmqZI1RbS5FY0K uezqkqnVK+R/qrcUnEbuyAwaDXl7xSv9QrM2eAl44HV52X2Mqyx46aSLA== Subject: Re: [Bloat] [iccrg] Musings on the future of Internet Congestion Control X-BeenThere: bloat@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: General list for discussing Bufferbloat List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 11 Jul 2022 07:33:18 -0000 HI Michael, > On Jul 11, 2022, at 08:24, Michael Welzl wrote: >=20 > Hi Sebastian, >=20 > Neither our paper nor me are advocating one particular solution - we = point at a problem and suggest that research on ways to solve the = under-utilization problem might be worthwhile. [SM2] That is easy to agree upon, as is agreeing on improving = slow start and trying to reduce underutilization, but actually doing is = hard; personally I am more interested in the hard part, so I might have = misunderstood the gist of the discussion you want to start with that = publication. > Jumping from this to discussing the pro=E2=80=99s and con=E2=80=99s of = a potential concrete solution is quite a leap=E2=80=A6 >=20 > More below: >=20 >=20 >> On Jul 10, 2022, at 11:29 PM, Sebastian Moeller = wrote: >>=20 >> Hi Michael, >>=20 >>=20 >>> On Jul 10, 2022, at 22:01, Michael Welzl wrote: >>>=20 >>> Hi ! >>>=20 >>>=20 >>>> On Jul 10, 2022, at 7:27 PM, Sebastian Moeller = wrote: >>>>=20 >>>> Hi Michael, >>>>=20 >>>> so I reread your paper and stewed a bit on it. >>>=20 >>> Many thanks for doing that! :) >>>=20 >>>=20 >>>> I believe that I do not buy some of your premises. >>>=20 >>> you say so, but I don=E2=80=99t really see much disagreement here. = Let=E2=80=99s see: >>>=20 >>>=20 >>>> e.g. you write: >>>>=20 >>>> "We will now examine two factors that make the the present = situation particularly worrisome. First, the way the infrastructure has = been evolving gives TCP an increasingly large operational space in which = it does not see any feedback at all. Second, most TCP connections are = extremely short. As a result, it is quite rare for a TCP connection to = even see a single congestion notification during its lifetime." >>>>=20 >>>> And seem to see a problem that flows might be able to finish their = data transfer business while still in slow start. I see the same data, = but see no problem. Unless we have an oracle that tells each sender = (over a shared bottleneck) exactly how much to send at any given time = point, different control loops will interact on those intermediary = nodes. >>>=20 >>> You really say that you don=E2=80=99t see the solution. The problem = is that capacities are underutilized, which means that flows take longer = (sometimes, much longer!) to finish than they theoretically could, if we = had a better solution. >>=20 >> [SM] No IMHO the underutilization is the direct consequence of = requiring a gradual filling of the "pipes" to probe he available = capacity. I see no way how this could be done differently with the = traffic sources/sinks being uncoordinated entities at the edge, and I = see no way of coordinating all end points and handle all paths. In other = words, we can fine tune a parameters to tweak the probing a bit, make it = more or less aggressive/fast, but the fact that we need to probe = capacity somehow means underutilization can not be avoided unless we = find a way of coordinating all of the sinks and sources. But being = sufficiently dumb, all I can come up with is an all-knowing oracle or = faster than light communication, and neither strikes me to be realistic = ;) >=20 > There=E2=80=99s quite a spectrum of possibilities between an oracle or = =E2=80=9Ccoordinating all of the sinks and sources=E2=80=9D on one hand, = and quite =E2=80=9Cblindly=E2=80=9D probing from a constant IW on the = other. [SM] You say "blindly" I say "starting from a conservative but = reliable prior"... And what I see is that qualitatively significantly = better approaches are not really possible, so we need to discuss small = quantitative changes. > The =E2=80=9Cfine tuning=E2=80=9D that you mention is interesting = research, IMO! [SM] The paper did not read that you were soliciting ideas for = small gradual improvements to me. >>>> I might be limited in my depth of thought here, but having each = flow probing for capacity seems exactly the right approach... and = doubling CWND or rate every RTT is pretty aggressive already (making = slow start shorter by reaching capacity faster within the slow-start = framework requires either to start with a higher initial value (what = increasing IW tries to achieve?) or use a larger increase factor than 2 = per RTT). I consider increased IW a milder approach than the = alternative. And once one accepts that a gradual rate increasing is the = way forward it falls out logically that some flows will finish before = they reach steady state capacity especially if that flows available = capacity is large. So what exactly is the problem with short flows not = reaching capacity and what alternative exists that does not lead to = carnage if more-aggressive start-up phases drive the bottleneck load = into emergency drop territory? >>>=20 >>> There are various ways to do this >=20 > [snip: a couple of concrete suggestions from me, and answers about = what problems they might have, with requests for references from you] >=20 > I=E2=80=99m sorry, but I wasn=E2=80=99t really going to have a = discussion about these particular possibilities. My point was only that = many possible directions exist - being completely =E2=80=9Cblind=E2=80=9D = isn=E2=80=99t the only possible approach. [SM] Again I do not consider "blind" to be an appropriate = qualification here. > Instead of answering your comments to my suggestions, let me give you = one single concrete piece here: our reference 6, as one example of the = kind of resesarch that we consider worthwhile for the future: >=20 > "X. Nie, Y. Zhao, Z. Li, G. Chen, K. Sui, J. Zhang, Z. Ye, and D. Pei, = =E2=80=9CDynamic TCP initial windows and congestion control schemes = through reinforcement learning,=E2=80=9D IEEE JSAC, vol. 37, no. 6, = 2019.=E2=80=9D > https://1989chenguo.github.io/Publications/TCP-RL-JSAC19.pdf [SM] =46rom the title I predict that this is going to lean into = the "cache" idea trying to improve the average hit rate of said cache... > This work learns a useful value of IW over time, rather than using a = constant. One author works at Baidu, the paper uses data from Baidu, and = it says: > "TCP-RL has been deployed in one of the top global search engines for = more than a year. Our online and testbed experiments show that for short = flow transmission, compared with the common practice of IW =3D 10, = TCP-RL can reduce the average transmission time by 23% to 29%.=E2=80=9D >=20 > - so it=E2=80=99s probably fair to assume that this was (and perhaps = still is) active in Baidu. [SM] This seems to confirm my prediction... however the paper = seems to be written pretty exclusively from the view of an operator of = server farms, not sure this approach will actually do any good for leaf = end-points in e.g. home networks (that is for their sending behavior). I = tend to prefer symmetric solutions, but if data center traffic can reach = higher utilization without compromising end-user quality of experience = and fairness, what is not to like about this. It is however fully within = the existing slow-start framework, no? >=20 >=20 >>>> And as an aside, a PEP (performance enhancing proxy) that does not = enhance performance is useless at best and likely harmful (rather a PDP, = performance degrading proxy). >>>=20 >>> You=E2=80=99ve made it sound worse by changing the term, for = whatever that=E2=80=99s worth. If they never help, why has anyone ever = called them PEPs in the first place? >>=20 >> [SM] I would guess because "marketing" was unhappy with = "engineering" emphasizing the side-effects/potential problems and = focussed in the best-case scenario? ;) >=20 > It appears that you want to just ill-talk PEPs. [SM] Not really, I just wanted to point out that I expect the = term PEP to come from entities selling those products and in our current = environment it is clear that products are named and promoted emphasizing = the potential benefit they can bring and not by the additional risks = they might carry (e.g. fission power plants were sold on the idea of = essentially unlimited cheap emission free energy, and not on the = concurrent problem with waste disposal over time frames in the order of = the aggregate human civilisation from the bronze age). I have no beef = with that, but I do not think that taking the "positive" name as a sign = that PEPs are generally liked or live up to their name (note I am also = not saying that they do not, just that the name PEP is a rather = unreliable predictor here). > There are plenty of useful things that they can do and yes, I = personally think they=E2=80=99re the way of the future - but **not** in = their current form, where they must =E2=80=9Clie=E2=80=9D to TCP, cause = ossification, [SM] Here I happily agree, if we can get the nagative = side-effects removed that would be great, however is that actually = feasible or just desirable? > etc. PEPs have never been considered as part of the congestion control = design - when they came on the scene, in the IETF, they were despised = for breaking the architecture, and then all the trouble with how they = need to play tricks was discovered (spoofing IP addresses, making = assumptions about header fields, and whatnot). That doesn=E2=80=99t mean = that a very different kind of PEP - one which is authenticated and = speaks an agreed-upon protocol - couldn=E2=80=99t be a good solution. [SM] Again, I agree it could in theory especially if = well-architected.=20 > You=E2=80=99re bound to ask me for concrete things next, and if I give = you something concrete (e.g., a paper on PEPs), you=E2=80=99ll find = something bad about it [SM] Them are the rules of the game... however if we should play = the game that way, I will come out of it having learned something new = and potentially changing my opinion. > - but this is not a constructive direction of this conversation. = Please note that I=E2=80=99m not saying =E2=80=9CPEPs are always = good=E2=80=9D: I only say that, in my personal opinion, they=E2=80=99re = a worthwhile direction of future research. That=E2=80=99s a very = different statement. [SM] Fair enough. I am less optimistic, but happy to be = disappointed in my pessimism. >=20 >>> Why do people buy these boxes? >>=20 >> [SM] Because e.g. for GEO links, latency is in a range where = default unadulterated TCP will likely choke on itself, and when faced = with requiring customers to change/tune TCPs or having "PEP" fudge it, = ease of use of fudging won the day. That is a generous explanation (as = this fudging is beneficial to both the operator and most end-users), I = can come up with less charitable theories if you want ;) . >>=20 >>>> The network so far has been doing reasonably well with putting more = protocol smarts at the ends than in the parts in between. >>>=20 >>> Truth is, PEPs are used a lot: at cellular edges, at satellite = links=E2=80=A6 because the network is *not* always doing reasonably well = without them. >>=20 >> [SM] Fair enough, I accept that there are use cases for those, = but again, only if the actually enhance the "experience" will users be = happy to accept them. >=20 > =E2=80=A6 and that=E2=80=99s the only reason to deploy them, given = that (as the name suggests) they=E2=80=99re meant to increase = performance. I=E2=80=99d be happy to learn more about why you appear to = hate them so much (even just anecdotal). >=20 >> The goals of the operators and the paying customers are not always = aligned here, a PEP might be advantageous more to the operator than the = end-user (theoretically also the other direction, but since operators = pay for PEPs they are unlikely to deploy those) think mandatory image = recompression or forced video quality downscaling.... (and sure these = are not as clear as I pitched them, if after an emergency a PEP allows = most/all users in a cell to still send somewhat degraded images that is = better than the network choking itself with a few high quality images, = assuming images from the emergency are somewhat useful). >=20 > What is this, are you inventing a (too me, frankly, strange) scenario = where PEPs do some evil for customers yet help operators, [SM] This is no invention, but how capitalism works, sorry. The = party paying for the PEP decides on using it based on the advantages it = offers for them. E.g. a mobile carrier that (in the past) forcible = managed to downgrade the quality of streaming video over mobile links = without giving the paying end-user an option to use either choppy high = resolution or smooth low resolution video. By the way, that does not = make the operator evil, it is just that operator and paying customers = goals and desires are not all that well aligned (e.g. the operator wants = to maximize revenue, the customer to minimize cost). > or is there an anecdote here? [SM] I think the video downscaling thing actually happened in = the German market, but I am not sure on the exact details, so I might = misinterpret things a bit here. However the observation about alignment = of goals I believe to be universally true. >=20 >>>> I have witnessed the arguments in the "L4S wars" about how little = processing one can ask the more central network nodes perform, e.g. flow = queueing which would solve a lot of the issues (e.g. a hyper aggressive = slow-start flow would mostly hurt itself if it overshoots its capacity) = seems to be a complete no-go. >>>=20 >>> That=E2=80=99s to do with scalability, which depends on how close to = the network=E2=80=99s edge one is. >>=20 >> [SM] I have heard the alternative that it has to do with what = operators of core-links request from their vendors and what features = they are willing to pay for... but this is very anecdotal as I have = little insight into big-iron vendors or core-link operators.=20 >>=20 >>>> I personally think what we should do is have the network supply = more information to the end points to control their behavior better. = E.g. if we would mandate a max_queue-fill-percentage field in a protocol = header and have each node write max(current_value_of_the_field, = queue-filling_percentage_of_the_current_node) in every packet, end = points could estimate how close to congestion the path is (e.g. by = looking at the rate of %queueing changes) and tailor their = growth/shrinkage rates accordingly, both during slow-start and during = congestion avoidance. >>>=20 >>> That could well be one way to go. Nice if we provoked you to think! >>=20 >> [SM] You mostly made me realize what the recent increases in IW = actually aim to accomplish ;) >=20 > That=E2=80=99s fine! Increasing IW is surely a part of the solution = space - though I advocate doing something else (as in the example above) = than just to increase the constant in a worldwide standard. [SM] Happy to agree, I am not saying I think increasing IW is = something I unconditionally support, just that I see what it offers. >> and that current slow start seems actually better than its = reputation; it solves a hard problem surprisingly well. >=20 > Actually, given that the large majority of flows end somewhere in slow = start, what makes you say that it solves it =E2=80=9Cwell=E2=80=9D? [SM] As I said, I accepted that there is no silver bullet, and = hence some gradual probing with increasing CWND/rate is unavoidable = which immediately implies that some flows will end before reaching = capacity. So the fact that flows end in slow-start is not a problem but = part of the solution. I see no way of ever having all flows immediately = start at their "stable" long-term capacity share (something that does = not exist in the first place in environments with un-correlated and = unpredictable cross traffic). But short of that almost all flows will = need more round trips to finish that theoretically minimally possible. I = tried to make that point before, and I am not saying current slow-start = is 100% perfect, but I do not expect the possible fine-tuning to get us = close enough to the theoretical performance of an "oracle" solution to = count as "revolutionary" improvement. >> The max(pat_queue%) idea has been kicking around in my head ever = since reading a paper about storing queue occupancy into packets to help = CC along (sorry, do not recall the authors or the title right now) so = that is not even my own original idea, but simply something I borrowed = from smarter engineers simply because I found the data convincing and = the theory sane. (Also because I grudgingly accept that latency = increases measured over the internet are a tad too noisy to be easily = useful* and too noisy for a meaningful controller based on the latency = rate of change**) >>=20 >>>> But alas we seem to go the path of a relative dumb 1 bit signal = giving us an under-defined queue filling state instead and to estimate = relative queue filling dynamics from that we need many samples (so = literally too little too late, or L3T2), but I digress. >>>=20 >>> Yeah you do :-) >>=20 >> [SM] Less than you let on ;). If L4S gets ratified >=20 > [snip] >=20 > I=E2=80=99m really not interested in an L4S debate. [SM] I understand, however I see clear reasons why L4S is = detrimental to your stated goals as it will getting more information = from the network less likely. I also tried to explain, why I believe = that to be a theoretically viable way forwards to improve slow-start = dynamics. Maybe show why my proposal is bunk while completely ignoring = L4S? Or is that the kind of "particular solution" you do not want to = discuss at the current stage? Anyway, thanks for your time. I fear I have made my points in the last = mail already and are mostly repeating myself, so I would not feel = offended in any way if you let this sub-discussion sleep and wait for = more topical discussion entries.=20 Regards Sebastian >=20 > Cheers, > Michael