From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-out04.uio.no (mail-out04.uio.no [IPv6:2001:700:100:8210::76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id 150943B29E for ; Mon, 11 Jul 2022 02:24:56 -0400 (EDT) Received: from mail-mx11.uio.no ([129.240.10.83]) by mail-out04.uio.no with esmtps (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1oAmqi-003THK-Tx; Mon, 11 Jul 2022 08:24:52 +0200 Received: from 178.115.63.84.wireless.dyn.drei.com ([178.115.63.84] helo=smtpclient.apple) by mail-mx11.uio.no with esmtpsa (TLS1.2:ECDHE-RSA-AES256-GCM-SHA384:256) user michawe (Exim 4.94.2) (envelope-from ) id 1oAmqh-0006Q2-Cs; Mon, 11 Jul 2022 08:24:52 +0200 Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3696.100.31\)) From: Michael Welzl In-Reply-To: Date: Mon, 11 Jul 2022 08:24:49 +0200 Cc: Dave Taht , bloat Content-Transfer-Encoding: quoted-printable Message-Id: <0BAAEF4C-331B-493C-B1F5-47AA648C64F8@ifi.uio.no> References: <6458C1E6-14CB-4A36-8BB3-740525755A95@ifi.uio.no> <7D20BEF3-8A1C-4050-AE6F-66E1B4203EE1@gmx.de> <4E163307-9B8A-4BCF-A2DE-8D7F3C6CCEF4@ifi.uio.no> <95FB54F9-973F-40DE-84BF-90D05A642D6B@ifi.uio.no> To: Sebastian Moeller X-Mailer: Apple Mail (2.3696.100.31) X-UiO-SPF-Received: Received-SPF: neutral (mail-mx11.uio.no: 178.115.63.84 is neither permitted nor denied by domain of ifi.uio.no) client-ip=178.115.63.84; envelope-from=michawe@ifi.uio.no; helo=smtpclient.apple; X-UiO-Spam-info: not spam, SpamAssassin (score=-5.0, required=5.0, autolearn=disabled, TVD_RCVD_IP=0.001, T_SCC_BODY_TEXT_LINE=-0.01, UIO_MAIL_IS_INTERNAL=-5) X-UiO-Scanned: 0F6D3267929D428CD14A048A0EA24067A2EA0CFA X-UiOonly: D9D4FBFFC7B8861CBA5072A3440763443607D536 Subject: Re: [Bloat] [iccrg] Musings on the future of Internet Congestion Control X-BeenThere: bloat@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: General list for discussing Bufferbloat List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 11 Jul 2022 06:24:56 -0000 Hi Sebastian, Neither our paper nor me are advocating one particular solution - we = point at a problem and suggest that research on ways to solve the = under-utilization problem might be worthwhile. Jumping from this to discussing the pro=E2=80=99s and con=E2=80=99s of a = potential concrete solution is quite a leap=E2=80=A6 More below: > On Jul 10, 2022, at 11:29 PM, Sebastian Moeller = wrote: >=20 > Hi Michael, >=20 >=20 >> On Jul 10, 2022, at 22:01, Michael Welzl wrote: >>=20 >> Hi ! >>=20 >>=20 >>> On Jul 10, 2022, at 7:27 PM, Sebastian Moeller = wrote: >>>=20 >>> Hi Michael, >>>=20 >>> so I reread your paper and stewed a bit on it. >>=20 >> Many thanks for doing that! :) >>=20 >>=20 >>> I believe that I do not buy some of your premises. >>=20 >> you say so, but I don=E2=80=99t really see much disagreement here. = Let=E2=80=99s see: >>=20 >>=20 >>> e.g. you write: >>>=20 >>> "We will now examine two factors that make the the present situation = particularly worrisome. First, the way the infrastructure has been = evolving gives TCP an increasingly large operational space in which it = does not see any feedback at all. Second, most TCP connections are = extremely short. As a result, it is quite rare for a TCP connection to = even see a single congestion notification during its lifetime." >>>=20 >>> And seem to see a problem that flows might be able to finish their = data transfer business while still in slow start. I see the same data, = but see no problem. Unless we have an oracle that tells each sender = (over a shared bottleneck) exactly how much to send at any given time = point, different control loops will interact on those intermediary = nodes. >>=20 >> You really say that you don=E2=80=99t see the solution. The problem = is that capacities are underutilized, which means that flows take longer = (sometimes, much longer!) to finish than they theoretically could, if we = had a better solution. >=20 > [SM] No IMHO the underutilization is the direct consequence of = requiring a gradual filling of the "pipes" to probe he available = capacity. I see no way how this could be done differently with the = traffic sources/sinks being uncoordinated entities at the edge, and I = see no way of coordinating all end points and handle all paths. In other = words, we can fine tune a parameters to tweak the probing a bit, make it = more or less aggressive/fast, but the fact that we need to probe = capacity somehow means underutilization can not be avoided unless we = find a way of coordinating all of the sinks and sources. But being = sufficiently dumb, all I can come up with is an all-knowing oracle or = faster than light communication, and neither strikes me to be realistic = ;) There=E2=80=99s quite a spectrum of possibilities between an oracle or = =E2=80=9Ccoordinating all of the sinks and sources=E2=80=9D on one hand, = and quite =E2=80=9Cblindly=E2=80=9D probing from a constant IW on the = other. The =E2=80=9Cfine tuning=E2=80=9D that you mention is interesting = research, IMO! >>> I might be limited in my depth of thought here, but having each flow = probing for capacity seems exactly the right approach... and doubling = CWND or rate every RTT is pretty aggressive already (making slow start = shorter by reaching capacity faster within the slow-start framework = requires either to start with a higher initial value (what increasing IW = tries to achieve?) or use a larger increase factor than 2 per RTT). I = consider increased IW a milder approach than the alternative. And once = one accepts that a gradual rate increasing is the way forward it falls = out logically that some flows will finish before they reach steady state = capacity especially if that flows available capacity is large. So what = exactly is the problem with short flows not reaching capacity and what = alternative exists that does not lead to carnage if more-aggressive = start-up phases drive the bottleneck load into emergency drop territory? >>=20 >> There are various ways to do this [snip: a couple of concrete suggestions from me, and answers about what = problems they might have, with requests for references from you] I=E2=80=99m sorry, but I wasn=E2=80=99t really going to have a = discussion about these particular possibilities. My point was only that = many possible directions exist - being completely =E2=80=9Cblind=E2=80=9D = isn=E2=80=99t the only possible approach. Instead of answering your comments to my suggestions, let me give you = one single concrete piece here: our reference 6, as one example of the = kind of resesarch that we consider worthwhile for the future: "X. Nie, Y. Zhao, Z. Li, G. Chen, K. Sui, J. Zhang, Z. Ye, and D. Pei, = =E2=80=9CDynamic TCP initial windows and congestion control schemes = through reinforcement learning,=E2=80=9D IEEE JSAC, vol. 37, no. 6, = 2019.=E2=80=9D https://1989chenguo.github.io/Publications/TCP-RL-JSAC19.pdf This work learns a useful value of IW over time, rather than using a = constant. One author works at Baidu, the paper uses data from Baidu, and = it says: "TCP-RL has been deployed in one of the top global search engines for = more than a year. Our online and testbed experiments show that for short = flow transmission, compared with the common practice of IW =3D 10, = TCP-RL can reduce the average transmission time by 23% to 29%.=E2=80=9D - so it=E2=80=99s probably fair to assume that this was (and perhaps = still is) active in Baidu. >>> And as an aside, a PEP (performance enhancing proxy) that does not = enhance performance is useless at best and likely harmful (rather a PDP, = performance degrading proxy). >>=20 >> You=E2=80=99ve made it sound worse by changing the term, for whatever = that=E2=80=99s worth. If they never help, why has anyone ever called = them PEPs in the first place? >=20 > [SM] I would guess because "marketing" was unhappy with = "engineering" emphasizing the side-effects/potential problems and = focussed in the best-case scenario? ;) It appears that you want to just ill-talk PEPs. There are plenty of = useful things that they can do and yes, I personally think they=E2=80=99re= the way of the future - but **not** in their current form, where they = must =E2=80=9Clie=E2=80=9D to TCP, cause ossification, etc. PEPs have = never been considered as part of the congestion control design - when = they came on the scene, in the IETF, they were despised for breaking the = architecture, and then all the trouble with how they need to play tricks = was discovered (spoofing IP addresses, making assumptions about header = fields, and whatnot). That doesn=E2=80=99t mean that a very different = kind of PEP - one which is authenticated and speaks an agreed-upon = protocol - couldn=E2=80=99t be a good solution. You=E2=80=99re bound to ask me for concrete things next, and if I give = you something concrete (e.g., a paper on PEPs), you=E2=80=99ll find = something bad about it - but this is not a constructive direction of = this conversation. Please note that I=E2=80=99m not saying =E2=80=9CPEPs = are always good=E2=80=9D: I only say that, in my personal opinion, = they=E2=80=99re a worthwhile direction of future research. That=E2=80=99s = a very different statement. >> Why do people buy these boxes? >=20 > [SM] Because e.g. for GEO links, latency is in a range where = default unadulterated TCP will likely choke on itself, and when faced = with requiring customers to change/tune TCPs or having "PEP" fudge it, = ease of use of fudging won the day. That is a generous explanation (as = this fudging is beneficial to both the operator and most end-users), I = can come up with less charitable theories if you want ;) . >=20 >>> The network so far has been doing reasonably well with putting more = protocol smarts at the ends than in the parts in between. >>=20 >> Truth is, PEPs are used a lot: at cellular edges, at satellite = links=E2=80=A6 because the network is *not* always doing reasonably well = without them. >=20 > [SM] Fair enough, I accept that there are use cases for those, = but again, only if the actually enhance the "experience" will users be = happy to accept them. =E2=80=A6 and that=E2=80=99s the only reason to deploy them, given that = (as the name suggests) they=E2=80=99re meant to increase performance. = I=E2=80=99d be happy to learn more about why you appear to hate them so = much (even just anecdotal). > The goals of the operators and the paying customers are not always = aligned here, a PEP might be advantageous more to the operator than the = end-user (theoretically also the other direction, but since operators = pay for PEPs they are unlikely to deploy those) think mandatory image = recompression or forced video quality downscaling.... (and sure these = are not as clear as I pitched them, if after an emergency a PEP allows = most/all users in a cell to still send somewhat degraded images that is = better than the network choking itself with a few high quality images, = assuming images from the emergency are somewhat useful). What is this, are you inventing a (too me, frankly, strange) scenario = where PEPs do some evil for customers yet help operators, or is there an = anecdote here? >>> I have witnessed the arguments in the "L4S wars" about how little = processing one can ask the more central network nodes perform, e.g. flow = queueing which would solve a lot of the issues (e.g. a hyper aggressive = slow-start flow would mostly hurt itself if it overshoots its capacity) = seems to be a complete no-go. >>=20 >> That=E2=80=99s to do with scalability, which depends on how close to = the network=E2=80=99s edge one is. >=20 > [SM] I have heard the alternative that it has to do with what = operators of core-links request from their vendors and what features = they are willing to pay for... but this is very anecdotal as I have = little insight into big-iron vendors or core-link operators.=20 >=20 >>> I personally think what we should do is have the network supply more = information to the end points to control their behavior better. E.g. if = we would mandate a max_queue-fill-percentage field in a protocol header = and have each node write max(current_value_of_the_field, = queue-filling_percentage_of_the_current_node) in every packet, end = points could estimate how close to congestion the path is (e.g. by = looking at the rate of %queueing changes) and tailor their = growth/shrinkage rates accordingly, both during slow-start and during = congestion avoidance. >>=20 >> That could well be one way to go. Nice if we provoked you to think! >=20 > [SM] You mostly made me realize what the recent increases in IW = actually aim to accomplish ;) That=E2=80=99s fine! Increasing IW is surely a part of the solution = space - though I advocate doing something else (as in the example above) = than just to increase the constant in a worldwide standard. > and that current slow start seems actually better than its reputation; = it solves a hard problem surprisingly well. Actually, given that the large majority of flows end somewhere in slow = start, what makes you say that it solves it =E2=80=9Cwell=E2=80=9D? > The max(pat_queue%) idea has been kicking around in my head ever since = reading a paper about storing queue occupancy into packets to help CC = along (sorry, do not recall the authors or the title right now) so that = is not even my own original idea, but simply something I borrowed from = smarter engineers simply because I found the data convincing and the = theory sane. (Also because I grudgingly accept that latency increases = measured over the internet are a tad too noisy to be easily useful* and = too noisy for a meaningful controller based on the latency rate of = change**) >=20 >>> But alas we seem to go the path of a relative dumb 1 bit signal = giving us an under-defined queue filling state instead and to estimate = relative queue filling dynamics from that we need many samples (so = literally too little too late, or L3T2), but I digress. >>=20 >> Yeah you do :-) >=20 > [SM] Less than you let on ;). If L4S gets ratified [snip] I=E2=80=99m really not interested in an L4S debate. Cheers, Michael