From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-lf0-x236.google.com (mail-lf0-x236.google.com [IPv6:2a00:1450:4010:c07::236]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id 430E33B29E; Wed, 13 Dec 2017 16:06:21 -0500 (EST) Received: by mail-lf0-x236.google.com with SMTP id r143so4217672lfe.13; Wed, 13 Dec 2017 13:06:21 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:mime-version:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=sEJdLw1rKEshBbbcyeOGT41GTw8A8BEDotaZh+1OPFE=; b=IDE5CiKNJuMExrzwOEmvZ/C4qCo5XZ6RZfdEdJkWgKcmlHuZ0U48+AWMf0jWiI+YJh SDbeKJqM9yC6TDE8XOUPjGxPgF3VuFNIKXy+rAQGMASPO/Csp1h7fncB/WRIWTVTTFX4 B59VJZO7gVW7SfPOFVsyC7+41sprFCkpv2ODIe6R9dt3GC5BEjOw6lfWSSDnB8zSb68U mxI0A+/syeOnd+eRujzGTdxoAT/yMMPYSw3UQtxhB6p7QIOptEOzo7/kga94B/Z1DGkU 1+CeuDSquOYahHTkUJUM668Ua8YOwbxjzDhlIDLROFvSt20B4H8Q1ssyiBrOSB9ej4lv G5CA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:mime-version:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=sEJdLw1rKEshBbbcyeOGT41GTw8A8BEDotaZh+1OPFE=; b=rM6qLKHgybthCPK+Sn8Nro1fD71rTRkcAil10vMA+A+DhQibbfoWunt+1/LkPvhaSF G2NJ+NCx8Y2mcD/t1fT321iqgW7x3LU5BB0qqRBzgB3Mc1ynNTfPN/yngpfb5dGPyam5 TJ9AkQpBhL56j0g1F+XYJwoNETlnn2YwhY0kM8anI9cRYrjhgT3GAQkGqE+cDFhnkHf2 x44u/7WNAO/kLnaXxRyVHAvgzwA4hAEy6nYIGLDjTYSMCibPfp2O0lOvijM1hXXk6FfG wZBpkNbHRWLzfB87BtXMCccn/OI0lFD1fFzZK4NvzQZTFok8uuvxtPHzpkQwKDgJ/ABr zjwQ== X-Gm-Message-State: AKGB3mKYqTsVD5xQ1qOONhBrfMKi2bNeN6atU9j7kzDdgqzQ98CKD1G4 tfQNYAbQU59uzhjnFFUl7Uk1glqY X-Google-Smtp-Source: ACJfBotUeqRf7VLHnGT6RFeYR6kAcALJuzUuZJgiDPrKCeEtgXVIrrTIPweppeS2cXyGuKl/v/hw5w== X-Received: by 10.46.29.67 with SMTP id d64mr2232318ljd.139.1513199179992; Wed, 13 Dec 2017 13:06:19 -0800 (PST) Received: from [192.168.239.216] (mobile-access-bceed4-0.dhcp.inet.fi. [188.238.212.0]) by smtp.gmail.com with ESMTPSA id a9sm502690lfg.12.2017.12.13.13.06.17 (version=TLS1 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Wed, 13 Dec 2017 13:06:18 -0800 (PST) Mime-Version: 1.0 (Mac OS X Mail 9.3 \(3124\)) Content-Type: text/plain; charset=utf-8 From: Jonathan Morton X-Priority: 3 (Normal) In-Reply-To: <34FB5FF9-1490-4355-B2F3-76E519479287@pnsol.com> Date: Wed, 13 Dec 2017 23:06:15 +0200 Cc: dpreed@reed.com, cerowrt-devel@lists.bufferbloat.net, bloat Content-Transfer-Encoding: quoted-printable Message-Id: <70FC9694-4B74-45C1-9684-4418C4FAFCE5@gmail.com> References: <92906bd8-7bad-945d-83c8-a2f9598aac2c@lackof.org> <87bmjff7l6.fsf_-_@nemesis.taht.net> <1512417597.091724124@apps.rackspace.com> <87wp1rbxo8.fsf@nemesis.taht.net> <1513119230.638732339@apps.rackspace.com> <7D300E07-536C-4ABD-AE38-DDBAF30E80D7@pnsol.com> <1513188494.316722195@apps.rackspace.com> <34FB5FF9-1490-4355-B2F3-76E519479287@pnsol.com> To: Neil Davies X-Mailer: Apple Mail (2.3124) Subject: Re: [Bloat] [Cerowrt-devel] DC behaviors today X-BeenThere: bloat@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: General list for discussing Bufferbloat List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 13 Dec 2017 21:06:21 -0000 (Awesome development - I have a computer with a sane e-mail client = again. One that doesn=E2=80=99t assume I want to top-post if I quote = anything at all, *and* lets me type with an actual keyboard. Luxury!) >> One of the features well observed in real measurements of real = systems is that packet flows are "fractal", which means that there is a = self-similarity of rate variability all time scales from micro to macro. >=20 > I remember this debate and its evolution, Hurst parameters and all = that. I also understand that a collection of on/off Poisson sources = looks fractal - I found that =E2=80=9Cthe universe if fractal - live = with it=E2=80=9D ethos of limited practical use (except to help people = say it was not solvable). >> Designers may imagine that their networks have "smooth averaging" = properties. There's a strong thread in networking literature that makes = this pretty-much-always-false assumption the basis of protocol designs, = thinking about "Quality of Service" and other sorts of things. You can = teach graduate students about a reality that does not exist, and get = papers accepted in conferences where the reviewers have been trained in = the same tradition of unreal assumptions. >=20 > Agreed - there is a massive disconnect between a lot of the literature = (and the people who make their living generating it - [to those people, = please don=E2=80=99t take offence, queueing theory is really useful it = is just the real world is a lot more non-stationary than you model]) and = reality. Probably a lot of theoreticians would be horrified at the extent to = which I ignored mathematics and relied on intuition (and observations of = real traffic, ie. eating my own dogfood) while building Cake. That approach, however, led me to some novel algorithms and combinations = thereof which seem to work well in practice, as well as to some = practical observations about the present state of the Internet. I=E2=80=99= ve also used some contributions from others, but only where they made = sense at an intuitive level. However, Cake isn=E2=80=99t designed to work in the datacentre. Nor is = it likely to work optimally in an ISP=E2=80=99s core networks. The = combination of features in Cake is not optimised for those environments, = rather for last-mile links which are typically the bottlenecks = experienced by ordinary Internet users. Some of Cake's algorithms could = reasonably be reused in a different combination for a different = environment. > I see large scale (i.e. public internets) not as a mono-service but as = a =E2=80=9Cpoly service=E2=80=9D - there are multiple demands for = timeliness etc that exist out there for =E2=80=9Creal services=E2=80=9D. This is definitely true. However, the biggest problem I=E2=80=99ve = noticed is with distinguishing these traffic types from each other. In = some cases there are heuristics which are accurate enough to be useful. = In others, there are not. Rarely is the distinction *explicitly* marked = in any way, and some common protocols explicitly obscure themselves due = to historical mistreatment. Diffserv is very hard to use in practice. There=E2=80=99s a = controversial fact for you to chew on. > We=E2=80=99ve worked with people who have created risks for Netflix = delivery (accidentally I might add - they though they were doing =E2=80=9C= the right thing=E2=80=9D) by increasing their network infrastructure to = 100G delivery everywhere. That change (combined with others made by CDN = people - TCP offload engines) created so much non-stationarity in the = load so as to cause delay and loss spikes that *did* cause VoD playout = buffers to empty. This is an example of where =E2=80=9Cmore capacity=E2=80= =9D produced worse outcomes. That=E2=80=99s an interesting and counter-intuitive result. I=E2=80=99ll = hazard a guess that it had something to do with burst loss in dumb = tail-drop FIFOs? Offload engines tend to produce extremely bursty = traffic which - with a nod to another thread presently ongoing - makes a = mockery of any ack-clocking or pacing which TCP designers normally = assume is in effect. One of the things that fq_codel and Cake can do really well is to take a = deep queue full of consecutive line-rate bursts and turn them into = interleaved packet streams, which are at least slightly better = =E2=80=9Cpaced=E2=80=9D than the originals. They also specifically try = to avoid burst loss and (at least in Cake=E2=80=99s case) tail loss. It is of course regrettable that this behaviour conflicts with the = assumptions of most network acceleration hardware, and that maximum = throughput might therefore be compromised. The *qualitative* behaviour = is however improved. > I would suggest that there are other ways of dealing with the impact = of =E2=80=9Cpeak=E2=80=9D (i.e where instantaneous demand exceeds supply = over a long enough timescale to start effecting the most delay/loss = sensitive application in the collective multiplexed stream). Such as signalling to the traffic that congestion exists, and to please = slow down a bit to make room? ECN and AQM are great ways of doing that, = especially in combination with flow isolation - the latter shares out = the capacity fairly on short timescales, *and* avoids the need to signal = congestion to flows which are already using less than their fair share. > I would also agree that if all the streams are of the same =E2=80=9Cboun= d on delay and loss=E2=80=9D requirements (i.e *all* Netflix) then 100%+ = of all the same load (over, again the appropriate timescale - which for = Netflix VoD in streaming is about 20s to 30s) then end-user = disappointment is the only thing that can occur. I think emphasising the importance of measurement timescales is = consistently underrated in the industry and in academia alike. An = hour-long bucket of traffic tells you about a very different set of = characteristics than a millisecond-long bucket, and there are several = timescales between those extremes of great practical interest. - Jonathan Morton