From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-lb0-x22f.google.com (mail-lb0-x22f.google.com [IPv6:2a00:1450:4010:c04::22f]) (using TLSv1 with cipher RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by huchra.bufferbloat.net (Postfix) with ESMTPS id 2092A21F34A for ; Tue, 12 May 2015 19:52:38 -0700 (PDT) Received: by lbbzk7 with SMTP id zk7so19640854lbb.0 for ; Tue, 12 May 2015 19:52:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=content-type:mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=92BB/xzdiWosWBLMzM0/Ta+9GDEF4snD4RmpTfL8MUk=; b=PiAADdYLFX4rzIScOy3KRG6bEoGJA+8LGI3pcTMyebb+94vwn68tinsEZXLaCp6f20 2ie4/fiV10ksg85JUCq2en5qId+RX2fAVoYuJIVZ/f+wkmgaE/cag/fAVRyFFo6IX6gg vdSch9Fry0YDFnYZslDCXD43LoIvYU6+eJB4iEg6wTcOwURtzQQjYMftpiAZKzj1hJYs zG3jBRbk7HYjhkFG7gjAraeAeeyyyumix9V6zBrghTObx2Gp4VcpjuxsPjWoLLMF/pgI Gq/6RgGPhoT5Q1GAOUi1bzfcon1WVVGYgYvAOX+HFn72fA4yrlBSVmPRMZWPg8L9kaWu D84w== X-Received: by 10.152.8.231 with SMTP id u7mr14007196laa.37.1431485556539; Tue, 12 May 2015 19:52:36 -0700 (PDT) Received: from bass.home.chromatix.fi (87-93-63-112.bb.dnainternet.fi. [87.93.63.112]) by mx.google.com with ESMTPSA id 7sm4456446lax.44.2015.05.12.19.51.55 (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Tue, 12 May 2015 19:52:35 -0700 (PDT) Content-Type: text/plain; charset=windows-1252 Mime-Version: 1.0 (Mac OS X Mail 8.2 \(2098\)) From: Jonathan Morton In-Reply-To: Date: Wed, 13 May 2015 05:51:36 +0300 Content-Transfer-Encoding: quoted-printable Message-Id: <7E879DBB-1EFB-46FA-9230-A5AFC97B93B0@gmail.com> References: <152DD781-725D-4DD7-AB94-C7412D92F82C@gmx.de> <1F323E22-817A-4212-A354-C6A14D2F1DBB@gmail.com> To: David Lang X-Mailer: Apple Mail (2.2098) Cc: cake@lists.bufferbloat.net Subject: Re: [Cake] Control theory and congestion control X-BeenThere: cake@lists.bufferbloat.net X-Mailman-Version: 2.1.13 Precedence: list List-Id: Cake - FQ_codel the next generation List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 13 May 2015 02:53:09 -0000 > On 13 May, 2015, at 02:23, David Lang wrote: >=20 >> 1) The most restrictive signal seen during an RTT is the one to react = to. So a =93fast down=94 signal overrides anything else. >=20 > sorry for joining in late, but I think you are modeling something that = doesn't match reality. >=20 > are you really going to see two bottlenecks in a given round trip (or = even one connection)? Since you are ramping up fairly slowly, aren't you = far more likely to only see one bottleneck (and once you get through = that one, you are pretty much set through the rest of the link) It=92s important to remember that link speeds can change drastically = over time (usually if it=92s *anything* wireless), that new competing = traffic might reduce the available bandwidth suddenly, and that as a = result the bottleneck can *move* from an ELR-enabled queue to a = different queue which might not be. I consider that far more likely = than an ELR queue abruptly losing control as Sebastian originally = suggested, but it looks similar to the endpoints. So what you might have is an ELR queue happily controlling the cwnd = based on the assumption that *it* is the bottleneck, which until now it = has been. But *after* that queue is another one which has just *become* = the bottleneck, and it=92s not ELR - it=92s plain ECN. The only way it = can tell the flow to slow down is by giving =93fast down=94 signals. = But that=92s okay, the endpoints will react to that just as they should = do, as long as they correctly interpret the most restrictive signal as = being the operative one. Or maybe the new bottleneck is a dumb FIFO. In this case, ELR will = initially hold the cwnd constant, but the FIFO will fill up, increasing = latency and reducing throughput at the same BDP. This will cause ELR to = start giving =93slow up=94 and then maybe =93fast up=94 signals, and = might thereby relinquish control of the flow automatically. Note that = =93fast up=94 is signalled by ELR *not modifying* any packets. Or maybe the new bottleneck is a drop-only AQM. In that case, the first = sign of it will be a dropped packet after, if anything, only a small = increase in latency (ie. not enough, for long enough, for ELR to do very = much about). At this point, the observable network state is = indistinguishable from a randomly-lost packet, ie. not congestion = related. The safe option here is to react like an ECN-enabled flow, treating any = lost packet as a =93fast down=94 signal. An alternative is to treat a = lost packet as =93slow down=94 *if* it is accompanied by =93slow up=94 = or =93hold=94 signals in the same RTT (ie. there=92s a reasonable belief = that we=92re being properly controlled by ELR). While =93slow down=94 = doesn=92t react as quickly as a new bottleneck queue might prefer, it = does at least respond; if enough drops appear, the ELR queue=92s control = loop will be shifted to =93fast up=94, relinquishing control. Or, if = the AQM isn=92t tight enough to do that, the corresponding increase in = RTT will do it instead. > (if it's a new flow, it should start slow and ramp up, so you, and the = other affected flows, should all be good with a 'slow down' signal) Given that slow-start grows the cwnd exponentially, that might not be = the case after the first few RTTs. But that=92s all part of the control = loop, and ELR would normally signal it with the CE codepoint rather than = dropping packets. Sebastian=92s scenario of =93slow down=94 suddenly = changing to =93omgwtfbbq drop everything now=94 within the same queue is = indeed unlikely. >> I fully appreciate that *some* network paths may be unstable, and any = congestion control system will need to chase the sweet spot up and down = under such conditions. >>=20 >> Most of the time, however, baseline RTT is stable over timescales of = the order of minutes, and available bandwidth is dictated by the = last-mile link as the bottleneck. BDP and therefore the ideal cwnd is a = simple function of baseline RTT and bandwidth. Hence there are common = scenarios in which a steady-state condition can exist. That=92s enough = to justify the =93hold=94 signal. >=20 > Unless you prevent other traffic from showing up on the network = (phones checking e-mail, etc). I don't believe that you are ever going = to have stable bandwidth available for any noticable timeframe. On many links, light traffic such as e-mail will disturb the balance too = little to even notice, especially with flow isolation. Assuming ELR is = implemented as per my later post, running without flow isolation will = allow light traffic to perturb the ELR signal slightly, converting a = =93hold=94 into a random sequence of =93slow up=94, =93hold" and =93slow = down=94, but this will self-correct conservatively, with ELR = transitioning to a true =93slow up=94 briefly if required. Of course, as with any speculation of this nature, simulations and other = experiments will tell a more convincing story. - Jonathan Morton