From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from bifrost.lang.hm (mail.lang.hm [64.81.33.126]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by huchra.bufferbloat.net (Postfix) with ESMTPS id 4110221F35F for ; Fri, 5 Jun 2015 20:08:57 -0700 (PDT) Received: from asgard.lang.hm (asgard.lang.hm [10.0.0.100]) by bifrost.lang.hm (8.13.4/8.13.4/Debian-3) with ESMTP id t5638sKa012722; Fri, 5 Jun 2015 20:08:55 -0700 Date: Fri, 5 Jun 2015 20:08:54 -0700 (PDT) From: David Lang X-X-Sender: dlang@asgard.lang.hm To: Dave Taht In-Reply-To: Message-ID: References: User-Agent: Alpine 2.02 (DEB 1266 2009-07-14) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: cake@lists.bufferbloat.net Subject: Re: [Cake] lower bounds for latency X-BeenThere: cake@lists.bufferbloat.net X-Mailman-Version: 2.1.13 Precedence: list List-Id: Cake - FQ_codel the next generation List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 06 Jun 2015 03:09:26 -0000 On Fri, 5 Jun 2015, Dave Taht wrote: > On Fri, Jun 5, 2015 at 6:02 PM, David Lang wrote: >> On Fri, 5 Jun 2015, Dave Taht wrote: >> >>> bob's been up to good stuff lately.. >>> >>> http://bobbriscoe.net/projects/latency/sub-mss-w.pdf >> >> >> one thing that looks wrong to me. He talks about how TCP implementations >> cannot operate at less than two packets per RTT. It's not clear what he >> means. Does he mean two packets in flight per RTT? or two packets worth of >> buffering per RTT? > > Flight. In that case I think his analysis of the effects of AQM are incorrect because they only limit the buffer size on one device, not the number of packets in flight. He uses the example of 12 connections totaling 40Mb with a 6ms RTT. What if the systems are in the same rack and have <1ms RTT? according to him TCP just won't work. > I also disagree with this statement: > > "It is always wrong to send smaller packets more > often, because the constraint may be packet pro- > cessing, not bits." > > because I believe (but would have to go look at some data to make > sure) that we're good with packet sizes down into the 300 byte range > on most hardware, and thus could (and SHOULD) also reduce the mss in > these cases to keep the "signal strength" up at a sustainable levels. > > I do recall that bittorrent used to reduce the mss and were asked to > stop (a decade ago), when they got as far down as 600 bytes or so. > > but the rest I quite liked. I think it depends on the speed of the link. At the very low end (cheap home routers) and the very high end (multiple links >10Gb/s) the available processing per packet may be a limit. But in the huge middle ground between these extremes, you have quite a bit of cpu time per packet. The 3800 at 100Mb is an example of running into this limit, but at 10Mb it has no problem. The WRT1200 with it's dual-core GHz+ cpu has a lot of processor available for a bit more money. From there up until the large datacenter/ISP cores with multiple 10Gb ports to manage, you hae pleanty of cpu. The other issue is the length of the 'dead air' between packets. The current standards have this being a fixed amount of time. combined with the 'wasted' packet header data results in the same amount of data using less total time if it's sent in fewer, larger packets than in smaller packets. when you are pushing the limits of the wire, this can make a difference. This is why wifi tries to combine multiple packets into one transmission. We just need to break people of the mindset that it makes sense to hold off on transmitting something "just in case" something more comes along that could be combiend with it. Instead they need to start off tranmitting the first one that comes along without any wait, and then in the future, check the queue to see if there's something else that can be combined with what you are ready to transmit, and if so, send it at the same time. Rsyslog implements exactly this algorithm in how it batches log messages that it's processing, the first message gets the minimum delay and future messages only get as much delay as is required to keep things moving. It does mean that the process hits continuous processing sooner (where there is no delay between finishing working on one batch and starting work on the next), but that 'always busy' point is not peak throughput, as batches are more efficient to process, the throughput keeps climbing while the latency climbs at a much lower rate (eventually you do hit a solid wall where larger batches don't help, so you just fall behind until traffic slows) >> >> Two packets in flight per RTT would make sense as a minimum, but two packets >> worth of buffering on N devices in the path doesn't. >> >> using the example of a 6ms RTT. Depending on the equipment involved, this >> could have from one to several devices handling the packets between the >> source and the destination. Saying that each device in the path must have >> two packets worth of buffering doesn't make sense. At a given line speed and >> data rate, you will have X packets in flight. the number of devices between >> the source and the destination will not change X. > > Flight. > >> If the requirement is that there are always at least two packets in flight >> in a RTT, it doesn't then follow that both packets are going to be in the >> buffer of the same device at the same time. I spoke with a vendor promising >> 7ms Los Angeles to Los Vegas. For the vast majority of that 7ms the packets >> are not in the buffers of the routers, but exist only as light in the fiber >> (I guess you could view the fiber acting as a buffer in such conditions) >> >> where is the disconnect between my understanding and what Bob is talking >> about? > > Flight, not buffering. Redefining the goal of an aqm to keep packets > in flight rather than achieve a fixed queuing delay is what this is > about, and improving tcps to also keep packets in flight with > subpacket windows is part of his answer. > > I like getting away from a target constant for delay (why 5ms when 5us > is doable) and this is an interesting way to think about it from both > ends. I agree, the idea of trying to maintain a fixed buffer delay is not what we're trying to do, we're trying for the minimum amount of uneccessary buffer delay. The 'target' numbers are just the point where we say the delay is so bad that the traffic must be slowed. > And I was nattering about how I didn't like delayed acks just a few hours ago. what we need is a TCP stack that can combine acks that arrive separately over time and only send one. David Lang