From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-oi0-x233.google.com (mail-oi0-x233.google.com [IPv6:2607:f8b0:4003:c06::233]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id 973C63B25E for ; Tue, 10 May 2016 00:59:20 -0400 (EDT) Received: by mail-oi0-x233.google.com with SMTP id k142so2072069oib.1 for ; Mon, 09 May 2016 21:59:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-transfer-encoding; bh=5s17Szbqim1UKIsIMo81bxNvXsxzlktnrJynsuEuuvA=; b=cSWlYsayE45IWkW0YSRtGc5L1wpd1wFRdbnS4c8tHwuE5yLaepBoZGv0500CeLpcWi YSNpbpXh9j37HRk4UUaPwVM1O21Keyj15cE8ugOHVHA5VatN7VoyG4pInM4xdnhtVE+j qwbY9kx5yiyx1eg76S83trM+rMCffzTb4ptXf2EOMLrSMzJWOFiuIb04XGR8C6RCgbU4 V/PCtXov30/IvmbHXBtZ/S0N/bhjoLVmq2ya/4rsETm4Bt5Wqaaua5Roym6KbNZ5Jjix bx5kjqxPcmVgxHOWCj4vfIqBZPH890FmOP+5UlwiXb/oErAppSf5wFafnFQWcXtgU21O ruOw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc:content-transfer-encoding; bh=5s17Szbqim1UKIsIMo81bxNvXsxzlktnrJynsuEuuvA=; b=P83N6fa5Y+xmXfIZ3gK6+hYUifldofh+24RE/Ign3TQM6Cpfvz4wmQaK8cc0KMWjAB IZkzaMvh2FECCikBprp4qrV4pt7toTsxuFJ0/A9iqR/geEWODwuSa/Yeja68AK56PBmx 2FYrQCCpMh3zUe9DHQ/U6eCzlaiy1P8SrrRh1rbVyjOrE1qMK+YS3SdPSqsaqk6GwRN0 vOdkfGq8ZqgsfJP544P70DWwhXdbjvkU+F2il8QxDBI5h1i0G/0ifdNUgidsy/hnfXZD swqLJurm2gfPFozOCbDiXK1SvpUxx9pUG0ULZ86na1OtnCq6VeQXW/x4rsHEJ1RU9+Bv kOAw== X-Gm-Message-State: AOPr4FWC6+FOXcyUymve8THaPFJq8k/aC0g+P+w2TCvs5tQrZ+F/G7VOPp9j207CU3iDYwmO7b3+x26CsgbGQA== MIME-Version: 1.0 X-Received: by 10.157.57.200 with SMTP id y66mr1585322otb.169.1462856360005; Mon, 09 May 2016 21:59:20 -0700 (PDT) Received: by 10.202.229.210 with HTTP; Mon, 9 May 2016 21:59:19 -0700 (PDT) In-Reply-To: References: <871t5bpkc7.fsf@toke.dk> <6ADC1A9D-72C9-47A5-BDC7-94C14ED34379@gmail.com> Date: Mon, 9 May 2016 21:59:19 -0700 Message-ID: From: Dave Taht To: David Lang Cc: Jonathan Morton , make-wifi-fast@lists.bufferbloat.net, "ath9k-devel@lists.ath9k.org" , Randell Jesup Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Subject: Re: [Make-wifi-fast] Diagram of the ath9k TX path X-BeenThere: make-wifi-fast@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 10 May 2016 04:59:20 -0000 This is a very good overview, thank you. I'd like to take apart station behavior on wifi with a web application... as a straw man. On Mon, May 9, 2016 at 8:41 PM, David Lang wrote: > On Mon, 9 May 2016, Dave Taht wrote: > >> On Mon, May 9, 2016 at 7:25 PM, Jonathan Morton >> wrote: >>> >>> >>>> On 9 May, 2016, at 18:35, Dave Taht wrote: >>>> >>>> should we always wait a little bit to see if we can form an aggregate? >>> >>> >>> I thought the consensus on this front was =E2=80=9Cno=E2=80=9D, as long= as we=E2=80=99re making >>> the decision when we have an immediate transmit opportunity. >> >> >> I think it is more nuanced than how david lang has presented it. > > > I have four reasons for arguing for no speculative delays. > > 1. airtime that isn't used can't be saved. > > 2. lower best-case latency > > 3. simpler code > > 4. clean, and gradual service degredation under load. > > the arguments against are: > > 5. throughput per ms of transmit time is better if aggregation happens th= an > if it doesn't. > > 6. if you don't transmit, some other station may choose to before you wou= ld > have finished. > > #2 is obvious, but with the caviot that anytime you transmit you may be > delaying someone else. > > #1 and #6 are flip sides of each other. we want _someone_ to use the > airtime, the question is who. > > #3 and #4 are closely related. > > If you follow my approach (transmit immediately if you can, aggregate whe= n > you have a queue), the code really has one mode (plus queuing). "If you h= ave > a Transmit Oppertunity, transmit up to X packets from the queue", and it > doesn't matter if it's only one packet. > > If you delay the first packet to give you a chance to aggregate it with > others, you add in the complexity and overhead of timers (including > cancelling timers, slippage in timers, etc) and you add "first packet, st= art > timers" mode to deal with. > > I grant you that the first approach will "saturate" the airtime at lower > traffic levels, but at that point all the stations will start aggregating > the minimum amount needed to keep the air saturated, while still minimizi= ng > latency. > > I then expect that application related optimizations would then further > complicate the second approach. there are just too many cases where small > amounts of data have to be sent and other things serialize behind them. > > DNS lookup to find a domain to then to a 3-way handshake to then do a > request to see if the library has been updated since last > cached (repeat for several libraries) to then fetch the actual page conte= nt. > All of these thing up to the actual page content could be single packets > that have to be sent (and responded to with a single packet), waiting for > the prior one to complete. If you add a few ms to each of these, you can > easily hit 100ms in added latency. Once you start to try and special case= s > these sorts of things, the code complexity multiplies. Take web page parsing as an example. The first request is a dns lookup. The second request is a http get (which can include a few more round trips for negotiating SSL), the next is a flurry of page parsing that results in the internal web browser attempting to schedule it's requests best and then sending out the relevant dns and tcp flows as best it can figure out, and then, typically several seconds of data transfer across each set of flows. Page paint is bound by getting the critical portions of the resulting data parsed and laid out properly. Now, I'd really like that early phase to be optimized by APs by something more like SQF, where when a station appears and does a few packet exchanges that it gets priority over stations taking big flows on a more regular basis, so it more rapidly gets into flow balance with the other stations. (and then, for most use cases, like web, exits) the second phase, of actual transfer, is also bound by RTT. I have no idea to what extent wifi folk actually put into typical web transfer delays (20-80ms), but they are there... ... The idea of the wifi driver waiting a bit to form a better aggregate to fit into a txop ties into two slightly different timings and flow behaviors. If it is taking 10ms to get a txop in the first place, taking more time to assemble a good batch of packets to fit into "your" txop would be good. If it is taking 4ms to transfer your last txop, well, more packets may arrive for you in that interval, and feed into your existing flows to keep them going, if you defer feeding the hardware with them. Also, classic tcp acking goes out the window with competing acks at layer 2= . I don't know if quic can do the equivalent of stretch acks... but one layer 3 ack, block acked by layer 2 in wifi, suffices... if you have a ton of tcp acks outstanding, block acking them all is expensive... > So I believe that the KISS approach ends up with a 'worse is better' > situation. Code is going to get more complex anyway, and there are other optimizations that could be made. One item I realized recently is that part of codel need not run on every packet in every flow for stuff destined to fit into a single txop. It is sufficient to see if it declared a drop on the first packet in a flow destined for a given txop. You can then mark that entire flow (in a txop) as droppable (QoSNoAck) within that txop (as it is within an RTT, and even losing all the packets there will only cause the rate to halve). > > David Lang --=20 Dave T=C3=A4ht Let's go make home routers and wifi faster! With better software! http://blog.cerowrt.org