From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <dave.taht@gmail.com>
Received: from mail-oi0-x233.google.com (mail-oi0-x233.google.com
 [IPv6:2607:f8b0:4003:c06::233])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (No client certificate requested)
 by lists.bufferbloat.net (Postfix) with ESMTPS id 973C63B25E
 for <make-wifi-fast@lists.bufferbloat.net>;
 Tue, 10 May 2016 00:59:20 -0400 (EDT)
Received: by mail-oi0-x233.google.com with SMTP id k142so2072069oib.1
 for <make-wifi-fast@lists.bufferbloat.net>;
 Mon, 09 May 2016 21:59:20 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:in-reply-to:references:date:message-id:subject:from:to
 :cc:content-transfer-encoding;
 bh=5s17Szbqim1UKIsIMo81bxNvXsxzlktnrJynsuEuuvA=;
 b=cSWlYsayE45IWkW0YSRtGc5L1wpd1wFRdbnS4c8tHwuE5yLaepBoZGv0500CeLpcWi
 YSNpbpXh9j37HRk4UUaPwVM1O21Keyj15cE8ugOHVHA5VatN7VoyG4pInM4xdnhtVE+j
 qwbY9kx5yiyx1eg76S83trM+rMCffzTb4ptXf2EOMLrSMzJWOFiuIb04XGR8C6RCgbU4
 V/PCtXov30/IvmbHXBtZ/S0N/bhjoLVmq2ya/4rsETm4Bt5Wqaaua5Roym6KbNZ5Jjix
 bx5kjqxPcmVgxHOWCj4vfIqBZPH890FmOP+5UlwiXb/oErAppSf5wFafnFQWcXtgU21O
 ruOw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20130820;
 h=x-gm-message-state:mime-version:in-reply-to:references:date
 :message-id:subject:from:to:cc:content-transfer-encoding;
 bh=5s17Szbqim1UKIsIMo81bxNvXsxzlktnrJynsuEuuvA=;
 b=P83N6fa5Y+xmXfIZ3gK6+hYUifldofh+24RE/Ign3TQM6Cpfvz4wmQaK8cc0KMWjAB
 IZkzaMvh2FECCikBprp4qrV4pt7toTsxuFJ0/A9iqR/geEWODwuSa/Yeja68AK56PBmx
 2FYrQCCpMh3zUe9DHQ/U6eCzlaiy1P8SrrRh1rbVyjOrE1qMK+YS3SdPSqsaqk6GwRN0
 vOdkfGq8ZqgsfJP544P70DWwhXdbjvkU+F2il8QxDBI5h1i0G/0ifdNUgidsy/hnfXZD
 swqLJurm2gfPFozOCbDiXK1SvpUxx9pUG0ULZ86na1OtnCq6VeQXW/x4rsHEJ1RU9+Bv
 kOAw==
X-Gm-Message-State: AOPr4FWC6+FOXcyUymve8THaPFJq8k/aC0g+P+w2TCvs5tQrZ+F/G7VOPp9j207CU3iDYwmO7b3+x26CsgbGQA==
MIME-Version: 1.0
X-Received: by 10.157.57.200 with SMTP id y66mr1585322otb.169.1462856360005;
 Mon, 09 May 2016 21:59:20 -0700 (PDT)
Received: by 10.202.229.210 with HTTP; Mon, 9 May 2016 21:59:19 -0700 (PDT)
In-Reply-To: <alpine.DEB.2.02.1605092003130.6517@nftneq.ynat.uz>
References: <871t5bpkc7.fsf@toke.dk>
 <CAA93jw5v0orW9hHsKw7LuL7HnN2eJUU4qYuh_oPUeV84qXZutg@mail.gmail.com>
 <6ADC1A9D-72C9-47A5-BDC7-94C14ED34379@gmail.com>
 <CAA93jw5drf2MXGS6jxsY0DBS1ZgmQ3oE5EBhtXTqyZ46vFHGTg@mail.gmail.com>
 <alpine.DEB.2.02.1605092003130.6517@nftneq.ynat.uz>
Date: Mon, 9 May 2016 21:59:19 -0700
Message-ID: <CAA93jw48JU3FFJDtwvX_3AwbFYQUbzah0iWcXt6Ko1aHD2AP5g@mail.gmail.com>
From: Dave Taht <dave.taht@gmail.com>
To: David Lang <david@lang.hm>
Cc: Jonathan Morton <chromatix99@gmail.com>,
 make-wifi-fast@lists.bufferbloat.net, 
 "ath9k-devel@lists.ath9k.org" <ath9k-devel@venema.h4ckr.net>,
 Randell Jesup <rjesup@mozilla.com>
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
Subject: Re: [Make-wifi-fast] Diagram of the ath9k TX path
X-BeenThere: make-wifi-fast@lists.bufferbloat.net
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: <make-wifi-fast.lists.bufferbloat.net>
List-Unsubscribe: <https://lists.bufferbloat.net/options/make-wifi-fast>,
 <mailto:make-wifi-fast-request@lists.bufferbloat.net?subject=unsubscribe>
List-Archive: <https://lists.bufferbloat.net/pipermail/make-wifi-fast>
List-Post: <mailto:make-wifi-fast@lists.bufferbloat.net>
List-Help: <mailto:make-wifi-fast-request@lists.bufferbloat.net?subject=help>
List-Subscribe: <https://lists.bufferbloat.net/listinfo/make-wifi-fast>,
 <mailto:make-wifi-fast-request@lists.bufferbloat.net?subject=subscribe>
X-List-Received-Date: Tue, 10 May 2016 04:59:20 -0000

This is a very good overview, thank you. I'd like to take apart
station behavior on wifi with a web application... as a straw man.

On Mon, May 9, 2016 at 8:41 PM, David Lang <david@lang.hm> wrote:
> On Mon, 9 May 2016, Dave Taht wrote:
>
>> On Mon, May 9, 2016 at 7:25 PM, Jonathan Morton <chromatix99@gmail.com>
>> wrote:
>>>
>>>
>>>> On 9 May, 2016, at 18:35, Dave Taht <dave.taht@gmail.com> wrote:
>>>>
>>>> should we always wait a little bit to see if we can form an aggregate?
>>>
>>>
>>> I thought the consensus on this front was =E2=80=9Cno=E2=80=9D, as long=
 as we=E2=80=99re making
>>> the decision when we have an immediate transmit opportunity.
>>
>>
>> I think it is more nuanced than how david lang has presented it.
>
>
> I have four reasons for arguing for no speculative delays.
>
> 1. airtime that isn't used can't be saved.
>
> 2. lower best-case latency
>
> 3. simpler code
>
> 4. clean, and gradual service degredation under load.
>
> the arguments against are:
>
> 5. throughput per ms of transmit time is better if aggregation happens th=
an
> if it doesn't.
>
> 6. if you don't transmit, some other station may choose to before you wou=
ld
> have finished.
>
> #2 is obvious, but with the caviot that anytime you transmit you may be
> delaying someone else.
>
> #1 and #6 are flip sides of each other. we want _someone_ to use the
> airtime, the question is who.
>
> #3 and #4 are closely related.
>
> If you follow my approach (transmit immediately if you can, aggregate whe=
n
> you have a queue), the code really has one mode (plus queuing). "If you h=
ave
> a Transmit Oppertunity, transmit up to X packets from the queue", and it
> doesn't matter if it's only one packet.
>
> If you delay the first packet to give you a chance to aggregate it with
> others, you add in the complexity and overhead of timers (including
> cancelling timers, slippage in timers, etc) and you add "first packet, st=
art
> timers" mode to deal with.
>
> I grant you that the first approach will "saturate" the airtime at lower
> traffic levels, but at that point all the stations will start aggregating
> the minimum amount needed to keep the air saturated, while still minimizi=
ng
> latency.
>
> I then expect that application related optimizations would then further
> complicate the second approach. there are just too many cases where small
> amounts of data have to be sent and other things serialize behind them.
>
> DNS lookup to find a domain to then to a 3-way handshake to then do a
> request to see if the <web something> library has been updated since last
> cached (repeat for several libraries) to then fetch the actual page conte=
nt.
> All of these thing up to the actual page content could be single packets
> that have to be sent (and responded to with a single packet), waiting for
> the prior one to complete. If you add a few ms to each of these, you can
> easily hit 100ms in added latency. Once you start to try and special case=
s
> these sorts of things, the code complexity multiplies.

Take web page parsing as an example. The first request is a dns
lookup. The second request is a http get (which can include a few more
round trips for
negotiating SSL), the next is a flurry of page parsing that results in
the internal web browser attempting to schedule it's requests best and
then sending out the relevant dns and tcp flows as best it can figure
out, and then, typically several seconds of data transfer across each
set of flows.

Page paint is bound by getting the critical portions of the resulting
data parsed and laid out properly.

Now, I'd really like that early phase to be optimized by APs by
something more like SQF, where when a station appears and does a few
packet exchanges that it gets priority over stations taking big flows
on a more regular basis, so it more rapidly gets into flow balance
with the other stations.

(and then, for most use cases, like web, exits)

the second phase, of actual transfer, is also bound by RTT. I have no
idea to what extent wifi folk actually put into typical web transfer
delays (20-80ms),
but they are there...

...

The idea of the wifi driver waiting a bit to form a better aggregate
to fit into a txop ties into two slightly different timings and flow
behaviors.

If it is taking 10ms to get a txop in the first place, taking more
time to assemble a good batch of packets to fit into "your" txop would
be good.

If it is taking 4ms to transfer your last txop, well, more packets may
arrive for you in that interval, and feed into your existing flows to
keep them going,
if you defer feeding the hardware with them.

Also, classic tcp acking goes out the window with competing acks at layer 2=
.

I don't know if quic can do the equivalent of stretch acks...

but one layer 3 ack, block acked by layer 2 in wifi, suffices... if
you have a ton of tcp acks outstanding, block acking them all is
expensive...

> So I believe that the KISS approach ends up with a 'worse is better'
> situation.

Code is going to get more complex anyway, and there are other
optimizations that could be made.

One item I realized recently is that part of codel need not run on
every packet in every flow for stuff destined to fit into a single
txop. It is sufficient to see if it declared a drop on the first
packet in a flow destined for a given txop.

You can then mark that entire flow (in a txop) as droppable (QoSNoAck)
within that txop (as it is within an RTT, and even losing all the
packets there will only cause the rate to halve).

>
> David Lang


--=20
Dave T=C3=A4ht
Let's go make home routers and wifi faster! With better software!
http://blog.cerowrt.org