[Cerowrt-devel] Managed to break 802.11n (on a 3800)

Thu Jan 16 18:40:08 EST 2014

On Thu, Jan 16, 2014 at 6:15 PM, Sebastian Moeller <moeller0 at gmx.de> wrote:
> Hi Dave,
>
> thanks again.
>
> On Jan 16, 2014, at 23:50 , Dave Taht <dave.taht at gmail.com> wrote:
>
>>
>>
>>
>> On Thu, Jan 16, 2014 at 3:10 PM, Sebastian Moeller <moeller0 at gmx.de> wrote:
>> Hi Aaron,
>>
>>
>> On Jan 16, 2014, at 20:08 , Aaron Wood <woody77 at gmail.com> wrote:
>>
>>>
>>>>> Sebastian, after sorting out the router, it's still biased, but far
>>>>> less
>>>>> so, about a 2:1 ratio between upload and download.
>>>>
>>>> So I See offen 10:1 and worse @165Mbit/s raw wireless rate
>>>
>>> I get mixed results, but they aren't good.
>>
>> It's hard to comment on each graph in email, but I'll try.
>>
>> I generally run rrul with the --disable-log option. Log scales helped back when we were still
>> comparing against pfifo fast.
>
>         Good point, I had not thought about this and just sheepishly copied the netperf-wrapper invocation from a scratch buffer…. oops.

I think --disable-log should be the default... except that for
everyone not running an AQM the results they will get
will need log scales...

>
>>
>> The really bad download graph. "Crazy results"
>>
>> Download bandwith is bad because the upload starts and fills the queue first, the download
>> has to wait to fill the queue and generally gets dropped earlier than the upload. This is
>> one of the many reasons I don't care for IW10….
>
>         But aren't those two different queues? I am confused...

One flow using TCP is sending 66 byte acks at the same time another
flow is sending full 1500 byte packets.

so if you have "2" TCP flows, you actually have 4 flows, 2 in each
direction, one with big packets, the other with
little.

So what is happening in the fq_codel case with a 300 byte quantum is
you will see (flows 1 and 3 are down,
2 and 4 are data up), roughly, in a packet capture, is something like this:

FLOW ACK DATA
1           1
1           1
1           1
1           1
2                     1
3             1
3             1
3             1
3             1
4                      1
1              1
1              1
1              1
1              1
3              1
3              1
3              1
3              1
1              1
1              1
1              1
1               1
3               1
3               1
3               1
3               1 (repeat til you've served up 1500 bytes from each
small flow, then deliver the 1500 byte packet)

It's more complicated than this as you typically ack only every other packet...

nfq_codel is different in that it rotates the flow queue on every
packet much like SFQ does, but still pays
attention to the quantum

1              1
2                       1
3               1
4                        1
1               1
3               1
1                1
3                1
1                1
1                 1     # serve up 1500 bytes of acks
2                        1
4                        1

SFQ would look like this
1                1
2                         1
3                 1
4                          1
1                 1
2                           1
3                 1
4                            1

I've long meant to port over the sfq_codel implementation from ns2 but haven't
got round to it.

>> The upload gets better slowly due to how slow tcp is ramping up over the half-duplex
>> wifi channel.
>
>         Yepp, my sentiment as well, the sharing between up and down sucks badly. I naively assumed that cero would sort of manage TX-ops and share these equally between its own sending needs and the remote station… I guess wifi is too complicated (and I had thought last mile wired connectivity was wonderfully weird...)

One day... I view most of the problems modern-day wifi has as
solvable...  and critical to society that we fix them.

>
>>
>>
>>
>>
>>       I just checked again and I get crazy results for both RRUL and RRUL_NOCLASSIFICATION:
>> <rrul_noclassification_macbook_2_cerowrt_5GHz.png><rrul_macbook_2_cerowrt_5GHz.png>
>>
>> in both cases I get ~ 10:1 out-in imbalance.
>>
>> I think that with a larger quantum on the AP they will be in less imbalance, and you should try nfq_codel also.
>
>         So for this I would modify the debloat script, correct?

look at the nfq_codel_ll definitions in the file.

There is an SFQ model in there too that might be interesting to try.

the big problem is that no matter how much we twiddle up at this level
there is still far too much queuing in the
driver itself to see much effect (unless you hold the qlen vars at the
levels they are at now)

Ideally all the fq and aqm stuff happens *just before* an aggregate
gets created.

>>
>> The larger quantum will also hurt, too.... right answer has always been per station queues.
>
>         Which I will happily test once they are implemented :)

been 3 years since we started discussing it and looking for funding.

>
>>
>>
>> And even crazier just had one rrul where both in and out came up almost perfectly at 1:1
>
>         Thinking of it again, it might have been a case of really really low total bandwidth, so until this reoccurs I think it is a fluke...
>
>>
>>
>> Hmm. Wifi is weird, isn't it? It's not like ethernet at all. Too bad the universe insists on trying to
>> defy the laws of physics by trying to make it act like ethernet….
>
>         Oh, there was one new blurb last year about going full duplex on wifi, which might help to make wiki behave closer to what people nowadays correlate with ethernet...
>
>>
>>
>> . Interestingly the classification really works in giving different bandwidth for the different classes. (And in rrul_noclassification, where the still classified UDP probes make it through the EF flow gets shorter latencies…).
>>
>> having 4 full queues and a txop each is far worse than 1 queue with better aggregation, IMHO.
>
>         So, the one queue would need to shave off all TOS (excuse my occasional shouting, but all caps is the quickest way to avoid auto correction turning my english even funnier), and have say HTB (or god forbid prio) keep some semblance of priority on the packets instead of letting wifi do its "let's waste a few tx ops" thing. Is it just me or should wifi basically get a better tx-ops sheduler?
>
>>
>>
>>       Note that measuring through cerowrt to a wired host (with too restrictive firewall settings) get:
>> <rrul_macbook_2_cerowrt_2_happy-horse_5GHz.png><rrul_noclassification_macbook_2_cerowrt_2_happy-horse_5GHz.png>
>>
>> You are seeing the upload ramp up along tcp's lines and the download ramp down as it gets progressively more starved.
>
>         The sum seems constant, so yes.
>
>
>>
>>
>> with the MacBooks uplink still dominant (actually continually getting more bandwidth…).
>>
>> Well, you only have X bandwidth, in the air, total. A better way of saying it might be the macbook is taking better advantage
>> of it's txops to ship more data in an aggregate.
>
>         Mmh, it looks like it gets more tx-ops or cero gets increasingly bad in filling its tx-ops, no?
>
>>
>> Since I my only wireless connected machines are macs and nobody else complained about this issue I assume it is an osx issue
>>
>> I honestly think that aside from benchmarks, bandwidth is irrelevant on wifi. Lower latency is something that you
>> actually feel, and when accessing the web or doing a videoconference, that's the part that matters.
>
>         Oh, sure, and my quick and dirty real world test (bidirectional data transfer initiated from the macbook turned out quite useable and balanced). And I only see this on the local net were wireless is the bottleneck. (Silly Idea, all I need to do is switch the wired machine to 10Mbit ethernet and I will be fine :) )
>
>
>>
>> it IS possible to get the best of both worlds, but that's going to take some driver rework.
>>
>>
>> <PastedGraphic-2.tiff>
>> For comparison an RRUL test from the wired linux host to cerowrt, where things look much better...
>>
>>
>>
>>> IIRC, apple really changed something about the media access in 10.8, I'll look into that.  And see if my wife will let me install netperf on her laptop (I think it's still running 10.7)
>>
>>       Yeah, good question whether this is the same in all macosx versions? (Sonner or later I will switch to 10.9 and repeat the measurements…) The saving grace is that I usually either upload or download at home between my 2 computers so I rarely feel the full force of this unfortunate macosx behavior. Just checked using SMB to copy a file to the wired machine and from the wired machine at the same time, nicely splits the bandwidth evenly between up and download, so this might be netsurf related...
>>
>>
>> Single threaded tests will generally work ok, which is why nobody before has complained... which is why rrul exists to beat up things like torrent-like ,web like videoconferencing-like and voip-like behaviors.
>
>         And beating up it does ;)
>
> best regards
>         Sebastian
>
>
>
>>
>>>
>>>
>>>>> Also, my understanding was that with rts/cts, the router was in control
>>>>> of
>>>>> that aspect of things?
>>>>
>>>>     That is what I thought AS well, but it is not what I See with osx 10.8.
>>>>
>>>
>>> It may be a case of the station aggressively asking to send, and the AP granting instead of sending data to the station that's waiting.
>>
>>       I think we agree that the AP should show more self-confidence and reject such requests more firmly :)
>>
>>
>>>
>>> It should be clear in a monitor-mode tcpdump (or a statistical summary of packets).
>>
>>       I am not really equipped to do this, with just one wireless notebook at my disposal :)
>>
>> best
>>       Sebastian
>>
>>
>> Now that you have your laptops running this stuff (AWESOME thank you!)
>>
>> If I can encourage you all to go outside, to like your nearest cybercafe, or conference center, and run your rrul tests there...
>>
>> ... you'll find out just how bad the rest of the world is... (and get some good data).
>>
>> I routinely see 4-6 seconds of latency, and a bare megabit or two of actual bandwidth. The ONLY place Iv'e ever had decent wifi performance in a busy area has been at ietf conferences....
>>
>>
>>>
>>> --Aaron
>>
>>
>> _______________________________________________
>> Cerowrt-devel mailing list
>> Cerowrt-devel at lists.bufferbloat.net
>> https://lists.bufferbloat.net/listinfo/cerowrt-devel
>>
>>
>>
>>
>> --
>> Dave Täht
>>
>> Fixing bufferbloat with cerowrt: http://www.teklibre.com/cerowrt/subscribe.html
>

-- 
Dave Täht

Fixing bufferbloat with cerowrt: http://www.teklibre.com/cerowrt/subscribe.html