[Cerowrt-devel] Fwd: Throughput regression with `tcp: refine TSO autosizing`

Sat Jan 31 22:06:59 EST 2015

I would argue that insofar as the bufferbloat project has made a
difference, it's because there was a very clear message and product:
- here's what sucks when you have bufferbloat
- here's how you can detect it
- here's how you can get rid of it
- by the way, here's which of your competitors are already beating you at it.

It turns out you don't need a standards org in order to push any of
the above things.  The IEEE exists to make sure things interop at the
MAC/PHY layer.  The IETF exists to make sure things interop at the
higher layers.  But bufferbloat isn't about interop, it's just a thing
that happens inside gateways, so it's not something you can really
write standards about.  It is something you can turn into a
competitive advantage (or disadvantage, if you're a straggler).

...but meanwhile, if we want to fix bufferbloat in wifi, nobody
actually knows how to do it, so we are still at steps 1 and 2.

This is why we're funding Dave to continue work on netperf-wrappers.
Those diagrams of latency under load are pretty convincing.  The
diagrams of page load times under different levels of latency are even
more convincing.  First, we prove there's a problem and a way to
measure the problem.  Then hopefully more people will be interested in
solving it.

On Sat, Jan 31, 2015 at 4:51 PM,  <dpreed at reed.com> wrote:
> I think we need to create an Internet focused 802.11 working group that
> would be to the "OS wireless designers and IEEE 802.11 standards groups" as
> the WHATML group was to W3C.
>
>
>
> W3C was clueless about the real world at the point WHATML was created.  And
> WHATML was a "revenge of the real" against W3C - advancing a wide variety of
> important practical innovations rather than attending endless standards
> meetings with people who were not focused on solving actually important
> problems.
>
>
>
> It took a bunch of work to get WHATML going, and it offended W3C, who became
> unhelpful.  But the approach actually worked - we now have a Web that really
> uses browser-side expressivity and that would never have happened if W3C
> were left to its own devices.
>
>
>
> The WiFi consortium was an attempt to wrest control of pragmatic direction
> from 802.11 and the proprietary-divergence folks at Qualcomm, Broadcom,
> Cisco, etc.  But it failed, because it became thieves on a raft, more
> focused on picking each others' pockets than on actually addressing the big
> issues.
>
>
>
> Jim has seen this play out in the Linux community around X.  Though there
> are lots of interests who would benefit by moving the engineering ball
> forward, everyone resists action because it means giving up the chance at
> dominance, and the central group is far too weak to do anything beyond
> adjudicating the worst battles.
>
>
>
> When I say "we" I definitely include myself (though my time is limited due
> to other commitments and the need to support my family), but I would only
> play with people who actually are committed to making stuff happen - which
> includes raising hell with the vendors if need be, but also effective
> engineering steps that can achieve quick adoption.
>
>
>
> Sadly, and I think it is manageable at the moment, there are moves out there
> being made to get the FCC to "protect" WiFi from "interference".  The
> current one was Marriott, who requested the FCC for a rule to make it legal
> to disrupt and block use of WiFi in people's rooms in their hotels, except
> with their access points.  This also needs some technical defense.  I
> believe any issues with WiFi performance in actual Marriott hotels are due
> to bufferbloat in their hotel-wide systems, just as the issues with GoGo are
> the same.  But it's possible that queueing problems in their own WiFi gear
> are bad as well.
>
>
>
> I mention this because it is related, and to the layperson, or
> non-radio-knowledgeable executive, indistinguishable.  It will take away the
> incentive to actually fix the 802.11 implementations to be better
> performing, making the problem seem to be a "management" issue that can be
> solved by making WiFi less interoperable and less flexible by rules, rather
> than by engineering.
>
>
>
> However, solving the problems of hotspot networks and hotel networks are
> definitely "real world" issues, and quite along the same lines you mention,
> Dave.  FQ is almost certainly a big deal both in WiFi and in the
> distribution networks behind WiFi. Co-existence is also a big deal
> (RTS/CTS-like mechanisms can go a long way to remediate hidden-terminal
> disruption of the basic protocols). Roaming and scaling need work as well.
>
>
>
> It would even be a good thing to invent pragmatic ways to provide "low rate"
> subnets and "high rate" subnets that can coexist, so that compatibility with
> ancient "b" networks need not be maintained on all nets, at great cost -
> just send beacons at a high rate, so that the "b" NICs can't see them....
> but you need pragmatic stack implementations.
>
>
>
> But the engineering is not the only challenge. The other challenge is to
> take the initiative and get stuff deployed.  In the case of bufferbloat, the
> grade currently is a "D" for deployments, maybe a "D-".  Beautiful technical
> work, but the economic/business/political side of things has been poor.
> Look at how slow IETF has been to achieve anything (the perfect is truly the
> enemy of the good, and Dave Clark's "rough consensus and working code" has
> been replaced by technocratic malaise, and what appears to me to be a class
> of people who love traveling the world to a floating cocktail party without
> getting anything important done).
>
>
>
> The problem with communications is that you can't just ship a product with a
> new "feature", because the innovation only works if widely adopted.  Since
> there is no "Linux Desktop" (and Linus hates the idea, to a large extent)
> Linux can't be the sole carrier of the idea.  You pretty much need iOS and
> Android both to buy in or to provide a path for easy third-party upgrades.
> How do you do that?  Well, that's where the WHATML-type approach is
> necessary.
>
>
>
> I don't know if this can be achieved, and there are lots of details to be
> worked out.  But I'll play.
>
>
>
>
>
>
>
> On Saturday, January 31, 2015 4:05pm, "Dave Taht" <dave.taht at gmail.com>
> said:
>
> I would like to have somehow assembled all the focused resources to make a
> go at fixing wifi, or at least having a f2f with a bunch of people in the
> late march timeframe. This message of mine to linux-wireless bounced for
> some reason and I am off to log out for 10 days, so...
> see relevant netdev thread also for ore details.
>
> ---------- Forwarded message ----------
> From: Dave Taht <dave.taht at gmail.com>
> Date: Sat, Jan 31, 2015 at 12:29 PM
> Subject: Re: Throughput regression with `tcp: refine TSO autosizing`
> To: Arend van Spriel <arend at broadcom.com>
> Cc: linux-wireless <linux-wireless at vger.kernel.org>, Michal Kazior
> <michal.kazior at tieto.com>, Eyal Perry <eyalpe at dev.mellanox.co.il>, Network
> Development <netdev at vger.kernel.org>, Eric Dumazet <eric.dumazet at gmail.com>
>
>
> The wifi industry as a whole has vastly bigger problems than achieving
> 1500Mbits in a faraday cage on a single flow.
>
> I encourage you to try tests in netperf-wrapper that explicitly test for
> latency under load, and in particular, the RTT_FAIR tests against 4 or more
> stations on a single wifi AP. You will find the results very depressing.
> Similarly, on your previous test series, a latency figure would have been
> nice to have. I just did a talk at nznog, where I tested the local wifi with
> less than ambits of throughput, and 3 seconds of latency, filmed here:
>
> https://plus.google.com/u/0/107942175615993706558/posts/CY8ew8MPnMt
>
> Do wish more folk were testing in the busy real world environments, like
> coffee shops, cities... really, anywhere outside a faraday cage!
>
> I am not attending netconf - I was unable to raise funds to go, and the
> program committee wanted something "new",
>
> instead of the preso I gave the IEEE 802.11 working group back in september.
> (
> http://snapon.lab.bufferbloat.net/~d/ieee802.11-sept-17-2014/11-14-1265-00-0wng-More-on-Bufferbloat.pdf
> )
>
> I was very pleased with the results of that talk - the day after I gave it,
> the phrase "test for latency" showed up in a bunch of 802.11ax (the next
> generation after ac) documents. :) Still, we are stuck with the train wreck
> that is 802.11ac glommed on top of 802.11n, glommed on top of 802.11g, in
> terms of queue management, terrible uses of airtime, rate control and other
> stuff. Aruba and Meraki, in particular took a big interest in what I'd
> outlined in the preso above (we have a half dozen less well baked ideas -
> that's just the easy stuff that can be done to improve wifi).  I gave a
> followup at meraki but I don't think that's online.
>
> Felix (nbd) is on vacation right now, as I am I. In fact I am going
> somewhere for a week totally lacking internet access.
>
> Presently the plan, with what budget (none) we have and time (very little)
> we have is to produce a pair of proof of concept implementations for per tid
> queuing (see relevant commit by nbd),  leveraging the new minstrel stats,
> the new minstrel-blues stuff, and an aggregation aware codel with a
> calculated target based on the most recently active stations, and a bunch of
> the other stuff outlined above at IEEE.
>
> It is my hope that this will start to provide accurate back pressure (or
> sufficient lack thereof for TSQ), to also improve throughput while still
> retaining low latency. But it is a certainty that we will run into more
> cross layer issues that will be a pita to resolve.
>
> If we can put together a meet up around or during ELC in california in
> march?
>
> I am really not terribly optimistic on anything other than the 2 chipsets we
> can hack on (ath9k, mt76). Negotiations to get qualcomm to open up their
> ath10k firmware have thus far failed, nor has a ath10k-lite got anywhere.
> Perhaps broadcom would be willing to open up their firmware sufficiently to
> build in a better API?
>
> A bit more below.
>
>
> On Jan 30, 2015 5:59 AM, "Arend van Spriel" <arend at broadcom.com> wrote:
>>
>> On 01/30/15 14:19, Eric Dumazet wrote:
>>>
>>> On Fri, 2015-01-30 at 11:29 +0100, Arend van Spriel wrote:
>>>
>>>> Hi Eric,
>>>>
>>>> Your suggestions are still based on the fact that you consider wireless
>>>> networking to be similar to ethernet, but as Michal indicated there are
>>>> some fundamental differences starting with CSMA/CD versus CSMA/CA. Also
>>>> the medium conditions are far from comparable.
>
> The analogy i now use for it is that switched ethernet is generally your
> classic "dumbbell"
>
> topology. Wifi is more like a "taxi-stand" topology. If you think about how
> people
>
> queue up at a taxi stand (and sometimes agree to share a ride), the inter
> arrival
>
> and departure times of a taxi stand make for a better mental model.
>
> Admittedly, I seem to spend a lot of time, waiting for taxies, thinking
> about
>
> wifi.
>
>>> There is no shielding so
>>>> it needs to deal with interference and dynamically drops the link rate
>>>> so transmission of packets can take several milliseconds. Then with 11n
>>>> they came up with aggregation with sends up to 64 packets in a single
>>>> transmit over the air at worst case 6.5 Mbps (if I am not mistaken). The
>>>> parameter value for tcp_limit_output_bytes of 131072 means that it
>>>> allows queuing for about 1ms on a 1Gbps link, but I hope you can see
>>>> this is not realistic for dealing with all variances of the wireless
>>>> medium/standard. I suggested this as topic for the wireless workshop in
>>>> Otawa [1], but I can not attend there. Still hope that there will be
>>>> some discussions to get more awareness.
>
> I have sometimes hoped that TSQ could be made more a function of the
>
> number of active flows exiting an interface, but eric tells me that's
> impossible.
>
> This is possibly another case where TSQ could use to be a callback
> function...
>
> but frankly I care not a whit about maximizing single flow tcp throughput on
> wifi
>
> in a faraday cage.
>
>
>>>
>>> Ever heard about bufferbloat ?
>>
>>
>> Sure. I am trying to get awareness about that in our wireless
>> driver/firmware development teams. So bear with me.
>>
>>
>>> Have you read my suggestions and tried them ?
>>>
>>> You can adjust the limit per flow to pretty much you want. If you need
>>> 64 packets, just do the math. If in 2018 you need 128 packets, do the
>>> math again.
>>>
>>> I am very well aware that wireless wants aggregation, thank you.
>
> I note that a lot of people testing this are getting it backwards. Usually
> it is the AP that is sending lots and lots of big packets, where the return
> path is predominately acks from the station.
>
> I am not a huge fan of stretch acks, but certainly a little bit of thinning
> doesn't bother me on the return path there.
>
> Going the other way, particularly in a wifi world that insists on treating
> every packet as sacred (which I don't agree with at all), thinning acks can
> help, but single stream throughput is of interest only on benchmarks, FQing
> as much as possible all the flows destined the station in each aggregate
> masks loss and reduces the need to protect everything so much.
>
>>
>> Sorry if I offended you. I was just giving these as example combined with
>> effective rate usable on the medium to say that the bandwidth is more
>> dynamic in wireless and as such need dynamic change of queue depth. Now this
>> can be done by making the fraction size as used in your suggestion adaptive
>> to these conditions.
>
> Well... see above. Maybe this technique will do more of the right thing,
> but... go test.
>
>
>>
>>> 131072 bytes of queue on 40Gbit is not 1ms, but 26 usec of queueing, and
>>> we get line rate nevertheless.
>>
>>
>> I was saying it was about 1ms on *1Gbit* as the wireless TCP rates are
>> moving into that direction in 11ac.
>>
>>
>>> We need this level of shallow queues (BQL, TSQ), to get very precise rtt
>>> estimations so that TCP has good entropy for its pacing, even in the 50
>>> usec rtt ranges.
>>>
>>> If we allowed 1ms of queueing, then a 40Gbit flow would queue 5 MBytes.
>>>
>>> This was terrible, because it increased cwnd and all sender queues to
>>> insane levels.
>>
>>
>> Indeed and that is what we would like to address in our wireless drivers.
>> I will setup some experiments using the fraction sizing and post my
>> findings. Again sorry if I offended you.
>
> You really, really, really need to test at rates below 50mbit and with other
> stations, also while doing this. It's not going to be a linear curve.
>
>
>
>>
>> Regards,
>> Arend
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe netdev" in
>> the body of a message to majordomo at vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>
>
> --
> Dave Täht
>
> thttp://www.bufferbloat.net/projects/bloat/wiki/Upcoming_Talks