From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <apenwarr@google.com>
Received: from mail-qc0-x22d.google.com (mail-qc0-x22d.google.com
	[IPv6:2607:f8b0:400d:c01::22d])
	(using TLSv1 with cipher RC4-SHA (128/128 bits))
	(Client CN "smtp.gmail.com",
	Issuer "Google Internet Authority G2" (verified OK))
	by huchra.bufferbloat.net (Postfix) with ESMTPS id 404C121F526
	for <cerowrt-devel@lists.bufferbloat.net>;
	Sat, 31 Jan 2015 19:07:22 -0800 (PST)
Received: by mail-qc0-f173.google.com with SMTP id m20so25854830qcx.4
	for <cerowrt-devel@lists.bufferbloat.net>;
	Sat, 31 Jan 2015 19:07:21 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113;
	h=mime-version:in-reply-to:references:from:date:message-id:subject:to
	:cc:content-type:content-transfer-encoding;
	bh=KYAyUpuaTlXY062b9M/qddigd9HAEax/Q+ADmcvvevk=;
	b=VvqFsHh97Xhi01OcHFhtAArF2+HH9w/9wWQoSLXoXSLYthXHB1PNwvlmJg5+vqoZ8Q
	zlJ0If1m7UIKci2o6xUcaYn2l0c5Kw+9FwunxH1Z0gcytIav38Suf5yB8bqHgL9nbpp8
	3ga8p2RUn1DWfDgFqKvpdKsomrs9un38RfvMrw3WwH4r106YHV07rxlEuyMCF0vNHarV
	R3uyou9POnab2SjPjed9ZgDmomsVTOTRWL/csVn7LvThm7w8g9i8PjgG9g6rOvLIehlf
	SI9otbp5CovyXGVgj/7wS+5TSBhbKciInV0/9WDQzy28yY/HBveFgFXSSMkyCRUa/e5E
	MD7w==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
	d=1e100.net; s=20130820;
	h=x-gm-message-state:mime-version:in-reply-to:references:from:date
	:message-id:subject:to:cc:content-type:content-transfer-encoding;
	bh=KYAyUpuaTlXY062b9M/qddigd9HAEax/Q+ADmcvvevk=;
	b=DsaivBb77L90r0ekAX2a1HgHMcljUavHfUDSnEnMutCWlL2soRqsGWxPNAskrYl6ye
	h4Or+WTN+KPrffbzbYWM0WhDEQQ5V22O+Kj0IrzE01Y93IcI3vwJ425RK05N9RLeZLfW
	/+UtKORKx6tW06R1lRdvmpPB23f1Ch5NrDlqcn34yz5CJKXAbFBlRTg5Ruq9pgRJzY8E
	TVLn29OKmNCqqMX6BRiQNnKDtfB482BMLb36c5aHVmzXt2JPJ0o7OTo0zG0gSpbPzfxz
	fGgU9oQ4nb3xADPPBdbzDr6uSVgn1icuC29PKpwlM7hxdPVN9miicv2Jyi5vVnLmLDfW
	l67Q==
X-Gm-Message-State: ALoCoQldDwQQnvZxWg+puaIV5shQ4RDej8ihpjZaBXVIIIzROx1uAXsRmjwzwF4E7fJ2OVDWHsLh
X-Received: by 10.140.37.39 with SMTP id q36mr24074970qgq.89.1422760041003;
	Sat, 31 Jan 2015 19:07:21 -0800 (PST)
MIME-Version: 1.0
Received: by 10.229.165.193 with HTTP; Sat, 31 Jan 2015 19:06:59 -0800 (PST)
In-Reply-To: <1422741065.199624134@apps.rackspace.com>
References: <CA+BoTQkVu23P3EOmY_Q3E1GJnWsyF==Pawz4iPOS_Bq5dvfO5Q@mail.gmail.com>
	<1422537297.21689.15.camel@edumazet-glaptop2.roam.corp.google.com>
	<54CB5D08.2070906@broadcom.com>
	<1422623975.21689.77.camel@edumazet-glaptop2.roam.corp.google.com>
	<54CB8B69.1070807@broadcom.com>
	<CAA93jw5fqhz0Hiw74L2GXgtZ9JsMg+NtYydKxKzGDrvQcZn4hA@mail.gmail.com>
	<CAA93jw7b0E9jjQYXrEPzjLLC9j8xNC0TFYXpWVtgFameJaNBdw@mail.gmail.com>
	<1422741065.199624134@apps.rackspace.com>
From: Avery Pennarun <apenwarr@google.com>
Date: Sat, 31 Jan 2015 22:06:59 -0500
Message-ID: <CAPp0ZBb2nkA6Y0s=W0kw=zvyn0wi0NMBRsBCw_xcD61ScOmgQg@mail.gmail.com>
To: David Reed <dpreed@reed.com>
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
X-Mailman-Approved-At: Sun, 01 Feb 2015 22:36:11 -0800
Cc: Andrew McGregor <andrewmcgr@gmail.com>,
	Jesper Dangaard Brouer <jbrouer@redhat.com>,
	Matt Mathis <mattmathis@google.com>, "cerowrt-devel@lists.bufferbloat.net"
	<cerowrt-devel@lists.bufferbloat.net>,
	Jonathan Morton <chromatix99@gmail.com>, Tim Shepard <shep@alum.mit.edu>
Subject: Re: [Cerowrt-devel] Fwd: Throughput regression with `tcp: refine
	TSO autosizing`
X-BeenThere: cerowrt-devel@lists.bufferbloat.net
X-Mailman-Version: 2.1.13
Precedence: list
List-Id: Development issues regarding the cerowrt test router project
	<cerowrt-devel.lists.bufferbloat.net>
List-Unsubscribe: <https://lists.bufferbloat.net/options/cerowrt-devel>,
	<mailto:cerowrt-devel-request@lists.bufferbloat.net?subject=unsubscribe>
List-Archive: <https://lists.bufferbloat.net/pipermail/cerowrt-devel>
List-Post: <mailto:cerowrt-devel@lists.bufferbloat.net>
List-Help: <mailto:cerowrt-devel-request@lists.bufferbloat.net?subject=help>
List-Subscribe: <https://lists.bufferbloat.net/listinfo/cerowrt-devel>,
	<mailto:cerowrt-devel-request@lists.bufferbloat.net?subject=subscribe>
X-List-Received-Date: Sun, 01 Feb 2015 03:07:51 -0000

I would argue that insofar as the bufferbloat project has made a
difference, it's because there was a very clear message and product:
- here's what sucks when you have bufferbloat
- here's how you can detect it
- here's how you can get rid of it
- by the way, here's which of your competitors are already beating you at i=
t.

It turns out you don't need a standards org in order to push any of
the above things.  The IEEE exists to make sure things interop at the
MAC/PHY layer.  The IETF exists to make sure things interop at the
higher layers.  But bufferbloat isn't about interop, it's just a thing
that happens inside gateways, so it's not something you can really
write standards about.  It is something you can turn into a
competitive advantage (or disadvantage, if you're a straggler).

...but meanwhile, if we want to fix bufferbloat in wifi, nobody
actually knows how to do it, so we are still at steps 1 and 2.

This is why we're funding Dave to continue work on netperf-wrappers.
Those diagrams of latency under load are pretty convincing.  The
diagrams of page load times under different levels of latency are even
more convincing.  First, we prove there's a problem and a way to
measure the problem.  Then hopefully more people will be interested in
solving it.


On Sat, Jan 31, 2015 at 4:51 PM,  <dpreed@reed.com> wrote:
> I think we need to create an Internet focused 802.11 working group that
> would be to the "OS wireless designers and IEEE 802.11 standards groups" =
as
> the WHATML group was to W3C.
>
>
>
> W3C was clueless about the real world at the point WHATML was created.  A=
nd
> WHATML was a "revenge of the real" against W3C - advancing a wide variety=
 of
> important practical innovations rather than attending endless standards
> meetings with people who were not focused on solving actually important
> problems.
>
>
>
> It took a bunch of work to get WHATML going, and it offended W3C, who bec=
ame
> unhelpful.  But the approach actually worked - we now have a Web that rea=
lly
> uses browser-side expressivity and that would never have happened if W3C
> were left to its own devices.
>
>
>
> The WiFi consortium was an attempt to wrest control of pragmatic directio=
n
> from 802.11 and the proprietary-divergence folks at Qualcomm, Broadcom,
> Cisco, etc.  But it failed, because it became thieves on a raft, more
> focused on picking each others' pockets than on actually addressing the b=
ig
> issues.
>
>
>
> Jim has seen this play out in the Linux community around X.  Though there
> are lots of interests who would benefit by moving the engineering ball
> forward, everyone resists action because it means giving up the chance at
> dominance, and the central group is far too weak to do anything beyond
> adjudicating the worst battles.
>
>
>
> When I say "we" I definitely include myself (though my time is limited du=
e
> to other commitments and the need to support my family), but I would only
> play with people who actually are committed to making stuff happen - whic=
h
> includes raising hell with the vendors if need be, but also effective
> engineering steps that can achieve quick adoption.
>
>
>
> Sadly, and I think it is manageable at the moment, there are moves out th=
ere
> being made to get the FCC to "protect" WiFi from "interference".  The
> current one was Marriott, who requested the FCC for a rule to make it leg=
al
> to disrupt and block use of WiFi in people's rooms in their hotels, excep=
t
> with their access points.  This also needs some technical defense.  I
> believe any issues with WiFi performance in actual Marriott hotels are du=
e
> to bufferbloat in their hotel-wide systems, just as the issues with GoGo =
are
> the same.  But it's possible that queueing problems in their own WiFi gea=
r
> are bad as well.
>
>
>
> I mention this because it is related, and to the layperson, or
> non-radio-knowledgeable executive, indistinguishable.  It will take away =
the
> incentive to actually fix the 802.11 implementations to be better
> performing, making the problem seem to be a "management" issue that can b=
e
> solved by making WiFi less interoperable and less flexible by rules, rath=
er
> than by engineering.
>
>
>
> However, solving the problems of hotspot networks and hotel networks are
> definitely "real world" issues, and quite along the same lines you mentio=
n,
> Dave.  FQ is almost certainly a big deal both in WiFi and in the
> distribution networks behind WiFi. Co-existence is also a big deal
> (RTS/CTS-like mechanisms can go a long way to remediate hidden-terminal
> disruption of the basic protocols). Roaming and scaling need work as well=
.
>
>
>
> It would even be a good thing to invent pragmatic ways to provide "low ra=
te"
> subnets and "high rate" subnets that can coexist, so that compatibility w=
ith
> ancient "b" networks need not be maintained on all nets, at great cost -
> just send beacons at a high rate, so that the "b" NICs can't see them....
> but you need pragmatic stack implementations.
>
>
>
> But the engineering is not the only challenge. The other challenge is to
> take the initiative and get stuff deployed.  In the case of bufferbloat, =
the
> grade currently is a "D" for deployments, maybe a "D-".  Beautiful techni=
cal
> work, but the economic/business/political side of things has been poor.
> Look at how slow IETF has been to achieve anything (the perfect is truly =
the
> enemy of the good, and Dave Clark's "rough consensus and working code" ha=
s
> been replaced by technocratic malaise, and what appears to me to be a cla=
ss
> of people who love traveling the world to a floating cocktail party witho=
ut
> getting anything important done).
>
>
>
> The problem with communications is that you can't just ship a product wit=
h a
> new "feature", because the innovation only works if widely adopted.  Sinc=
e
> there is no "Linux Desktop" (and Linus hates the idea, to a large extent)
> Linux can't be the sole carrier of the idea.  You pretty much need iOS an=
d
> Android both to buy in or to provide a path for easy third-party upgrades=
.
> How do you do that?  Well, that's where the WHATML-type approach is
> necessary.
>
>
>
> I don't know if this can be achieved, and there are lots of details to be
> worked out.  But I'll play.
>
>
>
>
>
>
>
> On Saturday, January 31, 2015 4:05pm, "Dave Taht" <dave.taht@gmail.com>
> said:
>
> I would like to have somehow assembled all the focused resources to make =
a
> go at fixing wifi, or at least having a f2f with a bunch of people in the
> late march timeframe. This message of mine to linux-wireless bounced for
> some reason and I am off to log out for 10 days, so...
> see relevant netdev thread also for ore details.
>
> ---------- Forwarded message ----------
> From: Dave Taht <dave.taht@gmail.com>
> Date: Sat, Jan 31, 2015 at 12:29 PM
> Subject: Re: Throughput regression with `tcp: refine TSO autosizing`
> To: Arend van Spriel <arend@broadcom.com>
> Cc: linux-wireless <linux-wireless@vger.kernel.org>, Michal Kazior
> <michal.kazior@tieto.com>, Eyal Perry <eyalpe@dev.mellanox.co.il>, Networ=
k
> Development <netdev@vger.kernel.org>, Eric Dumazet <eric.dumazet@gmail.co=
m>
>
>
> The wifi industry as a whole has vastly bigger problems than achieving
> 1500Mbits in a faraday cage on a single flow.
>
> I encourage you to try tests in netperf-wrapper that explicitly test for
> latency under load, and in particular, the RTT_FAIR tests against 4 or mo=
re
> stations on a single wifi AP. You will find the results very depressing.
> Similarly, on your previous test series, a latency figure would have been
> nice to have. I just did a talk at nznog, where I tested the local wifi w=
ith
> less than ambits of throughput, and 3 seconds of latency, filmed here:
>
> https://plus.google.com/u/0/107942175615993706558/posts/CY8ew8MPnMt
>
> Do wish more folk were testing in the busy real world environments, like
> coffee shops, cities... really, anywhere outside a faraday cage!
>
> I am not attending netconf - I was unable to raise funds to go, and the
> program committee wanted something "new",
>
> instead of the preso I gave the IEEE 802.11 working group back in septemb=
er.
> (
> http://snapon.lab.bufferbloat.net/~d/ieee802.11-sept-17-2014/11-14-1265-0=
0-0wng-More-on-Bufferbloat.pdf
> )
>
> I was very pleased with the results of that talk - the day after I gave i=
t,
> the phrase "test for latency" showed up in a bunch of 802.11ax (the next
> generation after ac) documents. :) Still, we are stuck with the train wre=
ck
> that is 802.11ac glommed on top of 802.11n, glommed on top of 802.11g, in
> terms of queue management, terrible uses of airtime, rate control and oth=
er
> stuff. Aruba and Meraki, in particular took a big interest in what I'd
> outlined in the preso above (we have a half dozen less well baked ideas -
> that's just the easy stuff that can be done to improve wifi).  I gave a
> followup at meraki but I don't think that's online.
>
> Felix (nbd) is on vacation right now, as I am I. In fact I am going
> somewhere for a week totally lacking internet access.
>
> Presently the plan, with what budget (none) we have and time (very little=
)
> we have is to produce a pair of proof of concept implementations for per =
tid
> queuing (see relevant commit by nbd),  leveraging the new minstrel stats,
> the new minstrel-blues stuff, and an aggregation aware codel with a
> calculated target based on the most recently active stations, and a bunch=
 of
> the other stuff outlined above at IEEE.
>
> It is my hope that this will start to provide accurate back pressure (or
> sufficient lack thereof for TSQ), to also improve throughput while still
> retaining low latency. But it is a certainty that we will run into more
> cross layer issues that will be a pita to resolve.
>
> If we can put together a meet up around or during ELC in california in
> march?
>
> I am really not terribly optimistic on anything other than the 2 chipsets=
 we
> can hack on (ath9k, mt76). Negotiations to get qualcomm to open up their
> ath10k firmware have thus far failed, nor has a ath10k-lite got anywhere.
> Perhaps broadcom would be willing to open up their firmware sufficiently =
to
> build in a better API?
>
> A bit more below.
>
>
> On Jan 30, 2015 5:59 AM, "Arend van Spriel" <arend@broadcom.com> wrote:
>>
>> On 01/30/15 14:19, Eric Dumazet wrote:
>>>
>>> On Fri, 2015-01-30 at 11:29 +0100, Arend van Spriel wrote:
>>>
>>>> Hi Eric,
>>>>
>>>> Your suggestions are still based on the fact that you consider wireles=
s
>>>> networking to be similar to ethernet, but as Michal indicated there ar=
e
>>>> some fundamental differences starting with CSMA/CD versus CSMA/CA. Als=
o
>>>> the medium conditions are far from comparable.
>
> The analogy i now use for it is that switched ethernet is generally your
> classic "dumbbell"
>
> topology. Wifi is more like a "taxi-stand" topology. If you think about h=
ow
> people
>
> queue up at a taxi stand (and sometimes agree to share a ride), the inter
> arrival
>
> and departure times of a taxi stand make for a better mental model.
>
> Admittedly, I seem to spend a lot of time, waiting for taxies, thinking
> about
>
> wifi.
>
>>> There is no shielding so
>>>> it needs to deal with interference and dynamically drops the link rate
>>>> so transmission of packets can take several milliseconds. Then with 11=
n
>>>> they came up with aggregation with sends up to 64 packets in a single
>>>> transmit over the air at worst case 6.5 Mbps (if I am not mistaken). T=
he
>>>> parameter value for tcp_limit_output_bytes of 131072 means that it
>>>> allows queuing for about 1ms on a 1Gbps link, but I hope you can see
>>>> this is not realistic for dealing with all variances of the wireless
>>>> medium/standard. I suggested this as topic for the wireless workshop i=
n
>>>> Otawa [1], but I can not attend there. Still hope that there will be
>>>> some discussions to get more awareness.
>
> I have sometimes hoped that TSQ could be made more a function of the
>
> number of active flows exiting an interface, but eric tells me that's
> impossible.
>
> This is possibly another case where TSQ could use to be a callback
> function...
>
> but frankly I care not a whit about maximizing single flow tcp throughput=
 on
> wifi
>
> in a faraday cage.
>
>
>>>
>>> Ever heard about bufferbloat ?
>>
>>
>> Sure. I am trying to get awareness about that in our wireless
>> driver/firmware development teams. So bear with me.
>>
>>
>>> Have you read my suggestions and tried them ?
>>>
>>> You can adjust the limit per flow to pretty much you want. If you need
>>> 64 packets, just do the math. If in 2018 you need 128 packets, do the
>>> math again.
>>>
>>> I am very well aware that wireless wants aggregation, thank you.
>
> I note that a lot of people testing this are getting it backwards. Usuall=
y
> it is the AP that is sending lots and lots of big packets, where the retu=
rn
> path is predominately acks from the station.
>
> I am not a huge fan of stretch acks, but certainly a little bit of thinni=
ng
> doesn't bother me on the return path there.
>
> Going the other way, particularly in a wifi world that insists on treatin=
g
> every packet as sacred (which I don't agree with at all), thinning acks c=
an
> help, but single stream throughput is of interest only on benchmarks, FQi=
ng
> as much as possible all the flows destined the station in each aggregate
> masks loss and reduces the need to protect everything so much.
>
>>
>> Sorry if I offended you. I was just giving these as example combined wit=
h
>> effective rate usable on the medium to say that the bandwidth is more
>> dynamic in wireless and as such need dynamic change of queue depth. Now =
this
>> can be done by making the fraction size as used in your suggestion adapt=
ive
>> to these conditions.
>
> Well... see above. Maybe this technique will do more of the right thing,
> but... go test.
>
>
>>
>>> 131072 bytes of queue on 40Gbit is not 1ms, but 26 usec of queueing, an=
d
>>> we get line rate nevertheless.
>>
>>
>> I was saying it was about 1ms on *1Gbit* as the wireless TCP rates are
>> moving into that direction in 11ac.
>>
>>
>>> We need this level of shallow queues (BQL, TSQ), to get very precise rt=
t
>>> estimations so that TCP has good entropy for its pacing, even in the 50
>>> usec rtt ranges.
>>>
>>> If we allowed 1ms of queueing, then a 40Gbit flow would queue 5 MBytes.
>>>
>>> This was terrible, because it increased cwnd and all sender queues to
>>> insane levels.
>>
>>
>> Indeed and that is what we would like to address in our wireless drivers=
.
>> I will setup some experiments using the fraction sizing and post my
>> findings. Again sorry if I offended you.
>
> You really, really, really need to test at rates below 50mbit and with ot=
her
> stations, also while doing this. It's not going to be a linear curve.
>
>
>
>>
>> Regards,
>> Arend
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe netdev" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>
>
> --
> Dave T=C3=A4ht
>
> thttp://www.bufferbloat.net/projects/bloat/wiki/Upcoming_Talks