From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <moeller0@gmx.de>
Received: from mout.gmx.net (mout.gmx.net [212.227.17.20])
 (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (No client certificate requested)
 by lists.bufferbloat.net (Postfix) with ESMTPS id B81343B2A3
 for <bloat@lists.bufferbloat.net>; Fri, 27 Jan 2017 09:49:25 -0500 (EST)
Received: from [172.17.3.29] ([134.76.241.253]) by mail.gmx.com (mrgmx102
 [212.227.17.168]) with ESMTPSA (Nemesis) id 0Lm6IP-1bxmxG0vAn-00Zgws; Fri, 27
 Jan 2017 15:49:23 +0100
Content-Type: text/plain; charset=utf-8
Mime-Version: 1.0 (Mac OS X Mail 10.2 \(3259\))
From: Sebastian Moeller <moeller0@gmx.de>
In-Reply-To: <1485528030.6360.35.camel@edumazet-glaptop3.roam.corp.google.com>
Date: Fri, 27 Jan 2017 15:49:20 +0100
Cc: =?utf-8?Q?Dave_T=C3=A4ht?= <dave@taht.net>,
 bloat@lists.bufferbloat.net
Content-Transfer-Encoding: quoted-printable
Message-Id: <D3AA5A7F-FBC6-4CFA-8F62-71E99A5CA285@gmx.de>
References: <CAD_cGvEtHwy9Kat0NkK81E0EFMVWMHe0OCU2C9TvfUCuwkqvjw@mail.gmail.com>
 <0496946b-827a-8527-643d-0b186f52e192@taht.net>
 <1485528030.6360.35.camel@edumazet-glaptop3.roam.corp.google.com>
To: Eric Dumazet <eric.dumazet@gmail.com>
X-Mailer: Apple Mail (2.3259)
X-Provags-ID: V03:K0:uTAh+GoZGxmVR2KSbRLTf8DoWpecHeNVVP2W21cC+sGOh8Eon0e
 Qv0RcagHibXevvy/cKenrobXGhcLlxrPg7E/6GwKwYHyFbUSQDCc/lY4Jb/iWj6tsA6OyeA
 y3TMm5+3Gw3VsUT2BihF7uLIOczlam9jb1UXewh5dDBcUffMa/tExQVOSgqRKG+4OSI8XTn
 yYSqAm3XaE2XEDUYLhMug==
X-UI-Out-Filterresults: notjunk:1;V01:K0:MakUCDOwzUY=:aRr4OVJHiwZtsgb0+4SMHt
 6Ls8Jd+c5zCo8WzNMAh/1azxq6DTxE5GlpvO+iRJ3vcRV80n9mDPD8Iz3F9j/+VqGy/vEy/eE
 dhwv7IUj9nZNambgDes7FMU44to6JtwswS9fmIoFUFBlbJwkCpJpBVVBJVFoBFW3tkR1rx780
 WJ683D5B41q2bloIHcxdToF2YhHxDamDKnIpFK+zBKaGOzkgZStEGII3CW8pzLv/I7tKeorc8
 G34GsFOnjWtiI23xuT4YiF26/f9D5rNFUr/U9y1s8yXnQr8bTqE0qD9NLwTdebvTC80OZqs5m
 K75K+USAIpHhFf4BbB8K2F5jIGf2Jtdb7LphGIFXMQdG7qhhSuYXvkYAlRfaesbmHhK32qyrb
 7oLMaQ4i6wBOSiINDcumHBgLfu+HrFCe2oZP/xrrCpGoeYKGbpPVtokYVTwMMYe1BITu1SIw6
 kK05VplMRgiAeLipIWKh1L3HueBBao6iMcOjVRk8s6I+jZgVQyoMDpv1wssI3Cy2PYp/2qv/O
 6+4P4JZ/hK4DFuMqT6MYJGZmINmY63sl/HvQFowxTb2BqiJNj2sZHE8J7GbGGkN3lgyte7p0R
 U5PDriLcuTVapxd/zHE7e4xwhSBLcLosuAIYKopgM6PAUdXc2BH1FfN0IZhRb7b8h67esA667
 k3z9WhIp1na4IhpQKaKgjhe9CiV0A3EF8r6GA7Vft8/We49VWo122KaW0/xI5HtXYXSAMq0IB
 WjVHItoKOmZlNHHjHq2DSdGsqDqthHJBgCr43tFVWCycHbdP39TOKfRPT/oLVpCt9AhKUwFw5
 bNlYZny
Subject: Re: [Bloat] Recommendations for fq_codel and tso/gso in 2017
X-BeenThere: bloat@lists.bufferbloat.net
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: General list for discussing Bufferbloat <bloat.lists.bufferbloat.net>
List-Unsubscribe: <https://lists.bufferbloat.net/options/bloat>,
 <mailto:bloat-request@lists.bufferbloat.net?subject=unsubscribe>
List-Archive: <https://lists.bufferbloat.net/pipermail/bloat>
List-Post: <mailto:bloat@lists.bufferbloat.net>
List-Help: <mailto:bloat-request@lists.bufferbloat.net?subject=help>
List-Subscribe: <https://lists.bufferbloat.net/listinfo/bloat>,
 <mailto:bloat-request@lists.bufferbloat.net?subject=subscribe>
X-List-Received-Date: Fri, 27 Jan 2017 14:49:26 -0000

Hi Eric,

quick question from the peanut gallery: on a typical home router with =
1Gbps internal and <<100Mbps external interfaces, will giant packets =
will be generated by the 1Gbps interface (with acceptable latency)? I =
ask, as what makes sense on a 1000Mbps ingress link, might still block =
an 20Mbps wan egress link slightly longer than one would like (to the =
tune of 50ms, just based on the bandwidth ratio?).

Best Regards
	Sebastian


> On Jan 27, 2017, at 15:40, Eric Dumazet <eric.dumazet@gmail.com> =
wrote:
>=20
> On Thu, 2017-01-26 at 23:55 -0800, Dave T=C3=A4ht wrote:
>>=20
>> On 1/26/17 11:21 PM, Hans-Kristian Bakke wrote:
>>> Hi
>>>=20
>>> After having had some issues with inconcistent tso/gso configuration
>>> causing performance issues for sch_fq with pacing in one of my =
systems,
>>> I wonder if is it still recommended to disable gso/tso for =
interfaces
>>> used with fq_codel qdiscs and shaping using HTB etc.=20
>>=20
>> At lower bandwidths gro can do terrible things. Say you have a 1Mbit
>> uplink, and IW10. (At least one device (mvneta) will synthesise 64k =
of
>> gro packets)
>>=20
>> a single IW10 burst from one flow injects 130ms of latency.
>=20
> That is simply a sign of something bad happening from the source.
>=20
> The router will spend too much time trying to fix the TCP sender by
> smoothing things.
>=20
> Lets fix the root cause, instead of making everything slow or burn =
mega
> watts.
>=20
> GRO aggregates trains of packets for the same flow, in sub ms window.
>=20
> Why ? Because GRO can not predict the future : It can not know when =
next
> interrupt might come from the device telling : here is some additional
> packet(s). Maybe next packet is coming in 5 seconds.
>=20
> Take a look at napi_poll()
>=20
> 1) If device driver called napi_complete(), all packets are flushed
> (given) to upper stack. No packet will wait in GRO for additional
> segments.
>=20
> 2) Under flood (we exhausted the napi budget and did not call
> napi_complete()), we make sure no packet can sit in GRO for more than =
1
> ms.
>=20
> Only when the device is under flood and cpu can not drain fast enough =
RX
> queue, GRO can aggregate packets more aggressively, and the size of =
GRO
> packets exactly fits the CPU budget.
>=20
> In a nutshell, GRO is exactly the mechanism that adapts the packet =
sizes
> to available cpu power.
>=20
> If your cpu is really fast, then it will dequeue one packet at a time
> and GRO wont kick in.
>=20
> So the real problem here is that some device drivers implemented a =
poor
> interrupt mitigation logic, inherited from other OS that had not GRO =
and
> _had_ to implement their own crap, hurting latencies.
>=20
> Make sure you disable interrupt mitigation, and leave GRO enabled.
>=20
> e1000e is notoriously bad for interrupt mitigation.
>=20
> At Google, we let the NIC sends its RX interrupt ASAP.
>=20
> Every usec matters.
>=20
> So the model for us is very clear : Use GRO and TSO as much as we can,
> but make sure the producers (TCP senders) are smart and control their
> burst sizes.
>=20
> Think about 50Gbit and 100Gbit, and really the question of having or =
not
> TSO and GRO is simply moot.
>=20
>=20
> Even at 1Gbit, GRO is helping to reduce cpu cycles and thus reduce
> latencies.
>=20
> Adding a sysctl to limit GRO max size would be trivial, I already
> mentioned that, but nobody cared enough to send a patch.
>=20
>>=20
>>>=20
>>> If there is a trade off, at which bandwith does it generally make =
more
>>> sense to enable tso/gso than to have it disabled when doing HTB =
shaped
>>> fq_codel qdiscs?
>>=20
>> I stopped caring about tuning params at > 40Mbit. < 10 gbit, or =
rather,
>> trying get below 200usec of jitter|latency. (Others care)
>>=20
>> And: My expectation was generally that people would ignore our
>> recommendations on disabling offloads!
>>=20
>> Yes, we should revise the sample sqm code and recommendations for a =
post
>> gigabit era to not bother with changing network offloads. Were you
>> modifying the old debloat script?
>>=20
>> TBF & sch_Cake do peeling of gro/tso/gso back into packets, and then
>> interleave their scheduling, so GRO is both helpful (transiting the
>> stack faster) and harmless, at all bandwidths.
>>=20
>> HTB doesn't peel. We just ripped out hsfc for sqm-scripts (too =
buggy),
>> alsp. Leaving: tbf + fq_codel, htb+fq_codel, and cake models there.
>>=20
>=20
>=20
>=20
>> ...
>>=20
>> Cake is coming along nicely. I'd love a test in your 2Gbit bonding
>> scenario, particularly in a per host fairness test, at line or shaped
>> rates. We recently got cake working well with nat.
>>=20
>> http://blog.cerowrt.org/flent/steam/down_working.svg (ignore the =
latency
>> figure, the 6 flows were to spots all over the world)
>>=20
>>> Regards,
>>> Hans-Kristian
>>>=20
>>>=20
>>> _______________________________________________
>>> Bloat mailing list
>>> Bloat@lists.bufferbloat.net
>>> https://lists.bufferbloat.net/listinfo/bloat
>>>=20
>> _______________________________________________
>> Bloat mailing list
>> Bloat@lists.bufferbloat.net
>> https://lists.bufferbloat.net/listinfo/bloat
>=20
>=20
> _______________________________________________
> Bloat mailing list
> Bloat@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/bloat