Thank you for answering!

On 27 January 2017 at 08:55, Dave Täht <dave@taht.net> wrote:


On 1/26/17 11:21 PM, Hans-Kristian Bakke wrote:
> Hi
>
> After having had some issues with inconcistent tso/gso configuration
> causing performance issues for sch_fq with pacing in one of my systems,
> I wonder if is it still recommended to disable gso/tso for interfaces
> used with fq_codel qdiscs and shaping using HTB etc.

At lower bandwidths gro can do terrible things. Say you have a 1Mbit
uplink, and IW10. (At least one device (mvneta) will synthesise 64k of
gro packets)

a single IW10 burst from one flow injects 130ms of latency.

>
> If there is a trade off, at which bandwith does it generally make more
> sense to enable tso/gso than to have it disabled when doing HTB shaped
> fq_codel qdiscs?

I stopped caring about tuning params at > 40Mbit. < 10 gbit, or rather,
trying get below 200usec of jitter|latency. (Others care)

And: My expectation was generally that people would ignore our
recommendations on disabling offloads!

Yes, we should revise the sample sqm code and recommendations for a post
gigabit era to not bother with changing network offloads. Were you
modifying the old debloat script?

​I just picked it up from just about any bufferbloat script or introduction ​I have seen in the last 4 years. 
In addition it seemed to bring the bandwith accuracy of the shaped stream a little bit closer to the bandwith I actually configured in HTB in my own testing, which, if I remember correctly, was then done on a symmetrical link that was shaped to around 25 mbit/s, so I just took it for granted. 

However, the fq pacing issue I had when I had a bond interface with tso and gso disabled on top of physical nics with tso and gso enabled, made me think that disabling tso and gso perhaps is not really expected behaviour for new implentations in the linux network stack. Perhaps it works nicely for my shaping needs, but also gives me other not so obvious issues in other ways.


TBF & sch_Cake do peeling of gro/tso/gso back into packets, and then
interleave their scheduling, so GRO is both helpful (transiting the
stack faster) and harmless, at all bandwidths.

HTB doesn't peel. We just ripped out hsfc for sqm-scripts (too buggy),
alsp. Leaving: tbf + fq_codel, htb+fq_codel, and cake models there.

...

Cake is coming along nicely. I'd love a test in your 2Gbit bonding
scenario, particularly in a per host fairness test, at line or shaped
rates. We recently got cake working well with nat.


​Is this something I can do for you? This is a system in production. Non-critical enough to play with some qdiscs and generate some bandwith usage, but still in production​. It is not really possible for me to remove all other traffic and factors that may interfere with the results (or is a real life scenario perhaps the point?). But running a few scripts is no problem if that is what is required!

 
http://blog.cerowrt.org/flent/steam/down_working.svg (ignore the latency
figure, the 6 flows were to spots all over the world)

> Regards,
> Hans-Kristian
>
>
> _______________________________________________
> Bloat mailing list
> Bloat@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/bloat
>
_______________________________________________
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat