[Cerowrt-devel] Turris Omnia

Mon Nov 7 12:41:42 EST 2016

On 11/7/16 6:30 AM, James Cloos wrote:

> but with wshaper there were 8 rather than 4 lines matching parent for
> each interface.  There still are 8 for the 3 ethernet devices, but that
> doesn't seem to be a problem....

The armada chipset in the omnia has 8 hardware queues on the ethernet
devices, so you end up with 8 fq_codel instances on them.

They did not support BQL, thus fq_codel was ineffective, until recently
- a patch was submitted upstream to address this, I don't know if it
made it into openwrt yet. I believed it shaved 10-20ms of latency off a
saturated gigE network.

Second problem is that BQL gives out an eventually even share of queue
to the right hw queues, but that's "eventually".

A third problem, even with BQL, is that you end up with the birthday
problem with so few queues - two flows can easily hash into the same hw
queue. I think the designer's original intent was not to have 8 equal
weight queues, but to have them apply to levels of QoS. A secondary
intent was to map queues to cores for processing. Not in the intent was
to provide consistently low latency or network fairness. :/

Last problem - and why cake may be important to some - was that they had
*aggressively* enabled (64k!) GRO offloads in the driver, which is
helpful on benchmarks, but really hurts sqm-scripts w/fq_codel in some
cases.

> 
> So this seems solved.

Yea!

I note that tc -s qdisc show tends to be more revealing. If the BQL and
GRO problem still exist, you generally will never see fq_codel mark or
drop packets on the ethernet devices, even if you drive two ports into one.

> Thanks.
> 
> -JimC
>