[Cerowrt-devel] squash/ignore DSCP and mangle table questions

leetminiwheat LeetMiniWheat at gmail.com
Wed Apr 22 06:17:15 EDT 2015

On Wed, Apr 22, 2015 at 5:19 AM, Sebastian Moeller <moeller0 at gmx.de> wrote:
> On Apr 22, 2015, at 02:28 , leetminiwheat <LeetMiniWheat at gmail.com> wrote:
>> Correcton on P.S. section: 3 and 5t not 4 and 5t.
>         This is confusing me ;)

I just meant the switch vlan config, configuring port 4 to anything
seems to break it. default is 0 1 2 3 5t

my idea was to segregate a vlan on 3. I'll have to give it some more thought.

> Please find attached the most recent sqm-scripts and the most recent luci-sqm (the relevant script) please copy sqm-cbi.lua to:
> /usr/lib/lua/luci/model/cbi/sqm.lua and you should be as up to date as you can be. Make sure to also move all the files from the attached sqm-scripts folder to the matching folders on your cerowrt router; that should hopefully fix the leftover IFB issue to some degree (we currently do not clean up when an interface goes away, only when the interface gets upped again)

Thanks! much appreciated and easier than cherry picking from git,
especially since I don't have a buildroot set up at the moment

On Wed, Apr 22, 2015 at 5:03 AM, Sebastian Moeller <moeller0 at gmx.de> wrote:
> On Apr 22, 2015, at 02:19 , leetminiwheat <LeetMiniWheat at gmail.com> wrote:
>> Sorry this is getting a bit off-topic here.
>>> On Wed, Apr 15, 2015 at 5:05 AM, Sebastian Moeller <moeller0 at gmx.de> wrote:
>>>> On Apr 15, 2015, at 03:35 , leetminiwheat <LeetMiniWheat at gmail.com> wrote:
>>>> I assume tweaking ring parameters from default RX:128 and TX:32
>>>> doesn't matter anymore thenr?
>>>        As far as I know we leave that alone, see: http://www.bufferbloat.net/projects/bloat/wiki/Linux_Tips:
>>> “Set the size of the ring buffer for the network interface
>>> NOTE: THIS HACK IS NO LONGER NEEDED on many ethernet drivers in Linux 3.3, which has Byte Queue Limits instead, which does a far better job."
>> I noticed Dave mentioned on a mailing list that changing tx ring still
>> does have some benefit, and his notes in debloat script suggest BQL
>> doesn't always work as implied.
>         Could you cite that note please? I can not find it, @Dave maybe you could comment on the notes applicability?

 * Byte Queue Limits is supposed to have a rate limiter that works.

It is not very effective at less than 100Mbit. I get ~32k peak there
and with GSO on, at 100Mbit, I have seen latency spikes of up to 70ms.

   (Not recently tested, however)

A per queue limit of 2-3 large packets appears to be the best
compromise at 100Mbit and below. So typically I hammer down BQL to
4.5k or 3k at < 100Mbit, and turn GSO/TSO off, and as a result see
ping against load latencies in the 1 to 2ms range, which is about
what you would expect. I have tried 1500 bytes, which limited the top
end performance to about 84Mbit.

Can't seem to find the mailing list archive where Dave mentioned tx
ring still having some further benefit in addition to BQL.

>> Also, I tested some custom HFSC+fq_codel qos scripts here:
>> https://github.com/zcecc22/qos-nxt
>         Inteeresting, does his never sqm-qos work better for you?
>> He defaults to 90% (meaning you have to adjust your b/w limits), and
>> the 2-bin codel doesn't seem to work very well. Seemed like an
>> interesting compromise between simple and simplest, but the results
>> were pretty terrible.
>         If you have a RRUL plot to share that would be helpful.

Actually, I linked the wrong QoS scripts, those were his old ones
which I haven't tested. These are the newer simplified ones:

nxt.qos plot: http://screencloud.net/v/jHza
nxt_v2.qos plot: http://screencloud.net/v/oe8x

Note: adjusted QoS caps to my full provisioned rate since these
scripts limit to 90% anyways (and I use 90% in

as you can see, lots of latency spikes. I'm not sure what these
scripts are intended to accomplish, perhaps they're more optimized for
lower speed connections. I haven't tested his older scripts but they
looked more advanced and even changed the tx rings and such, they're
located here:

>> I'd like to test CAKE more, but it seems
>> 3.10.50-1 doesn't have the required kernel support.
>> Recent changes in barrier breaker to txring seem pretty dumb, they
>> default to 256 txring now I believe, ticket here was closed with
>> "worksforme" https://dev.openwrt.org/ticket/13072 so I'm reluctant to
>> upgrade, plus I don't fully understand the extent of which Dave's
>> kernel hacks impact things. Closer inspection/comparison/diffs are
>> needed if I'm to upgrade and integrate Cero's tweaks.
>         I am probably off here, but I assume that with a properly sizes BQL the actual tx ring does not matter for latency anymore and can be selected to keep the hardware happy ;)
>> Oddly enough, simplest.qos on WAN gives me better throughput/latency
>> at times (likely due to less overhead), but other times simple.qos is
>> doing what it should and giving more throughput and lower latency to
>> higher priority traffic. I seem to get better RRUL tests with LIMIT=
>> blank, and ILIMIT/ELIMIT set to auto which results in this:
>         Hmm, LIMIT defaults to 1001 basically so we can see that it was not set, but left at its default. I believe that at one time limit defaulted to 10240 packets which was to large for a wndr3700, that’s why I changed that to 1001, but I have a hard time believing that the difference between 1001 and 1024 packets should affect RRUL noticeably… but as always if the data supports that notion I am willing to adapt...

Yeah I'm not entirely sure... I ran about 20 tests last night with
different settings but didn't extensively test auto vs 1001, i do see
this in the output though but likely the wireless:
SQM: get_limit: CURLIMIT:
SQM: cur_target: auto cur_bandwidth: 1536
SQM: get_limit: CURLIMIT:
SQM: cur_target: auto cur_bandwidth: 1536
SQM: get_limit: CURLIMIT:
SQM: cur_target: auto cur_bandwidth: 1536
SQM: egress shaping activated
SQM: Perform DSCP based filtering on ingress. (3-tier classification)
SQM: get_limit: CURLIMIT:
SQM: cur_target: auto cur_bandwidth: 1192
SQM: get_limit: CURLIMIT:
SQM: cur_target: auto cur_bandwidth: 1192
SQM: get_limit: CURLIMIT:
SQM: cur_target: auto cur_bandwidth: 1192

>> image of RRUL 45s graph here with simple.qos, no tx changes, auto
>> LIMIT on FiOS 32/25 (30Mb/22.5Mb QoS): https://screencloud.net/v/tVV0
>> - looks pretty good to me,
>         So what is a bit weird is that you see the priority banding in the download direction (upper plot) but not in the egress direction, for IPv4 this typically looks different, were you by any chance using IPv6 for this test?

This was purely ipv4, verizon still has not rolled out ipv6 and I try
to remove as much of it from Cero as possible. I did notice after the
fact that the upload was a little inconsistent but not entirely sure
what that can be attributed to. I speedtest at 25mbps upload, and rate
limit to 22.5mbps, but verizon could be screwing with it. I did test
20mbps as well but didn't see any noticeable differences. I was mostly
looking at the latency and separation between priority queues. I may
run some more at lower speed again, my upload can be inconsistent
depending on destination (verizon often routes me through really cheap
backbone routes, as a result I need to ssh tunnel my gaming traffic
through a friend's server at the datacenter he works at)

>> I never top out my wireless
>> since it's used only for mobile phones anyways and I run HT20 which
>> seems to be more reliable/less latency.
>         As compared to ht40? On 5GHz or on 2.4GHz?

probably applies to both, but HT40 has more overhead.

I heard some talk about cleaning up WiFi recently though, including
fast-tx paths but at the moment it only works in client/ap mode with
single client. recent kernel changes regarding WiFi seem based on some
really bad assumptions though.

>> however my guest wifi I do
>> need to limit and segregate via firewall so I have it enabled.
>         Silly question, why do you need to limit your guest wifi? At most I would disallow the guest the use of the highest priority queue (sort of keeping it as a poor man’s control plane), but otherwise let them have their fair share of the full bandwidth after all they are my guests ;) (or alternatively put that traffic into the BK queue so they will never really delay your traffic)

prioritization doesn't always work, as we can see with bittorrent
traffic. and it's not so much for "guests" but more of a hotspot since
I'm in a densely populated suburban area. Also I don't want to
encourage them to leach off me if they have their own faster WiFi, or
can just get their own home internet. I know some may disagree with me
but it IS my portion of the internet that *I* pay for. I pay probably
way more than I should for my fiber service and it seems to have
latency issues upstream with >200 simultaneous connections sending or
receiving data, something that might not ever be avoidable using QoS
on my end. I know I'm on GPON probably on a fiber switch/hub thingy
(forget the name) with 32-64 other customers, about a mile from the
nearest main office.

>> port4 is unused and likely internally
>> reserved for unknown purposes. I'm still trying to figure out how it
>> maps an interface to an actual port, since I'd like to assign a single
>> switch switch port to it's own subnet for my FiOS router instead of
>> having to use a secondary router to clone the ge00 interface on the
>> backend router to forward FiOS ports to the verizon/FiOS MOCA bridge
>> router in order for alerts to display on set-top boxes such as caller
>> ID. There has to be a way of doing this without needing 3 routers...
>> My current thoughts are to remove the port (port3 in this case) from
>> the switch and make a new switch config with just 4 and 5t and somehow
>> make a new interface on that for the FiOS router, but assigning the
>> same ip address as the default gateway/route from ge00 on that port
>> might confuse routing. More information on their rather complicated
>> and seemingly unnecessary config with a backend router is located
>> here: http://www.dslreports.com/faq/verizonfios/3.0_Networking#16710
>         Again, this is far away from my area of expertise, but I would simple use a switch as poor man’s aggregation network between the ONT’s ethernet-port and the main-router; and then I would connrect the actiontec’s wan port to the same switch. With a bit of vlan? configuration it should be possible to have the actiontec only see the main-router (to not confuse the ONT, but maybe that is not necessary), and use a fixed route from the main-router to the actiontec’s network with the devices you want to access. I guess at that level of abstraction I avoid all the pesky issues cropping up when trying to implement that ;)

Yeah that was another option I saw, of using a switch with dedicated
VLANs but I don't really have the hardware to do that. putting both on
a simple dumb switch sounds bad if they both request an IP upstream. I
will give it some more thought to somehow create a separate virtual
network inside the WNDR3800 just for the ISP router. somehow just port
forwarding alone doesn't do it.

More information about the Cerowrt-devel mailing list