[Bloat] Getting started with sqm-scripts - latency good, bandwidth decimated
Alan Jenkins
alan.christopher.jenkins at gmail.com
Wed Jan 20 07:06:55 EST 2016
On 20/01/16 11:43, moeller0 wrote:
>> On Jan 20, 2016, at 11:12 , Alan Jenkins <alan.christopher.jenkins at gmail.com> wrote:
>>
>> On 19/01/2016, Brandon Applegate <brandon at burn.net> wrote:
>>> Disclaimer: if this is the wrong list for such a question - let me know.
>>> This is specifically about the sqm-scripts package...
>>>
>>> Hello,
>>>
>>> I’ve been reading all I can on the bufferbloat website and also trying to
>>> understand the evolution of the various scripts (debloat, sqm, etc).
>>>
>>> I managed to get sqm-scripts on my firewall (Ubuntu linux on a PC - no *wrt
>>> etc). Got it built with the ‘linux’ platform. Since this is Ubuntu 12.04 -
>>> I had to cheat a bit and pull down the iproute2 source from 14.04. I’ve
>>> tweaked the main sqm script to reflect this for the tc bindary - this is
>>> working. I also updated my kernel to a later version that supports
>>> fq_codel.
>>>
>>> My topology is ‘on a stick’. I have one gig interface to a managed switch,
>>> on which are eth0.666 (outside/wan) and eth0.10 (inside).
>>>
>>> I have 30/5 cable service, and have tried both those values as well as 90%
>>> in my /etc/sqm/*conf file.
>>>
>>> I’ve tried both eth0 (raw/parent interface) as well as eth0.666.
>>>
>>> No matter what I do - my bandwidth is 10% of what it should be. I get
>>> approx. 3/4mbit down + 2/3mbit up on dslreports speedtest. Bufferbloat
>>> looks great though - A+.
>>>
>>> Is there something inherent I’m doing wrong ? Something to do with my ‘on a
>>> stick’ topology biting me ? Kernel version (Ubuntu’s 3.13.0-74-generic
>>> btw).
>>>
>>> Thanks in advance for any help or info (or pointer to a more appropriate
>>> list).
>> It doesn't sound like you're doing anything wrong :(.
>>
>> I would make sure to check the rates on `tc class show dev eth0.666`
>> (and ifb4eth0.666). Switching to `simplest.qos` could be easier to
>> debug. With your simple.qos, there'll be several tracffic classes...
>> the `root` should be the specified `rate`, and it looks like all
>> classes save 1:11 should have a `ceil` just under the specified rate.
>>
>> Not sure how to debug qos-scripts itself. However the Gentoo wiki has
>> a 50-line script, which was corrected by dtaht :). Like simplest.qos
>> this has a single class.
>> https://wiki.gentoo.org/wiki/Traffic_shaping
>>
>> That would let you investigate the commands finely, as well as the
>> resulting state shown by `tc qdisc` and `tc class`, and really narrow
>> it down.
>>
>> `dslreports.com` will show bandwidth and latency-under-load in each
>> direction independently, so you could work on a single direction. I
>> would look at ingress only (the IFB) since that's where your bandwidth
>> decimation is so visible. E.g. just comment out the egress section,
>> to avoid distractions.
> It should be sufficient to set the egress rate to 0 then, as for sqm zero denotes no shaping and not a rate of zero kbps (good luck using TCP on a purely unidirectional link...)
>
>> I think you can run the htb without the fq_codel command at the end -
>> that is, it will default to a massive fifo, which will replace the
>> fq_codel in the output of `tc qdisc`, but to a first approximation it
>> will affect bandwidth.
> Simply put
> QDISC=pfifo_fast
> into the .conf file for eth0.666 to test this.
>
> Best Regards
> Sebastian
No offense to your work on sqm-scripts.
1) It's just in case the problem is *outside* of sqm-scripts, it could
be useful to try the minimum commands necessary to demonstrate the
problem. Maybe it's unfair but I assumed running on a PC is also a less
tested case, as well as the "firewall on a stick" part.
Equally, since AFAICT Brandon hasn't had a working AQM setup on this box
before. It could be useful if the Gentoo script works, to prove that
AQM + fq_codel can work correctly on Brandon's box.
If the Gentoo script _wasn't_ working, that's when I'd suggest tearing
it down e.g. to eliminate fq_codel as an issue. Rather than specify
fifo explicitly, just don't run the fq_codel command (assuming that
actually does something sensible, which can be checked using `tc qdisc`).
2) More specifically, you didn't mention trying "simplest.qos". This
would i) simplify the setup we're trying to debug, and ii) it might show
if there's a bug with the more complex bandwidth calculations /
assignments in "simple.qos".
And again - I suggest starting by checking `tc class show dev eth0.666`,
because it's not a 100% obvious command, and we don't want to miss if
there are bad rates there :).
Alan
More information about the Bloat
mailing list