* [Cake] FreeNet backhaul
@ 2018-09-06 22:37 Pete Heist
2018-09-06 22:58 ` Dave Taht
2018-09-06 23:03 ` Jonathan Morton
0 siblings, 2 replies; 6+ messages in thread
From: Pete Heist @ 2018-09-06 22:37 UTC (permalink / raw)
To: Cake List
[-- Attachment #1: Type: text/plain, Size: 1276 bytes --]
I met with the FreeNet Liberec admins earlier this week, and am just starting to get the first IRTT / SmokePing probe data from a few backhaul routers. I’ll see if I can get snapshots of the SmokePing pages public somewhere, but for now...
https://www.heistp.net/downloads/jerab_ping.pdf <https://www.heistp.net/downloads/jerab_ping.pdf>
https://www.heistp.net/downloads/jerab_irtt.pdf <https://www.heistp.net/downloads/jerab_irtt.pdf>
On this router, there are various “events” that occur with RTT spikes, and in general, UDP RTT maximums appear quite a bit higher than those for ICMP. I speculate that at least some of these events may be bloat, but can’t be sure yet.
This router is an old ALIX with kernel 2.6.26, but on the other hand it does have hfsc + esfq (a variant of sfq with host fairness) deployed, so if it’s actually controlling the queue, one might suspect that sfq it could control inter-flow latency at least somewhat.
I feel the need for better data than this, but this is at least a first look for me at bloat in an ISP’s backhaul. The “elaborate” plan is to gather data for a while, deploy sqm in a couple places (I hope for cake, but the kernel will need to be upgraded on this one), and see what happens… :)
Pete
[-- Attachment #2: Type: text/html, Size: 1864 bytes --]
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Cake] FreeNet backhaul
2018-09-06 22:37 [Cake] FreeNet backhaul Pete Heist
@ 2018-09-06 22:58 ` Dave Taht
2018-09-06 23:03 ` Jonathan Morton
1 sibling, 0 replies; 6+ messages in thread
From: Dave Taht @ 2018-09-06 22:58 UTC (permalink / raw)
To: Pete Heist; +Cc: Cake List
Heh. esfq was "best in class" for a very, very long time.
I have years of mrtg data on my network that I haven't looked at in years....
On Thu, Sep 6, 2018 at 3:37 PM Pete Heist <pete@heistp.net> wrote:
>
> I met with the FreeNet Liberec admins earlier this week, and am just starting to get the first IRTT / SmokePing probe data from a few backhaul routers. I’ll see if I can get snapshots of the SmokePing pages public somewhere, but for now...
>
> https://www.heistp.net/downloads/jerab_ping.pdf
>
> https://www.heistp.net/downloads/jerab_irtt.pdf
>
> On this router, there are various “events” that occur with RTT spikes, and in general, UDP RTT maximums appear quite a bit higher than those for ICMP. I speculate that at least some of these events may be bloat, but can’t be sure yet.
>
> This router is an old ALIX with kernel 2.6.26, but on the other hand it does have hfsc + esfq (a variant of sfq with host fairness) deployed, so if it’s actually controlling the queue, one might suspect that sfq it could control inter-flow latency at least somewhat.
>
> I feel the need for better data than this, but this is at least a first look for me at bloat in an ISP’s backhaul. The “elaborate” plan is to gather data for a while, deploy sqm in a couple places (I hope for cake, but the kernel will need to be upgraded on this one), and see what happens… :)
>
> Pete
>
> _______________________________________________
> Cake mailing list
> Cake@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cake
--
Dave Täht
CEO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-669-226-2619
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Cake] FreeNet backhaul
2018-09-06 22:37 [Cake] FreeNet backhaul Pete Heist
2018-09-06 22:58 ` Dave Taht
@ 2018-09-06 23:03 ` Jonathan Morton
2018-09-07 10:00 ` Pete Heist
1 sibling, 1 reply; 6+ messages in thread
From: Jonathan Morton @ 2018-09-06 23:03 UTC (permalink / raw)
To: Pete Heist; +Cc: Cake List
> On 7 Sep, 2018, at 1:37 am, Pete Heist <pete@heistp.net> wrote:
>
> This router is an old ALIX with kernel 2.6.26, but on the other hand it does have hfsc + esfq (a variant of sfq with host fairness) deployed, so if it’s actually controlling the queue, one might suspect that sfq it could control inter-flow latency at least somewhat.
ESFQ has two important faults: it doesn't explicitly control the length of individual queues (only tail-drops when a global limit is reached), and it suffers from hash collisions at the full "birthday problem" rate. So some of your measurement traffic is likely colliding with real traffic and suffering accordingly.
That still makes ESFQ far better than a dumb FIFO.
- Jonathan Morton
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Cake] FreeNet backhaul
2018-09-06 23:03 ` Jonathan Morton
@ 2018-09-07 10:00 ` Pete Heist
2018-09-07 13:59 ` Dave Taht
0 siblings, 1 reply; 6+ messages in thread
From: Pete Heist @ 2018-09-07 10:00 UTC (permalink / raw)
To: Jonathan Morton; +Cc: Cake List
[-- Attachment #1: Type: text/plain, Size: 1528 bytes --]
> On Sep 7, 2018, at 1:03 AM, Jonathan Morton <chromatix99@gmail.com> wrote:
>
>> On 7 Sep, 2018, at 1:37 am, Pete Heist <pete@heistp.net> wrote:
>>
>> This router is an old ALIX with kernel 2.6.26, but on the other hand it does have hfsc + esfq (a variant of sfq with host fairness) deployed, so if it’s actually controlling the queue, one might suspect that sfq it could control inter-flow latency at least somewhat.
>
> ESFQ has two important faults: it doesn't explicitly control the length of individual queues (only tail-drops when a global limit is reached), and it suffers from hash collisions at the full "birthday problem" rate. So some of your measurement traffic is likely colliding with real traffic and suffering accordingly.
Ah, ok, that is important.
> That still makes ESFQ far better than a dumb FIFO.
I’ve heard tales of the way things were.
As a contrast, the router I’m on: https://www.heistp.net/downloads/vysina_ping.pdf <https://www.heistp.net/downloads/vysina_ping.pdf> The big difference here is this router’s uplink is licensed spectrum full-duplex 100Mbit, whereas Jerab from earlier is 5GHz WiFi (2x NSM5). The shift around June was an upgrade from ALIX to APU.
I haven’t seen evidence yet of backhaul links running at saturation for long periods. When I watch throughputs in real-time I do see pulses though that probably don't show up in the long-term MRTG throughput graphs. I wonder what queue lengths look like at millisecond resolution during these events.
[-- Attachment #2: Type: text/html, Size: 2389 bytes --]
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Cake] FreeNet backhaul
2018-09-07 10:00 ` Pete Heist
@ 2018-09-07 13:59 ` Dave Taht
2018-09-07 16:02 ` Pete Heist
0 siblings, 1 reply; 6+ messages in thread
From: Dave Taht @ 2018-09-07 13:59 UTC (permalink / raw)
To: Pete Heist; +Cc: Jonathan Morton, Cake List
you are making me pull out my mrtg stats, I'll post one. In the
debloating universe, 5 minute averages
really obscure the bufferbloat problem. What's important are
drops/marks, reschedules, queue depths, and overlimits. I get about
3000 drops/day (debloats). I wish I could extrapolate what that and
the reschedules means in terms of induced latency on other flows,
easily.
On Fri, Sep 7, 2018 at 3:00 AM Pete Heist <pete@heistp.net> wrote:
>
>
> On Sep 7, 2018, at 1:03 AM, Jonathan Morton <chromatix99@gmail.com> wrote:
>
> On 7 Sep, 2018, at 1:37 am, Pete Heist <pete@heistp.net> wrote:
>
> This router is an old ALIX with kernel 2.6.26, but on the other hand it does have hfsc + esfq (a variant of sfq with host fairness) deployed, so if it’s actually controlling the queue, one might suspect that sfq it could control inter-flow latency at least somewhat.
>
>
> ESFQ has two important faults: it doesn't explicitly control the length of individual queues (only tail-drops when a global limit is reached), and it suffers from hash collisions at the full "birthday problem" rate. So some of your measurement traffic is likely colliding with real traffic and suffering accordingly.
>
>
> Ah, ok, that is important.
>
> That still makes ESFQ far better than a dumb FIFO.
>
>
> I’ve heard tales of the way things were.
>
> As a contrast, the router I’m on: https://www.heistp.net/downloads/vysina_ping.pdf The big difference here is this router’s uplink is licensed spectrum full-duplex 100Mbit, whereas Jerab from earlier is 5GHz WiFi (2x NSM5). The shift around June was an upgrade from ALIX to APU.
>
> I haven’t seen evidence yet of backhaul links running at saturation for long periods. When I watch throughputs in real-time I do see pulses though that probably don't show up in the long-term MRTG throughput graphs. I wonder what queue lengths look like at millisecond resolution during these events.
> _______________________________________________
> Cake mailing list
> Cake@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cake
--
Dave Täht
CEO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-669-226-2619
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Cake] FreeNet backhaul
2018-09-07 13:59 ` Dave Taht
@ 2018-09-07 16:02 ` Pete Heist
0 siblings, 0 replies; 6+ messages in thread
From: Pete Heist @ 2018-09-07 16:02 UTC (permalink / raw)
To: Dave Taht; +Cc: Cake List
[-- Attachment #1: Type: text/plain, Size: 1554 bytes --]
> On Sep 7, 2018, at 3:59 PM, Dave Taht <dave.taht@gmail.com> wrote:
>
> you are making me pull out my mrtg stats, I'll post one. In the
> debloating universe, 5 minute averages
> really obscure the bufferbloat problem. What's important are
> drops/marks, reschedules, queue depths, and overlimits. I get about
> 3000 drops/day (debloats). I wish I could extrapolate what that and
> the reschedules means in terms of induced latency on other flows,
> easily.
The reminders are useful, but sorry if those mrtg stats gives you flashbacks to 2001, XP and Napster rips. :)
As I look at different router configs, queue management looks like it’s being applied inconsistently, sometimes on the backhaul interfaces, and most often just on the customer facing interfaces (interfaces connected to the point-to-multipoint APs).
I would only do it on the point-to-point links, as I would think you need airtime fairness and queues per station to do much good on ptmp. Failing that, I would probably just let the APs do what they will(?) That said, the Ethernet interfaces to NSM5s are 100Mbit, and in that case, a qdisc on egress towards the AP makes sense.
SmokePing should be public. On this AP, eth0 is the ptmp radio and eth1 the backhaul, which has pfifo_fast, so yeah, there’s work to do, and quite a few overlimits on at least one hfsc qdisc...
http://smokeping.lbcfree.net/cgi-bin/smokeping.cgi?filter=jerab;target=Frantiskov.Fjerab <http://smokeping.lbcfree.net/cgi-bin/smokeping.cgi?filter=jerab;target=Frantiskov.Fjerab>
[-- Attachment #2.1: Type: text/html, Size: 2492 bytes --]
[-- Attachment #2.2: tc_jerab.txt --]
[-- Type: text/plain, Size: 4161 bytes --]
FreeNetJerab:~# tc -s -d qdisc show dev eth0
qdisc prio 1: root bands 3 priomap 2 2 2 2 2 2 0 0 2 2 2 2 2 2 2 2
Sent 758499377670 bytes 553404355 pkt (dropped 243, overlimits 0 requeues 0)
rate 0bit 0pps backlog 0b 0p requeues 0
qdisc esfq 11: parent 1:1 quantum 1514b limit 128p flows 128/1024 perturb 10sec hash: dst
Sent 16655331 bytes 55403 pkt (dropped 0, overlimits 0 requeues 0)
rate 0bit 0pps backlog 0b 0p requeues 0
qdisc esfq 12: parent 1:2 quantum 1514b limit 128p flows 128/1024 perturb 10sec hash: dst
Sent 302689276925 bytes 227433380 pkt (dropped 0, overlimits 0 requeues 0)
rate 0bit 0pps backlog 0b 0p requeues 0
qdisc hfsc 13: parent 1:3 default 111
Sent 455793445414 bytes 325915572 pkt (dropped 243, overlimits 233425034 requeues 0)
rate 0bit 0pps backlog 0b 0p requeues 0
qdisc esfq 111: parent 13:111 quantum 1514b limit 128p flows 128/1024 perturb 10sec hash: dst
Sent 171713351551 bytes 120321398 pkt (dropped 462, overlimits 0 requeues 0)
rate 0bit 0pps backlog 0b 0p requeues 0
qdisc esfq 112: parent 13:112 quantum 1514b limit 128p flows 128/1024 perturb 10sec hash: dst
Sent 283808986833 bytes 205401869 pkt (dropped 24, overlimits 0 requeues 0)
rate 0bit 0pps backlog 0b 0p requeues 0
qdisc esfq 113: parent 13:113 quantum 1514b limit 128p flows 128/1024 perturb 10sec hash: dst
Sent 271107030 bytes 192305 pkt (dropped 0, overlimits 0 requeues 0)
rate 0bit 0pps backlog 0b 0p requeues 0
qdisc esfq 114: parent 13:114 quantum 1514b limit 128p flows 128/1024 perturb 10sec hash: dst
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
rate 0bit 0pps backlog 0b 0p requeues 0
--------------------
FreeNetJerab:~# tc -s -d class show dev eth0
class prio 1:1 parent 1: leaf 11:
Sent 16655331 bytes 55403 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
class prio 1:2 parent 1: leaf 12:
Sent 302689276925 bytes 227433380 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
class prio 1:3 parent 1: leaf 13:
Sent 455793447887 bytes 325915583 pkt (dropped 243, overlimits 233425034 requeues 0)
backlog 0b 0p requeues 0
class hfsc 13: root
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
rate 0bit 0pps backlog 0b 0p requeues 0
period 0 level 2
class hfsc 13:1 parent 13: ls m1 0bit d 0us m2 70000Kbit ul m1 0bit d 0us m2 70000Kbit
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
rate 0bit 0pps backlog 0b 0p requeues 0
period 105456403 work 455793447887 bytes level 1
class hfsc 13:111 parent 13:1 leaf 111: ls m1 0bit d 0us m2 34968Kbit ul m1 0bit d 0us m2 70000Kbit
Sent 171713351605 bytes 120321399 pkt (dropped 231, overlimits 0 requeues 0)
rate 0bit 0pps backlog 0b 0p requeues 0
period 23695738 work 171713351605 bytes level 0
class hfsc 13:112 parent 13:1 leaf 112: ls m1 0bit d 0us m2 20980Kbit ul m1 0bit d 0us m2 70000Kbit
Sent 283808989252 bytes 205401879 pkt (dropped 12, overlimits 0 requeues 0)
rate 0bit 0pps backlog 0b 0p requeues 0
period 83433005 work 283808989252 bytes level 0
class hfsc 13:113 parent 13:1 leaf 113: ls m1 0bit d 0us m2 13987Kbit ul m1 0bit d 0us m2 70000Kbit
Sent 271107030 bytes 192305 pkt (dropped 0, overlimits 0 requeues 0)
rate 0bit 0pps backlog 0b 0p requeues 0
period 33674 work 271107030 bytes level 0
class hfsc 13:114 parent 13:1 leaf 114: ls m1 0bit d 0us m2 64000bit ul m1 0bit d 0us m2 128000bit
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
rate 0bit 0pps backlog 0b 0p requeues 0
period 0 level 0
--------------------
FreeNetJerab:~# tc -s -d filter show dev eth0
filter parent 1: protocol ip pref 1 fw
filter parent 1: protocol ip pref 1 fw handle 0x1 classid 1:1
filter parent 1: protocol ip pref 49152 fw
filter parent 1: protocol ip pref 49152 fw handle 0x2 classid 1:2
--------------------
FreeNetJerab:~# tc -s -d qdisc show dev eth1
qdisc pfifo_fast 0: root bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
Sent 31015622705 bytes 215778566 pkt (dropped 0, overlimits 0 requeues 0)
rate 0bit 0pps backlog 0b 0p requeues 0
[-- Attachment #2.3: Type: text/html, Size: 466 bytes --]
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2018-09-07 16:02 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-09-06 22:37 [Cake] FreeNet backhaul Pete Heist
2018-09-06 22:58 ` Dave Taht
2018-09-06 23:03 ` Jonathan Morton
2018-09-07 10:00 ` Pete Heist
2018-09-07 13:59 ` Dave Taht
2018-09-07 16:02 ` Pete Heist
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox