* [Cake] FreeNet backhaul @ 2018-09-06 22:37 Pete Heist 2018-09-06 22:58 ` Dave Taht 2018-09-06 23:03 ` Jonathan Morton 0 siblings, 2 replies; 6+ messages in thread From: Pete Heist @ 2018-09-06 22:37 UTC (permalink / raw) To: Cake List [-- Attachment #1: Type: text/plain, Size: 1276 bytes --] I met with the FreeNet Liberec admins earlier this week, and am just starting to get the first IRTT / SmokePing probe data from a few backhaul routers. I’ll see if I can get snapshots of the SmokePing pages public somewhere, but for now... https://www.heistp.net/downloads/jerab_ping.pdf <https://www.heistp.net/downloads/jerab_ping.pdf> https://www.heistp.net/downloads/jerab_irtt.pdf <https://www.heistp.net/downloads/jerab_irtt.pdf> On this router, there are various “events” that occur with RTT spikes, and in general, UDP RTT maximums appear quite a bit higher than those for ICMP. I speculate that at least some of these events may be bloat, but can’t be sure yet. This router is an old ALIX with kernel 2.6.26, but on the other hand it does have hfsc + esfq (a variant of sfq with host fairness) deployed, so if it’s actually controlling the queue, one might suspect that sfq it could control inter-flow latency at least somewhat. I feel the need for better data than this, but this is at least a first look for me at bloat in an ISP’s backhaul. The “elaborate” plan is to gather data for a while, deploy sqm in a couple places (I hope for cake, but the kernel will need to be upgraded on this one), and see what happens… :) Pete [-- Attachment #2: Type: text/html, Size: 1864 bytes --] ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Cake] FreeNet backhaul 2018-09-06 22:37 [Cake] FreeNet backhaul Pete Heist @ 2018-09-06 22:58 ` Dave Taht 2018-09-06 23:03 ` Jonathan Morton 1 sibling, 0 replies; 6+ messages in thread From: Dave Taht @ 2018-09-06 22:58 UTC (permalink / raw) To: Pete Heist; +Cc: Cake List Heh. esfq was "best in class" for a very, very long time. I have years of mrtg data on my network that I haven't looked at in years.... On Thu, Sep 6, 2018 at 3:37 PM Pete Heist <pete@heistp.net> wrote: > > I met with the FreeNet Liberec admins earlier this week, and am just starting to get the first IRTT / SmokePing probe data from a few backhaul routers. I’ll see if I can get snapshots of the SmokePing pages public somewhere, but for now... > > https://www.heistp.net/downloads/jerab_ping.pdf > > https://www.heistp.net/downloads/jerab_irtt.pdf > > On this router, there are various “events” that occur with RTT spikes, and in general, UDP RTT maximums appear quite a bit higher than those for ICMP. I speculate that at least some of these events may be bloat, but can’t be sure yet. > > This router is an old ALIX with kernel 2.6.26, but on the other hand it does have hfsc + esfq (a variant of sfq with host fairness) deployed, so if it’s actually controlling the queue, one might suspect that sfq it could control inter-flow latency at least somewhat. > > I feel the need for better data than this, but this is at least a first look for me at bloat in an ISP’s backhaul. The “elaborate” plan is to gather data for a while, deploy sqm in a couple places (I hope for cake, but the kernel will need to be upgraded on this one), and see what happens… :) > > Pete > > _______________________________________________ > Cake mailing list > Cake@lists.bufferbloat.net > https://lists.bufferbloat.net/listinfo/cake -- Dave Täht CEO, TekLibre, LLC http://www.teklibre.com Tel: 1-669-226-2619 ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Cake] FreeNet backhaul 2018-09-06 22:37 [Cake] FreeNet backhaul Pete Heist 2018-09-06 22:58 ` Dave Taht @ 2018-09-06 23:03 ` Jonathan Morton 2018-09-07 10:00 ` Pete Heist 1 sibling, 1 reply; 6+ messages in thread From: Jonathan Morton @ 2018-09-06 23:03 UTC (permalink / raw) To: Pete Heist; +Cc: Cake List > On 7 Sep, 2018, at 1:37 am, Pete Heist <pete@heistp.net> wrote: > > This router is an old ALIX with kernel 2.6.26, but on the other hand it does have hfsc + esfq (a variant of sfq with host fairness) deployed, so if it’s actually controlling the queue, one might suspect that sfq it could control inter-flow latency at least somewhat. ESFQ has two important faults: it doesn't explicitly control the length of individual queues (only tail-drops when a global limit is reached), and it suffers from hash collisions at the full "birthday problem" rate. So some of your measurement traffic is likely colliding with real traffic and suffering accordingly. That still makes ESFQ far better than a dumb FIFO. - Jonathan Morton ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Cake] FreeNet backhaul 2018-09-06 23:03 ` Jonathan Morton @ 2018-09-07 10:00 ` Pete Heist 2018-09-07 13:59 ` Dave Taht 0 siblings, 1 reply; 6+ messages in thread From: Pete Heist @ 2018-09-07 10:00 UTC (permalink / raw) To: Jonathan Morton; +Cc: Cake List [-- Attachment #1: Type: text/plain, Size: 1528 bytes --] > On Sep 7, 2018, at 1:03 AM, Jonathan Morton <chromatix99@gmail.com> wrote: > >> On 7 Sep, 2018, at 1:37 am, Pete Heist <pete@heistp.net> wrote: >> >> This router is an old ALIX with kernel 2.6.26, but on the other hand it does have hfsc + esfq (a variant of sfq with host fairness) deployed, so if it’s actually controlling the queue, one might suspect that sfq it could control inter-flow latency at least somewhat. > > ESFQ has two important faults: it doesn't explicitly control the length of individual queues (only tail-drops when a global limit is reached), and it suffers from hash collisions at the full "birthday problem" rate. So some of your measurement traffic is likely colliding with real traffic and suffering accordingly. Ah, ok, that is important. > That still makes ESFQ far better than a dumb FIFO. I’ve heard tales of the way things were. As a contrast, the router I’m on: https://www.heistp.net/downloads/vysina_ping.pdf <https://www.heistp.net/downloads/vysina_ping.pdf> The big difference here is this router’s uplink is licensed spectrum full-duplex 100Mbit, whereas Jerab from earlier is 5GHz WiFi (2x NSM5). The shift around June was an upgrade from ALIX to APU. I haven’t seen evidence yet of backhaul links running at saturation for long periods. When I watch throughputs in real-time I do see pulses though that probably don't show up in the long-term MRTG throughput graphs. I wonder what queue lengths look like at millisecond resolution during these events. [-- Attachment #2: Type: text/html, Size: 2389 bytes --] ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Cake] FreeNet backhaul 2018-09-07 10:00 ` Pete Heist @ 2018-09-07 13:59 ` Dave Taht 2018-09-07 16:02 ` Pete Heist 0 siblings, 1 reply; 6+ messages in thread From: Dave Taht @ 2018-09-07 13:59 UTC (permalink / raw) To: Pete Heist; +Cc: Jonathan Morton, Cake List you are making me pull out my mrtg stats, I'll post one. In the debloating universe, 5 minute averages really obscure the bufferbloat problem. What's important are drops/marks, reschedules, queue depths, and overlimits. I get about 3000 drops/day (debloats). I wish I could extrapolate what that and the reschedules means in terms of induced latency on other flows, easily. On Fri, Sep 7, 2018 at 3:00 AM Pete Heist <pete@heistp.net> wrote: > > > On Sep 7, 2018, at 1:03 AM, Jonathan Morton <chromatix99@gmail.com> wrote: > > On 7 Sep, 2018, at 1:37 am, Pete Heist <pete@heistp.net> wrote: > > This router is an old ALIX with kernel 2.6.26, but on the other hand it does have hfsc + esfq (a variant of sfq with host fairness) deployed, so if it’s actually controlling the queue, one might suspect that sfq it could control inter-flow latency at least somewhat. > > > ESFQ has two important faults: it doesn't explicitly control the length of individual queues (only tail-drops when a global limit is reached), and it suffers from hash collisions at the full "birthday problem" rate. So some of your measurement traffic is likely colliding with real traffic and suffering accordingly. > > > Ah, ok, that is important. > > That still makes ESFQ far better than a dumb FIFO. > > > I’ve heard tales of the way things were. > > As a contrast, the router I’m on: https://www.heistp.net/downloads/vysina_ping.pdf The big difference here is this router’s uplink is licensed spectrum full-duplex 100Mbit, whereas Jerab from earlier is 5GHz WiFi (2x NSM5). The shift around June was an upgrade from ALIX to APU. > > I haven’t seen evidence yet of backhaul links running at saturation for long periods. When I watch throughputs in real-time I do see pulses though that probably don't show up in the long-term MRTG throughput graphs. I wonder what queue lengths look like at millisecond resolution during these events. > _______________________________________________ > Cake mailing list > Cake@lists.bufferbloat.net > https://lists.bufferbloat.net/listinfo/cake -- Dave Täht CEO, TekLibre, LLC http://www.teklibre.com Tel: 1-669-226-2619 ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Cake] FreeNet backhaul 2018-09-07 13:59 ` Dave Taht @ 2018-09-07 16:02 ` Pete Heist 0 siblings, 0 replies; 6+ messages in thread From: Pete Heist @ 2018-09-07 16:02 UTC (permalink / raw) To: Dave Taht; +Cc: Cake List [-- Attachment #1: Type: text/plain, Size: 1554 bytes --] > On Sep 7, 2018, at 3:59 PM, Dave Taht <dave.taht@gmail.com> wrote: > > you are making me pull out my mrtg stats, I'll post one. In the > debloating universe, 5 minute averages > really obscure the bufferbloat problem. What's important are > drops/marks, reschedules, queue depths, and overlimits. I get about > 3000 drops/day (debloats). I wish I could extrapolate what that and > the reschedules means in terms of induced latency on other flows, > easily. The reminders are useful, but sorry if those mrtg stats gives you flashbacks to 2001, XP and Napster rips. :) As I look at different router configs, queue management looks like it’s being applied inconsistently, sometimes on the backhaul interfaces, and most often just on the customer facing interfaces (interfaces connected to the point-to-multipoint APs). I would only do it on the point-to-point links, as I would think you need airtime fairness and queues per station to do much good on ptmp. Failing that, I would probably just let the APs do what they will(?) That said, the Ethernet interfaces to NSM5s are 100Mbit, and in that case, a qdisc on egress towards the AP makes sense. SmokePing should be public. On this AP, eth0 is the ptmp radio and eth1 the backhaul, which has pfifo_fast, so yeah, there’s work to do, and quite a few overlimits on at least one hfsc qdisc... http://smokeping.lbcfree.net/cgi-bin/smokeping.cgi?filter=jerab;target=Frantiskov.Fjerab <http://smokeping.lbcfree.net/cgi-bin/smokeping.cgi?filter=jerab;target=Frantiskov.Fjerab> [-- Attachment #2.1: Type: text/html, Size: 2492 bytes --] [-- Attachment #2.2: tc_jerab.txt --] [-- Type: text/plain, Size: 4161 bytes --] FreeNetJerab:~# tc -s -d qdisc show dev eth0 qdisc prio 1: root bands 3 priomap 2 2 2 2 2 2 0 0 2 2 2 2 2 2 2 2 Sent 758499377670 bytes 553404355 pkt (dropped 243, overlimits 0 requeues 0) rate 0bit 0pps backlog 0b 0p requeues 0 qdisc esfq 11: parent 1:1 quantum 1514b limit 128p flows 128/1024 perturb 10sec hash: dst Sent 16655331 bytes 55403 pkt (dropped 0, overlimits 0 requeues 0) rate 0bit 0pps backlog 0b 0p requeues 0 qdisc esfq 12: parent 1:2 quantum 1514b limit 128p flows 128/1024 perturb 10sec hash: dst Sent 302689276925 bytes 227433380 pkt (dropped 0, overlimits 0 requeues 0) rate 0bit 0pps backlog 0b 0p requeues 0 qdisc hfsc 13: parent 1:3 default 111 Sent 455793445414 bytes 325915572 pkt (dropped 243, overlimits 233425034 requeues 0) rate 0bit 0pps backlog 0b 0p requeues 0 qdisc esfq 111: parent 13:111 quantum 1514b limit 128p flows 128/1024 perturb 10sec hash: dst Sent 171713351551 bytes 120321398 pkt (dropped 462, overlimits 0 requeues 0) rate 0bit 0pps backlog 0b 0p requeues 0 qdisc esfq 112: parent 13:112 quantum 1514b limit 128p flows 128/1024 perturb 10sec hash: dst Sent 283808986833 bytes 205401869 pkt (dropped 24, overlimits 0 requeues 0) rate 0bit 0pps backlog 0b 0p requeues 0 qdisc esfq 113: parent 13:113 quantum 1514b limit 128p flows 128/1024 perturb 10sec hash: dst Sent 271107030 bytes 192305 pkt (dropped 0, overlimits 0 requeues 0) rate 0bit 0pps backlog 0b 0p requeues 0 qdisc esfq 114: parent 13:114 quantum 1514b limit 128p flows 128/1024 perturb 10sec hash: dst Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) rate 0bit 0pps backlog 0b 0p requeues 0 -------------------- FreeNetJerab:~# tc -s -d class show dev eth0 class prio 1:1 parent 1: leaf 11: Sent 16655331 bytes 55403 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 class prio 1:2 parent 1: leaf 12: Sent 302689276925 bytes 227433380 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 class prio 1:3 parent 1: leaf 13: Sent 455793447887 bytes 325915583 pkt (dropped 243, overlimits 233425034 requeues 0) backlog 0b 0p requeues 0 class hfsc 13: root Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) rate 0bit 0pps backlog 0b 0p requeues 0 period 0 level 2 class hfsc 13:1 parent 13: ls m1 0bit d 0us m2 70000Kbit ul m1 0bit d 0us m2 70000Kbit Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) rate 0bit 0pps backlog 0b 0p requeues 0 period 105456403 work 455793447887 bytes level 1 class hfsc 13:111 parent 13:1 leaf 111: ls m1 0bit d 0us m2 34968Kbit ul m1 0bit d 0us m2 70000Kbit Sent 171713351605 bytes 120321399 pkt (dropped 231, overlimits 0 requeues 0) rate 0bit 0pps backlog 0b 0p requeues 0 period 23695738 work 171713351605 bytes level 0 class hfsc 13:112 parent 13:1 leaf 112: ls m1 0bit d 0us m2 20980Kbit ul m1 0bit d 0us m2 70000Kbit Sent 283808989252 bytes 205401879 pkt (dropped 12, overlimits 0 requeues 0) rate 0bit 0pps backlog 0b 0p requeues 0 period 83433005 work 283808989252 bytes level 0 class hfsc 13:113 parent 13:1 leaf 113: ls m1 0bit d 0us m2 13987Kbit ul m1 0bit d 0us m2 70000Kbit Sent 271107030 bytes 192305 pkt (dropped 0, overlimits 0 requeues 0) rate 0bit 0pps backlog 0b 0p requeues 0 period 33674 work 271107030 bytes level 0 class hfsc 13:114 parent 13:1 leaf 114: ls m1 0bit d 0us m2 64000bit ul m1 0bit d 0us m2 128000bit Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) rate 0bit 0pps backlog 0b 0p requeues 0 period 0 level 0 -------------------- FreeNetJerab:~# tc -s -d filter show dev eth0 filter parent 1: protocol ip pref 1 fw filter parent 1: protocol ip pref 1 fw handle 0x1 classid 1:1 filter parent 1: protocol ip pref 49152 fw filter parent 1: protocol ip pref 49152 fw handle 0x2 classid 1:2 -------------------- FreeNetJerab:~# tc -s -d qdisc show dev eth1 qdisc pfifo_fast 0: root bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 Sent 31015622705 bytes 215778566 pkt (dropped 0, overlimits 0 requeues 0) rate 0bit 0pps backlog 0b 0p requeues 0 [-- Attachment #2.3: Type: text/html, Size: 466 bytes --] ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2018-09-07 16:02 UTC | newest] Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2018-09-06 22:37 [Cake] FreeNet backhaul Pete Heist 2018-09-06 22:58 ` Dave Taht 2018-09-06 23:03 ` Jonathan Morton 2018-09-07 10:00 ` Pete Heist 2018-09-07 13:59 ` Dave Taht 2018-09-07 16:02 ` Pete Heist
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox