Lets make wifi fast again!
 help / color / mirror / Atom feed
* [Make-wifi-fast] babeld patch enabling ecn
@ 2018-08-28 16:45 Dave Taht
  2018-08-28 19:05 ` [Make-wifi-fast] [Cerowrt-devel] " Toke Høiland-Jørgensen
  0 siblings, 1 reply; 5+ messages in thread
From: Dave Taht @ 2018-08-28 16:45 UTC (permalink / raw)
  To: Make-Wifi-fast, cerowrt-devel

[-- Attachment #1: Type: text/plain, Size: 2461 bytes --]

In cleaning up the lab and some long out of tree patches, I guess I
should make a push for the the following patch to be more thoroughly
tested in openwrt. For review here first before running that
gauntlet....

[PATCH] Disable CS6 and enable ECN in Babeld
This one line patch disables CS6 marking and enables ECN in babeld.

ECN decouples "packet loss from congestion" from "loss due to bad connectivity".

It also moves unicast babel packets into the best effort queue.

The good:

* OpenWrt is fully fq_codeled and doesn't pay much attention to diffserv
* ECN'd Routes stay up even under extreme congestion
* ECN'd Packet loss returns to a measure of connectivity only
* Killing CS6 saves bandwidth - The 802.11n VO queue (where CS6 falls
normally)  cannot aggregate.


I would support a default qos-map for Openwrt 802.11n devices essentially
disabling the VO queue, as better aggregation works so much better,
(In fact I'd disable VI and BK universally also),

post fq_codel for wifi on ath*, and poorly in general on all devices -
  and then keep CS6 + ECN.


The bad:

* Babel does not do anything to reduce its rate on receipt of CE
  or modify its metrics

Given a choice between losing core connectivity under congestion or not...

* a babeld instance using ECN over a fq_codel'd link will always be
  more reachable than a non-ecn'd one

you can argue that a fq_codel'd link is generically faster than one that is not.

* CS6 does help somewhat on ethernet switches but it's largely been immeasurable
* Lacking an effective response to ECN large babel networks can fill it's FQ'd
  queue with undroppable packets. This problem is generic, actually, ECN or no,

  as the multicast queue is infinite and not fq_codeled.

  babel protocol packets however are light, a single flow, and babel flooding

  will be unnoticible except to itself, and if it gets truly out of hand

  the bulk dropper should kick in.

* Babeld should also independently schedule hellos from route announcements

  and manage the route announcement queue better


After 5 years in my deployment of babel + fq_codel running this patch IMHO

this is the best of multiple bad alternatives, and dramatically improves

network reliability.


It is the rough equivalent of adding a minimal "control plane" for
critical packets.


-- 

Dave Täht
CEO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-669-226-2619

[-- Attachment #2: 0001-Disable-CS6-and-enable-ECN-in-Babeld.patch --]
[-- Type: application/octet-stream, Size: 3304 bytes --]

From 4ac61f74fc0f2ecc49b5304038e3077aae8f7b04 Mon Sep 17 00:00:00 2001
From: Dave Taht <dave@taht.net>
Date: Tue, 28 Aug 2018 09:38:26 -0700
Subject: [PATCH] Disable CS6 and enable ECN in Babeld

This one line patch disables CS6 marking and enables ECN in babeld.

ECN decouples "packet loss from congestion" from "loss due to bad connectivity".

It also moves unicast babel packets into the best effort queue.

The good:

* OpenWrt is fully fq_codeled and doesn't pay much attention to diffserv
* ECN'd Routes stay up even under extreme congestion
* ECN'd Packet loss returns to a measure of connectivity only
* Killing CS6 saves bandwidth - The 802.11n VO queue (where CS6 falls normally)
  cannot aggregate.

I would support a default qos-map for Openwrt 802.11n devices essentially
disabling the VO queue, as better aggregation works so much better,
(In fact I'd disable VI and BK universally also),
post fq_codel for wifi on ath*, and poorly in general on all devices -
  and then keep CS6 + ECN.

The bad:

* Babel does not do anything to reduce its rate on receipt of CE
  or modify its metrics

Given a choice between losing core connectivity under congestion or not...

* a babeld instance using ECN over a fq_codel'd link will always be
  more reachable than a non-ecn'd one
you can argue that a fq_codel'd link is generically faster than one that is not.
* CS6 does help somewhat on ethernet switches but it's largely been immeasurable
* Lacking an effective response to ECN large babel networks can fill it's FQ'd
  queue with undroppable packets. This problem is generic, actually, ECN or no,
  as the multicast queue is infinite and not fq_codeled.
  babel protocol packets however are light, a single flow, and babel flooding
  will be unnoticible except to itself, and if it gets truly out of hand
  the bulk dropper should kick in.
* Babeld should also independently schedule hellos from route announcements
  and manage the route announcement queue better

After 5 years in my deployment of babel + fq_codel running this patch IMHO
this is the best of multiple bad alternatives, and dramatically improves
network reliability.

It is the rough equivalent of adding a minimal "control plane" for critical packets.

Otherwise congested routers can (and do) "fall off the net" when they shouldn't.

Other routing protocol stacks (OSPF, BATMAN, BMX, ISIS) should look deeply
into the benefits and pitfalls of ECN.

We've long been exploring alternatives in babel (RTT metrics, unicast hellos, etc)
but none have arrived yet. There are also some statistically sound
means of randomizing and interleaving route announcements to get past congestive
drops that could be used...

... but this works "good enough", for now.

Submitted after a rapidly aborted attempt at deploying vanilla openwrt 18.06
on my production 280+ route network.
---
 net.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net.c b/net.c
index 1e5890d..88bbab3 100644
--- a/net.c
+++ b/net.c
@@ -46,7 +46,7 @@ babel_socket(int port)
     int s, rc;
     int saved_errno;
     int one = 1, zero = 0;
-    const int ds = 0xc0;        /* CS6 - Network Control */
+    const int ds = 0x02;        /* ECT - Enable ECN */
 
     s = socket(PF_INET6, SOCK_DGRAM, 0);
     if(s < 0)
-- 
2.15.1 (Apple Git-101)


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Make-wifi-fast] [Cerowrt-devel] babeld patch enabling ecn
  2018-08-28 16:45 [Make-wifi-fast] babeld patch enabling ecn Dave Taht
@ 2018-08-28 19:05 ` Toke Høiland-Jørgensen
  2018-08-28 19:12   ` Dave Taht
  0 siblings, 1 reply; 5+ messages in thread
From: Toke Høiland-Jørgensen @ 2018-08-28 19:05 UTC (permalink / raw)
  To: Dave Taht, Make-Wifi-fast, cerowrt-devel

Dave Taht <dave.taht@gmail.com> writes:

>   as the multicast queue is infinite and not fq_codeled.

No it isn't. Multicast has its own fully fq-codel'ed queue...

-Toke

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Make-wifi-fast] [Cerowrt-devel] babeld patch enabling ecn
  2018-08-28 19:05 ` [Make-wifi-fast] [Cerowrt-devel] " Toke Høiland-Jørgensen
@ 2018-08-28 19:12   ` Dave Taht
  2018-08-28 20:32     ` Toke Høiland-Jørgensen
  0 siblings, 1 reply; 5+ messages in thread
From: Dave Taht @ 2018-08-28 19:12 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen; +Cc: Make-Wifi-fast, cerowrt-devel

On Tue, Aug 28, 2018 at 12:05 PM Toke Høiland-Jørgensen <toke@toke.dk> wrote:
>
> Dave Taht <dave.taht@gmail.com> writes:
>
> >   as the multicast queue is infinite and not fq_codeled.
>
> No it isn't. Multicast has its own fully fq-codel'ed queue...

Used to be infinite. Does it adjust the target?
> -Toke



-- 

Dave Täht
CEO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-669-226-2619

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Make-wifi-fast] [Cerowrt-devel] babeld patch enabling ecn
  2018-08-28 19:12   ` Dave Taht
@ 2018-08-28 20:32     ` Toke Høiland-Jørgensen
  2018-08-28 20:37       ` Dave Taht
  0 siblings, 1 reply; 5+ messages in thread
From: Toke Høiland-Jørgensen @ 2018-08-28 20:32 UTC (permalink / raw)
  To: Dave Taht; +Cc: Make-Wifi-fast, cerowrt-devel

Dave Taht <dave.taht@gmail.com> writes:

> On Tue, Aug 28, 2018 at 12:05 PM Toke Høiland-Jørgensen <toke@toke.dk> wrote:
>>
>> Dave Taht <dave.taht@gmail.com> writes:
>>
>> >   as the multicast queue is infinite and not fq_codeled.
>>
>> No it isn't. Multicast has its own fully fq-codel'ed queue...
>
> Used to be infinite.

Not since we enabled fq-codel'ed queues.

> Does it adjust the target?

Don't think so...

-Toke

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Make-wifi-fast] [Cerowrt-devel] babeld patch enabling ecn
  2018-08-28 20:32     ` Toke Høiland-Jørgensen
@ 2018-08-28 20:37       ` Dave Taht
  0 siblings, 0 replies; 5+ messages in thread
From: Dave Taht @ 2018-08-28 20:37 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen; +Cc: Make-Wifi-fast, cerowrt-devel

On Tue, Aug 28, 2018 at 1:32 PM Toke Høiland-Jørgensen <toke@toke.dk> wrote:
>
> Dave Taht <dave.taht@gmail.com> writes:
>
> > On Tue, Aug 28, 2018 at 12:05 PM Toke Høiland-Jørgensen <toke@toke.dk> wrote:
> >>
> >> Dave Taht <dave.taht@gmail.com> writes:
> >>
> >> >   as the multicast queue is infinite and not fq_codeled.
> >>
> >> No it isn't. Multicast has its own fully fq-codel'ed queue...
> >
> > Used to be infinite.
>
> Not since we enabled fq-codel'ed queues.

Bulk dropper is also still just the qdisc.

> > Does it adjust the target?
>
> Don't think so...

Might explain how disastrous my test was.

>
> -Toke



-- 

Dave Täht
CEO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-669-226-2619

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2018-08-28 20:36 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-08-28 16:45 [Make-wifi-fast] babeld patch enabling ecn Dave Taht
2018-08-28 19:05 ` [Make-wifi-fast] [Cerowrt-devel] " Toke Høiland-Jørgensen
2018-08-28 19:12   ` Dave Taht
2018-08-28 20:32     ` Toke Høiland-Jørgensen
2018-08-28 20:37       ` Dave Taht

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox