From: Dave Taht <dave.taht@gmail.com>
To: Make-Wifi-fast <make-wifi-fast@lists.bufferbloat.net>,
cerowrt-devel@lists.bufferbloat.net
Subject: [Make-wifi-fast] babeld patch enabling ecn
Date: Tue, 28 Aug 2018 09:45:43 -0700 [thread overview]
Message-ID: <CAA93jw46K7weUKq2tem9MbX87Cz0gDVqefmdTwF+Ee_V64ohfQ@mail.gmail.com> (raw)
[-- Attachment #1: Type: text/plain, Size: 2461 bytes --]
In cleaning up the lab and some long out of tree patches, I guess I
should make a push for the the following patch to be more thoroughly
tested in openwrt. For review here first before running that
gauntlet....
[PATCH] Disable CS6 and enable ECN in Babeld
This one line patch disables CS6 marking and enables ECN in babeld.
ECN decouples "packet loss from congestion" from "loss due to bad connectivity".
It also moves unicast babel packets into the best effort queue.
The good:
* OpenWrt is fully fq_codeled and doesn't pay much attention to diffserv
* ECN'd Routes stay up even under extreme congestion
* ECN'd Packet loss returns to a measure of connectivity only
* Killing CS6 saves bandwidth - The 802.11n VO queue (where CS6 falls
normally) cannot aggregate.
I would support a default qos-map for Openwrt 802.11n devices essentially
disabling the VO queue, as better aggregation works so much better,
(In fact I'd disable VI and BK universally also),
post fq_codel for wifi on ath*, and poorly in general on all devices -
and then keep CS6 + ECN.
The bad:
* Babel does not do anything to reduce its rate on receipt of CE
or modify its metrics
Given a choice between losing core connectivity under congestion or not...
* a babeld instance using ECN over a fq_codel'd link will always be
more reachable than a non-ecn'd one
you can argue that a fq_codel'd link is generically faster than one that is not.
* CS6 does help somewhat on ethernet switches but it's largely been immeasurable
* Lacking an effective response to ECN large babel networks can fill it's FQ'd
queue with undroppable packets. This problem is generic, actually, ECN or no,
as the multicast queue is infinite and not fq_codeled.
babel protocol packets however are light, a single flow, and babel flooding
will be unnoticible except to itself, and if it gets truly out of hand
the bulk dropper should kick in.
* Babeld should also independently schedule hellos from route announcements
and manage the route announcement queue better
After 5 years in my deployment of babel + fq_codel running this patch IMHO
this is the best of multiple bad alternatives, and dramatically improves
network reliability.
It is the rough equivalent of adding a minimal "control plane" for
critical packets.
--
Dave Täht
CEO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-669-226-2619
[-- Attachment #2: 0001-Disable-CS6-and-enable-ECN-in-Babeld.patch --]
[-- Type: application/octet-stream, Size: 3304 bytes --]
From 4ac61f74fc0f2ecc49b5304038e3077aae8f7b04 Mon Sep 17 00:00:00 2001
From: Dave Taht <dave@taht.net>
Date: Tue, 28 Aug 2018 09:38:26 -0700
Subject: [PATCH] Disable CS6 and enable ECN in Babeld
This one line patch disables CS6 marking and enables ECN in babeld.
ECN decouples "packet loss from congestion" from "loss due to bad connectivity".
It also moves unicast babel packets into the best effort queue.
The good:
* OpenWrt is fully fq_codeled and doesn't pay much attention to diffserv
* ECN'd Routes stay up even under extreme congestion
* ECN'd Packet loss returns to a measure of connectivity only
* Killing CS6 saves bandwidth - The 802.11n VO queue (where CS6 falls normally)
cannot aggregate.
I would support a default qos-map for Openwrt 802.11n devices essentially
disabling the VO queue, as better aggregation works so much better,
(In fact I'd disable VI and BK universally also),
post fq_codel for wifi on ath*, and poorly in general on all devices -
and then keep CS6 + ECN.
The bad:
* Babel does not do anything to reduce its rate on receipt of CE
or modify its metrics
Given a choice between losing core connectivity under congestion or not...
* a babeld instance using ECN over a fq_codel'd link will always be
more reachable than a non-ecn'd one
you can argue that a fq_codel'd link is generically faster than one that is not.
* CS6 does help somewhat on ethernet switches but it's largely been immeasurable
* Lacking an effective response to ECN large babel networks can fill it's FQ'd
queue with undroppable packets. This problem is generic, actually, ECN or no,
as the multicast queue is infinite and not fq_codeled.
babel protocol packets however are light, a single flow, and babel flooding
will be unnoticible except to itself, and if it gets truly out of hand
the bulk dropper should kick in.
* Babeld should also independently schedule hellos from route announcements
and manage the route announcement queue better
After 5 years in my deployment of babel + fq_codel running this patch IMHO
this is the best of multiple bad alternatives, and dramatically improves
network reliability.
It is the rough equivalent of adding a minimal "control plane" for critical packets.
Otherwise congested routers can (and do) "fall off the net" when they shouldn't.
Other routing protocol stacks (OSPF, BATMAN, BMX, ISIS) should look deeply
into the benefits and pitfalls of ECN.
We've long been exploring alternatives in babel (RTT metrics, unicast hellos, etc)
but none have arrived yet. There are also some statistically sound
means of randomizing and interleaving route announcements to get past congestive
drops that could be used...
... but this works "good enough", for now.
Submitted after a rapidly aborted attempt at deploying vanilla openwrt 18.06
on my production 280+ route network.
---
net.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net.c b/net.c
index 1e5890d..88bbab3 100644
--- a/net.c
+++ b/net.c
@@ -46,7 +46,7 @@ babel_socket(int port)
int s, rc;
int saved_errno;
int one = 1, zero = 0;
- const int ds = 0xc0; /* CS6 - Network Control */
+ const int ds = 0x02; /* ECT - Enable ECN */
s = socket(PF_INET6, SOCK_DGRAM, 0);
if(s < 0)
--
2.15.1 (Apple Git-101)
next reply other threads:[~2018-08-28 16:46 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-08-28 16:45 Dave Taht [this message]
2018-08-28 19:05 ` [Make-wifi-fast] [Cerowrt-devel] " Toke Høiland-Jørgensen
2018-08-28 19:12 ` Dave Taht
2018-08-28 20:32 ` Toke Høiland-Jørgensen
2018-08-28 20:37 ` Dave Taht
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://lists.bufferbloat.net/postorius/lists/make-wifi-fast.lists.bufferbloat.net/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAA93jw46K7weUKq2tem9MbX87Cz0gDVqefmdTwF+Ee_V64ohfQ@mail.gmail.com \
--to=dave.taht@gmail.com \
--cc=cerowrt-devel@lists.bufferbloat.net \
--cc=make-wifi-fast@lists.bufferbloat.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox