battling with babel and route changes
Dave Taht
dave.taht at gmail.com
Sun Jun 19 09:44:53 EDT 2011
While I am totally in love with babel as a routing protocol, at the
traffic volumes I'm regularly generating now,
it's fairly trivial for a route change to stop a tcp stream in its
tracks. [1] I'm simply not familiar enough with netlink
to fix it, or even know if it can be fixed...
0) babel keeps all the routing information in it's head. It does not
use the kernel metrics in particular:
1) babel installs ipv4 routes with a metric of 0, ipv6 routes with a
metric of 1024
2) doing a route change via netlink seems to require a different metric
(or at least it did, several years ago, when the code was written)
So I've been playing with the experimental babelz from Julius's darcs
repo, which can be obtained from
darcs get http://www.pps.jussieu.fr/~jch/software/repos/babelz/
The relevant bit of code is in kernel_netlink.c, where it calls itself
recursively to handle a route change,
instead of making a route change.
if(operation == ROUTE_MODIFY) {
if(newmetric == metric && memcmp(newgate, gate, 16) == 0 &&
newifindex == ifindex)
return 0;
/* It is better to add the new route before removing the old
one, to avoid losing packets. However, this only appears
to work if the metrics are different. */
if(newmetric != metric) {
rc = kernel_route(ROUTE_ADD, dest, plen,
newgate, newifindex, newmetric,
NULL, 0, 0);
if(rc < 0 && errno != EEXIST)
return rc;
rc = kernel_route(ROUTE_FLUSH, dest, plen,
gate, ifindex, metric,
NULL, 0, 0);
if(rc < 0 && (errno == ENOENT || errno == ESRCH))
rc = 1;
} else {
rc = kernel_route(ROUTE_FLUSH, dest, plen,
gate, ifindex, metric,
NULL, 0, 0);
rc = kernel_route(ROUTE_ADD, dest, plen,
newgate, newifindex, newmetric,
NULL, 0, 0);
if(rc < 0 && errno == EEXIST)
rc = 1;
}
--
Dave Täht
SKYPE: davetaht
US Tel: 1-239-829-5608
http://the-edge.blogspot.com
1: lest you think this is not relevant to bufferbloat... I am
generating saturating loads across two test networks [2], while trying
to get real work done - saving the mice packets - you know, typing,
listening to streaming radio - and fixing bloat wherever it appears -
and I keep stumbling across problems like this one (assuming it's
real) that aren't related to bloat but affect the tests.
I THINK, but am not prepared to discuss, that I've found a long
standing bug with dhcp, for example. Working on it...
2: Test network #1 - 8 wndr3700s and a pc at georgia tech
Test network #2 - Linux laptop, 2 wndr3700, 3 nanostation M5s,
1 android, 1 ipad, 1 iphone, 1 nokia 770, 1 windows XP box,
1 vista box, and soon - 3 OLPCS.
I'm getting to where I can grant permission for others to hack on the
first network.
3: While this might be better on the babel list, I thought that
perhaps in-depth netlink knowledge was here...
More information about the Bloat-devel
mailing list