[Cerowrt-devel] tinc vpn: adding dscp passthrough (priorityinherit), ecn, and fq_codel support

Wed Dec 3 03:07:59 EST 2014

I have long included tinc in the cerowrt project as a lighter weight,
meshy alternative to conventional vpns.

I sat down a few days ago to think about how to make vpn connections
work better through fq_codel, and decided I should maybe hack on a vpn
to do the job. So I picked up tinc's source code for the first time,
got it working on IPv6 as a switch in a matter of minutes between two
endpoints (very impressed, thx!), and started hacking at it to see
where I would get.

This is partially the outgrowth of looking at an ietf document on ecn
encapsulation and vpns....

https://tools.ietf.org/html/rfc6040

Experimental patches so far are at:

https://github.com/dtaht/tinc

I successfully converted tinc to use sendmsg and recvmsg, acquire (at
least on linux) the TTL/Hoplimit and IP_TOS/IPv6_TCLASS packet fields,
as well as SO_TIMESTAMPNS, and use a higher resolution internal clock.
Got passing through the dscp values to work also, but:

A) encapsulation of ecn capable marked packets, and availability in
the outer header, without correct decapsulationm doesn't work well.

The outer packet gets marked, but by default the marking doesn't make
it back into the inner packet when decoded.

see:

http://snapon.lab.bufferbloat.net/~d/tinc/ecn.png # packets get marked
but not decapsulated - and never dropped so they
just keep accumulating delay over this path....

vs

http://snapon.lab.bufferbloat.net/~d/tinc/noecn.png

So communicating somehow that a path can take ecn (and/or diffserv
markings) is needed between tinc daemons. I thought of perhaps
crafting a special icmp message marked with CE but am open to ideas
that would be backward compatible.

It IS nice to be able to operate with near zero packet loss.....

B) I have long theorized that a lot of userspace vpns bottleneck on
the read and encapsulate step, and being strict FIFOs,
gradually accumulate delay until finally they run out of read socket
buffer space and start dropping packets.

so I had a couple thoughts towards using multiple rx queues in the
vtun interface, and/or trying to read more than one packet at a time
(via recvmmsg) and do some level of fair queueing and queue management
(codel) inside tinc itself. I think that's
pretty doable without modifying the protocol any, but I'm not sure of
it's value until I saturate some cpu more. (and if you thought recvmsg
was complex, look at recvmmsg)

C) Moving forward, in this case, it looks like I am bottlenecked on my
gateway anyway (only eating 36% of cpu at this speed,
not showing any substantial delays with SO_TIMESTAMPNS (but I haven't
fully checked that)

http://snapon.lab.bufferbloat.net/~d/tinc2/native_ipv6.png

http://snapon.lab.bufferbloat.net/~d/tinc2/tunneled_classified.png

I am a little puzzled as to how well tinc handles out of order packet
delivery (the EF,BE,BK(CS1) diffserv queues are handled differently by
the shaper on the gateway...

and:

D)

the bottleneck link above is actually not tinc but the gateway, and as
the gateway reverts to codel behavior on a single encapsulated flow
encapsulating all the other flows, we end up with about 40ms of
induced delay on this test. While I have a better codel (gets below
20ms latency, not deployed), *fq*_codel by identifying individual
flows gets the induced delay on those flows down below 5ms.

At one level, tinc being so nicely meshy means that the "fq" part of
fq_codel on the gateway will have more chance to work against the
multiple vpn flows it generates for all the potential vpn endpoints...

but at another... lookie here! ipv6! 2^64 addresses or more to use!
and port space to burn! What if I could make tinc open up 1024 ports
per connection, and have it fq all it's flows over those? What could
go wrong?

-- 
Dave Täht

http://www.bufferbloat.net