From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-oi0-x22a.google.com (mail-oi0-x22a.google.com [IPv6:2607:f8b0:4003:c06::22a]) (using TLSv1 with cipher RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by huchra.bufferbloat.net (Postfix) with ESMTPS id 700C521F386 for ; Wed, 3 Dec 2014 00:08:01 -0800 (PST) Received: by mail-oi0-f42.google.com with SMTP id v63so10560708oia.1 for ; Wed, 03 Dec 2014 00:08:00 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:date:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=qKjpSBaziOzRNSUcuUFVesLSum0SlrGe8Xl2BTgVyZs=; b=fbOH8QsccXvAh5JEF+KowCI7AXnIDiFnF1aCOKwOfPadsKsTmVIv7fX7YvDLPZD7L9 winZpgKvkUus+mlQ65ug4py/t2VAVqrnPxx81V/x9gys2kFQNAi1r67E0Pcb9cFq2YSu F8G3ZwvG+F2oIFaycmzYchoUJCwxxFvNqLMs3hLDXIko2hy5rnuanDKlHRRYMWY6sXG+ sPQNa/faqB7QYK4cbaSIZ1gL2a0dPGhTFGTeE7OVach00m93ViclFfv7peDy4nvFjA8R 5aiylarxCdD4dO9txpYrSqwiiHSU6yF7WjofCDfnmbfzA1MozYLAEu4cfrA3wbsk/jgo f1Lg== MIME-Version: 1.0 X-Received: by 10.202.209.200 with SMTP id i191mr2076823oig.134.1417594079814; Wed, 03 Dec 2014 00:07:59 -0800 (PST) Received: by 10.202.227.211 with HTTP; Wed, 3 Dec 2014 00:07:59 -0800 (PST) Date: Wed, 3 Dec 2014 00:07:59 -0800 Message-ID: From: Dave Taht To: tinc-devel@tinc-vpn.org Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Cc: "cerowrt-devel@lists.bufferbloat.net" Subject: [Cerowrt-devel] tinc vpn: adding dscp passthrough (priorityinherit), ecn, and fq_codel support X-BeenThere: cerowrt-devel@lists.bufferbloat.net X-Mailman-Version: 2.1.13 Precedence: list List-Id: Development issues regarding the cerowrt test router project List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 03 Dec 2014 08:08:29 -0000 I have long included tinc in the cerowrt project as a lighter weight, meshy alternative to conventional vpns. I sat down a few days ago to think about how to make vpn connections work better through fq_codel, and decided I should maybe hack on a vpn to do the job. So I picked up tinc's source code for the first time, got it working on IPv6 as a switch in a matter of minutes between two endpoints (very impressed, thx!), and started hacking at it to see where I would get. This is partially the outgrowth of looking at an ietf document on ecn encapsulation and vpns.... https://tools.ietf.org/html/rfc6040 Experimental patches so far are at: https://github.com/dtaht/tinc I successfully converted tinc to use sendmsg and recvmsg, acquire (at least on linux) the TTL/Hoplimit and IP_TOS/IPv6_TCLASS packet fields, as well as SO_TIMESTAMPNS, and use a higher resolution internal clock. Got passing through the dscp values to work also, but: A) encapsulation of ecn capable marked packets, and availability in the outer header, without correct decapsulationm doesn't work well. The outer packet gets marked, but by default the marking doesn't make it back into the inner packet when decoded. see: http://snapon.lab.bufferbloat.net/~d/tinc/ecn.png # packets get marked but not decapsulated - and never dropped so they just keep accumulating delay over this path.... vs http://snapon.lab.bufferbloat.net/~d/tinc/noecn.png So communicating somehow that a path can take ecn (and/or diffserv markings) is needed between tinc daemons. I thought of perhaps crafting a special icmp message marked with CE but am open to ideas that would be backward compatible. It IS nice to be able to operate with near zero packet loss..... B) I have long theorized that a lot of userspace vpns bottleneck on the read and encapsulate step, and being strict FIFOs, gradually accumulate delay until finally they run out of read socket buffer space and start dropping packets. so I had a couple thoughts towards using multiple rx queues in the vtun interface, and/or trying to read more than one packet at a time (via recvmmsg) and do some level of fair queueing and queue management (codel) inside tinc itself. I think that's pretty doable without modifying the protocol any, but I'm not sure of it's value until I saturate some cpu more. (and if you thought recvmsg was complex, look at recvmmsg) C) Moving forward, in this case, it looks like I am bottlenecked on my gateway anyway (only eating 36% of cpu at this speed, not showing any substantial delays with SO_TIMESTAMPNS (but I haven't fully checked that) http://snapon.lab.bufferbloat.net/~d/tinc2/native_ipv6.png http://snapon.lab.bufferbloat.net/~d/tinc2/tunneled_classified.png I am a little puzzled as to how well tinc handles out of order packet delivery (the EF,BE,BK(CS1) diffserv queues are handled differently by the shaper on the gateway... and: D) the bottleneck link above is actually not tinc but the gateway, and as the gateway reverts to codel behavior on a single encapsulated flow encapsulating all the other flows, we end up with about 40ms of induced delay on this test. While I have a better codel (gets below 20ms latency, not deployed), *fq*_codel by identifying individual flows gets the induced delay on those flows down below 5ms. At one level, tinc being so nicely meshy means that the "fq" part of fq_codel on the gateway will have more chance to work against the multiple vpn flows it generates for all the potential vpn endpoints... but at another... lookie here! ipv6! 2^64 addresses or more to use! and port space to burn! What if I could make tinc open up 1024 ports per connection, and have it fq all it's flows over those? What could go wrong? --=20 Dave T=C3=A4ht http://www.bufferbloat.net