From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from arianus.sliepen.org (arianus.sliepen.org [92.243.30.131]) by huchra.bufferbloat.net (Postfix) with ESMTP id 5060C21F38D for ; Wed, 3 Dec 2014 04:02:48 -0800 (PST) Received: from sliepen.org (unknown [IPv6:fec0::1:204:76ff:fe14:6e86]) by arianus.sliepen.org (Postfix) with ESMTP id 677D2266F5; Wed, 3 Dec 2014 13:02:46 +0100 (CET) Received: from localhost (localhost [127.0.0.1]) by sliepen.org (Postfix) with ESMTP id 1D9222202F7; Wed, 3 Dec 2014 13:02:46 +0100 (CET) Date: Wed, 3 Dec 2014 13:02:46 +0100 From: Guus Sliepen To: tinc-devel@tinc-vpn.org Message-ID: <20141203120246.GO10533@sliepen.org> Mail-Followup-To: Guus Sliepen , tinc-devel@tinc-vpn.org, "cerowrt-devel@lists.bufferbloat.net" References: MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="IbA9xpzOQlG26JSn" Content-Disposition: inline In-Reply-To: X-oi: oi User-Agent: Mutt/1.5.23 (2014-03-12) X-Mailman-Approved-At: Wed, 03 Dec 2014 04:18:28 -0800 Cc: "cerowrt-devel@lists.bufferbloat.net" Subject: Re: [Cerowrt-devel] tinc vpn: adding dscp passthrough (priorityinherit), ecn, and fq_codel support X-BeenThere: cerowrt-devel@lists.bufferbloat.net X-Mailman-Version: 2.1.13 Precedence: list List-Id: Development issues regarding the cerowrt test router project List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 03 Dec 2014 12:03:18 -0000 --IbA9xpzOQlG26JSn Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Wed, Dec 03, 2014 at 12:07:59AM -0800, Dave Taht wrote: [...] > https://github.com/dtaht/tinc >=20 > I successfully converted tinc to use sendmsg and recvmsg, acquire (at > least on linux) the TTL/Hoplimit and IP_TOS/IPv6_TCLASS packet fields, Windows does not have sendmsg()/recvmsg(), but the BSDs support it. > as well as SO_TIMESTAMPNS, and use a higher resolution internal clock. > Got passing through the dscp values to work also, but: >=20 > A) encapsulation of ecn capable marked packets, and availability in > the outer header, without correct decapsulationm doesn't work well. >=20 > The outer packet gets marked, but by default the marking doesn't make > it back into the inner packet when decoded. Is the kernel stripping the ECN bits provided by userspace? In the code in your git branch you strip the ECN bits out yourself. > So communicating somehow that a path can take ecn (and/or diffserv > markings) is needed between tinc daemons. I thought of perhaps > crafting a special icmp message marked with CE but am open to ideas > that would be backward compatible. PMTU probes are used to discover whether UDP works and how big the path MTU is, maybe it could be used to discover whether ECN works as well? Set one of the ECN bits on some of the PMTU probes, and if you receive a probe with that ECN bit set, also set it on the probe reply. If you succesfully receive a reply with ECN bits set, then you know ECN works. Since the remote side just echoes the contents of the probe, you could also put a copy of the ECN bits in the probe payload, and then you can detect if the ECN bits got zeroed. You can also define an OPTION_ECN in src/connection.h, so nodes can announce their support for ECN, but that should not be necessary I think. > B) I have long theorized that a lot of userspace vpns bottleneck on > the read and encapsulate step, and being strict FIFOs, > gradually accumulate delay until finally they run out of read socket > buffer space and start dropping packets. Well, encryption and decryption takes a lot of CPU time, but context switches are also bad. Tinc is treating UDP in a strictly FIFO way, but actually it does use a RED algorithm when tunneling over TCP. That said, it only looks at its own buffers to determine when to drop packets, and those only come into play once the kernel's TCP buffers are filled. > so I had a couple thoughts towards using multiple rx queues in the > vtun interface, and/or trying to read more than one packet at a time > (via recvmmsg) and do some level of fair queueing and queue management > (codel) inside tinc itself. I think that's > pretty doable without modifying the protocol any, but I'm not sure of > it's value until I saturate some cpu more. I'd welcome any work in this area :) > (and if you thought recvmsg was complex, look at recvmmsg) It seems someone is already working on that, see https://github.com/jasdeep-hundal/tinc. > D) >=20 > the bottleneck link above is actually not tinc but the gateway, and as > the gateway reverts to codel behavior on a single encapsulated flow > encapsulating all the other flows, we end up with about 40ms of > induced delay on this test. While I have a better codel (gets below > 20ms latency, not deployed), *fq*_codel by identifying individual > flows gets the induced delay on those flows down below 5ms. But that should improve with ECN if fq_codel is configured to use that, right? > At one level, tinc being so nicely meshy means that the "fq" part of > fq_codel on the gateway will have more chance to work against the > multiple vpn flows it generates for all the potential vpn endpoints... >=20 > but at another... lookie here! ipv6! 2^64 addresses or more to use! > and port space to burn! What if I could make tinc open up 1024 ports > per connection, and have it fq all it's flows over those? What could > go wrong? Right, hash the header of the original packets, and then select a port or address based on the hash? What about putting that hash in the flow label of outer packets? Any routers that would actually treat those as separate flows? --=20 Met vriendelijke groet / with kind regards, Guus Sliepen --IbA9xpzOQlG26JSn Content-Type: application/pgp-signature; name="signature.asc" Content-Description: Digital signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAEBCAAGBQJUfvvlAAoJED9JDeuHHvn6kj0P/jOgDfMyJUrTwiUS8+/KwxRm CrxJAqNUAPexo9mitNwoHvbmEZIpXRzk0fADGJU7DQ3GEkVhfcMdn7F3a5ImCC7z GYQdoadHQtwhEVlBt5jYe9pO07suq1qG3vvZQtgw4sFQHXQUmUnDat3FkCEEfcvf Qbb9Ts3XwMar6A13NzM6usUo5j11MGF5q5EcBTXWe1UqE6LMGAMGU/q566TwQZzh 943F8p5iuvWciXyakZkShqKcTqeo8O7YB6Nx6AlolSwOU6BB4S+8YkWBrJOWmDbW pWYWJktHfldRrStEs7EIuZ9M6ROv4nwr7Kw3eb/3QKX+XukRDjIslTAZ6eFMBNSf sH0Sxq9epxeQEK0dZb8BJ7ECH4hLj1TIfBBe8B9C0mas/vb9V/Ctb4RjQ7zj1w+F 7dNiZk547NHN+786KCvntyl1LUTWCa7Q97YiKdV1ahjBPk8yZQqlWhmDqlxYAmBI prLCbF77j+LYVNu8JW3pr5Sv+yZjgVuI9C9d0CBBgeiSKzNBOFTnS7JTbwIyDZiV Mn/J8DNENk0YNz400iFIXRPyr9uWUPK6YuOosjKY3b+D+8qpRMuOm0cz461s1zTZ +vmhjJzyStnoqZVkpqszFDqLfQ58oSSHn0KWnHlMG58s/s+9ZRVoKORrM3tB5xQS 8u3NAWPsl1eNjyU6XE+t =GtkR -----END PGP SIGNATURE----- --IbA9xpzOQlG26JSn--