From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <dpreed@reed.com>
Received: from smtp105.iad3a.emailsrvr.com (smtp105.iad3a.emailsrvr.com
	[173.203.187.105])
	(using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits))
	(Client did not present a certificate)
	by huchra.bufferbloat.net (Postfix) with ESMTPS id E141C21F452
	for <cerowrt-devel@lists.bufferbloat.net>;
	Wed,  3 Dec 2014 16:45:11 -0800 (PST)
Received: from localhost (localhost.localdomain [127.0.0.1])
	by smtp30.relay.iad3a.emailsrvr.com (SMTP Server) with ESMTP id
	3911638019A; Wed,  3 Dec 2014 19:45:10 -0500 (EST)
X-Virus-Scanned: OK
Received: from app21.wa-webapps.iad3a (relay-webapps.rsapps.net
	[172.27.255.140])
	by smtp30.relay.iad3a.emailsrvr.com (SMTP Server) with ESMTP id
	E0DC53801AB; Wed,  3 Dec 2014 19:45:09 -0500 (EST)
X-Sender-Id: dpreed@reed.com
Received: from app21.wa-webapps.iad3a (relay-webapps.rsapps.net
	[172.27.255.140]) by 0.0.0.0:25 (trex/5.4.1);
	Thu, 04 Dec 2014 00:45:10 GMT
Received: from reed.com (localhost.localdomain [127.0.0.1])
	by app21.wa-webapps.iad3a (Postfix) with ESMTP id CD380280054;
	Wed,  3 Dec 2014 19:45:09 -0500 (EST)
Received: by apps.rackspace.com
	(Authenticated sender: dpreed@reed.com, from: dpreed@reed.com) 
	with HTTP; Wed, 3 Dec 2014 19:45:09 -0500 (EST)
Date: Wed, 3 Dec 2014 19:45:09 -0500 (EST)
From: dpreed@reed.com
To: "Dave Taht" <dave.taht@gmail.com>
MIME-Version: 1.0
Content-Type: multipart/alternative;
	boundary="----=_20141203194509000000_77232"
Importance: Normal
X-Priority: 3 (Normal)
X-Type: html
In-Reply-To: <CAA93jw5Ham5eivkEQvdeeqF4n7Lp4u_2kYN=31upQr1qLWEBOA@mail.gmail.com>
References: <CAA93jw7-odrOhQKpPz9rAtK_MCpnkz9FELhDYmZoBndX2wTG2Q@mail.gmail.com>
	<20141203120246.GO10533@sliepen.org> 
	<892513fe-8e57-4ee9-be7d-423a3afb4fba@reed.com> 
	<CAA93jw5Ham5eivkEQvdeeqF4n7Lp4u_2kYN=31upQr1qLWEBOA@mail.gmail.com>
X-Auth-ID: dpreed@reed.com
Message-ID: <1417653909.838517290@apps.rackspace.com>
X-Mailer: webmail7.0
Cc: Guus Sliepen <guus@tinc-vpn.org>, tinc-devel@tinc-vpn.org,
	"cerowrt-devel@lists.bufferbloat.net" <cerowrt-devel@lists.bufferbloat.net>
Subject: Re: [Cerowrt-devel]
 =?utf-8?q?tinc_vpn=3A_adding_dscp_passthrough_=28?=
 =?utf-8?q?priorityinherit=29=2C_ecn=2C_and_fq=5Fcodel_support?=
X-BeenThere: cerowrt-devel@lists.bufferbloat.net
X-Mailman-Version: 2.1.13
Precedence: list
List-Id: Development issues regarding the cerowrt test router project
	<cerowrt-devel.lists.bufferbloat.net>
List-Unsubscribe: <https://lists.bufferbloat.net/options/cerowrt-devel>,
	<mailto:cerowrt-devel-request@lists.bufferbloat.net?subject=unsubscribe>
List-Archive: <https://lists.bufferbloat.net/pipermail/cerowrt-devel>
List-Post: <mailto:cerowrt-devel@lists.bufferbloat.net>
List-Help: <mailto:cerowrt-devel-request@lists.bufferbloat.net?subject=help>
List-Subscribe: <https://lists.bufferbloat.net/listinfo/cerowrt-devel>,
	<mailto:cerowrt-devel-request@lists.bufferbloat.net?subject=subscribe>
X-List-Received-Date: Thu, 04 Dec 2014 00:45:40 -0000

------=_20141203194509000000_77232
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

=0AAwesome start on the issue, in your note, Dave.  Tor needs to change for=
 several reasons - not that it isn't great, but with IPv6 and other things =
coming on line, plus the understanding of fq_codel's rationale, plus ... - =
the world can do much better.  Same with VPNs.=0A =0AI hope we can set our =
sights on a convergent target that doesn't get bogged down in the tradeoffs=
 that were made when VPNs were originally proposed.  The world is no longer=
 a bunch of disconnected networks protected by Cheswick firewalls.  Cheswic=
k said they were only temporary, and they've outlived their usefulness - th=
ey actually create security risks more than they fix them (centralizing sec=
urity creates points of failure and attack that exponentially decrease the =
attackers' work factor).  To some extent that is also true for Tor after th=
ese many years.=0A =0ABy putting the intelligence about security in the net=
work, you basically do all the bad things that the end-to-end argument enco=
urages you to avoid.  We could also put congestion control in the network b=
y re-creating admission control and requiring contractual agreements to car=
ry traffic across every intermediary.  But I think that basically destroys =
almost all the value of an "inter" net.  It makes it a balkanized proprieta=
ry set of subnets that have dozens of reasons why you can't connect with an=
yone else, and no way to be free to connect.=0A =0A =0A =0A=0A=0AOn Wednesd=
ay, December 3, 2014 2:44pm, "Dave Taht" <dave.taht@gmail.com> said:=0A=0A=
=0A=0A> On Wed, Dec 3, 2014 at 6:17 AM, David P. Reed <dpreed@reed.com> wro=
te:=0A> > Tor needs this stuff very badly.=0A> =0A> Tor has many, many prob=
lematic behaviors relevant to congestion control=0A> in general. Let me pas=
te a bit of private discussion I'd had on it in a second,=0A> but a very go=
od paper that touched upon it all was:=0A> =0A> DefenestraTor: Throwing out=
 Windows in Tor=0A> http://www.cypherpunks.ca/~iang/pubs/defenestrator.pdf=
=0A> =0A> Honestly tor needs to move to udp, and hide in all the upcoming=
=0A> webrtc traffic....=0A> =0A> http://blog.mozilla.org/futurereleases/201=
4/10/16/test-the-new-firefox-hello-webrtc-feature-in-firefox-beta/=0A> =0A>=
 webrtc needs some sort of non-centralized rendezvous mechanism, but I am R=
EALLY=0A> happy to see calls and video stay entirely inside my network when=
 they can be=0A> negotiated as such.=0A> =0A> https://plus.google.com/u/0/1=
07942175615993706558/posts/M4xUtpCKJ4P=0A> =0A> And of course, people are b=
usily reinventing torrent in webrtc without=0A> paying attention to congest=
ion control at all.=0A> =0A> https://github.com/feross/webtorrent/issues/39=
=0A> =0A> Giving access to udp to javascript programmers... what could go w=
rong?=0A> :/=0A> =0A> > I do wonder whether we should focus on vpn's rather=
 than end to end=0A> > encryption that does not leak secure information thr=
ough from inside as the=0A> > plan seems to do.=0A> =0A> "plan"?=0A> =0A> I=
 like e2e encryption. I also like overlay networks. And meshes.=0A> And wor=
king dns and service discovery. And low latency.=0A> =0A> vpns are useful a=
bstractions for sharing an address space you=0A> may not want to share more=
 widely.=0A> =0A> and: I've taken a lot of flack about how fq doesn't help =
on conventional=0A> vpns, and well, just came up with an unconventional vpn=
 idea,=0A> that might have some legs here... (certainly in my case tinc=0A>=
 as constructed already, no patches, solves hooking together the=0A> 12 net=
works I have around the globe, mostly)=0A> =0A> As for "leaking information=
", packet size and frequency is generally=0A> an obvious indicator of a giv=
en traffic type, some padding added or=0A> no. There is one piece of plaint=
ext=0A> in tinc (the seqno), also. It also uses a fixed port number for bot=
h=0A> sides of the connection (perhaps it shouldn't)=0A> =0A> So I don't ne=
cessarily see a difference between sending a whole lot of=0A> varying data =
on one tuple=0A> =0A> 2001:db8::1 <-> 2001:db8:1::1 on port 655=0A> =0A> vs=
=0A> =0A> 2001:db8::1 <-> 2001:db8:1::1 port 655=0A> 2001:db8::2 <-> 2001:d=
b8:1::1 port 655=0A> 2001:db8::3 <-> 2001:db8:1::1 port 655=0A> 2001:db8::4=
 <-> 2001:db8:1::1 port 655=0A> ....=0A> =0A> which solves the fq problem o=
n a vpn like tinc neatly. A security feature=0A> could be source specific r=
outing where we send stuff over different paths=0A> from different ipv6 sou=
rce addresses... and mixing up the src/dest ports=0A> more but that complex=
ifies the fq portion of the algo.... my thought=0A> for an initial implemen=
tation is to just hard code the ipv6 address range.=0A> =0A> I think howeve=
r that adding tons and tons of ipv6 addresses to a given=0A> interface is p=
robably slow,=0A> and might break things like nd and/or multicast...=0A> =
=0A> what would be cooler would be if you could allocate an entire /64 (or=
=0A> /118) to the vpn daemon=0A> =0A> bindtoaddress(2001:db8::/118) (give m=
e all the data for 1024 ips)=0A> =0A> but I am not sure how to go about doi=
ng that..=0A> =0A> ...moving back to a formerly private discussion about to=
rs woes...=0A> =0A> =0A> "This conversation is a bit separate from #11197 (=
which is an=0A> implementation issue in obfsproxy), so separate discussion =
somewhere=0A> would probably be required.=0A> =0A> So, there appears to be =
a slight misconception on how tor traffic=0A> travels across the Internet t=
hat I will attempt to clarify, and=0A> hopefully not get too terribly wrong=
.=0A> =0A> Each step of a given connection over tor involves multiple TCP/I=
P=0A> connections. To use a standard example of someone trying to watch Cat=
=0A> Videos on the "real internet", it will look approximately like thus:=
=0A> =0A> Client <-> Guard <-> Relay <-> Exit <-> Cat Videos=0A> =0A> Each =
step is a separate TCP/IP connection, authenticated and encrypted=0A> via T=
LS (TLS is likewise terminated at each hop). Using a pluggable=0A> transpor=
t encapsulates the first hop's TLS session with a different=0A> protocol be=
 it obfs2, obfs3, or something else.=0A> =0A> The cat videos are passed thr=
ough this path of many TCP/IP connections=0A> across things called Circuits=
 that are created/extended by the Client=0A> one hop at a time (So the exam=
ple above, the kitty cats travel across=0A> 4 TCP/IP connections, relaying =
data across a Circuit that spans from=0A> the Client to the Exit. If my art=
 skills were up to it, I would draw a=0A> diagram.).=0A> =0A> Circuits are =
currently required to provide reliable, in-order delivery.=0A> =0A> In addi=
tion to the standard congestion control provided by TCP/IP on a=0A> per-hop=
 basis, there is Circuit level flow control *and* "end to end"=0A> flow con=
trol in the form of RELAY_SENDME cells, but given that multiple=0A> circuit=
s can end up being multiplexed over a singlular TCP/IP=0A> connection, prop=
agation of these RELAY_SENDME cells can get delayed due=0A> to HOL issues.=
=0A> =0A> So, with that quick and dirty overview out of the way:=0A> =0A> *=
 "Ah so if ecn is enabled it can be used?"=0A> =0A> ECN will be used if it =
is enabled, *but* the congestion information=0A> will not get propaged to t=
he source/destination of a given stream.=0A> =0A> * "Does it retain iw10 (t=
he Linux default nowadays sadly)?"=0A> =0A> Each TCP/IP connection if sent =
from a host that uses a obnoxiously=0A> large initial window, will have an =
obnoxiously large initial=0A> window.=0A> =0A> It is worth noting that sinc=
e multiple Circuits originating from=0A> potentially numerous clients can a=
nd will reuse existing TCP/IP=0A> connections if able to (see 5.3.1 of the =
tor spec) that dropping packets=0A> between tor relays is kind of bad, beca=
use all of the separate=0A> encapsulated flows sharing the singular TCP/IP =
link will suffer (ECN=0A> would help here). This situation is rather unfort=
unate as the good=0A> active queue management algorithms drop packets (when=
 ECN is not=0A> available).=0A> =0A> A better summary of tor's flow control=
/bufferbloat woes is given in:=0A> =0A> DefenestraTor: Throwing out Windows=
 in Tor=0A> http://www.cypherpunks.ca/~iang/pubs/defenestrator.pdf=0A> =0A>=
 The N23 algorithm suggested in the paper did not end up getting=0A> implem=
ented into Tor, but I do not remember the reason off the top of=0A> my head=
."=0A> =0A> =0A> >=0A> >=0A> >=0A> > On Dec 3, 2014, Guus Sliepen <guus@tin=
c-vpn.org> wrote:=0A> >>=0A> >> On Wed, Dec 03, 2014 at 12:07:59AM -0800, D=
ave Taht wrote:=0A> >>=0A> >> [...]=0A> >>>=0A> >>> https://github.com/dtah=
t/tinc=0A> >>>=0A> >>> I successfully converted tinc to use sendmsg and rec=
vmsg, acquire=0A> (at=0A> >>> least on linux) the TTL/Hoplimit and IP_TOS/I=
Pv6_TCLASS packet=0A> fields,=0A> >>=0A> >>=0A> >> Windows does not have se=
ndmsg()/recvmsg(), but the BSDs support it.=0A> >>=0A> >>> as well as SO_TI=
MESTAMPNS, and use a higher resolution internal=0A> clock.=0A> >>> Got pass=
ing through the dscp values to work also, but:=0A> >>>=0A> >>> A) encapsula=
tion of ecn capable marked packets, and availability in=0A> >>> the outer h=
eader, without correct decapsulationm doesn't work well.=0A> >>>=0A> >>> Th=
e outer packet gets marked, but by default the marking doesn't=0A> make=0A>=
 >>> it back into the inner packet when decoded.=0A> >>=0A> >>=0A> >> Is th=
e kernel stripping the ECN bits provided by userspace? In the code=0A> >> i=
n your git branch you strip the ECN bits out yourself.=0A> >>=0A> >>> So co=
mmunicating somehow that a path can take ecn (and/or diffserv=0A> >>> marki=
ngs) is needed between tinc daemons. I thought of perhaps=0A> >>> crafting =
a special icmp message marked with CE but am open to ideas=0A> >>> that wou=
ld be backward compatible.=0A> >>=0A> >>=0A> >> PMTU probes are used to dis=
cover whether UDP works and how big the path=0A> >> MTU is, maybe it could =
be used to discover whether ECN works as well?=0A> >> Set one of the ECN bi=
ts on some of the PMTU probes, and if you receive a=0A> >> probe with that =
ECN bit set, also set it on the probe reply. If you=0A> >> succesfully rece=
ive a reply with ECN bits set, then you know ECN works.=0A> >> Since the re=
mote side just echoes the contents of the probe, you could=0A> >> also put =
a copy of the ECN bits in the probe payload, and then you can=0A> >> detect=
 if the ECN bits got zeroed. You can also define an OPTION_ECN in=0A> >> sr=
c/connection.h, so nodes can announce their support for ECN, but that=0A> >=
> should not be necessary I think.=0A> >>=0A> >>> B) I have long theorized =
that a lot of userspace vpns bottleneck on=0A> >>> the read and encapsulate=
 step, and being strict FIFOs,=0A> >>> gradually accumulate delay until fin=
ally they run out of read socket=0A> >>> buffer space and start dropping pa=
ckets.=0A> >>=0A> >>=0A> >> Well, encryption and decryption takes a lot of =
CPU time, but context=0A> >> switches are also bad.=0A> >>=0A> >> Tinc is t=
reating UDP in a strictly FIFO way, but actually it does use a=0A> >> RED a=
lgorithm when tunneling over TCP. That said, it only looks at its=0A> >> ow=
n buffers to determine when to drop packets, and those only come into=0A> >=
> play once the kernel's TCP buffers are filled.=0A> >>=0A> >>> so I had a =
couple thoughts towards using multiple rx queues in the=0A> >>> vtun interf=
ace, and/or trying to read more than one packet at a time=0A> >>> (via recv=
mmsg) and do some level of fair queueing and queue=0A> management=0A> >>> (=
codel) inside tinc itself. I think that's=0A> >>> pretty doable without mod=
ifying the protocol any, but I'm not sure=0A> of=0A> >>> it's value until I=
 saturate some cpu more.=0A> >>=0A> >>=0A> >> I'd welcome any work in this =
area :)=0A> >>=0A> >>> (and if you thought recvmsg was complex, look at rec=
vmmsg)=0A> >>=0A> >>=0A> >> It seems someone is already working on that, se=
e=0A> >> https://github.com/jasdeep-hundal/tinc.=0A> >>=0A> >>> D)=0A> >>>=
=0A> >>> the bottleneck link above is actually not tinc but the gateway, an=
d=0A> as=0A> >>> the gateway reverts to codel behavior on a single encapsul=
ated flow=0A> >>> encapsulating all the other flows, we end up with about 4=
0ms of=0A> >>> induced delay on this test. While I have a better codel (get=
s below=0A> >>> 20ms latency, not deployed), *fq*_codel by identifying indi=
vidual=0A> >>> flows gets the induced delay on those flows down below 5ms.=
=0A> >>=0A> >>=0A> >> But that should improve with ECN if fq_codel is confi=
gured to use that,=0A> >> right?=0A> >>=0A> >>> At one level, tinc being so=
 nicely meshy means that the "fq" part of=0A> >>> fq_codel on the gateway w=
ill have more chance to work against the=0A> >>> multiple vpn flows it gene=
rates for all the potential vpn=0A> endpoints...=0A> >>>=0A> >>> but at ano=
ther... lookie here! ipv6! 2^64 addresses or more to use!=0A> >>> and port =
space to burn! What if I could make tinc open up 1024 ports=0A> >>> per con=
nection, and have it fq all it's flows over those? What could=0A> >>> go wr=
ong?=0A> >>=0A> >>=0A> >> Right, hash the header of the original packets, a=
nd then select a port=0A> >> or address based on the hash? What about putti=
ng that hash in the flow=0A> >> label of outer packets? Any routers that wo=
uld actually treat those as=0A> >> separate flows?=0A> >=0A> >=0A> > -- Sen=
t from my Android device with K-@ Mail. Please excuse my brevity.=0A> >=0A>=
 > _______________________________________________=0A> > Cerowrt-devel mail=
ing list=0A> > Cerowrt-devel@lists.bufferbloat.net=0A> > https://lists.buff=
erbloat.net/listinfo/cerowrt-devel=0A> >=0A> =0A> =0A> =0A> --=0A> Dave T=
=C3=A4ht=0A> =0A> thttp://www.bufferbloat.net/projects/bloat/wiki/Upcoming_=
Talks=0A> 
------=_20141203194509000000_77232
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<font face=3D"tahoma" size=3D"2"><p style=3D"margin:0;padding:0;font-family=
: tahoma; font-size: 10pt; word-wrap: break-word;">Awesome start on the iss=
ue, in your note, Dave. &nbsp;Tor needs to change for several reasons - not=
 that it isn't great, but with IPv6 and other things coming on line, plus t=
he understanding of fq_codel's rationale, plus ... - the world can do much =
better. &nbsp;Same with VPNs.</p>=0A<p style=3D"margin:0;padding:0;font-fam=
ily: tahoma; font-size: 10pt; word-wrap: break-word;">&nbsp;</p>=0A<p style=
=3D"margin:0;padding:0;font-family: tahoma; font-size: 10pt; word-wrap: bre=
ak-word;">I hope we can set our sights on a convergent target that doesn't =
get bogged down in the tradeoffs that were made when VPNs were originally p=
roposed. &nbsp;The world is no longer a bunch of disconnected networks prot=
ected by Cheswick firewalls. &nbsp;Cheswick said they were only temporary, =
and they've outlived their usefulness - they actually create security risks=
 more than they fix them (centralizing security creates points of failure a=
nd attack that exponentially decrease the attackers' work factor). &nbsp;To=
 some extent that is also true for Tor after these many years.</p>=0A<p sty=
le=3D"margin:0;padding:0;font-family: tahoma; font-size: 10pt; word-wrap: b=
reak-word;">&nbsp;</p>=0A<p style=3D"margin:0;padding:0;font-family: tahoma=
; font-size: 10pt; word-wrap: break-word;">By putting the intelligence abou=
t security in the network, you basically do all the bad things that the end=
-to-end argument encourages you to avoid. &nbsp;We could also put congestio=
n control in the network by re-creating admission control and requiring con=
tractual agreements to carry traffic across every intermediary. &nbsp;But I=
 think that basically destroys almost all the value of an "inter" net. &nbs=
p;It makes it a balkanized proprietary set of subnets that have dozens of r=
easons why you can't connect with anyone else, and no way to be free to con=
nect.</p>=0A<p style=3D"margin:0;padding:0;font-family: tahoma; font-size: =
10pt; word-wrap: break-word;">&nbsp;</p>=0A<p style=3D"margin:0;padding:0;f=
ont-family: tahoma; font-size: 10pt; word-wrap: break-word;">&nbsp;</p>=0A<=
p style=3D"margin:0;padding:0;font-family: tahoma; font-size: 10pt; word-wr=
ap: break-word;">&nbsp;</p>=0A<!--WM_COMPOSE_SIGNATURE_START--><!--WM_COMPO=
SE_SIGNATURE_END-->=0A<p style=3D"margin:0;padding:0;font-family: tahoma; f=
ont-size: 10pt; word-wrap: break-word;"><br /><br />On Wednesday, December =
3, 2014 2:44pm, "Dave Taht" &lt;dave.taht@gmail.com&gt; said:<br /><br /></=
p>=0A<div id=3D"SafeStyles1417653444">=0A<p style=3D"margin:0;padding:0;fon=
t-family: tahoma; font-size: 10pt; word-wrap: break-word;">&gt; On Wed, Dec=
 3, 2014 at 6:17 AM, David P. Reed &lt;dpreed@reed.com&gt; wrote:<br />&gt;=
 &gt; Tor needs this stuff very badly.<br />&gt; <br />&gt; Tor has many, m=
any problematic behaviors relevant to congestion control<br />&gt; in gener=
al. Let me paste a bit of private discussion I'd had on it in a second,<br =
/>&gt; but a very good paper that touched upon it all was:<br />&gt; <br />=
&gt; DefenestraTor: Throwing out Windows in Tor<br />&gt; http://www.cypher=
punks.ca/~iang/pubs/defenestrator.pdf<br />&gt; <br />&gt; Honestly tor nee=
ds to move to udp, and hide in all the upcoming<br />&gt; webrtc traffic...=
.<br />&gt; <br />&gt; http://blog.mozilla.org/futurereleases/2014/10/16/te=
st-the-new-firefox-hello-webrtc-feature-in-firefox-beta/<br />&gt; <br />&g=
t; webrtc needs some sort of non-centralized rendezvous mechanism, but I am=
 REALLY<br />&gt; happy to see calls and video stay entirely inside my netw=
ork when they can be<br />&gt; negotiated as such.<br />&gt; <br />&gt; htt=
ps://plus.google.com/u/0/107942175615993706558/posts/M4xUtpCKJ4P<br />&gt; =
<br />&gt; And of course, people are busily reinventing torrent in webrtc w=
ithout<br />&gt; paying attention to congestion control at all.<br />&gt; <=
br />&gt; https://github.com/feross/webtorrent/issues/39<br />&gt; <br />&g=
t; Giving access to udp to javascript programmers... what could go wrong?<b=
r />&gt; :/<br />&gt; <br />&gt; &gt; I do wonder whether we should focus o=
n vpn's rather than end to end<br />&gt; &gt; encryption that does not leak=
 secure information through from inside as the<br />&gt; &gt; plan seems to=
 do.<br />&gt; <br />&gt; "plan"?<br />&gt; <br />&gt; I like e2e encryptio=
n. I also like overlay networks. And meshes.<br />&gt; And working dns and =
service discovery. And low latency.<br />&gt; <br />&gt; vpns are useful ab=
stractions for sharing an address space you<br />&gt; may not want to share=
 more widely.<br />&gt; <br />&gt; and: I've taken a lot of flack about how=
 fq doesn't help on conventional<br />&gt; vpns, and well, just came up wit=
h an unconventional vpn idea,<br />&gt; that might have some legs here... (=
certainly in my case tinc<br />&gt; as constructed already, no patches, sol=
ves hooking together the<br />&gt; 12 networks I have around the globe, mos=
tly)<br />&gt; <br />&gt; As for "leaking information", packet size and fre=
quency is generally<br />&gt; an obvious indicator of a given traffic type,=
 some padding added or<br />&gt; no. There is one piece of plaintext<br />&=
gt; in tinc (the seqno), also. It also uses a fixed port number for both<br=
 />&gt; sides of the connection (perhaps it shouldn't)<br />&gt; <br />&gt;=
 So I don't necessarily see a difference between sending a whole lot of<br =
/>&gt; varying data on one tuple<br />&gt; <br />&gt; 2001:db8::1 &lt;-&gt;=
 2001:db8:1::1 on port 655<br />&gt; <br />&gt; vs<br />&gt; <br />&gt; 200=
1:db8::1 &lt;-&gt; 2001:db8:1::1 port 655<br />&gt; 2001:db8::2 &lt;-&gt; 2=
001:db8:1::1 port 655<br />&gt; 2001:db8::3 &lt;-&gt; 2001:db8:1::1 port 65=
5<br />&gt; 2001:db8::4 &lt;-&gt; 2001:db8:1::1 port 655<br />&gt; ....<br =
/>&gt; <br />&gt; which solves the fq problem on a vpn like tinc neatly. A =
security feature<br />&gt; could be source specific routing where we send s=
tuff over different paths<br />&gt; from different ipv6 source addresses...=
 and mixing up the src/dest ports<br />&gt; more but that complexifies the =
fq portion of the algo.... my thought<br />&gt; for an initial implementati=
on is to just hard code the ipv6 address range.<br />&gt; <br />&gt; I thin=
k however that adding tons and tons of ipv6 addresses to a given<br />&gt; =
interface is probably slow,<br />&gt; and might break things like nd and/or=
 multicast...<br />&gt; <br />&gt; what would be cooler would be if you cou=
ld allocate an entire /64 (or<br />&gt; /118) to the vpn daemon<br />&gt; <=
br />&gt; bindtoaddress(2001:db8::/118) (give me all the data for 1024 ips)=
<br />&gt; <br />&gt; but I am not sure how to go about doing that..<br />&=
gt; <br />&gt; ...moving back to a formerly private discussion about tors w=
oes...<br />&gt; <br />&gt; <br />&gt; "This conversation is a bit separate=
 from #11197 (which is an<br />&gt; implementation issue in obfsproxy), so =
separate discussion somewhere<br />&gt; would probably be required.<br />&g=
t; <br />&gt; So, there appears to be a slight misconception on how tor tra=
ffic<br />&gt; travels across the Internet that I will attempt to clarify, =
and<br />&gt; hopefully not get too terribly wrong.<br />&gt; <br />&gt; Ea=
ch step of a given connection over tor involves multiple TCP/IP<br />&gt; c=
onnections. To use a standard example of someone trying to watch Cat<br />&=
gt; Videos on the "real internet", it will look approximately like thus:<br=
 />&gt; <br />&gt; Client &lt;-&gt; Guard &lt;-&gt; Relay &lt;-&gt; Exit &l=
t;-&gt; Cat Videos<br />&gt; <br />&gt; Each step is a separate TCP/IP conn=
ection, authenticated and encrypted<br />&gt; via TLS (TLS is likewise term=
inated at each hop). Using a pluggable<br />&gt; transport encapsulates the=
 first hop's TLS session with a different<br />&gt; protocol be it obfs2, o=
bfs3, or something else.<br />&gt; <br />&gt; The cat videos are passed thr=
ough this path of many TCP/IP connections<br />&gt; across things called Ci=
rcuits that are created/extended by the Client<br />&gt; one hop at a time =
(So the example above, the kitty cats travel across<br />&gt; 4 TCP/IP conn=
ections, relaying data across a Circuit that spans from<br />&gt; the Clien=
t to the Exit. If my art skills were up to it, I would draw a<br />&gt; dia=
gram.).<br />&gt; <br />&gt; Circuits are currently required to provide rel=
iable, in-order delivery.<br />&gt; <br />&gt; In addition to the standard =
congestion control provided by TCP/IP on a<br />&gt; per-hop basis, there i=
s Circuit level flow control *and* "end to end"<br />&gt; flow control in t=
he form of RELAY_SENDME cells, but given that multiple<br />&gt; circuits c=
an end up being multiplexed over a singlular TCP/IP<br />&gt; connection, p=
ropagation of these RELAY_SENDME cells can get delayed due<br />&gt; to HOL=
 issues.<br />&gt; <br />&gt; So, with that quick and dirty overview out of=
 the way:<br />&gt; <br />&gt; * "Ah so if ecn is enabled it can be used?"<=
br />&gt; <br />&gt; ECN will be used if it is enabled, *but* the congestio=
n information<br />&gt; will not get propaged to the source/destination of =
a given stream.<br />&gt; <br />&gt; * "Does it retain iw10 (the Linux defa=
ult nowadays sadly)?"<br />&gt; <br />&gt; Each TCP/IP connection if sent f=
rom a host that uses a obnoxiously<br />&gt; large initial window, will hav=
e an obnoxiously large initial<br />&gt; window.<br />&gt; <br />&gt; It is=
 worth noting that since multiple Circuits originating from<br />&gt; poten=
tially numerous clients can and will reuse existing TCP/IP<br />&gt; connec=
tions if able to (see 5.3.1 of the tor spec) that dropping packets<br />&gt=
; between tor relays is kind of bad, because all of the separate<br />&gt; =
encapsulated flows sharing the singular TCP/IP link will suffer (ECN<br />&=
gt; would help here). This situation is rather unfortunate as the good<br /=
>&gt; active queue management algorithms drop packets (when ECN is not<br /=
>&gt; available).<br />&gt; <br />&gt; A better summary of tor's flow contr=
ol/bufferbloat woes is given in:<br />&gt; <br />&gt; DefenestraTor: Throwi=
ng out Windows in Tor<br />&gt; http://www.cypherpunks.ca/~iang/pubs/defene=
strator.pdf<br />&gt; <br />&gt; The N23 algorithm suggested in the paper d=
id not end up getting<br />&gt; implemented into Tor, but I do not remember=
 the reason off the top of<br />&gt; my head."<br />&gt; <br />&gt; <br />&=
gt; &gt;<br />&gt; &gt;<br />&gt; &gt;<br />&gt; &gt; On Dec 3, 2014, Guus =
Sliepen &lt;guus@tinc-vpn.org&gt; wrote:<br />&gt; &gt;&gt;<br />&gt; &gt;&=
gt; On Wed, Dec 03, 2014 at 12:07:59AM -0800, Dave Taht wrote:<br />&gt; &g=
t;&gt;<br />&gt; &gt;&gt; [...]<br />&gt; &gt;&gt;&gt;<br />&gt; &gt;&gt;&g=
t; https://github.com/dtaht/tinc<br />&gt; &gt;&gt;&gt;<br />&gt; &gt;&gt;&=
gt; I successfully converted tinc to use sendmsg and recvmsg, acquire<br />=
&gt; (at<br />&gt; &gt;&gt;&gt; least on linux) the TTL/Hoplimit and IP_TOS=
/IPv6_TCLASS packet<br />&gt; fields,<br />&gt; &gt;&gt;<br />&gt; &gt;&gt;=
<br />&gt; &gt;&gt; Windows does not have sendmsg()/recvmsg(), but the BSDs=
 support it.<br />&gt; &gt;&gt;<br />&gt; &gt;&gt;&gt; as well as SO_TIMEST=
AMPNS, and use a higher resolution internal<br />&gt; clock.<br />&gt; &gt;=
&gt;&gt; Got passing through the dscp values to work also, but:<br />&gt; &=
gt;&gt;&gt;<br />&gt; &gt;&gt;&gt; A) encapsulation of ecn capable marked p=
ackets, and availability in<br />&gt; &gt;&gt;&gt; the outer header, withou=
t correct decapsulationm doesn't work well.<br />&gt; &gt;&gt;&gt;<br />&gt=
; &gt;&gt;&gt; The outer packet gets marked, but by default the marking doe=
sn't<br />&gt; make<br />&gt; &gt;&gt;&gt; it back into the inner packet wh=
en decoded.<br />&gt; &gt;&gt;<br />&gt; &gt;&gt;<br />&gt; &gt;&gt; Is the=
 kernel stripping the ECN bits provided by userspace? In the code<br />&gt;=
 &gt;&gt; in your git branch you strip the ECN bits out yourself.<br />&gt;=
 &gt;&gt;<br />&gt; &gt;&gt;&gt; So communicating somehow that a path can t=
ake ecn (and/or diffserv<br />&gt; &gt;&gt;&gt; markings) is needed between=
 tinc daemons. I thought of perhaps<br />&gt; &gt;&gt;&gt; crafting a speci=
al icmp message marked with CE but am open to ideas<br />&gt; &gt;&gt;&gt; =
that would be backward compatible.<br />&gt; &gt;&gt;<br />&gt; &gt;&gt;<br=
 />&gt; &gt;&gt; PMTU probes are used to discover whether UDP works and how=
 big the path<br />&gt; &gt;&gt; MTU is, maybe it could be used to discover=
 whether ECN works as well?<br />&gt; &gt;&gt; Set one of the ECN bits on s=
ome of the PMTU probes, and if you receive a<br />&gt; &gt;&gt; probe with =
that ECN bit set, also set it on the probe reply. If you<br />&gt; &gt;&gt;=
 succesfully receive a reply with ECN bits set, then you know ECN works.<br=
 />&gt; &gt;&gt; Since the remote side just echoes the contents of the prob=
e, you could<br />&gt; &gt;&gt; also put a copy of the ECN bits in the prob=
e payload, and then you can<br />&gt; &gt;&gt; detect if the ECN bits got z=
eroed. You can also define an OPTION_ECN in<br />&gt; &gt;&gt; src/connecti=
on.h, so nodes can announce their support for ECN, but that<br />&gt; &gt;&=
gt; should not be necessary I think.<br />&gt; &gt;&gt;<br />&gt; &gt;&gt;&=
gt; B) I have long theorized that a lot of userspace vpns bottleneck on<br =
/>&gt; &gt;&gt;&gt; the read and encapsulate step, and being strict FIFOs,<=
br />&gt; &gt;&gt;&gt; gradually accumulate delay until finally they run ou=
t of read socket<br />&gt; &gt;&gt;&gt; buffer space and start dropping pac=
kets.<br />&gt; &gt;&gt;<br />&gt; &gt;&gt;<br />&gt; &gt;&gt; Well, encryp=
tion and decryption takes a lot of CPU time, but context<br />&gt; &gt;&gt;=
 switches are also bad.<br />&gt; &gt;&gt;<br />&gt; &gt;&gt; Tinc is treat=
ing UDP in a strictly FIFO way, but actually it does use a<br />&gt; &gt;&g=
t; RED algorithm when tunneling over TCP. That said, it only looks at its<b=
r />&gt; &gt;&gt; own buffers to determine when to drop packets, and those =
only come into<br />&gt; &gt;&gt; play once the kernel's TCP buffers are fi=
lled.<br />&gt; &gt;&gt;<br />&gt; &gt;&gt;&gt; so I had a couple thoughts =
towards using multiple rx queues in the<br />&gt; &gt;&gt;&gt; vtun interfa=
ce, and/or trying to read more than one packet at a time<br />&gt; &gt;&gt;=
&gt; (via recvmmsg) and do some level of fair queueing and queue<br />&gt; =
management<br />&gt; &gt;&gt;&gt; (codel) inside tinc itself. I think that'=
s<br />&gt; &gt;&gt;&gt; pretty doable without modifying the protocol any, =
but I'm not sure<br />&gt; of<br />&gt; &gt;&gt;&gt; it's value until I sat=
urate some cpu more.<br />&gt; &gt;&gt;<br />&gt; &gt;&gt;<br />&gt; &gt;&g=
t; I'd welcome any work in this area :)<br />&gt; &gt;&gt;<br />&gt; &gt;&g=
t;&gt; (and if you thought recvmsg was complex, look at recvmmsg)<br />&gt;=
 &gt;&gt;<br />&gt; &gt;&gt;<br />&gt; &gt;&gt; It seems someone is already=
 working on that, see<br />&gt; &gt;&gt; https://github.com/jasdeep-hundal/=
tinc.<br />&gt; &gt;&gt;<br />&gt; &gt;&gt;&gt; D)<br />&gt; &gt;&gt;&gt;<b=
r />&gt; &gt;&gt;&gt; the bottleneck link above is actually not tinc but th=
e gateway, and<br />&gt; as<br />&gt; &gt;&gt;&gt; the gateway reverts to c=
odel behavior on a single encapsulated flow<br />&gt; &gt;&gt;&gt; encapsul=
ating all the other flows, we end up with about 40ms of<br />&gt; &gt;&gt;&=
gt; induced delay on this test. While I have a better codel (gets below<br =
/>&gt; &gt;&gt;&gt; 20ms latency, not deployed), *fq*_codel by identifying =
individual<br />&gt; &gt;&gt;&gt; flows gets the induced delay on those flo=
ws down below 5ms.<br />&gt; &gt;&gt;<br />&gt; &gt;&gt;<br />&gt; &gt;&gt;=
 But that should improve with ECN if fq_codel is configured to use that,<br=
 />&gt; &gt;&gt; right?<br />&gt; &gt;&gt;<br />&gt; &gt;&gt;&gt; At one le=
vel, tinc being so nicely meshy means that the "fq" part of<br />&gt; &gt;&=
gt;&gt; fq_codel on the gateway will have more chance to work against the<b=
r />&gt; &gt;&gt;&gt; multiple vpn flows it generates for all the potential=
 vpn<br />&gt; endpoints...<br />&gt; &gt;&gt;&gt;<br />&gt; &gt;&gt;&gt; b=
ut at another... lookie here! ipv6! 2^64 addresses or more to use!<br />&gt=
; &gt;&gt;&gt; and port space to burn! What if I could make tinc open up 10=
24 ports<br />&gt; &gt;&gt;&gt; per connection, and have it fq all it's flo=
ws over those? What could<br />&gt; &gt;&gt;&gt; go wrong?<br />&gt; &gt;&g=
t;<br />&gt; &gt;&gt;<br />&gt; &gt;&gt; Right, hash the header of the orig=
inal packets, and then select a port<br />&gt; &gt;&gt; or address based on=
 the hash? What about putting that hash in the flow<br />&gt; &gt;&gt; labe=
l of outer packets? Any routers that would actually treat those as<br />&gt=
; &gt;&gt; separate flows?<br />&gt; &gt;<br />&gt; &gt;<br />&gt; &gt; -- =
Sent from my Android device with K-@ Mail. Please excuse my brevity.<br />&=
gt; &gt;<br />&gt; &gt; _______________________________________________<br =
/>&gt; &gt; Cerowrt-devel mailing list<br />&gt; &gt; Cerowrt-devel@lists.b=
ufferbloat.net<br />&gt; &gt; https://lists.bufferbloat.net/listinfo/cerowr=
t-devel<br />&gt; &gt;<br />&gt; <br />&gt; <br />&gt; <br />&gt; --<br />&=
gt; Dave T=C3=A4ht<br />&gt; <br />&gt; thttp://www.bufferbloat.net/project=
s/bloat/wiki/Upcoming_Talks<br />&gt; </p>=0A</div></font>
------=_20141203194509000000_77232--