From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <dpreed@reed.com>
Received: from smtp113.iad3a.emailsrvr.com (smtp113.iad3a.emailsrvr.com
	[173.203.187.113])
	(using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits))
	(Client did not present a certificate)
	by huchra.bufferbloat.net (Postfix) with ESMTPS id 6A55C21F2E2
	for <cerowrt-devel@lists.bufferbloat.net>;
	Mon, 26 Jan 2015 16:12:16 -0800 (PST)
Received: from smtp23.relay.iad3a.emailsrvr.com (localhost.localdomain
	[127.0.0.1])
	by smtp23.relay.iad3a.emailsrvr.com (SMTP Server) with ESMTP id
	AB07C28023C; Mon, 26 Jan 2015 19:12:15 -0500 (EST)
Received: from app24.wa-webapps.iad3a (relay-webapps.rsapps.net
	[172.27.255.140])
	by smtp23.relay.iad3a.emailsrvr.com (SMTP Server) with ESMTP id
	8926B280213; Mon, 26 Jan 2015 19:12:15 -0500 (EST)
X-Sender-Id: dpreed@reed.com
Received: from app24.wa-webapps.iad3a (relay-webapps.rsapps.net
	[172.27.255.140]) by 0.0.0.0:25 (trex/5.4.2);
	Tue, 27 Jan 2015 00:12:15 GMT
Received: from reed.com (localhost.localdomain [127.0.0.1])
	by app24.wa-webapps.iad3a (Postfix) with ESMTP id 744948003E;
	Mon, 26 Jan 2015 19:12:15 -0500 (EST)
Received: by apps.rackspace.com
	(Authenticated sender: dpreed@reed.com, from: dpreed@reed.com) 
	with HTTP; Mon, 26 Jan 2015 19:12:15 -0500 (EST)
Date: Mon, 26 Jan 2015 19:12:15 -0500 (EST)
From: dpreed@reed.com
To: "Dave Taht" <dave.taht@gmail.com>
MIME-Version: 1.0
Content-Type: multipart/alternative;
	boundary="----=_20150126191215000000_81962"
Importance: Normal
X-Priority: 3 (Normal)
X-Type: html
In-Reply-To: <CAA93jw4ero-ApHh8nX-S1vPbfZ1RzUNEkG9nh8wy3=Gw6DCsjA@mail.gmail.com>
References: <54B5D28A.3010906@gmail.com> 
	<7B1EA8F0-FCB6-4A37-950F-2558FC751DE8@gmail.com> 
	<54C038D0.1000305@gmail.com> 
	<alpine.DEB.2.02.1501211553090.21864@nftneq.ynat.uz> 
	<54C0BD22.3000608@gmail.com> 
	<alpine.DEB.2.02.1501220110170.19609@nftneq.ynat.uz> 
	<54C13F47.1010203@gmail.com> <1422111577.328132080@apps.rackspace.com> 
	<alpine.DEB.2.02.1501242029320.19609@nftneq.ynat.uz> 
	<1422217048.025611275@apps.rackspace.com> 
	<alpine.DEB.2.02.1501251538031.19609@nftneq.ynat.uz> 
	<1422237076.005718796@apps.rackspace.com> 
	<CAA93jw4DYgbv0oFwOfJmDfnOfAz6VYAdv9BcgS51sNg-rEopCA@mail.gmail.com> 
	<alpine.DEB.2.02.1501251821500.19609@nftneq.ynat.uz> 
	<CAA93jw4wr+UeSdYK+y8U2S6K=KCiKN0Z7LUn__aAjBTw2qowrg@mail.gmail.com> 
	<1422242279.46066942@apps.rackspace.com> 
	<CAA93jw4ero-ApHh8nX-S1vPbfZ1RzUNEkG9nh8wy3=Gw6DCsjA@mail.gmail.com>
X-Auth-ID: dpreed@reed.com
Message-ID: <1422317535.474322223@apps.rackspace.com>
X-Mailer: webmail/11.3.10-RC
Cc: "cerowrt-devel@lists.bufferbloat.net" <cerowrt-devel@lists.bufferbloat.net>
Subject: Re: [Cerowrt-devel]
 =?utf-8?q?Recording_RF_management_info_=5Fand=5F_?=
 =?utf-8?q?associated_traffic=3F?=
X-BeenThere: cerowrt-devel@lists.bufferbloat.net
X-Mailman-Version: 2.1.13
Precedence: list
List-Id: Development issues regarding the cerowrt test router project
	<cerowrt-devel.lists.bufferbloat.net>
List-Unsubscribe: <https://lists.bufferbloat.net/options/cerowrt-devel>,
	<mailto:cerowrt-devel-request@lists.bufferbloat.net?subject=unsubscribe>
List-Archive: <https://lists.bufferbloat.net/pipermail/cerowrt-devel>
List-Post: <mailto:cerowrt-devel@lists.bufferbloat.net>
List-Help: <mailto:cerowrt-devel-request@lists.bufferbloat.net?subject=help>
List-Subscribe: <https://lists.bufferbloat.net/listinfo/cerowrt-devel>,
	<mailto:cerowrt-devel-request@lists.bufferbloat.net?subject=subscribe>
X-List-Received-Date: Tue, 27 Jan 2015 00:12:46 -0000

------=_20150126191215000000_81962
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

=0AWell, we all may want to agree to disagree.  I don't buy the argument th=
at hash tables are slow compared to the TCAMs - and even if cache misses ha=
ppened, a hash table is still o(1) - you look at exactly one memory address=
 on the average in a hash table - that's the point of it.  The constant fac=
tor is the speed of memory - not terribly slow by any means.=0A =0ATo get i=
nto this deeper would require actual measurements, of which I am a great fa=
n.  But your handwaves are pretty unquantitative, Dave, so at best they are=
 similar to mine.  I'm very measurement focused, being part hardware archit=
ecture guy.=0A =0ADavid - my comment about HP doing layer 3 switching in TC=
AMs just was there to point out that there's nothing magic about layer 2.  =
I was not suggesting that they don't use proprietary binary blobs, because =
they do.  But so do the TCAM programs in layer 2 devices.=0A =0ADave - you =
are conflating the implementation technique of the routing algorithm when y=
ou focus on "prefix matching" as being hard to do.  It's not hard to invent=
 a performant algorithm to do that combined with a hash table.  A simple wa=
y to do that is to treat the address one is looking up as several addresses=
 (of shorter prefixes of the address).  Then look each one up separately by=
 its hash.  Its still o(1) if you do that, just a larger constant factor. I=
 assume you don't actually think it is optimal to do linear searches on the=
 routing table like hosts sometimes do.  Linear search is not necessary.=0A=
 =0AThere is literally nothing magical about looking up 48-bit random Ether=
net addresses in a LAN.=0A =0AAs far as NAT'ing is concerned - that is done=
 by the gateways.  It's possible in principle to create a distributed NAT f=
ace to an Enterprise - if you do so, then roaming within the enterprise jus=
t amounts to telling the NAT face about the new internal IP address that co=
rresponds to the old one - an update of one address translation with anothe=
r.=0A =0AThis is how phones roam, by the way. They update their location vi=
a an HLR as they roam.=0A =0A=0A=0AOn Sunday, January 25, 2015 10:45pm, "Da=
ve Taht" <dave.taht@gmail.com> said:=0A=0A=0A=0A> On Sun, Jan 25, 2015 at 7=
:17 PM, <dpreed@reed.com> wrote:=0A> > Looking up an address in a routing t=
able is o(1) if the routing table is a=0A> > hash table. That's much more e=
fficient than a TCAM. My simple example just=0A> > requires a delete/insert=
 at each node's route lookup table.=0A> =0A> Regrettably it is not O(1) onc=
e you take into account the cpu cache hierarchy,=0A> or the potential colli=
sions you will have once you shrink the hash to=0A> something reasonable.=
=0A> =0A> Also I think you are ignoring the problem of covering routes. Say=
 I have to=0A> get something to a.b.c.z/32. I do a lookup of that and find =
nothing. I then=0A> look to find a.b.c.z/31 and find nothing, then /30, the=
n /29, /28, until I find=0A> a hit for the next hop. Now you can of course =
do a binary search for likely=0A> subprefixes, but in any case, the search =
is not O(1).=0A> =0A> In terms of cache efficient data structures, a straig=
ht hash is not the way=0A> to go, of late I have been trying to wrap my hea=
d around the hat-trie as=0A> possibly being useful in these circumstances.=
=0A> =0A> Now, if you think about limiting the domain of the problem to som=
ething=0A> greater than the typical mac table, but less than the whole inte=
rnet,=0A> it starts looking more reasonable to have a 1x1 ratio of destinat=
ion=0A> IPs to hash table entries for lookups, but updates have to probe/ch=
ange=0A> large segments of the table in order to deal with covering prefixe=
s.=0A> =0A> > My point was about collections of WLAN's bridged together. Lo=
ok at what=0A> > happens (at the packet/radio layer) when a new node joins =
a bridged set of=0A> > WLANs using STP. It is not exactly simple to rebuild=
 the Ethernet layer's=0A> > bridge routing tables in a complex network. And=
 the limit of 4096 entries=0A> > in many inexpensive switches is not a triv=
ial limit.=0A> =0A> Agreed. But see http://en.wikipedia.org/wiki/Virtual_Ex=
tensible_LAN=0A> =0A> >=0A> >=0A> >=0A> > Routers used to be memory-starved=
 (a small number of KB of RAM was the=0A> > norm). Perhaps the thinking the=
n (back before 2000) has not been revised,=0A> > even though the hardware i=
s a lot more capacious.=0A> =0A> The profit margins have not been revised.=
=0A> =0A> I would not mind, incidentally expanding the scope of the fqswitc=
h project ot=0A> try to build something that would scale up at l3 farther t=
han we've ever seen=0A> before, however funding for needed gear like:=0A> =
=0A> http://www.eetimes.com/document.asp?doc_id=3D1321334=0A> =0A> and time=
, and fpga expertise, is lacking. I am currently distracted by=0A> evaluati=
ng=0A> a very cool new cpu architecture ( see=0A> http://www.millcomputing.=
com/wiki/Memory )=0A> and even as nifty as that is I foresee a need for a l=
ot of dedicated packet=0A> processing logic and memories to get into the 40=
GBit+ range.=0A> >=0A> >=0A> > Remember, the Ethernet layer in WLANs is imp=
lemented by microcontrollers,=0A> > typically not very capable ones, plus T=
CAMs which are pretty limited in=0A> > their flexibility.=0A> =0A> I do ten=
d to think that the next era of SDN enabled hardware will eventually=0A> le=
ad to more innovation in both the control and data plane - however it=0A> s=
eems we are still in a "me-too" phase=0A> of development of openvswitch (bt=
w: there is a new software switch for=0A> linux called rocker we should loo=
k at, and make sure runs fq_codel), and=0A> a long way from flexibly progra=
mmable switch hardware in general.=0A> =0A> http://openvswitch.org/pipermai=
l/dev/2014-September/045084.html=0A> >=0A> >=0A> >=0A> > While it is tempti=
ng to use the "pre-packaged, proprietary" Ethernet switch=0A> > functionali=
ty, routing gets you out of the binary blobs, and let's you be a=0A> > lot =
smarter and more scalable. Given that it does NOT cost more to do=0A> > rou=
ting at the IP layer, building complex Ethernet bridging is not obviously=
=0A> > a win.=0A> =0A> SDN is certainly a way out of this mess. Eventually.=
 But I fear we are making=0A> all the same mistakes over again, and making =
slower hardware, where in the=0A> end, it needs to be faster, to win.=0A> =
=0A> >=0A> >=0A> > BTW, TCAMs are used in IP layer switching, too, and also=
 are used in packet=0A> > filtering. Maybe not in cheap consumer switches, =
but lots of Gigabit=0A> > switches implement IP layer switching and filteri=
ng. At HP, their switches=0A> > routinely did all their IP layer switching =
entirely in TCAMs.=0A> =0A> Yep. I really wish big, fat TCAMS were standard=
 equipment.=0A> =0A> >=0A> >=0A> > On Sunday, January 25, 2015 9:58pm, "Dav=
e Taht" <dave.taht@gmail.com>=0A> said:=0A> >=0A> >> On Sun, Jan 25, 2015 a=
t 6:43 PM, David Lang <david@lang.hm> wrote:=0A> >> > On Sun, 25 Jan 2015, =
Dave Taht wrote:=0A> >> >=0A> >> >> To your roaming point, yes this is cert=
ainly one place where=0A> migrating=0A> >> >> bridged vms across machines b=
reaks down, and yet more and more=0A> vm=0A> >> >> layers are doing it. I w=
ould certainly prefer routing in this=0A> case.=0A> >> >=0A> >> >=0A> >> > =
What's the difference between "roaming" and moving a VM from one=0A> place=
=0A> >> > in=0A> >> > the network to another?=0A> >>=0A> >> I think most pe=
ople think of "roaming" as moving fairly rapidly from one=0A> >> piece of e=
dge connectivity to another, and moving a vm is a great deal=0A> >> more=0A=
> >> permanent operation.=0A> >>=0A> >> > As far as layer 2 vs layer 3 goes=
. If you try to operate at layer 3,=0A> you=0A> >> > are=0A> >> > going to =
have quite a bit of smarts in the endpoint. Even if it's=0A> only=0A> >> > =
connected vi a single link. If you think about it, even if your=0A> network=
=0A> >> > routing tables list every machine in our environment individually=
,=0A> you=0A> >> > still=0A> >> > have a problem of what gateway the endpoi=
nt uses. It would have to=0A> >> > change=0A> >> > every time it moved. Sin=
ce DHCP doesn't update frequently enough to=0A> be=0A> >> > transparent, yo=
u would need to have each endpoint running a routing=0A> >> > protocol.=0A>=
 >>=0A> >> Hmm? I don't ever use a dhcp-supplied default gateway, I depend =
on the=0A> >> routing=0A> >> protocol to supply that. In terms of each vm r=
unning a routing protocol,=0A> >> well, no, I would rely on the underlying =
bare metal OS to be doing=0A> >> that, supplying=0A> >> the FIB tables to t=
he overlying vms, if they need it, but otherwise the=0A> >> vms=0A> >> just=
 see a "default" route and don't bother with it. They do need to=0A> >> inf=
orm the=0A> >> bare metal OS (better term for this please? hypervisor?) of =
what IPs=0A> they=0A> >> own.=0A> >>=0A> >> static default gateways are evi=
l. and easily disabled. in linux you=0A> >> merely comment=0A> >> out the "=
routers" in /etc/dhcp/dhclient.conf, in openwrt, set=0A> >> "defaultroute 0=
" for the=0A> >> interface fetching dhcp.=0A> >>=0A> >> When a box migrates=
, it tells the hypervisor it's addresses, and then=0A> that=0A> >> box=0A> =
>> propagates out the route change to elsewhere.=0A> >>=0A> >> >=0A> >> > T=
his can work for individual hobbiests, but not when you need to=0A> support=
=0A> >> > random devices (how would you configure an iPhone to support this=
?)=0A> >>=0A> >> Carefully. :)=0A> >>=0A> >> I do note that this stuff does=
 (or at least did) work on some of the=0A> open=0A> >> source variants of a=
ndroid. I would rather like it if android added ipv6=0A> >> tethering soon,=
 and made it possible to mesh together multiple phones.=0A> >>=0A> >> >=0A>=
 >> >=0A> >> > Letting the layer 2 equipment deal with the traffic within t=
he=0A> building=0A> >> > and=0A> >> > invoking layer 3 to go outside the bu=
ilding (or to a different=0A> security=0A> >> > domain) makes a lot of sens=
e. Even if that means that layer 2 within=0A> a=0A> >> > building looks ver=
y similar to what layer 3 used to look like around=0A> a=0A> >> > city.=0A>=
 >>=0A> >> Be careful what you wish for.=0A> >>=0A> >> >=0A> >> >=0A> >> > =
back to the topic of wifi, I'm not aware of any APs that participate=0A> in=
=0A> >> > the=0A> >> > switch protocols at this level. I also don't know of=
 any reasonably=0A> >> > priced=0A> >> > switches that can do anything smar=
ter than plain spanning tree when=0A> >> > connected through multiple paths=
 (I'd love to learn otherwise)=0A> >> >=0A> >> > David Lang=0A> >>=0A> >>=
=0A> >>=0A> >> --=0A> >> Dave T=C3=A4ht=0A> >>=0A> >> thttp://www.bufferblo=
at.net/projects/bloat/wiki/Upcoming_Talks=0A> >>=0A> =0A> =0A> =0A> --=0A> =
Dave T=C3=A4ht=0A> =0A> thttp://www.bufferbloat.net/projects/bloat/wiki/Upc=
oming_Talks=0A> 
------=_20150126191215000000_81962
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<font face=3D"tahoma" size=3D"2"><p style=3D"margin:0;padding:0;font-family=
: tahoma; font-size: 10pt; word-wrap: break-word;">Well, we all may want to=
 agree to disagree. &nbsp;I don't buy the argument that hash tables are slo=
w compared to the TCAMs - and even if cache misses happened, a hash table i=
s still o(1) - you look at exactly one memory address on the average in a h=
ash table - that's the point of it. &nbsp;The constant factor is the speed =
of memory - not terribly slow by any means.</p>=0A<p style=3D"margin:0;padd=
ing:0;font-family: tahoma; font-size: 10pt; word-wrap: break-word;">&nbsp;<=
/p>=0A<p style=3D"margin:0;padding:0;font-family: tahoma; font-size: 10pt; =
word-wrap: break-word;">To get into this deeper would require actual measur=
ements, of which I am a great fan. &nbsp;But your handwaves are pretty unqu=
antitative, Dave, so at best they are similar to mine. &nbsp;I'm very measu=
rement focused, being part hardware architecture guy.</p>=0A<p style=3D"mar=
gin:0;padding:0;font-family: tahoma; font-size: 10pt; word-wrap: break-word=
;">&nbsp;</p>=0A<p style=3D"margin:0;padding:0;font-family: tahoma; font-si=
ze: 10pt; word-wrap: break-word;">David - my comment about HP doing layer 3=
 switching in TCAMs just was there to point out that there's nothing magic =
about layer 2. &nbsp;I was not suggesting that they don't use proprietary b=
inary blobs, because they do. &nbsp;But so do the TCAM programs in layer 2 =
devices.</p>=0A<p style=3D"margin:0;padding:0;font-family: tahoma; font-siz=
e: 10pt; word-wrap: break-word;">&nbsp;</p>=0A<p style=3D"margin:0;padding:=
0;font-family: tahoma; font-size: 10pt; word-wrap: break-word;">Dave - you =
are conflating the implementation technique of the routing algorithm when y=
ou focus on "prefix matching" as being hard to do. &nbsp;It's not hard to i=
nvent a performant algorithm to do that combined with a hash table. &nbsp;A=
 simple way to do that is to treat the address one is looking up as several=
 addresses (of shorter prefixes of the address). &nbsp;Then look each one u=
p separately by its hash. &nbsp;Its still o(1) if you do that, just a large=
r constant factor. I assume&nbsp;you don't actually think it is optimal to =
do linear searches on the routing table like hosts sometimes do. &nbsp;Line=
ar search is not necessary.</p>=0A<p style=3D"margin:0;padding:0;font-famil=
y: tahoma; font-size: 10pt; word-wrap: break-word;">&nbsp;</p>=0A<p style=
=3D"margin:0;padding:0;font-family: tahoma; font-size: 10pt; word-wrap: bre=
ak-word;">There is literally nothing magical about looking up 48-bit random=
 Ethernet addresses in a LAN.</p>=0A<p style=3D"margin:0;padding:0;font-fam=
ily: tahoma; font-size: 10pt; word-wrap: break-word;">&nbsp;</p>=0A<p style=
=3D"margin:0;padding:0;font-family: tahoma; font-size: 10pt; word-wrap: bre=
ak-word;">As far as NAT'ing is concerned - that is done by the gateways. &n=
bsp;It's possible in principle to create a distributed NAT face to an Enter=
prise - if you do so, then roaming within the enterprise just amounts to te=
lling the NAT face about the new internal IP address that corresponds to th=
e old one - an update of one address translation with another.</p>=0A<p sty=
le=3D"margin:0;padding:0;font-family: tahoma; font-size: 10pt; word-wrap: b=
reak-word;">&nbsp;</p>=0A<p style=3D"margin:0;padding:0;font-family: tahoma=
; font-size: 10pt; word-wrap: break-word;">This is how phones roam, by the =
way. They update their location via an HLR as they roam.</p>=0A<p style=3D"=
margin:0;padding:0;font-family: tahoma; font-size: 10pt; word-wrap: break-w=
ord;">&nbsp;</p>=0A<!--WM_COMPOSE_SIGNATURE_START--><!--WM_COMPOSE_SIGNATUR=
E_END-->=0A<p style=3D"margin:0;padding:0;font-family: tahoma; font-size: 1=
0pt; word-wrap: break-word;"><br /><br />On Sunday, January 25, 2015 10:45p=
m, "Dave Taht" &lt;dave.taht@gmail.com&gt; said:<br /><br /></p>=0A<div id=
=3D"SafeStyles1422316592">=0A<p style=3D"margin:0;padding:0;font-family: ta=
homa; font-size: 10pt; word-wrap: break-word;">&gt; On Sun, Jan 25, 2015 at=
 7:17 PM, &lt;dpreed@reed.com&gt; wrote:<br />&gt; &gt; Looking up an addre=
ss in a routing table is o(1) if the routing table is a<br />&gt; &gt; hash=
 table. That's much more efficient than a TCAM. My simple example just<br /=
>&gt; &gt; requires a delete/insert at each node's route lookup table.<br /=
>&gt; <br />&gt; Regrettably it is not O(1) once you take into account the =
cpu cache hierarchy,<br />&gt; or the potential collisions you will have on=
ce you shrink the hash to<br />&gt; something reasonable.<br />&gt; <br />&=
gt; Also I think you are ignoring the problem of covering routes. Say I hav=
e to<br />&gt; get something to a.b.c.z/32. I do a lookup of that and find =
nothing. I then<br />&gt; look to find a.b.c.z/31 and find nothing, then /3=
0, then /29, /28, until I find<br />&gt; a hit for the next hop. Now you ca=
n of course do a binary search for likely<br />&gt; subprefixes, but in any=
 case, the search is not O(1).<br />&gt; <br />&gt; In terms of cache effic=
ient data structures, a straight hash is not the way<br />&gt; to go, of la=
te I have been trying to wrap my head around the hat-trie as<br />&gt; poss=
ibly being useful in these circumstances.<br />&gt; <br />&gt; Now, if you =
think about limiting the domain of the problem to something<br />&gt; great=
er than the typical mac table, but less than the whole internet,<br />&gt; =
it starts looking more reasonable to have a 1x1 ratio of destination<br />&=
gt; IPs to hash table entries for lookups, but updates have to probe/change=
<br />&gt; large segments of the table in order to deal with covering prefi=
xes.<br />&gt; <br />&gt; &gt; My point was about collections of WLAN's bri=
dged together. Look at what<br />&gt; &gt; happens (at the packet/radio lay=
er) when a new node joins a bridged set of<br />&gt; &gt; WLANs using STP. =
It is not exactly simple to rebuild the Ethernet layer's<br />&gt; &gt; bri=
dge routing tables in a complex network. And the limit of 4096 entries<br /=
>&gt; &gt; in many inexpensive switches is not a trivial limit.<br />&gt; <=
br />&gt; Agreed. But see http://en.wikipedia.org/wiki/Virtual_Extensible_L=
AN<br />&gt; <br />&gt; &gt;<br />&gt; &gt;<br />&gt; &gt;<br />&gt; &gt; R=
outers used to be memory-starved (a small number of KB of RAM was the<br />=
&gt; &gt; norm). Perhaps the thinking then (back before 2000) has not been =
revised,<br />&gt; &gt; even though the hardware is a lot more capacious.<b=
r />&gt; <br />&gt; The profit margins have not been revised.<br />&gt; <br=
 />&gt; I would not mind, incidentally expanding the scope of the fqswitch =
project ot<br />&gt; try to build something that would scale up at l3 farth=
er than we've ever seen<br />&gt; before, however funding for needed gear l=
ike:<br />&gt; <br />&gt; http://www.eetimes.com/document.asp?doc_id=3D1321=
334<br />&gt; <br />&gt; and time, and fpga expertise, is lacking. I am cur=
rently distracted by<br />&gt; evaluating<br />&gt; a very cool new cpu arc=
hitecture ( see<br />&gt; http://www.millcomputing.com/wiki/Memory )<br />&=
gt; and even as nifty as that is I foresee a need for a lot of dedicated pa=
cket<br />&gt; processing logic and memories to get into the 40GBit+ range.=
<br />&gt; &gt;<br />&gt; &gt;<br />&gt; &gt; Remember, the Ethernet layer =
in WLANs is implemented by microcontrollers,<br />&gt; &gt; typically not v=
ery capable ones, plus TCAMs which are pretty limited in<br />&gt; &gt; the=
ir flexibility.<br />&gt; <br />&gt; I do tend to think that the next era o=
f SDN enabled hardware will eventually<br />&gt; lead to more innovation in=
 both the control and data plane - however it<br />&gt; seems we are still =
in a "me-too" phase<br />&gt; of development of openvswitch (btw: there is =
a new software switch for<br />&gt; linux called rocker we should look at, =
and make sure runs fq_codel), and<br />&gt; a long way from flexibly progra=
mmable switch hardware in general.<br />&gt; <br />&gt; http://openvswitch.=
org/pipermail/dev/2014-September/045084.html<br />&gt; &gt;<br />&gt; &gt;<=
br />&gt; &gt;<br />&gt; &gt; While it is tempting to use the "pre-packaged=
, proprietary" Ethernet switch<br />&gt; &gt; functionality, routing gets y=
ou out of the binary blobs, and let's you be a<br />&gt; &gt; lot smarter a=
nd more scalable. Given that it does NOT cost more to do<br />&gt; &gt; rou=
ting at the IP layer, building complex Ethernet bridging is not obviously<b=
r />&gt; &gt; a win.<br />&gt; <br />&gt; SDN is certainly a way out of thi=
s mess. Eventually. But I fear we are making<br />&gt; all the same mistake=
s over again, and making slower hardware, where in the<br />&gt; end, it ne=
eds to be faster, to win.<br />&gt; <br />&gt; &gt;<br />&gt; &gt;<br />&gt=
; &gt; BTW, TCAMs are used in IP layer switching, too, and also are used in=
 packet<br />&gt; &gt; filtering. Maybe not in cheap consumer switches, but=
 lots of Gigabit<br />&gt; &gt; switches implement IP layer switching and f=
iltering. At HP, their switches<br />&gt; &gt; routinely did all their IP l=
ayer switching entirely in TCAMs.<br />&gt; <br />&gt; Yep. I really wish b=
ig, fat TCAMS were standard equipment.<br />&gt; <br />&gt; &gt;<br />&gt; =
&gt;<br />&gt; &gt; On Sunday, January 25, 2015 9:58pm, "Dave Taht" &lt;dav=
e.taht@gmail.com&gt;<br />&gt; said:<br />&gt; &gt;<br />&gt; &gt;&gt; On S=
un, Jan 25, 2015 at 6:43 PM, David Lang &lt;david@lang.hm&gt; wrote:<br />&=
gt; &gt;&gt; &gt; On Sun, 25 Jan 2015, Dave Taht wrote:<br />&gt; &gt;&gt; =
&gt;<br />&gt; &gt;&gt; &gt;&gt; To your roaming point, yes this is certain=
ly one place where<br />&gt; migrating<br />&gt; &gt;&gt; &gt;&gt; bridged =
vms across machines breaks down, and yet more and more<br />&gt; vm<br />&g=
t; &gt;&gt; &gt;&gt; layers are doing it. I would certainly prefer routing =
in this<br />&gt; case.<br />&gt; &gt;&gt; &gt;<br />&gt; &gt;&gt; &gt;<br =
/>&gt; &gt;&gt; &gt; What's the difference between "roaming" and moving a V=
M from one<br />&gt; place<br />&gt; &gt;&gt; &gt; in<br />&gt; &gt;&gt; &g=
t; the network to another?<br />&gt; &gt;&gt;<br />&gt; &gt;&gt; I think mo=
st people think of "roaming" as moving fairly rapidly from one<br />&gt; &g=
t;&gt; piece of edge connectivity to another, and moving a vm is a great de=
al<br />&gt; &gt;&gt; more<br />&gt; &gt;&gt; permanent operation.<br />&gt=
; &gt;&gt;<br />&gt; &gt;&gt; &gt; As far as layer 2 vs layer 3 goes. If yo=
u try to operate at layer 3,<br />&gt; you<br />&gt; &gt;&gt; &gt; are<br /=
>&gt; &gt;&gt; &gt; going to have quite a bit of smarts in the endpoint. Ev=
en if it's<br />&gt; only<br />&gt; &gt;&gt; &gt; connected vi a single lin=
k. If you think about it, even if your<br />&gt; network<br />&gt; &gt;&gt;=
 &gt; routing tables list every machine in our environment individually,<br=
 />&gt; you<br />&gt; &gt;&gt; &gt; still<br />&gt; &gt;&gt; &gt; have a pr=
oblem of what gateway the endpoint uses. It would have to<br />&gt; &gt;&gt=
; &gt; change<br />&gt; &gt;&gt; &gt; every time it moved. Since DHCP doesn=
't update frequently enough to<br />&gt; be<br />&gt; &gt;&gt; &gt; transpa=
rent, you would need to have each endpoint running a routing<br />&gt; &gt;=
&gt; &gt; protocol.<br />&gt; &gt;&gt;<br />&gt; &gt;&gt; Hmm? I don't ever=
 use a dhcp-supplied default gateway, I depend on the<br />&gt; &gt;&gt; ro=
uting<br />&gt; &gt;&gt; protocol to supply that. In terms of each vm runni=
ng a routing protocol,<br />&gt; &gt;&gt; well, no, I would rely on the und=
erlying bare metal OS to be doing<br />&gt; &gt;&gt; that, supplying<br />&=
gt; &gt;&gt; the FIB tables to the overlying vms, if they need it, but othe=
rwise the<br />&gt; &gt;&gt; vms<br />&gt; &gt;&gt; just see a "default" ro=
ute and don't bother with it. They do need to<br />&gt; &gt;&gt; inform the=
<br />&gt; &gt;&gt; bare metal OS (better term for this please? hypervisor?=
) of what IPs<br />&gt; they<br />&gt; &gt;&gt; own.<br />&gt; &gt;&gt;<br =
/>&gt; &gt;&gt; static default gateways are evil. and easily disabled. in l=
inux you<br />&gt; &gt;&gt; merely comment<br />&gt; &gt;&gt; out the "rout=
ers" in /etc/dhcp/dhclient.conf, in openwrt, set<br />&gt; &gt;&gt; "defaul=
troute 0" for the<br />&gt; &gt;&gt; interface fetching dhcp.<br />&gt; &gt=
;&gt;<br />&gt; &gt;&gt; When a box migrates, it tells the hypervisor it's =
addresses, and then<br />&gt; that<br />&gt; &gt;&gt; box<br />&gt; &gt;&gt=
; propagates out the route change to elsewhere.<br />&gt; &gt;&gt;<br />&gt=
; &gt;&gt; &gt;<br />&gt; &gt;&gt; &gt; This can work for individual hobbie=
sts, but not when you need to<br />&gt; support<br />&gt; &gt;&gt; &gt; ran=
dom devices (how would you configure an iPhone to support this?)<br />&gt; =
&gt;&gt;<br />&gt; &gt;&gt; Carefully. :)<br />&gt; &gt;&gt;<br />&gt; &gt;=
&gt; I do note that this stuff does (or at least did) work on some of the<b=
r />&gt; open<br />&gt; &gt;&gt; source variants of android. I would rather=
 like it if android added ipv6<br />&gt; &gt;&gt; tethering soon, and made =
it possible to mesh together multiple phones.<br />&gt; &gt;&gt;<br />&gt; =
&gt;&gt; &gt;<br />&gt; &gt;&gt; &gt;<br />&gt; &gt;&gt; &gt; Letting the l=
ayer 2 equipment deal with the traffic within the<br />&gt; building<br />&=
gt; &gt;&gt; &gt; and<br />&gt; &gt;&gt; &gt; invoking layer 3 to go outsid=
e the building (or to a different<br />&gt; security<br />&gt; &gt;&gt; &gt=
; domain) makes a lot of sense. Even if that means that layer 2 within<br /=
>&gt; a<br />&gt; &gt;&gt; &gt; building looks very similar to what layer 3=
 used to look like around<br />&gt; a<br />&gt; &gt;&gt; &gt; city.<br />&g=
t; &gt;&gt;<br />&gt; &gt;&gt; Be careful what you wish for.<br />&gt; &gt;=
&gt;<br />&gt; &gt;&gt; &gt;<br />&gt; &gt;&gt; &gt;<br />&gt; &gt;&gt; &gt=
; back to the topic of wifi, I'm not aware of any APs that participate<br /=
>&gt; in<br />&gt; &gt;&gt; &gt; the<br />&gt; &gt;&gt; &gt; switch protoco=
ls at this level. I also don't know of any reasonably<br />&gt; &gt;&gt; &g=
t; priced<br />&gt; &gt;&gt; &gt; switches that can do anything smarter tha=
n plain spanning tree when<br />&gt; &gt;&gt; &gt; connected through multip=
le paths (I'd love to learn otherwise)<br />&gt; &gt;&gt; &gt;<br />&gt; &g=
t;&gt; &gt; David Lang<br />&gt; &gt;&gt;<br />&gt; &gt;&gt;<br />&gt; &gt;=
&gt;<br />&gt; &gt;&gt; --<br />&gt; &gt;&gt; Dave T=C3=A4ht<br />&gt; &gt;=
&gt;<br />&gt; &gt;&gt; thttp://www.bufferbloat.net/projects/bloat/wiki/Upc=
oming_Talks<br />&gt; &gt;&gt;<br />&gt; <br />&gt; <br />&gt; <br />&gt; -=
-<br />&gt; Dave T=C3=A4ht<br />&gt; <br />&gt; thttp://www.bufferbloat.net=
/projects/bloat/wiki/Upcoming_Talks<br />&gt; </p>=0A</div></font>
------=_20150126191215000000_81962--