[Cerowrt-devel] Recording RF management info _and

Development issues regarding the cerowrt test router project
 help / color / mirror / Atom feed

* [Cerowrt-devel] Recording RF management info _and_ associated traffic?
@ 2015-01-14  2:20 Richard Smith
  2015-01-20 16:59 ` Rich Brown
  0 siblings, 1 reply; 43+ messages in thread
From: Richard Smith @ 2015-01-14  2:20 UTC (permalink / raw)
  To: cerowrt-devel

I'm trying to track down some poor wireless issues we are having at 
work.  At random times the 5Ghz WLANs we have just go to hell.

I've been sniffing in monitor mode which has been quite enlightening 
there's certainly a lot more going on in the 5Ghz channels than I was 
expecting.

Monitor mode shows me loads of stuff I didn't know was there but what it 
doesn't show me is how all that other traffic interacts with the traffic 
on my ESS.

 From what I've been reading it seems like you with most cards you can't 
grab the 802.11 management info and actual traffic on the network at the 
same time.

Is this possible with a WNDR3[78]00 CeroWRT (or openWRT) setup?

-- 
Richard A. Smith

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Cerowrt-devel] Recording RF management info _and_ associated traffic?
  2015-01-14  2:20 [Cerowrt-devel] Recording RF management info _and_ associated traffic? Richard Smith
@ 2015-01-20 16:59 ` Rich Brown
  2015-01-21 23:40   ` Richard Smith
  0 siblings, 1 reply; 43+ messages in thread
From: Rich Brown @ 2015-01-20 16:59 UTC (permalink / raw)
  To: Richard Smith; +Cc: cerowrt-devel

[-- Attachment #1: Type: text/plain, Size: 1184 bytes --]


On Jan 13, 2015, at 9:20 PM, Richard Smith <smithbone@gmail.com> wrote:

> I'm trying to track down some poor wireless issues we are having at work.  At random times the 5Ghz WLANs we have just go to hell.
> 
> I've been sniffing in monitor mode which has been quite enlightening there's certainly a lot more going on in the 5Ghz channels than I was expecting.
> 
> Monitor mode shows me loads of stuff I didn't know was there but what it doesn't show me is how all that other traffic interacts with the traffic on my ESS.
> 
> From what I've been reading it seems like you with most cards you can't grab the 802.11 management info and actual traffic on the network at the same time.
> 
> Is this possible with a WNDR3[78]00 CeroWRT (or openWRT) setup?


One of the first things I would do is a Wifi site survey, to look for conflicts between access points/channels, etc. Two recommendations for tools:

MacOSX: WiFi Explorer from Adrian Granados - US$4.99 from the Mac App Store. http://www.adriangranados.com/apps/wifi-explorer
Android: WiFi Analyzer from farproc - Donationware from the Android store. https://sites.google.com/site/farproc/wifi-analyzer

Rich

[-- Attachment #2: Message signed with OpenPGP using GPGMail --]
[-- Type: application/pgp-signature, Size: 496 bytes --]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Cerowrt-devel] Recording RF management info _and_ associated traffic?
  2015-01-20 16:59 ` Rich Brown
@ 2015-01-21 23:40   ` Richard Smith
  2015-01-21 23:58     ` David Lang
  0 siblings, 1 reply; 43+ messages in thread
From: Richard Smith @ 2015-01-21 23:40 UTC (permalink / raw)
  To: Rich Brown, Richard Smith; +Cc: cerowrt-devel

On 01/20/2015 11:59 AM, Rich Brown wrote:

> One of the first things I would do is a Wifi site survey, to look for
> conflicts between access points/channels, etc. Two recommendations
> for tools:
>
> MacOSX: WiFi Explorer from Adrian Granados - US$4.99 from the Mac App
> Store. http://www.adriangranados.com/apps/wifi-explorer Android: WiFi
> Analyzer from farproc - Donationware from the Android store.
> https://sites.google.com/site/farproc/wifi-analyzer

Thanks for the suggestion. I've done that.

I'll offer up a 3rd choice which I have been using.  Horst.
Runs on OpenWrt perfectly and free.  I've not tried it on CeroWrt yet 
but I don't see why it would not work.

http://br1.einfach.org/tech/horst/

With horst I've verified that the 3 AP's we are running are all on 5Ghz 
channels that don't have another AP on them.  We are up on the 10th 
floor of a tower type building and 3 of our walls have large windows 
with clear views of surrounding buildings.  We are higher than most of 
the stuff around us.

So when I scan I do see a lot of intermittent probes or wifi traffic
from other things but nothing cronic.  I haven't been able to run a scan 
when it all goes to hell though.

With horst it shows me the DATA or QDATA packet among the radiotap info 
and that the contents are encrypted  but I've not figured out how to 
capture decrypted traffic at the same time as radiotap info.  This would 
let me see exactly what sort of dynamic was happening from our network.

As an aside if any of the gurus here are near the Boston, MA area and
want to do small business Wi-Fi consulting let me know.  We will gladly
pay someone to fix our wireless.

--
Richard A. Smith

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Cerowrt-devel] Recording RF management info _and_ associated traffic?
  2015-01-21 23:40   ` Richard Smith
@ 2015-01-21 23:58     ` David Lang
  2015-01-22  9:04       ` Richard Smith
  0 siblings, 1 reply; 43+ messages in thread
From: David Lang @ 2015-01-21 23:58 UTC (permalink / raw)
  To: Richard Smith; +Cc: cerowrt-devel

On Wed, 21 Jan 2015, Richard Smith wrote:

> On 01/20/2015 11:59 AM, Rich Brown wrote:
>
>> One of the first things I would do is a Wifi site survey, to look for
>> conflicts between access points/channels, etc. Two recommendations
>> for tools:
>> 
>> MacOSX: WiFi Explorer from Adrian Granados - US$4.99 from the Mac App
>> Store. http://www.adriangranados.com/apps/wifi-explorer Android: WiFi
>> Analyzer from farproc - Donationware from the Android store.
>> https://sites.google.com/site/farproc/wifi-analyzer
>
> Thanks for the suggestion. I've done that.
>
> I'll offer up a 3rd choice which I have been using.  Horst.
> Runs on OpenWrt perfectly and free.  I've not tried it on CeroWrt yet but I 
> don't see why it would not work.
>
> http://br1.einfach.org/tech/horst/
>
> With horst I've verified that the 3 AP's we are running are all on 5Ghz 
> channels that don't have another AP on them.  We are up on the 10th floor of 
> a tower type building and 3 of our walls have large windows with clear views 
> of surrounding buildings.  We are higher than most of the stuff around us.

Ok, this would suggest that you are unlikely to have interference causing your 
problems. I don't have the earlier part of this thread still in my mailbox, what 
is the problem that you are trying to solve again?

When you do a wifi survey, you are not just looking at one spot, or near the APs 
for what you see. You should also be going to all the areas your users are going 
to be trying to access your network and see if you have a strong enough signal 
from at least one AP everywhere. Also note that if you have high-power APs, you 
may hear a signal from them, but they may not be able to hear the signal from 
the mobile device very well. Mobile devices tend to have lousy antennas, and try 
to operate a lower power levels to save battery power. So you may need to look 
at the stats on the AP showing the signal it sees from the client.

Assuming that you have enough signal, the next question is how many people are 
going to be trying to use the network at one time. You may be better off with 
more APs operating at lower power levels so that you have fewer people talking 
to each one.

David Lang

> So when I scan I do see a lot of intermittent probes or wifi traffic
> from other things but nothing cronic.  I haven't been able to run a scan when 
> it all goes to hell though.
>
> With horst it shows me the DATA or QDATA packet among the radiotap info and 
> that the contents are encrypted  but I've not figured out how to capture 
> decrypted traffic at the same time as radiotap info.  This would let me see 
> exactly what sort of dynamic was happening from our network.
>
> As an aside if any of the gurus here are near the Boston, MA area and
> want to do small business Wi-Fi consulting let me know.  We will gladly
> pay someone to fix our wireless.
>
> --
> Richard A. Smith
> _______________________________________________
> Cerowrt-devel mailing list
> Cerowrt-devel@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cerowrt-devel
>

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Cerowrt-devel] Recording RF management info _and_ associated traffic?
  2015-01-21 23:58     ` David Lang
@ 2015-01-22  9:04       ` Richard Smith
  2015-01-22  9:18         ` David Lang
  0 siblings, 1 reply; 43+ messages in thread
From: Richard Smith @ 2015-01-22  9:04 UTC (permalink / raw)
  To: David Lang; +Cc: cerowrt-devel

On 01/21/2015 06:58 PM, David Lang wrote:
> On Wed, 21 Jan 2015, Richard Smith wrote:
>

Thanks for the response.  First I want to say that I'm sensitive to the 
fact that this is the Cerowrt-devel list and not the small business WiFi 
help list.  If things go too far off-topic or people get tired of the 
discussion let me know and I'll take it off the list.

> Ok, this would suggest that you are unlikely to have interference
> causing your problems. I don't have the earlier part of this thread
> still in my mailbox, what is the problem that you are trying to solve
> again?

I didn't really describe the problem(s) in detail (see above note) but 
I'll provide a detailed description of my woes.

We have a small network of about 30 people or so with ~60 devices 
connected.  Most of which are wireless of some sort (both 2.4Ghz and 
5hz).  Here's my issues + my story. :)

1) Periodic reports of poor "Internet". However, its not the Internet 
uplink.  I setup a netperf-wrapper test that goes off every 10 minutes 
with a brief speed+latency test to a well connected host.  Tracked 
across several weeks the uplink/downlink always exactly as expected.  So 
I'm suspecting it's poor wireless rather than poor Internet.

2) Occasional total loss of WiFi.  This a bit fuzzy since I have 
multiple hardware permutations and currently no consistent failure.

The story:

Originally we had an Engenius 2.4/5Ghz AP and a Netgear AP/router (WiFi 
turned off).  I can't remember the original router model number. I 
didn't set any of the original hardware up.

Several times a week the Engenius AP would stop passing traffic.  A 
power cycle or reboot would fix it.  The Engenius forums had lots of 
people reporting similar problems.  We did firmware upgrades which 
seemed to help but not eliminate the issue.

Sometime later we added VoIP phones.  But bufferbloat in the cable modem 
caused large latencies under load and VoIP was unhappy.

Enter the trusty WNDR3700v2 from my stash with OpenWRT (pre-barrier 
breaker build).  I replaced both the original router and the Engenius AP 
with it.

QoS solved VoIP issues and for the most part wireless was happy.  Still 
occasionally though 5Ghz would stop working but much less frequent than 
the Engenius.  Rebooting the box would fix it.  I suspected the single 
box running all the AP + DHCP + DNS + routing may not have had the 
resources for our load or perhaps the pre-release of barrier breaker had 
issues.

Replaced the routing/DHCP/DNS/QoS portion with a x86 box running OpenWRT 
x86 (using released barrier breaker, but locally built).   Now the 
WNDR3700v2 was just an AP.  This also allowed us actually get our rated 
cable modem speed.  QoS on the wndr was capping out at ~60Mbps, a well 
known limit among members of this list.

Around the same time I also added a 2nd AP on a different 5Ghz channel 
(TP-Link AC1750) to spread the connected clients across multiple 
channels.  They have different ESSIDs.  Things seem to be happy.  I got 
the the TP-Link because its on target to be supported by OpenWRT and has 
3 external antennas which I though might provided a path for different 
antenna testing.

Recently, we picked up the 11th floor as well and moved many people up 
there.  I got a 3rd AP (another TP-Link AC1750) and set that one up on a 
free channel with a different ESSID.

Then about a week before my original post I got notified that Internet 
was down.  Both 10th floor APs had stopped working.  The 11th floor 
(where I am) was still working.   On the 10th floor, I could connect to 
the  TP-link via its IP address on its wired interface but it did not 
seem to be passing wireless traffic. A reboot fixed it.

The WNDR3700 was completely unresponsive both via WiFi and when I tried 
its IP connected directly to it's switch with a Cat-5.  I also have a 
serial port mod on that wndr3700 so I connected up to that instead.

 From the serial port everything appeared to be running fine only no 
would pass on the bridge.  Dropping the interfaces with ifconfig and 
then bringing them back up had no effect and I didn't see anything 
unusual in the system logs.  A power cycle fixed it.  I've never seen my 
wndr3700 do something like that.

So then I really began to wonder... that's 3 different hardware vendors 
with 3 very different firmware's all that had similar issues.  2 of them 
at exactly the same time.

I considered the possibility of a power event but the 2 APs are on 
different circuits and in physically different locations.  The power 
connection for the wndr3700 also has the x86 router, 2 switches, the 
cable modem, and a linux box plugged up and all of those devices were 
still working.

That's when I figured I needed to start looking at what was going on in 
RF land.  At that time I didn't have anything like horst to be able to 
verify that wireless really was broken and not some other mysterious 
network gremlin. So I started tooling up.  When it happens again I can 
investigate deeper.   I have a 2nd wndr3700v2 at my disposal set up in 
monitor on that channel that I can run horst on when the next total loss 
happens.

It's not happened again.  While I'm waiting I've been trying to look 
into issue 1 by trying to understand what is really happing on the RF 
channel its on.  Thus my query about wanting to see associated network 
traffic decoded along with the radiotap info.

> When you do a wifi survey, you are not just looking at one spot, or near
> the APs for what you see. You should also be going to all the areas your
> users are going to be trying to access your network and see if you have
> a strong enough signal from at least one AP everywhere.

I have taken readings at multiple points in the office but it was not a 
very rigorous survey. I should repeat with more care.  The wireless 
signal indicators most clients I've messed with show good strength.

Our floor(s) are fairly small and almost completely open. There are no 
cubicles and very few internal walls. There are some offices and 
conference rooms but each of them have large walls of glass that look 
into the center of the room.   The only big obstruction is a large 
concrete pillar in the center of the room.  The 10th floor TPlink AP is 
located in a ceiling cable tray very close to the center of the room. 
All the stations are in about a 40 foot radius and all but 1 or 2 have 
line of sight to the AP.  The wndr3700 is in a closet on the side of the 
room with other equipment so it might be 80 feet away from the furthest 
station or so.

> Also note that
> if you have high-power APs,

What Tx level qualifies as a high-power AP?  The wndr says 50mW.  The 
tplink just gives me low,medium,and high as choices.  It's still at the 
default of high.

> you may hear a signal from them, but they
> may not be able to hear the signal from the mobile device very well.
> Mobile devices tend to have lousy antennas, and try to operate a lower
> power levels to save battery power. So you may need to look at the stats
> on the AP showing the signal it sees from the client.

I can see those for things connected to the wndr unit but sadly the 
stock tplink firmware does not show me rx strength.

Can I perhaps approximate signal strength by looking at the bitrate for 
packets that station sends?  The theory being that higher quality RF 
links should use the higher bitrate encodings when sending.

If need be I can move the wndr to the same location as the tplink and 
then have stations connect to the wndr so I can watch the rx signal 
strength.

> Assuming that you have enough signal, the next question is how many
> people are going to be trying to use the network at one time. You may be
> better off with more APs operating at lower power levels so that you
> have fewer people talking to each one.

The tplink is better located so in general people tend to use that one 
over the the wndr. Last check it has around 20 stations connected to it 
during the day. The rest are connected to the 2 other APs.

Thanks again for any insights you have.

Lastly, I've been doing some reading on getting enterprise class APs 
from Cisco, HP, etc.  A large number of them seem to require a lot of 
extra infrastructure running wireless controllers and special software 
you have to run to set them up.

Any recommendations for something that's a step above consumer grade 
devices but that does not require additional controllers or licensed 
software would be appreciated.

-- 
Richard A. Smith

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Cerowrt-devel] Recording RF management info _and_ associated traffic?
  2015-01-22  9:04       ` Richard Smith
@ 2015-01-22  9:18         ` David Lang
  2015-01-22 18:19           ` Richard Smith
  0 siblings, 1 reply; 43+ messages in thread
From: David Lang @ 2015-01-22  9:18 UTC (permalink / raw)
  To: Richard Smith; +Cc: cerowrt-devel

On Thu, 22 Jan 2015, Richard Smith wrote:

> On 01/21/2015 06:58 PM, David Lang wrote:
>> On Wed, 21 Jan 2015, Richard Smith wrote:
>> 
>
> Thanks for the response.  First I want to say that I'm sensitive to the fact 
> that this is the Cerowrt-devel list and not the small business WiFi help 
> list.  If things go too far off-topic or people get tired of the discussion 
> let me know and I'll take it off the list.
>
>> Ok, this would suggest that you are unlikely to have interference
>> causing your problems. I don't have the earlier part of this thread
>> still in my mailbox, what is the problem that you are trying to solve
>> again?
>
> I didn't really describe the problem(s) in detail (see above note) but I'll 
> provide a detailed description of my woes.
>
> We have a small network of about 30 people or so with ~60 devices connected. 
> Most of which are wireless of some sort (both 2.4Ghz and 5hz).  Here's my 
> issues + my story. :)
>
> 1) Periodic reports of poor "Internet". However, its not the Internet uplink. 
> I setup a netperf-wrapper test that goes off every 10 minutes with a brief 
> speed+latency test to a well connected host.  Tracked across several weeks 
> the uplink/downlink always exactly as expected.  So I'm suspecting it's poor 
> wireless rather than poor Internet.
>
> 2) Occasional total loss of WiFi.  This a bit fuzzy since I have multiple 
> hardware permutations and currently no consistent failure.
>
> The story:
>
> Originally we had an Engenius 2.4/5Ghz AP and a Netgear AP/router (WiFi 
> turned off).  I can't remember the original router model number. I didn't set 
> any of the original hardware up.
>
> Several times a week the Engenius AP would stop passing traffic.  A power 
> cycle or reboot would fix it.  The Engenius forums had lots of people 
> reporting similar problems.  We did firmware upgrades which seemed to help 
> but not eliminate the issue.
>
> Sometime later we added VoIP phones.  But bufferbloat in the cable modem 
> caused large latencies under load and VoIP was unhappy.
>
> Enter the trusty WNDR3700v2 from my stash with OpenWRT (pre-barrier breaker 
> build).  I replaced both the original router and the Engenius AP with it.
>
> QoS solved VoIP issues and for the most part wireless was happy.  Still 
> occasionally though 5Ghz would stop working but much less frequent than the 
> Engenius.  Rebooting the box would fix it.  I suspected the single box 
> running all the AP + DHCP + DNS + routing may not have had the resources for 
> our load or perhaps the pre-release of barrier breaker had issues.
>
> Replaced the routing/DHCP/DNS/QoS portion with a x86 box running OpenWRT x86 
> (using released barrier breaker, but locally built).   Now the WNDR3700v2 was 
> just an AP.  This also allowed us actually get our rated cable modem speed. 
> QoS on the wndr was capping out at ~60Mbps, a well known limit among members 
> of this list.
>
> Around the same time I also added a 2nd AP on a different 5Ghz channel 
> (TP-Link AC1750) to spread the connected clients across multiple channels. 
> They have different ESSIDs.  Things seem to be happy.  I got the the TP-Link 
> because its on target to be supported by OpenWRT and has 3 external antennas 
> which I though might provided a path for different antenna testing.
>
> Recently, we picked up the 11th floor as well and moved many people up there. 
> I got a 3rd AP (another TP-Link AC1750) and set that one up on a free channel 
> with a different ESSID.

I like to put all the APs on the same ESSID so that people can roam between 
them. This requires that the APs act as bridges to a dedicated common network, 
not as routers.

> Then about a week before my original post I got notified that Internet was 
> down.  Both 10th floor APs had stopped working.  The 11th floor (where I am) 
> was still working.   On the 10th floor, I could connect to the  TP-link via 
> its IP address on its wired interface but it did not seem to be passing 
> wireless traffic. A reboot fixed it.

There has been an ongoing bug with Apple devices on 5Ghz that causes the wifi 
chipset to lockup. We think we've fixed it in the current Cerowrt, but I don't 
know what kernel versions have this problem. This is likely to affect multiple 
vendors who use the same chipset (check the openwrt hardware list for details of 
the chipsets in each model)

> The WNDR3700 was completely unresponsive both via WiFi and when I tried its 
> IP connected directly to it's switch with a Cat-5.  I also have a serial port 
> mod on that wndr3700 so I connected up to that instead.

hmm, it's not common to have it be unresponsive on the wired network.

> From the serial port everything appeared to be running fine only no would 
> pass on the bridge.  Dropping the interfaces with ifconfig and then bringing 
> them back up had no effect and I didn't see anything unusual in the system 
> logs.  A power cycle fixed it.  I've never seen my wndr3700 do something like 
> that.
>
> So then I really began to wonder... that's 3 different hardware vendors with 
> 3 very different firmware's all that had similar issues.  2 of them at 
> exactly the same time.
>
> I considered the possibility of a power event but the 2 APs are on different 
> circuits and in physically different locations.  The power connection for the 
> wndr3700 also has the x86 router, 2 switches, the cable modem, and a linux 
> box plugged up and all of those devices were still working.
>
> That's when I figured I needed to start looking at what was going on in RF 
> land.  At that time I didn't have anything like horst to be able to verify 
> that wireless really was broken and not some other mysterious network 
> gremlin. So I started tooling up.  When it happens again I can investigate 
> deeper.   I have a 2nd wndr3700v2 at my disposal set up in monitor on that 
> channel that I can run horst on when the next total loss happens.
>
> It's not happened again.  While I'm waiting I've been trying to look into 
> issue 1 by trying to understand what is really happing on the RF channel its 
> on.  Thus my query about wanting to see associated network traffic decoded 
> along with the radiotap info.
>
>> When you do a wifi survey, you are not just looking at one spot, or near
>> the APs for what you see. You should also be going to all the areas your
>> users are going to be trying to access your network and see if you have
>> a strong enough signal from at least one AP everywhere.
>
> I have taken readings at multiple points in the office but it was not a very 
> rigorous survey. I should repeat with more care.  The wireless signal 
> indicators most clients I've messed with show good strength.
>
> Our floor(s) are fairly small and almost completely open. There are no 
> cubicles and very few internal walls. There are some offices and conference 
> rooms but each of them have large walls of glass that look into the center of 
> the room.   The only big obstruction is a large concrete pillar in the center 
> of the room.  The 10th floor TPlink AP is located in a ceiling cable tray 
> very close to the center of the room. All the stations are in about a 40 foot 
> radius and all but 1 or 2 have line of sight to the AP.  The wndr3700 is in a 
> closet on the side of the room with other equipment so it might be 80 feet 
> away from the furthest station or so.

this doesn't sound unreasonable unless your users are trying to use a LOT of 
bandwidth (although the fact that you refer to the 50Mb bottleneck indicates 
that you may be)

>> Also note that
>> if you have high-power APs,
>
> What Tx level qualifies as a high-power AP?  The wndr says 50mW.  The tplink 
> just gives me low,medium,and high as choices.  It's still at the default of 
> high.
>
>> you may hear a signal from them, but they
>> may not be able to hear the signal from the mobile device very well.
>> Mobile devices tend to have lousy antennas, and try to operate a lower
>> power levels to save battery power. So you may need to look at the stats
>> on the AP showing the signal it sees from the client.
>
> I can see those for things connected to the wndr unit but sadly the stock 
> tplink firmware does not show me rx strength.
>
> Can I perhaps approximate signal strength by looking at the bitrate for 
> packets that station sends?  The theory being that higher quality RF links 
> should use the higher bitrate encodings when sending.

not reliably, too many other things factor in to that.

> If need be I can move the wndr to the same location as the tplink and then 
> have stations connect to the wndr so I can watch the rx signal strength.
>
>> Assuming that you have enough signal, the next question is how many
>> people are going to be trying to use the network at one time. You may be
>> better off with more APs operating at lower power levels so that you
>> have fewer people talking to each one.
>
> The tplink is better located so in general people tend to use that one over 
> the the wndr. Last check it has around 20 stations connected to it during the 
> day. The rest are connected to the 2 other APs.
>
> Thanks again for any insights you have.
>
> Lastly, I've been doing some reading on getting enterprise class APs from 
> Cisco, HP, etc.  A large number of them seem to require a lot of extra 
> infrastructure running wireless controllers and special software you have to 
> run to set them up.
>
> Any recommendations for something that's a step above consumer grade devices 
> but that does not require additional controllers or licensed software would 
> be appreciated.

There is a lot of room with consumer grade equipment from where you currently 
are. The "Enterprise Grade" systems do have a lot of infrastructure to 
coordinate the different APs.

David Lang


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Cerowrt-devel] Recording RF management info _and_ associated traffic?
  2015-01-22  9:18         ` David Lang
@ 2015-01-22 18:19           ` Richard Smith
  2015-01-22 22:09             ` David Lang
                               ` (2 more replies)
  0 siblings, 3 replies; 43+ messages in thread
From: Richard Smith @ 2015-01-22 18:19 UTC (permalink / raw)
  To: David Lang, Richard Smith; +Cc: cerowrt-devel

On 01/22/2015 04:18 AM, David Lang wrote:

>> Recently, we picked up the 11th floor as well and moved many people up
>> there. I got a 3rd AP (another TP-Link AC1750) and set that one up on
>> a free channel with a different ESSID.
>
> I like to put all the APs on the same ESSID so that people can roam
> between them. This requires that the APs act as bridges to a dedicated
> common network, not as routers.

That's the ultimate plan but for convenience of being able to easily 
select what AP I'm talking to or to be able to tell folks to move from 
one to another I've got them on different ESSIDs.  It also helps me keep 
track of what RF channel things are on.

>> Then about a week before my original post I got notified that Internet
>> was down.  Both 10th floor APs had stopped working.  The 11th floor
>> (where I am) was still working.   On the 10th floor, I could connect
>> to the  TP-link via its IP address on its wired interface but it did
>> not seem to be passing wireless traffic. A reboot fixed it.
>
> There has been an ongoing bug with Apple devices on 5Ghz that causes the
> wifi chipset to lockup. We think we've fixed it in the current Cerowrt,
> but I don't know what kernel versions have this problem. This is likely
> to affect multiple vendors who use the same chipset (check the openwrt
> hardware list for details of the chipsets in each model)

Oooohhh!  That could be it. We have a _lot_ of Apple devices.  Most of 
the company uses MacBook,or Air and a large number of people have 
iPhones and we use iPods for some of our testing.   I'll go dig through 
the openWRT and get the details.

>> The WNDR3700 was completely unresponsive both via WiFi and when I
>> tried its IP connected directly to it's switch with a Cat-5.  I also
>> have a serial port mod on that wndr3700 so I connected up to that
>> instead.
>
> hmm, it's not common to have it be unresponsive on the wired network.

It's uncommon to me. :)  This unit has travelled with me for years while 
I worked for OLPC and its see a lot of different wireless environments. 
   Granted never one with this many apple clients.  Usually 7-8 
Linux/Windows machines and a pile of XOs.

So this happened a lot at your SCALE setups?

>> room. All the stations are in about a 40 foot radius and all but 1 or
>> 2 have line of sight to the AP.  The wndr3700 is in a closet on the
>> side of the room with other equipment so it might be 80 feet away from
>> the furthest station or so.
>
> this doesn't sound unreasonable unless your users are trying to use a
> LOT of bandwidth (although the fact that you refer to the 50Mb
> bottleneck indicates that you may be)

The bottleneck was just a nice side effect.  We don't use that much 
traffic.  I only noticed the limit once I started running 
netperf-wrapper tests from a wired host.

Occasional there will be some big download that eats up bandwidth, but 
when I watch the throughput during the day we peak up in to the 40Mbps 
but the average is < 10Mbps (Download).

>> Can I perhaps approximate signal strength by looking at the bitrate
>> for packets that station sends?  The theory being that higher quality
>> RF links should use the higher bitrate encodings when sending.
>
> not reliably, too many other things factor in to that.

Indeed. Horst tells me I basically have 2 rates happening on the tplink 
6Mbs and 24Mbps with a few 12Mbps in there.

>> If need be I can move the wndr to the same location as the tplink and
>> then have stations connect to the wndr so I can watch the rx signal
>> strength.

Looks like that's what I'll have to do.

> There is a lot of room with consumer grade equipment from where you
> currently are. The "Enterprise Grade" systems do have a lot of
> infrastructure to coordinate the different APs.

Thanks for the ray of hope.  Yeah I don't need all the multi-AP 
coordination handoff stuff.

-- 
Richard A. Smith

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Cerowrt-devel] Recording RF management info _and_ associated traffic?
  2015-01-22 18:19           ` Richard Smith
@ 2015-01-22 22:09             ` David Lang
  2015-01-22 22:55               ` Roman Toledo Casabona
  2015-01-24 14:59             ` dpreed
  2015-01-25  8:07             ` Outback Dingo
  2 siblings, 1 reply; 43+ messages in thread
From: David Lang @ 2015-01-22 22:09 UTC (permalink / raw)
  To: Richard Smith; +Cc: cerowrt-devel

On Thu, 22 Jan 2015, Richard Smith wrote:

>>> The WNDR3700 was completely unresponsive both via WiFi and when I
>>> tried its IP connected directly to it's switch with a Cat-5.  I also
>>> have a serial port mod on that wndr3700 so I connected up to that
>>> instead.
>> 
>> hmm, it's not common to have it be unresponsive on the wired network.
>
> It's uncommon to me. :)  This unit has travelled with me for years while I 
> worked for OLPC and its see a lot of different wireless environments. 
> Granted never one with this many apple clients.  Usually 7-8 Linux/Windows 
> machines and a pile of XOs.
>
> So this happened a lot at your SCALE setups?

two years ago we had a problem with the APs dropping off, but last year 
everything worked wonderfully.

David Lang

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Cerowrt-devel] Recording RF management info _and_ associated traffic?
  2015-01-22 22:09             ` David Lang
@ 2015-01-22 22:55               ` Roman Toledo Casabona
  0 siblings, 0 replies; 43+ messages in thread
From: Roman Toledo Casabona @ 2015-01-22 22:55 UTC (permalink / raw)
  To: Richard Smith, David Lang; +Cc: cerowrt-devel

I guess Yahoo email is not liked as a source for my reply,  can you kindly remove me from more notices as I'm overloaded with work to follow your project

thank you and excuse me for steeping on this conversation, maybe this will get  thru

Sorry, we were unable to deliver your message to the following address.

<majordomo@vger.kernel.org>:
Remote host said:
553 5.7.1 Hello [72.30.239.75], for your MAIL FROM address <rtoledo2002@yahoo.com> policy analysis reported: Your address is not liked source for email
[MAIL_FROM]

--- Below this line is a copy of the message.

Received: from [66.196.81.174] by nm34.bullet.mail.bf1.yahoo.com with NNFMP; 22 Jan 2015 22:52:37 -0000
Received: from [98.139.212.231] by tm20.bullet.mail.bf1.yahoo.com with NNFMP; 22 Jan 2015 22:52:37 -0000
Received: from [127.0.0.1] by omp1040.mail.bf1.yahoo.com with NNFMP; 22 Jan 2015 22:52:37 -0000
X-Yahoo-Newman-Property: ymail-3
X-Yahoo-Newman-Id: 508745.6516.bm@omp1040.mail.bf1.yahoo.com
Received: (qmail 58276 invoked by uid 60001); 22 Jan 2015 22:52:37 -0000
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024; t=1421967157; bh=+hoVZbV4ePHhwT48DL2jKdAiiEz3u4DphjF7TNJfus4=; h=Message-ID:Date:From:Subject:To:MIME-Version:Content-Type; b=fHXb+lpfFysgslEQ9OIOvtOrJvhcXd46V4xXqBwTpXBRrhQZsFX7deDV37b+rSBPKn9KuxrSVl8TfPRDwWvJrYTX6yNUHX2sTAMQB/+fROQvUiYPijmDAo2FxbK4e7bUUMWHDuGViQWqm3LVMoPPgw5CKJpWayeIvTPLuQrKRZE=
X-YMail-OSG: 9YkReVMVM1lcI3gjTh0TFiGFhn620jxFLrWLaxHpAfZTTwP
7BbbFh0sTYi9Zay6Pn8C1oBz1dQ1w6XvjCe3pzINldpd2EEAAWRx3iterebA
zfNFrUkvlWSCnod3MZtfZM3ryuIwuvQHe1qocC4BTCxIognNjnVqrefw2IN2
3r3eXNBS6YD2eOXaxIHec0ZRs6x6XfsIFNvLB1_DhWHNhSf58zsWFR6R8cDL
L9AU2YgIlO.142L_LjGOqTXDB39yn3FHUueVHgcsmoUmjhC88dfxDXBxRq_n
uxBfYxD7rzL6n7Ss_lp2bqZgq4hJs6ezsxKFK9I7qMpJeMofr1rbBhMXPpm0
D9sVMaEcJFRfSEhJrqKFXfmJukEYfAlYMqRcGZpCs8rnQ2uw0LiEFsi1pLEs
dcueN.Xs7CcvshWaQ4zaM8s.MYwyYpZrJaaqXtFweRfiryf7LqQk4w9p04FK
Pkx1qEGjSdPth8R7QeT6uFwcrwOGyoJr1Brx28jcoPaAHE3SmVSbQpT_SXnX
gTjWtJzW9Fz2Ttp_xbyJhbByN8R3uN6f3gtlAKxVDPGNFALUmyz2C9V8lxKk
ghP2xkQuA4w--
Received: from [96.251.130.107] by web162203.mail.bf1.yahoo.com via HTTP; Thu, 22 Jan 2015 14:52:37 PST
X-Rocket-MIMEInfo: 002.001,DQp1bnN1YnNjcmliZSBuZXRkZXYNCg0KIGluICB0aGUgYm9keSBvZiBhIG1lc3NhZ2UgdG8gbWFqb3Jkb21vQHZnZXIua2VybmVsLm9yZwEwAQEBAQ--
X-Mailer: YahooMailClassic/948 YahooMailWebService/0.8.203.740
Message-ID: <1421967157.50516.YahooMailBasic@web162203.mail.bf1.yahoo.com>
Date: Thu, 22 Jan 2015 14:52:37 -0800
From: Roman Toledo Casabona <rtoledo2002@yahoo.com>
Subject: unsubscribe netdev
To: majordomo@vger.kernel.org
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii


unsubscribe netdev

in  the body of a message to majordomo@vger.kernel.org


--------------------------------------------
On Thu, 1/22/15, David Lang <david@lang.hm> wrote:

 Subject: Re: [Cerowrt-devel] Recording RF management info _and_ associated traffic?
 To: "Richard Smith" <smithbone@gmail.com>
 Cc: "cerowrt-devel@lists.bufferbloat.net" <cerowrt-devel@lists.bufferbloat.net>
 Date: Thursday, January 22, 2015, 2:09 PM
 
 On Thu, 22 Jan 2015,
 Richard Smith wrote:
 
 >>> The WNDR3700 was completely
 unresponsive both via WiFi and when I
 >>> tried its IP connected directly to
 it's switch with a Cat-5.  I also
 >>> have a serial port mod on that
 wndr3700 so I connected up to that
 >>> instead.
 >>
 
 >> hmm, it's not common to have
 it be unresponsive on the wired network.
 >
 > It's uncommon to
 me. :)  This unit has travelled with me for years while I
 
 > worked for OLPC and its see a lot of
 different wireless environments. 
 >
 Granted never one with this many apple clients.  Usually
 7-8 Linux/Windows 
 > machines and a pile
 of XOs.
 >
 > So this
 happened a lot at your SCALE setups?
 
 two years ago we had a problem with the APs
 dropping off, but last year 
 everything
 worked wonderfully.
 
 David
 Lang
 _______________________________________________
 Cerowrt-devel mailing list
 Cerowrt-devel@lists.bufferbloat.net
 https://lists.bufferbloat.net/listinfo/cerowrt-devel


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Cerowrt-devel] Recording RF management info _and_ associated traffic?
  2015-01-22 18:19           ` Richard Smith
  2015-01-22 22:09             ` David Lang
@ 2015-01-24 14:59             ` dpreed
  2015-01-24 15:30               ` Kelvin Edmison
  2015-01-25  4:35               ` David Lang
  2015-01-25  8:07             ` Outback Dingo
  2 siblings, 2 replies; 43+ messages in thread
From: dpreed @ 2015-01-24 14:59 UTC (permalink / raw)
  To: Richard Smith; +Cc: cerowrt-devel

[-- Attachment #1: Type: text/plain, Size: 1302 bytes --]


On Thursday, January 22, 2015 1:19pm, "Richard Smith" <smithbone@gmail.com> said:
 

> On 01/22/2015 04:18 AM, David Lang wrote:
> 
> >> Recently, we picked up the 11th floor as well and moved many people up
> >> there. I got a 3rd AP (another TP-Link AC1750) and set that one up on
> >> a free channel with a different ESSID.
> >
> > I like to put all the APs on the same ESSID so that people can roam
> > between them. This requires that the APs act as bridges to a dedicated
> > common network, not as routers.
> 
> That's the ultimate plan but for convenience of being able to easily
> select what AP I'm talking to or to be able to tell folks to move from
> one to another I've got them on different ESSIDs. It also helps me keep
> track of what RF channel things are on.


A side comment, meant to discourage continuing to bridge rather than route.
There's no reason that the AP's cannot have different IP addresses, but a common ESSID.  Roaming between them would be like roaming among mesh subnets. Assuming you are securing your APs' air interfaces using encryption over the air, you are already re-authenticating as you move from AP to AP.  So using routing rather than bridging is a good idea for all the reasons that routing rather than bridging is better for mesh.
 

[-- Attachment #2: Type: text/html, Size: 2103 bytes --]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Cerowrt-devel] Recording RF management info _and_ associated traffic?
  2015-01-24 14:59             ` dpreed
@ 2015-01-24 15:30               ` Kelvin Edmison
  2015-01-25  4:35               ` David Lang
  1 sibling, 0 replies; 43+ messages in thread
From: Kelvin Edmison @ 2015-01-24 15:30 UTC (permalink / raw)
  To: dpreed; +Cc: cerowrt-devel

[-- Attachment #1: Type: text/plain, Size: 1888 bytes --]


> On Jan 24, 2015, at 9:59 AM, dpreed@reed.com wrote:
> 
> On Thursday, January 22, 2015 1:19pm, "Richard Smith" <smithbone@gmail.com> said:
>  
> > On 01/22/2015 04:18 AM, David Lang wrote:
> > 
> > >> Recently, we picked up the 11th floor as well and moved many people up
> > >> there. I got a 3rd AP (another TP-Link AC1750) and set that one up on
> > >> a free channel with a different ESSID.
> > >
> > > I like to put all the APs on the same ESSID so that people can roam
> > > between them. This requires that the APs act as bridges to a dedicated
> > > common network, not as routers.
> > 
> > That's the ultimate plan but for convenience of being able to easily
> > select what AP I'm talking to or to be able to tell folks to move from
> > one to another I've got them on different ESSIDs. It also helps me keep
> > track of what RF channel things are on.
> 
> A side comment, meant to discourage continuing to bridge rather than route.
> There's no reason that the AP's cannot have different IP addresses, but a common ESSID.  Roaming between them would be like roaming among mesh subnets. Assuming you are securing your APs' air interfaces using encryption over the air, you are already re-authenticating as you move from AP to AP.  So using routing rather than bridging is a good idea for all the reasons that routing rather than bridging is better for mesh.
> 

Have the MDNS problems been addressed?  The last time I had a go with CeroWRT (about 6 months ago) the problems were too severe for me to keep using it.  I had to fall back to a bridged setup for my primarily Mac environment. 

I'm a long-time Linux user-space developer but am a complete newbie when it comes to developing for CeroWRT. If someone can point me at the right spot to start working on the MDNS issues then I'll see if I can do anything to help.  

Regards,
  Kelvin


[-- Attachment #2: Type: text/html, Size: 2972 bytes --]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Cerowrt-devel] Recording RF management info _and_ associated traffic?
  2015-01-24 14:59             ` dpreed
  2015-01-24 15:30               ` Kelvin Edmison
@ 2015-01-25  4:35               ` David Lang
  2015-01-25  5:02                 ` Dave Taht
  2015-01-25 20:17                 ` dpreed
  1 sibling, 2 replies; 43+ messages in thread
From: David Lang @ 2015-01-25  4:35 UTC (permalink / raw)
  To: dpreed; +Cc: cerowrt-devel

On Sat, 24 Jan 2015, dpreed@reed.com wrote:

> On Thursday, January 22, 2015 1:19pm, "Richard Smith" <smithbone@gmail.com> said:
> 
>
>> On 01/22/2015 04:18 AM, David Lang wrote:
>> 
>> >> Recently, we picked up the 11th floor as well and moved many people up
>> >> there. I got a 3rd AP (another TP-Link AC1750) and set that one up on
>> >> a free channel with a different ESSID.
>> >
>> > I like to put all the APs on the same ESSID so that people can roam
>> > between them. This requires that the APs act as bridges to a dedicated
>> > common network, not as routers.
>> 
>> That's the ultimate plan but for convenience of being able to easily
>> select what AP I'm talking to or to be able to tell folks to move from
>> one to another I've got them on different ESSIDs. It also helps me keep
>> track of what RF channel things are on.
>
>
> A side comment, meant to discourage continuing to bridge rather than route.
>
> There's no reason that the AP's cannot have different IP addresses, but a 
> common ESSID.  Roaming between them would be like roaming among mesh subnets. 
> Assuming you are securing your APs' air interfaces using encryption over the 
> air, you are already re-authenticating as you move from AP to AP.  So using 
> routing rather than bridging is a good idea for all the reasons that routing 
> rather than bridging is better for mesh.

The problem with doing this is that all existing TCP connections will break when 
you move from one AP to another and while some apps will quickly notice this and 
establish new connections, there are many apps that will not and this will cause 
noticable disruption to the user.

Bridgeing allows the connections to remain intact. The wifi stack re-negotiates 
the encryption, but the encapsulated IP packets don't change.

I do this with the wifi on it's own VLAN (actually separate VLANs for 2.4 and 
5GHz) and have the APs configured not to relay broadcast traffic from one 
wireless user to another. This cuts down a LOT on the problems of broadcasts.

In about a month I'm going to be running the wireless network for SCaLE again, 
and I would be happy to instrament the network to gather whatever info anyone is 
interested in. I will be using ~50 APs to handle the ~2800 or so devices that 
show up, with the footprint of each AP roughly covering a small meeting room 
(larger rooms have 2 APs in them, the largest room has 3, and I'm adding APs 
this year to cover the hallways better because the ones in the rooms aren't 
doing well enough at the low power settings I'm using)

David Lang

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Cerowrt-devel] Recording RF management info _and_ associated traffic?
  2015-01-25  4:35               ` David Lang
@ 2015-01-25  5:02                 ` Dave Taht
  2015-01-25  5:04                   ` Dave Taht
  2015-01-25  6:44                   ` David Lang
  2015-01-25 20:17                 ` dpreed
  1 sibling, 2 replies; 43+ messages in thread
From: Dave Taht @ 2015-01-25  5:02 UTC (permalink / raw)
  To: David Lang; +Cc: cerowrt-devel

On Sat, Jan 24, 2015 at 8:35 PM, David Lang <david@lang.hm> wrote:
> On Sat, 24 Jan 2015, dpreed@reed.com wrote:
>
>> On Thursday, January 22, 2015 1:19pm, "Richard Smith"
>> <smithbone@gmail.com> said:
>>
>>
>>> On 01/22/2015 04:18 AM, David Lang wrote:
>>>
>>> >> Recently, we picked up the 11th floor as well and moved many people up
>>> >> there. I got a 3rd AP (another TP-Link AC1750) and set that one up on
>>> >> a free channel with a different ESSID.
>>> >
>>> > I like to put all the APs on the same ESSID so that people can roam
>>> > between them. This requires that the APs act as bridges to a dedicated
>>> > common network, not as routers.
>>>
>>> That's the ultimate plan but for convenience of being able to easily
>>> select what AP I'm talking to or to be able to tell folks to move from
>>> one to another I've got them on different ESSIDs. It also helps me keep
>>> track of what RF channel things are on.

My usual use case for using different APs is to find an error in the campus.

When someone tells me that "Lupin-lodge" is down, I know exactly which machine
to check. If everything was named Lupin, I'd have to check far more
than one AP, and
to ask approximately where on the campus they were.

>>
>>
>>
>> A side comment, meant to discourage continuing to bridge rather than
>> route.
>>
>> There's no reason that the AP's cannot have different IP addresses, but a
>> common ESSID.  Roaming between them would be like roaming among mesh
>> subnets. Assuming you are securing your APs' air interfaces using encryption
>> over the air, you are already re-authenticating as you move from AP to AP.
>> So using routing rather than bridging is a good idea for all the reasons
>> that routing rather than bridging is better for mesh.
>
>
> The problem with doing this is that all existing TCP connections will break
> when you move from one AP to another and while some apps will quickly notice
> this and establish new connections, there are many apps that will not and
> this will cause noticable disruption to the user.

I am under the impression that network-manager and linux, at least,
tend to renegotiate
IPv6 addresses on an down/up, and preserve ipv4.

>
> Bridgeing allows the connections to remain intact. The wifi stack
> re-negotiates the encryption, but the encapsulated IP packets don't change.

While I actually agree with dlang on having all the same ssid and
bridging, and not routing, on a conference, as well as with the idea
of disabling broadcast (and I assume direct connectivity between two
people seated side by side), it is a pita:

More than once I've wanted to share a git tree with someone right next
to me. I try to hand them my ip to grab the tree, and they can't even
ping me, so I end uploading it somewhere, and he or she downloading it
from there. Similarly, breaking interconnectivity precludes sane usage
of in-conference

In my case, since choosing to live in a routed, rather than bridged
world, I have modified the nailed up tools I use to be more
connectionless. Instead of ssh (tcp), I use mosh-multipath (udp),
which is far superior for interactive shells in lousy wifi
environments. For vpns, I switched to tinc, which will attempt direct
connections over udp, and tcp on both ipv4 and ipv6. For access to
google, I adopted quic in my chrome browser. Since doing all these
things I rarely notice losing a nailed up connection or migrating from
AP to AP. Additionally I use babel (where I control the network) and
ad-hoc wifi to transparently migrate from AP to AP, and (often) from
AP to wired to AP to wired as I change locations, also with no loss in
connectivity.

I don't expect the scale userbase to have made these adjustments in behavior. :/

>
> I do this with the wifi on it's own VLAN (actually separate VLANs for 2.4
> and 5GHz) and have the APs configured not to relay broadcast traffic from
> one wireless user to another. This cuts down a LOT on the problems of
> broadcasts.
>
> In about a month I'm going to be running the wireless network for SCaLE
> again, and I would be happy to instrament the network to gather whatever
> info anyone is interested in. I will be using ~50 APs to handle the ~2800 or

I will look into some tools bismark and others have.

Will you attempt to deploy ipv6?

> so devices that show up, with the footprint of each AP roughly covering a
> small meeting room (larger rooms have 2 APs in them, the largest room has 3,
> and I'm adding APs this year to cover the hallways better because the ones
> in the rooms aren't doing well enough at the low power settings I'm using)

I am of course interested in how fq_codel performs on your ISP link, and
are you planning on running it for your wifi?

> David Lang
>
> _______________________________________________
> Cerowrt-devel mailing list
> Cerowrt-devel@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cerowrt-devel

-- 
Dave Täht

thttp://www.bufferbloat.net/projects/bloat/wiki/Upcoming_Talks

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Cerowrt-devel] Recording RF management info _and_ associated traffic?
  2015-01-25  5:02                 ` Dave Taht
@ 2015-01-25  5:04                   ` Dave Taht
  2015-01-25  6:44                   ` David Lang
  1 sibling, 0 replies; 43+ messages in thread
From: Dave Taht @ 2015-01-25  5:04 UTC (permalink / raw)
  To: David Lang; +Cc: cerowrt-devel

On Sat, Jan 24, 2015 at 9:02 PM, Dave Taht <dave.taht@gmail.com> wrote:
> On Sat, Jan 24, 2015 at 8:35 PM, David Lang <david@lang.hm> wrote:
>> On Sat, 24 Jan 2015, dpreed@reed.com wrote:
>>
>>> On Thursday, January 22, 2015 1:19pm, "Richard Smith"
>>> <smithbone@gmail.com> said:
>>>
>>>
>>>> On 01/22/2015 04:18 AM, David Lang wrote:
>>>>
>>>> >> Recently, we picked up the 11th floor as well and moved many people up
>>>> >> there. I got a 3rd AP (another TP-Link AC1750) and set that one up on
>>>> >> a free channel with a different ESSID.
>>>> >
>>>> > I like to put all the APs on the same ESSID so that people can roam
>>>> > between them. This requires that the APs act as bridges to a dedicated
>>>> > common network, not as routers.
>>>>
>>>> That's the ultimate plan but for convenience of being able to easily
>>>> select what AP I'm talking to or to be able to tell folks to move from
>>>> one to another I've got them on different ESSIDs. It also helps me keep
>>>> track of what RF channel things are on.
>
> My usual use case for using different APs is to find an error in the campus.
>
> When someone tells me that "Lupin-lodge" is down, I know exactly which machine
> to check. If everything was named Lupin, I'd have to check far more
> than one AP, and
> to ask approximately where on the campus they were.
>
>>>
>>>
>>>
>>> A side comment, meant to discourage continuing to bridge rather than
>>> route.
>>>
>>> There's no reason that the AP's cannot have different IP addresses, but a
>>> common ESSID.  Roaming between them would be like roaming among mesh
>>> subnets. Assuming you are securing your APs' air interfaces using encryption
>>> over the air, you are already re-authenticating as you move from AP to AP.
>>> So using routing rather than bridging is a good idea for all the reasons
>>> that routing rather than bridging is better for mesh.
>>
>>
>> The problem with doing this is that all existing TCP connections will break
>> when you move from one AP to another and while some apps will quickly notice
>> this and establish new connections, there are many apps that will not and
>> this will cause noticable disruption to the user.
>
> I am under the impression that network-manager and linux, at least,
> tend to renegotiate
> IPv6 addresses on an down/up, and preserve ipv4.
>
>>
>> Bridgeing allows the connections to remain intact. The wifi stack
>> re-negotiates the encryption, but the encapsulated IP packets don't change.
>
> While I actually agree with dlang on having all the same ssid and
> bridging, and not routing, on a conference, as well as with the idea
> of disabling broadcast (and I assume direct connectivity between two
> people seated side by side), it is a pita:
>
> More than once I've wanted to share a git tree with someone right next
> to me. I try to hand them my ip to grab the tree, and they can't even
> ping me, so I end uploading it somewhere, and he or she downloading it
> from there. Similarly, breaking interconnectivity precludes sane usage
> of in-conference

oops, hit send too early. "Of in-conference tools like webrtc, which
would otherwise seek a direct path, as well as other p2p things like
chat based on that".

> In my case, since choosing to live in a routed, rather than bridged
> world, I have modified the nailed up tools I use to be more
> connectionless. Instead of ssh (tcp), I use mosh-multipath (udp),
> which is far superior for interactive shells in lousy wifi
> environments. For vpns, I switched to tinc, which will attempt direct
> connections over udp, and tcp on both ipv4 and ipv6. For access to
> google, I adopted quic in my chrome browser. Since doing all these
> things I rarely notice losing a nailed up connection or migrating from
> AP to AP. Additionally I use babel (where I control the network) and
> ad-hoc wifi to transparently migrate from AP to AP, and (often) from
> AP to wired to AP to wired as I change locations, also with no loss in
> connectivity.
>
> I don't expect the scale userbase to have made these adjustments in behavior. :/
>
>>
>> I do this with the wifi on it's own VLAN (actually separate VLANs for 2.4
>> and 5GHz) and have the APs configured not to relay broadcast traffic from
>> one wireless user to another. This cuts down a LOT on the problems of
>> broadcasts.
>>
>> In about a month I'm going to be running the wireless network for SCaLE
>> again, and I would be happy to instrament the network to gather whatever
>> info anyone is interested in. I will be using ~50 APs to handle the ~2800 or
>
> I will look into some tools bismark and others have.
>
> Will you attempt to deploy ipv6?
>
>> so devices that show up, with the footprint of each AP roughly covering a
>> small meeting room (larger rooms have 2 APs in them, the largest room has 3,
>> and I'm adding APs this year to cover the hallways better because the ones
>> in the rooms aren't doing well enough at the low power settings I'm using)
>
> I am of course interested in how fq_codel performs on your ISP link, and
> are you planning on running it for your wifi?
>
>> David Lang
>>
>> _______________________________________________
>> Cerowrt-devel mailing list
>> Cerowrt-devel@lists.bufferbloat.net
>> https://lists.bufferbloat.net/listinfo/cerowrt-devel
>
>
>
> --
> Dave Täht
>
> thttp://www.bufferbloat.net/projects/bloat/wiki/Upcoming_Talks



-- 
Dave Täht

thttp://www.bufferbloat.net/projects/bloat/wiki/Upcoming_Talks

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Cerowrt-devel] Recording RF management info _and_ associated traffic?
  2015-01-25  5:02                 ` Dave Taht
  2015-01-25  5:04                   ` Dave Taht
@ 2015-01-25  6:44                   ` David Lang
  2015-01-25  7:06                     ` David Lang
       [not found]                     ` <CAA93jw64KjW-JjLxB3i_ZK348NCyJYSQACFO34MaUsBBWyZ+pA@mail.gmail.com>
  1 sibling, 2 replies; 43+ messages in thread
From: David Lang @ 2015-01-25  6:44 UTC (permalink / raw)
  To: Dave Taht; +Cc: cerowrt-devel

On Sat, 24 Jan 2015, Dave Taht wrote:

>>> A side comment, meant to discourage continuing to bridge rather than
>>> route.
>>>
>>> There's no reason that the AP's cannot have different IP addresses, but a
>>> common ESSID.  Roaming between them would be like roaming among mesh
>>> subnets. Assuming you are securing your APs' air interfaces using encryption
>>> over the air, you are already re-authenticating as you move from AP to AP.
>>> So using routing rather than bridging is a good idea for all the reasons
>>> that routing rather than bridging is better for mesh.
>>
>>
>> The problem with doing this is that all existing TCP connections will break
>> when you move from one AP to another and while some apps will quickly notice
>> this and establish new connections, there are many apps that will not and
>> this will cause noticable disruption to the user.
>
> I am under the impression that network-manager and linux, at least,
> tend to renegotiate
> IPv6 addresses on an down/up, and preserve ipv4.

It can't preserve the ipv4 address if you end up on a different network address 
range (and trying to have lots of separate networks with the same IP addresses 
would mean that you have to do NAT at each network, and if you did that, then 
when you ended up on a different AP with the same IP address, the NAT tables 
would not have records of your connections and they would terminate the 
connections when you tried to send the next packets.

>> Bridgeing allows the connections to remain intact. The wifi stack
>> re-negotiates the encryption, but the encapsulated IP packets don't change.
>
> While I actually agree with dlang on having all the same ssid and
> bridging, and not routing, on a conference, as well as with the idea
> of disabling broadcast (and I assume direct connectivity between two
> people seated side by side), it is a pita:
>
> More than once I've wanted to share a git tree with someone right next
> to me. I try to hand them my ip to grab the tree, and they can't even
> ping me, so I end uploading it somewhere, and he or she downloading it
> from there. Similarly, breaking interconnectivity precludes sane usage
> of in-conference

True, it also blocks some abuse. People who really want direct connectivity can 
establish it as an ad-hoc network.

For the normal user that we are trying to support at a conference, it's a win.

I'll note that we also block streaming sites (which has the side effect of 
blocking some useful sites that share the same IPs, Amazon for example) to help 
make things better for everyone else, even at the cost of limiting what some 
people are able to do. Bandwidth is limited compared to the number of people we 
have, and we have to make choices.

We do provide a local mirror of the debian based distros so that people can do 
the updates that they always tend to do at the conference (we would do the same 
for Fedora, but they make it too hard to do so)

> In my case, since choosing to live in a routed, rather than bridged
> world, I have modified the nailed up tools I use to be more
> connectionless. Instead of ssh (tcp), I use mosh-multipath (udp),
> which is far superior for interactive shells in lousy wifi
> environments. For vpns, I switched to tinc, which will attempt direct
> connections over udp, and tcp on both ipv4 and ipv6. For access to
> google, I adopted quic in my chrome browser. Since doing all these
> things I rarely notice losing a nailed up connection or migrating from
> AP to AP. Additionally I use babel (where I control the network) and
> ad-hoc wifi to transparently migrate from AP to AP, and (often) from
> AP to wired to AP to wired as I change locations, also with no loss in
> connectivity.
>
> I don't expect the scale userbase to have made these adjustments in behavior. :/

:-)

>>
>> I do this with the wifi on it's own VLAN (actually separate VLANs for 2.4
>> and 5GHz) and have the APs configured not to relay broadcast traffic from
>> one wireless user to another. This cuts down a LOT on the problems of
>> broadcasts.
>>
>> In about a month I'm going to be running the wireless network for SCaLE
>> again, and I would be happy to instrament the network to gather whatever
>> info anyone is interested in. I will be using ~50 APs to handle the ~2800 or
>
> I will look into some tools bismark and others have.
>
> Will you attempt to deploy ipv6?

We have been offering IPv6 routable addresses for a few years.

>> so devices that show up, with the footprint of each AP roughly covering a
>> small meeting room (larger rooms have 2 APs in them, the largest room has 3,
>> and I'm adding APs this year to cover the hallways better because the ones
>> in the rooms aren't doing well enough at the low power settings I'm using)
>
> I am of course interested in how fq_codel performs on your ISP link, and
> are you planning on running it for your wifi?

I'm running OpenWRT on the APs but haven't done anything in particular to 
activate it. I'll check what we have on the firewall (a fairly up to day Debian 
build)

What's the best way to monitor the queues?

David Lang

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Cerowrt-devel] Recording RF management info _and_ associated traffic?
  2015-01-25  6:44                   ` David Lang
@ 2015-01-25  7:06                     ` David Lang
       [not found]                     ` <CAA93jw64KjW-JjLxB3i_ZK348NCyJYSQACFO34MaUsBBWyZ+pA@mail.gmail.com>
  1 sibling, 0 replies; 43+ messages in thread
From: David Lang @ 2015-01-25  7:06 UTC (permalink / raw)
  To: Dave Taht; +Cc: cerowrt-devel

On Sat, 24 Jan 2015, David Lang wrote:

> On Sat, 24 Jan 2015, Dave Taht wrote:
>
>> I am of course interested in how fq_codel performs on your ISP link, and
>> are you planning on running it for your wifi?
>
> I'm running OpenWRT on the APs but haven't done anything in particular to 
> activate it. I'll check what we have on the firewall (a fairly up to day 
> Debian build)
>
> What's the best way to monitor the queues?

For that matter, if you have any other monitoring or stats that you would like 
me to gather? I'm using WNDR3800 and WNDR3700v2 APs

Especially anything related to gathering stats related to fast wifi.

David Lang

^ permalink raw reply	[flat|nested] 43+ messages in thread

[parent not found: <CAA93jw64KjW-JjLxB3i_ZK348NCyJYSQACFO34MaUsBBWyZ+pA@mail.gmail.com>]

* Re: [Cerowrt-devel] Recording RF management info _and_ associated traffic?
       [not found]                     ` <CAA93jw64KjW-JjLxB3i_ZK348NCyJYSQACFO34MaUsBBWyZ+pA@mail.gmail.com>
@ 2015-01-25  7:59                       ` Dave Taht
  2015-01-25  9:39                       ` David Lang
  1 sibling, 0 replies; 43+ messages in thread
From: Dave Taht @ 2015-01-25  7:59 UTC (permalink / raw)
  To: David Lang; +Cc: cerowrt-devel

[-- Attachment #1: Type: text/plain, Size: 11230 bytes --]

This got mangled by my IP addr filter

On Jan 24, 2015 11:56 PM, "Dave Taht" <dave.taht@gmail.com> wrote:
>
> I want to make clear that I support dlang's design in the abstract... and
am just arguing because it is a slow day.
>
> On Sat, Jan 24, 2015 at 10:44 PM, David Lang <david@lang.hm> wrote:
> > On Sat, 24 Jan 2015, Dave Taht wrote:
> >
> >>>> A side comment, meant to discourage continuing to bridge rather than
> >>>> route.
> >>>>
> >>>> There's no reason that the AP's cannot have different IP addresses,
but
> >>>> a
> >>>> common ESSID. Roaming between them would be like roaming among mesh
> >>>> subnets. Assuming you are securing your APs' air interfaces using
> >>>> encryption
> >>>> over the air, you are already re-authenticating as you move from AP
to
> >>>> AP.
> >>>> So using routing rather than bridging is a good idea for all the
reasons
> >>>> that routing rather than bridging is better for mesh.
> >>>
> >>>
> >>>
> >>> The problem with doing this is that all existing TCP connections will
> >>> break
> >>> when you move from one AP to another and while some apps will quickly
> >>> notice
> >>> this and establish new connections, there are many apps that will not
and
> >>> this will cause noticable disruption to the user.
> >>
> >>
> >> I am under the impression that network-manager and linux, at least,
> >> tend to renegotiate
> >> IPv6 addresses on an down/up, and preserve ipv4.
> >
> >
> > It can't preserve the ipv4 address if you end up on a different network
> > address range (and trying to have lots of separate networks with the
same IP
> > addresses would mean that you have to do NAT at each network, and if
you did
> > that, then when you ended up on a different AP with the same IP
address, the
> > NAT tables would not have records of your connections and they would
> > terminate the connections when you tried to send the next packets.
>
> Hmm? The first thing I ever do to a router is renumber it to a unique IP
address range,
> and rename the subnet in dns to something unique. The 3 sed lines for
this are on a cerowrt web page somewhere. Adding ipv6 statically is a pita,
but doable with care and a uci script, and mildly more doable as hnetd
matures.
>
> I run local dns services on each in the hope that at least some will be
cached, and a local dhcp server to serve addresses out of that range. I
turn off dhcp default route fetching on each routers external interface and
use babel instead to find the right route(s) out of the system.
>
> On the NAT front, there is no nat on the internal routers, just a flat
address space (a /14 in my case). I push all the nat to the main egress
gateway(s), and in a case like yours would probably use multiple external
IPs and dnat rather than masquarade the entire subnet on one to free up
port space. You rapidly run out of ports in a natted evironment with that
many users. I've had to turn down NAT timeouts for udp in particular to
truly unreasonable levels otherwise (20 seconds in some cases)
>
> Doing this I can get a quick status on what is up with "ip route", and by
monitoring the activity on each ip range, see if traffic is actually being
passed, a failure of a given gateway fails over to another, and so on.
There's a couple snmp hacks to do things like monitor active leases, and
smokeping/mrtg to access other stats. There's a couple beagles that are on
wifi that I ping on some APs. The beagles have not been very reliable for
me, so they switch on and off with digiloggers gear when they fail a local
ping. In fact the main logging beagle failed entirely the other month, sigh.
>
> I use the ad-hoc links on cerowrt as backups (if they lose ethernet
connectivity) and extenders (if there is no ethernet connectivity), and (as
I have 5 different comcast exit nodes spread throughout the network), use
babel-pinger on each to see if they are up, and insert default routes into
the mix that are automatically the shortest "distance" between the node and
exit gateway. If one gw goes down (usually) all the traffic ends up
switching to the next nearest default gateway switching over in 16 seconds
or so, breaking all the nat associations for the net they were on (sigh),
as well as ipv6 native stuff, but it's happened so often without me
noticing it that it's nice not to worry.
>
> (I have a mostly failed attempt in play for doing better with ipv6 and
hnetd on a couple of exit nodes, but that isn't solid enough to deploy as
yet, so it's only sort of working in the yurtlab. I really wish I could buy
PI space for ipv6 somehow)
>
> (I have been fiddling with dns anycast to try to get more redundancy on
the main dns gateways. That works pretty good)
>
> Now, your method is simpler! (although mine is mostly scripted) I imagine
you bridge everything on a vlan, and use a central dhcp/dns server to serve
up dhcp across (say) a 10/16 subnet. And by blocking local
multicast/broadcast, in particular, this scales across the 3k user
population. You've got a critical single point of failure in your gateway,
but at least that's only one, and I imagine you have that duplicated.
>
> (In contrast my network is always broken somewhere, but unless two
critical nodes break, it's pretty redundant and loss is confined to a a
single AP - my biggest problem is that I need to upgrade the firmware on
about half the network - which involves climbing trees - and my plan was to
deploy hnetd last year so I could roll out ipv6)
>
> How do you deal with a dead AP that is not actually connecting with
traffic?
>
> >>> Bridgeing allows the connections to remain intact. The wifi stack
> >>> re-negotiates the encryption, but the encapsulated IP packets don't
> >>> change.
> >>
> >>
> >> While I actually agree with dlang on having all the same ssid and
> >> bridging, and not routing, on a conference, as well as with the idea
> >> of disabling broadcast (and I assume direct connectivity between two
> >> people seated side by side), it is a pita:
> >>
> >> More than once I've wanted to share a git tree with someone right next
> >> to me. I try to hand them my ip to grab the tree, and they can't even
> >> ping me, so I end uploading it somewhere, and he or she downloading it
> >> from there. Similarly, breaking interconnectivity precludes sane usage
> >> of in-conference
> >
> >
> > True, it also blocks some abuse. People who really want direct
connectivity
> > can establish it as an ad-hoc network.
>
> yes, I've often draped an ethernet cable between seats. :)
>
> >
> > For the normal user that we are trying to support at a conference, it's
a
> > win.
> >
> > I'll note that we also block streaming sites (which has the side effect
of
> > blocking some useful sites that share the same IPs, Amazon for example)
to
> > help make things better for everyone else, even at the cost of limiting
what
> > some people are able to do. Bandwidth is limited compared to the number
of
> > people we have, and we have to make choices.
>
> Blocking ads is also effective.
>
> > We do provide a local mirror of the debian based distros so that people
can
> > do the updates that they always tend to do at the conference (we would
do
> > the same for Fedora, but they make it too hard to do so)
> >
> >> In my case, since choosing to live in a routed, rather than bridged
> >> world, I have modified the nailed up tools I use to be more
> >> connectionless. Instead of ssh (tcp), I use mosh-multipath (udp),
> >> which is far superior for interactive shells in lousy wifi
> >> environments. For vpns, I switched to tinc, which will attempt direct
> >> connections over udp, and tcp on both ipv4 and ipv6. For access to
> >> google, I adopted quic in my chrome browser. Since doing all these
> >> things I rarely notice losing a nailed up connection or migrating from
> >> AP to AP. Additionally I use babel (where I control the network) and
> >> ad-hoc wifi to transparently migrate from AP to AP, and (often) from
> >> AP to wired to AP to wired as I change locations, also with no loss in
> >> connectivity.
> >>
> >> I don't expect the scale userbase to have made these adjustments in
> >> behavior. :/
> >
> >
> > :-)
>
> It wouldn't hurt to recomend these tools (notably quic and mosh) to
conference
> participants. both are pretty awesome.
>
> >
> >>>
> >>> I do this with the wifi on it's own VLAN (actually separate VLANs for
2.4
> >>> and 5GHz) and have the APs configured not to relay broadcast traffic
from
> >>> one wireless user to another. This cuts down a LOT on the problems of
> >>> broadcasts.
> >>>
> >>> In about a month I'm going to be running the wireless network for
SCaLE
> >>> again, and I would be happy to instrament the network to gather
whatever
> >>> info anyone is interested in. I will be using ~50 APs to handle the
~2800
> >>> or
> >>
> >>
> >> I will look into some tools bismark and others have.
> >>
> >> Will you attempt to deploy ipv6?
> >
> >
> > We have been offering IPv6 routable addresses for a few years.
>
> How many do you get and from whom?
>
> If I had time (doubtful) and budget (even more doubtful) I'd try to make
scale to observe and help out.
>
> >>> so devices that show up, with the footprint of each AP roughly
covering a
> >>> small meeting room (larger rooms have 2 APs in them, the largest room
has
> >>> 3,
> >>> and I'm adding APs this year to cover the hallways better because the
> >>> ones
> >>> in the rooms aren't doing well enough at the low power settings I'm
> >>> using)
> >>
> >>
> >> I am of course interested in how fq_codel performs on your ISP link,
and
> >> are you planning on running it for your wifi?
> >
> >
> > I'm running OpenWRT on the APs but haven't done anything in particular
to
> > activate it.
>
> fq_codel is on by default in Barrier breaker and later on all interfaces.
I note that it doesn't scale anywhere near as we would like under
contention but that work is only beginning in chaos calmer. A thought I've
had in an environment such as yours would be to rate limit each AP's
ingress/egress ethernet interface to, say, 20mbits, thus pushing all the
potential bloat to sqm on ethernet and out of the wifi (which would
generally run faster). Might even force uploads from the users lower, also
(say 10mbit). Might not, and just rely on people retaining low
expectations. :)
>
> Was it on openwrt last year?
>
> > I'll check what we have on the firewall (a fairly up to day
> > Debian build)
>
> fq_codel has been a part of that for a long time.
>
> I'd port over the sqm-scripts and use those, it's only a 1 line change.
>
> > What's the best way to monitor the queues?
>
> On each router?
>
> I tend to use pdsh a lot, setting up a /etc/genders file for them all so
I can do a
>
> pdsh tc qdisc show dev wlan0 # or uptime or cat /etc/dhcp.leases | wc -l
or whatever
>
> Been meaning to get around to something that used snmp instead for a
while.
>
> >
> > David Lang
>
> --
> Dave Täht
>
> thttp://www.bufferbloat.net/projects/bloat/wiki/Upcoming_Talks

[-- Attachment #2: Type: text/html, Size: 13637 bytes --]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Cerowrt-devel] Recording RF management info _and_ associated traffic?
       [not found]                     ` <CAA93jw64KjW-JjLxB3i_ZK348NCyJYSQACFO34MaUsBBWyZ+pA@mail.gmail.com>
  2015-01-25  7:59                       ` Dave Taht
@ 2015-01-25  9:39                       ` David Lang
  2015-01-25 15:03                         ` Chuck Anderson
  1 sibling, 1 reply; 43+ messages in thread
From: David Lang @ 2015-01-25  9:39 UTC (permalink / raw)
  To: Dave Taht; +Cc: cerowrt-devel

On Sun, 25 Jan 2015, Dave Taht wrote:

> I want to make clear that I support dlang's design in the abstract... and
> am just arguing because it is a slow day.

I welcome challenges to the design, it's how I improve things :-)

> On Sat, Jan 24, 2015 at 10:44 PM, David Lang <david@lang.hm> wrote:
>> On Sat, 24 Jan 2015, Dave Taht wrote:
>>

to clarify, the chain of comments was

1. instead of bridging I should route

2. network manager would preserve the IPv4 address to prevent breaking 
established connections.

I was explaining how that can't work. If you are moving between different 
networks, each routed independently, they either need to have different address 
ranges (in which case the old IP just won't work), or they would each need to 
NAT to get to the outside (in which case the IP may stay the same, but the 
connections will break since the new router wouldn't have the NAT entries for 
the existing connections)

> Hmm? The first thing I ever do to a router is renumber it to a unique IP 
> address range, and rename the subnet in dns to something unique. The 3 sed 
> lines for this are on a cerowrt web page somewhere. Adding ipv6 statically is 
> a pita, but doable with care and a uci script, and mildly more doable as hnetd 
> matures.
>
> I run local dns services on each in the hope that at least some will be
> cached, and a local dhcp server to serve addresses out of that range. I
> turn off dhcp default route fetching on each routers external interface and
> use babel instead to find the right route(s) out of the system.
>
> On the NAT front, there is no nat on the internal routers, just a flat
> address space (172.20.0.0/14 in my case). I push all the nat to the main
> egress gateway(s), and in a case like yours would probably use multiple
> external IPs and dnat rather than masquarade the entire subnet on one to
> free up port space. You rapidly run out of ports in a natted evironment
> with that many users. I've had to turn down NAT timeouts for udp in
> particular to truly unreasonable levels otherwise (20 seconds in some cases)

hmm, we haven't seen anything like this, but it could be a problem we haven't 
noticed because we haven't been looking for it.

> Doing this I can get a quick status on what is up with "ip route", and by
> monitoring the activity on each ip range, see if traffic is actually being
> passed, a failure of a given gateway fails over to another, and so on.
> There's a couple snmp hacks to do things like monitor active leases, and
> smokeping/mrtg to access other stats. There's a couple beagles that are on
> wifi that I ping on some APs. The beagles have not been very reliable for
> me, so they switch on and off with digiloggers gear when they fail a local
> ping. In fact the main logging beagle failed entirely the other month, sigh.
>
> I use the ad-hoc links on cerowrt as backups (if they lose ethernet
> connectivity) and extenders (if there is no ethernet connectivity), and (as
> I have 5 different comcast exit nodes spread throughout the network), use
> babel-pinger on each to see if they are up, and insert default routes into
> the mix that are automatically the shortest "distance" between the node and
> exit gateway. If one gw goes down (usually) all the traffic ends up
> switching to the next nearest default gateway switching over in 16 seconds
> or so, breaking all the nat associations for the net they were on (sigh),
> as well as ipv6 native stuff, but it's happened so often without me
> noticing it that it's nice not to worry.
>
> (I have a mostly failed attempt in play for doing better with ipv6 and
> hnetd on a couple of exit nodes, but that isn't solid enough to deploy as
> yet, so it's only sort of working in the yurtlab. I really wish I could buy
> PI space for ipv6 somehow)
>
> (I have been fiddling with dns anycast to try to get more redundancy on the
> main dns gateways. That works pretty good)
>
> Now, your method is simpler! (although mine is mostly scripted) I imagine
> you bridge everything on a vlan, and use a central dhcp/dns server to serve
> up dhcp across (say) a 10.0.0.0/16 subnet. And by blocking local
> multicast/broadcast, in particular, this scales across the 3k user
> population. You've got a critical single point of failure in your gateway,
> but at least that's only one, and I imagine you have that duplicated.

I have two wifi vlans, one for 5GHz (ESSID SCALE), and one for 2.4GHz (ESSID 
SCALE-slow, no speed limits, but it does a great job of encouraging everyone who 
can to use 5GHz :-) ) There is a central DHCP server and firewall that allocates 
addresses across a /17 for each of the two networks. We don't setup active 
failover, but we have a spare box that we can put in if needed.

The APs don't have any IP addresses on either wireless network. They have an IP 
on a different VLAN that's used for management only. Makes it a bit harder for 
any attackers to do anything to them.

Remember, we need to have it work for a few days at a shot

> (In contrast my network is always broken somewhere, but unless two critical
> nodes break, it's pretty redundant and loss is confined to a a single AP -
> my biggest problem is that I need to upgrade the firmware on about half the
> network - which involves climbing trees - and my plan was to deploy hnetd
> last year so I could roll out ipv6)
>
> How do you deal with a dead AP that is not actually connecting with traffic?

Nagios type monitoring to detect that the AP isn't reachable on the wired 
network and we send a runner to find out what's happening. About three years ago 
we had a lot of problems with people unplugging the APs for some reason.

>> For the normal user that we are trying to support at a conference, it's a
>> win.
>>
>> I'll note that we also block streaming sites (which has the side effect of 
>> blocking some useful sites that share the same IPs, Amazon for example) to 
>> help make things better for everyone else, even at the cost of limiting what 
>> some people are able to do. Bandwidth is limited compared to the number of 
>> people we have, and we have to make choices.
>
> Blocking ads is also effective.

We use DNS to block things like this (or actually redirect the DNS to point to a 
server that serves an image saying that they are being blocked by SCaLE), and 
then we block port 53 to the outside to force people to use our DNS servers. 
Somewhat heavy handed, but it works.

>>> Will you attempt to deploy ipv6?
>>
>>
>> We have been offering IPv6 routable addresses for a few years.
>
> How many do you get and from whom?

I don't remember at the moment.

>>> I am of course interested in how fq_codel performs on your ISP link, and
>>> are you planning on running it for your wifi?
>>
>>
>> I'm running OpenWRT on the APs but haven't done anything in particular to
>> activate it.
>
> fq_codel is on by default in Barrier breaker and later on all interfaces. I
> note that it doesn't scale anywhere near as we would like under contention
> but that work is only beginning in chaos calmer. A thought I've had in an
> environment such as yours would be to rate limit each AP's ingress/egress
> ethernet interface to, say, 20mbits, thus pushing all the potential bloat
> to sqm on ethernet and out of the wifi (which would generally run faster).
> Might even force uploads from the users lower, also (say 10mbit). Might
> not, and just rely on people retaining low expectations. :)
>
> Was it on openwrt last year?

yes, most of what I did on the wireless side is in the paper at 
https://www.usenix.org/conference/lisa12/technical-sessions/presentation/lang_david_wireless

The first year I did the network I had a total of one month to plan and buy APs, 
so I was running stock firmware, the second year I used DD-WRT and was very 
unhappy with it. I've been running OpenWRT since.

>> I'll check what we have on the firewall (a fairly up to day
>> Debian build)
>
> fq_codel has been a part of that for a long time.
>
> I'd port over the sqm-scripts and use those, it's only a 1 line change.
>
>> What's the best way to monitor the queues?
>
> On each router?
>
> I tend to use pdsh a lot, setting up a /etc/genders file for them all so I
> can do a
>
> pdsh tc qdisc show dev wlan0 # or uptime or cat /etc/dhcp.leases | wc -l or
> whatever
>
> Been meaning to get around to something that used snmp instead for a while.

I'm gathering info on each AP about the number of users currently connected and 
the bandwidth used on all ports. I also have a central log from all APs which 
shows the MAC addresses as they associate with each AP.

So collecting the data to one place is the easy part, what I don't now is what I 
need to gather from where with what commands. Any suggestions for this are very 
welcome.

David Lang

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Cerowrt-devel] Recording RF management info _and_ associated traffic?
  2015-01-25  9:39                       ` David Lang
@ 2015-01-25 15:03                         ` Chuck Anderson
  0 siblings, 0 replies; 43+ messages in thread
From: Chuck Anderson @ 2015-01-25 15:03 UTC (permalink / raw)
  To: cerowrt-devel

On Sun, Jan 25, 2015 at 01:39:32AM -0800, David Lang wrote:
> On Sun, 25 Jan 2015, Dave Taht wrote:
> 
> >I want to make clear that I support dlang's design in the abstract... and
> >am just arguing because it is a slow day.
> 
> I welcome challenges to the design, it's how I improve things :-)
> 
> >On Sat, Jan 24, 2015 at 10:44 PM, David Lang <david@lang.hm> wrote:
> >>On Sat, 24 Jan 2015, Dave Taht wrote:
> >>
> 
> to clarify, the chain of comments was
> 
> 1. instead of bridging I should route
> 
> 2. network manager would preserve the IPv4 address to prevent
> breaking established connections.
> 
> I was explaining how that can't work. If you are moving between
> different networks, each routed independently, they either need to
> have different address ranges (in which case the old IP just won't
> work), or they would each need to NAT to get to the outside (in
> which case the IP may stay the same, but the connections will break
> since the new router wouldn't have the NAT entries for the existing
> connections)

To keep your IP when roaming:

3. The old school way: use mobile IP or some other tunneling mechanism
   (or VPN) so you can keep your same IP.

4. Use a "virtual subnet" model similar to:

https://tools.ietf.org/html/draft-ietf-l3vpn-virtual-subnet-03

The draft is focused on data centers and VM migration, but the problem
is the same with client migration/mobility.  I would argue that it is
even easier to "discover" the location of a client with Wi-Fi because
of the association/authentication handshake with the AP rather than
relying on a Gratuitous ARP/ND or LLDP, VSI, etc.

5. Use LISP:

http://en.wikipedia.org/wiki/Locator/Identifier_Separation_Protocol
http://lispmob.org/ (supported on OpenWRT)

Has anyone played with this?

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Cerowrt-devel] Recording RF management info _and_ associated traffic?
  2015-01-25  4:35               ` David Lang
  2015-01-25  5:02                 ` Dave Taht
@ 2015-01-25 20:17                 ` dpreed
  2015-01-25 23:21                   ` Aaron Wood
  2015-01-25 23:57                   ` David Lang
  1 sibling, 2 replies; 43+ messages in thread
From: dpreed @ 2015-01-25 20:17 UTC (permalink / raw)
  To: David Lang; +Cc: cerowrt-devel

[-- Attachment #1: Type: text/plain, Size: 2970 bytes --]

Disagree. See below.

On Saturday, January 24, 2015 11:35pm, "David Lang" <david@lang.hm> said:

> On Sat, 24 Jan 2015, dpreed@reed.com wrote:
> > A side comment, meant to discourage continuing to bridge rather than route.
> >
> > There's no reason that the AP's cannot have different IP addresses, but a
> > common ESSID. Roaming between them would be like roaming among mesh subnets.
> > Assuming you are securing your APs' air interfaces using encryption over the
> > air, you are already re-authenticating as you move from AP to AP. So using
> > routing rather than bridging is a good idea for all the reasons that routing
> > rather than bridging is better for mesh.
> 
> The problem with doing this is that all existing TCP connections will break when
> you move from one AP to another and while some apps will quickly notice this and
> establish new connections, there are many apps that will not and this will cause
> noticable disruption to the user.
> 
> Bridgeing allows the connections to remain intact. The wifi stack re-negotiates
> the encryption, but the encapsulated IP packets don't change.

There is no reason why one cannot set up an enterprise network to support roaming, yet maintaining the property that IP addresses don't change while roaming from AP to AP.  Here's a simple concept, that amounts to moving what would be in the Ethernet bridging tables up to the IP layer.

All addresses in the enterprise are assigned from a common prefix (XXX/16 in IPv4, perhaps).  Routing in each access point is used to decide whether to send the packet on its LAN, or to reflect it to another LAN.  A node's preferred location would be updated by the endpoint itself, sending its current location to its current access point (via ARP or some other protocol).   The access point that hears of a new node that it can reach tells all the other access points that the node is attached to it.  Delivery of a packet to a node is done by the access point that receives the packet by looking up the destination IP address in its local table, and sending it to the access point that currently has the destination IP address.

This is far better than "bridging" at the Ethernet level from a functionality point of view - it is using routing, not bridging.  Bridging at the Ethernet level uses Ethernet's STP feature, which doesn't work very well in collections of wireless LAN's (it is slow to recalculate when something moves, because it was designed for unplug/plug of actual cables, and moving the host from one physical location to another).

IMO, Ethernet sometimes aspires to solve problems that are already well-solved in the Internet protocols. (for example the 802.11s mess which tries to do a mesh entirely in the Ethernet layer, and fails pretty miserably).
Of course that's only my opinion, but I think it applies to overuse of bridging at the Ethernet layer when there are better approaches at the next layer up.

[-- Attachment #2: Type: text/html, Size: 4368 bytes --]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Cerowrt-devel] Recording RF management info _and_ associated traffic?
  2015-01-25 20:17                 ` dpreed
@ 2015-01-25 23:21                   ` Aaron Wood
  2015-01-25 23:57                   ` David Lang
  1 sibling, 0 replies; 43+ messages in thread
From: Aaron Wood @ 2015-01-25 23:21 UTC (permalink / raw)
  To: David Reed; +Cc: cerowrt-devel

[-- Attachment #1: Type: text/plain, Size: 1241 bytes --]

On Sun, Jan 25, 2015 at 12:17 PM, <dpreed@reed.com> wrote:

> There is no reason why one cannot set up an enterprise network to support
> roaming, yet maintaining the property that IP addresses don't change while
> roaming from AP to AP.  Here's a simple concept, that amounts to moving
> what would be in the Ethernet bridging tables up to the IP layer.
>
>
>
> All addresses in the enterprise are assigned from a common prefix (XXX/16
> in IPv4, perhaps).  Routing in each access point is used to decide whether
> to send the packet on its LAN, or to reflect it to another LAN.  A node's
> preferred location would be updated by the endpoint itself, sending its
> current location to its current access point (via ARP or some other
> protocol).   The access point that hears of a new node that it can reach
> tells all the other access points that the node is attached to it.
> Delivery of a packet to a node is done by the access point that receives
> the packet by looking up the destination IP address in its local table, and
> sending it to the access point that currently has the destination IP
> address.
>

I'm not familiar with routing protocols.  Do any of the current ones do
this, or is this an idea for a new protocol?

-Aaron

[-- Attachment #2: Type: text/html, Size: 1910 bytes --]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Cerowrt-devel] Recording RF management info _and_ associated traffic?
  2015-01-25 20:17                 ` dpreed
  2015-01-25 23:21                   ` Aaron Wood
@ 2015-01-25 23:57                   ` David Lang
  2015-01-26  1:51                     ` dpreed
  2015-01-26  4:25                     ` Valdis.Kletnieks
  1 sibling, 2 replies; 43+ messages in thread
From: David Lang @ 2015-01-25 23:57 UTC (permalink / raw)
  To: dpreed; +Cc: cerowrt-devel

On Sun, 25 Jan 2015, dpreed@reed.com wrote:

> Disagree. See below.
>
>
> On Saturday, January 24, 2015 11:35pm, "David Lang" <david@lang.hm> said:
>
>
>
>> On Sat, 24 Jan 2015, dpreed@reed.com wrote:
>> > A side comment, meant to discourage continuing to bridge rather than route.
>> >
>> > There's no reason that the AP's cannot have different IP addresses, but a
>> > common ESSID. Roaming between them would be like roaming among mesh subnets.
>> > Assuming you are securing your APs' air interfaces using encryption over the
>> > air, you are already re-authenticating as you move from AP to AP. So using
>> > routing rather than bridging is a good idea for all the reasons that routing
>> > rather than bridging is better for mesh.
>> 
>> The problem with doing this is that all existing TCP connections will break when
>> you move from one AP to another and while some apps will quickly notice this and
>> establish new connections, there are many apps that will not and this will cause
>> noticable disruption to the user.
>> 
>> Bridgeing allows the connections to remain intact. The wifi stack re-negotiates
>> the encryption, but the encapsulated IP packets don't change.
>
>
> There is no reason why one cannot set up an enterprise network to support 
> roaming, yet maintaining the property that IP addresses don't change while 
> roaming from AP to AP.  Here's a simple concept, that amounts to moving what 
> would be in the Ethernet bridging tables up to the IP layer.
> 
> All addresses in the enterprise are assigned from a common prefix (XXX/16 in 
> IPv4, perhaps).  Routing in each access point is used to decide whether to 
> send the packet on its LAN, or to reflect it to another LAN.  A node's 
> preferred location would be updated by the endpoint itself, sending its 
> current location to its current access point (via ARP or some other protocol). 
> The access point that hears of a new node that it can reach tells all the 
> other access points that the node is attached to it.  Delivery of a packet to 
> a node is done by the access point that receives the packet by looking up the 
> destination IP address in its local table, and sending it to the access point 
> that currently has the destination IP address.
> 
> This is far better than "bridging" at the Ethernet level from a functionality 
> point of view - it is using routing, not bridging.  Bridging at the Ethernet 
> level uses Ethernet's STP feature, which doesn't work very well in collections 
> of wireless LAN's (it is slow to recalculate when something moves, because it 
> was designed for unplug/plug of actual cables, and moving the host from one 
> physical location to another).
> 
> IMO, Ethernet sometimes aspires to solve problems that are already well-solved 
> in the Internet protocols. (for example the 802.11s mess which tries to do a 
> mesh entirely in the Ethernet layer, and fails pretty miserably).
>
> Of course that's only my opinion, but I think it applies to overuse of 
> bridging at the Ethernet layer when there are better approaches at the next 
> layer up.

Unless you are going to have your routing tables handle every address in your 
network separately (and fix all the software that depends on broadcasts) you are 
going to have trouble trying to do this at the IP layer.

The 'modern Enterprise' datacenter has lots of large machines that get sliced 
into multiple virtual machines. For redundancy purposes you want to have the 
machines used for a particular job to be spread across as many of these machines 
as possible, spread around your datacenter.

Switches in this environment are becoming layer 2 routers. They are connected 
together with multiple links providing redundant paths around the network. This 
isn't being done with Spanning Tree because Spanning Tree only allows one path 
to exist at once, and that is inefficient and creates bottlenecks. As a result, 
they are now keeping all these links live at the same time and using least cost 
paths to route the layer 2 traffic across the switches.

It's fair to argue that this is abuse of layer 2, but the difficulties in having 
to change the software operating at higher layers vs the fact that making these 
changes at the layer 2 level is completely transparent to the higher layers make 
it so that using this layer 2 capability is pragmantically a far better choice.

The Computer Scientist will cringe at the 'hacks' that this introduces, but 
there is far more progress made when new capabilities can be added in a way 
that's transparent to other layers of the stack then when it requires major 
changes to how things work.

The software layer is the worst to try and force fundamental changes to. You 
would be horrified to learn how old some of the software is that's running major 
jobs at large companies. Even if the software is in continuous development, the 
age of the core software frequently shows.

David Lang

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Cerowrt-devel] Recording RF management info _and_ associated traffic?
  2015-01-25 23:57                   ` David Lang
@ 2015-01-26  1:51                     ` dpreed
  2015-01-26  2:09                       ` David Lang
  2015-01-26  2:19                       ` Dave Taht
  2015-01-26  4:25                     ` Valdis.Kletnieks
  1 sibling, 2 replies; 43+ messages in thread
From: dpreed @ 2015-01-26  1:51 UTC (permalink / raw)
  To: David Lang; +Cc: cerowrt-devel

[-- Attachment #1: Type: text/plain, Size: 5955 bytes --]


If you are using Ethernet bridging, your Ethernet switches are doing exactly this at the Ethernet layer... they have large tables of MAC addresses that are known throughout the network, and for each MAC address in the Enterprise, they have the next hop destination.
 
So IP routing tables, one IP address per destination in the Enterprise, would occupy no more space than do the Ethernet routing tables....  so any argument about space efficiency is mooted.
 
This is why bridging is no better than routing - you have to solve the same problem at one layer or the other. The Ethernet layer's "solution" is actually very suboptimal, especially when roaming is going on.


On Sunday, January 25, 2015 6:57pm, "David Lang" <david@lang.hm> said:



> On Sun, 25 Jan 2015, dpreed@reed.com wrote:
> 
> > Disagree. See below.
> >
> >
> > On Saturday, January 24, 2015 11:35pm, "David Lang" <david@lang.hm>
> said:
> >
> >
> >
> >> On Sat, 24 Jan 2015, dpreed@reed.com wrote:
> >> > A side comment, meant to discourage continuing to bridge rather than
> route.
> >> >
> >> > There's no reason that the AP's cannot have different IP addresses,
> but a
> >> > common ESSID. Roaming between them would be like roaming among mesh
> subnets.
> >> > Assuming you are securing your APs' air interfaces using encryption
> over the
> >> > air, you are already re-authenticating as you move from AP to AP. So
> using
> >> > routing rather than bridging is a good idea for all the reasons that
> routing
> >> > rather than bridging is better for mesh.
> >>
> >> The problem with doing this is that all existing TCP connections will
> break when
> >> you move from one AP to another and while some apps will quickly notice
> this and
> >> establish new connections, there are many apps that will not and this
> will cause
> >> noticable disruption to the user.
> >>
> >> Bridgeing allows the connections to remain intact. The wifi stack
> re-negotiates
> >> the encryption, but the encapsulated IP packets don't change.
> >
> >
> > There is no reason why one cannot set up an enterprise network to support
> > roaming, yet maintaining the property that IP addresses don't change while
> > roaming from AP to AP. Here's a simple concept, that amounts to moving what
> > would be in the Ethernet bridging tables up to the IP layer.
> >
> > All addresses in the enterprise are assigned from a common prefix (XXX/16 in
> > IPv4, perhaps). Routing in each access point is used to decide whether to
> > send the packet on its LAN, or to reflect it to another LAN. A node's
> > preferred location would be updated by the endpoint itself, sending its
> > current location to its current access point (via ARP or some other
> protocol).
> > The access point that hears of a new node that it can reach tells all the
> > other access points that the node is attached to it. Delivery of a packet to
> > a node is done by the access point that receives the packet by looking up the
> > destination IP address in its local table, and sending it to the access point
> > that currently has the destination IP address.
> >
> > This is far better than "bridging" at the Ethernet level from a functionality
> > point of view - it is using routing, not bridging. Bridging at the Ethernet
> > level uses Ethernet's STP feature, which doesn't work very well in
> collections
> > of wireless LAN's (it is slow to recalculate when something moves, because it
> > was designed for unplug/plug of actual cables, and moving the host from one
> > physical location to another).
> >
> > IMO, Ethernet sometimes aspires to solve problems that are already
> well-solved
> > in the Internet protocols. (for example the 802.11s mess which tries to do a
> > mesh entirely in the Ethernet layer, and fails pretty miserably).
> >
> > Of course that's only my opinion, but I think it applies to overuse of
> > bridging at the Ethernet layer when there are better approaches at the next
> > layer up.
> 
> Unless you are going to have your routing tables handle every address in your
> network separately (and fix all the software that depends on broadcasts) you are
> going to have trouble trying to do this at the IP layer.
> 
> The 'modern Enterprise' datacenter has lots of large machines that get sliced
> into multiple virtual machines. For redundancy purposes you want to have the
> machines used for a particular job to be spread across as many of these machines
> as possible, spread around your datacenter.
> 
> Switches in this environment are becoming layer 2 routers. They are connected
> together with multiple links providing redundant paths around the network. This
> isn't being done with Spanning Tree because Spanning Tree only allows one path
> to exist at once, and that is inefficient and creates bottlenecks. As a result,
> they are now keeping all these links live at the same time and using least cost
> paths to route the layer 2 traffic across the switches.
> 
> It's fair to argue that this is abuse of layer 2, but the difficulties in having
> to change the software operating at higher layers vs the fact that making these
> changes at the layer 2 level is completely transparent to the higher layers make
> it so that using this layer 2 capability is pragmantically a far better choice.
> 
> The Computer Scientist will cringe at the 'hacks' that this introduces, but
> there is far more progress made when new capabilities can be added in a way
> that's transparent to other layers of the stack then when it requires major
> changes to how things work.
> 
> The software layer is the worst to try and force fundamental changes to. You
> would be horrified to learn how old some of the software is that's running major
> jobs at large companies. Even if the software is in continuous development, the
> age of the core software frequently shows.
> 
> David Lang
> 

[-- Attachment #2: Type: text/html, Size: 7783 bytes --]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Cerowrt-devel] Recording RF management info _and_ associated traffic?
  2015-01-26  1:51                     ` dpreed
@ 2015-01-26  2:09                       ` David Lang
  2015-01-26  4:33                         ` Valdis.Kletnieks
  2015-01-26  2:19                       ` Dave Taht
  1 sibling, 1 reply; 43+ messages in thread
From: David Lang @ 2015-01-26  2:09 UTC (permalink / raw)
  To: dpreed; +Cc: cerowrt-devel

On Sun, 25 Jan 2015, dpreed@reed.com wrote:

> If you are using Ethernet bridging, your Ethernet switches are doing exactly 
> this at the Ethernet layer... they have large tables of MAC addresses that are 
> known throughout the network, and for each MAC address in the Enterprise, they 
> have the next hop destination.
> 
> So IP routing tables, one IP address per destination in the Enterprise, would 
> occupy no more space than do the Ethernet routing tables....  so any argument 
> about space efficiency is mooted.

The difference is that the switches and their protocols have been designed from 
the beginning for this scale of operation, IP routing protocols are designed for 
much fewer endpoints to track.

> This is why bridging is no better than routing - you have to solve the same 
> problem at one layer or the other. The Ethernet layer's "solution" is actually 
> very suboptimal, especially when roaming is going on.

well, the fact that doing it at the ethernet layer rather than the IP layer 
avoids the need to change your software, that's a significant win.

Other than 'tradition' or "layering violation', why is it any better to solve 
this at the IP layer than the MAC layer?

David Lang

>
> On Sunday, January 25, 2015 6:57pm, "David Lang" <david@lang.hm> said:
>
>
>
>> On Sun, 25 Jan 2015, dpreed@reed.com wrote:
>> 
>> > Disagree. See below.
>> >
>> >
>> > On Saturday, January 24, 2015 11:35pm, "David Lang" <david@lang.hm>
>> said:
>> >
>> >
>> >
>> >> On Sat, 24 Jan 2015, dpreed@reed.com wrote:
>> >> > A side comment, meant to discourage continuing to bridge rather than
>> route.
>> >> >
>> >> > There's no reason that the AP's cannot have different IP addresses,
>> but a
>> >> > common ESSID. Roaming between them would be like roaming among mesh
>> subnets.
>> >> > Assuming you are securing your APs' air interfaces using encryption
>> over the
>> >> > air, you are already re-authenticating as you move from AP to AP. So
>> using
>> >> > routing rather than bridging is a good idea for all the reasons that
>> routing
>> >> > rather than bridging is better for mesh.
>> >>
>> >> The problem with doing this is that all existing TCP connections will
>> break when
>> >> you move from one AP to another and while some apps will quickly notice
>> this and
>> >> establish new connections, there are many apps that will not and this
>> will cause
>> >> noticable disruption to the user.
>> >>
>> >> Bridgeing allows the connections to remain intact. The wifi stack
>> re-negotiates
>> >> the encryption, but the encapsulated IP packets don't change.
>> >
>> >
>> > There is no reason why one cannot set up an enterprise network to support
>> > roaming, yet maintaining the property that IP addresses don't change while
>> > roaming from AP to AP. Here's a simple concept, that amounts to moving what
>> > would be in the Ethernet bridging tables up to the IP layer.
>> >
>> > All addresses in the enterprise are assigned from a common prefix (XXX/16 in
>> > IPv4, perhaps). Routing in each access point is used to decide whether to
>> > send the packet on its LAN, or to reflect it to another LAN. A node's
>> > preferred location would be updated by the endpoint itself, sending its
>> > current location to its current access point (via ARP or some other
>> protocol).
>> > The access point that hears of a new node that it can reach tells all the
>> > other access points that the node is attached to it. Delivery of a packet to
>> > a node is done by the access point that receives the packet by looking up the
>> > destination IP address in its local table, and sending it to the access point
>> > that currently has the destination IP address.
>> >
>> > This is far better than "bridging" at the Ethernet level from a functionality
>> > point of view - it is using routing, not bridging. Bridging at the Ethernet
>> > level uses Ethernet's STP feature, which doesn't work very well in
>> collections
>> > of wireless LAN's (it is slow to recalculate when something moves, because it
>> > was designed for unplug/plug of actual cables, and moving the host from one
>> > physical location to another).
>> >
>> > IMO, Ethernet sometimes aspires to solve problems that are already
>> well-solved
>> > in the Internet protocols. (for example the 802.11s mess which tries to do a
>> > mesh entirely in the Ethernet layer, and fails pretty miserably).
>> >
>> > Of course that's only my opinion, but I think it applies to overuse of
>> > bridging at the Ethernet layer when there are better approaches at the next
>> > layer up.
>> 
>> Unless you are going to have your routing tables handle every address in your
>> network separately (and fix all the software that depends on broadcasts) you are
>> going to have trouble trying to do this at the IP layer.
>> 
>> The 'modern Enterprise' datacenter has lots of large machines that get sliced
>> into multiple virtual machines. For redundancy purposes you want to have the
>> machines used for a particular job to be spread across as many of these machines
>> as possible, spread around your datacenter.
>> 
>> Switches in this environment are becoming layer 2 routers. They are connected
>> together with multiple links providing redundant paths around the network. This
>> isn't being done with Spanning Tree because Spanning Tree only allows one path
>> to exist at once, and that is inefficient and creates bottlenecks. As a result,
>> they are now keeping all these links live at the same time and using least cost
>> paths to route the layer 2 traffic across the switches.
>> 
>> It's fair to argue that this is abuse of layer 2, but the difficulties in having
>> to change the software operating at higher layers vs the fact that making these
>> changes at the layer 2 level is completely transparent to the higher layers make
>> it so that using this layer 2 capability is pragmantically a far better choice.
>> 
>> The Computer Scientist will cringe at the 'hacks' that this introduces, but
>> there is far more progress made when new capabilities can be added in a way
>> that's transparent to other layers of the stack then when it requires major
>> changes to how things work.
>> 
>> The software layer is the worst to try and force fundamental changes to. You
>> would be horrified to learn how old some of the software is that's running major
>> jobs at large companies. Even if the software is in continuous development, the
>> age of the core software frequently shows.
>> 
>> David Lang
>>

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Cerowrt-devel] Recording RF management info _and_ associated traffic?
  2015-01-26  2:09                       ` David Lang
@ 2015-01-26  4:33                         ` Valdis.Kletnieks
  2015-01-26  4:44                           ` David Lang
  0 siblings, 1 reply; 43+ messages in thread
From: Valdis.Kletnieks @ 2015-01-26  4:33 UTC (permalink / raw)
  To: David Lang; +Cc: cerowrt-devel

[-- Attachment #1: Type: text/plain, Size: 565 bytes --]

On Sun, 25 Jan 2015 18:09:59 -0800, David Lang said:
> The difference is that the switches and their protocols have been designed from
> the beginning for this scale of operation, IP routing protocols are designed for
> much fewer endpoints to track.

Anybody who's carrying a full routing table was swallowing on the order
of 528,833 routes (as of Friday's "weekly routing table report" posted
to NANOG).  Pretty much everybody and their pet llama accepts full tables
thesedays.

You know anybody who's doing that many entries in an L2 Ethernet broadcast
domain?


[-- Attachment #2: Type: application/pgp-signature, Size: 848 bytes --]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Cerowrt-devel] Recording RF management info _and_ associated traffic?
  2015-01-26  4:33                         ` Valdis.Kletnieks
@ 2015-01-26  4:44                           ` David Lang
  2015-01-27  0:14                             ` dpreed
  0 siblings, 1 reply; 43+ messages in thread
From: David Lang @ 2015-01-26  4:44 UTC (permalink / raw)
  To: Valdis.Kletnieks; +Cc: cerowrt-devel

On Sun, 25 Jan 2015, Valdis.Kletnieks@vt.edu wrote:

> On Sun, 25 Jan 2015 18:09:59 -0800, David Lang said:
>> The difference is that the switches and their protocols have been designed from
>> the beginning for this scale of operation, IP routing protocols are designed for
>> much fewer endpoints to track.
>
> Anybody who's carrying a full routing table was swallowing on the order
> of 528,833 routes (as of Friday's "weekly routing table report" posted
> to NANOG).  Pretty much everybody and their pet llama accepts full tables
> thesedays.
>
> You know anybody who's doing that many entries in an L2 Ethernet broadcast
> domain?

The full IP routing tables are something that you normally only have to deal 
with in a few devices at the perimeter of your network.

What is being talked about here is routing each /32 IP address individually 
throughout your network so that any IP address can be connected anywhere and 
have it 'just work' as far as the client on that IP is concerned.

David Lang

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Cerowrt-devel] Recording RF management info _and_ associated traffic?
  2015-01-26  4:44                           ` David Lang
@ 2015-01-27  0:14                             ` dpreed
  2015-01-27  0:23                               ` David Lang
  0 siblings, 1 reply; 43+ messages in thread
From: dpreed @ 2015-01-27  0:14 UTC (permalink / raw)
  To: David Lang; +Cc: cerowrt-devel

[-- Attachment #1: Type: text/plain, Size: 1237 bytes --]


And having every /48 MAC address in your entterprise tracked is cheaper?


On Sunday, January 25, 2015 11:44pm, "David Lang" <david@lang.hm> said:



> On Sun, 25 Jan 2015, Valdis.Kletnieks@vt.edu wrote:
> 
> > On Sun, 25 Jan 2015 18:09:59 -0800, David Lang said:
> >> The difference is that the switches and their protocols have been
> designed from
> >> the beginning for this scale of operation, IP routing protocols are
> designed for
> >> much fewer endpoints to track.
> >
> > Anybody who's carrying a full routing table was swallowing on the order
> > of 528,833 routes (as of Friday's "weekly routing table report" posted
> > to NANOG). Pretty much everybody and their pet llama accepts full tables
> > thesedays.
> >
> > You know anybody who's doing that many entries in an L2 Ethernet broadcast
> > domain?
> 
> The full IP routing tables are something that you normally only have to deal
> with in a few devices at the perimeter of your network.
> 
> What is being talked about here is routing each /32 IP address individually
> throughout your network so that any IP address can be connected anywhere and
> have it 'just work' as far as the client on that IP is concerned.
> 
> David Lang
> 

[-- Attachment #2: Type: text/html, Size: 1906 bytes --]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Cerowrt-devel] Recording RF management info _and_ associated traffic?
  2015-01-27  0:14                             ` dpreed
@ 2015-01-27  0:23                               ` David Lang
  0 siblings, 0 replies; 43+ messages in thread
From: David Lang @ 2015-01-27  0:23 UTC (permalink / raw)
  To: dpreed; +Cc: cerowrt-devel

it doesn't get mixed in with tracking Internet routes as well.

On Mon, 26 Jan 2015, dpreed@reed.com wrote:

> And having every /48 MAC address in your entterprise tracked is cheaper?
>
>
> On Sunday, January 25, 2015 11:44pm, "David Lang" <david@lang.hm> said:
>
>
>
>> On Sun, 25 Jan 2015, Valdis.Kletnieks@vt.edu wrote:
>> 
>> > On Sun, 25 Jan 2015 18:09:59 -0800, David Lang said:
>> >> The difference is that the switches and their protocols have been
>> designed from
>> >> the beginning for this scale of operation, IP routing protocols are
>> designed for
>> >> much fewer endpoints to track.
>> >
>> > Anybody who's carrying a full routing table was swallowing on the order
>> > of 528,833 routes (as of Friday's "weekly routing table report" posted
>> > to NANOG). Pretty much everybody and their pet llama accepts full tables
>> > thesedays.
>> >
>> > You know anybody who's doing that many entries in an L2 Ethernet broadcast
>> > domain?
>> 
>> The full IP routing tables are something that you normally only have to deal
>> with in a few devices at the perimeter of your network.
>> 
>> What is being talked about here is routing each /32 IP address individually
>> throughout your network so that any IP address can be connected anywhere and
>> have it 'just work' as far as the client on that IP is concerned.
>> 
>> David Lang
>>

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Cerowrt-devel] Recording RF management info _and_ associated traffic?
  2015-01-26  1:51                     ` dpreed
  2015-01-26  2:09                       ` David Lang
@ 2015-01-26  2:19                       ` Dave Taht
  2015-01-26  2:43                         ` David Lang
  1 sibling, 1 reply; 43+ messages in thread
From: Dave Taht @ 2015-01-26  2:19 UTC (permalink / raw)
  To: David Reed; +Cc: Alexander Duyck, cerowrt-devel

Two notes:

1)

Switches all have a very fast (t)CAM based lookup for mac addresses
and vlan tags. The typical size for these is around 4096 entries per
vlan, although the next generation VXLAN standard will push this to a
lot more bits.

Routing, on the other hand, requires a lot more storage, that is
difficult to search in linear time, and worse, requires that a layer
three retain tables for ipv4, ipv6, and "other". Furthermore it
requires that every device that needs it participate in the routing
protocol - of which there are dozens - where spanning tree only has a
few variants and improvements. I don't know the extent to which

2) I am no fan of the various things I see being built on top of VXLAN
(see conga) - but it is a prevailing trend. I am a partial advocate of
moving all the routing support to the servers, and letting the
switches remain pretty dumb. There has been a lot of good work in this
area in Linux of late, as alexander has successfully cut the cost of a
a routing lookup that falls through to default from several hundred ns
to, like 16ns on the high end intel chips. I look forward to testing
that on the next round of cerowrt.

This is still a great deal slower than a switch can find the right mac
address (well, depending on how you measure it). And still needs a
commonly agreed upon routing protocol to fill the fib tables. Most
routing protocols do not fail over very quickly either, with typical
timeouts measured in 10s of seconds. On my very long todo list would
be one day trying to get babel to fail over or otherwise switch ideal
routes in under 40ms in a 10gigE environment - and even that is too
slow, and going faster would require changing the babel protocol,
which has a minimum time representation of 10ms. It would be an
interesting research project for someone to attempt high speed routing
in a data center virtual machine environment, instead of bridging.

To your roaming point, yes this is certainly one place where migrating
bridged vms across machines breaks down, and yet more and more vm
layers are doing it. I would certainly prefer routing in this case.

On Sun, Jan 25, 2015 at 5:51 PM,  <dpreed@reed.com> wrote:
> If you are using Ethernet bridging, your Ethernet switches are doing exactly
> this at the Ethernet layer... they have large tables of MAC addresses that
> are known throughout the network, and for each MAC address in the
> Enterprise, they have the next hop destination.
>
>
>
> So IP routing tables, one IP address per destination in the Enterprise,
> would occupy no more space than do the Ethernet routing tables....  so any
> argument about space efficiency is mooted.
>
>
>
> This is why bridging is no better than routing - you have to solve the same
> problem at one layer or the other. The Ethernet layer's "solution" is
> actually very suboptimal, especially when roaming is going on.
>
>
>
> On Sunday, January 25, 2015 6:57pm, "David Lang" <david@lang.hm> said:
>
>> On Sun, 25 Jan 2015, dpreed@reed.com wrote:
>>
>> > Disagree. See below.
>> >
>> >
>> > On Saturday, January 24, 2015 11:35pm, "David Lang" <david@lang.hm>
>> said:
>> >
>> >
>> >
>> >> On Sat, 24 Jan 2015, dpreed@reed.com wrote:
>> >> > A side comment, meant to discourage continuing to bridge rather than
>> route.
>> >> >
>> >> > There's no reason that the AP's cannot have different IP addresses,
>> but a
>> >> > common ESSID. Roaming between them would be like roaming among mesh
>> subnets.
>> >> > Assuming you are securing your APs' air interfaces using encryption
>> over the
>> >> > air, you are already re-authenticating as you move from AP to AP. So
>> using
>> >> > routing rather than bridging is a good idea for all the reasons that
>> routing
>> >> > rather than bridging is better for mesh.
>> >>
>> >> The problem with doing this is that all existing TCP connections will
>> break when
>> >> you move from one AP to another and while some apps will quickly notice
>> this and
>> >> establish new connections, there are many apps that will not and this
>> will cause
>> >> noticable disruption to the user.
>> >>
>> >> Bridgeing allows the connections to remain intact. The wifi stack
>> re-negotiates
>> >> the encryption, but the encapsulated IP packets don't change.
>> >
>> >
>> > There is no reason why one cannot set up an enterprise network to
>> > support
>> > roaming, yet maintaining the property that IP addresses don't change
>> > while
>> > roaming from AP to AP. Here's a simple concept, that amounts to moving
>> > what
>> > would be in the Ethernet bridging tables up to the IP layer.
>> >
>> > All addresses in the enterprise are assigned from a common prefix
>> > (XXX/16 in
>> > IPv4, perhaps). Routing in each access point is used to decide whether
>> > to
>> > send the packet on its LAN, or to reflect it to another LAN. A node's
>> > preferred location would be updated by the endpoint itself, sending its
>> > current location to its current access point (via ARP or some other
>> protocol).
>> > The access point that hears of a new node that it can reach tells all
>> > the
>> > other access points that the node is attached to it. Delivery of a
>> > packet to
>> > a node is done by the access point that receives the packet by looking
>> > up the
>> > destination IP address in its local table, and sending it to the access
>> > point
>> > that currently has the destination IP address.
>> >
>> > This is far better than "bridging" at the Ethernet level from a
>> > functionality
>> > point of view - it is using routing, not bridging. Bridging at the
>> > Ethernet
>> > level uses Ethernet's STP feature, which doesn't work very well in
>> collections
>> > of wireless LAN's (it is slow to recalculate when something moves,
>> > because it
>> > was designed for unplug/plug of actual cables, and moving the host from
>> > one
>> > physical location to another).
>> >
>> > IMO, Ethernet sometimes aspires to solve problems that are already
>> well-solved
>> > in the Internet protocols. (for example the 802.11s mess which tries to
>> > do a
>> > mesh entirely in the Ethernet layer, and fails pretty miserably).
>> >
>> > Of course that's only my opinion, but I think it applies to overuse of
>> > bridging at the Ethernet layer when there are better approaches at the
>> > next
>> > layer up.
>>
>> Unless you are going to have your routing tables handle every address in
>> your
>> network separately (and fix all the software that depends on broadcasts)
>> you are
>> going to have trouble trying to do this at the IP layer.
>>
>> The 'modern Enterprise' datacenter has lots of large machines that get
>> sliced
>> into multiple virtual machines. For redundancy purposes you want to have
>> the
>> machines used for a particular job to be spread across as many of these
>> machines
>> as possible, spread around your datacenter.
>>
>> Switches in this environment are becoming layer 2 routers. They are
>> connected
>> together with multiple links providing redundant paths around the network.
>> This
>> isn't being done with Spanning Tree because Spanning Tree only allows one
>> path
>> to exist at once, and that is inefficient and creates bottlenecks. As a
>> result,
>> they are now keeping all these links live at the same time and using least
>> cost
>> paths to route the layer 2 traffic across the switches.
>>
>> It's fair to argue that this is abuse of layer 2, but the difficulties in
>> having
>> to change the software operating at higher layers vs the fact that making
>> these
>> changes at the layer 2 level is completely transparent to the higher
>> layers make
>> it so that using this layer 2 capability is pragmantically a far better
>> choice.
>>
>> The Computer Scientist will cringe at the 'hacks' that this introduces,
>> but
>> there is far more progress made when new capabilities can be added in a
>> way
>> that's transparent to other layers of the stack then when it requires
>> major
>> changes to how things work.
>>
>> The software layer is the worst to try and force fundamental changes to.
>> You
>> would be horrified to learn how old some of the software is that's running
>> major
>> jobs at large companies. Even if the software is in continuous
>> development, the
>> age of the core software frequently shows.
>>
>> David Lang
>>
>
>
> _______________________________________________
> Cerowrt-devel mailing list
> Cerowrt-devel@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cerowrt-devel
>



-- 
Dave Täht

thttp://www.bufferbloat.net/projects/bloat/wiki/Upcoming_Talks

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Cerowrt-devel] Recording RF management info _and_ associated traffic?
  2015-01-26  2:19                       ` Dave Taht
@ 2015-01-26  2:43                         ` David Lang
  2015-01-26  2:58                           ` Dave Taht
  0 siblings, 1 reply; 43+ messages in thread
From: David Lang @ 2015-01-26  2:43 UTC (permalink / raw)
  To: Dave Taht; +Cc: Alexander Duyck, cerowrt-devel

On Sun, 25 Jan 2015, Dave Taht wrote:

> To your roaming point, yes this is certainly one place where migrating
> bridged vms across machines breaks down, and yet more and more vm
> layers are doing it. I would certainly prefer routing in this case.

What's the difference between "roaming" and moving a VM from one place in the 
network to another?

As far as layer 2 vs layer 3 goes. If you try to operate at layer 3, you are 
going to have quite a bit of smarts in the endpoint. Even if it's only connected 
vi a single link. If you think about it, even if your network routing tables 
list every machine in our environment individually, you still have a problem of 
what gateway the endpoint uses. It would have to change every time it moved. 
Since DHCP doesn't update frequently enough to be transparent, you would need to 
have each endpoint running a routing protocol.

This can work for individual hobbiests, but not when you need to support random 
devices (how would you configure an iPhone to support this?)

Letting the layer 2 equipment deal with the traffic within the building and 
invoking layer 3 to go outside the building (or to a different security domain) 
makes a lot of sense. Even if that means that layer 2 within a building looks 
very similar to what layer 3 used to look like around a city.

back to the topic of wifi, I'm not aware of any APs that participate in the 
switch protocols at this level. I also don't know of any reasonably priced 
switches that can do anything smarter than plain spanning tree when connected 
through multiple paths (I'd love to learn otherwise)

David Lang

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Cerowrt-devel] Recording RF management info _and_ associated traffic?
  2015-01-26  2:43                         ` David Lang
@ 2015-01-26  2:58                           ` Dave Taht
  2015-01-26  3:17                             ` dpreed
  2015-01-26  3:19                             ` David Lang
  0 siblings, 2 replies; 43+ messages in thread
From: Dave Taht @ 2015-01-26  2:58 UTC (permalink / raw)
  To: David Lang; +Cc: cerowrt-devel

On Sun, Jan 25, 2015 at 6:43 PM, David Lang <david@lang.hm> wrote:
> On Sun, 25 Jan 2015, Dave Taht wrote:
>
>> To your roaming point, yes this is certainly one place where migrating
>> bridged vms across machines breaks down, and yet more and more vm
>> layers are doing it. I would certainly prefer routing in this case.
>
>
> What's the difference between "roaming" and moving a VM from one place in
> the network to another?

I think most people think of "roaming" as moving fairly rapidly from one
piece of edge connectivity to another, and moving a vm is a great deal more
permanent operation.

> As far as layer 2 vs layer 3 goes. If you try to operate at layer 3, you are
> going to have quite a bit of smarts in the endpoint. Even if it's only
> connected vi a single link. If you think about it, even if your network
> routing tables list every machine in our environment individually, you still
> have a problem of what gateway the endpoint uses. It would have to change
> every time it moved. Since DHCP doesn't update frequently enough to be
> transparent, you would need to have each endpoint running a routing
> protocol.

Hmm? I don't ever use a dhcp-supplied default gateway, I depend on the routing
protocol to supply that. In terms of each vm running a routing protocol,
well, no, I would rely on the underlying bare metal OS to be doing
that, supplying
the FIB tables to the overlying vms, if they need it, but otherwise the vms
just see a "default" route and don't bother with it. They do need to inform the
bare metal OS (better term for this please? hypervisor?) of what IPs they own.

static default gateways are evil. and easily disabled. in linux you
merely comment
out the "routers" in /etc/dhcp/dhclient.conf, in openwrt, set
"defaultroute 0" for the
interface fetching dhcp.

When a box migrates, it tells the hypervisor it's addresses, and then that box
propagates out the route change to elsewhere.

>
> This can work for individual hobbiests, but not when you need to support
> random devices (how would you configure an iPhone to support this?)

Carefully. :)

I do note that this stuff does (or at least did) work on some of the open
source variants of android. I would rather like it if android added ipv6
tethering soon, and made it possible to mesh together multiple phones.

>
>
> Letting the layer 2 equipment deal with the traffic within the building and
> invoking layer 3 to go outside the building (or to a different security
> domain) makes a lot of sense. Even if that means that layer 2 within a
> building looks very similar to what layer 3 used to look like around a city.

Be careful what you wish for.

>
>
> back to the topic of wifi, I'm not aware of any APs that participate in the
> switch protocols at this level. I also don't know of any reasonably priced
> switches that can do anything smarter than plain spanning tree when
> connected through multiple paths (I'd love to learn otherwise)
>
> David Lang

-- 
Dave Täht

thttp://www.bufferbloat.net/projects/bloat/wiki/Upcoming_Talks

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Cerowrt-devel] Recording RF management info _and_ associated traffic?
  2015-01-26  2:58                           ` Dave Taht
@ 2015-01-26  3:17                             ` dpreed
  2015-01-26  3:32                               ` David Lang
  2015-01-26  3:45                               ` Dave Taht
  2015-01-26  3:19                             ` David Lang
  1 sibling, 2 replies; 43+ messages in thread
From: dpreed @ 2015-01-26  3:17 UTC (permalink / raw)
  To: Dave Taht; +Cc: cerowrt-devel

[-- Attachment #1: Type: text/plain, Size: 4895 bytes --]


Looking up an address in a routing table is o(1) if the routing table is a hash table.  That's much more efficient than a TCAM.  My simple example just requires a delete/insert at each node's route lookup table.
 
My point was about collections of WLAN's bridged together.  Look at what happens (at the packet/radio layer) when a new node joins a bridged set of WLANs using STP.  It is not exactly simple to rebuild the Ethernet layer's bridge routing tables in a complex network.  And the limit of 4096 entries in many inexpensive switches is not a trivial limit.
 
Routers used to be memory-starved (a small number of KB of RAM was the norm).  Perhaps the thinking then (back before 2000) has not been revised, even though the hardware is a lot more capacious.
 
Remember, the Ethernet layer in WLANs is implemented by microcontrollers, typically not very capable ones, plus TCAMs which are pretty limited in their flexibility.
 
While it is tempting to use the "pre-packaged, proprietary" Ethernet switch functionality, routing gets you out of the binary blobs, and let's you be a lot smarter and more scalable.  Given that it does NOT cost more to do routing at the IP layer, building complex Ethernet bridging is not obviously a win.
 
BTW, TCAMs are used in IP layer switching, too, and also are used in packet filtering.  Maybe not in cheap consumer switches, but lots of Gigabit switches implement IP layer switching and filtering.  At HP, their switches routinely did all their IP layer switching entirely in TCAMs.


On Sunday, January 25, 2015 9:58pm, "Dave Taht" <dave.taht@gmail.com> said:



> On Sun, Jan 25, 2015 at 6:43 PM, David Lang <david@lang.hm> wrote:
> > On Sun, 25 Jan 2015, Dave Taht wrote:
> >
> >> To your roaming point, yes this is certainly one place where migrating
> >> bridged vms across machines breaks down, and yet more and more vm
> >> layers are doing it. I would certainly prefer routing in this case.
> >
> >
> > What's the difference between "roaming" and moving a VM from one place in
> > the network to another?
> 
> I think most people think of "roaming" as moving fairly rapidly from one
> piece of edge connectivity to another, and moving a vm is a great deal more
> permanent operation.
> 
> > As far as layer 2 vs layer 3 goes. If you try to operate at layer 3, you are
> > going to have quite a bit of smarts in the endpoint. Even if it's only
> > connected vi a single link. If you think about it, even if your network
> > routing tables list every machine in our environment individually, you still
> > have a problem of what gateway the endpoint uses. It would have to change
> > every time it moved. Since DHCP doesn't update frequently enough to be
> > transparent, you would need to have each endpoint running a routing
> > protocol.
> 
> Hmm? I don't ever use a dhcp-supplied default gateway, I depend on the routing
> protocol to supply that. In terms of each vm running a routing protocol,
> well, no, I would rely on the underlying bare metal OS to be doing
> that, supplying
> the FIB tables to the overlying vms, if they need it, but otherwise the vms
> just see a "default" route and don't bother with it. They do need to inform the
> bare metal OS (better term for this please? hypervisor?) of what IPs they own.
> 
> static default gateways are evil. and easily disabled. in linux you
> merely comment
> out the "routers" in /etc/dhcp/dhclient.conf, in openwrt, set
> "defaultroute 0" for the
> interface fetching dhcp.
> 
> When a box migrates, it tells the hypervisor it's addresses, and then that box
> propagates out the route change to elsewhere.
> 
> >
> > This can work for individual hobbiests, but not when you need to support
> > random devices (how would you configure an iPhone to support this?)
> 
> Carefully. :)
> 
> I do note that this stuff does (or at least did) work on some of the open
> source variants of android. I would rather like it if android added ipv6
> tethering soon, and made it possible to mesh together multiple phones.
> 
> >
> >
> > Letting the layer 2 equipment deal with the traffic within the building and
> > invoking layer 3 to go outside the building (or to a different security
> > domain) makes a lot of sense. Even if that means that layer 2 within a
> > building looks very similar to what layer 3 used to look like around a city.
> 
> Be careful what you wish for.
> 
> >
> >
> > back to the topic of wifi, I'm not aware of any APs that participate in the
> > switch protocols at this level. I also don't know of any reasonably priced
> > switches that can do anything smarter than plain spanning tree when
> > connected through multiple paths (I'd love to learn otherwise)
> >
> > David Lang
> 
> 
> 
> --
> Dave Täht
> 
> thttp://www.bufferbloat.net/projects/bloat/wiki/Upcoming_Talks
> 

[-- Attachment #2: Type: text/html, Size: 7011 bytes --]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Cerowrt-devel] Recording RF management info _and_ associated traffic?
  2015-01-26  3:17                             ` dpreed
@ 2015-01-26  3:32                               ` David Lang
  2015-01-26  3:45                               ` Dave Taht
  1 sibling, 0 replies; 43+ messages in thread
From: David Lang @ 2015-01-26  3:32 UTC (permalink / raw)
  To: dpreed; +Cc: cerowrt-devel

On Sun, 25 Jan 2015, dpreed@reed.com wrote:

> Looking up an address in a routing table is o(1) if the routing table is a 
> hash table.  That's much more efficient than a TCAM.  My simple example just 
> requires a delete/insert at each node's route lookup table.
> 
> My point was about collections of WLAN's bridged together.  Look at what 
> happens (at the packet/radio layer) when a new node joins a bridged set of 
> WLANs using STP.  It is not exactly simple to rebuild the Ethernet layer's 
> bridge routing tables in a complex network.

How would it be any easier to rebuild the routing table? (even ignoring the 
question of what the devices use as their gateway)

> And the limit of 4096 entries in many inexpensive switches is not a trivial 
> limit.

Getting similar number of ports that all can be routed is significantly more 
expensive. Yes, the mid-range switches can run layer 3 routing, but they are far 
less efficient at doing so than they are at switching.

> Routers used to be memory-starved (a small number of KB of RAM was the norm). 
> Perhaps the thinking then (back before 2000) has not been revised, even though 
> the hardware is a lot more capacious.

well, you do have to remember that most of the routing protocols were designed 
in the days of those limits.

> Remember, the Ethernet layer in WLANs is implemented by microcontrollers, 
> typically not very capable ones, plus TCAMs which are pretty limited in their 
> flexibility.
> 
> While it is tempting to use the "pre-packaged, proprietary" Ethernet switch 
> functionality, routing gets you out of the binary blobs, and let's you be a 
> lot smarter and more scalable.

how do I run my own software on a HP switch to eliminate the binary blobs? How 
do I get similar performance on something with a dozen or more ports? From a 
theoretical point of view, you are absolutly correct, but there isn't an open 
equivalent available. This is even before you start talking about what's coded 
into the ASICs on the higher end switches, which while they are limited in what 
they can do, within those limits they will massivly outperform the other 
options.

>  Given that it does NOT cost more to do routing 
> at the IP layer, building complex Ethernet bridging is not obviously a win.

Ok, if it's not more expensive to do this. Exactly how would I set this up? 
remember that I have no ability to make any changes to the clients (iphones, 
android, Linux, Windows, Macs) I can't have them all running a routing protocol 
to have them figure out what gateway to use as they move from AP to AP.

not using 'cheap' commodity switches would make it more expensive (in my case we 
invested in buying a bunch of HP switches a couple years ago)

David Lang

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Cerowrt-devel] Recording RF management info _and_ associated traffic?
  2015-01-26  3:17                             ` dpreed
  2015-01-26  3:32                               ` David Lang
@ 2015-01-26  3:45                               ` Dave Taht
  2015-01-27  0:12                                 ` dpreed
  1 sibling, 1 reply; 43+ messages in thread
From: Dave Taht @ 2015-01-26  3:45 UTC (permalink / raw)
  To: David Reed; +Cc: cerowrt-devel

On Sun, Jan 25, 2015 at 7:17 PM,  <dpreed@reed.com> wrote:
> Looking up an address in a routing table is o(1) if the routing table is a
> hash table.  That's much more efficient than a TCAM.  My simple example just
> requires a delete/insert at each node's route lookup table.

Regrettably it is not O(1) once you take into account the cpu cache hierarchy,
or the potential collisions you will have once you shrink the hash to
something reasonable.

Also I think you are ignoring the problem of covering routes. Say I have to
get something to a.b.c.z/32. I do a lookup of that and find nothing. I then
look to find a.b.c.z/31 and find nothing, then /30, then /29, /28, until I find
a hit for the next hop. Now you can of course do a binary search for likely
subprefixes, but in any case, the search is not O(1).

In terms of cache efficient data structures, a straight hash is not the way
to go, of late I have been trying to wrap my head around the hat-trie as
possibly being useful in these circumstances.

Now, if you think about limiting the domain of the problem to something
greater than the typical mac table, but less than the whole internet,
it starts looking more reasonable to have a 1x1 ratio of destination
IPs to hash table entries for lookups, but updates have to probe/change
large segments of the table in order to deal with covering prefixes.

> My point was about collections of WLAN's bridged together.  Look at what
> happens (at the packet/radio layer) when a new node joins a bridged set of
> WLANs using STP.  It is not exactly simple to rebuild the Ethernet layer's
> bridge routing tables in a complex network.  And the limit of 4096 entries
> in many inexpensive switches is not a trivial limit.

Agreed. But see http://en.wikipedia.org/wiki/Virtual_Extensible_LAN

>
>
>
> Routers used to be memory-starved (a small number of KB of RAM was the
> norm).  Perhaps the thinking then (back before 2000) has not been revised,
> even though the hardware is a lot more capacious.

The profit margins have not been revised.

I would not mind, incidentally expanding the scope of the fqswitch project ot
try to build something that would scale up at l3 farther than we've ever seen
before, however funding for needed gear like:

http://www.eetimes.com/document.asp?doc_id=1321334

and time, and fpga expertise, is lacking. I am currently distracted by
evaluating
a very cool new cpu architecture ( see
http://www.millcomputing.com/wiki/Memory )
and even as nifty as that is I foresee a need for a lot of dedicated packet
processing logic and memories to get into the 40GBit+ range.
>
>
> Remember, the Ethernet layer in WLANs is implemented by microcontrollers,
> typically not very capable ones, plus TCAMs which are pretty limited in
> their flexibility.

I do tend to think that the next era of SDN enabled hardware will eventually
lead to more innovation in both the control and data plane  - however it
seems we are still in a "me-too" phase
of development of openvswitch (btw: there is a new software switch for
linux called rocker we should look at, and make sure runs fq_codel), and
a long way from flexibly programmable switch hardware in general.

http://openvswitch.org/pipermail/dev/2014-September/045084.html
>
>
>
> While it is tempting to use the "pre-packaged, proprietary" Ethernet switch
> functionality, routing gets you out of the binary blobs, and let's you be a
> lot smarter and more scalable.  Given that it does NOT cost more to do
> routing at the IP layer, building complex Ethernet bridging is not obviously
> a win.

SDN is certainly a way out of this mess. Eventually. But I fear we are making
all the same mistakes over again, and making slower hardware, where in the
end, it needs to be faster, to win.

>
>
> BTW, TCAMs are used in IP layer switching, too, and also are used in packet
> filtering.  Maybe not in cheap consumer switches, but lots of Gigabit
> switches implement IP layer switching and filtering.  At HP, their switches
> routinely did all their IP layer switching entirely in TCAMs.

Yep. I really wish big, fat TCAMS were standard equipment.

>
>
> On Sunday, January 25, 2015 9:58pm, "Dave Taht" <dave.taht@gmail.com> said:
>
>> On Sun, Jan 25, 2015 at 6:43 PM, David Lang <david@lang.hm> wrote:
>> > On Sun, 25 Jan 2015, Dave Taht wrote:
>> >
>> >> To your roaming point, yes this is certainly one place where migrating
>> >> bridged vms across machines breaks down, and yet more and more vm
>> >> layers are doing it. I would certainly prefer routing in this case.
>> >
>> >
>> > What's the difference between "roaming" and moving a VM from one place
>> > in
>> > the network to another?
>>
>> I think most people think of "roaming" as moving fairly rapidly from one
>> piece of edge connectivity to another, and moving a vm is a great deal
>> more
>> permanent operation.
>>
>> > As far as layer 2 vs layer 3 goes. If you try to operate at layer 3, you
>> > are
>> > going to have quite a bit of smarts in the endpoint. Even if it's only
>> > connected vi a single link. If you think about it, even if your network
>> > routing tables list every machine in our environment individually, you
>> > still
>> > have a problem of what gateway the endpoint uses. It would have to
>> > change
>> > every time it moved. Since DHCP doesn't update frequently enough to be
>> > transparent, you would need to have each endpoint running a routing
>> > protocol.
>>
>> Hmm? I don't ever use a dhcp-supplied default gateway, I depend on the
>> routing
>> protocol to supply that. In terms of each vm running a routing protocol,
>> well, no, I would rely on the underlying bare metal OS to be doing
>> that, supplying
>> the FIB tables to the overlying vms, if they need it, but otherwise the
>> vms
>> just see a "default" route and don't bother with it. They do need to
>> inform the
>> bare metal OS (better term for this please? hypervisor?) of what IPs they
>> own.
>>
>> static default gateways are evil. and easily disabled. in linux you
>> merely comment
>> out the "routers" in /etc/dhcp/dhclient.conf, in openwrt, set
>> "defaultroute 0" for the
>> interface fetching dhcp.
>>
>> When a box migrates, it tells the hypervisor it's addresses, and then that
>> box
>> propagates out the route change to elsewhere.
>>
>> >
>> > This can work for individual hobbiests, but not when you need to support
>> > random devices (how would you configure an iPhone to support this?)
>>
>> Carefully. :)
>>
>> I do note that this stuff does (or at least did) work on some of the open
>> source variants of android. I would rather like it if android added ipv6
>> tethering soon, and made it possible to mesh together multiple phones.
>>
>> >
>> >
>> > Letting the layer 2 equipment deal with the traffic within the building
>> > and
>> > invoking layer 3 to go outside the building (or to a different security
>> > domain) makes a lot of sense. Even if that means that layer 2 within a
>> > building looks very similar to what layer 3 used to look like around a
>> > city.
>>
>> Be careful what you wish for.
>>
>> >
>> >
>> > back to the topic of wifi, I'm not aware of any APs that participate in
>> > the
>> > switch protocols at this level. I also don't know of any reasonably
>> > priced
>> > switches that can do anything smarter than plain spanning tree when
>> > connected through multiple paths (I'd love to learn otherwise)
>> >
>> > David Lang
>>
>>
>>
>> --
>> Dave Täht
>>
>> thttp://www.bufferbloat.net/projects/bloat/wiki/Upcoming_Talks
>>



-- 
Dave Täht

thttp://www.bufferbloat.net/projects/bloat/wiki/Upcoming_Talks

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Cerowrt-devel] Recording RF management info _and_ associated traffic?
  2015-01-26  3:45                               ` Dave Taht
@ 2015-01-27  0:12                                 ` dpreed
  2015-01-27  0:31                                   ` David Lang
  2015-01-27  0:36                                   ` Dave Taht
  0 siblings, 2 replies; 43+ messages in thread
From: dpreed @ 2015-01-27  0:12 UTC (permalink / raw)
  To: Dave Taht; +Cc: cerowrt-devel

[-- Attachment #1: Type: text/plain, Size: 10339 bytes --]


Well, we all may want to agree to disagree.  I don't buy the argument that hash tables are slow compared to the TCAMs - and even if cache misses happened, a hash table is still o(1) - you look at exactly one memory address on the average in a hash table - that's the point of it.  The constant factor is the speed of memory - not terribly slow by any means.
 
To get into this deeper would require actual measurements, of which I am a great fan.  But your handwaves are pretty unquantitative, Dave, so at best they are similar to mine.  I'm very measurement focused, being part hardware architecture guy.
 
David - my comment about HP doing layer 3 switching in TCAMs just was there to point out that there's nothing magic about layer 2.  I was not suggesting that they don't use proprietary binary blobs, because they do.  But so do the TCAM programs in layer 2 devices.
 
Dave - you are conflating the implementation technique of the routing algorithm when you focus on "prefix matching" as being hard to do.  It's not hard to invent a performant algorithm to do that combined with a hash table.  A simple way to do that is to treat the address one is looking up as several addresses (of shorter prefixes of the address).  Then look each one up separately by its hash.  Its still o(1) if you do that, just a larger constant factor. I assume you don't actually think it is optimal to do linear searches on the routing table like hosts sometimes do.  Linear search is not necessary.
 
There is literally nothing magical about looking up 48-bit random Ethernet addresses in a LAN.
 
As far as NAT'ing is concerned - that is done by the gateways.  It's possible in principle to create a distributed NAT face to an Enterprise - if you do so, then roaming within the enterprise just amounts to telling the NAT face about the new internal IP address that corresponds to the old one - an update of one address translation with another.
 
This is how phones roam, by the way. They update their location via an HLR as they roam.
 


On Sunday, January 25, 2015 10:45pm, "Dave Taht" <dave.taht@gmail.com> said:



> On Sun, Jan 25, 2015 at 7:17 PM, <dpreed@reed.com> wrote:
> > Looking up an address in a routing table is o(1) if the routing table is a
> > hash table. That's much more efficient than a TCAM. My simple example just
> > requires a delete/insert at each node's route lookup table.
> 
> Regrettably it is not O(1) once you take into account the cpu cache hierarchy,
> or the potential collisions you will have once you shrink the hash to
> something reasonable.
> 
> Also I think you are ignoring the problem of covering routes. Say I have to
> get something to a.b.c.z/32. I do a lookup of that and find nothing. I then
> look to find a.b.c.z/31 and find nothing, then /30, then /29, /28, until I find
> a hit for the next hop. Now you can of course do a binary search for likely
> subprefixes, but in any case, the search is not O(1).
> 
> In terms of cache efficient data structures, a straight hash is not the way
> to go, of late I have been trying to wrap my head around the hat-trie as
> possibly being useful in these circumstances.
> 
> Now, if you think about limiting the domain of the problem to something
> greater than the typical mac table, but less than the whole internet,
> it starts looking more reasonable to have a 1x1 ratio of destination
> IPs to hash table entries for lookups, but updates have to probe/change
> large segments of the table in order to deal with covering prefixes.
> 
> > My point was about collections of WLAN's bridged together. Look at what
> > happens (at the packet/radio layer) when a new node joins a bridged set of
> > WLANs using STP. It is not exactly simple to rebuild the Ethernet layer's
> > bridge routing tables in a complex network. And the limit of 4096 entries
> > in many inexpensive switches is not a trivial limit.
> 
> Agreed. But see http://en.wikipedia.org/wiki/Virtual_Extensible_LAN
> 
> >
> >
> >
> > Routers used to be memory-starved (a small number of KB of RAM was the
> > norm). Perhaps the thinking then (back before 2000) has not been revised,
> > even though the hardware is a lot more capacious.
> 
> The profit margins have not been revised.
> 
> I would not mind, incidentally expanding the scope of the fqswitch project ot
> try to build something that would scale up at l3 farther than we've ever seen
> before, however funding for needed gear like:
> 
> http://www.eetimes.com/document.asp?doc_id=1321334
> 
> and time, and fpga expertise, is lacking. I am currently distracted by
> evaluating
> a very cool new cpu architecture ( see
> http://www.millcomputing.com/wiki/Memory )
> and even as nifty as that is I foresee a need for a lot of dedicated packet
> processing logic and memories to get into the 40GBit+ range.
> >
> >
> > Remember, the Ethernet layer in WLANs is implemented by microcontrollers,
> > typically not very capable ones, plus TCAMs which are pretty limited in
> > their flexibility.
> 
> I do tend to think that the next era of SDN enabled hardware will eventually
> lead to more innovation in both the control and data plane - however it
> seems we are still in a "me-too" phase
> of development of openvswitch (btw: there is a new software switch for
> linux called rocker we should look at, and make sure runs fq_codel), and
> a long way from flexibly programmable switch hardware in general.
> 
> http://openvswitch.org/pipermail/dev/2014-September/045084.html
> >
> >
> >
> > While it is tempting to use the "pre-packaged, proprietary" Ethernet switch
> > functionality, routing gets you out of the binary blobs, and let's you be a
> > lot smarter and more scalable. Given that it does NOT cost more to do
> > routing at the IP layer, building complex Ethernet bridging is not obviously
> > a win.
> 
> SDN is certainly a way out of this mess. Eventually. But I fear we are making
> all the same mistakes over again, and making slower hardware, where in the
> end, it needs to be faster, to win.
> 
> >
> >
> > BTW, TCAMs are used in IP layer switching, too, and also are used in packet
> > filtering. Maybe not in cheap consumer switches, but lots of Gigabit
> > switches implement IP layer switching and filtering. At HP, their switches
> > routinely did all their IP layer switching entirely in TCAMs.
> 
> Yep. I really wish big, fat TCAMS were standard equipment.
> 
> >
> >
> > On Sunday, January 25, 2015 9:58pm, "Dave Taht" <dave.taht@gmail.com>
> said:
> >
> >> On Sun, Jan 25, 2015 at 6:43 PM, David Lang <david@lang.hm> wrote:
> >> > On Sun, 25 Jan 2015, Dave Taht wrote:
> >> >
> >> >> To your roaming point, yes this is certainly one place where
> migrating
> >> >> bridged vms across machines breaks down, and yet more and more
> vm
> >> >> layers are doing it. I would certainly prefer routing in this
> case.
> >> >
> >> >
> >> > What's the difference between "roaming" and moving a VM from one
> place
> >> > in
> >> > the network to another?
> >>
> >> I think most people think of "roaming" as moving fairly rapidly from one
> >> piece of edge connectivity to another, and moving a vm is a great deal
> >> more
> >> permanent operation.
> >>
> >> > As far as layer 2 vs layer 3 goes. If you try to operate at layer 3,
> you
> >> > are
> >> > going to have quite a bit of smarts in the endpoint. Even if it's
> only
> >> > connected vi a single link. If you think about it, even if your
> network
> >> > routing tables list every machine in our environment individually,
> you
> >> > still
> >> > have a problem of what gateway the endpoint uses. It would have to
> >> > change
> >> > every time it moved. Since DHCP doesn't update frequently enough to
> be
> >> > transparent, you would need to have each endpoint running a routing
> >> > protocol.
> >>
> >> Hmm? I don't ever use a dhcp-supplied default gateway, I depend on the
> >> routing
> >> protocol to supply that. In terms of each vm running a routing protocol,
> >> well, no, I would rely on the underlying bare metal OS to be doing
> >> that, supplying
> >> the FIB tables to the overlying vms, if they need it, but otherwise the
> >> vms
> >> just see a "default" route and don't bother with it. They do need to
> >> inform the
> >> bare metal OS (better term for this please? hypervisor?) of what IPs
> they
> >> own.
> >>
> >> static default gateways are evil. and easily disabled. in linux you
> >> merely comment
> >> out the "routers" in /etc/dhcp/dhclient.conf, in openwrt, set
> >> "defaultroute 0" for the
> >> interface fetching dhcp.
> >>
> >> When a box migrates, it tells the hypervisor it's addresses, and then
> that
> >> box
> >> propagates out the route change to elsewhere.
> >>
> >> >
> >> > This can work for individual hobbiests, but not when you need to
> support
> >> > random devices (how would you configure an iPhone to support this?)
> >>
> >> Carefully. :)
> >>
> >> I do note that this stuff does (or at least did) work on some of the
> open
> >> source variants of android. I would rather like it if android added ipv6
> >> tethering soon, and made it possible to mesh together multiple phones.
> >>
> >> >
> >> >
> >> > Letting the layer 2 equipment deal with the traffic within the
> building
> >> > and
> >> > invoking layer 3 to go outside the building (or to a different
> security
> >> > domain) makes a lot of sense. Even if that means that layer 2 within
> a
> >> > building looks very similar to what layer 3 used to look like around
> a
> >> > city.
> >>
> >> Be careful what you wish for.
> >>
> >> >
> >> >
> >> > back to the topic of wifi, I'm not aware of any APs that participate
> in
> >> > the
> >> > switch protocols at this level. I also don't know of any reasonably
> >> > priced
> >> > switches that can do anything smarter than plain spanning tree when
> >> > connected through multiple paths (I'd love to learn otherwise)
> >> >
> >> > David Lang
> >>
> >>
> >>
> >> --
> >> Dave Täht
> >>
> >> thttp://www.bufferbloat.net/projects/bloat/wiki/Upcoming_Talks
> >>
> 
> 
> 
> --
> Dave Täht
> 
> thttp://www.bufferbloat.net/projects/bloat/wiki/Upcoming_Talks
> 

[-- Attachment #2: Type: text/html, Size: 14397 bytes --]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Cerowrt-devel] Recording RF management info _and_ associated traffic?
  2015-01-27  0:12                                 ` dpreed
@ 2015-01-27  0:31                                   ` David Lang
  2015-01-27  0:36                                   ` Dave Taht
  1 sibling, 0 replies; 43+ messages in thread
From: David Lang @ 2015-01-27  0:31 UTC (permalink / raw)
  To: dpreed; +Cc: cerowrt-devel

[-- Attachment #1: Type: TEXT/PLAIN, Size: 9889 bytes --]

On Mon, 26 Jan 2015, dpreed@reed.com wrote:

> As far as NAT'ing is concerned - that is done by the gateways.  It's possible 
> in principle to create a distributed NAT face to an Enterprise - if you do so, 
> then roaming within the enterprise just amounts to telling the NAT face about 
> the new internal IP address that corresponds to the old one - an update of one 
> address translation with another.

remember that the claim was that you could have the APs route, not bridge, but 
let a device move from being connected to one AP to being connected to another 
AP without it needing to change it's IP address and without the connections 
using that IP address getting broken.

How you would do this is the problem. Getting traffic to the device could be 
done if you detect it's movement and change your IP routing tables, but getting 
data from the device is going to be harder because the device is going to keep 
sending traffic to the same gateway. So you either need to pull layer 2 tricks 
to get the packets to the right gateway before processing them, or you need the 
new AP to handle packets sent to the IP address of the old AP. If you do NAT or 
stateful packet filtering on the AP, you also need the that state to get 
migrated somehow.

> This is how phones roam, by the way. They update their location via an HLR as 
> they roam.

the phones get a new IP address as they roam and break existing connections 
don't they? The software either gets a notification that the network has changed 
and connect again, or the connections end up timing out. Right??

David Lang

>
>
> On Sunday, January 25, 2015 10:45pm, "Dave Taht" <dave.taht@gmail.com> said:
>
>
>
>> On Sun, Jan 25, 2015 at 7:17 PM, <dpreed@reed.com> wrote:
>> > Looking up an address in a routing table is o(1) if the routing table is a
>> > hash table. That's much more efficient than a TCAM. My simple example just
>> > requires a delete/insert at each node's route lookup table.
>> 
>> Regrettably it is not O(1) once you take into account the cpu cache hierarchy,
>> or the potential collisions you will have once you shrink the hash to
>> something reasonable.
>> 
>> Also I think you are ignoring the problem of covering routes. Say I have to
>> get something to a.b.c.z/32. I do a lookup of that and find nothing. I then
>> look to find a.b.c.z/31 and find nothing, then /30, then /29, /28, until I find
>> a hit for the next hop. Now you can of course do a binary search for likely
>> subprefixes, but in any case, the search is not O(1).
>> 
>> In terms of cache efficient data structures, a straight hash is not the way
>> to go, of late I have been trying to wrap my head around the hat-trie as
>> possibly being useful in these circumstances.
>> 
>> Now, if you think about limiting the domain of the problem to something
>> greater than the typical mac table, but less than the whole internet,
>> it starts looking more reasonable to have a 1x1 ratio of destination
>> IPs to hash table entries for lookups, but updates have to probe/change
>> large segments of the table in order to deal with covering prefixes.
>> 
>> > My point was about collections of WLAN's bridged together. Look at what
>> > happens (at the packet/radio layer) when a new node joins a bridged set of
>> > WLANs using STP. It is not exactly simple to rebuild the Ethernet layer's
>> > bridge routing tables in a complex network. And the limit of 4096 entries
>> > in many inexpensive switches is not a trivial limit.
>> 
>> Agreed. But see http://en.wikipedia.org/wiki/Virtual_Extensible_LAN
>> 
>> >
>> >
>> >
>> > Routers used to be memory-starved (a small number of KB of RAM was the
>> > norm). Perhaps the thinking then (back before 2000) has not been revised,
>> > even though the hardware is a lot more capacious.
>> 
>> The profit margins have not been revised.
>> 
>> I would not mind, incidentally expanding the scope of the fqswitch project ot
>> try to build something that would scale up at l3 farther than we've ever seen
>> before, however funding for needed gear like:
>> 
>> http://www.eetimes.com/document.asp?doc_id=1321334
>> 
>> and time, and fpga expertise, is lacking. I am currently distracted by
>> evaluating
>> a very cool new cpu architecture ( see
>> http://www.millcomputing.com/wiki/Memory )
>> and even as nifty as that is I foresee a need for a lot of dedicated packet
>> processing logic and memories to get into the 40GBit+ range.
>> >
>> >
>> > Remember, the Ethernet layer in WLANs is implemented by microcontrollers,
>> > typically not very capable ones, plus TCAMs which are pretty limited in
>> > their flexibility.
>> 
>> I do tend to think that the next era of SDN enabled hardware will eventually
>> lead to more innovation in both the control and data plane - however it
>> seems we are still in a "me-too" phase
>> of development of openvswitch (btw: there is a new software switch for
>> linux called rocker we should look at, and make sure runs fq_codel), and
>> a long way from flexibly programmable switch hardware in general.
>> 
>> http://openvswitch.org/pipermail/dev/2014-September/045084.html
>> >
>> >
>> >
>> > While it is tempting to use the "pre-packaged, proprietary" Ethernet switch
>> > functionality, routing gets you out of the binary blobs, and let's you be a
>> > lot smarter and more scalable. Given that it does NOT cost more to do
>> > routing at the IP layer, building complex Ethernet bridging is not obviously
>> > a win.
>> 
>> SDN is certainly a way out of this mess. Eventually. But I fear we are making
>> all the same mistakes over again, and making slower hardware, where in the
>> end, it needs to be faster, to win.
>> 
>> >
>> >
>> > BTW, TCAMs are used in IP layer switching, too, and also are used in packet
>> > filtering. Maybe not in cheap consumer switches, but lots of Gigabit
>> > switches implement IP layer switching and filtering. At HP, their switches
>> > routinely did all their IP layer switching entirely in TCAMs.
>> 
>> Yep. I really wish big, fat TCAMS were standard equipment.
>> 
>> >
>> >
>> > On Sunday, January 25, 2015 9:58pm, "Dave Taht" <dave.taht@gmail.com>
>> said:
>> >
>> >> On Sun, Jan 25, 2015 at 6:43 PM, David Lang <david@lang.hm> wrote:
>> >> > On Sun, 25 Jan 2015, Dave Taht wrote:
>> >> >
>> >> >> To your roaming point, yes this is certainly one place where
>> migrating
>> >> >> bridged vms across machines breaks down, and yet more and more
>> vm
>> >> >> layers are doing it. I would certainly prefer routing in this
>> case.
>> >> >
>> >> >
>> >> > What's the difference between "roaming" and moving a VM from one
>> place
>> >> > in
>> >> > the network to another?
>> >>
>> >> I think most people think of "roaming" as moving fairly rapidly from one
>> >> piece of edge connectivity to another, and moving a vm is a great deal
>> >> more
>> >> permanent operation.
>> >>
>> >> > As far as layer 2 vs layer 3 goes. If you try to operate at layer 3,
>> you
>> >> > are
>> >> > going to have quite a bit of smarts in the endpoint. Even if it's
>> only
>> >> > connected vi a single link. If you think about it, even if your
>> network
>> >> > routing tables list every machine in our environment individually,
>> you
>> >> > still
>> >> > have a problem of what gateway the endpoint uses. It would have to
>> >> > change
>> >> > every time it moved. Since DHCP doesn't update frequently enough to
>> be
>> >> > transparent, you would need to have each endpoint running a routing
>> >> > protocol.
>> >>
>> >> Hmm? I don't ever use a dhcp-supplied default gateway, I depend on the
>> >> routing
>> >> protocol to supply that. In terms of each vm running a routing protocol,
>> >> well, no, I would rely on the underlying bare metal OS to be doing
>> >> that, supplying
>> >> the FIB tables to the overlying vms, if they need it, but otherwise the
>> >> vms
>> >> just see a "default" route and don't bother with it. They do need to
>> >> inform the
>> >> bare metal OS (better term for this please? hypervisor?) of what IPs
>> they
>> >> own.
>> >>
>> >> static default gateways are evil. and easily disabled. in linux you
>> >> merely comment
>> >> out the "routers" in /etc/dhcp/dhclient.conf, in openwrt, set
>> >> "defaultroute 0" for the
>> >> interface fetching dhcp.
>> >>
>> >> When a box migrates, it tells the hypervisor it's addresses, and then
>> that
>> >> box
>> >> propagates out the route change to elsewhere.
>> >>
>> >> >
>> >> > This can work for individual hobbiests, but not when you need to
>> support
>> >> > random devices (how would you configure an iPhone to support this?)
>> >>
>> >> Carefully. :)
>> >>
>> >> I do note that this stuff does (or at least did) work on some of the
>> open
>> >> source variants of android. I would rather like it if android added ipv6
>> >> tethering soon, and made it possible to mesh together multiple phones.
>> >>
>> >> >
>> >> >
>> >> > Letting the layer 2 equipment deal with the traffic within the
>> building
>> >> > and
>> >> > invoking layer 3 to go outside the building (or to a different
>> security
>> >> > domain) makes a lot of sense. Even if that means that layer 2 within
>> a
>> >> > building looks very similar to what layer 3 used to look like around
>> a
>> >> > city.
>> >>
>> >> Be careful what you wish for.
>> >>
>> >> >
>> >> >
>> >> > back to the topic of wifi, I'm not aware of any APs that participate
>> in
>> >> > the
>> >> > switch protocols at this level. I also don't know of any reasonably
>> >> > priced
>> >> > switches that can do anything smarter than plain spanning tree when
>> >> > connected through multiple paths (I'd love to learn otherwise)
>> >> >
>> >> > David Lang
>> >>
>> >>
>> >>
>> >> --
>> >> Dave Täht
>> >>
>> >> thttp://www.bufferbloat.net/projects/bloat/wiki/Upcoming_Talks
>> >>
>> 
>> 
>> 
>> --
>> Dave Täht
>> 
>> thttp://www.bufferbloat.net/projects/bloat/wiki/Upcoming_Talks
>>

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Cerowrt-devel] Recording RF management info _and_ associated traffic?
  2015-01-27  0:12                                 ` dpreed
  2015-01-27  0:31                                   ` David Lang
@ 2015-01-27  0:36                                   ` Dave Taht
  1 sibling, 0 replies; 43+ messages in thread
From: Dave Taht @ 2015-01-27  0:36 UTC (permalink / raw)
  To: David Reed, Jesper Dangaard Brouer; +Cc: cerowrt-devel

Jesper now cc'd.

On Tue, Jan 27, 2015 at 1:12 PM,  <dpreed@reed.com> wrote:
> Well, we all may want to agree to disagree.  I don't buy the argument that
> hash tables are slow compared to the TCAMs - and even if cache misses
> happened, a hash table is still o(1) - you look at exactly one memory
> address on the average in a hash table - that's the point of it.  The
> constant factor is the speed of memory - not terribly slow by any means.
>
>
>
> To get into this deeper would require actual measurements, of which I am a
> great fan.  But your handwaves are pretty unquantitative, Dave, so at best
> they are similar to mine.  I'm very measurement focused, being part hardware
> architecture guy.

Two of the people doing serious optimization and measurement of linux
network behavior are now cc'd. (tho they might want to read back on
the thread). Jesper, in particular, has been working on speeding up
10GigE in preparation for 100GigE and gave a great preso at lca:

http://lwn.net/Articles/629155/ (see slides, video)

Relative to that was:

http://lwn.net/Articles/629152/

And alexander has been working specifically on dramatically improving
routing cache lookups. He tells me:

"The amount of gain seen will vary based on the routing configuration
of each system.  The biggest gain in all of this is that the
prefix-matching/backtrace portion of the look-up was reduced from
O(N^2) to O(N).  So on my test systems that were configured with
rather large tries I saw a reduction from 380ns to 16ns for performing
a prefix-match/backtrace.  What this means for most end users is that
anything that falls back to the default route on the system should
take significantly less time for look-up in the fib tables.

A hit at depth 7 in my trie costs about 31ns, though I think that
might be a cache warm hit versus a cache cold look-up.  Though you
have to keep in mind if you are dealing with a routing table that is
the main trie, and not the local trie.  That means to get a "hit" only
any route you are still going to have to have a failed look-up in the
local trie first and the size of that trie depends on the number of
local addresses you have configured on the system.

My second set of patches should cut that by about 25% to 50% since I
am dropping a couple of unnecessary items from the look-up process and
compressing things so that the pointer to the next tnode and the key
info for that tnode should always be in the same cache-line.

the second set of patches [if they work out] that should reduce the
cache utilization by up to half.  Basically it consists of pushing the
key and key information up to the same cache-line that pointer for the
tnode/leaf lives on.  However I have to sort out some RCU ugliness
that adds since I have to RCU protect the key information."

My main complaint about the work so far is that no-one has been
measuring the total system costs (and latency) of the time it takes
from when a packet enters the system to the time it departs. I am
pretty sure that immense call path could be additionally optimized...
(last I recall it transited a minimum of 34 functions)

My secondary complaint is all the work is being tested on 64 bit
hardware with huge caches.

Somewhat relevant to that:

It has long been my hope that we would see per-packet timestamping
become the default at ingress (from the card or host application's
interaction with the stack), and then merely checked at egress through
codel, rather than the fq_codel queue merely measuring itself.

>
>
> David - my comment about HP doing layer 3 switching in TCAMs just was there
> to point out that there's nothing magic about layer 2.  I was not suggesting
> that they don't use proprietary binary blobs, because they do.  But so do
> the TCAM programs in layer 2 devices.
>
>
>
> Dave - you are conflating the implementation technique of the routing
> algorithm when you focus on "prefix matching" as being hard to do.  It's not
> hard to invent a performant algorithm to do that combined with a hash table.
> A simple way to do that is to treat the address one is looking up as several
> addresses (of shorter prefixes of the address).  Then look each one up
> separately by its hash.  Its still o(1) if you do that, just a larger
> constant factor. I assume you don't actually think it is optimal to do
> linear searches on the routing table like hosts sometimes do.  Linear search
> is not necessary.

I am tracking alexander's fine work closely. See recent commits to the
net-next tree.
>
>
> There is literally nothing magical about looking up 48-bit random Ethernet
> addresses in a LAN.

The difference between 48 bits and 128 bits is quite large.

>
>
> As far as NAT'ing is concerned - that is done by the gateways.  It's
> possible in principle to create a distributed NAT face to an Enterprise - if
> you do so, then roaming within the enterprise just amounts to telling the
> NAT face about the new internal IP address that corresponds to the old one -
> an update of one address translation with another.
>
>
>
> This is how phones roam, by the way. They update their location via an HLR
> as they roam.
>
>
>
>
>
> On Sunday, January 25, 2015 10:45pm, "Dave Taht" <dave.taht@gmail.com> said:
>
>> On Sun, Jan 25, 2015 at 7:17 PM, <dpreed@reed.com> wrote:
>> > Looking up an address in a routing table is o(1) if the routing table is
>> > a
>> > hash table. That's much more efficient than a TCAM. My simple example
>> > just
>> > requires a delete/insert at each node's route lookup table.
>>
>> Regrettably it is not O(1) once you take into account the cpu cache
>> hierarchy,
>> or the potential collisions you will have once you shrink the hash to
>> something reasonable.
>>
>> Also I think you are ignoring the problem of covering routes. Say I have
>> to
>> get something to a.b.c.z/32. I do a lookup of that and find nothing. I
>> then
>> look to find a.b.c.z/31 and find nothing, then /30, then /29, /28, until I
>> find
>> a hit for the next hop. Now you can of course do a binary search for
>> likely
>> subprefixes, but in any case, the search is not O(1).
>>
>> In terms of cache efficient data structures, a straight hash is not the
>> way
>> to go, of late I have been trying to wrap my head around the hat-trie as
>> possibly being useful in these circumstances.
>>
>> Now, if you think about limiting the domain of the problem to something
>> greater than the typical mac table, but less than the whole internet,
>> it starts looking more reasonable to have a 1x1 ratio of destination
>> IPs to hash table entries for lookups, but updates have to probe/change
>> large segments of the table in order to deal with covering prefixes.
>>
>> > My point was about collections of WLAN's bridged together. Look at what
>> > happens (at the packet/radio layer) when a new node joins a bridged set
>> > of
>> > WLANs using STP. It is not exactly simple to rebuild the Ethernet
>> > layer's
>> > bridge routing tables in a complex network. And the limit of 4096
>> > entries
>> > in many inexpensive switches is not a trivial limit.
>>
>> Agreed. But see http://en.wikipedia.org/wiki/Virtual_Extensible_LAN
>>
>> >
>> >
>> >
>> > Routers used to be memory-starved (a small number of KB of RAM was the
>> > norm). Perhaps the thinking then (back before 2000) has not been
>> > revised,
>> > even though the hardware is a lot more capacious.
>>
>> The profit margins have not been revised.
>>
>> I would not mind, incidentally expanding the scope of the fqswitch project
>> ot
>> try to build something that would scale up at l3 farther than we've ever
>> seen
>> before, however funding for needed gear like:
>>
>> http://www.eetimes.com/document.asp?doc_id=1321334
>>
>> and time, and fpga expertise, is lacking. I am currently distracted by
>> evaluating
>> a very cool new cpu architecture ( see
>> http://www.millcomputing.com/wiki/Memory )
>> and even as nifty as that is I foresee a need for a lot of dedicated
>> packet
>> processing logic and memories to get into the 40GBit+ range.
>> >
>> >
>> > Remember, the Ethernet layer in WLANs is implemented by
>> > microcontrollers,
>> > typically not very capable ones, plus TCAMs which are pretty limited in
>> > their flexibility.
>>
>> I do tend to think that the next era of SDN enabled hardware will
>> eventually
>> lead to more innovation in both the control and data plane - however it
>> seems we are still in a "me-too" phase
>> of development of openvswitch (btw: there is a new software switch for
>> linux called rocker we should look at, and make sure runs fq_codel), and
>> a long way from flexibly programmable switch hardware in general.
>>
>> http://openvswitch.org/pipermail/dev/2014-September/045084.html
>> >
>> >
>> >
>> > While it is tempting to use the "pre-packaged, proprietary" Ethernet
>> > switch
>> > functionality, routing gets you out of the binary blobs, and let's you
>> > be a
>> > lot smarter and more scalable. Given that it does NOT cost more to do
>> > routing at the IP layer, building complex Ethernet bridging is not
>> > obviously
>> > a win.
>>
>> SDN is certainly a way out of this mess. Eventually. But I fear we are
>> making
>> all the same mistakes over again, and making slower hardware, where in the
>> end, it needs to be faster, to win.
>>
>> >
>> >
>> > BTW, TCAMs are used in IP layer switching, too, and also are used in
>> > packet
>> > filtering. Maybe not in cheap consumer switches, but lots of Gigabit
>> > switches implement IP layer switching and filtering. At HP, their
>> > switches
>> > routinely did all their IP layer switching entirely in TCAMs.
>>
>> Yep. I really wish big, fat TCAMS were standard equipment.
>>
>> >
>> >
>> > On Sunday, January 25, 2015 9:58pm, "Dave Taht" <dave.taht@gmail.com>
>> said:
>> >
>> >> On Sun, Jan 25, 2015 at 6:43 PM, David Lang <david@lang.hm> wrote:
>> >> > On Sun, 25 Jan 2015, Dave Taht wrote:
>> >> >
>> >> >> To your roaming point, yes this is certainly one place where
>> migrating
>> >> >> bridged vms across machines breaks down, and yet more and more
>> vm
>> >> >> layers are doing it. I would certainly prefer routing in this
>> case.
>> >> >
>> >> >
>> >> > What's the difference between "roaming" and moving a VM from one
>> place
>> >> > in
>> >> > the network to another?
>> >>
>> >> I think most people think of "roaming" as moving fairly rapidly from
>> >> one
>> >> piece of edge connectivity to another, and moving a vm is a great deal
>> >> more
>> >> permanent operation.
>> >>
>> >> > As far as layer 2 vs layer 3 goes. If you try to operate at layer 3,
>> you
>> >> > are
>> >> > going to have quite a bit of smarts in the endpoint. Even if it's
>> only
>> >> > connected vi a single link. If you think about it, even if your
>> network
>> >> > routing tables list every machine in our environment individually,
>> you
>> >> > still
>> >> > have a problem of what gateway the endpoint uses. It would have to
>> >> > change
>> >> > every time it moved. Since DHCP doesn't update frequently enough to
>> be
>> >> > transparent, you would need to have each endpoint running a routing
>> >> > protocol.
>> >>
>> >> Hmm? I don't ever use a dhcp-supplied default gateway, I depend on the
>> >> routing
>> >> protocol to supply that. In terms of each vm running a routing
>> >> protocol,
>> >> well, no, I would rely on the underlying bare metal OS to be doing
>> >> that, supplying
>> >> the FIB tables to the overlying vms, if they need it, but otherwise the
>> >> vms
>> >> just see a "default" route and don't bother with it. They do need to
>> >> inform the
>> >> bare metal OS (better term for this please? hypervisor?) of what IPs
>> they
>> >> own.
>> >>
>> >> static default gateways are evil. and easily disabled. in linux you
>> >> merely comment
>> >> out the "routers" in /etc/dhcp/dhclient.conf, in openwrt, set
>> >> "defaultroute 0" for the
>> >> interface fetching dhcp.
>> >>
>> >> When a box migrates, it tells the hypervisor it's addresses, and then
>> that
>> >> box
>> >> propagates out the route change to elsewhere.
>> >>
>> >> >
>> >> > This can work for individual hobbiests, but not when you need to
>> support
>> >> > random devices (how would you configure an iPhone to support this?)
>> >>
>> >> Carefully. :)
>> >>
>> >> I do note that this stuff does (or at least did) work on some of the
>> open
>> >> source variants of android. I would rather like it if android added
>> >> ipv6
>> >> tethering soon, and made it possible to mesh together multiple phones.
>> >>
>> >> >
>> >> >
>> >> > Letting the layer 2 equipment deal with the traffic within the
>> building
>> >> > and
>> >> > invoking layer 3 to go outside the building (or to a different
>> security
>> >> > domain) makes a lot of sense. Even if that means that layer 2 within
>> a
>> >> > building looks very similar to what layer 3 used to look like around
>> a
>> >> > city.
>> >>
>> >> Be careful what you wish for.
>> >>
>> >> >
>> >> >
>> >> > back to the topic of wifi, I'm not aware of any APs that participate
>> in
>> >> > the
>> >> > switch protocols at this level. I also don't know of any reasonably
>> >> > priced
>> >> > switches that can do anything smarter than plain spanning tree when
>> >> > connected through multiple paths (I'd love to learn otherwise)
>> >> >
>> >> > David Lang
>> >>
>> >>
>> >>
>> >> --
>> >> Dave Täht
>> >>
>> >> thttp://www.bufferbloat.net/projects/bloat/wiki/Upcoming_Talks
>> >>
>>
>>
>>
>> --
>> Dave Täht
>>
>> thttp://www.bufferbloat.net/projects/bloat/wiki/Upcoming_Talks
>>



-- 
Dave Täht

thttp://www.bufferbloat.net/projects/bloat/wiki/Upcoming_Talks

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Cerowrt-devel] Recording RF management info _and_ associated traffic?
  2015-01-26  2:58                           ` Dave Taht
  2015-01-26  3:17                             ` dpreed
@ 2015-01-26  3:19                             ` David Lang
  1 sibling, 0 replies; 43+ messages in thread
From: David Lang @ 2015-01-26  3:19 UTC (permalink / raw)
  To: Dave Taht; +Cc: cerowrt-devel

On Sun, 25 Jan 2015, Dave Taht wrote:

> On Sun, Jan 25, 2015 at 6:43 PM, David Lang <david@lang.hm> wrote:
>> On Sun, 25 Jan 2015, Dave Taht wrote:
>>
>>> To your roaming point, yes this is certainly one place where migrating
>>> bridged vms across machines breaks down, and yet more and more vm
>>> layers are doing it. I would certainly prefer routing in this case.
>>
>>
>> What's the difference between "roaming" and moving a VM from one place in
>> the network to another?
>
> I think most people think of "roaming" as moving fairly rapidly from one
> piece of edge connectivity to another, and moving a vm is a great deal more
> permanent operation.

There are two different types of roaming.

You have the case like I deal with at SCaLE where you are moving within one 
network (within one site)

Then you have the case where you are moving between sites.

within one site, roaming and migrating VMs are pretty much the same problem and 
handling it at layer2 makes a lot of sense (how frequently the migrations 
happen, and how permanent they are varies, both for wifi nodes and VMs)

David Lang

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Cerowrt-devel] Recording RF management info _and_ associated traffic?
  2015-01-25 23:57                   ` David Lang
  2015-01-26  1:51                     ` dpreed
@ 2015-01-26  4:25                     ` Valdis.Kletnieks
  2015-01-26  4:39                       ` David Lang
  1 sibling, 1 reply; 43+ messages in thread
From: Valdis.Kletnieks @ 2015-01-26  4:25 UTC (permalink / raw)
  To: David Lang; +Cc: cerowrt-devel

[-- Attachment #1: Type: text/plain, Size: 421 bytes --]

On Sun, 25 Jan 2015 15:57:01 -0800, David Lang said:

> The Computer Scientist will cringe at the 'hacks' that this introduces, but
> there is far more progress made when new capabilities can be added in a way
> that's transparent to other layers of the stack then when it requires major
> changes to how things work.

Otherwise known as the "Just throw an F5 in front of the whole mess" school
of network design... :)



[-- Attachment #2: Type: application/pgp-signature, Size: 848 bytes --]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Cerowrt-devel] Recording RF management info _and_ associated traffic?
  2015-01-26  4:25                     ` Valdis.Kletnieks
@ 2015-01-26  4:39                       ` David Lang
  2015-01-26 16:42                         ` Valdis.Kletnieks
  0 siblings, 1 reply; 43+ messages in thread
From: David Lang @ 2015-01-26  4:39 UTC (permalink / raw)
  To: Valdis.Kletnieks; +Cc: cerowrt-devel

On Sun, 25 Jan 2015, Valdis.Kletnieks@vt.edu wrote:

>> The Computer Scientist will cringe at the 'hacks' that this introduces, but
>> there is far more progress made when new capabilities can be added in a way
>> that's transparent to other layers of the stack then when it requires major
>> changes to how things work.
>
> Otherwise known as the "Just throw an F5 in front of the whole mess" school
> of network design... :)

Much as you may hate the abuse of standards and protocols that F5 and other load 
balancers use to trick both clients and servers into operating without knowing 
that there are multiple machines serving a website, they do make things a lot 
more better than if you tried make a website reliable and scale without them.

"theoretically better" is trumped by "it works" any day. For something that's 
theoretically better to win it needs to be implemented and be better in practice 
as well.

David Lang

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Cerowrt-devel] Recording RF management info _and_ associated traffic?
  2015-01-26  4:39                       ` David Lang
@ 2015-01-26 16:42                         ` Valdis.Kletnieks
  0 siblings, 0 replies; 43+ messages in thread
From: Valdis.Kletnieks @ 2015-01-26 16:42 UTC (permalink / raw)
  To: David Lang; +Cc: cerowrt-devel

[-- Attachment #1: Type: text/plain, Size: 474 bytes --]

On Sun, 25 Jan 2015 20:39:23 -0800, David Lang said:

> Much as you may hate the abuse of standards and protocols that F5 and other load
> balancers use to trick both clients and servers into operating without knowing
> that there are multiple machines serving a website, they do make things a lot
> more better than if you tried make a website reliable and scale without them.

Oh, I'm fully aware of that.  We have several of the beasts across the
hall from my office. :)

[-- Attachment #2: Type: application/pgp-signature, Size: 848 bytes --]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Cerowrt-devel] Recording RF management info _and_ associated traffic?
  2015-01-22 18:19           ` Richard Smith
  2015-01-22 22:09             ` David Lang
  2015-01-24 14:59             ` dpreed
@ 2015-01-25  8:07             ` Outback Dingo
  2015-01-30 16:14               ` Richard Smith
  2 siblings, 1 reply; 43+ messages in thread
From: Outback Dingo @ 2015-01-25  8:07 UTC (permalink / raw)
  To: Richard Smith; +Cc: cerowrt-devel

[-- Attachment #1: Type: text/plain, Size: 4520 bytes --]

my first initial and only thought is over stauration on your network, i
dont see anything of enterprise grade APs listed with 30+ users, how many
connections and how many users? are they all trying to download/move data
at the same time.

On Fri, Jan 23, 2015 at 5:19 AM, Richard Smith <smithbone@gmail.com> wrote:

> On 01/22/2015 04:18 AM, David Lang wrote:
>
>  Recently, we picked up the 11th floor as well and moved many people up
>>> there. I got a 3rd AP (another TP-Link AC1750) and set that one up on
>>> a free channel with a different ESSID.
>>>
>>
>> I like to put all the APs on the same ESSID so that people can roam
>> between them. This requires that the APs act as bridges to a dedicated
>> common network, not as routers.
>>
>
> That's the ultimate plan but for convenience of being able to easily
> select what AP I'm talking to or to be able to tell folks to move from one
> to another I've got them on different ESSIDs.  It also helps me keep track
> of what RF channel things are on.
>
>  Then about a week before my original post I got notified that Internet
>>> was down.  Both 10th floor APs had stopped working.  The 11th floor
>>> (where I am) was still working.   On the 10th floor, I could connect
>>> to the  TP-link via its IP address on its wired interface but it did
>>> not seem to be passing wireless traffic. A reboot fixed it.
>>>
>>
>> There has been an ongoing bug with Apple devices on 5Ghz that causes the
>> wifi chipset to lockup. We think we've fixed it in the current Cerowrt,
>> but I don't know what kernel versions have this problem. This is likely
>> to affect multiple vendors who use the same chipset (check the openwrt
>> hardware list for details of the chipsets in each model)
>>
>
> Oooohhh!  That could be it. We have a _lot_ of Apple devices.  Most of the
> company uses MacBook,or Air and a large number of people have iPhones and
> we use iPods for some of our testing.   I'll go dig through the openWRT and
> get the details.
>
>  The WNDR3700 was completely unresponsive both via WiFi and when I
>>> tried its IP connected directly to it's switch with a Cat-5.  I also
>>> have a serial port mod on that wndr3700 so I connected up to that
>>> instead.
>>>
>>
>> hmm, it's not common to have it be unresponsive on the wired network.
>>
>
> It's uncommon to me. :)  This unit has travelled with me for years while I
> worked for OLPC and its see a lot of different wireless environments.
>  Granted never one with this many apple clients.  Usually 7-8 Linux/Windows
> machines and a pile of XOs.
>
> So this happened a lot at your SCALE setups?
>
>  room. All the stations are in about a 40 foot radius and all but 1 or
>>> 2 have line of sight to the AP.  The wndr3700 is in a closet on the
>>> side of the room with other equipment so it might be 80 feet away from
>>> the furthest station or so.
>>>
>>
>> this doesn't sound unreasonable unless your users are trying to use a
>> LOT of bandwidth (although the fact that you refer to the 50Mb
>> bottleneck indicates that you may be)
>>
>
> The bottleneck was just a nice side effect.  We don't use that much
> traffic.  I only noticed the limit once I started running netperf-wrapper
> tests from a wired host.
>
> Occasional there will be some big download that eats up bandwidth, but
> when I watch the throughput during the day we peak up in to the 40Mbps but
> the average is < 10Mbps (Download).
>
>  Can I perhaps approximate signal strength by looking at the bitrate
>>> for packets that station sends?  The theory being that higher quality
>>> RF links should use the higher bitrate encodings when sending.
>>>
>>
>> not reliably, too many other things factor in to that.
>>
>
> Indeed. Horst tells me I basically have 2 rates happening on the tplink
> 6Mbs and 24Mbps with a few 12Mbps in there.
>
>  If need be I can move the wndr to the same location as the tplink and
>>> then have stations connect to the wndr so I can watch the rx signal
>>> strength.
>>>
>>
> Looks like that's what I'll have to do.
>
>  There is a lot of room with consumer grade equipment from where you
>> currently are. The "Enterprise Grade" systems do have a lot of
>> infrastructure to coordinate the different APs.
>>
>
> Thanks for the ray of hope.  Yeah I don't need all the multi-AP
> coordination handoff stuff.
>
> --
> Richard A. Smith
>
> _______________________________________________
> Cerowrt-devel mailing list
> Cerowrt-devel@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cerowrt-devel
>

[-- Attachment #2: Type: text/html, Size: 6990 bytes --]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Cerowrt-devel] Recording RF management info _and_ associated traffic?
  2015-01-25  8:07             ` Outback Dingo
@ 2015-01-30 16:14               ` Richard Smith
  0 siblings, 0 replies; 43+ messages in thread
From: Richard Smith @ 2015-01-30 16:14 UTC (permalink / raw)
  To: Outback Dingo, Richard Smith; +Cc: cerowrt-devel

On 01/25/2015 03:07 AM, Outback Dingo wrote:

> my first initial and only thought is over stauration on your network, i
> dont see anything of enterprise grade APs listed with 30+ users, how
> many connections and how many users? are they all trying to
> download/move data at the same time.

Looking at the leases file we have 90 devices getting IPs.  That's about 
right for the 30 or so people + all the other devices connected.

The users are split up now on 3 APs all on different channels.

1 AP  on the 11th floor: (tplink stock) 16-20 clients.
2 APs on the 10th floor: (tplink stock and Wndr3700v2 OpenWRT)
  each 10 floor AP has 10ish clients

You can see all AP's from both floors but the AP not on the floor with 
you has a pretty low signal.  Low but still usable.

 From watching what's going on at the radiotap level via horst I don't 
see a very high level of utilisation but I've still not been able to 
catch things in the act of a total fail yet.

-- 
Richard A. Smith

^ permalink raw reply	[flat|nested] 43+ messages in thread

end of thread, other threads:[~2015-01-30 16:14 UTC | newest]

Thread overview: 43+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-01-14  2:20 [Cerowrt-devel] Recording RF management info _and_ associated traffic? Richard Smith
2015-01-20 16:59 ` Rich Brown
2015-01-21 23:40   ` Richard Smith
2015-01-21 23:58     ` David Lang
2015-01-22  9:04       ` Richard Smith
2015-01-22  9:18         ` David Lang
2015-01-22 18:19           ` Richard Smith
2015-01-22 22:09             ` David Lang
2015-01-22 22:55               ` Roman Toledo Casabona
2015-01-24 14:59             ` dpreed
2015-01-24 15:30               ` Kelvin Edmison
2015-01-25  4:35               ` David Lang
2015-01-25  5:02                 ` Dave Taht
2015-01-25  5:04                   ` Dave Taht
2015-01-25  6:44                   ` David Lang
2015-01-25  7:06                     ` David Lang
     [not found]                     ` <CAA93jw64KjW-JjLxB3i_ZK348NCyJYSQACFO34MaUsBBWyZ+pA@mail.gmail.com>
2015-01-25  7:59                       ` Dave Taht
2015-01-25  9:39                       ` David Lang
2015-01-25 15:03                         ` Chuck Anderson
2015-01-25 20:17                 ` dpreed
2015-01-25 23:21                   ` Aaron Wood
2015-01-25 23:57                   ` David Lang
2015-01-26  1:51                     ` dpreed
2015-01-26  2:09                       ` David Lang
2015-01-26  4:33                         ` Valdis.Kletnieks
2015-01-26  4:44                           ` David Lang
2015-01-27  0:14                             ` dpreed
2015-01-27  0:23                               ` David Lang
2015-01-26  2:19                       ` Dave Taht
2015-01-26  2:43                         ` David Lang
2015-01-26  2:58                           ` Dave Taht
2015-01-26  3:17                             ` dpreed
2015-01-26  3:32                               ` David Lang
2015-01-26  3:45                               ` Dave Taht
2015-01-27  0:12                                 ` dpreed
2015-01-27  0:31                                   ` David Lang
2015-01-27  0:36                                   ` Dave Taht
2015-01-26  3:19                             ` David Lang
2015-01-26  4:25                     ` Valdis.Kletnieks
2015-01-26  4:39                       ` David Lang
2015-01-26 16:42                         ` Valdis.Kletnieks
2015-01-25  8:07             ` Outback Dingo
2015-01-30 16:14               ` Richard Smith

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox