* [Cerowrt-devel] Recording RF management info _and_ associated traffic? @ 2015-01-14 2:20 Richard Smith 2015-01-20 16:59 ` Rich Brown 0 siblings, 1 reply; 43+ messages in thread From: Richard Smith @ 2015-01-14 2:20 UTC (permalink / raw) To: cerowrt-devel I'm trying to track down some poor wireless issues we are having at work. At random times the 5Ghz WLANs we have just go to hell. I've been sniffing in monitor mode which has been quite enlightening there's certainly a lot more going on in the 5Ghz channels than I was expecting. Monitor mode shows me loads of stuff I didn't know was there but what it doesn't show me is how all that other traffic interacts with the traffic on my ESS. From what I've been reading it seems like you with most cards you can't grab the 802.11 management info and actual traffic on the network at the same time. Is this possible with a WNDR3[78]00 CeroWRT (or openWRT) setup? -- Richard A. Smith ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Cerowrt-devel] Recording RF management info _and_ associated traffic? 2015-01-14 2:20 [Cerowrt-devel] Recording RF management info _and_ associated traffic? Richard Smith @ 2015-01-20 16:59 ` Rich Brown 2015-01-21 23:40 ` Richard Smith 0 siblings, 1 reply; 43+ messages in thread From: Rich Brown @ 2015-01-20 16:59 UTC (permalink / raw) To: Richard Smith; +Cc: cerowrt-devel [-- Attachment #1: Type: text/plain, Size: 1184 bytes --] On Jan 13, 2015, at 9:20 PM, Richard Smith <smithbone@gmail.com> wrote: > I'm trying to track down some poor wireless issues we are having at work. At random times the 5Ghz WLANs we have just go to hell. > > I've been sniffing in monitor mode which has been quite enlightening there's certainly a lot more going on in the 5Ghz channels than I was expecting. > > Monitor mode shows me loads of stuff I didn't know was there but what it doesn't show me is how all that other traffic interacts with the traffic on my ESS. > > From what I've been reading it seems like you with most cards you can't grab the 802.11 management info and actual traffic on the network at the same time. > > Is this possible with a WNDR3[78]00 CeroWRT (or openWRT) setup? One of the first things I would do is a Wifi site survey, to look for conflicts between access points/channels, etc. Two recommendations for tools: MacOSX: WiFi Explorer from Adrian Granados - US$4.99 from the Mac App Store. http://www.adriangranados.com/apps/wifi-explorer Android: WiFi Analyzer from farproc - Donationware from the Android store. https://sites.google.com/site/farproc/wifi-analyzer Rich [-- Attachment #2: Message signed with OpenPGP using GPGMail --] [-- Type: application/pgp-signature, Size: 496 bytes --] ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Cerowrt-devel] Recording RF management info _and_ associated traffic? 2015-01-20 16:59 ` Rich Brown @ 2015-01-21 23:40 ` Richard Smith 2015-01-21 23:58 ` David Lang 0 siblings, 1 reply; 43+ messages in thread From: Richard Smith @ 2015-01-21 23:40 UTC (permalink / raw) To: Rich Brown, Richard Smith; +Cc: cerowrt-devel On 01/20/2015 11:59 AM, Rich Brown wrote: > One of the first things I would do is a Wifi site survey, to look for > conflicts between access points/channels, etc. Two recommendations > for tools: > > MacOSX: WiFi Explorer from Adrian Granados - US$4.99 from the Mac App > Store. http://www.adriangranados.com/apps/wifi-explorer Android: WiFi > Analyzer from farproc - Donationware from the Android store. > https://sites.google.com/site/farproc/wifi-analyzer Thanks for the suggestion. I've done that. I'll offer up a 3rd choice which I have been using. Horst. Runs on OpenWrt perfectly and free. I've not tried it on CeroWrt yet but I don't see why it would not work. http://br1.einfach.org/tech/horst/ With horst I've verified that the 3 AP's we are running are all on 5Ghz channels that don't have another AP on them. We are up on the 10th floor of a tower type building and 3 of our walls have large windows with clear views of surrounding buildings. We are higher than most of the stuff around us. So when I scan I do see a lot of intermittent probes or wifi traffic from other things but nothing cronic. I haven't been able to run a scan when it all goes to hell though. With horst it shows me the DATA or QDATA packet among the radiotap info and that the contents are encrypted but I've not figured out how to capture decrypted traffic at the same time as radiotap info. This would let me see exactly what sort of dynamic was happening from our network. As an aside if any of the gurus here are near the Boston, MA area and want to do small business Wi-Fi consulting let me know. We will gladly pay someone to fix our wireless. -- Richard A. Smith ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Cerowrt-devel] Recording RF management info _and_ associated traffic? 2015-01-21 23:40 ` Richard Smith @ 2015-01-21 23:58 ` David Lang 2015-01-22 9:04 ` Richard Smith 0 siblings, 1 reply; 43+ messages in thread From: David Lang @ 2015-01-21 23:58 UTC (permalink / raw) To: Richard Smith; +Cc: cerowrt-devel On Wed, 21 Jan 2015, Richard Smith wrote: > On 01/20/2015 11:59 AM, Rich Brown wrote: > >> One of the first things I would do is a Wifi site survey, to look for >> conflicts between access points/channels, etc. Two recommendations >> for tools: >> >> MacOSX: WiFi Explorer from Adrian Granados - US$4.99 from the Mac App >> Store. http://www.adriangranados.com/apps/wifi-explorer Android: WiFi >> Analyzer from farproc - Donationware from the Android store. >> https://sites.google.com/site/farproc/wifi-analyzer > > Thanks for the suggestion. I've done that. > > I'll offer up a 3rd choice which I have been using. Horst. > Runs on OpenWrt perfectly and free. I've not tried it on CeroWrt yet but I > don't see why it would not work. > > http://br1.einfach.org/tech/horst/ > > With horst I've verified that the 3 AP's we are running are all on 5Ghz > channels that don't have another AP on them. We are up on the 10th floor of > a tower type building and 3 of our walls have large windows with clear views > of surrounding buildings. We are higher than most of the stuff around us. Ok, this would suggest that you are unlikely to have interference causing your problems. I don't have the earlier part of this thread still in my mailbox, what is the problem that you are trying to solve again? When you do a wifi survey, you are not just looking at one spot, or near the APs for what you see. You should also be going to all the areas your users are going to be trying to access your network and see if you have a strong enough signal from at least one AP everywhere. Also note that if you have high-power APs, you may hear a signal from them, but they may not be able to hear the signal from the mobile device very well. Mobile devices tend to have lousy antennas, and try to operate a lower power levels to save battery power. So you may need to look at the stats on the AP showing the signal it sees from the client. Assuming that you have enough signal, the next question is how many people are going to be trying to use the network at one time. You may be better off with more APs operating at lower power levels so that you have fewer people talking to each one. David Lang > So when I scan I do see a lot of intermittent probes or wifi traffic > from other things but nothing cronic. I haven't been able to run a scan when > it all goes to hell though. > > With horst it shows me the DATA or QDATA packet among the radiotap info and > that the contents are encrypted but I've not figured out how to capture > decrypted traffic at the same time as radiotap info. This would let me see > exactly what sort of dynamic was happening from our network. > > As an aside if any of the gurus here are near the Boston, MA area and > want to do small business Wi-Fi consulting let me know. We will gladly > pay someone to fix our wireless. > > -- > Richard A. Smith > _______________________________________________ > Cerowrt-devel mailing list > Cerowrt-devel@lists.bufferbloat.net > https://lists.bufferbloat.net/listinfo/cerowrt-devel > ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Cerowrt-devel] Recording RF management info _and_ associated traffic? 2015-01-21 23:58 ` David Lang @ 2015-01-22 9:04 ` Richard Smith 2015-01-22 9:18 ` David Lang 0 siblings, 1 reply; 43+ messages in thread From: Richard Smith @ 2015-01-22 9:04 UTC (permalink / raw) To: David Lang; +Cc: cerowrt-devel On 01/21/2015 06:58 PM, David Lang wrote: > On Wed, 21 Jan 2015, Richard Smith wrote: > Thanks for the response. First I want to say that I'm sensitive to the fact that this is the Cerowrt-devel list and not the small business WiFi help list. If things go too far off-topic or people get tired of the discussion let me know and I'll take it off the list. > Ok, this would suggest that you are unlikely to have interference > causing your problems. I don't have the earlier part of this thread > still in my mailbox, what is the problem that you are trying to solve > again? I didn't really describe the problem(s) in detail (see above note) but I'll provide a detailed description of my woes. We have a small network of about 30 people or so with ~60 devices connected. Most of which are wireless of some sort (both 2.4Ghz and 5hz). Here's my issues + my story. :) 1) Periodic reports of poor "Internet". However, its not the Internet uplink. I setup a netperf-wrapper test that goes off every 10 minutes with a brief speed+latency test to a well connected host. Tracked across several weeks the uplink/downlink always exactly as expected. So I'm suspecting it's poor wireless rather than poor Internet. 2) Occasional total loss of WiFi. This a bit fuzzy since I have multiple hardware permutations and currently no consistent failure. The story: Originally we had an Engenius 2.4/5Ghz AP and a Netgear AP/router (WiFi turned off). I can't remember the original router model number. I didn't set any of the original hardware up. Several times a week the Engenius AP would stop passing traffic. A power cycle or reboot would fix it. The Engenius forums had lots of people reporting similar problems. We did firmware upgrades which seemed to help but not eliminate the issue. Sometime later we added VoIP phones. But bufferbloat in the cable modem caused large latencies under load and VoIP was unhappy. Enter the trusty WNDR3700v2 from my stash with OpenWRT (pre-barrier breaker build). I replaced both the original router and the Engenius AP with it. QoS solved VoIP issues and for the most part wireless was happy. Still occasionally though 5Ghz would stop working but much less frequent than the Engenius. Rebooting the box would fix it. I suspected the single box running all the AP + DHCP + DNS + routing may not have had the resources for our load or perhaps the pre-release of barrier breaker had issues. Replaced the routing/DHCP/DNS/QoS portion with a x86 box running OpenWRT x86 (using released barrier breaker, but locally built). Now the WNDR3700v2 was just an AP. This also allowed us actually get our rated cable modem speed. QoS on the wndr was capping out at ~60Mbps, a well known limit among members of this list. Around the same time I also added a 2nd AP on a different 5Ghz channel (TP-Link AC1750) to spread the connected clients across multiple channels. They have different ESSIDs. Things seem to be happy. I got the the TP-Link because its on target to be supported by OpenWRT and has 3 external antennas which I though might provided a path for different antenna testing. Recently, we picked up the 11th floor as well and moved many people up there. I got a 3rd AP (another TP-Link AC1750) and set that one up on a free channel with a different ESSID. Then about a week before my original post I got notified that Internet was down. Both 10th floor APs had stopped working. The 11th floor (where I am) was still working. On the 10th floor, I could connect to the TP-link via its IP address on its wired interface but it did not seem to be passing wireless traffic. A reboot fixed it. The WNDR3700 was completely unresponsive both via WiFi and when I tried its IP connected directly to it's switch with a Cat-5. I also have a serial port mod on that wndr3700 so I connected up to that instead. From the serial port everything appeared to be running fine only no would pass on the bridge. Dropping the interfaces with ifconfig and then bringing them back up had no effect and I didn't see anything unusual in the system logs. A power cycle fixed it. I've never seen my wndr3700 do something like that. So then I really began to wonder... that's 3 different hardware vendors with 3 very different firmware's all that had similar issues. 2 of them at exactly the same time. I considered the possibility of a power event but the 2 APs are on different circuits and in physically different locations. The power connection for the wndr3700 also has the x86 router, 2 switches, the cable modem, and a linux box plugged up and all of those devices were still working. That's when I figured I needed to start looking at what was going on in RF land. At that time I didn't have anything like horst to be able to verify that wireless really was broken and not some other mysterious network gremlin. So I started tooling up. When it happens again I can investigate deeper. I have a 2nd wndr3700v2 at my disposal set up in monitor on that channel that I can run horst on when the next total loss happens. It's not happened again. While I'm waiting I've been trying to look into issue 1 by trying to understand what is really happing on the RF channel its on. Thus my query about wanting to see associated network traffic decoded along with the radiotap info. > When you do a wifi survey, you are not just looking at one spot, or near > the APs for what you see. You should also be going to all the areas your > users are going to be trying to access your network and see if you have > a strong enough signal from at least one AP everywhere. I have taken readings at multiple points in the office but it was not a very rigorous survey. I should repeat with more care. The wireless signal indicators most clients I've messed with show good strength. Our floor(s) are fairly small and almost completely open. There are no cubicles and very few internal walls. There are some offices and conference rooms but each of them have large walls of glass that look into the center of the room. The only big obstruction is a large concrete pillar in the center of the room. The 10th floor TPlink AP is located in a ceiling cable tray very close to the center of the room. All the stations are in about a 40 foot radius and all but 1 or 2 have line of sight to the AP. The wndr3700 is in a closet on the side of the room with other equipment so it might be 80 feet away from the furthest station or so. > Also note that > if you have high-power APs, What Tx level qualifies as a high-power AP? The wndr says 50mW. The tplink just gives me low,medium,and high as choices. It's still at the default of high. > you may hear a signal from them, but they > may not be able to hear the signal from the mobile device very well. > Mobile devices tend to have lousy antennas, and try to operate a lower > power levels to save battery power. So you may need to look at the stats > on the AP showing the signal it sees from the client. I can see those for things connected to the wndr unit but sadly the stock tplink firmware does not show me rx strength. Can I perhaps approximate signal strength by looking at the bitrate for packets that station sends? The theory being that higher quality RF links should use the higher bitrate encodings when sending. If need be I can move the wndr to the same location as the tplink and then have stations connect to the wndr so I can watch the rx signal strength. > Assuming that you have enough signal, the next question is how many > people are going to be trying to use the network at one time. You may be > better off with more APs operating at lower power levels so that you > have fewer people talking to each one. The tplink is better located so in general people tend to use that one over the the wndr. Last check it has around 20 stations connected to it during the day. The rest are connected to the 2 other APs. Thanks again for any insights you have. Lastly, I've been doing some reading on getting enterprise class APs from Cisco, HP, etc. A large number of them seem to require a lot of extra infrastructure running wireless controllers and special software you have to run to set them up. Any recommendations for something that's a step above consumer grade devices but that does not require additional controllers or licensed software would be appreciated. -- Richard A. Smith ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Cerowrt-devel] Recording RF management info _and_ associated traffic? 2015-01-22 9:04 ` Richard Smith @ 2015-01-22 9:18 ` David Lang 2015-01-22 18:19 ` Richard Smith 0 siblings, 1 reply; 43+ messages in thread From: David Lang @ 2015-01-22 9:18 UTC (permalink / raw) To: Richard Smith; +Cc: cerowrt-devel On Thu, 22 Jan 2015, Richard Smith wrote: > On 01/21/2015 06:58 PM, David Lang wrote: >> On Wed, 21 Jan 2015, Richard Smith wrote: >> > > Thanks for the response. First I want to say that I'm sensitive to the fact > that this is the Cerowrt-devel list and not the small business WiFi help > list. If things go too far off-topic or people get tired of the discussion > let me know and I'll take it off the list. > >> Ok, this would suggest that you are unlikely to have interference >> causing your problems. I don't have the earlier part of this thread >> still in my mailbox, what is the problem that you are trying to solve >> again? > > I didn't really describe the problem(s) in detail (see above note) but I'll > provide a detailed description of my woes. > > We have a small network of about 30 people or so with ~60 devices connected. > Most of which are wireless of some sort (both 2.4Ghz and 5hz). Here's my > issues + my story. :) > > 1) Periodic reports of poor "Internet". However, its not the Internet uplink. > I setup a netperf-wrapper test that goes off every 10 minutes with a brief > speed+latency test to a well connected host. Tracked across several weeks > the uplink/downlink always exactly as expected. So I'm suspecting it's poor > wireless rather than poor Internet. > > 2) Occasional total loss of WiFi. This a bit fuzzy since I have multiple > hardware permutations and currently no consistent failure. > > The story: > > Originally we had an Engenius 2.4/5Ghz AP and a Netgear AP/router (WiFi > turned off). I can't remember the original router model number. I didn't set > any of the original hardware up. > > Several times a week the Engenius AP would stop passing traffic. A power > cycle or reboot would fix it. The Engenius forums had lots of people > reporting similar problems. We did firmware upgrades which seemed to help > but not eliminate the issue. > > Sometime later we added VoIP phones. But bufferbloat in the cable modem > caused large latencies under load and VoIP was unhappy. > > Enter the trusty WNDR3700v2 from my stash with OpenWRT (pre-barrier breaker > build). I replaced both the original router and the Engenius AP with it. > > QoS solved VoIP issues and for the most part wireless was happy. Still > occasionally though 5Ghz would stop working but much less frequent than the > Engenius. Rebooting the box would fix it. I suspected the single box > running all the AP + DHCP + DNS + routing may not have had the resources for > our load or perhaps the pre-release of barrier breaker had issues. > > Replaced the routing/DHCP/DNS/QoS portion with a x86 box running OpenWRT x86 > (using released barrier breaker, but locally built). Now the WNDR3700v2 was > just an AP. This also allowed us actually get our rated cable modem speed. > QoS on the wndr was capping out at ~60Mbps, a well known limit among members > of this list. > > Around the same time I also added a 2nd AP on a different 5Ghz channel > (TP-Link AC1750) to spread the connected clients across multiple channels. > They have different ESSIDs. Things seem to be happy. I got the the TP-Link > because its on target to be supported by OpenWRT and has 3 external antennas > which I though might provided a path for different antenna testing. > > Recently, we picked up the 11th floor as well and moved many people up there. > I got a 3rd AP (another TP-Link AC1750) and set that one up on a free channel > with a different ESSID. I like to put all the APs on the same ESSID so that people can roam between them. This requires that the APs act as bridges to a dedicated common network, not as routers. > Then about a week before my original post I got notified that Internet was > down. Both 10th floor APs had stopped working. The 11th floor (where I am) > was still working. On the 10th floor, I could connect to the TP-link via > its IP address on its wired interface but it did not seem to be passing > wireless traffic. A reboot fixed it. There has been an ongoing bug with Apple devices on 5Ghz that causes the wifi chipset to lockup. We think we've fixed it in the current Cerowrt, but I don't know what kernel versions have this problem. This is likely to affect multiple vendors who use the same chipset (check the openwrt hardware list for details of the chipsets in each model) > The WNDR3700 was completely unresponsive both via WiFi and when I tried its > IP connected directly to it's switch with a Cat-5. I also have a serial port > mod on that wndr3700 so I connected up to that instead. hmm, it's not common to have it be unresponsive on the wired network. > From the serial port everything appeared to be running fine only no would > pass on the bridge. Dropping the interfaces with ifconfig and then bringing > them back up had no effect and I didn't see anything unusual in the system > logs. A power cycle fixed it. I've never seen my wndr3700 do something like > that. > > So then I really began to wonder... that's 3 different hardware vendors with > 3 very different firmware's all that had similar issues. 2 of them at > exactly the same time. > > I considered the possibility of a power event but the 2 APs are on different > circuits and in physically different locations. The power connection for the > wndr3700 also has the x86 router, 2 switches, the cable modem, and a linux > box plugged up and all of those devices were still working. > > That's when I figured I needed to start looking at what was going on in RF > land. At that time I didn't have anything like horst to be able to verify > that wireless really was broken and not some other mysterious network > gremlin. So I started tooling up. When it happens again I can investigate > deeper. I have a 2nd wndr3700v2 at my disposal set up in monitor on that > channel that I can run horst on when the next total loss happens. > > It's not happened again. While I'm waiting I've been trying to look into > issue 1 by trying to understand what is really happing on the RF channel its > on. Thus my query about wanting to see associated network traffic decoded > along with the radiotap info. > >> When you do a wifi survey, you are not just looking at one spot, or near >> the APs for what you see. You should also be going to all the areas your >> users are going to be trying to access your network and see if you have >> a strong enough signal from at least one AP everywhere. > > I have taken readings at multiple points in the office but it was not a very > rigorous survey. I should repeat with more care. The wireless signal > indicators most clients I've messed with show good strength. > > Our floor(s) are fairly small and almost completely open. There are no > cubicles and very few internal walls. There are some offices and conference > rooms but each of them have large walls of glass that look into the center of > the room. The only big obstruction is a large concrete pillar in the center > of the room. The 10th floor TPlink AP is located in a ceiling cable tray > very close to the center of the room. All the stations are in about a 40 foot > radius and all but 1 or 2 have line of sight to the AP. The wndr3700 is in a > closet on the side of the room with other equipment so it might be 80 feet > away from the furthest station or so. this doesn't sound unreasonable unless your users are trying to use a LOT of bandwidth (although the fact that you refer to the 50Mb bottleneck indicates that you may be) >> Also note that >> if you have high-power APs, > > What Tx level qualifies as a high-power AP? The wndr says 50mW. The tplink > just gives me low,medium,and high as choices. It's still at the default of > high. > >> you may hear a signal from them, but they >> may not be able to hear the signal from the mobile device very well. >> Mobile devices tend to have lousy antennas, and try to operate a lower >> power levels to save battery power. So you may need to look at the stats >> on the AP showing the signal it sees from the client. > > I can see those for things connected to the wndr unit but sadly the stock > tplink firmware does not show me rx strength. > > Can I perhaps approximate signal strength by looking at the bitrate for > packets that station sends? The theory being that higher quality RF links > should use the higher bitrate encodings when sending. not reliably, too many other things factor in to that. > If need be I can move the wndr to the same location as the tplink and then > have stations connect to the wndr so I can watch the rx signal strength. > >> Assuming that you have enough signal, the next question is how many >> people are going to be trying to use the network at one time. You may be >> better off with more APs operating at lower power levels so that you >> have fewer people talking to each one. > > The tplink is better located so in general people tend to use that one over > the the wndr. Last check it has around 20 stations connected to it during the > day. The rest are connected to the 2 other APs. > > Thanks again for any insights you have. > > Lastly, I've been doing some reading on getting enterprise class APs from > Cisco, HP, etc. A large number of them seem to require a lot of extra > infrastructure running wireless controllers and special software you have to > run to set them up. > > Any recommendations for something that's a step above consumer grade devices > but that does not require additional controllers or licensed software would > be appreciated. There is a lot of room with consumer grade equipment from where you currently are. The "Enterprise Grade" systems do have a lot of infrastructure to coordinate the different APs. David Lang ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Cerowrt-devel] Recording RF management info _and_ associated traffic? 2015-01-22 9:18 ` David Lang @ 2015-01-22 18:19 ` Richard Smith 2015-01-22 22:09 ` David Lang ` (2 more replies) 0 siblings, 3 replies; 43+ messages in thread From: Richard Smith @ 2015-01-22 18:19 UTC (permalink / raw) To: David Lang, Richard Smith; +Cc: cerowrt-devel On 01/22/2015 04:18 AM, David Lang wrote: >> Recently, we picked up the 11th floor as well and moved many people up >> there. I got a 3rd AP (another TP-Link AC1750) and set that one up on >> a free channel with a different ESSID. > > I like to put all the APs on the same ESSID so that people can roam > between them. This requires that the APs act as bridges to a dedicated > common network, not as routers. That's the ultimate plan but for convenience of being able to easily select what AP I'm talking to or to be able to tell folks to move from one to another I've got them on different ESSIDs. It also helps me keep track of what RF channel things are on. >> Then about a week before my original post I got notified that Internet >> was down. Both 10th floor APs had stopped working. The 11th floor >> (where I am) was still working. On the 10th floor, I could connect >> to the TP-link via its IP address on its wired interface but it did >> not seem to be passing wireless traffic. A reboot fixed it. > > There has been an ongoing bug with Apple devices on 5Ghz that causes the > wifi chipset to lockup. We think we've fixed it in the current Cerowrt, > but I don't know what kernel versions have this problem. This is likely > to affect multiple vendors who use the same chipset (check the openwrt > hardware list for details of the chipsets in each model) Oooohhh! That could be it. We have a _lot_ of Apple devices. Most of the company uses MacBook,or Air and a large number of people have iPhones and we use iPods for some of our testing. I'll go dig through the openWRT and get the details. >> The WNDR3700 was completely unresponsive both via WiFi and when I >> tried its IP connected directly to it's switch with a Cat-5. I also >> have a serial port mod on that wndr3700 so I connected up to that >> instead. > > hmm, it's not common to have it be unresponsive on the wired network. It's uncommon to me. :) This unit has travelled with me for years while I worked for OLPC and its see a lot of different wireless environments. Granted never one with this many apple clients. Usually 7-8 Linux/Windows machines and a pile of XOs. So this happened a lot at your SCALE setups? >> room. All the stations are in about a 40 foot radius and all but 1 or >> 2 have line of sight to the AP. The wndr3700 is in a closet on the >> side of the room with other equipment so it might be 80 feet away from >> the furthest station or so. > > this doesn't sound unreasonable unless your users are trying to use a > LOT of bandwidth (although the fact that you refer to the 50Mb > bottleneck indicates that you may be) The bottleneck was just a nice side effect. We don't use that much traffic. I only noticed the limit once I started running netperf-wrapper tests from a wired host. Occasional there will be some big download that eats up bandwidth, but when I watch the throughput during the day we peak up in to the 40Mbps but the average is < 10Mbps (Download). >> Can I perhaps approximate signal strength by looking at the bitrate >> for packets that station sends? The theory being that higher quality >> RF links should use the higher bitrate encodings when sending. > > not reliably, too many other things factor in to that. Indeed. Horst tells me I basically have 2 rates happening on the tplink 6Mbs and 24Mbps with a few 12Mbps in there. >> If need be I can move the wndr to the same location as the tplink and >> then have stations connect to the wndr so I can watch the rx signal >> strength. Looks like that's what I'll have to do. > There is a lot of room with consumer grade equipment from where you > currently are. The "Enterprise Grade" systems do have a lot of > infrastructure to coordinate the different APs. Thanks for the ray of hope. Yeah I don't need all the multi-AP coordination handoff stuff. -- Richard A. Smith ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Cerowrt-devel] Recording RF management info _and_ associated traffic? 2015-01-22 18:19 ` Richard Smith @ 2015-01-22 22:09 ` David Lang 2015-01-22 22:55 ` Roman Toledo Casabona 2015-01-24 14:59 ` dpreed 2015-01-25 8:07 ` Outback Dingo 2 siblings, 1 reply; 43+ messages in thread From: David Lang @ 2015-01-22 22:09 UTC (permalink / raw) To: Richard Smith; +Cc: cerowrt-devel On Thu, 22 Jan 2015, Richard Smith wrote: >>> The WNDR3700 was completely unresponsive both via WiFi and when I >>> tried its IP connected directly to it's switch with a Cat-5. I also >>> have a serial port mod on that wndr3700 so I connected up to that >>> instead. >> >> hmm, it's not common to have it be unresponsive on the wired network. > > It's uncommon to me. :) This unit has travelled with me for years while I > worked for OLPC and its see a lot of different wireless environments. > Granted never one with this many apple clients. Usually 7-8 Linux/Windows > machines and a pile of XOs. > > So this happened a lot at your SCALE setups? two years ago we had a problem with the APs dropping off, but last year everything worked wonderfully. David Lang ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Cerowrt-devel] Recording RF management info _and_ associated traffic? 2015-01-22 22:09 ` David Lang @ 2015-01-22 22:55 ` Roman Toledo Casabona 0 siblings, 0 replies; 43+ messages in thread From: Roman Toledo Casabona @ 2015-01-22 22:55 UTC (permalink / raw) To: Richard Smith, David Lang; +Cc: cerowrt-devel I guess Yahoo email is not liked as a source for my reply, can you kindly remove me from more notices as I'm overloaded with work to follow your project thank you and excuse me for steeping on this conversation, maybe this will get thru Sorry, we were unable to deliver your message to the following address. <majordomo@vger.kernel.org>: Remote host said: 553 5.7.1 Hello [72.30.239.75], for your MAIL FROM address <rtoledo2002@yahoo.com> policy analysis reported: Your address is not liked source for email [MAIL_FROM] --- Below this line is a copy of the message. Received: from [66.196.81.174] by nm34.bullet.mail.bf1.yahoo.com with NNFMP; 22 Jan 2015 22:52:37 -0000 Received: from [98.139.212.231] by tm20.bullet.mail.bf1.yahoo.com with NNFMP; 22 Jan 2015 22:52:37 -0000 Received: from [127.0.0.1] by omp1040.mail.bf1.yahoo.com with NNFMP; 22 Jan 2015 22:52:37 -0000 X-Yahoo-Newman-Property: ymail-3 X-Yahoo-Newman-Id: 508745.6516.bm@omp1040.mail.bf1.yahoo.com Received: (qmail 58276 invoked by uid 60001); 22 Jan 2015 22:52:37 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024; t=1421967157; bh=+hoVZbV4ePHhwT48DL2jKdAiiEz3u4DphjF7TNJfus4=; h=Message-ID:Date:From:Subject:To:MIME-Version:Content-Type; b=fHXb+lpfFysgslEQ9OIOvtOrJvhcXd46V4xXqBwTpXBRrhQZsFX7deDV37b+rSBPKn9KuxrSVl8TfPRDwWvJrYTX6yNUHX2sTAMQB/+fROQvUiYPijmDAo2FxbK4e7bUUMWHDuGViQWqm3LVMoPPgw5CKJpWayeIvTPLuQrKRZE= X-YMail-OSG: 9YkReVMVM1lcI3gjTh0TFiGFhn620jxFLrWLaxHpAfZTTwP 7BbbFh0sTYi9Zay6Pn8C1oBz1dQ1w6XvjCe3pzINldpd2EEAAWRx3iterebA zfNFrUkvlWSCnod3MZtfZM3ryuIwuvQHe1qocC4BTCxIognNjnVqrefw2IN2 3r3eXNBS6YD2eOXaxIHec0ZRs6x6XfsIFNvLB1_DhWHNhSf58zsWFR6R8cDL L9AU2YgIlO.142L_LjGOqTXDB39yn3FHUueVHgcsmoUmjhC88dfxDXBxRq_n uxBfYxD7rzL6n7Ss_lp2bqZgq4hJs6ezsxKFK9I7qMpJeMofr1rbBhMXPpm0 D9sVMaEcJFRfSEhJrqKFXfmJukEYfAlYMqRcGZpCs8rnQ2uw0LiEFsi1pLEs dcueN.Xs7CcvshWaQ4zaM8s.MYwyYpZrJaaqXtFweRfiryf7LqQk4w9p04FK Pkx1qEGjSdPth8R7QeT6uFwcrwOGyoJr1Brx28jcoPaAHE3SmVSbQpT_SXnX gTjWtJzW9Fz2Ttp_xbyJhbByN8R3uN6f3gtlAKxVDPGNFALUmyz2C9V8lxKk ghP2xkQuA4w-- Received: from [96.251.130.107] by web162203.mail.bf1.yahoo.com via HTTP; Thu, 22 Jan 2015 14:52:37 PST X-Rocket-MIMEInfo: 002.001,DQp1bnN1YnNjcmliZSBuZXRkZXYNCg0KIGluICB0aGUgYm9keSBvZiBhIG1lc3NhZ2UgdG8gbWFqb3Jkb21vQHZnZXIua2VybmVsLm9yZwEwAQEBAQ-- X-Mailer: YahooMailClassic/948 YahooMailWebService/0.8.203.740 Message-ID: <1421967157.50516.YahooMailBasic@web162203.mail.bf1.yahoo.com> Date: Thu, 22 Jan 2015 14:52:37 -0800 From: Roman Toledo Casabona <rtoledo2002@yahoo.com> Subject: unsubscribe netdev To: majordomo@vger.kernel.org MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii unsubscribe netdev in the body of a message to majordomo@vger.kernel.org -------------------------------------------- On Thu, 1/22/15, David Lang <david@lang.hm> wrote: Subject: Re: [Cerowrt-devel] Recording RF management info _and_ associated traffic? To: "Richard Smith" <smithbone@gmail.com> Cc: "cerowrt-devel@lists.bufferbloat.net" <cerowrt-devel@lists.bufferbloat.net> Date: Thursday, January 22, 2015, 2:09 PM On Thu, 22 Jan 2015, Richard Smith wrote: >>> The WNDR3700 was completely unresponsive both via WiFi and when I >>> tried its IP connected directly to it's switch with a Cat-5. I also >>> have a serial port mod on that wndr3700 so I connected up to that >>> instead. >> >> hmm, it's not common to have it be unresponsive on the wired network. > > It's uncommon to me. :) This unit has travelled with me for years while I > worked for OLPC and its see a lot of different wireless environments. > Granted never one with this many apple clients. Usually 7-8 Linux/Windows > machines and a pile of XOs. > > So this happened a lot at your SCALE setups? two years ago we had a problem with the APs dropping off, but last year everything worked wonderfully. David Lang _______________________________________________ Cerowrt-devel mailing list Cerowrt-devel@lists.bufferbloat.net https://lists.bufferbloat.net/listinfo/cerowrt-devel ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Cerowrt-devel] Recording RF management info _and_ associated traffic? 2015-01-22 18:19 ` Richard Smith 2015-01-22 22:09 ` David Lang @ 2015-01-24 14:59 ` dpreed 2015-01-24 15:30 ` Kelvin Edmison 2015-01-25 4:35 ` David Lang 2015-01-25 8:07 ` Outback Dingo 2 siblings, 2 replies; 43+ messages in thread From: dpreed @ 2015-01-24 14:59 UTC (permalink / raw) To: Richard Smith; +Cc: cerowrt-devel [-- Attachment #1: Type: text/plain, Size: 1302 bytes --] On Thursday, January 22, 2015 1:19pm, "Richard Smith" <smithbone@gmail.com> said: > On 01/22/2015 04:18 AM, David Lang wrote: > > >> Recently, we picked up the 11th floor as well and moved many people up > >> there. I got a 3rd AP (another TP-Link AC1750) and set that one up on > >> a free channel with a different ESSID. > > > > I like to put all the APs on the same ESSID so that people can roam > > between them. This requires that the APs act as bridges to a dedicated > > common network, not as routers. > > That's the ultimate plan but for convenience of being able to easily > select what AP I'm talking to or to be able to tell folks to move from > one to another I've got them on different ESSIDs. It also helps me keep > track of what RF channel things are on. A side comment, meant to discourage continuing to bridge rather than route. There's no reason that the AP's cannot have different IP addresses, but a common ESSID. Roaming between them would be like roaming among mesh subnets. Assuming you are securing your APs' air interfaces using encryption over the air, you are already re-authenticating as you move from AP to AP. So using routing rather than bridging is a good idea for all the reasons that routing rather than bridging is better for mesh. [-- Attachment #2: Type: text/html, Size: 2103 bytes --] ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Cerowrt-devel] Recording RF management info _and_ associated traffic? 2015-01-24 14:59 ` dpreed @ 2015-01-24 15:30 ` Kelvin Edmison 2015-01-25 4:35 ` David Lang 1 sibling, 0 replies; 43+ messages in thread From: Kelvin Edmison @ 2015-01-24 15:30 UTC (permalink / raw) To: dpreed; +Cc: cerowrt-devel [-- Attachment #1: Type: text/plain, Size: 1888 bytes --] > On Jan 24, 2015, at 9:59 AM, dpreed@reed.com wrote: > > On Thursday, January 22, 2015 1:19pm, "Richard Smith" <smithbone@gmail.com> said: > > > On 01/22/2015 04:18 AM, David Lang wrote: > > > > >> Recently, we picked up the 11th floor as well and moved many people up > > >> there. I got a 3rd AP (another TP-Link AC1750) and set that one up on > > >> a free channel with a different ESSID. > > > > > > I like to put all the APs on the same ESSID so that people can roam > > > between them. This requires that the APs act as bridges to a dedicated > > > common network, not as routers. > > > > That's the ultimate plan but for convenience of being able to easily > > select what AP I'm talking to or to be able to tell folks to move from > > one to another I've got them on different ESSIDs. It also helps me keep > > track of what RF channel things are on. > > A side comment, meant to discourage continuing to bridge rather than route. > There's no reason that the AP's cannot have different IP addresses, but a common ESSID. Roaming between them would be like roaming among mesh subnets. Assuming you are securing your APs' air interfaces using encryption over the air, you are already re-authenticating as you move from AP to AP. So using routing rather than bridging is a good idea for all the reasons that routing rather than bridging is better for mesh. > Have the MDNS problems been addressed? The last time I had a go with CeroWRT (about 6 months ago) the problems were too severe for me to keep using it. I had to fall back to a bridged setup for my primarily Mac environment. I'm a long-time Linux user-space developer but am a complete newbie when it comes to developing for CeroWRT. If someone can point me at the right spot to start working on the MDNS issues then I'll see if I can do anything to help. Regards, Kelvin [-- Attachment #2: Type: text/html, Size: 2972 bytes --] ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Cerowrt-devel] Recording RF management info _and_ associated traffic? 2015-01-24 14:59 ` dpreed 2015-01-24 15:30 ` Kelvin Edmison @ 2015-01-25 4:35 ` David Lang 2015-01-25 5:02 ` Dave Taht 2015-01-25 20:17 ` dpreed 1 sibling, 2 replies; 43+ messages in thread From: David Lang @ 2015-01-25 4:35 UTC (permalink / raw) To: dpreed; +Cc: cerowrt-devel On Sat, 24 Jan 2015, dpreed@reed.com wrote: > On Thursday, January 22, 2015 1:19pm, "Richard Smith" <smithbone@gmail.com> said: > > >> On 01/22/2015 04:18 AM, David Lang wrote: >> >> >> Recently, we picked up the 11th floor as well and moved many people up >> >> there. I got a 3rd AP (another TP-Link AC1750) and set that one up on >> >> a free channel with a different ESSID. >> > >> > I like to put all the APs on the same ESSID so that people can roam >> > between them. This requires that the APs act as bridges to a dedicated >> > common network, not as routers. >> >> That's the ultimate plan but for convenience of being able to easily >> select what AP I'm talking to or to be able to tell folks to move from >> one to another I've got them on different ESSIDs. It also helps me keep >> track of what RF channel things are on. > > > A side comment, meant to discourage continuing to bridge rather than route. > > There's no reason that the AP's cannot have different IP addresses, but a > common ESSID. Roaming between them would be like roaming among mesh subnets. > Assuming you are securing your APs' air interfaces using encryption over the > air, you are already re-authenticating as you move from AP to AP. So using > routing rather than bridging is a good idea for all the reasons that routing > rather than bridging is better for mesh. The problem with doing this is that all existing TCP connections will break when you move from one AP to another and while some apps will quickly notice this and establish new connections, there are many apps that will not and this will cause noticable disruption to the user. Bridgeing allows the connections to remain intact. The wifi stack re-negotiates the encryption, but the encapsulated IP packets don't change. I do this with the wifi on it's own VLAN (actually separate VLANs for 2.4 and 5GHz) and have the APs configured not to relay broadcast traffic from one wireless user to another. This cuts down a LOT on the problems of broadcasts. In about a month I'm going to be running the wireless network for SCaLE again, and I would be happy to instrament the network to gather whatever info anyone is interested in. I will be using ~50 APs to handle the ~2800 or so devices that show up, with the footprint of each AP roughly covering a small meeting room (larger rooms have 2 APs in them, the largest room has 3, and I'm adding APs this year to cover the hallways better because the ones in the rooms aren't doing well enough at the low power settings I'm using) David Lang ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Cerowrt-devel] Recording RF management info _and_ associated traffic? 2015-01-25 4:35 ` David Lang @ 2015-01-25 5:02 ` Dave Taht 2015-01-25 5:04 ` Dave Taht 2015-01-25 6:44 ` David Lang 2015-01-25 20:17 ` dpreed 1 sibling, 2 replies; 43+ messages in thread From: Dave Taht @ 2015-01-25 5:02 UTC (permalink / raw) To: David Lang; +Cc: cerowrt-devel On Sat, Jan 24, 2015 at 8:35 PM, David Lang <david@lang.hm> wrote: > On Sat, 24 Jan 2015, dpreed@reed.com wrote: > >> On Thursday, January 22, 2015 1:19pm, "Richard Smith" >> <smithbone@gmail.com> said: >> >> >>> On 01/22/2015 04:18 AM, David Lang wrote: >>> >>> >> Recently, we picked up the 11th floor as well and moved many people up >>> >> there. I got a 3rd AP (another TP-Link AC1750) and set that one up on >>> >> a free channel with a different ESSID. >>> > >>> > I like to put all the APs on the same ESSID so that people can roam >>> > between them. This requires that the APs act as bridges to a dedicated >>> > common network, not as routers. >>> >>> That's the ultimate plan but for convenience of being able to easily >>> select what AP I'm talking to or to be able to tell folks to move from >>> one to another I've got them on different ESSIDs. It also helps me keep >>> track of what RF channel things are on. My usual use case for using different APs is to find an error in the campus. When someone tells me that "Lupin-lodge" is down, I know exactly which machine to check. If everything was named Lupin, I'd have to check far more than one AP, and to ask approximately where on the campus they were. >> >> >> >> A side comment, meant to discourage continuing to bridge rather than >> route. >> >> There's no reason that the AP's cannot have different IP addresses, but a >> common ESSID. Roaming between them would be like roaming among mesh >> subnets. Assuming you are securing your APs' air interfaces using encryption >> over the air, you are already re-authenticating as you move from AP to AP. >> So using routing rather than bridging is a good idea for all the reasons >> that routing rather than bridging is better for mesh. > > > The problem with doing this is that all existing TCP connections will break > when you move from one AP to another and while some apps will quickly notice > this and establish new connections, there are many apps that will not and > this will cause noticable disruption to the user. I am under the impression that network-manager and linux, at least, tend to renegotiate IPv6 addresses on an down/up, and preserve ipv4. > > Bridgeing allows the connections to remain intact. The wifi stack > re-negotiates the encryption, but the encapsulated IP packets don't change. While I actually agree with dlang on having all the same ssid and bridging, and not routing, on a conference, as well as with the idea of disabling broadcast (and I assume direct connectivity between two people seated side by side), it is a pita: More than once I've wanted to share a git tree with someone right next to me. I try to hand them my ip to grab the tree, and they can't even ping me, so I end uploading it somewhere, and he or she downloading it from there. Similarly, breaking interconnectivity precludes sane usage of in-conference In my case, since choosing to live in a routed, rather than bridged world, I have modified the nailed up tools I use to be more connectionless. Instead of ssh (tcp), I use mosh-multipath (udp), which is far superior for interactive shells in lousy wifi environments. For vpns, I switched to tinc, which will attempt direct connections over udp, and tcp on both ipv4 and ipv6. For access to google, I adopted quic in my chrome browser. Since doing all these things I rarely notice losing a nailed up connection or migrating from AP to AP. Additionally I use babel (where I control the network) and ad-hoc wifi to transparently migrate from AP to AP, and (often) from AP to wired to AP to wired as I change locations, also with no loss in connectivity. I don't expect the scale userbase to have made these adjustments in behavior. :/ > > I do this with the wifi on it's own VLAN (actually separate VLANs for 2.4 > and 5GHz) and have the APs configured not to relay broadcast traffic from > one wireless user to another. This cuts down a LOT on the problems of > broadcasts. > > In about a month I'm going to be running the wireless network for SCaLE > again, and I would be happy to instrament the network to gather whatever > info anyone is interested in. I will be using ~50 APs to handle the ~2800 or I will look into some tools bismark and others have. Will you attempt to deploy ipv6? > so devices that show up, with the footprint of each AP roughly covering a > small meeting room (larger rooms have 2 APs in them, the largest room has 3, > and I'm adding APs this year to cover the hallways better because the ones > in the rooms aren't doing well enough at the low power settings I'm using) I am of course interested in how fq_codel performs on your ISP link, and are you planning on running it for your wifi? > David Lang > > _______________________________________________ > Cerowrt-devel mailing list > Cerowrt-devel@lists.bufferbloat.net > https://lists.bufferbloat.net/listinfo/cerowrt-devel -- Dave Täht thttp://www.bufferbloat.net/projects/bloat/wiki/Upcoming_Talks ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Cerowrt-devel] Recording RF management info _and_ associated traffic? 2015-01-25 5:02 ` Dave Taht @ 2015-01-25 5:04 ` Dave Taht 2015-01-25 6:44 ` David Lang 1 sibling, 0 replies; 43+ messages in thread From: Dave Taht @ 2015-01-25 5:04 UTC (permalink / raw) To: David Lang; +Cc: cerowrt-devel On Sat, Jan 24, 2015 at 9:02 PM, Dave Taht <dave.taht@gmail.com> wrote: > On Sat, Jan 24, 2015 at 8:35 PM, David Lang <david@lang.hm> wrote: >> On Sat, 24 Jan 2015, dpreed@reed.com wrote: >> >>> On Thursday, January 22, 2015 1:19pm, "Richard Smith" >>> <smithbone@gmail.com> said: >>> >>> >>>> On 01/22/2015 04:18 AM, David Lang wrote: >>>> >>>> >> Recently, we picked up the 11th floor as well and moved many people up >>>> >> there. I got a 3rd AP (another TP-Link AC1750) and set that one up on >>>> >> a free channel with a different ESSID. >>>> > >>>> > I like to put all the APs on the same ESSID so that people can roam >>>> > between them. This requires that the APs act as bridges to a dedicated >>>> > common network, not as routers. >>>> >>>> That's the ultimate plan but for convenience of being able to easily >>>> select what AP I'm talking to or to be able to tell folks to move from >>>> one to another I've got them on different ESSIDs. It also helps me keep >>>> track of what RF channel things are on. > > My usual use case for using different APs is to find an error in the campus. > > When someone tells me that "Lupin-lodge" is down, I know exactly which machine > to check. If everything was named Lupin, I'd have to check far more > than one AP, and > to ask approximately where on the campus they were. > >>> >>> >>> >>> A side comment, meant to discourage continuing to bridge rather than >>> route. >>> >>> There's no reason that the AP's cannot have different IP addresses, but a >>> common ESSID. Roaming between them would be like roaming among mesh >>> subnets. Assuming you are securing your APs' air interfaces using encryption >>> over the air, you are already re-authenticating as you move from AP to AP. >>> So using routing rather than bridging is a good idea for all the reasons >>> that routing rather than bridging is better for mesh. >> >> >> The problem with doing this is that all existing TCP connections will break >> when you move from one AP to another and while some apps will quickly notice >> this and establish new connections, there are many apps that will not and >> this will cause noticable disruption to the user. > > I am under the impression that network-manager and linux, at least, > tend to renegotiate > IPv6 addresses on an down/up, and preserve ipv4. > >> >> Bridgeing allows the connections to remain intact. The wifi stack >> re-negotiates the encryption, but the encapsulated IP packets don't change. > > While I actually agree with dlang on having all the same ssid and > bridging, and not routing, on a conference, as well as with the idea > of disabling broadcast (and I assume direct connectivity between two > people seated side by side), it is a pita: > > More than once I've wanted to share a git tree with someone right next > to me. I try to hand them my ip to grab the tree, and they can't even > ping me, so I end uploading it somewhere, and he or she downloading it > from there. Similarly, breaking interconnectivity precludes sane usage > of in-conference oops, hit send too early. "Of in-conference tools like webrtc, which would otherwise seek a direct path, as well as other p2p things like chat based on that". > In my case, since choosing to live in a routed, rather than bridged > world, I have modified the nailed up tools I use to be more > connectionless. Instead of ssh (tcp), I use mosh-multipath (udp), > which is far superior for interactive shells in lousy wifi > environments. For vpns, I switched to tinc, which will attempt direct > connections over udp, and tcp on both ipv4 and ipv6. For access to > google, I adopted quic in my chrome browser. Since doing all these > things I rarely notice losing a nailed up connection or migrating from > AP to AP. Additionally I use babel (where I control the network) and > ad-hoc wifi to transparently migrate from AP to AP, and (often) from > AP to wired to AP to wired as I change locations, also with no loss in > connectivity. > > I don't expect the scale userbase to have made these adjustments in behavior. :/ > >> >> I do this with the wifi on it's own VLAN (actually separate VLANs for 2.4 >> and 5GHz) and have the APs configured not to relay broadcast traffic from >> one wireless user to another. This cuts down a LOT on the problems of >> broadcasts. >> >> In about a month I'm going to be running the wireless network for SCaLE >> again, and I would be happy to instrament the network to gather whatever >> info anyone is interested in. I will be using ~50 APs to handle the ~2800 or > > I will look into some tools bismark and others have. > > Will you attempt to deploy ipv6? > >> so devices that show up, with the footprint of each AP roughly covering a >> small meeting room (larger rooms have 2 APs in them, the largest room has 3, >> and I'm adding APs this year to cover the hallways better because the ones >> in the rooms aren't doing well enough at the low power settings I'm using) > > I am of course interested in how fq_codel performs on your ISP link, and > are you planning on running it for your wifi? > >> David Lang >> >> _______________________________________________ >> Cerowrt-devel mailing list >> Cerowrt-devel@lists.bufferbloat.net >> https://lists.bufferbloat.net/listinfo/cerowrt-devel > > > > -- > Dave Täht > > thttp://www.bufferbloat.net/projects/bloat/wiki/Upcoming_Talks -- Dave Täht thttp://www.bufferbloat.net/projects/bloat/wiki/Upcoming_Talks ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Cerowrt-devel] Recording RF management info _and_ associated traffic? 2015-01-25 5:02 ` Dave Taht 2015-01-25 5:04 ` Dave Taht @ 2015-01-25 6:44 ` David Lang 2015-01-25 7:06 ` David Lang [not found] ` <CAA93jw64KjW-JjLxB3i_ZK348NCyJYSQACFO34MaUsBBWyZ+pA@mail.gmail.com> 1 sibling, 2 replies; 43+ messages in thread From: David Lang @ 2015-01-25 6:44 UTC (permalink / raw) To: Dave Taht; +Cc: cerowrt-devel On Sat, 24 Jan 2015, Dave Taht wrote: >>> A side comment, meant to discourage continuing to bridge rather than >>> route. >>> >>> There's no reason that the AP's cannot have different IP addresses, but a >>> common ESSID. Roaming between them would be like roaming among mesh >>> subnets. Assuming you are securing your APs' air interfaces using encryption >>> over the air, you are already re-authenticating as you move from AP to AP. >>> So using routing rather than bridging is a good idea for all the reasons >>> that routing rather than bridging is better for mesh. >> >> >> The problem with doing this is that all existing TCP connections will break >> when you move from one AP to another and while some apps will quickly notice >> this and establish new connections, there are many apps that will not and >> this will cause noticable disruption to the user. > > I am under the impression that network-manager and linux, at least, > tend to renegotiate > IPv6 addresses on an down/up, and preserve ipv4. It can't preserve the ipv4 address if you end up on a different network address range (and trying to have lots of separate networks with the same IP addresses would mean that you have to do NAT at each network, and if you did that, then when you ended up on a different AP with the same IP address, the NAT tables would not have records of your connections and they would terminate the connections when you tried to send the next packets. >> Bridgeing allows the connections to remain intact. The wifi stack >> re-negotiates the encryption, but the encapsulated IP packets don't change. > > While I actually agree with dlang on having all the same ssid and > bridging, and not routing, on a conference, as well as with the idea > of disabling broadcast (and I assume direct connectivity between two > people seated side by side), it is a pita: > > More than once I've wanted to share a git tree with someone right next > to me. I try to hand them my ip to grab the tree, and they can't even > ping me, so I end uploading it somewhere, and he or she downloading it > from there. Similarly, breaking interconnectivity precludes sane usage > of in-conference True, it also blocks some abuse. People who really want direct connectivity can establish it as an ad-hoc network. For the normal user that we are trying to support at a conference, it's a win. I'll note that we also block streaming sites (which has the side effect of blocking some useful sites that share the same IPs, Amazon for example) to help make things better for everyone else, even at the cost of limiting what some people are able to do. Bandwidth is limited compared to the number of people we have, and we have to make choices. We do provide a local mirror of the debian based distros so that people can do the updates that they always tend to do at the conference (we would do the same for Fedora, but they make it too hard to do so) > In my case, since choosing to live in a routed, rather than bridged > world, I have modified the nailed up tools I use to be more > connectionless. Instead of ssh (tcp), I use mosh-multipath (udp), > which is far superior for interactive shells in lousy wifi > environments. For vpns, I switched to tinc, which will attempt direct > connections over udp, and tcp on both ipv4 and ipv6. For access to > google, I adopted quic in my chrome browser. Since doing all these > things I rarely notice losing a nailed up connection or migrating from > AP to AP. Additionally I use babel (where I control the network) and > ad-hoc wifi to transparently migrate from AP to AP, and (often) from > AP to wired to AP to wired as I change locations, also with no loss in > connectivity. > > I don't expect the scale userbase to have made these adjustments in behavior. :/ :-) >> >> I do this with the wifi on it's own VLAN (actually separate VLANs for 2.4 >> and 5GHz) and have the APs configured not to relay broadcast traffic from >> one wireless user to another. This cuts down a LOT on the problems of >> broadcasts. >> >> In about a month I'm going to be running the wireless network for SCaLE >> again, and I would be happy to instrament the network to gather whatever >> info anyone is interested in. I will be using ~50 APs to handle the ~2800 or > > I will look into some tools bismark and others have. > > Will you attempt to deploy ipv6? We have been offering IPv6 routable addresses for a few years. >> so devices that show up, with the footprint of each AP roughly covering a >> small meeting room (larger rooms have 2 APs in them, the largest room has 3, >> and I'm adding APs this year to cover the hallways better because the ones >> in the rooms aren't doing well enough at the low power settings I'm using) > > I am of course interested in how fq_codel performs on your ISP link, and > are you planning on running it for your wifi? I'm running OpenWRT on the APs but haven't done anything in particular to activate it. I'll check what we have on the firewall (a fairly up to day Debian build) What's the best way to monitor the queues? David Lang ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Cerowrt-devel] Recording RF management info _and_ associated traffic? 2015-01-25 6:44 ` David Lang @ 2015-01-25 7:06 ` David Lang [not found] ` <CAA93jw64KjW-JjLxB3i_ZK348NCyJYSQACFO34MaUsBBWyZ+pA@mail.gmail.com> 1 sibling, 0 replies; 43+ messages in thread From: David Lang @ 2015-01-25 7:06 UTC (permalink / raw) To: Dave Taht; +Cc: cerowrt-devel On Sat, 24 Jan 2015, David Lang wrote: > On Sat, 24 Jan 2015, Dave Taht wrote: > >> I am of course interested in how fq_codel performs on your ISP link, and >> are you planning on running it for your wifi? > > I'm running OpenWRT on the APs but haven't done anything in particular to > activate it. I'll check what we have on the firewall (a fairly up to day > Debian build) > > What's the best way to monitor the queues? For that matter, if you have any other monitoring or stats that you would like me to gather? I'm using WNDR3800 and WNDR3700v2 APs Especially anything related to gathering stats related to fast wifi. David Lang ^ permalink raw reply [flat|nested] 43+ messages in thread
[parent not found: <CAA93jw64KjW-JjLxB3i_ZK348NCyJYSQACFO34MaUsBBWyZ+pA@mail.gmail.com>]
* Re: [Cerowrt-devel] Recording RF management info _and_ associated traffic? [not found] ` <CAA93jw64KjW-JjLxB3i_ZK348NCyJYSQACFO34MaUsBBWyZ+pA@mail.gmail.com> @ 2015-01-25 7:59 ` Dave Taht 2015-01-25 9:39 ` David Lang 1 sibling, 0 replies; 43+ messages in thread From: Dave Taht @ 2015-01-25 7:59 UTC (permalink / raw) To: David Lang; +Cc: cerowrt-devel [-- Attachment #1: Type: text/plain, Size: 11230 bytes --] This got mangled by my IP addr filter On Jan 24, 2015 11:56 PM, "Dave Taht" <dave.taht@gmail.com> wrote: > > I want to make clear that I support dlang's design in the abstract... and am just arguing because it is a slow day. > > On Sat, Jan 24, 2015 at 10:44 PM, David Lang <david@lang.hm> wrote: > > On Sat, 24 Jan 2015, Dave Taht wrote: > > > >>>> A side comment, meant to discourage continuing to bridge rather than > >>>> route. > >>>> > >>>> There's no reason that the AP's cannot have different IP addresses, but > >>>> a > >>>> common ESSID. Roaming between them would be like roaming among mesh > >>>> subnets. Assuming you are securing your APs' air interfaces using > >>>> encryption > >>>> over the air, you are already re-authenticating as you move from AP to > >>>> AP. > >>>> So using routing rather than bridging is a good idea for all the reasons > >>>> that routing rather than bridging is better for mesh. > >>> > >>> > >>> > >>> The problem with doing this is that all existing TCP connections will > >>> break > >>> when you move from one AP to another and while some apps will quickly > >>> notice > >>> this and establish new connections, there are many apps that will not and > >>> this will cause noticable disruption to the user. > >> > >> > >> I am under the impression that network-manager and linux, at least, > >> tend to renegotiate > >> IPv6 addresses on an down/up, and preserve ipv4. > > > > > > It can't preserve the ipv4 address if you end up on a different network > > address range (and trying to have lots of separate networks with the same IP > > addresses would mean that you have to do NAT at each network, and if you did > > that, then when you ended up on a different AP with the same IP address, the > > NAT tables would not have records of your connections and they would > > terminate the connections when you tried to send the next packets. > > Hmm? The first thing I ever do to a router is renumber it to a unique IP address range, > and rename the subnet in dns to something unique. The 3 sed lines for this are on a cerowrt web page somewhere. Adding ipv6 statically is a pita, but doable with care and a uci script, and mildly more doable as hnetd matures. > > I run local dns services on each in the hope that at least some will be cached, and a local dhcp server to serve addresses out of that range. I turn off dhcp default route fetching on each routers external interface and use babel instead to find the right route(s) out of the system. > > On the NAT front, there is no nat on the internal routers, just a flat address space (a /14 in my case). I push all the nat to the main egress gateway(s), and in a case like yours would probably use multiple external IPs and dnat rather than masquarade the entire subnet on one to free up port space. You rapidly run out of ports in a natted evironment with that many users. I've had to turn down NAT timeouts for udp in particular to truly unreasonable levels otherwise (20 seconds in some cases) > > Doing this I can get a quick status on what is up with "ip route", and by monitoring the activity on each ip range, see if traffic is actually being passed, a failure of a given gateway fails over to another, and so on. There's a couple snmp hacks to do things like monitor active leases, and smokeping/mrtg to access other stats. There's a couple beagles that are on wifi that I ping on some APs. The beagles have not been very reliable for me, so they switch on and off with digiloggers gear when they fail a local ping. In fact the main logging beagle failed entirely the other month, sigh. > > I use the ad-hoc links on cerowrt as backups (if they lose ethernet connectivity) and extenders (if there is no ethernet connectivity), and (as I have 5 different comcast exit nodes spread throughout the network), use babel-pinger on each to see if they are up, and insert default routes into the mix that are automatically the shortest "distance" between the node and exit gateway. If one gw goes down (usually) all the traffic ends up switching to the next nearest default gateway switching over in 16 seconds or so, breaking all the nat associations for the net they were on (sigh), as well as ipv6 native stuff, but it's happened so often without me noticing it that it's nice not to worry. > > (I have a mostly failed attempt in play for doing better with ipv6 and hnetd on a couple of exit nodes, but that isn't solid enough to deploy as yet, so it's only sort of working in the yurtlab. I really wish I could buy PI space for ipv6 somehow) > > (I have been fiddling with dns anycast to try to get more redundancy on the main dns gateways. That works pretty good) > > Now, your method is simpler! (although mine is mostly scripted) I imagine you bridge everything on a vlan, and use a central dhcp/dns server to serve up dhcp across (say) a 10/16 subnet. And by blocking local multicast/broadcast, in particular, this scales across the 3k user population. You've got a critical single point of failure in your gateway, but at least that's only one, and I imagine you have that duplicated. > > (In contrast my network is always broken somewhere, but unless two critical nodes break, it's pretty redundant and loss is confined to a a single AP - my biggest problem is that I need to upgrade the firmware on about half the network - which involves climbing trees - and my plan was to deploy hnetd last year so I could roll out ipv6) > > How do you deal with a dead AP that is not actually connecting with traffic? > > >>> Bridgeing allows the connections to remain intact. The wifi stack > >>> re-negotiates the encryption, but the encapsulated IP packets don't > >>> change. > >> > >> > >> While I actually agree with dlang on having all the same ssid and > >> bridging, and not routing, on a conference, as well as with the idea > >> of disabling broadcast (and I assume direct connectivity between two > >> people seated side by side), it is a pita: > >> > >> More than once I've wanted to share a git tree with someone right next > >> to me. I try to hand them my ip to grab the tree, and they can't even > >> ping me, so I end uploading it somewhere, and he or she downloading it > >> from there. Similarly, breaking interconnectivity precludes sane usage > >> of in-conference > > > > > > True, it also blocks some abuse. People who really want direct connectivity > > can establish it as an ad-hoc network. > > yes, I've often draped an ethernet cable between seats. :) > > > > > For the normal user that we are trying to support at a conference, it's a > > win. > > > > I'll note that we also block streaming sites (which has the side effect of > > blocking some useful sites that share the same IPs, Amazon for example) to > > help make things better for everyone else, even at the cost of limiting what > > some people are able to do. Bandwidth is limited compared to the number of > > people we have, and we have to make choices. > > Blocking ads is also effective. > > > We do provide a local mirror of the debian based distros so that people can > > do the updates that they always tend to do at the conference (we would do > > the same for Fedora, but they make it too hard to do so) > > > >> In my case, since choosing to live in a routed, rather than bridged > >> world, I have modified the nailed up tools I use to be more > >> connectionless. Instead of ssh (tcp), I use mosh-multipath (udp), > >> which is far superior for interactive shells in lousy wifi > >> environments. For vpns, I switched to tinc, which will attempt direct > >> connections over udp, and tcp on both ipv4 and ipv6. For access to > >> google, I adopted quic in my chrome browser. Since doing all these > >> things I rarely notice losing a nailed up connection or migrating from > >> AP to AP. Additionally I use babel (where I control the network) and > >> ad-hoc wifi to transparently migrate from AP to AP, and (often) from > >> AP to wired to AP to wired as I change locations, also with no loss in > >> connectivity. > >> > >> I don't expect the scale userbase to have made these adjustments in > >> behavior. :/ > > > > > > :-) > > It wouldn't hurt to recomend these tools (notably quic and mosh) to conference > participants. both are pretty awesome. > > > > >>> > >>> I do this with the wifi on it's own VLAN (actually separate VLANs for 2.4 > >>> and 5GHz) and have the APs configured not to relay broadcast traffic from > >>> one wireless user to another. This cuts down a LOT on the problems of > >>> broadcasts. > >>> > >>> In about a month I'm going to be running the wireless network for SCaLE > >>> again, and I would be happy to instrament the network to gather whatever > >>> info anyone is interested in. I will be using ~50 APs to handle the ~2800 > >>> or > >> > >> > >> I will look into some tools bismark and others have. > >> > >> Will you attempt to deploy ipv6? > > > > > > We have been offering IPv6 routable addresses for a few years. > > How many do you get and from whom? > > If I had time (doubtful) and budget (even more doubtful) I'd try to make scale to observe and help out. > > >>> so devices that show up, with the footprint of each AP roughly covering a > >>> small meeting room (larger rooms have 2 APs in them, the largest room has > >>> 3, > >>> and I'm adding APs this year to cover the hallways better because the > >>> ones > >>> in the rooms aren't doing well enough at the low power settings I'm > >>> using) > >> > >> > >> I am of course interested in how fq_codel performs on your ISP link, and > >> are you planning on running it for your wifi? > > > > > > I'm running OpenWRT on the APs but haven't done anything in particular to > > activate it. > > fq_codel is on by default in Barrier breaker and later on all interfaces. I note that it doesn't scale anywhere near as we would like under contention but that work is only beginning in chaos calmer. A thought I've had in an environment such as yours would be to rate limit each AP's ingress/egress ethernet interface to, say, 20mbits, thus pushing all the potential bloat to sqm on ethernet and out of the wifi (which would generally run faster). Might even force uploads from the users lower, also (say 10mbit). Might not, and just rely on people retaining low expectations. :) > > Was it on openwrt last year? > > > I'll check what we have on the firewall (a fairly up to day > > Debian build) > > fq_codel has been a part of that for a long time. > > I'd port over the sqm-scripts and use those, it's only a 1 line change. > > > What's the best way to monitor the queues? > > On each router? > > I tend to use pdsh a lot, setting up a /etc/genders file for them all so I can do a > > pdsh tc qdisc show dev wlan0 # or uptime or cat /etc/dhcp.leases | wc -l or whatever > > Been meaning to get around to something that used snmp instead for a while. > > > > > David Lang > > -- > Dave Täht > > thttp://www.bufferbloat.net/projects/bloat/wiki/Upcoming_Talks [-- Attachment #2: Type: text/html, Size: 13637 bytes --] ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Cerowrt-devel] Recording RF management info _and_ associated traffic? [not found] ` <CAA93jw64KjW-JjLxB3i_ZK348NCyJYSQACFO34MaUsBBWyZ+pA@mail.gmail.com> 2015-01-25 7:59 ` Dave Taht @ 2015-01-25 9:39 ` David Lang 2015-01-25 15:03 ` Chuck Anderson 1 sibling, 1 reply; 43+ messages in thread From: David Lang @ 2015-01-25 9:39 UTC (permalink / raw) To: Dave Taht; +Cc: cerowrt-devel On Sun, 25 Jan 2015, Dave Taht wrote: > I want to make clear that I support dlang's design in the abstract... and > am just arguing because it is a slow day. I welcome challenges to the design, it's how I improve things :-) > On Sat, Jan 24, 2015 at 10:44 PM, David Lang <david@lang.hm> wrote: >> On Sat, 24 Jan 2015, Dave Taht wrote: >> to clarify, the chain of comments was 1. instead of bridging I should route 2. network manager would preserve the IPv4 address to prevent breaking established connections. I was explaining how that can't work. If you are moving between different networks, each routed independently, they either need to have different address ranges (in which case the old IP just won't work), or they would each need to NAT to get to the outside (in which case the IP may stay the same, but the connections will break since the new router wouldn't have the NAT entries for the existing connections) > Hmm? The first thing I ever do to a router is renumber it to a unique IP > address range, and rename the subnet in dns to something unique. The 3 sed > lines for this are on a cerowrt web page somewhere. Adding ipv6 statically is > a pita, but doable with care and a uci script, and mildly more doable as hnetd > matures. > > I run local dns services on each in the hope that at least some will be > cached, and a local dhcp server to serve addresses out of that range. I > turn off dhcp default route fetching on each routers external interface and > use babel instead to find the right route(s) out of the system. > > On the NAT front, there is no nat on the internal routers, just a flat > address space (172.20.0.0/14 in my case). I push all the nat to the main > egress gateway(s), and in a case like yours would probably use multiple > external IPs and dnat rather than masquarade the entire subnet on one to > free up port space. You rapidly run out of ports in a natted evironment > with that many users. I've had to turn down NAT timeouts for udp in > particular to truly unreasonable levels otherwise (20 seconds in some cases) hmm, we haven't seen anything like this, but it could be a problem we haven't noticed because we haven't been looking for it. > Doing this I can get a quick status on what is up with "ip route", and by > monitoring the activity on each ip range, see if traffic is actually being > passed, a failure of a given gateway fails over to another, and so on. > There's a couple snmp hacks to do things like monitor active leases, and > smokeping/mrtg to access other stats. There's a couple beagles that are on > wifi that I ping on some APs. The beagles have not been very reliable for > me, so they switch on and off with digiloggers gear when they fail a local > ping. In fact the main logging beagle failed entirely the other month, sigh. > > I use the ad-hoc links on cerowrt as backups (if they lose ethernet > connectivity) and extenders (if there is no ethernet connectivity), and (as > I have 5 different comcast exit nodes spread throughout the network), use > babel-pinger on each to see if they are up, and insert default routes into > the mix that are automatically the shortest "distance" between the node and > exit gateway. If one gw goes down (usually) all the traffic ends up > switching to the next nearest default gateway switching over in 16 seconds > or so, breaking all the nat associations for the net they were on (sigh), > as well as ipv6 native stuff, but it's happened so often without me > noticing it that it's nice not to worry. > > (I have a mostly failed attempt in play for doing better with ipv6 and > hnetd on a couple of exit nodes, but that isn't solid enough to deploy as > yet, so it's only sort of working in the yurtlab. I really wish I could buy > PI space for ipv6 somehow) > > (I have been fiddling with dns anycast to try to get more redundancy on the > main dns gateways. That works pretty good) > > Now, your method is simpler! (although mine is mostly scripted) I imagine > you bridge everything on a vlan, and use a central dhcp/dns server to serve > up dhcp across (say) a 10.0.0.0/16 subnet. And by blocking local > multicast/broadcast, in particular, this scales across the 3k user > population. You've got a critical single point of failure in your gateway, > but at least that's only one, and I imagine you have that duplicated. I have two wifi vlans, one for 5GHz (ESSID SCALE), and one for 2.4GHz (ESSID SCALE-slow, no speed limits, but it does a great job of encouraging everyone who can to use 5GHz :-) ) There is a central DHCP server and firewall that allocates addresses across a /17 for each of the two networks. We don't setup active failover, but we have a spare box that we can put in if needed. The APs don't have any IP addresses on either wireless network. They have an IP on a different VLAN that's used for management only. Makes it a bit harder for any attackers to do anything to them. Remember, we need to have it work for a few days at a shot > (In contrast my network is always broken somewhere, but unless two critical > nodes break, it's pretty redundant and loss is confined to a a single AP - > my biggest problem is that I need to upgrade the firmware on about half the > network - which involves climbing trees - and my plan was to deploy hnetd > last year so I could roll out ipv6) > > How do you deal with a dead AP that is not actually connecting with traffic? Nagios type monitoring to detect that the AP isn't reachable on the wired network and we send a runner to find out what's happening. About three years ago we had a lot of problems with people unplugging the APs for some reason. >> For the normal user that we are trying to support at a conference, it's a >> win. >> >> I'll note that we also block streaming sites (which has the side effect of >> blocking some useful sites that share the same IPs, Amazon for example) to >> help make things better for everyone else, even at the cost of limiting what >> some people are able to do. Bandwidth is limited compared to the number of >> people we have, and we have to make choices. > > Blocking ads is also effective. We use DNS to block things like this (or actually redirect the DNS to point to a server that serves an image saying that they are being blocked by SCaLE), and then we block port 53 to the outside to force people to use our DNS servers. Somewhat heavy handed, but it works. >>> Will you attempt to deploy ipv6? >> >> >> We have been offering IPv6 routable addresses for a few years. > > How many do you get and from whom? I don't remember at the moment. >>> I am of course interested in how fq_codel performs on your ISP link, and >>> are you planning on running it for your wifi? >> >> >> I'm running OpenWRT on the APs but haven't done anything in particular to >> activate it. > > fq_codel is on by default in Barrier breaker and later on all interfaces. I > note that it doesn't scale anywhere near as we would like under contention > but that work is only beginning in chaos calmer. A thought I've had in an > environment such as yours would be to rate limit each AP's ingress/egress > ethernet interface to, say, 20mbits, thus pushing all the potential bloat > to sqm on ethernet and out of the wifi (which would generally run faster). > Might even force uploads from the users lower, also (say 10mbit). Might > not, and just rely on people retaining low expectations. :) > > Was it on openwrt last year? yes, most of what I did on the wireless side is in the paper at https://www.usenix.org/conference/lisa12/technical-sessions/presentation/lang_david_wireless The first year I did the network I had a total of one month to plan and buy APs, so I was running stock firmware, the second year I used DD-WRT and was very unhappy with it. I've been running OpenWRT since. >> I'll check what we have on the firewall (a fairly up to day >> Debian build) > > fq_codel has been a part of that for a long time. > > I'd port over the sqm-scripts and use those, it's only a 1 line change. > >> What's the best way to monitor the queues? > > On each router? > > I tend to use pdsh a lot, setting up a /etc/genders file for them all so I > can do a > > pdsh tc qdisc show dev wlan0 # or uptime or cat /etc/dhcp.leases | wc -l or > whatever > > Been meaning to get around to something that used snmp instead for a while. I'm gathering info on each AP about the number of users currently connected and the bandwidth used on all ports. I also have a central log from all APs which shows the MAC addresses as they associate with each AP. So collecting the data to one place is the easy part, what I don't now is what I need to gather from where with what commands. Any suggestions for this are very welcome. David Lang ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Cerowrt-devel] Recording RF management info _and_ associated traffic? 2015-01-25 9:39 ` David Lang @ 2015-01-25 15:03 ` Chuck Anderson 0 siblings, 0 replies; 43+ messages in thread From: Chuck Anderson @ 2015-01-25 15:03 UTC (permalink / raw) To: cerowrt-devel On Sun, Jan 25, 2015 at 01:39:32AM -0800, David Lang wrote: > On Sun, 25 Jan 2015, Dave Taht wrote: > > >I want to make clear that I support dlang's design in the abstract... and > >am just arguing because it is a slow day. > > I welcome challenges to the design, it's how I improve things :-) > > >On Sat, Jan 24, 2015 at 10:44 PM, David Lang <david@lang.hm> wrote: > >>On Sat, 24 Jan 2015, Dave Taht wrote: > >> > > to clarify, the chain of comments was > > 1. instead of bridging I should route > > 2. network manager would preserve the IPv4 address to prevent > breaking established connections. > > I was explaining how that can't work. If you are moving between > different networks, each routed independently, they either need to > have different address ranges (in which case the old IP just won't > work), or they would each need to NAT to get to the outside (in > which case the IP may stay the same, but the connections will break > since the new router wouldn't have the NAT entries for the existing > connections) To keep your IP when roaming: 3. The old school way: use mobile IP or some other tunneling mechanism (or VPN) so you can keep your same IP. 4. Use a "virtual subnet" model similar to: https://tools.ietf.org/html/draft-ietf-l3vpn-virtual-subnet-03 The draft is focused on data centers and VM migration, but the problem is the same with client migration/mobility. I would argue that it is even easier to "discover" the location of a client with Wi-Fi because of the association/authentication handshake with the AP rather than relying on a Gratuitous ARP/ND or LLDP, VSI, etc. 5. Use LISP: http://en.wikipedia.org/wiki/Locator/Identifier_Separation_Protocol http://lispmob.org/ (supported on OpenWRT) Has anyone played with this? ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Cerowrt-devel] Recording RF management info _and_ associated traffic? 2015-01-25 4:35 ` David Lang 2015-01-25 5:02 ` Dave Taht @ 2015-01-25 20:17 ` dpreed 2015-01-25 23:21 ` Aaron Wood 2015-01-25 23:57 ` David Lang 1 sibling, 2 replies; 43+ messages in thread From: dpreed @ 2015-01-25 20:17 UTC (permalink / raw) To: David Lang; +Cc: cerowrt-devel [-- Attachment #1: Type: text/plain, Size: 2970 bytes --] Disagree. See below. On Saturday, January 24, 2015 11:35pm, "David Lang" <david@lang.hm> said: > On Sat, 24 Jan 2015, dpreed@reed.com wrote: > > A side comment, meant to discourage continuing to bridge rather than route. > > > > There's no reason that the AP's cannot have different IP addresses, but a > > common ESSID. Roaming between them would be like roaming among mesh subnets. > > Assuming you are securing your APs' air interfaces using encryption over the > > air, you are already re-authenticating as you move from AP to AP. So using > > routing rather than bridging is a good idea for all the reasons that routing > > rather than bridging is better for mesh. > > The problem with doing this is that all existing TCP connections will break when > you move from one AP to another and while some apps will quickly notice this and > establish new connections, there are many apps that will not and this will cause > noticable disruption to the user. > > Bridgeing allows the connections to remain intact. The wifi stack re-negotiates > the encryption, but the encapsulated IP packets don't change. There is no reason why one cannot set up an enterprise network to support roaming, yet maintaining the property that IP addresses don't change while roaming from AP to AP. Here's a simple concept, that amounts to moving what would be in the Ethernet bridging tables up to the IP layer. All addresses in the enterprise are assigned from a common prefix (XXX/16 in IPv4, perhaps). Routing in each access point is used to decide whether to send the packet on its LAN, or to reflect it to another LAN. A node's preferred location would be updated by the endpoint itself, sending its current location to its current access point (via ARP or some other protocol). The access point that hears of a new node that it can reach tells all the other access points that the node is attached to it. Delivery of a packet to a node is done by the access point that receives the packet by looking up the destination IP address in its local table, and sending it to the access point that currently has the destination IP address. This is far better than "bridging" at the Ethernet level from a functionality point of view - it is using routing, not bridging. Bridging at the Ethernet level uses Ethernet's STP feature, which doesn't work very well in collections of wireless LAN's (it is slow to recalculate when something moves, because it was designed for unplug/plug of actual cables, and moving the host from one physical location to another). IMO, Ethernet sometimes aspires to solve problems that are already well-solved in the Internet protocols. (for example the 802.11s mess which tries to do a mesh entirely in the Ethernet layer, and fails pretty miserably). Of course that's only my opinion, but I think it applies to overuse of bridging at the Ethernet layer when there are better approaches at the next layer up. [-- Attachment #2: Type: text/html, Size: 4368 bytes --] ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Cerowrt-devel] Recording RF management info _and_ associated traffic? 2015-01-25 20:17 ` dpreed @ 2015-01-25 23:21 ` Aaron Wood 2015-01-25 23:57 ` David Lang 1 sibling, 0 replies; 43+ messages in thread From: Aaron Wood @ 2015-01-25 23:21 UTC (permalink / raw) To: David Reed; +Cc: cerowrt-devel [-- Attachment #1: Type: text/plain, Size: 1241 bytes --] On Sun, Jan 25, 2015 at 12:17 PM, <dpreed@reed.com> wrote: > There is no reason why one cannot set up an enterprise network to support > roaming, yet maintaining the property that IP addresses don't change while > roaming from AP to AP. Here's a simple concept, that amounts to moving > what would be in the Ethernet bridging tables up to the IP layer. > > > > All addresses in the enterprise are assigned from a common prefix (XXX/16 > in IPv4, perhaps). Routing in each access point is used to decide whether > to send the packet on its LAN, or to reflect it to another LAN. A node's > preferred location would be updated by the endpoint itself, sending its > current location to its current access point (via ARP or some other > protocol). The access point that hears of a new node that it can reach > tells all the other access points that the node is attached to it. > Delivery of a packet to a node is done by the access point that receives > the packet by looking up the destination IP address in its local table, and > sending it to the access point that currently has the destination IP > address. > I'm not familiar with routing protocols. Do any of the current ones do this, or is this an idea for a new protocol? -Aaron [-- Attachment #2: Type: text/html, Size: 1910 bytes --] ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Cerowrt-devel] Recording RF management info _and_ associated traffic? 2015-01-25 20:17 ` dpreed 2015-01-25 23:21 ` Aaron Wood @ 2015-01-25 23:57 ` David Lang 2015-01-26 1:51 ` dpreed 2015-01-26 4:25 ` Valdis.Kletnieks 1 sibling, 2 replies; 43+ messages in thread From: David Lang @ 2015-01-25 23:57 UTC (permalink / raw) To: dpreed; +Cc: cerowrt-devel On Sun, 25 Jan 2015, dpreed@reed.com wrote: > Disagree. See below. > > > On Saturday, January 24, 2015 11:35pm, "David Lang" <david@lang.hm> said: > > > >> On Sat, 24 Jan 2015, dpreed@reed.com wrote: >> > A side comment, meant to discourage continuing to bridge rather than route. >> > >> > There's no reason that the AP's cannot have different IP addresses, but a >> > common ESSID. Roaming between them would be like roaming among mesh subnets. >> > Assuming you are securing your APs' air interfaces using encryption over the >> > air, you are already re-authenticating as you move from AP to AP. So using >> > routing rather than bridging is a good idea for all the reasons that routing >> > rather than bridging is better for mesh. >> >> The problem with doing this is that all existing TCP connections will break when >> you move from one AP to another and while some apps will quickly notice this and >> establish new connections, there are many apps that will not and this will cause >> noticable disruption to the user. >> >> Bridgeing allows the connections to remain intact. The wifi stack re-negotiates >> the encryption, but the encapsulated IP packets don't change. > > > There is no reason why one cannot set up an enterprise network to support > roaming, yet maintaining the property that IP addresses don't change while > roaming from AP to AP. Here's a simple concept, that amounts to moving what > would be in the Ethernet bridging tables up to the IP layer. > > All addresses in the enterprise are assigned from a common prefix (XXX/16 in > IPv4, perhaps). Routing in each access point is used to decide whether to > send the packet on its LAN, or to reflect it to another LAN. A node's > preferred location would be updated by the endpoint itself, sending its > current location to its current access point (via ARP or some other protocol). > The access point that hears of a new node that it can reach tells all the > other access points that the node is attached to it. Delivery of a packet to > a node is done by the access point that receives the packet by looking up the > destination IP address in its local table, and sending it to the access point > that currently has the destination IP address. > > This is far better than "bridging" at the Ethernet level from a functionality > point of view - it is using routing, not bridging. Bridging at the Ethernet > level uses Ethernet's STP feature, which doesn't work very well in collections > of wireless LAN's (it is slow to recalculate when something moves, because it > was designed for unplug/plug of actual cables, and moving the host from one > physical location to another). > > IMO, Ethernet sometimes aspires to solve problems that are already well-solved > in the Internet protocols. (for example the 802.11s mess which tries to do a > mesh entirely in the Ethernet layer, and fails pretty miserably). > > Of course that's only my opinion, but I think it applies to overuse of > bridging at the Ethernet layer when there are better approaches at the next > layer up. Unless you are going to have your routing tables handle every address in your network separately (and fix all the software that depends on broadcasts) you are going to have trouble trying to do this at the IP layer. The 'modern Enterprise' datacenter has lots of large machines that get sliced into multiple virtual machines. For redundancy purposes you want to have the machines used for a particular job to be spread across as many of these machines as possible, spread around your datacenter. Switches in this environment are becoming layer 2 routers. They are connected together with multiple links providing redundant paths around the network. This isn't being done with Spanning Tree because Spanning Tree only allows one path to exist at once, and that is inefficient and creates bottlenecks. As a result, they are now keeping all these links live at the same time and using least cost paths to route the layer 2 traffic across the switches. It's fair to argue that this is abuse of layer 2, but the difficulties in having to change the software operating at higher layers vs the fact that making these changes at the layer 2 level is completely transparent to the higher layers make it so that using this layer 2 capability is pragmantically a far better choice. The Computer Scientist will cringe at the 'hacks' that this introduces, but there is far more progress made when new capabilities can be added in a way that's transparent to other layers of the stack then when it requires major changes to how things work. The software layer is the worst to try and force fundamental changes to. You would be horrified to learn how old some of the software is that's running major jobs at large companies. Even if the software is in continuous development, the age of the core software frequently shows. David Lang ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Cerowrt-devel] Recording RF management info _and_ associated traffic? 2015-01-25 23:57 ` David Lang @ 2015-01-26 1:51 ` dpreed 2015-01-26 2:09 ` David Lang 2015-01-26 2:19 ` Dave Taht 2015-01-26 4:25 ` Valdis.Kletnieks 1 sibling, 2 replies; 43+ messages in thread From: dpreed @ 2015-01-26 1:51 UTC (permalink / raw) To: David Lang; +Cc: cerowrt-devel [-- Attachment #1: Type: text/plain, Size: 5955 bytes --] If you are using Ethernet bridging, your Ethernet switches are doing exactly this at the Ethernet layer... they have large tables of MAC addresses that are known throughout the network, and for each MAC address in the Enterprise, they have the next hop destination. So IP routing tables, one IP address per destination in the Enterprise, would occupy no more space than do the Ethernet routing tables.... so any argument about space efficiency is mooted. This is why bridging is no better than routing - you have to solve the same problem at one layer or the other. The Ethernet layer's "solution" is actually very suboptimal, especially when roaming is going on. On Sunday, January 25, 2015 6:57pm, "David Lang" <david@lang.hm> said: > On Sun, 25 Jan 2015, dpreed@reed.com wrote: > > > Disagree. See below. > > > > > > On Saturday, January 24, 2015 11:35pm, "David Lang" <david@lang.hm> > said: > > > > > > > >> On Sat, 24 Jan 2015, dpreed@reed.com wrote: > >> > A side comment, meant to discourage continuing to bridge rather than > route. > >> > > >> > There's no reason that the AP's cannot have different IP addresses, > but a > >> > common ESSID. Roaming between them would be like roaming among mesh > subnets. > >> > Assuming you are securing your APs' air interfaces using encryption > over the > >> > air, you are already re-authenticating as you move from AP to AP. So > using > >> > routing rather than bridging is a good idea for all the reasons that > routing > >> > rather than bridging is better for mesh. > >> > >> The problem with doing this is that all existing TCP connections will > break when > >> you move from one AP to another and while some apps will quickly notice > this and > >> establish new connections, there are many apps that will not and this > will cause > >> noticable disruption to the user. > >> > >> Bridgeing allows the connections to remain intact. The wifi stack > re-negotiates > >> the encryption, but the encapsulated IP packets don't change. > > > > > > There is no reason why one cannot set up an enterprise network to support > > roaming, yet maintaining the property that IP addresses don't change while > > roaming from AP to AP. Here's a simple concept, that amounts to moving what > > would be in the Ethernet bridging tables up to the IP layer. > > > > All addresses in the enterprise are assigned from a common prefix (XXX/16 in > > IPv4, perhaps). Routing in each access point is used to decide whether to > > send the packet on its LAN, or to reflect it to another LAN. A node's > > preferred location would be updated by the endpoint itself, sending its > > current location to its current access point (via ARP or some other > protocol). > > The access point that hears of a new node that it can reach tells all the > > other access points that the node is attached to it. Delivery of a packet to > > a node is done by the access point that receives the packet by looking up the > > destination IP address in its local table, and sending it to the access point > > that currently has the destination IP address. > > > > This is far better than "bridging" at the Ethernet level from a functionality > > point of view - it is using routing, not bridging. Bridging at the Ethernet > > level uses Ethernet's STP feature, which doesn't work very well in > collections > > of wireless LAN's (it is slow to recalculate when something moves, because it > > was designed for unplug/plug of actual cables, and moving the host from one > > physical location to another). > > > > IMO, Ethernet sometimes aspires to solve problems that are already > well-solved > > in the Internet protocols. (for example the 802.11s mess which tries to do a > > mesh entirely in the Ethernet layer, and fails pretty miserably). > > > > Of course that's only my opinion, but I think it applies to overuse of > > bridging at the Ethernet layer when there are better approaches at the next > > layer up. > > Unless you are going to have your routing tables handle every address in your > network separately (and fix all the software that depends on broadcasts) you are > going to have trouble trying to do this at the IP layer. > > The 'modern Enterprise' datacenter has lots of large machines that get sliced > into multiple virtual machines. For redundancy purposes you want to have the > machines used for a particular job to be spread across as many of these machines > as possible, spread around your datacenter. > > Switches in this environment are becoming layer 2 routers. They are connected > together with multiple links providing redundant paths around the network. This > isn't being done with Spanning Tree because Spanning Tree only allows one path > to exist at once, and that is inefficient and creates bottlenecks. As a result, > they are now keeping all these links live at the same time and using least cost > paths to route the layer 2 traffic across the switches. > > It's fair to argue that this is abuse of layer 2, but the difficulties in having > to change the software operating at higher layers vs the fact that making these > changes at the layer 2 level is completely transparent to the higher layers make > it so that using this layer 2 capability is pragmantically a far better choice. > > The Computer Scientist will cringe at the 'hacks' that this introduces, but > there is far more progress made when new capabilities can be added in a way > that's transparent to other layers of the stack then when it requires major > changes to how things work. > > The software layer is the worst to try and force fundamental changes to. You > would be horrified to learn how old some of the software is that's running major > jobs at large companies. Even if the software is in continuous development, the > age of the core software frequently shows. > > David Lang > [-- Attachment #2: Type: text/html, Size: 7783 bytes --] ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Cerowrt-devel] Recording RF management info _and_ associated traffic? 2015-01-26 1:51 ` dpreed @ 2015-01-26 2:09 ` David Lang 2015-01-26 4:33 ` Valdis.Kletnieks 2015-01-26 2:19 ` Dave Taht 1 sibling, 1 reply; 43+ messages in thread From: David Lang @ 2015-01-26 2:09 UTC (permalink / raw) To: dpreed; +Cc: cerowrt-devel On Sun, 25 Jan 2015, dpreed@reed.com wrote: > If you are using Ethernet bridging, your Ethernet switches are doing exactly > this at the Ethernet layer... they have large tables of MAC addresses that are > known throughout the network, and for each MAC address in the Enterprise, they > have the next hop destination. > > So IP routing tables, one IP address per destination in the Enterprise, would > occupy no more space than do the Ethernet routing tables.... so any argument > about space efficiency is mooted. The difference is that the switches and their protocols have been designed from the beginning for this scale of operation, IP routing protocols are designed for much fewer endpoints to track. > This is why bridging is no better than routing - you have to solve the same > problem at one layer or the other. The Ethernet layer's "solution" is actually > very suboptimal, especially when roaming is going on. well, the fact that doing it at the ethernet layer rather than the IP layer avoids the need to change your software, that's a significant win. Other than 'tradition' or "layering violation', why is it any better to solve this at the IP layer than the MAC layer? David Lang > > On Sunday, January 25, 2015 6:57pm, "David Lang" <david@lang.hm> said: > > > >> On Sun, 25 Jan 2015, dpreed@reed.com wrote: >> >> > Disagree. See below. >> > >> > >> > On Saturday, January 24, 2015 11:35pm, "David Lang" <david@lang.hm> >> said: >> > >> > >> > >> >> On Sat, 24 Jan 2015, dpreed@reed.com wrote: >> >> > A side comment, meant to discourage continuing to bridge rather than >> route. >> >> > >> >> > There's no reason that the AP's cannot have different IP addresses, >> but a >> >> > common ESSID. Roaming between them would be like roaming among mesh >> subnets. >> >> > Assuming you are securing your APs' air interfaces using encryption >> over the >> >> > air, you are already re-authenticating as you move from AP to AP. So >> using >> >> > routing rather than bridging is a good idea for all the reasons that >> routing >> >> > rather than bridging is better for mesh. >> >> >> >> The problem with doing this is that all existing TCP connections will >> break when >> >> you move from one AP to another and while some apps will quickly notice >> this and >> >> establish new connections, there are many apps that will not and this >> will cause >> >> noticable disruption to the user. >> >> >> >> Bridgeing allows the connections to remain intact. The wifi stack >> re-negotiates >> >> the encryption, but the encapsulated IP packets don't change. >> > >> > >> > There is no reason why one cannot set up an enterprise network to support >> > roaming, yet maintaining the property that IP addresses don't change while >> > roaming from AP to AP. Here's a simple concept, that amounts to moving what >> > would be in the Ethernet bridging tables up to the IP layer. >> > >> > All addresses in the enterprise are assigned from a common prefix (XXX/16 in >> > IPv4, perhaps). Routing in each access point is used to decide whether to >> > send the packet on its LAN, or to reflect it to another LAN. A node's >> > preferred location would be updated by the endpoint itself, sending its >> > current location to its current access point (via ARP or some other >> protocol). >> > The access point that hears of a new node that it can reach tells all the >> > other access points that the node is attached to it. Delivery of a packet to >> > a node is done by the access point that receives the packet by looking up the >> > destination IP address in its local table, and sending it to the access point >> > that currently has the destination IP address. >> > >> > This is far better than "bridging" at the Ethernet level from a functionality >> > point of view - it is using routing, not bridging. Bridging at the Ethernet >> > level uses Ethernet's STP feature, which doesn't work very well in >> collections >> > of wireless LAN's (it is slow to recalculate when something moves, because it >> > was designed for unplug/plug of actual cables, and moving the host from one >> > physical location to another). >> > >> > IMO, Ethernet sometimes aspires to solve problems that are already >> well-solved >> > in the Internet protocols. (for example the 802.11s mess which tries to do a >> > mesh entirely in the Ethernet layer, and fails pretty miserably). >> > >> > Of course that's only my opinion, but I think it applies to overuse of >> > bridging at the Ethernet layer when there are better approaches at the next >> > layer up. >> >> Unless you are going to have your routing tables handle every address in your >> network separately (and fix all the software that depends on broadcasts) you are >> going to have trouble trying to do this at the IP layer. >> >> The 'modern Enterprise' datacenter has lots of large machines that get sliced >> into multiple virtual machines. For redundancy purposes you want to have the >> machines used for a particular job to be spread across as many of these machines >> as possible, spread around your datacenter. >> >> Switches in this environment are becoming layer 2 routers. They are connected >> together with multiple links providing redundant paths around the network. This >> isn't being done with Spanning Tree because Spanning Tree only allows one path >> to exist at once, and that is inefficient and creates bottlenecks. As a result, >> they are now keeping all these links live at the same time and using least cost >> paths to route the layer 2 traffic across the switches. >> >> It's fair to argue that this is abuse of layer 2, but the difficulties in having >> to change the software operating at higher layers vs the fact that making these >> changes at the layer 2 level is completely transparent to the higher layers make >> it so that using this layer 2 capability is pragmantically a far better choice. >> >> The Computer Scientist will cringe at the 'hacks' that this introduces, but >> there is far more progress made when new capabilities can be added in a way >> that's transparent to other layers of the stack then when it requires major >> changes to how things work. >> >> The software layer is the worst to try and force fundamental changes to. You >> would be horrified to learn how old some of the software is that's running major >> jobs at large companies. Even if the software is in continuous development, the >> age of the core software frequently shows. >> >> David Lang >> ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Cerowrt-devel] Recording RF management info _and_ associated traffic? 2015-01-26 2:09 ` David Lang @ 2015-01-26 4:33 ` Valdis.Kletnieks 2015-01-26 4:44 ` David Lang 0 siblings, 1 reply; 43+ messages in thread From: Valdis.Kletnieks @ 2015-01-26 4:33 UTC (permalink / raw) To: David Lang; +Cc: cerowrt-devel [-- Attachment #1: Type: text/plain, Size: 565 bytes --] On Sun, 25 Jan 2015 18:09:59 -0800, David Lang said: > The difference is that the switches and their protocols have been designed from > the beginning for this scale of operation, IP routing protocols are designed for > much fewer endpoints to track. Anybody who's carrying a full routing table was swallowing on the order of 528,833 routes (as of Friday's "weekly routing table report" posted to NANOG). Pretty much everybody and their pet llama accepts full tables thesedays. You know anybody who's doing that many entries in an L2 Ethernet broadcast domain? [-- Attachment #2: Type: application/pgp-signature, Size: 848 bytes --] ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Cerowrt-devel] Recording RF management info _and_ associated traffic? 2015-01-26 4:33 ` Valdis.Kletnieks @ 2015-01-26 4:44 ` David Lang 2015-01-27 0:14 ` dpreed 0 siblings, 1 reply; 43+ messages in thread From: David Lang @ 2015-01-26 4:44 UTC (permalink / raw) To: Valdis.Kletnieks; +Cc: cerowrt-devel On Sun, 25 Jan 2015, Valdis.Kletnieks@vt.edu wrote: > On Sun, 25 Jan 2015 18:09:59 -0800, David Lang said: >> The difference is that the switches and their protocols have been designed from >> the beginning for this scale of operation, IP routing protocols are designed for >> much fewer endpoints to track. > > Anybody who's carrying a full routing table was swallowing on the order > of 528,833 routes (as of Friday's "weekly routing table report" posted > to NANOG). Pretty much everybody and their pet llama accepts full tables > thesedays. > > You know anybody who's doing that many entries in an L2 Ethernet broadcast > domain? The full IP routing tables are something that you normally only have to deal with in a few devices at the perimeter of your network. What is being talked about here is routing each /32 IP address individually throughout your network so that any IP address can be connected anywhere and have it 'just work' as far as the client on that IP is concerned. David Lang ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Cerowrt-devel] Recording RF management info _and_ associated traffic? 2015-01-26 4:44 ` David Lang @ 2015-01-27 0:14 ` dpreed 2015-01-27 0:23 ` David Lang 0 siblings, 1 reply; 43+ messages in thread From: dpreed @ 2015-01-27 0:14 UTC (permalink / raw) To: David Lang; +Cc: cerowrt-devel [-- Attachment #1: Type: text/plain, Size: 1237 bytes --] And having every /48 MAC address in your entterprise tracked is cheaper? On Sunday, January 25, 2015 11:44pm, "David Lang" <david@lang.hm> said: > On Sun, 25 Jan 2015, Valdis.Kletnieks@vt.edu wrote: > > > On Sun, 25 Jan 2015 18:09:59 -0800, David Lang said: > >> The difference is that the switches and their protocols have been > designed from > >> the beginning for this scale of operation, IP routing protocols are > designed for > >> much fewer endpoints to track. > > > > Anybody who's carrying a full routing table was swallowing on the order > > of 528,833 routes (as of Friday's "weekly routing table report" posted > > to NANOG). Pretty much everybody and their pet llama accepts full tables > > thesedays. > > > > You know anybody who's doing that many entries in an L2 Ethernet broadcast > > domain? > > The full IP routing tables are something that you normally only have to deal > with in a few devices at the perimeter of your network. > > What is being talked about here is routing each /32 IP address individually > throughout your network so that any IP address can be connected anywhere and > have it 'just work' as far as the client on that IP is concerned. > > David Lang > [-- Attachment #2: Type: text/html, Size: 1906 bytes --] ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Cerowrt-devel] Recording RF management info _and_ associated traffic? 2015-01-27 0:14 ` dpreed @ 2015-01-27 0:23 ` David Lang 0 siblings, 0 replies; 43+ messages in thread From: David Lang @ 2015-01-27 0:23 UTC (permalink / raw) To: dpreed; +Cc: cerowrt-devel it doesn't get mixed in with tracking Internet routes as well. On Mon, 26 Jan 2015, dpreed@reed.com wrote: > And having every /48 MAC address in your entterprise tracked is cheaper? > > > On Sunday, January 25, 2015 11:44pm, "David Lang" <david@lang.hm> said: > > > >> On Sun, 25 Jan 2015, Valdis.Kletnieks@vt.edu wrote: >> >> > On Sun, 25 Jan 2015 18:09:59 -0800, David Lang said: >> >> The difference is that the switches and their protocols have been >> designed from >> >> the beginning for this scale of operation, IP routing protocols are >> designed for >> >> much fewer endpoints to track. >> > >> > Anybody who's carrying a full routing table was swallowing on the order >> > of 528,833 routes (as of Friday's "weekly routing table report" posted >> > to NANOG). Pretty much everybody and their pet llama accepts full tables >> > thesedays. >> > >> > You know anybody who's doing that many entries in an L2 Ethernet broadcast >> > domain? >> >> The full IP routing tables are something that you normally only have to deal >> with in a few devices at the perimeter of your network. >> >> What is being talked about here is routing each /32 IP address individually >> throughout your network so that any IP address can be connected anywhere and >> have it 'just work' as far as the client on that IP is concerned. >> >> David Lang >> ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Cerowrt-devel] Recording RF management info _and_ associated traffic? 2015-01-26 1:51 ` dpreed 2015-01-26 2:09 ` David Lang @ 2015-01-26 2:19 ` Dave Taht 2015-01-26 2:43 ` David Lang 1 sibling, 1 reply; 43+ messages in thread From: Dave Taht @ 2015-01-26 2:19 UTC (permalink / raw) To: David Reed; +Cc: Alexander Duyck, cerowrt-devel Two notes: 1) Switches all have a very fast (t)CAM based lookup for mac addresses and vlan tags. The typical size for these is around 4096 entries per vlan, although the next generation VXLAN standard will push this to a lot more bits. Routing, on the other hand, requires a lot more storage, that is difficult to search in linear time, and worse, requires that a layer three retain tables for ipv4, ipv6, and "other". Furthermore it requires that every device that needs it participate in the routing protocol - of which there are dozens - where spanning tree only has a few variants and improvements. I don't know the extent to which 2) I am no fan of the various things I see being built on top of VXLAN (see conga) - but it is a prevailing trend. I am a partial advocate of moving all the routing support to the servers, and letting the switches remain pretty dumb. There has been a lot of good work in this area in Linux of late, as alexander has successfully cut the cost of a a routing lookup that falls through to default from several hundred ns to, like 16ns on the high end intel chips. I look forward to testing that on the next round of cerowrt. This is still a great deal slower than a switch can find the right mac address (well, depending on how you measure it). And still needs a commonly agreed upon routing protocol to fill the fib tables. Most routing protocols do not fail over very quickly either, with typical timeouts measured in 10s of seconds. On my very long todo list would be one day trying to get babel to fail over or otherwise switch ideal routes in under 40ms in a 10gigE environment - and even that is too slow, and going faster would require changing the babel protocol, which has a minimum time representation of 10ms. It would be an interesting research project for someone to attempt high speed routing in a data center virtual machine environment, instead of bridging. To your roaming point, yes this is certainly one place where migrating bridged vms across machines breaks down, and yet more and more vm layers are doing it. I would certainly prefer routing in this case. On Sun, Jan 25, 2015 at 5:51 PM, <dpreed@reed.com> wrote: > If you are using Ethernet bridging, your Ethernet switches are doing exactly > this at the Ethernet layer... they have large tables of MAC addresses that > are known throughout the network, and for each MAC address in the > Enterprise, they have the next hop destination. > > > > So IP routing tables, one IP address per destination in the Enterprise, > would occupy no more space than do the Ethernet routing tables.... so any > argument about space efficiency is mooted. > > > > This is why bridging is no better than routing - you have to solve the same > problem at one layer or the other. The Ethernet layer's "solution" is > actually very suboptimal, especially when roaming is going on. > > > > On Sunday, January 25, 2015 6:57pm, "David Lang" <david@lang.hm> said: > >> On Sun, 25 Jan 2015, dpreed@reed.com wrote: >> >> > Disagree. See below. >> > >> > >> > On Saturday, January 24, 2015 11:35pm, "David Lang" <david@lang.hm> >> said: >> > >> > >> > >> >> On Sat, 24 Jan 2015, dpreed@reed.com wrote: >> >> > A side comment, meant to discourage continuing to bridge rather than >> route. >> >> > >> >> > There's no reason that the AP's cannot have different IP addresses, >> but a >> >> > common ESSID. Roaming between them would be like roaming among mesh >> subnets. >> >> > Assuming you are securing your APs' air interfaces using encryption >> over the >> >> > air, you are already re-authenticating as you move from AP to AP. So >> using >> >> > routing rather than bridging is a good idea for all the reasons that >> routing >> >> > rather than bridging is better for mesh. >> >> >> >> The problem with doing this is that all existing TCP connections will >> break when >> >> you move from one AP to another and while some apps will quickly notice >> this and >> >> establish new connections, there are many apps that will not and this >> will cause >> >> noticable disruption to the user. >> >> >> >> Bridgeing allows the connections to remain intact. The wifi stack >> re-negotiates >> >> the encryption, but the encapsulated IP packets don't change. >> > >> > >> > There is no reason why one cannot set up an enterprise network to >> > support >> > roaming, yet maintaining the property that IP addresses don't change >> > while >> > roaming from AP to AP. Here's a simple concept, that amounts to moving >> > what >> > would be in the Ethernet bridging tables up to the IP layer. >> > >> > All addresses in the enterprise are assigned from a common prefix >> > (XXX/16 in >> > IPv4, perhaps). Routing in each access point is used to decide whether >> > to >> > send the packet on its LAN, or to reflect it to another LAN. A node's >> > preferred location would be updated by the endpoint itself, sending its >> > current location to its current access point (via ARP or some other >> protocol). >> > The access point that hears of a new node that it can reach tells all >> > the >> > other access points that the node is attached to it. Delivery of a >> > packet to >> > a node is done by the access point that receives the packet by looking >> > up the >> > destination IP address in its local table, and sending it to the access >> > point >> > that currently has the destination IP address. >> > >> > This is far better than "bridging" at the Ethernet level from a >> > functionality >> > point of view - it is using routing, not bridging. Bridging at the >> > Ethernet >> > level uses Ethernet's STP feature, which doesn't work very well in >> collections >> > of wireless LAN's (it is slow to recalculate when something moves, >> > because it >> > was designed for unplug/plug of actual cables, and moving the host from >> > one >> > physical location to another). >> > >> > IMO, Ethernet sometimes aspires to solve problems that are already >> well-solved >> > in the Internet protocols. (for example the 802.11s mess which tries to >> > do a >> > mesh entirely in the Ethernet layer, and fails pretty miserably). >> > >> > Of course that's only my opinion, but I think it applies to overuse of >> > bridging at the Ethernet layer when there are better approaches at the >> > next >> > layer up. >> >> Unless you are going to have your routing tables handle every address in >> your >> network separately (and fix all the software that depends on broadcasts) >> you are >> going to have trouble trying to do this at the IP layer. >> >> The 'modern Enterprise' datacenter has lots of large machines that get >> sliced >> into multiple virtual machines. For redundancy purposes you want to have >> the >> machines used for a particular job to be spread across as many of these >> machines >> as possible, spread around your datacenter. >> >> Switches in this environment are becoming layer 2 routers. They are >> connected >> together with multiple links providing redundant paths around the network. >> This >> isn't being done with Spanning Tree because Spanning Tree only allows one >> path >> to exist at once, and that is inefficient and creates bottlenecks. As a >> result, >> they are now keeping all these links live at the same time and using least >> cost >> paths to route the layer 2 traffic across the switches. >> >> It's fair to argue that this is abuse of layer 2, but the difficulties in >> having >> to change the software operating at higher layers vs the fact that making >> these >> changes at the layer 2 level is completely transparent to the higher >> layers make >> it so that using this layer 2 capability is pragmantically a far better >> choice. >> >> The Computer Scientist will cringe at the 'hacks' that this introduces, >> but >> there is far more progress made when new capabilities can be added in a >> way >> that's transparent to other layers of the stack then when it requires >> major >> changes to how things work. >> >> The software layer is the worst to try and force fundamental changes to. >> You >> would be horrified to learn how old some of the software is that's running >> major >> jobs at large companies. Even if the software is in continuous >> development, the >> age of the core software frequently shows. >> >> David Lang >> > > > _______________________________________________ > Cerowrt-devel mailing list > Cerowrt-devel@lists.bufferbloat.net > https://lists.bufferbloat.net/listinfo/cerowrt-devel > -- Dave Täht thttp://www.bufferbloat.net/projects/bloat/wiki/Upcoming_Talks ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Cerowrt-devel] Recording RF management info _and_ associated traffic? 2015-01-26 2:19 ` Dave Taht @ 2015-01-26 2:43 ` David Lang 2015-01-26 2:58 ` Dave Taht 0 siblings, 1 reply; 43+ messages in thread From: David Lang @ 2015-01-26 2:43 UTC (permalink / raw) To: Dave Taht; +Cc: Alexander Duyck, cerowrt-devel On Sun, 25 Jan 2015, Dave Taht wrote: > To your roaming point, yes this is certainly one place where migrating > bridged vms across machines breaks down, and yet more and more vm > layers are doing it. I would certainly prefer routing in this case. What's the difference between "roaming" and moving a VM from one place in the network to another? As far as layer 2 vs layer 3 goes. If you try to operate at layer 3, you are going to have quite a bit of smarts in the endpoint. Even if it's only connected vi a single link. If you think about it, even if your network routing tables list every machine in our environment individually, you still have a problem of what gateway the endpoint uses. It would have to change every time it moved. Since DHCP doesn't update frequently enough to be transparent, you would need to have each endpoint running a routing protocol. This can work for individual hobbiests, but not when you need to support random devices (how would you configure an iPhone to support this?) Letting the layer 2 equipment deal with the traffic within the building and invoking layer 3 to go outside the building (or to a different security domain) makes a lot of sense. Even if that means that layer 2 within a building looks very similar to what layer 3 used to look like around a city. back to the topic of wifi, I'm not aware of any APs that participate in the switch protocols at this level. I also don't know of any reasonably priced switches that can do anything smarter than plain spanning tree when connected through multiple paths (I'd love to learn otherwise) David Lang ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Cerowrt-devel] Recording RF management info _and_ associated traffic? 2015-01-26 2:43 ` David Lang @ 2015-01-26 2:58 ` Dave Taht 2015-01-26 3:17 ` dpreed 2015-01-26 3:19 ` David Lang 0 siblings, 2 replies; 43+ messages in thread From: Dave Taht @ 2015-01-26 2:58 UTC (permalink / raw) To: David Lang; +Cc: cerowrt-devel On Sun, Jan 25, 2015 at 6:43 PM, David Lang <david@lang.hm> wrote: > On Sun, 25 Jan 2015, Dave Taht wrote: > >> To your roaming point, yes this is certainly one place where migrating >> bridged vms across machines breaks down, and yet more and more vm >> layers are doing it. I would certainly prefer routing in this case. > > > What's the difference between "roaming" and moving a VM from one place in > the network to another? I think most people think of "roaming" as moving fairly rapidly from one piece of edge connectivity to another, and moving a vm is a great deal more permanent operation. > As far as layer 2 vs layer 3 goes. If you try to operate at layer 3, you are > going to have quite a bit of smarts in the endpoint. Even if it's only > connected vi a single link. If you think about it, even if your network > routing tables list every machine in our environment individually, you still > have a problem of what gateway the endpoint uses. It would have to change > every time it moved. Since DHCP doesn't update frequently enough to be > transparent, you would need to have each endpoint running a routing > protocol. Hmm? I don't ever use a dhcp-supplied default gateway, I depend on the routing protocol to supply that. In terms of each vm running a routing protocol, well, no, I would rely on the underlying bare metal OS to be doing that, supplying the FIB tables to the overlying vms, if they need it, but otherwise the vms just see a "default" route and don't bother with it. They do need to inform the bare metal OS (better term for this please? hypervisor?) of what IPs they own. static default gateways are evil. and easily disabled. in linux you merely comment out the "routers" in /etc/dhcp/dhclient.conf, in openwrt, set "defaultroute 0" for the interface fetching dhcp. When a box migrates, it tells the hypervisor it's addresses, and then that box propagates out the route change to elsewhere. > > This can work for individual hobbiests, but not when you need to support > random devices (how would you configure an iPhone to support this?) Carefully. :) I do note that this stuff does (or at least did) work on some of the open source variants of android. I would rather like it if android added ipv6 tethering soon, and made it possible to mesh together multiple phones. > > > Letting the layer 2 equipment deal with the traffic within the building and > invoking layer 3 to go outside the building (or to a different security > domain) makes a lot of sense. Even if that means that layer 2 within a > building looks very similar to what layer 3 used to look like around a city. Be careful what you wish for. > > > back to the topic of wifi, I'm not aware of any APs that participate in the > switch protocols at this level. I also don't know of any reasonably priced > switches that can do anything smarter than plain spanning tree when > connected through multiple paths (I'd love to learn otherwise) > > David Lang -- Dave Täht thttp://www.bufferbloat.net/projects/bloat/wiki/Upcoming_Talks ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Cerowrt-devel] Recording RF management info _and_ associated traffic? 2015-01-26 2:58 ` Dave Taht @ 2015-01-26 3:17 ` dpreed 2015-01-26 3:32 ` David Lang 2015-01-26 3:45 ` Dave Taht 2015-01-26 3:19 ` David Lang 1 sibling, 2 replies; 43+ messages in thread From: dpreed @ 2015-01-26 3:17 UTC (permalink / raw) To: Dave Taht; +Cc: cerowrt-devel [-- Attachment #1: Type: text/plain, Size: 4895 bytes --] Looking up an address in a routing table is o(1) if the routing table is a hash table. That's much more efficient than a TCAM. My simple example just requires a delete/insert at each node's route lookup table. My point was about collections of WLAN's bridged together. Look at what happens (at the packet/radio layer) when a new node joins a bridged set of WLANs using STP. It is not exactly simple to rebuild the Ethernet layer's bridge routing tables in a complex network. And the limit of 4096 entries in many inexpensive switches is not a trivial limit. Routers used to be memory-starved (a small number of KB of RAM was the norm). Perhaps the thinking then (back before 2000) has not been revised, even though the hardware is a lot more capacious. Remember, the Ethernet layer in WLANs is implemented by microcontrollers, typically not very capable ones, plus TCAMs which are pretty limited in their flexibility. While it is tempting to use the "pre-packaged, proprietary" Ethernet switch functionality, routing gets you out of the binary blobs, and let's you be a lot smarter and more scalable. Given that it does NOT cost more to do routing at the IP layer, building complex Ethernet bridging is not obviously a win. BTW, TCAMs are used in IP layer switching, too, and also are used in packet filtering. Maybe not in cheap consumer switches, but lots of Gigabit switches implement IP layer switching and filtering. At HP, their switches routinely did all their IP layer switching entirely in TCAMs. On Sunday, January 25, 2015 9:58pm, "Dave Taht" <dave.taht@gmail.com> said: > On Sun, Jan 25, 2015 at 6:43 PM, David Lang <david@lang.hm> wrote: > > On Sun, 25 Jan 2015, Dave Taht wrote: > > > >> To your roaming point, yes this is certainly one place where migrating > >> bridged vms across machines breaks down, and yet more and more vm > >> layers are doing it. I would certainly prefer routing in this case. > > > > > > What's the difference between "roaming" and moving a VM from one place in > > the network to another? > > I think most people think of "roaming" as moving fairly rapidly from one > piece of edge connectivity to another, and moving a vm is a great deal more > permanent operation. > > > As far as layer 2 vs layer 3 goes. If you try to operate at layer 3, you are > > going to have quite a bit of smarts in the endpoint. Even if it's only > > connected vi a single link. If you think about it, even if your network > > routing tables list every machine in our environment individually, you still > > have a problem of what gateway the endpoint uses. It would have to change > > every time it moved. Since DHCP doesn't update frequently enough to be > > transparent, you would need to have each endpoint running a routing > > protocol. > > Hmm? I don't ever use a dhcp-supplied default gateway, I depend on the routing > protocol to supply that. In terms of each vm running a routing protocol, > well, no, I would rely on the underlying bare metal OS to be doing > that, supplying > the FIB tables to the overlying vms, if they need it, but otherwise the vms > just see a "default" route and don't bother with it. They do need to inform the > bare metal OS (better term for this please? hypervisor?) of what IPs they own. > > static default gateways are evil. and easily disabled. in linux you > merely comment > out the "routers" in /etc/dhcp/dhclient.conf, in openwrt, set > "defaultroute 0" for the > interface fetching dhcp. > > When a box migrates, it tells the hypervisor it's addresses, and then that box > propagates out the route change to elsewhere. > > > > > This can work for individual hobbiests, but not when you need to support > > random devices (how would you configure an iPhone to support this?) > > Carefully. :) > > I do note that this stuff does (or at least did) work on some of the open > source variants of android. I would rather like it if android added ipv6 > tethering soon, and made it possible to mesh together multiple phones. > > > > > > > Letting the layer 2 equipment deal with the traffic within the building and > > invoking layer 3 to go outside the building (or to a different security > > domain) makes a lot of sense. Even if that means that layer 2 within a > > building looks very similar to what layer 3 used to look like around a city. > > Be careful what you wish for. > > > > > > > back to the topic of wifi, I'm not aware of any APs that participate in the > > switch protocols at this level. I also don't know of any reasonably priced > > switches that can do anything smarter than plain spanning tree when > > connected through multiple paths (I'd love to learn otherwise) > > > > David Lang > > > > -- > Dave Täht > > thttp://www.bufferbloat.net/projects/bloat/wiki/Upcoming_Talks > [-- Attachment #2: Type: text/html, Size: 7011 bytes --] ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Cerowrt-devel] Recording RF management info _and_ associated traffic? 2015-01-26 3:17 ` dpreed @ 2015-01-26 3:32 ` David Lang 2015-01-26 3:45 ` Dave Taht 1 sibling, 0 replies; 43+ messages in thread From: David Lang @ 2015-01-26 3:32 UTC (permalink / raw) To: dpreed; +Cc: cerowrt-devel On Sun, 25 Jan 2015, dpreed@reed.com wrote: > Looking up an address in a routing table is o(1) if the routing table is a > hash table. That's much more efficient than a TCAM. My simple example just > requires a delete/insert at each node's route lookup table. > > My point was about collections of WLAN's bridged together. Look at what > happens (at the packet/radio layer) when a new node joins a bridged set of > WLANs using STP. It is not exactly simple to rebuild the Ethernet layer's > bridge routing tables in a complex network. How would it be any easier to rebuild the routing table? (even ignoring the question of what the devices use as their gateway) > And the limit of 4096 entries in many inexpensive switches is not a trivial > limit. Getting similar number of ports that all can be routed is significantly more expensive. Yes, the mid-range switches can run layer 3 routing, but they are far less efficient at doing so than they are at switching. > Routers used to be memory-starved (a small number of KB of RAM was the norm). > Perhaps the thinking then (back before 2000) has not been revised, even though > the hardware is a lot more capacious. well, you do have to remember that most of the routing protocols were designed in the days of those limits. > Remember, the Ethernet layer in WLANs is implemented by microcontrollers, > typically not very capable ones, plus TCAMs which are pretty limited in their > flexibility. > > While it is tempting to use the "pre-packaged, proprietary" Ethernet switch > functionality, routing gets you out of the binary blobs, and let's you be a > lot smarter and more scalable. how do I run my own software on a HP switch to eliminate the binary blobs? How do I get similar performance on something with a dozen or more ports? From a theoretical point of view, you are absolutly correct, but there isn't an open equivalent available. This is even before you start talking about what's coded into the ASICs on the higher end switches, which while they are limited in what they can do, within those limits they will massivly outperform the other options. > Given that it does NOT cost more to do routing > at the IP layer, building complex Ethernet bridging is not obviously a win. Ok, if it's not more expensive to do this. Exactly how would I set this up? remember that I have no ability to make any changes to the clients (iphones, android, Linux, Windows, Macs) I can't have them all running a routing protocol to have them figure out what gateway to use as they move from AP to AP. not using 'cheap' commodity switches would make it more expensive (in my case we invested in buying a bunch of HP switches a couple years ago) David Lang ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Cerowrt-devel] Recording RF management info _and_ associated traffic? 2015-01-26 3:17 ` dpreed 2015-01-26 3:32 ` David Lang @ 2015-01-26 3:45 ` Dave Taht 2015-01-27 0:12 ` dpreed 1 sibling, 1 reply; 43+ messages in thread From: Dave Taht @ 2015-01-26 3:45 UTC (permalink / raw) To: David Reed; +Cc: cerowrt-devel On Sun, Jan 25, 2015 at 7:17 PM, <dpreed@reed.com> wrote: > Looking up an address in a routing table is o(1) if the routing table is a > hash table. That's much more efficient than a TCAM. My simple example just > requires a delete/insert at each node's route lookup table. Regrettably it is not O(1) once you take into account the cpu cache hierarchy, or the potential collisions you will have once you shrink the hash to something reasonable. Also I think you are ignoring the problem of covering routes. Say I have to get something to a.b.c.z/32. I do a lookup of that and find nothing. I then look to find a.b.c.z/31 and find nothing, then /30, then /29, /28, until I find a hit for the next hop. Now you can of course do a binary search for likely subprefixes, but in any case, the search is not O(1). In terms of cache efficient data structures, a straight hash is not the way to go, of late I have been trying to wrap my head around the hat-trie as possibly being useful in these circumstances. Now, if you think about limiting the domain of the problem to something greater than the typical mac table, but less than the whole internet, it starts looking more reasonable to have a 1x1 ratio of destination IPs to hash table entries for lookups, but updates have to probe/change large segments of the table in order to deal with covering prefixes. > My point was about collections of WLAN's bridged together. Look at what > happens (at the packet/radio layer) when a new node joins a bridged set of > WLANs using STP. It is not exactly simple to rebuild the Ethernet layer's > bridge routing tables in a complex network. And the limit of 4096 entries > in many inexpensive switches is not a trivial limit. Agreed. But see http://en.wikipedia.org/wiki/Virtual_Extensible_LAN > > > > Routers used to be memory-starved (a small number of KB of RAM was the > norm). Perhaps the thinking then (back before 2000) has not been revised, > even though the hardware is a lot more capacious. The profit margins have not been revised. I would not mind, incidentally expanding the scope of the fqswitch project ot try to build something that would scale up at l3 farther than we've ever seen before, however funding for needed gear like: http://www.eetimes.com/document.asp?doc_id=1321334 and time, and fpga expertise, is lacking. I am currently distracted by evaluating a very cool new cpu architecture ( see http://www.millcomputing.com/wiki/Memory ) and even as nifty as that is I foresee a need for a lot of dedicated packet processing logic and memories to get into the 40GBit+ range. > > > Remember, the Ethernet layer in WLANs is implemented by microcontrollers, > typically not very capable ones, plus TCAMs which are pretty limited in > their flexibility. I do tend to think that the next era of SDN enabled hardware will eventually lead to more innovation in both the control and data plane - however it seems we are still in a "me-too" phase of development of openvswitch (btw: there is a new software switch for linux called rocker we should look at, and make sure runs fq_codel), and a long way from flexibly programmable switch hardware in general. http://openvswitch.org/pipermail/dev/2014-September/045084.html > > > > While it is tempting to use the "pre-packaged, proprietary" Ethernet switch > functionality, routing gets you out of the binary blobs, and let's you be a > lot smarter and more scalable. Given that it does NOT cost more to do > routing at the IP layer, building complex Ethernet bridging is not obviously > a win. SDN is certainly a way out of this mess. Eventually. But I fear we are making all the same mistakes over again, and making slower hardware, where in the end, it needs to be faster, to win. > > > BTW, TCAMs are used in IP layer switching, too, and also are used in packet > filtering. Maybe not in cheap consumer switches, but lots of Gigabit > switches implement IP layer switching and filtering. At HP, their switches > routinely did all their IP layer switching entirely in TCAMs. Yep. I really wish big, fat TCAMS were standard equipment. > > > On Sunday, January 25, 2015 9:58pm, "Dave Taht" <dave.taht@gmail.com> said: > >> On Sun, Jan 25, 2015 at 6:43 PM, David Lang <david@lang.hm> wrote: >> > On Sun, 25 Jan 2015, Dave Taht wrote: >> > >> >> To your roaming point, yes this is certainly one place where migrating >> >> bridged vms across machines breaks down, and yet more and more vm >> >> layers are doing it. I would certainly prefer routing in this case. >> > >> > >> > What's the difference between "roaming" and moving a VM from one place >> > in >> > the network to another? >> >> I think most people think of "roaming" as moving fairly rapidly from one >> piece of edge connectivity to another, and moving a vm is a great deal >> more >> permanent operation. >> >> > As far as layer 2 vs layer 3 goes. If you try to operate at layer 3, you >> > are >> > going to have quite a bit of smarts in the endpoint. Even if it's only >> > connected vi a single link. If you think about it, even if your network >> > routing tables list every machine in our environment individually, you >> > still >> > have a problem of what gateway the endpoint uses. It would have to >> > change >> > every time it moved. Since DHCP doesn't update frequently enough to be >> > transparent, you would need to have each endpoint running a routing >> > protocol. >> >> Hmm? I don't ever use a dhcp-supplied default gateway, I depend on the >> routing >> protocol to supply that. In terms of each vm running a routing protocol, >> well, no, I would rely on the underlying bare metal OS to be doing >> that, supplying >> the FIB tables to the overlying vms, if they need it, but otherwise the >> vms >> just see a "default" route and don't bother with it. They do need to >> inform the >> bare metal OS (better term for this please? hypervisor?) of what IPs they >> own. >> >> static default gateways are evil. and easily disabled. in linux you >> merely comment >> out the "routers" in /etc/dhcp/dhclient.conf, in openwrt, set >> "defaultroute 0" for the >> interface fetching dhcp. >> >> When a box migrates, it tells the hypervisor it's addresses, and then that >> box >> propagates out the route change to elsewhere. >> >> > >> > This can work for individual hobbiests, but not when you need to support >> > random devices (how would you configure an iPhone to support this?) >> >> Carefully. :) >> >> I do note that this stuff does (or at least did) work on some of the open >> source variants of android. I would rather like it if android added ipv6 >> tethering soon, and made it possible to mesh together multiple phones. >> >> > >> > >> > Letting the layer 2 equipment deal with the traffic within the building >> > and >> > invoking layer 3 to go outside the building (or to a different security >> > domain) makes a lot of sense. Even if that means that layer 2 within a >> > building looks very similar to what layer 3 used to look like around a >> > city. >> >> Be careful what you wish for. >> >> > >> > >> > back to the topic of wifi, I'm not aware of any APs that participate in >> > the >> > switch protocols at this level. I also don't know of any reasonably >> > priced >> > switches that can do anything smarter than plain spanning tree when >> > connected through multiple paths (I'd love to learn otherwise) >> > >> > David Lang >> >> >> >> -- >> Dave Täht >> >> thttp://www.bufferbloat.net/projects/bloat/wiki/Upcoming_Talks >> -- Dave Täht thttp://www.bufferbloat.net/projects/bloat/wiki/Upcoming_Talks ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Cerowrt-devel] Recording RF management info _and_ associated traffic? 2015-01-26 3:45 ` Dave Taht @ 2015-01-27 0:12 ` dpreed 2015-01-27 0:31 ` David Lang 2015-01-27 0:36 ` Dave Taht 0 siblings, 2 replies; 43+ messages in thread From: dpreed @ 2015-01-27 0:12 UTC (permalink / raw) To: Dave Taht; +Cc: cerowrt-devel [-- Attachment #1: Type: text/plain, Size: 10339 bytes --] Well, we all may want to agree to disagree. I don't buy the argument that hash tables are slow compared to the TCAMs - and even if cache misses happened, a hash table is still o(1) - you look at exactly one memory address on the average in a hash table - that's the point of it. The constant factor is the speed of memory - not terribly slow by any means. To get into this deeper would require actual measurements, of which I am a great fan. But your handwaves are pretty unquantitative, Dave, so at best they are similar to mine. I'm very measurement focused, being part hardware architecture guy. David - my comment about HP doing layer 3 switching in TCAMs just was there to point out that there's nothing magic about layer 2. I was not suggesting that they don't use proprietary binary blobs, because they do. But so do the TCAM programs in layer 2 devices. Dave - you are conflating the implementation technique of the routing algorithm when you focus on "prefix matching" as being hard to do. It's not hard to invent a performant algorithm to do that combined with a hash table. A simple way to do that is to treat the address one is looking up as several addresses (of shorter prefixes of the address). Then look each one up separately by its hash. Its still o(1) if you do that, just a larger constant factor. I assume you don't actually think it is optimal to do linear searches on the routing table like hosts sometimes do. Linear search is not necessary. There is literally nothing magical about looking up 48-bit random Ethernet addresses in a LAN. As far as NAT'ing is concerned - that is done by the gateways. It's possible in principle to create a distributed NAT face to an Enterprise - if you do so, then roaming within the enterprise just amounts to telling the NAT face about the new internal IP address that corresponds to the old one - an update of one address translation with another. This is how phones roam, by the way. They update their location via an HLR as they roam. On Sunday, January 25, 2015 10:45pm, "Dave Taht" <dave.taht@gmail.com> said: > On Sun, Jan 25, 2015 at 7:17 PM, <dpreed@reed.com> wrote: > > Looking up an address in a routing table is o(1) if the routing table is a > > hash table. That's much more efficient than a TCAM. My simple example just > > requires a delete/insert at each node's route lookup table. > > Regrettably it is not O(1) once you take into account the cpu cache hierarchy, > or the potential collisions you will have once you shrink the hash to > something reasonable. > > Also I think you are ignoring the problem of covering routes. Say I have to > get something to a.b.c.z/32. I do a lookup of that and find nothing. I then > look to find a.b.c.z/31 and find nothing, then /30, then /29, /28, until I find > a hit for the next hop. Now you can of course do a binary search for likely > subprefixes, but in any case, the search is not O(1). > > In terms of cache efficient data structures, a straight hash is not the way > to go, of late I have been trying to wrap my head around the hat-trie as > possibly being useful in these circumstances. > > Now, if you think about limiting the domain of the problem to something > greater than the typical mac table, but less than the whole internet, > it starts looking more reasonable to have a 1x1 ratio of destination > IPs to hash table entries for lookups, but updates have to probe/change > large segments of the table in order to deal with covering prefixes. > > > My point was about collections of WLAN's bridged together. Look at what > > happens (at the packet/radio layer) when a new node joins a bridged set of > > WLANs using STP. It is not exactly simple to rebuild the Ethernet layer's > > bridge routing tables in a complex network. And the limit of 4096 entries > > in many inexpensive switches is not a trivial limit. > > Agreed. But see http://en.wikipedia.org/wiki/Virtual_Extensible_LAN > > > > > > > > > Routers used to be memory-starved (a small number of KB of RAM was the > > norm). Perhaps the thinking then (back before 2000) has not been revised, > > even though the hardware is a lot more capacious. > > The profit margins have not been revised. > > I would not mind, incidentally expanding the scope of the fqswitch project ot > try to build something that would scale up at l3 farther than we've ever seen > before, however funding for needed gear like: > > http://www.eetimes.com/document.asp?doc_id=1321334 > > and time, and fpga expertise, is lacking. I am currently distracted by > evaluating > a very cool new cpu architecture ( see > http://www.millcomputing.com/wiki/Memory ) > and even as nifty as that is I foresee a need for a lot of dedicated packet > processing logic and memories to get into the 40GBit+ range. > > > > > > Remember, the Ethernet layer in WLANs is implemented by microcontrollers, > > typically not very capable ones, plus TCAMs which are pretty limited in > > their flexibility. > > I do tend to think that the next era of SDN enabled hardware will eventually > lead to more innovation in both the control and data plane - however it > seems we are still in a "me-too" phase > of development of openvswitch (btw: there is a new software switch for > linux called rocker we should look at, and make sure runs fq_codel), and > a long way from flexibly programmable switch hardware in general. > > http://openvswitch.org/pipermail/dev/2014-September/045084.html > > > > > > > > While it is tempting to use the "pre-packaged, proprietary" Ethernet switch > > functionality, routing gets you out of the binary blobs, and let's you be a > > lot smarter and more scalable. Given that it does NOT cost more to do > > routing at the IP layer, building complex Ethernet bridging is not obviously > > a win. > > SDN is certainly a way out of this mess. Eventually. But I fear we are making > all the same mistakes over again, and making slower hardware, where in the > end, it needs to be faster, to win. > > > > > > > BTW, TCAMs are used in IP layer switching, too, and also are used in packet > > filtering. Maybe not in cheap consumer switches, but lots of Gigabit > > switches implement IP layer switching and filtering. At HP, their switches > > routinely did all their IP layer switching entirely in TCAMs. > > Yep. I really wish big, fat TCAMS were standard equipment. > > > > > > > On Sunday, January 25, 2015 9:58pm, "Dave Taht" <dave.taht@gmail.com> > said: > > > >> On Sun, Jan 25, 2015 at 6:43 PM, David Lang <david@lang.hm> wrote: > >> > On Sun, 25 Jan 2015, Dave Taht wrote: > >> > > >> >> To your roaming point, yes this is certainly one place where > migrating > >> >> bridged vms across machines breaks down, and yet more and more > vm > >> >> layers are doing it. I would certainly prefer routing in this > case. > >> > > >> > > >> > What's the difference between "roaming" and moving a VM from one > place > >> > in > >> > the network to another? > >> > >> I think most people think of "roaming" as moving fairly rapidly from one > >> piece of edge connectivity to another, and moving a vm is a great deal > >> more > >> permanent operation. > >> > >> > As far as layer 2 vs layer 3 goes. If you try to operate at layer 3, > you > >> > are > >> > going to have quite a bit of smarts in the endpoint. Even if it's > only > >> > connected vi a single link. If you think about it, even if your > network > >> > routing tables list every machine in our environment individually, > you > >> > still > >> > have a problem of what gateway the endpoint uses. It would have to > >> > change > >> > every time it moved. Since DHCP doesn't update frequently enough to > be > >> > transparent, you would need to have each endpoint running a routing > >> > protocol. > >> > >> Hmm? I don't ever use a dhcp-supplied default gateway, I depend on the > >> routing > >> protocol to supply that. In terms of each vm running a routing protocol, > >> well, no, I would rely on the underlying bare metal OS to be doing > >> that, supplying > >> the FIB tables to the overlying vms, if they need it, but otherwise the > >> vms > >> just see a "default" route and don't bother with it. They do need to > >> inform the > >> bare metal OS (better term for this please? hypervisor?) of what IPs > they > >> own. > >> > >> static default gateways are evil. and easily disabled. in linux you > >> merely comment > >> out the "routers" in /etc/dhcp/dhclient.conf, in openwrt, set > >> "defaultroute 0" for the > >> interface fetching dhcp. > >> > >> When a box migrates, it tells the hypervisor it's addresses, and then > that > >> box > >> propagates out the route change to elsewhere. > >> > >> > > >> > This can work for individual hobbiests, but not when you need to > support > >> > random devices (how would you configure an iPhone to support this?) > >> > >> Carefully. :) > >> > >> I do note that this stuff does (or at least did) work on some of the > open > >> source variants of android. I would rather like it if android added ipv6 > >> tethering soon, and made it possible to mesh together multiple phones. > >> > >> > > >> > > >> > Letting the layer 2 equipment deal with the traffic within the > building > >> > and > >> > invoking layer 3 to go outside the building (or to a different > security > >> > domain) makes a lot of sense. Even if that means that layer 2 within > a > >> > building looks very similar to what layer 3 used to look like around > a > >> > city. > >> > >> Be careful what you wish for. > >> > >> > > >> > > >> > back to the topic of wifi, I'm not aware of any APs that participate > in > >> > the > >> > switch protocols at this level. I also don't know of any reasonably > >> > priced > >> > switches that can do anything smarter than plain spanning tree when > >> > connected through multiple paths (I'd love to learn otherwise) > >> > > >> > David Lang > >> > >> > >> > >> -- > >> Dave Täht > >> > >> thttp://www.bufferbloat.net/projects/bloat/wiki/Upcoming_Talks > >> > > > > -- > Dave Täht > > thttp://www.bufferbloat.net/projects/bloat/wiki/Upcoming_Talks > [-- Attachment #2: Type: text/html, Size: 14397 bytes --] ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Cerowrt-devel] Recording RF management info _and_ associated traffic? 2015-01-27 0:12 ` dpreed @ 2015-01-27 0:31 ` David Lang 2015-01-27 0:36 ` Dave Taht 1 sibling, 0 replies; 43+ messages in thread From: David Lang @ 2015-01-27 0:31 UTC (permalink / raw) To: dpreed; +Cc: cerowrt-devel [-- Attachment #1: Type: TEXT/PLAIN, Size: 9889 bytes --] On Mon, 26 Jan 2015, dpreed@reed.com wrote: > As far as NAT'ing is concerned - that is done by the gateways. It's possible > in principle to create a distributed NAT face to an Enterprise - if you do so, > then roaming within the enterprise just amounts to telling the NAT face about > the new internal IP address that corresponds to the old one - an update of one > address translation with another. remember that the claim was that you could have the APs route, not bridge, but let a device move from being connected to one AP to being connected to another AP without it needing to change it's IP address and without the connections using that IP address getting broken. How you would do this is the problem. Getting traffic to the device could be done if you detect it's movement and change your IP routing tables, but getting data from the device is going to be harder because the device is going to keep sending traffic to the same gateway. So you either need to pull layer 2 tricks to get the packets to the right gateway before processing them, or you need the new AP to handle packets sent to the IP address of the old AP. If you do NAT or stateful packet filtering on the AP, you also need the that state to get migrated somehow. > This is how phones roam, by the way. They update their location via an HLR as > they roam. the phones get a new IP address as they roam and break existing connections don't they? The software either gets a notification that the network has changed and connect again, or the connections end up timing out. Right?? David Lang > > > On Sunday, January 25, 2015 10:45pm, "Dave Taht" <dave.taht@gmail.com> said: > > > >> On Sun, Jan 25, 2015 at 7:17 PM, <dpreed@reed.com> wrote: >> > Looking up an address in a routing table is o(1) if the routing table is a >> > hash table. That's much more efficient than a TCAM. My simple example just >> > requires a delete/insert at each node's route lookup table. >> >> Regrettably it is not O(1) once you take into account the cpu cache hierarchy, >> or the potential collisions you will have once you shrink the hash to >> something reasonable. >> >> Also I think you are ignoring the problem of covering routes. Say I have to >> get something to a.b.c.z/32. I do a lookup of that and find nothing. I then >> look to find a.b.c.z/31 and find nothing, then /30, then /29, /28, until I find >> a hit for the next hop. Now you can of course do a binary search for likely >> subprefixes, but in any case, the search is not O(1). >> >> In terms of cache efficient data structures, a straight hash is not the way >> to go, of late I have been trying to wrap my head around the hat-trie as >> possibly being useful in these circumstances. >> >> Now, if you think about limiting the domain of the problem to something >> greater than the typical mac table, but less than the whole internet, >> it starts looking more reasonable to have a 1x1 ratio of destination >> IPs to hash table entries for lookups, but updates have to probe/change >> large segments of the table in order to deal with covering prefixes. >> >> > My point was about collections of WLAN's bridged together. Look at what >> > happens (at the packet/radio layer) when a new node joins a bridged set of >> > WLANs using STP. It is not exactly simple to rebuild the Ethernet layer's >> > bridge routing tables in a complex network. And the limit of 4096 entries >> > in many inexpensive switches is not a trivial limit. >> >> Agreed. But see http://en.wikipedia.org/wiki/Virtual_Extensible_LAN >> >> > >> > >> > >> > Routers used to be memory-starved (a small number of KB of RAM was the >> > norm). Perhaps the thinking then (back before 2000) has not been revised, >> > even though the hardware is a lot more capacious. >> >> The profit margins have not been revised. >> >> I would not mind, incidentally expanding the scope of the fqswitch project ot >> try to build something that would scale up at l3 farther than we've ever seen >> before, however funding for needed gear like: >> >> http://www.eetimes.com/document.asp?doc_id=1321334 >> >> and time, and fpga expertise, is lacking. I am currently distracted by >> evaluating >> a very cool new cpu architecture ( see >> http://www.millcomputing.com/wiki/Memory ) >> and even as nifty as that is I foresee a need for a lot of dedicated packet >> processing logic and memories to get into the 40GBit+ range. >> > >> > >> > Remember, the Ethernet layer in WLANs is implemented by microcontrollers, >> > typically not very capable ones, plus TCAMs which are pretty limited in >> > their flexibility. >> >> I do tend to think that the next era of SDN enabled hardware will eventually >> lead to more innovation in both the control and data plane - however it >> seems we are still in a "me-too" phase >> of development of openvswitch (btw: there is a new software switch for >> linux called rocker we should look at, and make sure runs fq_codel), and >> a long way from flexibly programmable switch hardware in general. >> >> http://openvswitch.org/pipermail/dev/2014-September/045084.html >> > >> > >> > >> > While it is tempting to use the "pre-packaged, proprietary" Ethernet switch >> > functionality, routing gets you out of the binary blobs, and let's you be a >> > lot smarter and more scalable. Given that it does NOT cost more to do >> > routing at the IP layer, building complex Ethernet bridging is not obviously >> > a win. >> >> SDN is certainly a way out of this mess. Eventually. But I fear we are making >> all the same mistakes over again, and making slower hardware, where in the >> end, it needs to be faster, to win. >> >> > >> > >> > BTW, TCAMs are used in IP layer switching, too, and also are used in packet >> > filtering. Maybe not in cheap consumer switches, but lots of Gigabit >> > switches implement IP layer switching and filtering. At HP, their switches >> > routinely did all their IP layer switching entirely in TCAMs. >> >> Yep. I really wish big, fat TCAMS were standard equipment. >> >> > >> > >> > On Sunday, January 25, 2015 9:58pm, "Dave Taht" <dave.taht@gmail.com> >> said: >> > >> >> On Sun, Jan 25, 2015 at 6:43 PM, David Lang <david@lang.hm> wrote: >> >> > On Sun, 25 Jan 2015, Dave Taht wrote: >> >> > >> >> >> To your roaming point, yes this is certainly one place where >> migrating >> >> >> bridged vms across machines breaks down, and yet more and more >> vm >> >> >> layers are doing it. I would certainly prefer routing in this >> case. >> >> > >> >> > >> >> > What's the difference between "roaming" and moving a VM from one >> place >> >> > in >> >> > the network to another? >> >> >> >> I think most people think of "roaming" as moving fairly rapidly from one >> >> piece of edge connectivity to another, and moving a vm is a great deal >> >> more >> >> permanent operation. >> >> >> >> > As far as layer 2 vs layer 3 goes. If you try to operate at layer 3, >> you >> >> > are >> >> > going to have quite a bit of smarts in the endpoint. Even if it's >> only >> >> > connected vi a single link. If you think about it, even if your >> network >> >> > routing tables list every machine in our environment individually, >> you >> >> > still >> >> > have a problem of what gateway the endpoint uses. It would have to >> >> > change >> >> > every time it moved. Since DHCP doesn't update frequently enough to >> be >> >> > transparent, you would need to have each endpoint running a routing >> >> > protocol. >> >> >> >> Hmm? I don't ever use a dhcp-supplied default gateway, I depend on the >> >> routing >> >> protocol to supply that. In terms of each vm running a routing protocol, >> >> well, no, I would rely on the underlying bare metal OS to be doing >> >> that, supplying >> >> the FIB tables to the overlying vms, if they need it, but otherwise the >> >> vms >> >> just see a "default" route and don't bother with it. They do need to >> >> inform the >> >> bare metal OS (better term for this please? hypervisor?) of what IPs >> they >> >> own. >> >> >> >> static default gateways are evil. and easily disabled. in linux you >> >> merely comment >> >> out the "routers" in /etc/dhcp/dhclient.conf, in openwrt, set >> >> "defaultroute 0" for the >> >> interface fetching dhcp. >> >> >> >> When a box migrates, it tells the hypervisor it's addresses, and then >> that >> >> box >> >> propagates out the route change to elsewhere. >> >> >> >> > >> >> > This can work for individual hobbiests, but not when you need to >> support >> >> > random devices (how would you configure an iPhone to support this?) >> >> >> >> Carefully. :) >> >> >> >> I do note that this stuff does (or at least did) work on some of the >> open >> >> source variants of android. I would rather like it if android added ipv6 >> >> tethering soon, and made it possible to mesh together multiple phones. >> >> >> >> > >> >> > >> >> > Letting the layer 2 equipment deal with the traffic within the >> building >> >> > and >> >> > invoking layer 3 to go outside the building (or to a different >> security >> >> > domain) makes a lot of sense. Even if that means that layer 2 within >> a >> >> > building looks very similar to what layer 3 used to look like around >> a >> >> > city. >> >> >> >> Be careful what you wish for. >> >> >> >> > >> >> > >> >> > back to the topic of wifi, I'm not aware of any APs that participate >> in >> >> > the >> >> > switch protocols at this level. I also don't know of any reasonably >> >> > priced >> >> > switches that can do anything smarter than plain spanning tree when >> >> > connected through multiple paths (I'd love to learn otherwise) >> >> > >> >> > David Lang >> >> >> >> >> >> >> >> -- >> >> Dave Täht >> >> >> >> thttp://www.bufferbloat.net/projects/bloat/wiki/Upcoming_Talks >> >> >> >> >> >> -- >> Dave Täht >> >> thttp://www.bufferbloat.net/projects/bloat/wiki/Upcoming_Talks >> ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Cerowrt-devel] Recording RF management info _and_ associated traffic? 2015-01-27 0:12 ` dpreed 2015-01-27 0:31 ` David Lang @ 2015-01-27 0:36 ` Dave Taht 1 sibling, 0 replies; 43+ messages in thread From: Dave Taht @ 2015-01-27 0:36 UTC (permalink / raw) To: David Reed, Jesper Dangaard Brouer; +Cc: cerowrt-devel Jesper now cc'd. On Tue, Jan 27, 2015 at 1:12 PM, <dpreed@reed.com> wrote: > Well, we all may want to agree to disagree. I don't buy the argument that > hash tables are slow compared to the TCAMs - and even if cache misses > happened, a hash table is still o(1) - you look at exactly one memory > address on the average in a hash table - that's the point of it. The > constant factor is the speed of memory - not terribly slow by any means. > > > > To get into this deeper would require actual measurements, of which I am a > great fan. But your handwaves are pretty unquantitative, Dave, so at best > they are similar to mine. I'm very measurement focused, being part hardware > architecture guy. Two of the people doing serious optimization and measurement of linux network behavior are now cc'd. (tho they might want to read back on the thread). Jesper, in particular, has been working on speeding up 10GigE in preparation for 100GigE and gave a great preso at lca: http://lwn.net/Articles/629155/ (see slides, video) Relative to that was: http://lwn.net/Articles/629152/ And alexander has been working specifically on dramatically improving routing cache lookups. He tells me: "The amount of gain seen will vary based on the routing configuration of each system. The biggest gain in all of this is that the prefix-matching/backtrace portion of the look-up was reduced from O(N^2) to O(N). So on my test systems that were configured with rather large tries I saw a reduction from 380ns to 16ns for performing a prefix-match/backtrace. What this means for most end users is that anything that falls back to the default route on the system should take significantly less time for look-up in the fib tables. A hit at depth 7 in my trie costs about 31ns, though I think that might be a cache warm hit versus a cache cold look-up. Though you have to keep in mind if you are dealing with a routing table that is the main trie, and not the local trie. That means to get a "hit" only any route you are still going to have to have a failed look-up in the local trie first and the size of that trie depends on the number of local addresses you have configured on the system. My second set of patches should cut that by about 25% to 50% since I am dropping a couple of unnecessary items from the look-up process and compressing things so that the pointer to the next tnode and the key info for that tnode should always be in the same cache-line. the second set of patches [if they work out] that should reduce the cache utilization by up to half. Basically it consists of pushing the key and key information up to the same cache-line that pointer for the tnode/leaf lives on. However I have to sort out some RCU ugliness that adds since I have to RCU protect the key information." My main complaint about the work so far is that no-one has been measuring the total system costs (and latency) of the time it takes from when a packet enters the system to the time it departs. I am pretty sure that immense call path could be additionally optimized... (last I recall it transited a minimum of 34 functions) My secondary complaint is all the work is being tested on 64 bit hardware with huge caches. Somewhat relevant to that: It has long been my hope that we would see per-packet timestamping become the default at ingress (from the card or host application's interaction with the stack), and then merely checked at egress through codel, rather than the fq_codel queue merely measuring itself. > > > David - my comment about HP doing layer 3 switching in TCAMs just was there > to point out that there's nothing magic about layer 2. I was not suggesting > that they don't use proprietary binary blobs, because they do. But so do > the TCAM programs in layer 2 devices. > > > > Dave - you are conflating the implementation technique of the routing > algorithm when you focus on "prefix matching" as being hard to do. It's not > hard to invent a performant algorithm to do that combined with a hash table. > A simple way to do that is to treat the address one is looking up as several > addresses (of shorter prefixes of the address). Then look each one up > separately by its hash. Its still o(1) if you do that, just a larger > constant factor. I assume you don't actually think it is optimal to do > linear searches on the routing table like hosts sometimes do. Linear search > is not necessary. I am tracking alexander's fine work closely. See recent commits to the net-next tree. > > > There is literally nothing magical about looking up 48-bit random Ethernet > addresses in a LAN. The difference between 48 bits and 128 bits is quite large. > > > As far as NAT'ing is concerned - that is done by the gateways. It's > possible in principle to create a distributed NAT face to an Enterprise - if > you do so, then roaming within the enterprise just amounts to telling the > NAT face about the new internal IP address that corresponds to the old one - > an update of one address translation with another. > > > > This is how phones roam, by the way. They update their location via an HLR > as they roam. > > > > > > On Sunday, January 25, 2015 10:45pm, "Dave Taht" <dave.taht@gmail.com> said: > >> On Sun, Jan 25, 2015 at 7:17 PM, <dpreed@reed.com> wrote: >> > Looking up an address in a routing table is o(1) if the routing table is >> > a >> > hash table. That's much more efficient than a TCAM. My simple example >> > just >> > requires a delete/insert at each node's route lookup table. >> >> Regrettably it is not O(1) once you take into account the cpu cache >> hierarchy, >> or the potential collisions you will have once you shrink the hash to >> something reasonable. >> >> Also I think you are ignoring the problem of covering routes. Say I have >> to >> get something to a.b.c.z/32. I do a lookup of that and find nothing. I >> then >> look to find a.b.c.z/31 and find nothing, then /30, then /29, /28, until I >> find >> a hit for the next hop. Now you can of course do a binary search for >> likely >> subprefixes, but in any case, the search is not O(1). >> >> In terms of cache efficient data structures, a straight hash is not the >> way >> to go, of late I have been trying to wrap my head around the hat-trie as >> possibly being useful in these circumstances. >> >> Now, if you think about limiting the domain of the problem to something >> greater than the typical mac table, but less than the whole internet, >> it starts looking more reasonable to have a 1x1 ratio of destination >> IPs to hash table entries for lookups, but updates have to probe/change >> large segments of the table in order to deal with covering prefixes. >> >> > My point was about collections of WLAN's bridged together. Look at what >> > happens (at the packet/radio layer) when a new node joins a bridged set >> > of >> > WLANs using STP. It is not exactly simple to rebuild the Ethernet >> > layer's >> > bridge routing tables in a complex network. And the limit of 4096 >> > entries >> > in many inexpensive switches is not a trivial limit. >> >> Agreed. But see http://en.wikipedia.org/wiki/Virtual_Extensible_LAN >> >> > >> > >> > >> > Routers used to be memory-starved (a small number of KB of RAM was the >> > norm). Perhaps the thinking then (back before 2000) has not been >> > revised, >> > even though the hardware is a lot more capacious. >> >> The profit margins have not been revised. >> >> I would not mind, incidentally expanding the scope of the fqswitch project >> ot >> try to build something that would scale up at l3 farther than we've ever >> seen >> before, however funding for needed gear like: >> >> http://www.eetimes.com/document.asp?doc_id=1321334 >> >> and time, and fpga expertise, is lacking. I am currently distracted by >> evaluating >> a very cool new cpu architecture ( see >> http://www.millcomputing.com/wiki/Memory ) >> and even as nifty as that is I foresee a need for a lot of dedicated >> packet >> processing logic and memories to get into the 40GBit+ range. >> > >> > >> > Remember, the Ethernet layer in WLANs is implemented by >> > microcontrollers, >> > typically not very capable ones, plus TCAMs which are pretty limited in >> > their flexibility. >> >> I do tend to think that the next era of SDN enabled hardware will >> eventually >> lead to more innovation in both the control and data plane - however it >> seems we are still in a "me-too" phase >> of development of openvswitch (btw: there is a new software switch for >> linux called rocker we should look at, and make sure runs fq_codel), and >> a long way from flexibly programmable switch hardware in general. >> >> http://openvswitch.org/pipermail/dev/2014-September/045084.html >> > >> > >> > >> > While it is tempting to use the "pre-packaged, proprietary" Ethernet >> > switch >> > functionality, routing gets you out of the binary blobs, and let's you >> > be a >> > lot smarter and more scalable. Given that it does NOT cost more to do >> > routing at the IP layer, building complex Ethernet bridging is not >> > obviously >> > a win. >> >> SDN is certainly a way out of this mess. Eventually. But I fear we are >> making >> all the same mistakes over again, and making slower hardware, where in the >> end, it needs to be faster, to win. >> >> > >> > >> > BTW, TCAMs are used in IP layer switching, too, and also are used in >> > packet >> > filtering. Maybe not in cheap consumer switches, but lots of Gigabit >> > switches implement IP layer switching and filtering. At HP, their >> > switches >> > routinely did all their IP layer switching entirely in TCAMs. >> >> Yep. I really wish big, fat TCAMS were standard equipment. >> >> > >> > >> > On Sunday, January 25, 2015 9:58pm, "Dave Taht" <dave.taht@gmail.com> >> said: >> > >> >> On Sun, Jan 25, 2015 at 6:43 PM, David Lang <david@lang.hm> wrote: >> >> > On Sun, 25 Jan 2015, Dave Taht wrote: >> >> > >> >> >> To your roaming point, yes this is certainly one place where >> migrating >> >> >> bridged vms across machines breaks down, and yet more and more >> vm >> >> >> layers are doing it. I would certainly prefer routing in this >> case. >> >> > >> >> > >> >> > What's the difference between "roaming" and moving a VM from one >> place >> >> > in >> >> > the network to another? >> >> >> >> I think most people think of "roaming" as moving fairly rapidly from >> >> one >> >> piece of edge connectivity to another, and moving a vm is a great deal >> >> more >> >> permanent operation. >> >> >> >> > As far as layer 2 vs layer 3 goes. If you try to operate at layer 3, >> you >> >> > are >> >> > going to have quite a bit of smarts in the endpoint. Even if it's >> only >> >> > connected vi a single link. If you think about it, even if your >> network >> >> > routing tables list every machine in our environment individually, >> you >> >> > still >> >> > have a problem of what gateway the endpoint uses. It would have to >> >> > change >> >> > every time it moved. Since DHCP doesn't update frequently enough to >> be >> >> > transparent, you would need to have each endpoint running a routing >> >> > protocol. >> >> >> >> Hmm? I don't ever use a dhcp-supplied default gateway, I depend on the >> >> routing >> >> protocol to supply that. In terms of each vm running a routing >> >> protocol, >> >> well, no, I would rely on the underlying bare metal OS to be doing >> >> that, supplying >> >> the FIB tables to the overlying vms, if they need it, but otherwise the >> >> vms >> >> just see a "default" route and don't bother with it. They do need to >> >> inform the >> >> bare metal OS (better term for this please? hypervisor?) of what IPs >> they >> >> own. >> >> >> >> static default gateways are evil. and easily disabled. in linux you >> >> merely comment >> >> out the "routers" in /etc/dhcp/dhclient.conf, in openwrt, set >> >> "defaultroute 0" for the >> >> interface fetching dhcp. >> >> >> >> When a box migrates, it tells the hypervisor it's addresses, and then >> that >> >> box >> >> propagates out the route change to elsewhere. >> >> >> >> > >> >> > This can work for individual hobbiests, but not when you need to >> support >> >> > random devices (how would you configure an iPhone to support this?) >> >> >> >> Carefully. :) >> >> >> >> I do note that this stuff does (or at least did) work on some of the >> open >> >> source variants of android. I would rather like it if android added >> >> ipv6 >> >> tethering soon, and made it possible to mesh together multiple phones. >> >> >> >> > >> >> > >> >> > Letting the layer 2 equipment deal with the traffic within the >> building >> >> > and >> >> > invoking layer 3 to go outside the building (or to a different >> security >> >> > domain) makes a lot of sense. Even if that means that layer 2 within >> a >> >> > building looks very similar to what layer 3 used to look like around >> a >> >> > city. >> >> >> >> Be careful what you wish for. >> >> >> >> > >> >> > >> >> > back to the topic of wifi, I'm not aware of any APs that participate >> in >> >> > the >> >> > switch protocols at this level. I also don't know of any reasonably >> >> > priced >> >> > switches that can do anything smarter than plain spanning tree when >> >> > connected through multiple paths (I'd love to learn otherwise) >> >> > >> >> > David Lang >> >> >> >> >> >> >> >> -- >> >> Dave Täht >> >> >> >> thttp://www.bufferbloat.net/projects/bloat/wiki/Upcoming_Talks >> >> >> >> >> >> -- >> Dave Täht >> >> thttp://www.bufferbloat.net/projects/bloat/wiki/Upcoming_Talks >> -- Dave Täht thttp://www.bufferbloat.net/projects/bloat/wiki/Upcoming_Talks ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Cerowrt-devel] Recording RF management info _and_ associated traffic? 2015-01-26 2:58 ` Dave Taht 2015-01-26 3:17 ` dpreed @ 2015-01-26 3:19 ` David Lang 1 sibling, 0 replies; 43+ messages in thread From: David Lang @ 2015-01-26 3:19 UTC (permalink / raw) To: Dave Taht; +Cc: cerowrt-devel On Sun, 25 Jan 2015, Dave Taht wrote: > On Sun, Jan 25, 2015 at 6:43 PM, David Lang <david@lang.hm> wrote: >> On Sun, 25 Jan 2015, Dave Taht wrote: >> >>> To your roaming point, yes this is certainly one place where migrating >>> bridged vms across machines breaks down, and yet more and more vm >>> layers are doing it. I would certainly prefer routing in this case. >> >> >> What's the difference between "roaming" and moving a VM from one place in >> the network to another? > > I think most people think of "roaming" as moving fairly rapidly from one > piece of edge connectivity to another, and moving a vm is a great deal more > permanent operation. There are two different types of roaming. You have the case like I deal with at SCaLE where you are moving within one network (within one site) Then you have the case where you are moving between sites. within one site, roaming and migrating VMs are pretty much the same problem and handling it at layer2 makes a lot of sense (how frequently the migrations happen, and how permanent they are varies, both for wifi nodes and VMs) David Lang ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Cerowrt-devel] Recording RF management info _and_ associated traffic? 2015-01-25 23:57 ` David Lang 2015-01-26 1:51 ` dpreed @ 2015-01-26 4:25 ` Valdis.Kletnieks 2015-01-26 4:39 ` David Lang 1 sibling, 1 reply; 43+ messages in thread From: Valdis.Kletnieks @ 2015-01-26 4:25 UTC (permalink / raw) To: David Lang; +Cc: cerowrt-devel [-- Attachment #1: Type: text/plain, Size: 421 bytes --] On Sun, 25 Jan 2015 15:57:01 -0800, David Lang said: > The Computer Scientist will cringe at the 'hacks' that this introduces, but > there is far more progress made when new capabilities can be added in a way > that's transparent to other layers of the stack then when it requires major > changes to how things work. Otherwise known as the "Just throw an F5 in front of the whole mess" school of network design... :) [-- Attachment #2: Type: application/pgp-signature, Size: 848 bytes --] ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Cerowrt-devel] Recording RF management info _and_ associated traffic? 2015-01-26 4:25 ` Valdis.Kletnieks @ 2015-01-26 4:39 ` David Lang 2015-01-26 16:42 ` Valdis.Kletnieks 0 siblings, 1 reply; 43+ messages in thread From: David Lang @ 2015-01-26 4:39 UTC (permalink / raw) To: Valdis.Kletnieks; +Cc: cerowrt-devel On Sun, 25 Jan 2015, Valdis.Kletnieks@vt.edu wrote: >> The Computer Scientist will cringe at the 'hacks' that this introduces, but >> there is far more progress made when new capabilities can be added in a way >> that's transparent to other layers of the stack then when it requires major >> changes to how things work. > > Otherwise known as the "Just throw an F5 in front of the whole mess" school > of network design... :) Much as you may hate the abuse of standards and protocols that F5 and other load balancers use to trick both clients and servers into operating without knowing that there are multiple machines serving a website, they do make things a lot more better than if you tried make a website reliable and scale without them. "theoretically better" is trumped by "it works" any day. For something that's theoretically better to win it needs to be implemented and be better in practice as well. David Lang ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Cerowrt-devel] Recording RF management info _and_ associated traffic? 2015-01-26 4:39 ` David Lang @ 2015-01-26 16:42 ` Valdis.Kletnieks 0 siblings, 0 replies; 43+ messages in thread From: Valdis.Kletnieks @ 2015-01-26 16:42 UTC (permalink / raw) To: David Lang; +Cc: cerowrt-devel [-- Attachment #1: Type: text/plain, Size: 474 bytes --] On Sun, 25 Jan 2015 20:39:23 -0800, David Lang said: > Much as you may hate the abuse of standards and protocols that F5 and other load > balancers use to trick both clients and servers into operating without knowing > that there are multiple machines serving a website, they do make things a lot > more better than if you tried make a website reliable and scale without them. Oh, I'm fully aware of that. We have several of the beasts across the hall from my office. :) [-- Attachment #2: Type: application/pgp-signature, Size: 848 bytes --] ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Cerowrt-devel] Recording RF management info _and_ associated traffic? 2015-01-22 18:19 ` Richard Smith 2015-01-22 22:09 ` David Lang 2015-01-24 14:59 ` dpreed @ 2015-01-25 8:07 ` Outback Dingo 2015-01-30 16:14 ` Richard Smith 2 siblings, 1 reply; 43+ messages in thread From: Outback Dingo @ 2015-01-25 8:07 UTC (permalink / raw) To: Richard Smith; +Cc: cerowrt-devel [-- Attachment #1: Type: text/plain, Size: 4520 bytes --] my first initial and only thought is over stauration on your network, i dont see anything of enterprise grade APs listed with 30+ users, how many connections and how many users? are they all trying to download/move data at the same time. On Fri, Jan 23, 2015 at 5:19 AM, Richard Smith <smithbone@gmail.com> wrote: > On 01/22/2015 04:18 AM, David Lang wrote: > > Recently, we picked up the 11th floor as well and moved many people up >>> there. I got a 3rd AP (another TP-Link AC1750) and set that one up on >>> a free channel with a different ESSID. >>> >> >> I like to put all the APs on the same ESSID so that people can roam >> between them. This requires that the APs act as bridges to a dedicated >> common network, not as routers. >> > > That's the ultimate plan but for convenience of being able to easily > select what AP I'm talking to or to be able to tell folks to move from one > to another I've got them on different ESSIDs. It also helps me keep track > of what RF channel things are on. > > Then about a week before my original post I got notified that Internet >>> was down. Both 10th floor APs had stopped working. The 11th floor >>> (where I am) was still working. On the 10th floor, I could connect >>> to the TP-link via its IP address on its wired interface but it did >>> not seem to be passing wireless traffic. A reboot fixed it. >>> >> >> There has been an ongoing bug with Apple devices on 5Ghz that causes the >> wifi chipset to lockup. We think we've fixed it in the current Cerowrt, >> but I don't know what kernel versions have this problem. This is likely >> to affect multiple vendors who use the same chipset (check the openwrt >> hardware list for details of the chipsets in each model) >> > > Oooohhh! That could be it. We have a _lot_ of Apple devices. Most of the > company uses MacBook,or Air and a large number of people have iPhones and > we use iPods for some of our testing. I'll go dig through the openWRT and > get the details. > > The WNDR3700 was completely unresponsive both via WiFi and when I >>> tried its IP connected directly to it's switch with a Cat-5. I also >>> have a serial port mod on that wndr3700 so I connected up to that >>> instead. >>> >> >> hmm, it's not common to have it be unresponsive on the wired network. >> > > It's uncommon to me. :) This unit has travelled with me for years while I > worked for OLPC and its see a lot of different wireless environments. > Granted never one with this many apple clients. Usually 7-8 Linux/Windows > machines and a pile of XOs. > > So this happened a lot at your SCALE setups? > > room. All the stations are in about a 40 foot radius and all but 1 or >>> 2 have line of sight to the AP. The wndr3700 is in a closet on the >>> side of the room with other equipment so it might be 80 feet away from >>> the furthest station or so. >>> >> >> this doesn't sound unreasonable unless your users are trying to use a >> LOT of bandwidth (although the fact that you refer to the 50Mb >> bottleneck indicates that you may be) >> > > The bottleneck was just a nice side effect. We don't use that much > traffic. I only noticed the limit once I started running netperf-wrapper > tests from a wired host. > > Occasional there will be some big download that eats up bandwidth, but > when I watch the throughput during the day we peak up in to the 40Mbps but > the average is < 10Mbps (Download). > > Can I perhaps approximate signal strength by looking at the bitrate >>> for packets that station sends? The theory being that higher quality >>> RF links should use the higher bitrate encodings when sending. >>> >> >> not reliably, too many other things factor in to that. >> > > Indeed. Horst tells me I basically have 2 rates happening on the tplink > 6Mbs and 24Mbps with a few 12Mbps in there. > > If need be I can move the wndr to the same location as the tplink and >>> then have stations connect to the wndr so I can watch the rx signal >>> strength. >>> >> > Looks like that's what I'll have to do. > > There is a lot of room with consumer grade equipment from where you >> currently are. The "Enterprise Grade" systems do have a lot of >> infrastructure to coordinate the different APs. >> > > Thanks for the ray of hope. Yeah I don't need all the multi-AP > coordination handoff stuff. > > -- > Richard A. Smith > > _______________________________________________ > Cerowrt-devel mailing list > Cerowrt-devel@lists.bufferbloat.net > https://lists.bufferbloat.net/listinfo/cerowrt-devel > [-- Attachment #2: Type: text/html, Size: 6990 bytes --] ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Cerowrt-devel] Recording RF management info _and_ associated traffic? 2015-01-25 8:07 ` Outback Dingo @ 2015-01-30 16:14 ` Richard Smith 0 siblings, 0 replies; 43+ messages in thread From: Richard Smith @ 2015-01-30 16:14 UTC (permalink / raw) To: Outback Dingo, Richard Smith; +Cc: cerowrt-devel On 01/25/2015 03:07 AM, Outback Dingo wrote: > my first initial and only thought is over stauration on your network, i > dont see anything of enterprise grade APs listed with 30+ users, how > many connections and how many users? are they all trying to > download/move data at the same time. Looking at the leases file we have 90 devices getting IPs. That's about right for the 30 or so people + all the other devices connected. The users are split up now on 3 APs all on different channels. 1 AP on the 11th floor: (tplink stock) 16-20 clients. 2 APs on the 10th floor: (tplink stock and Wndr3700v2 OpenWRT) each 10 floor AP has 10ish clients You can see all AP's from both floors but the AP not on the floor with you has a pretty low signal. Low but still usable. From watching what's going on at the radiotap level via horst I don't see a very high level of utilisation but I've still not been able to catch things in the act of a total fail yet. -- Richard A. Smith ^ permalink raw reply [flat|nested] 43+ messages in thread
end of thread, other threads:[~2015-01-30 16:14 UTC | newest] Thread overview: 43+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2015-01-14 2:20 [Cerowrt-devel] Recording RF management info _and_ associated traffic? Richard Smith 2015-01-20 16:59 ` Rich Brown 2015-01-21 23:40 ` Richard Smith 2015-01-21 23:58 ` David Lang 2015-01-22 9:04 ` Richard Smith 2015-01-22 9:18 ` David Lang 2015-01-22 18:19 ` Richard Smith 2015-01-22 22:09 ` David Lang 2015-01-22 22:55 ` Roman Toledo Casabona 2015-01-24 14:59 ` dpreed 2015-01-24 15:30 ` Kelvin Edmison 2015-01-25 4:35 ` David Lang 2015-01-25 5:02 ` Dave Taht 2015-01-25 5:04 ` Dave Taht 2015-01-25 6:44 ` David Lang 2015-01-25 7:06 ` David Lang [not found] ` <CAA93jw64KjW-JjLxB3i_ZK348NCyJYSQACFO34MaUsBBWyZ+pA@mail.gmail.com> 2015-01-25 7:59 ` Dave Taht 2015-01-25 9:39 ` David Lang 2015-01-25 15:03 ` Chuck Anderson 2015-01-25 20:17 ` dpreed 2015-01-25 23:21 ` Aaron Wood 2015-01-25 23:57 ` David Lang 2015-01-26 1:51 ` dpreed 2015-01-26 2:09 ` David Lang 2015-01-26 4:33 ` Valdis.Kletnieks 2015-01-26 4:44 ` David Lang 2015-01-27 0:14 ` dpreed 2015-01-27 0:23 ` David Lang 2015-01-26 2:19 ` Dave Taht 2015-01-26 2:43 ` David Lang 2015-01-26 2:58 ` Dave Taht 2015-01-26 3:17 ` dpreed 2015-01-26 3:32 ` David Lang 2015-01-26 3:45 ` Dave Taht 2015-01-27 0:12 ` dpreed 2015-01-27 0:31 ` David Lang 2015-01-27 0:36 ` Dave Taht 2015-01-26 3:19 ` David Lang 2015-01-26 4:25 ` Valdis.Kletnieks 2015-01-26 4:39 ` David Lang 2015-01-26 16:42 ` Valdis.Kletnieks 2015-01-25 8:07 ` Outback Dingo 2015-01-30 16:14 ` Richard Smith
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox