From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from rcdn-iport-6.cisco.com (rcdn-iport-6.cisco.com [173.37.86.77]) (using TLSv1 with cipher RC4-SHA (128/128 bits)) (Client CN "rcdn-iport.cisco.com", Issuer "Cisco SSCA2" (verified OK)) by huchra.bufferbloat.net (Postfix) with ESMTPS id 4CDC821F268; Tue, 1 Jul 2014 09:38:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=cisco.com; i=@cisco.com; l=15507; q=dns/txt; s=iport; t=1404232730; x=1405442330; h=from:to:cc:subject:date:message-id:references: in-reply-to:mime-version; bh=SNC2hHMxYl56RIC1CiBrxQ8zBNVXiTLzrJ/2HzunvOI=; b=leHPWUHDqs76UFWohnQO9gUGWA1oRfSoKPqK6CRnKVhf4SBeCrUpTHq6 z0E0XmTWeyXLAQTQXRlniCd0Ptir2rJ/7cezAvn7HAdKeCM81y96J4g31 AaDFm8Emr1guNZkgdopxc16tjdAd9nkUzzTioqJ4p6BnDrqae/8H3AxAn o=; X-Files: signature.asc : 195 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AggFAHzjslOtJV2Y/2dsb2JhbABaCoMDUlquR5c+AYENFlwZhAMBAQEDASciAgEtBQsCAQgTBRUZIQQNJQIEDgUOiCADCQgNsm8BC49SDYYlF4x4gU9YAgUJAYMjgRYFhGUCbIxDgUMDhQyBfoFIgwGJKIYSgUSBfmwBAQF/Qg X-IronPort-AV: E=Sophos;i="5.01,582,1400025600"; d="asc'?scan'208";a="337041764" Received: from rcdn-core-1.cisco.com ([173.37.93.152]) by rcdn-iport-6.cisco.com with ESMTP; 01 Jul 2014 16:38:48 +0000 Received: from xhc-aln-x15.cisco.com (xhc-aln-x15.cisco.com [173.36.12.89]) by rcdn-core-1.cisco.com (8.14.5/8.14.5) with ESMTP id s61GcmSB005439 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=FAIL); Tue, 1 Jul 2014 16:38:48 GMT Received: from xmb-rcd-x09.cisco.com ([169.254.9.143]) by xhc-aln-x15.cisco.com ([173.36.12.89]) with mapi id 14.03.0123.003; Tue, 1 Jul 2014 11:38:48 -0500 From: "Fred Baker (fred)" To: Dave Taht Thread-Topic: [Bloat] viability of the data center in the internet of the future Thread-Index: AQHPko2q8vKJMWZ/uUy0ibP3+Degk5uHlw4AgAOnA4CAAIaAAA== Date: Tue, 1 Jul 2014 16:38:47 +0000 Message-ID: References: In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: yes X-MS-TNEF-Correlator: x-originating-ip: [10.19.64.116] Content-Type: multipart/signed; boundary="Apple-Mail=_8996FC26-7CD9-4382-BEDF-4E85721A2C4B"; protocol="application/pgp-signature"; micalg=pgp-sha1 MIME-Version: 1.0 Cc: cerowrt-devel , bloat Subject: Re: [Cerowrt-devel] [Bloat] viability of the data center in the internet of the future X-BeenThere: cerowrt-devel@lists.bufferbloat.net X-Mailman-Version: 2.1.13 Precedence: list List-Id: Development issues regarding the cerowrt test router project List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 01 Jul 2014 16:38:50 -0000 --Apple-Mail=_8996FC26-7CD9-4382-BEDF-4E85721A2C4B Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=windows-1252 On Jul 1, 2014, at 1:37 AM, Dave Taht wrote: > On Sat, Jun 28, 2014 at 5:50 PM, Fred Baker (fred) = wrote: >> There is in fact a backbone. Once upon a time, it was run by a single = company, BBN. Then it was more like five, and then ... and now it=92s = 169. There are, if the BGP report = (http://seclists.org/nanog/2014/Jun/495) is to be believed, 47136 ASNs = in the system, of which 35929 don=92t show up as transit for anyone and = are therefore presumably edge networks and potentially multihomed, and = of those 16325 only announce a single prefix. Of the 6101 ASNs that show = up as transit, 169 ONLY show up as transit. Yes, the core is 169 ASNs, = and it=92s not a little dot off to the side. If you want to know where = it is, do a traceroute (tracery on windows). >=20 > The fact that the internet has grown to 10+ billion devices (by some > estimates), and from 1 transit provider to only 169 doesn't impress > me. There are 206 countries in the world... Did I say that there was only one transit provider? I said there were = 169 AS=92s that, in potoroo=92s equivalent of route views, *only* show = up as transit. There are, this morning, 195 transit-only AS=92s, 40724 = origin-only AS=92s (AS=92s that are only found at the edge), and 6573 = AS=92s that show up both a origin AS=92s and transit AS=92s.=20 http://bgp.potaroo.net/as2.0/bgp-active.html > It is a shame that multi-homing has never been easily obtainable nor > widely available, it would be nice to be able to have multiple links > for any business critically dependent on the continuous operation of > the internet and cloud. Actually, it is pretty common. Again, from potoroo.net, there are 30620 = origin AS=92s announced via a single AS path. The implication is that = there are 40724-30620=3D10104 origin AS=92s being announced to AS65000 = via multiple AS paths. I don=92t know whether they or their upstreams = are multi-homed, but I=92ll bet a significant subset of them are = multihomed. >> I=92ll give you two, one through Cisco and one through my residential = provider. >>=20 >> traceroute to reed.com (67.223.249.82), 64 hops max, 52 byte packets >> 1 sjc-fred-881.cisco.com (10.19.64.113) 1.289 ms 12.000 ms 1.130 = ms >=20 > This is through your vpn? Yes >> 2 sjce-access-hub1-tun10.cisco.com (10.27.128.1) 47.661 ms 45.281 = ms 42.995 ms >=20 >> 3 ... >> 11 sjck-isp-gw1-ten1-1-0.cisco.com (128.107.239.217) 44.972 ms = 45.094 ms 43.670 ms >> 12 tengige0-2-0-0.gw5.scl2.alter.net (152.179.99.153) 48.806 ms = 49.338 ms 47.975 ms >> 13 0.xe-9-1-0.br1.sjc7.alter.net (152.63.51.101) 43.998 ms 45.595 = ms 49.838 ms >> 14 206.111.6.121.ptr.us.xo.net (206.111.6.121) 52.110 ms 45.492 ms = 47.373 ms >> 15 207.88.14.225.ptr.us.xo.net (207.88.14.225) 126.696 ms 124.374 = ms 127.983 ms >> 16 te-2-0-0.rar3.washington-dc.us.xo.net (207.88.12.70) 127.639 ms = 132.965 ms 131.415 ms >> 17 te-3-0-0.rar3.nyc-ny.us.xo.net (207.88.12.73) 129.747 ms = 125.680 ms 123.907 ms >> 18 ae0d0.mcr1.cambridge-ma.us.xo.net (216.156.0.26) 125.009 ms = 123.152 ms 126.992 ms >> 19 ip65-47-145-6.z145-47-65.customer.algx.net (65.47.145.6) 118.244 = ms 118.024 ms 117.983 ms >> 20 * * * >> 21 209.59.211.175 (209.59.211.175) 119.378 ms * 122.057 ms >> 22 reed.com (67.223.249.82) 120.051 ms 120.146 ms 118.672 ms >=20 >=20 >> traceroute to reed.com (67.223.249.82), 64 hops max, 52 byte packets >> 1 10.0.2.1 (10.0.2.1) 1.728 ms 1.140 ms 1.289 ms >> 2 10.6.44.1 (10.6.44.1) 122.289 ms 126.330 ms 14.782 ms >=20 > ^^^^^ is this a wireless hop or something? Seeing your traceroute jump > all the way to 122+ms strongly suggests you are either wireless or > non-pied/fq_codeled. The zeroth hop is wireless - I pull my Ethernet plug and turn on the = wifi interface, which is instantiated by two Apple Airport APs in the = home. 10.0.2.1 is the residential slice of my router. To be honest, I=92m = hard-pressed to say what 10.6.44.1 is; I suspect it=92s an address of my = CMTS. The address *I* have for my CMTS is 98.173.193.1, and my address = in that subnet is 98.173.193.12. If you want my guess, Cox is returning = an RFC 1918 address to prevent non-customers from pinging it. --- 10.6.44.1 ping statistics --- 10 packets transmitted, 10 packets received, 0.0% packet loss round-trip min/avg/max/stddev =3D 7.668/10.102/12.012/1.520 ms --- 98.173.193.1 ping statistics --- 10 packets transmitted, 10 packets received, 0.0% packet loss round-trip min/avg/max/stddev =3D 8.414/30.501/120.407/41.031 ms and 98.173.193.1 doesn=92t show up in my traceroute.=20 Absent per-hop timestamps, I=92m not in a position to say where the = delay came from. For all I know, it has something to do with the Wifi in = the house. Wifi can have really strange delays. Whatever. >> 3 ip68-4-12-20.oc.oc.cox.net (68.4.12.20) 13.208 ms 12.667 ms = 8.941 ms >> 4 ip68-4-11-96.oc.oc.cox.net (68.4.11.96) 17.025 ms 13.911 ms = 13.835 ms >> 5 langbprj01-ae1.rd.la.cox.net (68.1.1.13) 131.855 ms 14.677 ms = 129.860 ms >> 6 68.105.30.150 (68.105.30.150) 16.750 ms 31.627 ms 130.134 ms >> 7 ae11.cr2.lax112.us.above.net (64.125.21.173) 40.754 ms 31.873 ms = 130.246 ms >> 8 ae3.cr2.iah1.us.above.net (64.125.21.85) 162.884 ms 77.157 ms = 69.431 ms >> 9 ae14.cr2.dca2.us.above.net (64.125.21.53) 97.115 ms 113.428 ms = 80.068 ms >> 10 ae8.mpr4.bos2.us.above.net.29.125.64.in-addr.arpa (64.125.29.33) = 109.957 ms 124.964 ms 122.447 ms >> 11 * 64.125.69.90.t01470-01.above.net (64.125.69.90) 86.163 ms = 103.232 ms >> 12 250.252.148.207.static.yourhostingaccount.com (207.148.252.250) = 111.068 ms 119.984 ms 114.022 ms >> 13 209.59.211.175 (209.59.211.175) 103.358 ms 87.412 ms 86.345 ms >> 14 reed.com (67.223.249.82) 87.276 ms 102.752 ms 86.800 ms >=20 > Doing me to you: >=20 > d@ida:$ traceroute -n 68.4.12.20 Through Cox: --- 68.4.12.20 ping statistics --- 10 packets transmitted, 10 packets received, 0.0% packet loss round-trip min/avg/max/stddev =3D 12.954/16.348/28.209/4.777 ms traceroute to 68.4.12.20 (68.4.12.20), 64 hops max, 52 byte packets 1 10.0.2.1 1.975 ms 9.026 ms 1.397 ms 2 * * * 3 * * * Traceroute to Facebook works, though: traceroute www.facebook.com traceroute to star.c10r.facebook.com (31.13.77.65), 64 hops max, 52 byte = packets 1 10.0.2.1 (10.0.2.1) 1.490 ms 1.347 ms 0.934 ms 2 10.6.44.1 (10.6.44.1) 9.253 ms 11.308 ms 10.974 ms 3 ip68-4-12-20.oc.oc.cox.net (68.4.12.20) 11.275 ms 13.531 ms = 20.180 ms 4 ip68-4-11-96.oc.oc.cox.net (68.4.11.96) 18.901 ms 13.013 ms = 18.723 ms 5 sanjbprj01-ae0.0.rd.sj.cox.net (68.1.5.184) 29.397 ms 28.944 ms = 30.062 ms 6 sv1.br01.sjc1.tfbnw.net (206.223.116.166) 31.011 ms 31.082 ms sv1.pr02.tfbnw.net (206.223.116.153) 32.035 ms 7 ae1.bb01.sjc1.tfbnw.net (74.119.76.23) 32.932 ms 33.251 ms po126.msw01.05.sjc1.tfbnw.net (31.13.31.131) 31.822 ms 8 edge-star-shv-05-sjc1.facebook.com (31.13.77.65) 38.234 ms 44.150 = ms 31.165 ms So it=92s not that the router is dropping incoming ICMP. Through Cisco: --- 68.4.12.20 ping statistics --- 10 packets transmitted, 0 packets received, 100.0% packet loss traceroute to 68.4.12.20 (68.4.12.20), 64 hops max, 52 byte packets 1 10.19.64.113 1.173 ms 0.932 ms 0.932 ms 2 10.27.128.1 36.256 ms 36.478 ms 37.376 ms 3 10.20.1.205 35.831 ms 36.211 ms 36.090 ms 4 171.69.14.249 36.084 ms 36.345 ms 37.889 ms 5 171.69.14.206 38.342 ms 37.791 ms 39.771 ms 6 171.69.7.178 37.699 ms 36.662 ms 41.758 ms 7 128.107.236.39 43.112 ms 36.401 ms 39.407 ms 8 128.107.239.6 35.576 ms 35.092 ms 37.770 ms 9 128.107.239.218 35.846 ms 35.337 ms 36.488 ms 10 128.107.239.250 35.504 ms 36.924 ms 39.353 ms 11 128.107.239.217 36.881 ms 38.063 ms 37.892 ms 12 152.179.99.153 38.745 ms 39.754 ms 39.665 ms 13 152.63.51.97 38.322 ms 37.466 ms 41.380 ms 14 129.250.9.249 39.924 ms 40.913 ms 39.690 ms 15 129.250.5.52 46.302 ms 43.463 ms 39.334 ms 16 129.250.6.10 49.332 ms 45.380 ms 47.309 ms 17 129.250.5.86 46.556 ms 48.806 ms 129.250.5.70 48.635 ms 18 129.250.6.181 48.020 ms 129.250.6.203 47.502 ms 47.111 ms 19 129.250.194.166 47.373 ms 48.532 ms 48.723 ms 20 68.1.0.179 66.514 ms 68.1.0.185 63.758 ms 68.1.0.189 61.326 ms 21 * * * > Using ping rather than traceroute I get a typical min RTT to you > of 32ms. >=20 > As the crow drives between santa barbara and los gatos, (280 miles) at > the speed of light in cable, we have roughly 4ms of RTT between us, or > 28ms of induced latency due to the characteristics of the underlying > media technologies, and the quality and limited quantity of the > interconnects. >=20 > A number I've long longed to have from fios, dsl, and cable are > measurements of "cross-town" latency - in the prior age of > circuit-switched networks, I can't imagine it being much higher than > 4ms, and local telephony used to account for a lot of calls. Well, if it=92s of any interest, I once upon a time had a fractional T-1 = to the home (a different one, but here in Santa Barbara), and ping RTT = to Cisco was routinely 30ish ms much as it is now through Cox. I did = have it jump once of about 600 ms, and I called to complain. > Going cable to cable, between two comcast cablemodems on (so far as I > know) different CMTSes, the 20 miles between los gatos and scotts > valley: >=20 > 1 50.197.142.150 0.794 ms 0.692 ms 0.517 ms > 2 67.180.184.1 19.266 ms 18.397 ms 8.726 ms > 3 68.85.102.173 14.953 ms 9.347 ms 10.213 ms > 4 69.139.198.146 20.477 ms 69.139.198.142 12.434 ms > 69.139.198.138 16.116 ms > 5 68.87.226.205 17.850 ms 15.375 ms 13.954 ms > 6 68.86.142.250 28.254 ms 33.133 ms 28.546 ms > 7 67.180.229.17 21.987 ms 23.831 ms 27.354 ms >=20 > gfiber testers are reporting 3-5ms RTT to speedtest (co-lo'd in their > data center), which is a very encouraging statistic, but I don't have > subscriber-2-subscriber numbers there. Yet. >=20 >>=20 >> Cisco->AlterNet->XO->ALGX is one path, and Cox->AboveNet->Presumably = ALGX is another. They both traverse the core. >>=20 >> Going to bufferbloat.net, I actually do skip the core in one path. = Through Cisco, I go through core site and hurricane electric and finally = into ISC. ISC, it turns out, is a Cox customer; taking my residential = path, since Cox serves us both, the traffic never goes upstream from = Cox. >>=20 >> Yes, there are CDNs. I don=92t think you=92d like the way Video/IP = and especially adaptive bitrate video - Netflix, Youtube, etc - worked = if they didn=92t exist. >=20 > I totally favor CDNs of all sorts. My worry - not successfully > mirrored in the fast/slow lane debate - was over the vertical > integration of certain providers preventing future CDN deployments of > certain kinds of content. Personally, I think most of that is blarney. A contract to colo a CDN = provider is money for the service provider. I haven=92t noticed any = service providers turning down money. >> Akamai is probably the prototypical one, and when they deployed = theirs it made the Internet quite a bit snappier - and that helped the = economics of Internet sales. Google and Facebook actually do operate = large data centers, but a lot of their common content (or at least = Google=92s) is in CDNlets. NetFlix uses several CDNs, or so I=92m told; = the best explanation I have found of their issues with Comcast and Level = 3 is at http://www.youtube.com/watch?v=3DtR1sLLOYxnY (and it has = imperfections). And yes, part of the story is business issues over CDNs. = Netflix=92s data traverses the core once to each CDN download server, = and from the server to its customers. >=20 > Yes, that description mostly mirrors my understanding, and the = viewpoint we > point forth in the wired article which I hoped help to defuse the = hysteria. >=20 > Then what gfiber published shortly afterwards on their co-lo policy > scored some points, I thought. >=20 > http://googlefiberblog.blogspot.com/2014/05/minimizing-buffering.html >=20 > In addition the wayward political arguments, the what bothered me > about level3's argument is that the made unsubstantiated claims about > packet loss and latency that I'd have loved to hear more about, > notably whether or not they had any AQM in place. Were I Netflix and company, and for that matter Youtube, I would handle = delay at the TCP sender by using a delay-based TCP congestion control = algorithm. There is at least one common data center provider that I = think does that; they told me that they had purchased a congestion = control algorithm (although the guy I was speaking with didn=92t know = what they bought or from whom), and the only one I know of that is for = sale in that sense is a pretty effective delay-based algorithm. The = point of TCP congestion control is to maximize throughput while = protecting the Internet. I would argue that it SHOULD be to maximize = throughput while minimizing latency. Rant available on request. >> The IETF uses a CDN, as of recently. It=92s called Cloudflare. >>=20 >> One of the places I worry is Chrome and Silk=92s SPDY Proxies, which = are somewhere in Google and Amazon respectively. >=20 > Well, the current focus on e2e encryption everywhere is breaking good > old fashioned methods of minimizing dns and web traffic inside an > organization and coping with odd circumstances like satellite links. I > liked web proxies, they were often capable of reducing traffic by 10s > of percentage points, reduce latency enormously for lossy or satellite > links, and were frequently used by large organizations (like schools) > to manage content. Well, yes. They also have the effect of gerrymandering routing. All = traffic through a proxy could go directly to the destination but goes = first to the proxy. If the proxy is on the path, well and good. If it=92s = off-path, it adds RTT.=20 >> Chrome and Silk send https and SPDY traffic directly to the targeted = service, but http traffic to their proxies, which do their magic and = send the result back. One of the potential implications is that instead = of going to the CDN nearest me, it then goes to the CDN nearest the = proxy. That=92s not good for me. I just hope that the CDNs I use accept = https from me, because that will give me the best service (and btw = encrypts my data). >>=20 >> Blind men and elephants, and they=92re all right. --Apple-Mail=_8996FC26-7CD9-4382-BEDF-4E85721A2C4B Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="signature.asc" Content-Type: application/pgp-signature; name="signature.asc" Content-Description: Message signed with OpenPGP using GPGMail -----BEGIN PGP SIGNATURE----- Comment: GPGTools - http://gpgtools.org iD8DBQFTsuQWbjEdbHIsm0MRAm0dAJ9I07pzNSE3kyHnHE1DkJCo/Mx+XwCfSvXp +NkG5Z8osCzaseFc/OLT2Mk= =iN1b -----END PGP SIGNATURE----- --Apple-Mail=_8996FC26-7CD9-4382-BEDF-4E85721A2C4B--