From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp73.iad3a.emailsrvr.com (smtp73.iad3a.emailsrvr.com [173.203.187.73]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by huchra.bufferbloat.net (Postfix) with ESMTPS id 6891C21F387 for ; Thu, 14 May 2015 07:28:07 -0700 (PDT) Received: from smtp10.relay.iad3a.emailsrvr.com (localhost.localdomain [127.0.0.1]) by smtp10.relay.iad3a.emailsrvr.com (SMTP Server) with ESMTP id 2D0B2280404; Thu, 14 May 2015 10:28:01 -0400 (EDT) Received: from app57.wa-webapps.iad3a (relay-webapps.rsapps.net [172.27.255.140]) by smtp10.relay.iad3a.emailsrvr.com (SMTP Server) with ESMTP id F293D2804A7; Thu, 14 May 2015 10:28:00 -0400 (EDT) X-Sender-Id: dpreed@reed.com Received: from app57.wa-webapps.iad3a (relay-webapps.rsapps.net [172.27.255.140]) by 0.0.0.0:25 (trex/5.4.2); Thu, 14 May 2015 14:28:01 GMT Received: from reed.com (localhost.localdomain [127.0.0.1]) by app57.wa-webapps.iad3a (Postfix) with ESMTP id D97F0180063; Thu, 14 May 2015 10:28:00 -0400 (EDT) Received: by apps.rackspace.com (Authenticated sender: dpreed@reed.com, from: dpreed@reed.com) with HTTP; Thu, 14 May 2015 10:28:00 -0400 (EDT) Date: Thu, 14 May 2015 10:28:00 -0400 (EDT) From: dpreed@reed.com To: "Jim Gettys" MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_20150514102800000000_59287" Importance: Normal X-Priority: 3 (Normal) X-Type: html In-Reply-To: References: X-Auth-ID: dpreed@reed.com Message-ID: <1431613680.888530951@apps.rackspace.com> X-Mailer: webmail/11.4.2-RC Cc: "cerowrt-devel@lists.bufferbloat.net" , =?utf-8?Q?Bill_Ver_Steeg_=28versteb=29?= , bloat Subject: Re: [Cerowrt-devel] =?utf-8?q?=5BBloat=5D_better_business_bufferbloat?= =?utf-8?q?_monitoring_tools=3F?= X-BeenThere: cerowrt-devel@lists.bufferbloat.net X-Mailman-Version: 2.1.13 Precedence: list List-Id: Development issues regarding the cerowrt test router project List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 14 May 2015 14:28:45 -0000 ------=_20150514102800000000_59287 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable =0ATools, tools, tools. Make it trivially easy to capture packets in the h= ome (don't require cerowrt, for obvious reasons). For example, an iPhone a= pp that does a tcpdump and sends it to us would be fantastic to diagnose "m= ake wifi fast" issues and also bufferbloat issues. Give feedback that is h= elpful to every one who contributes data. (That's what made netalyzr work = so well... you got feedback ASAP that could be used to understand your own = situation).=0A =0ANot sure an iPhone app can be disseminated. An Android a= pp might be, as could a MacBook app and a WIndows app.=0A =0ALinux/FreeBSD = options: One could generate a memstick app that would boot Linux on a stan= dard windows laptop to run tcpdump and upload the results, or something tha= t would run in Parallels or VMWare fusion on a Mac.=0A =0AI've started look= ing at a hardware measurement platform for my "make WiFi fast" work - curre= ntly looks like a Rangely board will do the trick. But that won't scale we= ll outside my home since it costs a few hundred bucks for the hardware.=0A= =0AOn Wednesday, May 13, 2015 11:30am, "Jim Gettys" sa= id:=0A=0A=0A=0A=0A=0A=0AOn Wed, May 13, 2015 at 9:20 AM, Bill Ver Steeg (ve= rsteb) <[ versteb@cisco.com ]( mailto:versteb@cisco.com )> wrote:=0ATime sc= ales are important. Any time you use TCP to send a moderately large file, y= ou drive the link into congestion. Sometimes this is for a few milliseconds= per hour and sometimes this is for 10s of minutes per hour.=0A=0A For inst= ance, watching a 3 Mbps video (Netflix/YouTube/whatever) on a 4 Mbps link w= ith no cross traffic can cause significant bloat, particularly on older tai= l drop middleboxes. The host code does an HTTP get every N seconds, and dr= ives the link as hard as it can until it gets the video chunk. It waits a s= econd or two and then does it again. Rinse and Repeat. You end up with a ve= ry characteristic delay plot. The bloat starts at 0, builds until the middl= ebox provides congestion feedback, then sawtooths around at about the buffe= r size. When the burst ends, the middlebox burns down its buffer and bloat = goes back to zero. Wait a second or two and do it again.=0A=0A=E2=80=8BIt's= time to do some packet traces to see what the video providers are doing. = In YouTube's case, I believe the traffic is using the new sched_fq qdisc, w= hich does packet pacing; but exactly how this plays out by the time packets= reach the home isn't entirely clear to me. Other video providers/CDN's may= /may not have started generating clues.=0A=0A=0AAlso note that so far, no o= ne is trying to pace the IW transmission at all.=0A=0A=E2=80=8B =0A You can= 't fix this by adding bandwidth to the link. The endpoint's TCP sessions wi= ll simply ramp up to fill the link. You will shorten the congested phase of= the cycle, but TCP will ALWAYS FILL THE LINK (given enough time to ramp up= )=0A=0A=E2=80=8BThat has been the behavior in the past, but it's no longer = safe to presume=E2=80=8B we should tar everyone with the same brush, rather= , we should do a bit of science, and then try to hold people's feet to the = fire that do not "play nice" with the network.=0A=E2=80=8BSome packet captu= res in the home can easily sort this out.=0AJim=0A=E2=80=8B=0A The new AQM = (and FQ_AQM) algorithms do a much better job of controlling the oscillatory= bloat, but you can still see ABR video patterns in the delay figures.=0A= =0A Bvs=0A=0A=0A=0A=0A -----Original Message-----=0A From: [ bloat-bounces@= lists.bufferbloat.net ]( mailto:bloat-bounces@lists.bufferbloat.net ) [mail= to:[ bloat-bounces@lists.bufferbloat.net ]( mailto:bloat-bounces@lists.buff= erbloat.net )] On Behalf Of Dave Taht=0A Sent: Tuesday, May 12, 2015 12:00 = PM=0A To: bloat; [ cerowrt-devel@lists.bufferbloat.net ]( mailto:cerowrt-de= vel@lists.bufferbloat.net )=0A Subject: [Bloat] better business bufferbloat= monitoring tools?=0A=0A One thread bothering me on [ dslreports.com ]( htt= p://dslreports.com ) is that some folk seem to think you only get bufferblo= at if you stress test the network, where transient bufferbloat is happening= all the time, everywhere.=0A=0A On one of my main sqm'd network gateways, = day in, day out, it reports about 6000 drops or ecn marks on ingress, and a= bout 300 on egress.=0A Before I doubled the bandwidth that main box got, th= e drop rate used to be much higher, and a great deal of the bloat, drops, e= tc, has now moved into the wifi APs deeper into the network where I am not = monitoring it effectively.=0A=0A I would love to see tools like mrtg, cacti= , nagios and smokeping[1] be more closely integrated, with bloat related pl= ugins, and in particular, as things like fq_codel and other ecn enabled aqm= s deploy, start also tracking congestive events like loss and ecn CE markin= gs on the bandwidth tracking graphs.=0A=0A This would counteract to some ex= tent the classic 5 minute bandwidth summaries everyone looks at, that hide = real traffic bursts, latencies and loss at sub 5 minute timescales.=0A=0A m= rtg and cacti rely on snmp. While loss statistics are deeply part of snmp, = I am not aware of there being a mib for CE events and a quick google search= was unrevealing. ?=0A=0A There is also a need for more cross-network monit= oring using tools such as that done by this excellent paper.=0A=0A[ http://= www.caida.org/publications/papers/2014/measurement_analysis_internet_interc= onnection/measurement_analysis_internet_interconnection.pdf ]( http://www.c= aida.org/publications/papers/2014/measurement_analysis_internet_interconnec= tion/measurement_analysis_internet_interconnection.pdf )=0A=0A [1] the netw= ork monitoring tools market is quite vast and has many commercial applicati= ons, like intermapper, forks of nagios, vendor specific producs from cisco,= etc, etc. Far too many to list, and so far as I know, none are reporting E= CN related stats, nor combining latency and loss with bandwidth graphs. I w= ould love to know if any products, commercial or open source, did....=0A=0A= --=0A Dave T=C3=A4ht=0A Open Networking needs **Open Source Hardware**=0A= =0A[ https://plus.google.com/u/0/+EricRaymond/posts/JqxCe2pFr67 ]( https://= plus.google.com/u/0/+EricRaymond/posts/JqxCe2pFr67 )=0A ___________________= ____________________________Bloat mailing list=0A[ Bloat@lists.bufferbloat.= net ]( mailto:Bloat@lists.bufferbloat.net )=0A[ https://lists.bufferbloat.n= et/listinfo/bloat ]( https://lists.bufferbloat.net/listinfo/bloat )=0A=0A= =0A_______________________________________________=0A Cerowrt-devel mailing= list=0A[ Cerowrt-devel@lists.bufferbloat.net ]( mailto:Cerowrt-devel@lists= .bufferbloat.net )=0A[ https://lists.bufferbloat.net/listinfo/cerowrt-devel= ]( https://lists.bufferbloat.net/listinfo/cerowrt-devel ) ------=_20150514102800000000_59287 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable

Tool= s, tools, tools.  Make it trivially easy to capture packets in the hom= e (don't require cerowrt, for obvious reasons).  For example, an iPhon= e app that does a tcpdump and sends it to us would be fantastic to diagnose= "make wifi fast" issues and also bufferbloat issues.  Give feedback t= hat is helpful to every one who contributes data.  (That's what made n= etalyzr work so well... you got feedback ASAP that could be used to underst= and your own situation).

=0A

 

=0ANot sure an iPhone app can be disseminated. &nbs= p;An Android app might be, as could a MacBook app and a WIndows app.

=0A=

 

=0A

Linu= x/FreeBSD options: One could  generate a memstick app that would boot = Linux on a standard windows laptop to run tcpdump and upload the results, o= r something that would run in Parallels or VMWare fusion on a Mac.

=0A 

=0A

I've s= tarted looking at a hardware measurement platform for my "make WiFi fast" w= ork - currently looks like a Rangely board will do the trick.  But tha= t won't scale well outside my home since it costs a few hundred bucks for t= he hardware.

=0A


On Wednesday, May 13,= 2015 11:30am, "Jim Gettys" <jg@freedesktop.org> said:

=0A

=0A
=0A

=0A
On Wed, May 13, 2015 at 9:2= 0 AM, Bill Ver Steeg (versteb) <versteb@cisco.com> wrote:=0A
Time scales are important. Any t= ime you use TCP to send a moderately large file, you drive the link into co= ngestion. Sometimes this is for a few milliseconds per hour and sometimes t= his is for 10s of minutes per hour.

For instance, watching a 3 = Mbps video (Netflix/YouTube/whatever) on a 4 Mbps link with no cross traffi= c can cause significant bloat, particularly on older tail drop middleboxes.=   The host code does an HTTP get every N seconds, and drives the link = as hard as it can until it gets the video chunk. It waits a second or two a= nd then does it again. Rinse and Repeat. You end up with a very characteris= tic delay plot. The bloat starts at 0, builds until the middlebox provides = congestion feedback, then sawtooths around at about the buffer size. When t= he burst ends, the middlebox burns down its buffer and bloat goes back to z= ero. Wait a second or two and do it again.
=0A
=0A
=E2=80=8BI= t's time to do some packet traces to see what the video providers are doing= .  In YouTube's case, I believe the traffic is using the new sched_fq = qdisc, which does packet pacing; but exactly how this plays out by the time= packets reach the home isn't entirely clear to me. Other video providers/C= DN's may/may not have started generating clues.
=0A
=0A
=0A
=0A
Also note that so far, no one is trying to pace the IW transmiss= ion at all.
=0A
=0A
=0A
=E2=80=8B
=0A 
=0A
You can't fix this by adding bandwidt= h to the link. The endpoint's TCP sessions will simply ramp up to fill the = link. You will shorten the congested phase of the cycle, but TCP will ALWAY= S FILL THE LINK (given enough time to ramp up)=0A
=0A
=E2=80= =8BThat has been the behavior in the past, but it's no longer safe to presu= me=E2=80=8B we should tar everyone with the same brush, rather, we should d= o a bit of science, and then try to hold people's feet to the fire that do = not "play nice" with the network.
=0A
=0A
=E2=80=8BSome packet captures in the home = can easily sort this out.
=0A
Jim
=0A
=E2=80=8B
=0A

The= new AQM (and FQ_AQM) algorithms do a much better job of controlling the os= cillatory bloat, but you can still see ABR video patterns in the delay figu= res.

Bvs=
=0A
=0A

-----Original Message-----
From: bloat-bounces@lists.bufferbloat.net [mailto:bloat-bounces@lists.buff= erbloat.net] On Behalf Of Dave Taht
Sent: Tuesday, May 12, 2015 1= 2:00 PM
To: bloat; cerowrt-devel@lists.bufferbloat.net
Subject: [Bloat] better = business bufferbloat monitoring tools?

One thread bothering me = on dslreports.com i= s that some folk seem to think you only get bufferbloat if you stress test = the network, where transient bufferbloat is happening all the time, everywh= ere.

On one of my main sqm'd network gateways, day in, day out,= it reports about 6000 drops or ecn marks on ingress, and about 300 on egre= ss.
Before I doubled the bandwidth that main box got, the drop rate u= sed to be much higher, and a great deal of the bloat, drops, etc, has now m= oved into the wifi APs deeper into the network where I am not monitoring it= effectively.

I would love to see tools like mrtg, cacti, nagio= s and smokeping[1] be more closely integrated, with bloat related plugins, = and in particular, as things like fq_codel and other ecn enabled aqms deplo= y, start also tracking congestive events like loss and ecn CE markings on t= he bandwidth tracking graphs.

This would counteract to some ext= ent the classic 5 minute bandwidth summaries everyone looks at, that hide r= eal traffic bursts, latencies and loss at sub 5 minute timescales.
mrtg and cacti rely on snmp. While loss statistics are deeply part of s= nmp, I am not aware of there being a mib for CE events and a quick google s= earch was unrevealing. ?

There is also a need for more cross-ne= twork monitoring using tools such as that done by this excellent paper.

http://www.caida.org/publications/papers/2014= /measurement_analysis_internet_interconnection/measurement_analysis_interne= t_interconnection.pdf

[1] the network monitoring tools mark= et is quite vast and has many commercial applications, like intermapper, fo= rks of nagios, vendor specific producs from cisco, etc, etc. Far too many t= o list, and so far as I know, none are reporting ECN related stats, nor com= bining latency and loss with bandwidth graphs. I would love to know if any = products, commercial or open source, did....

--
Dave T=C3= =A4ht
Open Networking needs **Open Source Hardware**

https://plus.google.com/u/0/+EricRaymond/posts/JqxCe2pFr67
_______________________________________________
=0A
=0ABloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferblo= at.net/listinfo/bloat
=0A
=0A
_______________________________________________
Cerowrt-deve= l mailing list
= Cerowrt-devel@lists.bufferbloat.net
https://lists.bufferbl= oat.net/listinfo/cerowrt-devel
=0A
=0A
=0A
= =0A
=0A
=0A
------=_20150514102800000000_59287--