From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp201.iad.emailsrvr.com (smtp201.iad.emailsrvr.com [207.97.245.201]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by huchra.bufferbloat.net (Postfix) with ESMTPS id 72B97201B0B for ; Thu, 5 Apr 2012 19:33:49 -0700 (PDT) Received: from localhost (localhost.localdomain [127.0.0.1]) by smtp50.relay.iad1a.emailsrvr.com (SMTP Server) with ESMTP id 2EED5370B40; Thu, 5 Apr 2012 22:33:48 -0400 (EDT) X-Virus-Scanned: OK Received: from legacy1.wa-web.iad1a (legacy1.wa-web.iad1a.rsapps.net [192.168.2.217]) by smtp50.relay.iad1a.emailsrvr.com (SMTP Server) with ESMTP id 12B50370B2C; Thu, 5 Apr 2012 22:33:48 -0400 (EDT) Received: from reed.com (localhost [127.0.0.1]) by legacy1.wa-web.iad1a (Postfix) with ESMTP id F3E2247880B1; Thu, 5 Apr 2012 22:33:47 -0400 (EDT) Received: by apps.rackspace.com (Authenticated sender: dpreed@reed.com, from: dpreed@reed.com) with HTTP; Thu, 5 Apr 2012 22:33:47 -0400 (EDT) Date: Thu, 5 Apr 2012 22:33:47 -0400 (EDT) From: dpreed@reed.com To: "Dave Taht" MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_20120405223347000000_65732" Importance: Normal X-Priority: 3 (Normal) X-Type: html In-Reply-To: References: Message-ID: <1333679627.997611294@apps.rackspace.com> X-Mailer: webmail7.0 Cc: cerowrt-devel@lists.bufferbloat.net Subject: Re: [Cerowrt-devel] Cero-state this week and last X-BeenThere: cerowrt-devel@lists.bufferbloat.net X-Mailman-Version: 2.1.13 Precedence: list List-Id: Development issues regarding the cerowrt test router project List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 06 Apr 2012 02:33:49 -0000 ------=_20120405223347000000_65732 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable =0AA small suggestion.=0A =0ACreate a regression test suite, and require co= ntributors to *pass* the test with each submitted patch set.=0A =0ABe a dam= ned Nazi about checkins that don't meet this criterion - eliminate the righ= t to check in code for anyone who contributes something that breaks functio= nality.=0A =0AEvery project leader discovers this. Programmers are *lazy* = and refuse to check their inputs unless you shame them into compliance.=0A = =0A-----Original Message-----=0AFrom: "Dave Taht" =0AS= ent: Thursday, April 5, 2012 10:27pm=0ATo: cerowrt-devel@lists.bufferbloat.= net=0ASubject: [Cerowrt-devel] Cero-state this week and last=0A=0A=0A=0AI a= ttended the ietf conference in Paris (virtually), particularly ccrg=0Aand h= omenet.=0A=0AI do encourage folk to pay attention to homenet if possible, a= s laying=0Aout what home networks will look like in the next 10 years is pr= oving=0Ato be a hairball.=0Accrg was productive.=0A=0ASome news:=0A=0AI hav= e been spending time fixing some infrastructural problems.=0A=0A1) After be= -ing blindsided by more continuous integration problems in=0Athe last month= than in the last 5, I found out that one of the root=0Acauses was that the= openwrt build cluster had declined in size from 8=0Aboxes to 1(!!), and ti= me between successful automated builds was in=0Asome cases over a month.=0A= =0AThe risk of going 1 to 0 build slaves seemed untenable. So I sprang=0Ain= to action, scammed two boxes and travis has tossed them into the=0Acluster.= Someone else volunteered a box.=0A=0AI am a huge proponent of continuous i= ntegration on complex projects.=0Ahttp://en.wikipedia.org/wiki/Continuous_i= ntegration=0A=0ABuilding all the components of an OS like openwrt correctly= , all the=0Atime, with the dozens of developers involved, with a minimum de= lta=0Abetween commit, breakage, and fix, is really key to simplifying the= =0Arelatively simple task we face in bufferbloat.net of merely layering=0Ao= n components and fixes improving the state of the art in networking.=0A=0AT= he tgrid is still looking quite bad at the moment.=0A=0Ahttp://buildbot.ope= nwrt.org:8010/tgrid=0A=0AThere's still a huge backlog of breakage.=0A=0ABut= I hope it gets better. Certainly building a full cluster of build=0Aboxes = or vms (openwrt@HOME!!) would help a lot more.=0A=0AIf anyone would like to= help hardware wise, or learn more about how to=0Amanage a build cluster us= ing buildbot, please contact travis=0A=0A=0A2) B= loatlab #1 has been completely rewired and rebuilt and most of=0Athe router= s in there reflashed to Cerowrt-3.3.1-2 or later. They=0Asurvived some seri= ous network abuse over the last couple days=0A(ironically the only router t= hat crashed was the last rc6 box I had in=0Athe mix - and not due to a netw= ork fault! I ran it out of flash with a=0Alogging tool).=0A=0ATo deal with = the complexity in there (there's also a sub-lab for some=0Asdnat and PCP te= sting), I ended up with a new ipv6 /48 and some better=0Aways to route that= I'll write up soon.=0A=0A3) I did finally got back to fully working builds= for the ar71xx=0A(cerowrt) architecture a few days ago. I also have a work= ing 3.3.1=0Akernel for the x86_64 build I use to test the server side.=0A(b= ufferbloat is NOT just a router problem. Fixing all sides of a=0Aconnection= helps a lot). That + a new iproute2 + the debloat script=0Aand YOU TOO can= experience orders of magnitude less latency....=0A=0Ahttp://europa.lab.buf= ferbloat.net/debloat/ has that 3.3.1 kernel for x86_64=0A=0AMost of the pas= t week has been backwards rather than forwards, but it=0Awas negative in a = good way, mostly.=0A=0AI'm sorry it's been three weeks without a viable bui= ld for others to test.=0A=0A4) today's build: http://huchra.bufferbloat.net= /~cero1/3.3/3.3.1-4/=0A=0A+ Linux 3.3.1 (this is missing the sfq patch I li= ked, but it's good enough)=0A+ Working wifi is back=0A+ No more fiddling wi= th ethtool tx rings (up to 64 from 2. BQL does=0Athis job better)=0A+ TCP C= UBIC is now the default (no longer westwood)=0Aafter 15+ years of misplaced= faith in delay based tcp for wireless,=0AI've collected enough data to con= vince me the cubic wins. all the=0Atime.=0A+ alttcp enabled (making it easy= to switch)=0A+ latest netperf from svn (yea! remotely changable diffserv s= ettings=0Afor a test tool!)=0A=0A- still horrible dependencies on time. You= pretty much have to get on=0Ait and do a rndc validation disable multiple = times, restart ntp=0Amultiple times, killall named multiple times to get an= ywhere if you=0Awant to get dns inside of 10 minutes.=0A=0AAt this point so= metimes I just turn off named in /etc/xinetd.d/named=0Aand turn on port 53 = for dnsmasq... but=0Ausually after flashing it the first time, wait 10 minu= tes (let it=0Aclean flash), reboot, wait another 10, then it works. Drives = me=0Acrazy... Once it's up and has valid time and is working, dnssec works= =0Agreat but....=0A=0A+ way cool new stuff in dnsmasq for ra and AAAA recor= ds=0A- huge dependency on keeping bind in there=0A- aqm-scripts. I have not= succeed in making hfsc work right. Period.=0A+ HTB (vs hfsc) is proving fa= r more tractable. SFQRED is scaling=0Abetter than I'd dreamed. Maybe eric d= reamed this big, I didn't.=0A- http://www.bufferbloat.net/issues/352=0A+ Ad= ded some essential randomness back into the entropy pool=0A- hostapd really= acts up at high rates with the hack in there for more=0Aentroy (From the o= penwrt mainline)=0A+ named caching the roots idea discarded in favor of cla= ssic '.'=0A=0A=0A-- =0ADave T=C3=A4ht=0ASKYPE: davetaht=0AUS Tel: 1-239-829= -5608=0Ahttp://www.bufferbloat.net=0A______________________________________= _________=0ACerowrt-devel mailing list=0ACerowrt-devel@lists.bufferbloat.ne= t=0Ahttps://lists.bufferbloat.net/listinfo/cerowrt-devel ------=_20120405223347000000_65732 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable

A small su= ggestion.

=0A

 

=0A

Create a regression test suite, and require contributor= s to *pass* the test with each submitted patch set.

=0A

 

=0A

Be a damned N= azi about checkins that don't meet this criterion - eliminate the right to = check in code for anyone who contributes something that breaks functionalit= y.

=0A

 

=0A

Every project leader discovers this.  Programmers are *la= zy* and refuse to check their inputs unless you shame them into compliance.=

=0A

 

=0A

-----Original Message-----
From: "Dave Taht" <dave.taht@= gmail.com>
Sent: Thursday, April 5, 2012 10:27pm
To: cerowrt-d= evel@lists.bufferbloat.net
Subject: [Cerowrt-devel] Cero-state this we= ek and last

=0A
=0A

I attended the ietf conference in Paris (virtually= ), particularly ccrg
and homenet.

I do encourage folk to pa= y attention to homenet if possible, as laying
out what home networks w= ill look like in the next 10 years is proving
to be a hairball.
c= crg was productive.

Some news:

I have been spending t= ime fixing some infrastructural problems.

1) After be-ing blinds= ided by more continuous integration problems in
the last month than in= the last 5, I found out that one of the root
causes was that the open= wrt build cluster had declined in size from 8
boxes to 1(!!), and time= between successful automated builds was in
some cases over a month.
The risk of going 1 to 0 build slaves seemed untenable. So I spra= ng
into action, scammed two boxes and travis has tossed them into the<= br />cluster. Someone else volunteered a box.

I am a huge propon= ent of continuous integration on complex projects.
http://en.wikipedia= .org/wiki/Continuous_integration

Building all the components of = an OS like openwrt correctly, all the
time, with the dozens of develop= ers involved, with a minimum delta
between commit, breakage, and fix, = is really key to simplifying the
relatively simple task we face in buf= ferbloat.net of merely layering
on components and fixes improving the = state of the art in networking.

The tgrid is still looking quite= bad at the moment.

http://buildbot.openwrt.org:8010/tgrid
=
There's still a huge backlog of breakage.

But I hope it ge= ts better. Certainly building a full cluster of build
boxes or vms (op= enwrt@HOME!!) would help a lot more.

If anyone would like to hel= p hardware wise, or learn more about how to
manage a build cluster usi= ng buildbot, please contact travis
<thepeople AT openwrt.org>
2) Bloatlab #1 has been completely rewired and rebuilt and most o= f
the routers in there reflashed to Cerowrt-3.3.1-2 or later. They
survived some serious network abuse over the last couple days
(ironi= cally the only router that crashed was the last rc6 box I had in
the m= ix - and not due to a network fault! I ran it out of flash with a
logg= ing tool).

To deal with the complexity in there (there's also a = sub-lab for some
sdnat and PCP testing), I ended up with a new ipv6 /4= 8 and some better
ways to route that I'll write up soon.

3)= I did finally got back to fully working builds for the ar71xx
(cerowr= t) architecture a few days ago. I also have a working 3.3.1
kernel for= the x86_64 build I use to test the server side.
(bufferbloat is NOT j= ust a router problem. Fixing all sides of a
connection helps a lot). T= hat + a new iproute2 + the debloat script
and YOU TOO can experience o= rders of magnitude less latency....

http://europa.lab.bufferbloa= t.net/debloat/ has that 3.3.1 kernel for x86_64

Most of the past= week has been backwards rather than forwards, but it
was negative in = a good way, mostly.

I'm sorry it's been three weeks without a vi= able build for others to test.

4) today's build: http://huchra.b= ufferbloat.net/~cero1/3.3/3.3.1-4/

+ Linux 3.3.1 (this is missin= g the sfq patch I liked, but it's good enough)
+ Working wifi is back<= br />+ No more fiddling with ethtool tx rings (up to 64 from 2. BQL doesthis job better)
+ TCP CUBIC is now the default (no longer westwood= )
after 15+ years of misplaced faith in delay based tcp for wireless,<= br />I've collected enough data to convince me the cubic wins. all the
time.
+ alttcp enabled (making it easy to switch)
+ latest netpe= rf from svn (yea! remotely changable diffserv settings
for a test tool= !)

- still horrible dependencies on time. You pretty much have t= o get on
it and do a rndc validation disable multiple times, restart n= tp
multiple times, killall named multiple times to get anywhere if you=
want to get dns inside of 10 minutes.

At this point someti= mes I just turn off named in /etc/xinetd.d/named
and turn on port 53 f= or dnsmasq... but
usually after flashing it the first time, wait 10 mi= nutes (let it
clean flash), reboot, wait another 10, then it works. Dr= ives me
crazy... Once it's up and has valid time and is working, dnsse= c works
great but....

+ way cool new stuff in dnsmasq for r= a and AAAA records
- huge dependency on keeping bind in there
- a= qm-scripts. I have not succeed in making hfsc work right. Period.
+ HT= B (vs hfsc) is proving far more tractable. SFQRED is scaling
better th= an I'd dreamed. Maybe eric dreamed this big, I didn't.
- http://www.bu= fferbloat.net/issues/352
+ Added some essential randomness back into t= he entropy pool
- hostapd really acts up at high rates with the hack i= n there for more
entroy (From the openwrt mainline)
+ named cachi= ng the roots idea discarded in favor of classic '.'


-- Dave T=C3=A4ht
SKYPE: davetaht
US Tel: 1-239-829-5608
htt= p://www.bufferbloat.net
______________________________________________= _
Cerowrt-devel mailing list
Cerowrt-devel@lists.bufferbloat.net<= br />https://lists.bufferbloat.net/listinfo/cerowrt-devel

=0A
------=_20120405223347000000_65732--