From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wi0-f169.google.com (mail-wi0-f169.google.com [209.85.212.169]) (using TLSv1 with cipher RC4-SHA (128/128 bits)) (Client did not present a certificate) by huchra.bufferbloat.net (Postfix) with ESMTPS id 2911B200AA1; Mon, 27 Aug 2012 16:15:15 -0700 (PDT) Received: by wibhm2 with SMTP id hm2so4041051wib.4 for ; Mon, 27 Aug 2012 16:15:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:date:message-id:subject:from:to:content-type :content-transfer-encoding; bh=LTZrkoS3mLJ+Yi8OcBYwbCrdsbaAUK8m8oDOO34crXI=; b=fO2nlkEzkNJWLbwZtHYHHhXXngrxotVJQkjIohJTE33CU9TOf1/hTI7Teg0+F0PH13 aJbG/Je5LO9E95KVSZ3OrxQ4B/1ZzQhtdjDuSBVN20vihJ4Mx7hmqk3HvQhV+RZayR+x Rvj1t6wE1ozwlmwBHuEYfxK7Rnk+YFilz/fGwFBe93ebmSE5b5lq6wvBe4jdrmtAahY0 T+5bCygq8fAGo5/hhGK8GPS62rCEsg0HVBuWWU3FaXbBU9nKO2ZfpisfXFxXxgGlhcEa bJRsmCyBHKFPRGAvVKKCNOruosy6w8b9YSzIngczGdT2Vas1ylYc5q2sKkswAMQd7Rzc IECA== MIME-Version: 1.0 Received: by 10.180.109.129 with SMTP id hs1mr28764529wib.0.1346109313652; Mon, 27 Aug 2012 16:15:13 -0700 (PDT) Received: by 10.223.159.134 with HTTP; Mon, 27 Aug 2012 16:15:13 -0700 (PDT) Date: Mon, 27 Aug 2012 16:15:13 -0700 Message-ID: From: Dave Taht To: cerowrt-devel@lists.bufferbloat.net, cerowrt@lists.bufferbloat.net Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Subject: [Cerowrt-devel] development snapshot of cerowrt-3.3.8-21 released X-BeenThere: cerowrt-devel@lists.bufferbloat.net X-Mailman-Version: 2.1.13 Precedence: list List-Id: Development issues regarding the cerowrt test router project List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 27 Aug 2012 23:15:16 -0000 I spent the last two weeks hunting down memory related issues and trying to fold in some development work I'd had going already. As best as I can tell, the core memory issues are killed dead; whether this is due to freeing up tons of memory or by the various other stuff remains to be determined. But: Under *no circumstances* install this release on your default router. I'm putting this out primarily because I'm seeing an odd behavior on the iwl card I have, but not on the ath9ks (yet!, still testing). If there is someone with a 3rd type of wireless card is out there, anything non-iwl or non-linux, please beat this up. Radical changes: + bind replaced with dnsmasq (bind available as an option) + support for AAAA naming and RA announcements in dnsmasq + Implementation of experimental codel code from kathie's ns2 work + fq_codel engages codel sooner + fq_codel has a CS1 deprioritization hack + codel and fq_codel shrink skbs under overload + debloat has reduced defaults for packet limits + debloat uses qlens of 2,4,12,12 (up from 2,3,3,3) + qos-scripts has reduced defaults + strongswan available again - I haven't looked at the hurricane ipv6 issue - dlna, upnp, either - didn't fix ath9k to use smaller allocations - no tcp small queues Big bugs remain. 1) htb does weird things at all bandwidths, and with all qdiscs, not just codel/fq_codel. It may well have been doing this for a while (like, months), which would ex= plain a lot. hfsc is also being weird. (hfsc is used by qos-scripts, htb by simple_qos) 2) wifi vs the x86 iwl card. This is the error that I get in /var/log/messages, and the only way to get connectivity back and clear it is to reboot the *x86* box. [67046.216150] iwlwifi 0000:03:00.0: fail to flush all tx fifo queues [67048.224185] iwlwifi 0000:03:00.0: fail to flush all tx fifo queues [67056.868185] iwlwifi 0000:03:00.0: fail to flush all tx fifo queues This is how I get the error, inside of about 30 seconds. (where the ip is the router's ip) netperf -Y CS5,CS5 -l 120 -H 172.20.42.65 -t TCP_STREAM & netperf -Y EF,EF -l 120 -H 172.20.42.65 -t TCP_STREAM & netperf -Y CS1,CS1 -l 120 -H 172.20.42.65 -t TCP_STREAM & netperf -Y CS0,CS0 -l 120 -H 172.20.42.65 -t TCP_STREAM & The above saturates the EF and VI queues, and for some reason starves the BE, BK queues (on iwl)... Packet traces indicate strongly that it's the iwl that's hosed, in this tcpdump, it is receiving packets from cero, but no longer able to transmit them. 15:26:15.889147 ARP, Request who-has ida.home.lan tell 172.20.11.97, length= 28 15:26:15.889185 ARP, Reply ida.home.lan is-at 00:26:c6:42:76:e2 (oui Unknown), length 28 (and I'm not running codel/fq_codel on the x86 box, either, on this test) I will refine this bug report more over time and get it to the linux-wireless mailing list. I just need to setup more boxes. The same codel related patch set for linux-3.6-rc3 x86 is now up as "codel2-ns2", where htb, the codel patches, etc is *just fine*, over ethernet. htb on x86/that version is also just fine. I'm pretty happy with this patch set, it feels like an improvement (at least on x86) over codel and fq_codel from before. http://snapon.lab.bufferbloat.net/~cero1/deb/ But on cero, htb has got extra-ordinary delays that shouldn't be there. and 3.3.8-21 is at (have I given you enough warning yet?) http://snapon.lab.bufferbloat.net/~cero1/3.3/3.3.8-21/ Up next for me is backing off to a way earlier version of cero, and incrementally adding back in stuff. But first up is a dip in the pool, and beating my head against a tree. Or vice versa. And if I'm lucky some x86 boxes will arrive soon and I can go build those instead of going nutso on this. --=20 Dave T=E4ht http://www.bufferbloat.net/projects/cerowrt/wiki - "3.3.8-17 is out with fq_codel!"