From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wi0-x22e.google.com (mail-wi0-x22e.google.com [IPv6:2a00:1450:400c:c05::22e]) (using TLSv1 with cipher RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by huchra.bufferbloat.net (Postfix) with ESMTPS id E34E621F30E; Thu, 12 Jun 2014 14:46:20 -0700 (PDT) Received: by mail-wi0-f174.google.com with SMTP id bs8so4330207wib.7 for ; Thu, 12 Jun 2014 14:46:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; bh=/Sp5JWZ37ThAUQ8om5hiXoiR2QTaKD8FO18dA3+W7Js=; b=Mi3JI07ZkUJGnRiSUmYRF9Vv5MnfHFYgHMYe6/1SWrMgFk8XCxvGW+1FrK37EUoUIL NIvMss98uo6SLj53amIUMQK0tGsuFDbMrIqEAz9fItFFnaBJJRARdCRxpDQYq/VXuKgz OSx/i0+E5mbYCeOgRRD9CArDLNEJbgK/qd52D+VoIfNCJNaciTWjWzg1DAQEh/5vphTj ShTcqRApNe4fa1X0M5yJja66ld0q4tYYAEY4Ne1Bu0h+gtOrE+GKs0MSTo1oNZFHzQeP RknSrvbe6v4+e96MyngU6zy+iZzFp5ZmlX/9wb+m8NVbrIu8FMa6lzF10eN1R3R8mdYb Ufag== MIME-Version: 1.0 X-Received: by 10.194.178.99 with SMTP id cx3mr65570941wjc.54.1402609578609; Thu, 12 Jun 2014 14:46:18 -0700 (PDT) Received: by 10.216.207.82 with HTTP; Thu, 12 Jun 2014 14:46:18 -0700 (PDT) In-Reply-To: References: Date: Thu, 12 Jun 2014 14:46:18 -0700 Message-ID: From: Dave Taht To: "David P. Reed" Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Cc: "cerowrt-devel@lists.bufferbloat.net" , bloat Subject: Re: [Cerowrt-devel] BQL, txqueue lengths and the internet of things X-BeenThere: cerowrt-devel@lists.bufferbloat.net X-Mailman-Version: 2.1.13 Precedence: list List-Id: Development issues regarding the cerowrt test router project List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 12 Jun 2014 21:46:21 -0000 On Wed, Jun 11, 2014 at 6:05 PM, David P. Reed wrote: > Maybe you can do a quick blog howto? I am thinking of a writeup, yes. But I am low on time this month. I'd like to target a publication outside the normal bufferbloat community. The folk working on beaglebone stuff DO tend to care about latency and realtime behavior a lot, and are willing to resort to programming the PRU to do robotic pwm projects and projects like http://www.nycresistor.com/2013/09/12/octoscroller/ ... so it would be my hope that community would "get bufferbloat" at a whole different level than we do. In my case I have a longstanding interest in reliably transporting real time audio which usually has latency constraints below 2ms, no matter what. The BQL patch with sch_fq makes the difference between success of failure for that. Hmm. Maybe AES. > I'd bet the same could be done for > raspberry pi and perhaps my other toy the wandboard which has a gigE adap= ter Well, measure first? Many of the low end devices can't achieve gigE in the first place, rendering the BQL method somewhat moot. If you can provide a pointer to the right driver I can take a look. (dmesg | grep eth) I've long planned on doing a BQL'd driver for the zedboard/zynq and parallella, I've looked over that code thoroughly, all I need is a board to test and spare time. (in fact for the latter, I'd like to take a stab at writing better ethernet hardware) > and Scsi making it a nice iscsi target or nfs server. We took a initial stab at BQL'ing the usbnet driver about a year back. There were too many possible error out conditions to get a proper accounting (at the time, anyway). Fixing this would not only make the Pi better; hundreds of usb ethernet devices exist, and the vast majority of lte devices are hooked up via this driver. The numbers I just got for usb latency on that were for 100+mbit operation, and the amount of buffering is fixed... https://plus.google.com/u/0/107942175615993706558/posts/Cpd76KHUbpp > De bloating the world... One step at a time. Adding BQL is easy, and most of the work can be done with code inspection. The huge problem is you absolutely have to be able to test the device and driver under a variety of circumstances after you patch it in. (What took me the longest was finding a correct toolchain for kernel builds, actually, and then finding a bug in dma padding that I'd missed until a whole bunch of printks and enlightenment dawned). I estimate that doing the beaglebone BQL patch cost me 30 hrs of time, or about 6k at my current billing rate. (while the 160,000+ present users of the beaglebone might get back 20ms under load on a regular basis, that's 30 hours of my life I'll never have back) I figure to get the patch accepted in mainline will take another 10, and for it to flow out to the beaglebone userbase in less than a year I'd have to convince the maintainers to incorporate it out of tree. The new rev C beaglebone only ships with 3.8.x as it's default kernel, although modern kernels are readily available via Robert Nelson's work ( https://rcn-ee.net/deb/wheezy-armhf/ ), and building your own is more or less correctly documented here: http://eewiki.net/display/linuxonarm/BeagleBone+Black So... Were it a new device, and a driver in development, and I'd had data sheets, and a working compiler, it probably would have been much less. The right people to do this work are the chipmakers writing drivers before they ever hit the mainline, or places like linaro.org that are doing tons of arm work. It's certainly my hope now that now that there's a demonstrable proof of concept and the benefit, that TI will retrofit similar code to the other devices that it makes and do the needed testing. (But I'm not holding my breath). So BQL-on-everything is something of a project that needs to get driven somehow... and whether to get eyeballs and resources on it is better done via just doing it, or creating publicity around the need for it so others do it, has a cost. ... and fixing wifi is going to be a lot harder than fixing usb. > > On Jun 11, 2014, Dave Taht wrote: >> >> The bloat problem and solutions are not just limited to fixing >> routers, but hosts. >> >> Nearly every low end board I've seen out there forgos a gigE ethernet >> interface in favor of a lower power and cost 100mbit interface. >> >> No distro I've seen modifies the default pfifo txqueuelen from the >> current 1000 packet default down to a more reasonable 100 packet >> default in that case. And, while many ethernet devices in this >> category are hooked up via usb (and currently hard to add BQL support >> to), some are not, and byte queue limit support can be easily added to >> those. >> >> Sadly byte queue limits (BQL) is only implemented on a bunch of top >> end ethernet drivers. (about 10, last I looked) >> >> I needed a break from big problems, so a couple late nights later, I >> have a very small patch adding support for BQL to the beaglebone >> black: >> >> >> http://snapon.lab.bufferbloat.net/~d/0001-Add-BQL-support-to-cpsw-beagle= bone-driver.patch >> >> And the results were quite pleasing at 100mbit. BQL holds things down >> to two full size packets in the tx ring and we see an enormous >> improvement in bidirectional throughput, jitter, and latency. >> >> http://snapon.lab.bufferbloat.net/~d/beagle_bql/bql_makes_a_difference.p= ng >> http://snapon.lab.bufferbloat.net/~d/beagle_bql/beaglebonewins.png >> >> The default linux behavior ( pfifo fast, txqueue 1000 ) prior to this >> patch looked pretty awful: >> >> >> http://snapon.lab.bufferbloat.net/~d/beagle_nobql/pfifo_nobql_tsq3028txq= ueue1000.svg >> >> and went to looking like this: >> >> >> http://snapon.lab.bufferbloat.net/~d/beagle_bql/pfifo_bql_tsq3028txqueue= 1000.svg >> >> And adding the new fq scheduler looked like this: >> >> http://snapon.lab.bufferbloat.net/~d/beagle_bql/fq_bql_tsq3028.svg >> >> (fq_codel was similar) >> >> The fact that we don't achieve full upload throughput on this last >> test is probably >> due to having a tail dropping switch in the way, and/or some dma dequeui= ng >> cleanup conflicts between the low level transmit and receive queues on >> this device (they share an interrupt AND use napi which seems >> puzzling). >> >> But any day I can get a 4-10x improvement in latency and throughput is >> a good day. One IoT device down, thousands to go. It would be nice if >> the chipmakers were incorporating bql into boxes destined for the >> internet of things. > > > -- Sent from my Android device with K-@ Mail. Please excuse my brevity. --=20 Dave T=C3=A4ht NSFW: https://w2.eff.org/Censorship/Internet_censorship_bills/russell_0296_= indecent.article