From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wi0-f181.google.com (mail-wi0-f181.google.com [209.85.212.181]) (using TLSv1 with cipher RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority" (verified OK)) by huchra.bufferbloat.net (Postfix) with ESMTPS id 4E750200B1B for ; Thu, 3 May 2012 14:35:28 -0700 (PDT) Received: by wibhn14 with SMTP id hn14so452316wib.10 for ; Thu, 03 May 2012 14:35:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; bh=dbNeuJQehOZSmbiNQ7yMwsOAe+0ZO2SJnUW43gQGGbQ=; b=H+5miUjBGUaCNxK8RtXrEDsN4AhNoqWv8vCeymZN/xlFloDSwNjDXtRWTGRsEES3b2 32hYEQBISxLKOk0FhF/IfhQUaZEGus81362MYjbetpa9pD1ZKIZaLFJZjYFO+9dAlFFS drBYp1/tlajMO8evYZ6vy+kpmgGYfMQPaZ9icWwBxisunUEkrT9YsXui/VWeljE83F0+ jkRPPOw/VExnaCtX2388KUY1Ridzbo0MMy/WeTSSrsdUUtKHd3/bimqFmTQeVBgDvomV JyFBqa/1V6lmdvGB72BEftRQX1PJPGI2FYiDD1JRg7qGAbqAQ/g84VYyrDpc6Turd86J E5Fg== MIME-Version: 1.0 Received: by 10.180.105.69 with SMTP id gk5mr7910537wib.3.1336080926039; Thu, 03 May 2012 14:35:26 -0700 (PDT) Received: by 10.223.112.66 with HTTP; Thu, 3 May 2012 14:35:25 -0700 (PDT) In-Reply-To: <4FA2EEF8.2070109@freedesktop.org> References: <1336067533-16923-1-git-send-email-dave.taht@bufferbloat.net> <1336067533-16923-2-git-send-email-dave.taht@bufferbloat.net> <4FA2EEF8.2070109@freedesktop.org> Date: Thu, 3 May 2012 14:35:25 -0700 Message-ID: From: Dave Taht To: Jim Gettys Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: codel@lists.bufferbloat.net, =?ISO-8859-1?Q?Dave_T=E4ht?= Subject: Re: [Codel] [PATCH] Preliminary codel implementation X-BeenThere: codel@lists.bufferbloat.net X-Mailman-Version: 2.1.13 Precedence: list List-Id: CoDel AQM discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 03 May 2012 21:35:28 -0000 On Thu, May 3, 2012 at 1:47 PM, Jim Gettys > There is a note in the article at this point in its exposition of the > pseudo code: > > "Note that for embedded systems or kernel implementation that the > inverse *sqrt* can be computed efficiently using only integer > multiplication." > > That will likely be faster than the division by sqrt; if/when this > optimisation is done, we should leave comments about why it is division > by a square root, to correspond to TCP's control law. > > Before we upstream this, I'd recommend taking the comments documenting > the algorithm from the article and including them. =A0I think Kathie had > to trim on the slides. > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0- = Jim 1) My own hurry is because I want to see this work outside of a model, before monday. 2) I strongly believe in correctness first, optimizations last. I guess I'm the only guy here that kind of wishes we were using floating point for this function! 3) I also would massively prefer that we measure queue length from actual ingress (eg the application or incoming network card), to egress. I regard scribbling on the cb as a hack, I'd rather use a normalized hardware timestamp gained on ingres. There are 3+ pages of code that need to be traversed to get from there to there, and those delays are significant (as we proved last year) and not accounted for in a fluid model. 4) Despite eric's request that I rebase on net-next, there's plenty of time for that, after the code is proven correct, and stable. I'm very glad we're at a point where it will be very easy to port forward to net-next, but I'm totally unwilling to cope with random breakage that goes with stepping over that edge. I have a dozen machines in the lab, setup on 3.3.4, ready to let me do head to head comparisons of various technologies in play, on three different architectures, using 3 forms of wireless and 2 forms of ethernet. (If I can beg pity from anyone, one of those arches has 300+ out of tree patches and it's a cast iron bitch to work on net-next in it) Yes, we absolutely want something to go upstream. It doesn't need to happen next week or even next month. I'd like a sane API and naming scheme, and a man page to go with it. I happen to be fond of ECN for long RTTs, and think it has great potential for wireless, I know others are not, and the code structure turned out to be hard to put that in... I didn't get sufficient time to evaluate sfq+red before it went upstream, and I really regret that. There is plenty of time in net-next's upcoming window, regardless, to get something into linux. I'd like to see someone working on BSD, and on a ns3 model that can integrate with the wifi model. Let's try to do some quality work here! I'd like very much to explore, for example, what happens in a massive dropping scenario, such as what eric alludes to might happen on bql + gigE. And to have effective measurements of what the actual overhead of a sqrt and divide are - so far as I know on modern arches it's like 11-25 clocks, much of which is lost in the noise on superscalar architecture. This particular portion of the algorithm is not triggered particularly often. I'd like to run a real 1000 streams of real ledbat through it, voip, web access, and games... play with prioritization and FQ techniques, etc. but first up the simplest possible implementation that's correct? OK? Measure first. Get it correct. Then analyze the real behavior. Then optimize.... Wash, rinse, repeat. This is such an old lesson I can't believe you guys are so over focused on it! A sqrt seems correct to me for tcp traffic but what about everything else? I very much appreciate the code review that happened earlier today, and hopefully my next attempt will be closer to correct. (In particular I hadn't realized how gnarly dealing with ktime was) But: I assure you all I NEVER get right on the first couple of tries. And I wasn't planning on being the guy to write the code, I just couldn't sleep, not seeing, feeling, experiencing, it work. So I'd like to setup a git repo for this and iproute2 ASAP. I'd be willing to wait til after monday and just rely on code review. --=20 Dave T=E4ht SKYPE: davetaht US Tel: 1-239-829-5608 http://www.bufferbloat.net