[Codel] [PATCH] Preliminary codel implementation
Jim Gettys
jg at freedesktop.org
Thu May 3 19:12:09 EDT 2012
On 05/03/2012 05:35 PM, Dave Taht wrote:
> On Thu, May 3, 2012 at 1:47 PM, Jim Gettys
>> There is a note in the article at this point in its exposition of the
>> pseudo code:
>>
>> "Note that for embedded systems or kernel implementation that the
>> inverse *sqrt* can be computed efficiently using only integer
>> multiplication."
>>
>> That will likely be faster than the division by sqrt; if/when this
>> optimisation is done, we should leave comments about why it is division
>> by a square root, to correspond to TCP's control law.
>>
>> Before we upstream this, I'd recommend taking the comments documenting
>> the algorithm from the article and including them. I think Kathie had
>> to trim on the slides.
>> - Jim
>
> 1) My own hurry is because I want to see this work outside of a model,
> before monday.
>
> 2) I strongly believe in correctness first, optimizations last. I
> guess I'm the only guy here that kind of wishes we were using floating
> point for this function!
Go get some sleep.
I sent the mail I did as I did not want someone reading these archives
next week to presume that a
square root function, in this context, is an expensive operation. Not
everyone will wade through the article to come across that
implementation note.
- Jim
> 3) I also would massively prefer that we measure queue length from
> actual ingress (eg the application or incoming network card),
> to egress. I regard scribbling on the cb as a hack, I'd rather use a
> normalized hardware timestamp gained on ingres.
>
> There are 3+ pages of code that need to be traversed to get from there
> to there, and those delays are significant (as we proved last year)
> and not accounted for in a fluid model.
>
> 4) Despite eric's request that I rebase on net-next, there's plenty of
> time for that, after the code is proven correct, and stable.
>
> I'm very glad we're at a point where it will be very easy to port
> forward to net-next, but I'm totally unwilling to cope with random
> breakage that goes with stepping over that edge.
>
> I have a dozen machines in the lab, setup on 3.3.4, ready to let me do
> head to head comparisons of various technologies in play, on three
> different architectures, using 3 forms of wireless and 2 forms of
> ethernet.
>
> (If I can beg pity from anyone, one of those arches has 300+ out of
> tree patches and it's a cast iron bitch to work on net-next in it)
>
> Yes, we absolutely want something to go upstream. It doesn't need to
> happen next week or even next month. I'd like a sane API and naming
> scheme, and a man page to go with it.
>
> I happen to be fond of ECN for long RTTs, and think it has
> great potential for wireless, I know others are not,
> and the code structure turned out to be hard to put that in...
>
> I didn't get sufficient time to evaluate sfq+red before it went
> upstream, and I really regret that.
>
> There is plenty of time in net-next's upcoming window, regardless,
> to get something into linux. I'd like to see someone working on BSD,
> and on a ns3 model that can integrate with the wifi model.
>
> Let's try to do some quality work here!
>
> I'd like very much to explore, for example, what happens in a massive
> dropping scenario, such as what eric alludes to might happen on bql +
> gigE. And to have effective measurements of what the actual overhead
> of a sqrt and divide are - so far as I know on modern arches it's like
> 11-25 clocks, much of which is lost in the noise on superscalar
> architecture. This particular portion of the algorithm is not
> triggered particularly often.
>
> I'd like to run a real 1000 streams of real ledbat through it, voip,
> web access, and games... play with prioritization and FQ techniques,
> etc.
>
> but first up the simplest possible implementation that's correct?
> OK?
>
> Measure first. Get it correct. Then analyze the real behavior. Then
> optimize.... Wash, rinse, repeat. This is such an old lesson I can't
> believe you guys are so over focused on it! A sqrt seems correct to me
> for tcp traffic but what about everything else?
>
> I very much appreciate the code review that happened earlier today,
> and hopefully my next attempt will be closer to correct.
>
> (In particular I hadn't realized how gnarly dealing with ktime was)
>
> But: I assure you all I NEVER get right on the first couple of tries.
>
> And I wasn't planning on being the guy to write the code, I just
> couldn't sleep, not seeing, feeling, experiencing, it work.
>
> So I'd like to setup a git repo for this and iproute2 ASAP.
>
> I'd be willing to wait til after monday and just rely on code review.
>
>
More information about the Codel
mailing list