* [Codel] fp sqrt vis int sqrt? @ 2012-05-04 15:11 Dave Taht 2012-05-04 15:14 ` Jim Gettys 0 siblings, 1 reply; 18+ messages in thread From: Dave Taht @ 2012-05-04 15:11 UTC (permalink / raw) To: codel The linux kernel has no floating point in it, so I'd substituted the internal int_sqrt as a substitute, just to get something to work. static inline ktime_t control_law(const struct codel_sched_data *q, ktime_t t) { return ktime_add_ns(t, q->interval / int_sqrt(q->count)); } Often ns2 models use floating point. Having not seen the model I don't know that for sure. The series for an integer sqrt is far more 'chunky' than a fp one. int sqrt 1,2,3,4 = 1 1 1 2 fp sqrt 1 2 3 4 = 1 1.4.1 1.73 2 and gets even more chunky as you get larger values, eg, sqrt(36 through 48) = 6, sqrt(49) = 7 So we could precalculate the interval/sqrt(count) using floating point in the control law calculation, early, during qdisc setup, thus neatly avoiding both the divide and sqrt in the control law path. Or we could do fixed point. Kathie: In the sim, in various simulations, what is the dynamic range of count? And is it floating point sqrt? -- Dave Täht SKYPE: davetaht US Tel: 1-239-829-5608 http://www.bufferbloat.net ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Codel] fp sqrt vis int sqrt? 2012-05-04 15:11 [Codel] fp sqrt vis int sqrt? Dave Taht @ 2012-05-04 15:14 ` Jim Gettys 2012-05-04 15:19 ` Dave Taht 0 siblings, 1 reply; 18+ messages in thread From: Jim Gettys @ 2012-05-04 15:14 UTC (permalink / raw) To: Dave Taht; +Cc: codel On 05/04/2012 11:11 AM, Dave Taht wrote: > The linux kernel has no floating point in it, so I'd substituted the > internal int_sqrt as a substitute, > just to get something to work. > > static inline ktime_t control_law(const struct codel_sched_data *q, ktime_t t) > { > return ktime_add_ns(t, q->interval / int_sqrt(q->count)); > } > > Often ns2 models use floating point. Having not seen the model I don't > know that for sure. > > The series for an integer sqrt is far more 'chunky' than a fp one. > > int sqrt 1,2,3,4 = 1 1 1 2 > fp sqrt 1 2 3 4 = 1 1.4.1 1.73 2 > > and gets even more chunky as you get larger values, eg, sqrt(36 > through 48) = 6, sqrt(49) = 7 > > So we could precalculate the interval/sqrt(count) using floating point > in the control law calculation, > early, during qdisc setup, thus neatly avoiding both the divide and > sqrt in the control law path. > > Or we could do fixed point. > > Kathie: In the sim, in various simulations, what is the dynamic range > of count? And is it floating point sqrt? > The Wikipedia article on fast inverse sqrt is enlightening on computing this really fast. See: http://en.wikipedia.org/wiki/Fast_inverse_square_root This has little to do with whether an integer or floating point value is better algorithmically. - Jim ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Codel] fp sqrt vis int sqrt? 2012-05-04 15:14 ` Jim Gettys @ 2012-05-04 15:19 ` Dave Taht 2012-05-04 15:26 ` Jim Gettys 0 siblings, 1 reply; 18+ messages in thread From: Dave Taht @ 2012-05-04 15:19 UTC (permalink / raw) To: Jim Gettys; +Cc: codel On Fri, May 4, 2012 at 8:14 AM, Jim Gettys <jg@freedesktop.org> wrote: > On 05/04/2012 11:11 AM, Dave Taht wrote: >> The linux kernel has no floating point in it, so I'd substituted the >> internal int_sqrt as a substitute, >> just to get something to work. >> >> static inline ktime_t control_law(const struct codel_sched_data *q, ktime_t t) >> { >> return ktime_add_ns(t, q->interval / int_sqrt(q->count)); >> } >> >> Often ns2 models use floating point. Having not seen the model I don't >> know that for sure. >> >> The series for an integer sqrt is far more 'chunky' than a fp one. >> >> int sqrt 1,2,3,4 = 1 1 1 2 >> fp sqrt 1 2 3 4 = 1 1.4.1 1.73 2 >> >> and gets even more chunky as you get larger values, eg, sqrt(36 >> through 48) = 6, sqrt(49) = 7 >> >> So we could precalculate the interval/sqrt(count) using floating point >> in the control law calculation, >> early, during qdisc setup, thus neatly avoiding both the divide and >> sqrt in the control law path. >> >> Or we could do fixed point. >> >> Kathie: In the sim, in various simulations, what is the dynamic range >> of count? And is it floating point sqrt? >> > The Wikipedia article on fast inverse sqrt is enlightening on computing > this really fast. > > See: > > http://en.wikipedia.org/wiki/Fast_inverse_square_root > > This has little to do with whether an integer or floating point value is > better algorithmically. Neat trick, that. I'm delighted the wikipedia article kept the fully commented original code around! That said, you can't use FP at all without special handling in the kernel. It's certainly a terrible idea on a hot path... > - Jim -- Dave Täht SKYPE: davetaht US Tel: 1-239-829-5608 http://www.bufferbloat.net ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Codel] fp sqrt vis int sqrt? 2012-05-04 15:19 ` Dave Taht @ 2012-05-04 15:26 ` Jim Gettys 2012-05-04 15:56 ` Eric Dumazet 0 siblings, 1 reply; 18+ messages in thread From: Jim Gettys @ 2012-05-04 15:26 UTC (permalink / raw) To: Dave Taht; +Cc: codel On 05/04/2012 11:19 AM, Dave Taht wrote: > On Fri, May 4, 2012 at 8:14 AM, Jim Gettys <jg@freedesktop.org> wrote: >> On 05/04/2012 11:11 AM, Dave Taht wrote: >>> The linux kernel has no floating point in it, so I'd substituted the >>> internal int_sqrt as a substitute, >>> just to get something to work. >>> >>> static inline ktime_t control_law(const struct codel_sched_data *q, ktime_t t) >>> { >>> return ktime_add_ns(t, q->interval / int_sqrt(q->count)); >>> } >>> >>> Often ns2 models use floating point. Having not seen the model I don't >>> know that for sure. >>> >>> The series for an integer sqrt is far more 'chunky' than a fp one. >>> >>> int sqrt 1,2,3,4 = 1 1 1 2 >>> fp sqrt 1 2 3 4 = 1 1.4.1 1.73 2 >>> >>> and gets even more chunky as you get larger values, eg, sqrt(36 >>> through 48) = 6, sqrt(49) = 7 >>> >>> So we could precalculate the interval/sqrt(count) using floating point >>> in the control law calculation, >>> early, during qdisc setup, thus neatly avoiding both the divide and >>> sqrt in the control law path. >>> >>> Or we could do fixed point. >>> >>> Kathie: In the sim, in various simulations, what is the dynamic range >>> of count? And is it floating point sqrt? >>> >> The Wikipedia article on fast inverse sqrt is enlightening on computing >> this really fast. >> >> See: >> >> http://en.wikipedia.org/wiki/Fast_inverse_square_root >> >> This has little to do with whether an integer or floating point value is >> better algorithmically. > Neat trick, that. > > I'm delighted the wikipedia article kept the fully commented original > code around! I don't think that is the "original", which predates quake by a lot according to the article. > > That said, you can't use FP at all without special handling in the > kernel. It's certainly a terrible idea on a hot path... I suspect there are fully integer implementations kicking around... It's just that x86 finally got fast enough FP that doing one multiply in floating point beat the alternative. - Jim ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Codel] fp sqrt vis int sqrt? 2012-05-04 15:26 ` Jim Gettys @ 2012-05-04 15:56 ` Eric Dumazet 2012-05-04 17:23 ` Dave Taht 0 siblings, 1 reply; 18+ messages in thread From: Eric Dumazet @ 2012-05-04 15:56 UTC (permalink / raw) To: Jim Gettys; +Cc: codel On Fri, 2012-05-04 at 11:26 -0400, Jim Gettys wrote: > > I suspect there are fully integer implementations kicking around... > It's just that x86 finally got fast enough FP that doing one multiply in > floating point beat the alternative. Anyway we dont need to compute sqrt() at all. This can be done easily when we do the q->count++, if we hold in q->sqrt_count the estimated value of sqrt(count) Ie replace : q->count++; by: q->count++; n = q->sqrt_count + 1; if (n * n <= q->count) q->sqrt_count = n; A multiply is an acceptable cost. Of course, you need to reset sqrt_count to 1 when count is reset to 1 ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Codel] fp sqrt vis int sqrt? 2012-05-04 15:56 ` Eric Dumazet @ 2012-05-04 17:23 ` Dave Taht 2012-05-04 17:43 ` Dave Taht 2012-05-04 17:47 ` Eric Dumazet 0 siblings, 2 replies; 18+ messages in thread From: Dave Taht @ 2012-05-04 17:23 UTC (permalink / raw) To: Eric Dumazet; +Cc: codel On Fri, May 4, 2012 at 8:56 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote: > On Fri, 2012-05-04 at 11:26 -0400, Jim Gettys wrote: > >> >> I suspect there are fully integer implementations kicking around... >> It's just that x86 finally got fast enough FP that doing one multiply in >> floating point beat the alternative. > > Anyway we dont need to compute sqrt() at all. > > This can be done easily when we do the q->count++, if we hold in > q->sqrt_count the estimated value of sqrt(count) > > Ie replace : > > q->count++; > > by: > > q->count++; > n = q->sqrt_count + 1; > if (n * n <= q->count) > q->sqrt_count = n; > > A multiply is an acceptable cost. > > > Of course, you need to reset sqrt_count to 1 when count is reset to 1 Except that the cumulative drop probability is very different when you apply a series of times of integer vs floating point sqrt calculations into the mix. I can do up a graph, and may (it helps to visualize things), however if you look at the first four in the series, you get the net effect of a 5th and nearly a 6th term by the time you get to the 4th expansion of the series done with a fp sqrt vs integer sqrt. 10000000+ 7071067+ 5773502+5000000 27844569 10000000+10000000+10000000+5000000 35000000 so with fp you end up with a much tighter and more responsive control loop. In looking over the (lack of) compiler support for fp in the kernel, it seems simplest to load up a table from userspace for the interval/sqrt(count) calculation. Not that I netlink and I are friendly at the moment. And I would have liked to been able to to fiddle with target (for wireless) on the fly as a function of (active_stations), with interval slaved to minstrel's idea of everything. /me goes to look at gcc's implementation > > -- Dave Täht SKYPE: davetaht US Tel: 1-239-829-5608 http://www.bufferbloat.net ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Codel] fp sqrt vis int sqrt? 2012-05-04 17:23 ` Dave Taht @ 2012-05-04 17:43 ` Dave Taht 2012-05-04 17:44 ` Dave Taht 2012-05-04 17:47 ` Eric Dumazet 1 sibling, 1 reply; 18+ messages in thread From: Dave Taht @ 2012-05-04 17:43 UTC (permalink / raw) To: Eric Dumazet; +Cc: codel On Fri, May 4, 2012 at 10:23 AM, Dave Taht <dave.taht@gmail.com> wrote: > On Fri, May 4, 2012 at 8:56 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote: >> On Fri, 2012-05-04 at 11:26 -0400, Jim Gettys wrote: >> >>> >>> I suspect there are fully integer implementations kicking around... >>> It's just that x86 finally got fast enough FP that doing one multiply in >>> floating point beat the alternative. >> >> Anyway we dont need to compute sqrt() at all. >> >> This can be done easily when we do the q->count++, if we hold in >> q->sqrt_count the estimated value of sqrt(count) >> >> Ie replace : >> >> q->count++; >> >> by: >> >> q->count++; >> n = q->sqrt_count + 1; >> if (n * n <= q->count) >> q->sqrt_count = n; >> >> A multiply is an acceptable cost. >> >> >> Of course, you need to reset sqrt_count to 1 when count is reset to 1 > > Except that the cumulative drop probability is very different when you > apply a series of times of integer vs floating point sqrt calculations > into the mix. > > I can do up a graph, and may (it helps to visualize things), however > if you look at the first four in the series, you get the net > effect of a 5th and nearly a 6th term by the time you get to the 4th > expansion of the series done with a fp sqrt vs integer sqrt. > > 10000000+ 7071067+ 5773502+5000000 > 27844569 > > 10000000+10000000+10000000+5000000 > 35000000 > > so with fp you end up with a much tighter and more responsive control loop. To be mildly more clear, by the time integer sqrt codel would have dropped 4 packets, fp codel would have nearly dropped 6. 27844569+4472135+4472135 36788839 That's assuming a 100ms target, which by observation over the past year is not what a typical tcp stream is like anymore, it's closer to 50ms, particularly for elephants and cdns, less in many cases. > > In looking over the (lack of) compiler support for fp in the kernel, > it seems simplest to load up a table from userspace for the > interval/sqrt(count) calculation. > > Not that I netlink and I are friendly at the moment. > > And I would have liked to been able to to fiddle with target (for > wireless) on the fly as a function of (active_stations), with interval > slaved to minstrel's idea of everything. > > /me goes to look at gcc's implementation >> >> > > > > -- > Dave Täht > SKYPE: davetaht > US Tel: 1-239-829-5608 > http://www.bufferbloat.net -- Dave Täht SKYPE: davetaht US Tel: 1-239-829-5608 http://www.bufferbloat.net ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Codel] fp sqrt vis int sqrt? 2012-05-04 17:43 ` Dave Taht @ 2012-05-04 17:44 ` Dave Taht 2012-05-04 17:46 ` Dave Taht 2012-05-04 23:42 ` Kathleen Nichols 0 siblings, 2 replies; 18+ messages in thread From: Dave Taht @ 2012-05-04 17:44 UTC (permalink / raw) To: Eric Dumazet; +Cc: codel On Fri, May 4, 2012 at 10:43 AM, Dave Taht <dave.taht@gmail.com> wrote: > On Fri, May 4, 2012 at 10:23 AM, Dave Taht <dave.taht@gmail.com> wrote: >> On Fri, May 4, 2012 at 8:56 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote: >>> On Fri, 2012-05-04 at 11:26 -0400, Jim Gettys wrote: >>> >>>> >>>> I suspect there are fully integer implementations kicking around... >>>> It's just that x86 finally got fast enough FP that doing one multiply in >>>> floating point beat the alternative. >>> >>> Anyway we dont need to compute sqrt() at all. >>> >>> This can be done easily when we do the q->count++, if we hold in >>> q->sqrt_count the estimated value of sqrt(count) >>> >>> Ie replace : >>> >>> q->count++; >>> >>> by: >>> >>> q->count++; >>> n = q->sqrt_count + 1; >>> if (n * n <= q->count) >>> q->sqrt_count = n; >>> >>> A multiply is an acceptable cost. >>> >>> >>> Of course, you need to reset sqrt_count to 1 when count is reset to 1 >> >> Except that the cumulative drop probability is very different when you >> apply a series of times of integer vs floating point sqrt calculations >> into the mix. >> >> I can do up a graph, and may (it helps to visualize things), however >> if you look at the first four in the series, you get the net >> effect of a 5th and nearly a 6th term by the time you get to the 4th >> expansion of the series done with a fp sqrt vs integer sqrt. >> >> 10000000+ 7071067+ 5773502+5000000 >> 27844569 >> >> 10000000+10000000+10000000+5000000 >> 35000000 >> >> so with fp you end up with a much tighter and more responsive control loop. > > To be mildly more clear, by the time integer sqrt codel would have > dropped 4 packets, fp codel would have nearly dropped 6. > > 27844569+4472135+4472135 > 36788839 > > That's assuming a 100ms target, which by observation over the ^^^^^^^^^^^^^^^^^^^^interval past > year is not what a typical tcp stream is like anymore, it's closer to > 50ms, particularly for elephants and cdns, less in many cases. > > >> >> In looking over the (lack of) compiler support for fp in the kernel, >> it seems simplest to load up a table from userspace for the >> interval/sqrt(count) calculation. >> >> Not that I netlink and I are friendly at the moment. >> >> And I would have liked to been able to to fiddle with target (for >> wireless) on the fly as a function of (active_stations), with interval >> slaved to minstrel's idea of everything. >> >> /me goes to look at gcc's implementation >>> >>> >> >> >> >> -- >> Dave Täht >> SKYPE: davetaht >> US Tel: 1-239-829-5608 >> http://www.bufferbloat.net > > > > -- > Dave Täht > SKYPE: davetaht > US Tel: 1-239-829-5608 > http://www.bufferbloat.net -- Dave Täht SKYPE: davetaht US Tel: 1-239-829-5608 http://www.bufferbloat.net ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Codel] fp sqrt vis int sqrt? 2012-05-04 17:44 ` Dave Taht @ 2012-05-04 17:46 ` Dave Taht 2012-05-04 23:42 ` Kathleen Nichols 1 sibling, 0 replies; 18+ messages in thread From: Dave Taht @ 2012-05-04 17:46 UTC (permalink / raw) To: Eric Dumazet; +Cc: codel On Fri, May 4, 2012 at 10:44 AM, Dave Taht <dave.taht@gmail.com> wrote: > On Fri, May 4, 2012 at 10:43 AM, Dave Taht <dave.taht@gmail.com> wrote: >> On Fri, May 4, 2012 at 10:23 AM, Dave Taht <dave.taht@gmail.com> wrote: >>> On Fri, May 4, 2012 at 8:56 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote: >>>> On Fri, 2012-05-04 at 11:26 -0400, Jim Gettys wrote: >>>> >>>>> >>>>> I suspect there are fully integer implementations kicking around... >>>>> It's just that x86 finally got fast enough FP that doing one multiply in >>>>> floating point beat the alternative. >>>> >>>> Anyway we dont need to compute sqrt() at all. >>>> >>>> This can be done easily when we do the q->count++, if we hold in >>>> q->sqrt_count the estimated value of sqrt(count) >>>> >>>> Ie replace : >>>> >>>> q->count++; >>>> >>>> by: >>>> >>>> q->count++; >>>> n = q->sqrt_count + 1; >>>> if (n * n <= q->count) >>>> q->sqrt_count = n; >>>> >>>> A multiply is an acceptable cost. >>>> >>>> >>>> Of course, you need to reset sqrt_count to 1 when count is reset to 1 >>> >>> Except that the cumulative drop probability is very different when you >>> apply a series of times of integer vs floating point sqrt calculations >>> into the mix. >>> >>> I can do up a graph, and may (it helps to visualize things), however >>> if you look at the first four in the series, you get the net >>> effect of a 5th and nearly a 6th term by the time you get to the 4th >>> expansion of the series done with a fp sqrt vs integer sqrt. >>> >>> 10000000+ 7071067+ 5773502+5000000 >>> 27844569 >>> >>> 10000000+10000000+10000000+5000000 >>> 35000000 >>> >>> so with fp you end up with a much tighter and more responsive control loop. >> >> To be mildly more clear, by the time integer sqrt codel would have >> dropped 4 packets, fp codel would have nearly dropped 6. >> >> 27844569+4472135+4472135 >> 36788839 And I goofed on the copy/paste math. (all the more reason for me to code this up and graph it) 27844569+4472135+4082482 36399186 >> >> That's assuming a 100ms target, which by observation over the > ^^^^^^^^^^^^^^^^^^^^interval > > past >> year is not what a typical tcp stream is like anymore, it's closer to >> 50ms, particularly for elephants and cdns, less in many cases. >> >> >>> >>> In looking over the (lack of) compiler support for fp in the kernel, >>> it seems simplest to load up a table from userspace for the >>> interval/sqrt(count) calculation. >>> >>> Not that I netlink and I are friendly at the moment. >>> >>> And I would have liked to been able to to fiddle with target (for >>> wireless) on the fly as a function of (active_stations), with interval >>> slaved to minstrel's idea of everything. >>> >>> /me goes to look at gcc's implementation >>>> >>>> >>> >>> >>> >>> -- >>> Dave Täht >>> SKYPE: davetaht >>> US Tel: 1-239-829-5608 >>> http://www.bufferbloat.net >> >> >> >> -- >> Dave Täht >> SKYPE: davetaht >> US Tel: 1-239-829-5608 >> http://www.bufferbloat.net > > > > -- > Dave Täht > SKYPE: davetaht > US Tel: 1-239-829-5608 > http://www.bufferbloat.net -- Dave Täht SKYPE: davetaht US Tel: 1-239-829-5608 http://www.bufferbloat.net ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Codel] fp sqrt vis int sqrt? 2012-05-04 17:44 ` Dave Taht 2012-05-04 17:46 ` Dave Taht @ 2012-05-04 23:42 ` Kathleen Nichols 2012-05-05 0:01 ` Dave Taht 1 sibling, 1 reply; 18+ messages in thread From: Kathleen Nichols @ 2012-05-04 23:42 UTC (permalink / raw) To: codel Take a deep breath. On 5/4/12 10:44 AM, Dave Taht wrote: >> >> That's assuming a 100ms target, which by observation over the > ^^^^^^^^^^^^^^^^^^^^interval > > past >> year is not what a typical tcp stream is like anymore, it's closer to >> 50ms, particularly for elephants and cdns, less in many cases. >> But that really isn't going to make much difference. Interval is *loosely* related to RTT. >>> >>> And I would have liked to been able to to fiddle with target (for >>> wireless) on the fly as a function of (active_stations), with interval >>> slaved to minstrel's idea of everything. Might be good to know what problem you are exploring before you fiddle. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Codel] fp sqrt vis int sqrt? 2012-05-04 23:42 ` Kathleen Nichols @ 2012-05-05 0:01 ` Dave Taht 0 siblings, 0 replies; 18+ messages in thread From: Dave Taht @ 2012-05-05 0:01 UTC (permalink / raw) To: Kathleen Nichols; +Cc: codel On Fri, May 4, 2012 at 4:42 PM, Kathleen Nichols <nichols@pollere.com> wrote: > > Take a deep breath. ok. > > On 5/4/12 10:44 AM, Dave Taht wrote: > >>> >>> That's assuming a 100ms target, which by observation over the >> ^^^^^^^^^^^^^^^^^^^^interval >> >> past >>> year is not what a typical tcp stream is like anymore, it's closer to >>> 50ms, particularly for elephants and cdns, less in many cases. >>> > > But that really isn't going to make much difference. Interval is *loosely* > related to RTT. yes. But I have got some nice results with codel + qfq, between crashes.... > >>>> >>>> And I would have liked to been able to to fiddle with target (for >>>> wireless) on the fly as a function of (active_stations), with interval >>>> slaved to minstrel's idea of everything. > > Might be good to know what problem you are exploring before you > fiddle. Well, we went into the special issues wireless has in the meeting. and at the same time, I'm very pleased with what I've been able to see of codel so far. And am trying to keep it simple... just head buzzing with ideas for later. > > _______________________________________________ > Codel mailing list > Codel@lists.bufferbloat.net > https://lists.bufferbloat.net/listinfo/codel -- Dave Täht SKYPE: davetaht US Tel: 1-239-829-5608 http://www.bufferbloat.net ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Codel] fp sqrt vis int sqrt? 2012-05-04 17:23 ` Dave Taht 2012-05-04 17:43 ` Dave Taht @ 2012-05-04 17:47 ` Eric Dumazet 2012-05-04 18:26 ` Eric Dumazet 1 sibling, 1 reply; 18+ messages in thread From: Eric Dumazet @ 2012-05-04 17:47 UTC (permalink / raw) To: Dave Taht; +Cc: codel On Fri, 2012-05-04 at 10:23 -0700, Dave Taht wrote: > In looking over the (lack of) compiler support for fp in the kernel, > it seems simplest to load up a table from userspace for the > interval/sqrt(count) calculation. Well, you could have a small table (not from userspace, thats really not needed at all) for the 16 first sqrt values. More over, for the first sqrt values you can use reciprocal divide so that we dont have a divide anymore (see include/linux/reciprocal_div.h), and you can have a very precise sqrt this way, not an integral one. With count >= 16, the error you mention becomes small. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Codel] fp sqrt vis int sqrt? 2012-05-04 17:47 ` Eric Dumazet @ 2012-05-04 18:26 ` Eric Dumazet 2012-05-04 18:39 ` Dave Taht 0 siblings, 1 reply; 18+ messages in thread From: Eric Dumazet @ 2012-05-04 18:26 UTC (permalink / raw) To: Dave Taht; +Cc: codel On Fri, 2012-05-04 at 19:47 +0200, Eric Dumazet wrote: > On Fri, 2012-05-04 at 10:23 -0700, Dave Taht wrote: > > > In looking over the (lack of) compiler support for fp in the kernel, > > it seems simplest to load up a table from userspace for the > > interval/sqrt(count) calculation. > > Well, you could have a small table (not from userspace, thats really not > needed at all) for the 16 first sqrt values. > > More over, for the first sqrt values you can use reciprocal divide so > that we dont have a divide anymore (see include/linux/reciprocal_div.h), > and you can have a very precise sqrt this way, not an integral one. > > With count >= 16, the error you mention becomes small. proof of concept You can see the integer approximation, using only a multiply is pretty good compared to a float computation, # ./try 1 val=16777216 sqrt=4096 rec=4294967295 interval/sqrt(1)=100000000 integer approx :99999999 2 val=33554432 sqrt=5792 rec=3037327360 interval/sqrt(2)=70710678 integer approx :70718288 3 val=50331648 sqrt=7094 rec=2479869952 interval/sqrt(3)=57735026 integer approx :57738971 4 val=67108864 sqrt=8192 rec=2147483648 interval/sqrt(4)=50000000 integer approx :50000000 5 val=83886080 sqrt=9158 rec=1920966656 interval/sqrt(5)=44721359 integer approx :44725990 6 val=100663296 sqrt=10033 rec=1753436160 interval/sqrt(6)=40824829 integer approx :40825366 7 val=117440512 sqrt=10836 rec=1623494656 interval/sqrt(7)=37796447 integer approx :37799930 8 val=134217728 sqrt=11585 rec=1518534656 interval/sqrt(8)=35355339 integer approx :35356140 9 val=150994944 sqrt=12288 rec=1431658496 interval/sqrt(9)=33333333 integer approx :33333396 10 val=167772160 sqrt=12952 rec=1358262272 interval/sqrt(10)=31622776 integer approx :31624507 11 val=184549376 sqrt=13584 rec=1295069184 interval/sqrt(11)=30151134 integer approx :30153179 12 val=201326592 sqrt=14188 rec=1239937024 interval/sqrt(12)=28867513 integer approx :28869533 13 val=218103808 sqrt=14768 rec=1191239680 interval/sqrt(13)=27735009 integer approx :27735710 14 val=234881024 sqrt=15325 rec=1147940864 interval/sqrt(14)=26726124 integer approx :26727581 15 val=251658240 sqrt=15863 rec=1109008384 interval/sqrt(15)=25819888 integer approx :25821113 cat try.c #include <stdio.h> #include <math.h> typedef unsigned int u32; typedef unsigned long long u64; static inline u32 reciprocal_divide(u32 A, u32 R) { return (u32)(((u64)A * R) >> 32); } unsigned long int_sqrt(unsigned long x) { unsigned long op, res, one; op = x; res = 0; one = 1UL << (32 - 2); while (one > op) one >>= 2; while (one != 0) { if (op >= res + one) { op = op - (res + one); res = res + 2 * one; } res /= 2; one /= 4; } return res; } u32 reciprocal_value(u32 k) { u64 val = (1LL << 32) + (k - 1); val = val/k; return (u32)val; } int main() { unsigned long l, val, sq; unsigned interval = 100000000; u32 rec; for (l = 1 ; l < 16; l++) { val = (l << 24); sq = int_sqrt(val) ; rec = reciprocal_value(sq) << 12; if (!rec) rec = ~0U; printf("%lu val=%lu sqrt=%lu rec=%u\n", l, val, sq, rec); printf("interval/sqrt(%u)=%u integer approx :%u\n", l, (u32)(interval/sqrt(l)), reciprocal_divide(interval, rec)); } } ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Codel] fp sqrt vis int sqrt? 2012-05-04 18:26 ` Eric Dumazet @ 2012-05-04 18:39 ` Dave Taht 2012-05-04 18:50 ` Eric Dumazet 0 siblings, 1 reply; 18+ messages in thread From: Dave Taht @ 2012-05-04 18:39 UTC (permalink / raw) To: Eric Dumazet; +Cc: codel Heh. I'd only gotten as far as #include <stdio.h> #include <math.h> main() { before your mail arrived. On Fri, May 4, 2012 at 11:26 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote: > On Fri, 2012-05-04 at 19:47 +0200, Eric Dumazet wrote: >> On Fri, 2012-05-04 at 10:23 -0700, Dave Taht wrote: >> >> > In looking over the (lack of) compiler support for fp in the kernel, >> > it seems simplest to load up a table from userspace for the >> > interval/sqrt(count) calculation. >> >> Well, you could have a small table (not from userspace, thats really not >> needed at all) for the 16 first sqrt values. >> >> More over, for the first sqrt values you can use reciprocal divide so >> that we dont have a divide anymore (see include/linux/reciprocal_div.h), >> and you can have a very precise sqrt this way, not an integral one. >> >> With count >= 16, the error you mention becomes small. > > proof of concept > > You can see the integer approximation, using only a multiply is pretty > good compared to a float computation, > > > # ./try > 1 val=16777216 sqrt=4096 rec=4294967295 > interval/sqrt(1)=100000000 integer approx :99999999 > 2 val=33554432 sqrt=5792 rec=3037327360 > interval/sqrt(2)=70710678 integer approx :70718288 > 3 val=50331648 sqrt=7094 rec=2479869952 > interval/sqrt(3)=57735026 integer approx :57738971 > 4 val=67108864 sqrt=8192 rec=2147483648 > interval/sqrt(4)=50000000 integer approx :50000000 > 5 val=83886080 sqrt=9158 rec=1920966656 > interval/sqrt(5)=44721359 integer approx :44725990 > 6 val=100663296 sqrt=10033 rec=1753436160 > interval/sqrt(6)=40824829 integer approx :40825366 > 7 val=117440512 sqrt=10836 rec=1623494656 > interval/sqrt(7)=37796447 integer approx :37799930 > 8 val=134217728 sqrt=11585 rec=1518534656 > interval/sqrt(8)=35355339 integer approx :35356140 > 9 val=150994944 sqrt=12288 rec=1431658496 > interval/sqrt(9)=33333333 integer approx :33333396 > 10 val=167772160 sqrt=12952 rec=1358262272 > interval/sqrt(10)=31622776 integer approx :31624507 > 11 val=184549376 sqrt=13584 rec=1295069184 > interval/sqrt(11)=30151134 integer approx :30153179 > 12 val=201326592 sqrt=14188 rec=1239937024 > interval/sqrt(12)=28867513 integer approx :28869533 > 13 val=218103808 sqrt=14768 rec=1191239680 > interval/sqrt(13)=27735009 integer approx :27735710 > 14 val=234881024 sqrt=15325 rec=1147940864 > interval/sqrt(14)=26726124 integer approx :26727581 > 15 val=251658240 sqrt=15863 rec=1109008384 > interval/sqrt(15)=25819888 integer approx :25821113 > > cat try.c > > #include <stdio.h> > #include <math.h> > > typedef unsigned int u32; > typedef unsigned long long u64; > > static inline u32 reciprocal_divide(u32 A, u32 R) > { > return (u32)(((u64)A * R) >> 32); > } > unsigned long int_sqrt(unsigned long x) > { > unsigned long op, res, one; > > op = x; > res = 0; > > one = 1UL << (32 - 2); > while (one > op) > one >>= 2; > > while (one != 0) { > if (op >= res + one) { > op = op - (res + one); > res = res + 2 * one; > } > res /= 2; > one /= 4; > } > return res; > } > > u32 reciprocal_value(u32 k) > { > u64 val = (1LL << 32) + (k - 1); > > val = val/k; > return (u32)val; > } > > int main() > { > unsigned long l, val, sq; > unsigned interval = 100000000; > u32 rec; > > for (l = 1 ; l < 16; l++) { > val = (l << 24); > sq = int_sqrt(val) ; > rec = reciprocal_value(sq) << 12; > if (!rec) > rec = ~0U; > printf("%lu val=%lu sqrt=%lu rec=%u\n", l, val, sq, rec); > printf("interval/sqrt(%u)=%u integer approx :%u\n", l, > (u32)(interval/sqrt(l)), > reciprocal_divide(interval, rec)); > } > } > > -- Dave Täht SKYPE: davetaht US Tel: 1-239-829-5608 http://www.bufferbloat.net ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Codel] fp sqrt vis int sqrt? 2012-05-04 18:39 ` Dave Taht @ 2012-05-04 18:50 ` Eric Dumazet 2012-05-04 19:04 ` Dave Taht 0 siblings, 1 reply; 18+ messages in thread From: Eric Dumazet @ 2012-05-04 18:50 UTC (permalink / raw) To: Dave Taht; +Cc: codel On Fri, 2012-05-04 at 11:39 -0700, Dave Taht wrote: > Heh. I'd only gotten as far as > > #include <stdio.h> > #include <math.h> > main() { > > > > > > before your mail arrived. Yep, so maybe we can use a precomputed table of 25 values (100 bytes), using the reciprocal trick to get precise integer approximation. (interval/sqrt(count)). So that control_law() has no multiply or divide to do, just a lookup in the array. Then for count values >= 25, either we can afford an error less than 20% using the q->sqrt_count maintained in // with q->count, either we compute the reciprocal each time we change q->count. (cost : one divide + int_sqrt() call ) plus one multiply in control_law() Not sure we need to be ultra precise. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Codel] fp sqrt vis int sqrt? 2012-05-04 18:50 ` Eric Dumazet @ 2012-05-04 19:04 ` Dave Taht 2012-05-04 19:16 ` Eric Dumazet 0 siblings, 1 reply; 18+ messages in thread From: Dave Taht @ 2012-05-04 19:04 UTC (permalink / raw) To: Eric Dumazet; +Cc: codel On Fri, May 4, 2012 at 11:50 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote: > On Fri, 2012-05-04 at 11:39 -0700, Dave Taht wrote: >> Heh. I'd only gotten as far as >> >> #include <stdio.h> >> #include <math.h> >> main() { >> >> >> >> >> >> before your mail arrived. > > Yep, so maybe we can use a precomputed table of 25 values (100 bytes), > using the reciprocal trick to get precise integer approximation. > > (interval/sqrt(count)). > > So that control_law() has no multiply or divide to do, just a lookup in > the array. > > Then for count values >= 25, either we can afford an error less than 20% > using the q->sqrt_count maintained in // with q->count, either we > compute the reciprocal each time we change q->count. We end up with cumulative error and we are trying to match in inverse tcp's behavior on the flip side, over a variety of RTTs and bandwidths ranging from nearly 0 to 10GigE.... table[sch->len] in size doesn't bother me much. Except that I'd hoped to get away with ~ infinite queue length and not bother with that. More noise presumably helps with randomization down to some point to where we fall below kernel time granularity. > (cost : one divide + int_sqrt() call ) > plus one multiply in control_law() > > Not sure we need to be ultra precise. I'd like to toss these concepts back over the wall to kathie to simulate. I think seeing the oscillation in queue length with just int_sqrt vs fp sqrt would be fun too, but I'm weird that way. Late this evening I'll have time to incorporate the progress so far today (thx for the cleanups and help!!!) and I'll try to get another patch out late tonight. Unless you beat me to it. About the only thing left bugging me is the netlink interface. -- Dave Täht SKYPE: davetaht US Tel: 1-239-829-5608 http://www.bufferbloat.net ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Codel] fp sqrt vis int sqrt? 2012-05-04 19:04 ` Dave Taht @ 2012-05-04 19:16 ` Eric Dumazet 2012-05-05 8:14 ` Eric Dumazet 0 siblings, 1 reply; 18+ messages in thread From: Eric Dumazet @ 2012-05-04 19:16 UTC (permalink / raw) To: Dave Taht; +Cc: codel On Fri, 2012-05-04 at 12:04 -0700, Dave Taht wrote: > Late this evening I'll have time to incorporate the progress so far > today (thx for the cleanups and help!!!) and I'll try to get another > patch out late tonight. Unless you beat me to it. > I am not planning to send a patch in a near future, its 9h15pm here in France, and TGIF. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Codel] fp sqrt vis int sqrt? 2012-05-04 19:16 ` Eric Dumazet @ 2012-05-05 8:14 ` Eric Dumazet 0 siblings, 0 replies; 18+ messages in thread From: Eric Dumazet @ 2012-05-05 8:14 UTC (permalink / raw) To: Dave Taht; +Cc: codel On Fri, 2012-05-04 at 21:16 +0200, Eric Dumazet wrote: > On Fri, 2012-05-04 at 12:04 -0700, Dave Taht wrote: > > > Late this evening I'll have time to incorporate the progress so far > > today (thx for the cleanups and help!!!) and I'll try to get another > > patch out late tonight. Unless you beat me to it. > > > > I am not planning to send a patch in a near future, its 9h15pm here in > France, and TGIF. > > Oh well, this is obvious, just use the following and be done : /* return interval/sqrt(x) with good precision */ u32 int_sqrt_div(u32 _interval, unsigned long x) { u64 interval = _interval; /* scale for maximum precision */ while (x < (1UL << (BITS_PER_LONG - 2))) { x <<= 2; interval <<= 1; } do_div(interval, int_sqrt(x)); return (u32)interval; } Doing the scaling before calling int_sqrt() makes int_sqrt() faster, since we avoid the initial scaling on 'one' one = 1UL << (BITS_PER_LONG - 2); while (one > op) one >>= 2; Always sleep a bit, its good. ^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2012-05-05 8:14 UTC | newest] Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2012-05-04 15:11 [Codel] fp sqrt vis int sqrt? Dave Taht 2012-05-04 15:14 ` Jim Gettys 2012-05-04 15:19 ` Dave Taht 2012-05-04 15:26 ` Jim Gettys 2012-05-04 15:56 ` Eric Dumazet 2012-05-04 17:23 ` Dave Taht 2012-05-04 17:43 ` Dave Taht 2012-05-04 17:44 ` Dave Taht 2012-05-04 17:46 ` Dave Taht 2012-05-04 23:42 ` Kathleen Nichols 2012-05-05 0:01 ` Dave Taht 2012-05-04 17:47 ` Eric Dumazet 2012-05-04 18:26 ` Eric Dumazet 2012-05-04 18:39 ` Dave Taht 2012-05-04 18:50 ` Eric Dumazet 2012-05-04 19:04 ` Dave Taht 2012-05-04 19:16 ` Eric Dumazet 2012-05-05 8:14 ` Eric Dumazet
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox