[Codel] fp sqrt vis int sqrt?

CoDel AQM discussions
 help / color / mirror / Atom feed

* [Codel] fp sqrt vis int sqrt?
@ 2012-05-04 15:11 Dave Taht
  2012-05-04 15:14 ` Jim Gettys
  0 siblings, 1 reply; 18+ messages in thread
From: Dave Taht @ 2012-05-04 15:11 UTC (permalink / raw)
  To: codel

The linux kernel has no floating point in it, so I'd substituted the
internal int_sqrt as a substitute,
just to get something to work.

static inline ktime_t control_law(const struct codel_sched_data *q, ktime_t t)
{
        return ktime_add_ns(t, q->interval / int_sqrt(q->count));
}

Often ns2 models use floating point. Having not seen the model I don't
know that for sure.

The series for an integer sqrt is far more 'chunky' than a fp one.

int sqrt 1,2,3,4 = 1 1 1 2
fp sqrt 1 2 3 4 = 1 1.4.1 1.73 2

and gets even more chunky as you get larger values, eg, sqrt(36
through 48) = 6, sqrt(49) = 7

So we could precalculate the interval/sqrt(count) using floating point
in the control law calculation,
early, during qdisc setup, thus neatly avoiding both the divide and
sqrt in the control law path.

Or we could do fixed point.

Kathie: In the sim, in various simulations, what is the dynamic range
of count? And is it floating point sqrt?

-- 
Dave Täht
SKYPE: davetaht
US Tel: 1-239-829-5608
http://www.bufferbloat.net

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Codel] fp sqrt vis int sqrt?
  2012-05-04 15:11 [Codel] fp sqrt vis int sqrt? Dave Taht
@ 2012-05-04 15:14 ` Jim Gettys
  2012-05-04 15:19   ` Dave Taht
  0 siblings, 1 reply; 18+ messages in thread
From: Jim Gettys @ 2012-05-04 15:14 UTC (permalink / raw)
  To: Dave Taht; +Cc: codel

On 05/04/2012 11:11 AM, Dave Taht wrote:
> The linux kernel has no floating point in it, so I'd substituted the
> internal int_sqrt as a substitute,
> just to get something to work.
>
> static inline ktime_t control_law(const struct codel_sched_data *q, ktime_t t)
> {
>         return ktime_add_ns(t, q->interval / int_sqrt(q->count));
> }
>
> Often ns2 models use floating point. Having not seen the model I don't
> know that for sure.
>
> The series for an integer sqrt is far more 'chunky' than a fp one.
>
> int sqrt 1,2,3,4 = 1 1 1 2
> fp sqrt 1 2 3 4 = 1 1.4.1 1.73 2
>
> and gets even more chunky as you get larger values, eg, sqrt(36
> through 48) = 6, sqrt(49) = 7
>
> So we could precalculate the interval/sqrt(count) using floating point
> in the control law calculation,
> early, during qdisc setup, thus neatly avoiding both the divide and
> sqrt in the control law path.
>
> Or we could do fixed point.
>
> Kathie: In the sim, in various simulations, what is the dynamic range
> of count? And is it floating point sqrt?
>
The Wikipedia article on fast inverse sqrt is enlightening on computing
this really fast.

See:

http://en.wikipedia.org/wiki/Fast_inverse_square_root

This has little to do with whether an integer or floating point value is
better algorithmically.
                            - Jim

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Codel] fp sqrt vis int sqrt?
  2012-05-04 15:14 ` Jim Gettys
@ 2012-05-04 15:19   ` Dave Taht
  2012-05-04 15:26     ` Jim Gettys
  0 siblings, 1 reply; 18+ messages in thread
From: Dave Taht @ 2012-05-04 15:19 UTC (permalink / raw)
  To: Jim Gettys; +Cc: codel

On Fri, May 4, 2012 at 8:14 AM, Jim Gettys <jg@freedesktop.org> wrote:
> On 05/04/2012 11:11 AM, Dave Taht wrote:
>> The linux kernel has no floating point in it, so I'd substituted the
>> internal int_sqrt as a substitute,
>> just to get something to work.
>>
>> static inline ktime_t control_law(const struct codel_sched_data *q, ktime_t t)
>> {
>>         return ktime_add_ns(t, q->interval / int_sqrt(q->count));
>> }
>>
>> Often ns2 models use floating point. Having not seen the model I don't
>> know that for sure.
>>
>> The series for an integer sqrt is far more 'chunky' than a fp one.
>>
>> int sqrt 1,2,3,4 = 1 1 1 2
>> fp sqrt 1 2 3 4 = 1 1.4.1 1.73 2
>>
>> and gets even more chunky as you get larger values, eg, sqrt(36
>> through 48) = 6, sqrt(49) = 7
>>
>> So we could precalculate the interval/sqrt(count) using floating point
>> in the control law calculation,
>> early, during qdisc setup, thus neatly avoiding both the divide and
>> sqrt in the control law path.
>>
>> Or we could do fixed point.
>>
>> Kathie: In the sim, in various simulations, what is the dynamic range
>> of count? And is it floating point sqrt?
>>
> The Wikipedia article on fast inverse sqrt is enlightening on computing
> this really fast.
>
> See:
>
> http://en.wikipedia.org/wiki/Fast_inverse_square_root
>
> This has little to do with whether an integer or floating point value is
> better algorithmically.

Neat trick, that.

I'm delighted the wikipedia article kept the fully commented original
code around!

That said, you can't use FP at all without special handling in the
kernel. It's certainly a terrible idea on a hot path...

>                            - Jim



-- 
Dave Täht
SKYPE: davetaht
US Tel: 1-239-829-5608
http://www.bufferbloat.net

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Codel] fp sqrt vis int sqrt?
  2012-05-04 15:19   ` Dave Taht
@ 2012-05-04 15:26     ` Jim Gettys
  2012-05-04 15:56       ` Eric Dumazet
  0 siblings, 1 reply; 18+ messages in thread
From: Jim Gettys @ 2012-05-04 15:26 UTC (permalink / raw)
  To: Dave Taht; +Cc: codel

On 05/04/2012 11:19 AM, Dave Taht wrote:
> On Fri, May 4, 2012 at 8:14 AM, Jim Gettys <jg@freedesktop.org> wrote:
>> On 05/04/2012 11:11 AM, Dave Taht wrote:
>>> The linux kernel has no floating point in it, so I'd substituted the
>>> internal int_sqrt as a substitute,
>>> just to get something to work.
>>>
>>> static inline ktime_t control_law(const struct codel_sched_data *q, ktime_t t)
>>> {
>>>         return ktime_add_ns(t, q->interval / int_sqrt(q->count));
>>> }
>>>
>>> Often ns2 models use floating point. Having not seen the model I don't
>>> know that for sure.
>>>
>>> The series for an integer sqrt is far more 'chunky' than a fp one.
>>>
>>> int sqrt 1,2,3,4 = 1 1 1 2
>>> fp sqrt 1 2 3 4 = 1 1.4.1 1.73 2
>>>
>>> and gets even more chunky as you get larger values, eg, sqrt(36
>>> through 48) = 6, sqrt(49) = 7
>>>
>>> So we could precalculate the interval/sqrt(count) using floating point
>>> in the control law calculation,
>>> early, during qdisc setup, thus neatly avoiding both the divide and
>>> sqrt in the control law path.
>>>
>>> Or we could do fixed point.
>>>
>>> Kathie: In the sim, in various simulations, what is the dynamic range
>>> of count? And is it floating point sqrt?
>>>
>> The Wikipedia article on fast inverse sqrt is enlightening on computing
>> this really fast.
>>
>> See:
>>
>> http://en.wikipedia.org/wiki/Fast_inverse_square_root
>>
>> This has little to do with whether an integer or floating point value is
>> better algorithmically.
> Neat trick, that.
>
> I'm delighted the wikipedia article kept the fully commented original
> code around!

I don't think that is the "original", which predates quake by a lot
according to the article.
>
> That said, you can't use FP at all without special handling in the
> kernel. It's certainly a terrible idea on a hot path...

I suspect there are fully integer implementations kicking around... 
It's just that x86 finally got fast enough FP that doing one multiply in
floating point beat the alternative.
                    - Jim


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Codel] fp sqrt vis int sqrt?
  2012-05-04 15:26     ` Jim Gettys
@ 2012-05-04 15:56       ` Eric Dumazet
  2012-05-04 17:23         ` Dave Taht
  0 siblings, 1 reply; 18+ messages in thread
From: Eric Dumazet @ 2012-05-04 15:56 UTC (permalink / raw)
  To: Jim Gettys; +Cc: codel

On Fri, 2012-05-04 at 11:26 -0400, Jim Gettys wrote:

> 
> I suspect there are fully integer implementations kicking around... 
> It's just that x86 finally got fast enough FP that doing one multiply in
> floating point beat the alternative.

Anyway we dont need to compute sqrt() at all.

This can be done easily when we do the q->count++, if we hold in
q->sqrt_count the estimated value of sqrt(count)

Ie replace :

q->count++;

by:

q->count++;
n = q->sqrt_count + 1;
if (n * n <= q->count)
	q->sqrt_count = n;

A multiply is an acceptable cost.


Of course, you need to reset sqrt_count to 1 when count is reset to 1




^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Codel] fp sqrt vis int sqrt?
  2012-05-04 15:56       ` Eric Dumazet
@ 2012-05-04 17:23         ` Dave Taht
  2012-05-04 17:43           ` Dave Taht
  2012-05-04 17:47           ` Eric Dumazet
  0 siblings, 2 replies; 18+ messages in thread
From: Dave Taht @ 2012-05-04 17:23 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: codel

On Fri, May 4, 2012 at 8:56 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Fri, 2012-05-04 at 11:26 -0400, Jim Gettys wrote:
>
>>
>> I suspect there are fully integer implementations kicking around...
>> It's just that x86 finally got fast enough FP that doing one multiply in
>> floating point beat the alternative.
>
> Anyway we dont need to compute sqrt() at all.
>
> This can be done easily when we do the q->count++, if we hold in
> q->sqrt_count the estimated value of sqrt(count)
>
> Ie replace :
>
> q->count++;
>
> by:
>
> q->count++;
> n = q->sqrt_count + 1;
> if (n * n <= q->count)
>        q->sqrt_count = n;
>
> A multiply is an acceptable cost.
>
>
> Of course, you need to reset sqrt_count to 1 when count is reset to 1

Except that the cumulative drop probability is very different when you
apply a series of times of integer vs floating point sqrt calculations
into the mix.

I can do up a graph, and may (it helps to visualize things), however
if you look at the first four in the series, you get the net
effect of a 5th and nearly a 6th term by the time you get to the 4th
expansion of the series done with a fp sqrt vs integer sqrt.

10000000+  7071067+  5773502+5000000
27844569

10000000+10000000+10000000+5000000
35000000

so with fp you end up with a much tighter and more responsive control loop.

In looking over the (lack of) compiler support for fp in the kernel,
it seems simplest to load up a table from userspace for the
interval/sqrt(count) calculation.

Not that I netlink and I are friendly at the moment.

And I would have liked to been able to to fiddle with target (for
wireless) on the fly as a function of (active_stations), with interval
slaved to minstrel's idea of everything.

/me goes to look at gcc's implementation
>
>

-- 
Dave Täht
SKYPE: davetaht
US Tel: 1-239-829-5608
http://www.bufferbloat.net

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Codel] fp sqrt vis int sqrt?
  2012-05-04 17:23         ` Dave Taht
@ 2012-05-04 17:43           ` Dave Taht
  2012-05-04 17:44             ` Dave Taht
  2012-05-04 17:47           ` Eric Dumazet
  1 sibling, 1 reply; 18+ messages in thread
From: Dave Taht @ 2012-05-04 17:43 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: codel

On Fri, May 4, 2012 at 10:23 AM, Dave Taht <dave.taht@gmail.com> wrote:
> On Fri, May 4, 2012 at 8:56 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
>> On Fri, 2012-05-04 at 11:26 -0400, Jim Gettys wrote:
>>
>>>
>>> I suspect there are fully integer implementations kicking around...
>>> It's just that x86 finally got fast enough FP that doing one multiply in
>>> floating point beat the alternative.
>>
>> Anyway we dont need to compute sqrt() at all.
>>
>> This can be done easily when we do the q->count++, if we hold in
>> q->sqrt_count the estimated value of sqrt(count)
>>
>> Ie replace :
>>
>> q->count++;
>>
>> by:
>>
>> q->count++;
>> n = q->sqrt_count + 1;
>> if (n * n <= q->count)
>>        q->sqrt_count = n;
>>
>> A multiply is an acceptable cost.
>>
>>
>> Of course, you need to reset sqrt_count to 1 when count is reset to 1
>
> Except that the cumulative drop probability is very different when you
> apply a series of times of integer vs floating point sqrt calculations
> into the mix.
>
> I can do up a graph, and may (it helps to visualize things), however
> if you look at the first four in the series, you get the net
> effect of a 5th and nearly a 6th term by the time you get to the 4th
> expansion of the series done with a fp sqrt vs integer sqrt.
>
> 10000000+  7071067+  5773502+5000000
> 27844569
>
> 10000000+10000000+10000000+5000000
> 35000000
>
> so with fp you end up with a much tighter and more responsive control loop.

To be mildly more clear, by the time integer sqrt codel would have
dropped 4 packets, fp codel would have nearly dropped 6.

27844569+4472135+4472135
36788839

That's assuming a 100ms target, which by observation over the past
year is not what a typical tcp stream is like anymore, it's closer to
50ms, particularly for elephants and cdns, less in many cases.


>
> In looking over the (lack of) compiler support for fp in the kernel,
> it seems simplest to load up a table from userspace for the
> interval/sqrt(count) calculation.
>
> Not that I netlink and I are friendly at the moment.
>
> And I would have liked to been able to to fiddle with target (for
> wireless) on the fly as a function of (active_stations), with interval
> slaved to minstrel's idea of everything.
>
> /me goes to look at gcc's implementation
>>
>>
>
>
>
> --
> Dave Täht
> SKYPE: davetaht
> US Tel: 1-239-829-5608
> http://www.bufferbloat.net



-- 
Dave Täht
SKYPE: davetaht
US Tel: 1-239-829-5608
http://www.bufferbloat.net

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Codel] fp sqrt vis int sqrt?
  2012-05-04 17:43           ` Dave Taht
@ 2012-05-04 17:44             ` Dave Taht
  2012-05-04 17:46               ` Dave Taht
  2012-05-04 23:42               ` Kathleen Nichols
  0 siblings, 2 replies; 18+ messages in thread
From: Dave Taht @ 2012-05-04 17:44 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: codel

On Fri, May 4, 2012 at 10:43 AM, Dave Taht <dave.taht@gmail.com> wrote:
> On Fri, May 4, 2012 at 10:23 AM, Dave Taht <dave.taht@gmail.com> wrote:
>> On Fri, May 4, 2012 at 8:56 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
>>> On Fri, 2012-05-04 at 11:26 -0400, Jim Gettys wrote:
>>>
>>>>
>>>> I suspect there are fully integer implementations kicking around...
>>>> It's just that x86 finally got fast enough FP that doing one multiply in
>>>> floating point beat the alternative.
>>>
>>> Anyway we dont need to compute sqrt() at all.
>>>
>>> This can be done easily when we do the q->count++, if we hold in
>>> q->sqrt_count the estimated value of sqrt(count)
>>>
>>> Ie replace :
>>>
>>> q->count++;
>>>
>>> by:
>>>
>>> q->count++;
>>> n = q->sqrt_count + 1;
>>> if (n * n <= q->count)
>>>        q->sqrt_count = n;
>>>
>>> A multiply is an acceptable cost.
>>>
>>>
>>> Of course, you need to reset sqrt_count to 1 when count is reset to 1
>>
>> Except that the cumulative drop probability is very different when you
>> apply a series of times of integer vs floating point sqrt calculations
>> into the mix.
>>
>> I can do up a graph, and may (it helps to visualize things), however
>> if you look at the first four in the series, you get the net
>> effect of a 5th and nearly a 6th term by the time you get to the 4th
>> expansion of the series done with a fp sqrt vs integer sqrt.
>>
>> 10000000+  7071067+  5773502+5000000
>> 27844569
>>
>> 10000000+10000000+10000000+5000000
>> 35000000
>>
>> so with fp you end up with a much tighter and more responsive control loop.
>
> To be mildly more clear, by the time integer sqrt codel would have
> dropped 4 packets, fp codel would have nearly dropped 6.
>
> 27844569+4472135+4472135
> 36788839
>
> That's assuming a 100ms target, which by observation over the
^^^^^^^^^^^^^^^^^^^^interval

past
> year is not what a typical tcp stream is like anymore, it's closer to
> 50ms, particularly for elephants and cdns, less in many cases.
>
>
>>
>> In looking over the (lack of) compiler support for fp in the kernel,
>> it seems simplest to load up a table from userspace for the
>> interval/sqrt(count) calculation.
>>
>> Not that I netlink and I are friendly at the moment.
>>
>> And I would have liked to been able to to fiddle with target (for
>> wireless) on the fly as a function of (active_stations), with interval
>> slaved to minstrel's idea of everything.
>>
>> /me goes to look at gcc's implementation
>>>
>>>
>>
>>
>>
>> --
>> Dave Täht
>> SKYPE: davetaht
>> US Tel: 1-239-829-5608
>> http://www.bufferbloat.net
>
>
>
> --
> Dave Täht
> SKYPE: davetaht
> US Tel: 1-239-829-5608
> http://www.bufferbloat.net



-- 
Dave Täht
SKYPE: davetaht
US Tel: 1-239-829-5608
http://www.bufferbloat.net

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Codel] fp sqrt vis int sqrt?
  2012-05-04 17:44             ` Dave Taht
@ 2012-05-04 17:46               ` Dave Taht
  2012-05-04 23:42               ` Kathleen Nichols
  1 sibling, 0 replies; 18+ messages in thread
From: Dave Taht @ 2012-05-04 17:46 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: codel

On Fri, May 4, 2012 at 10:44 AM, Dave Taht <dave.taht@gmail.com> wrote:
> On Fri, May 4, 2012 at 10:43 AM, Dave Taht <dave.taht@gmail.com> wrote:
>> On Fri, May 4, 2012 at 10:23 AM, Dave Taht <dave.taht@gmail.com> wrote:
>>> On Fri, May 4, 2012 at 8:56 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
>>>> On Fri, 2012-05-04 at 11:26 -0400, Jim Gettys wrote:
>>>>
>>>>>
>>>>> I suspect there are fully integer implementations kicking around...
>>>>> It's just that x86 finally got fast enough FP that doing one multiply in
>>>>> floating point beat the alternative.
>>>>
>>>> Anyway we dont need to compute sqrt() at all.
>>>>
>>>> This can be done easily when we do the q->count++, if we hold in
>>>> q->sqrt_count the estimated value of sqrt(count)
>>>>
>>>> Ie replace :
>>>>
>>>> q->count++;
>>>>
>>>> by:
>>>>
>>>> q->count++;
>>>> n = q->sqrt_count + 1;
>>>> if (n * n <= q->count)
>>>>        q->sqrt_count = n;
>>>>
>>>> A multiply is an acceptable cost.
>>>>
>>>>
>>>> Of course, you need to reset sqrt_count to 1 when count is reset to 1
>>>
>>> Except that the cumulative drop probability is very different when you
>>> apply a series of times of integer vs floating point sqrt calculations
>>> into the mix.
>>>
>>> I can do up a graph, and may (it helps to visualize things), however
>>> if you look at the first four in the series, you get the net
>>> effect of a 5th and nearly a 6th term by the time you get to the 4th
>>> expansion of the series done with a fp sqrt vs integer sqrt.
>>>
>>> 10000000+  7071067+  5773502+5000000
>>> 27844569
>>>
>>> 10000000+10000000+10000000+5000000
>>> 35000000
>>>
>>> so with fp you end up with a much tighter and more responsive control loop.
>>
>> To be mildly more clear, by the time integer sqrt codel would have
>> dropped 4 packets, fp codel would have nearly dropped 6.
>>
>> 27844569+4472135+4472135
>> 36788839

And I goofed on the copy/paste math. (all the more reason for me to
code this up and graph it)

27844569+4472135+4082482
36399186


>>
>> That's assuming a 100ms target, which by observation over the
> ^^^^^^^^^^^^^^^^^^^^interval
>
> past
>> year is not what a typical tcp stream is like anymore, it's closer to
>> 50ms, particularly for elephants and cdns, less in many cases.
>>
>>
>>>
>>> In looking over the (lack of) compiler support for fp in the kernel,
>>> it seems simplest to load up a table from userspace for the
>>> interval/sqrt(count) calculation.
>>>
>>> Not that I netlink and I are friendly at the moment.
>>>
>>> And I would have liked to been able to to fiddle with target (for
>>> wireless) on the fly as a function of (active_stations), with interval
>>> slaved to minstrel's idea of everything.
>>>
>>> /me goes to look at gcc's implementation
>>>>
>>>>
>>>
>>>
>>>
>>> --
>>> Dave Täht
>>> SKYPE: davetaht
>>> US Tel: 1-239-829-5608
>>> http://www.bufferbloat.net
>>
>>
>>
>> --
>> Dave Täht
>> SKYPE: davetaht
>> US Tel: 1-239-829-5608
>> http://www.bufferbloat.net
>
>
>
> --
> Dave Täht
> SKYPE: davetaht
> US Tel: 1-239-829-5608
> http://www.bufferbloat.net



-- 
Dave Täht
SKYPE: davetaht
US Tel: 1-239-829-5608
http://www.bufferbloat.net

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Codel] fp sqrt vis int sqrt?
  2012-05-04 17:23         ` Dave Taht
  2012-05-04 17:43           ` Dave Taht
@ 2012-05-04 17:47           ` Eric Dumazet
  2012-05-04 18:26             ` Eric Dumazet
  1 sibling, 1 reply; 18+ messages in thread
From: Eric Dumazet @ 2012-05-04 17:47 UTC (permalink / raw)
  To: Dave Taht; +Cc: codel

On Fri, 2012-05-04 at 10:23 -0700, Dave Taht wrote:

> In looking over the (lack of) compiler support for fp in the kernel,
> it seems simplest to load up a table from userspace for the
> interval/sqrt(count) calculation.

Well, you could have a small table (not from userspace, thats really not
needed at all) for the 16 first sqrt values.

More over, for the first sqrt values you can use reciprocal divide so
that we dont have a divide anymore (see include/linux/reciprocal_div.h),
and you can have a very precise sqrt this way, not an integral one.

With count >= 16, the error you mention becomes small.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Codel] fp sqrt vis int sqrt?
  2012-05-04 17:47           ` Eric Dumazet
@ 2012-05-04 18:26             ` Eric Dumazet
  2012-05-04 18:39               ` Dave Taht
  0 siblings, 1 reply; 18+ messages in thread
From: Eric Dumazet @ 2012-05-04 18:26 UTC (permalink / raw)
  To: Dave Taht; +Cc: codel

On Fri, 2012-05-04 at 19:47 +0200, Eric Dumazet wrote:
> On Fri, 2012-05-04 at 10:23 -0700, Dave Taht wrote:
> 
> > In looking over the (lack of) compiler support for fp in the kernel,
> > it seems simplest to load up a table from userspace for the
> > interval/sqrt(count) calculation.
> 
> Well, you could have a small table (not from userspace, thats really not
> needed at all) for the 16 first sqrt values.
> 
> More over, for the first sqrt values you can use reciprocal divide so
> that we dont have a divide anymore (see include/linux/reciprocal_div.h),
> and you can have a very precise sqrt this way, not an integral one.
> 
> With count >= 16, the error you mention becomes small.

proof of concept 

You can see the integer approximation, using only a multiply is pretty
good compared to a float computation,


# ./try
1 val=16777216 sqrt=4096 rec=4294967295
interval/sqrt(1)=100000000  integer approx :99999999
2 val=33554432 sqrt=5792 rec=3037327360
interval/sqrt(2)=70710678  integer approx :70718288
3 val=50331648 sqrt=7094 rec=2479869952
interval/sqrt(3)=57735026  integer approx :57738971
4 val=67108864 sqrt=8192 rec=2147483648
interval/sqrt(4)=50000000  integer approx :50000000
5 val=83886080 sqrt=9158 rec=1920966656
interval/sqrt(5)=44721359  integer approx :44725990
6 val=100663296 sqrt=10033 rec=1753436160
interval/sqrt(6)=40824829  integer approx :40825366
7 val=117440512 sqrt=10836 rec=1623494656
interval/sqrt(7)=37796447  integer approx :37799930
8 val=134217728 sqrt=11585 rec=1518534656
interval/sqrt(8)=35355339  integer approx :35356140
9 val=150994944 sqrt=12288 rec=1431658496
interval/sqrt(9)=33333333  integer approx :33333396
10 val=167772160 sqrt=12952 rec=1358262272
interval/sqrt(10)=31622776  integer approx :31624507
11 val=184549376 sqrt=13584 rec=1295069184
interval/sqrt(11)=30151134  integer approx :30153179
12 val=201326592 sqrt=14188 rec=1239937024
interval/sqrt(12)=28867513  integer approx :28869533
13 val=218103808 sqrt=14768 rec=1191239680
interval/sqrt(13)=27735009  integer approx :27735710
14 val=234881024 sqrt=15325 rec=1147940864
interval/sqrt(14)=26726124  integer approx :26727581
15 val=251658240 sqrt=15863 rec=1109008384
interval/sqrt(15)=25819888  integer approx :25821113

cat try.c

#include <stdio.h>
#include <math.h>

typedef unsigned int u32;
typedef unsigned long long u64;

static inline u32 reciprocal_divide(u32 A, u32 R)
{
	return (u32)(((u64)A * R) >> 32);
}
unsigned long int_sqrt(unsigned long x)
{
	unsigned long op, res, one;

	op = x;
	res = 0;

	one = 1UL << (32 - 2);
	while (one > op)
		one >>= 2;

	while (one != 0) {
		if (op >= res + one) {
			op = op - (res + one);
			res = res +  2 * one;
		}
		res /= 2;
		one /= 4;
	}
	return res;
}

u32 reciprocal_value(u32 k)
{
	u64 val = (1LL << 32) + (k - 1);

	val = val/k;
	return (u32)val;
}

int main()
{
	unsigned long l, val, sq;
	unsigned interval = 100000000;
	u32 rec;

	for (l = 1 ; l < 16; l++) {
		val = (l << 24);
		sq = int_sqrt(val) ;
		rec = reciprocal_value(sq) << 12;
		if (!rec)
			rec = ~0U;
		printf("%lu val=%lu sqrt=%lu rec=%u\n", l, val, sq, rec);
		printf("interval/sqrt(%u)=%u  integer approx :%u\n", l, 
			(u32)(interval/sqrt(l)), 
			reciprocal_divide(interval, rec));
	}
}



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Codel] fp sqrt vis int sqrt?
  2012-05-04 18:26             ` Eric Dumazet
@ 2012-05-04 18:39               ` Dave Taht
  2012-05-04 18:50                 ` Eric Dumazet
  0 siblings, 1 reply; 18+ messages in thread
From: Dave Taht @ 2012-05-04 18:39 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: codel

Heh. I'd only gotten as far as

#include <stdio.h>
#include <math.h>
main() {





before your mail arrived.

On Fri, May 4, 2012 at 11:26 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Fri, 2012-05-04 at 19:47 +0200, Eric Dumazet wrote:
>> On Fri, 2012-05-04 at 10:23 -0700, Dave Taht wrote:
>>
>> > In looking over the (lack of) compiler support for fp in the kernel,
>> > it seems simplest to load up a table from userspace for the
>> > interval/sqrt(count) calculation.
>>
>> Well, you could have a small table (not from userspace, thats really not
>> needed at all) for the 16 first sqrt values.
>>
>> More over, for the first sqrt values you can use reciprocal divide so
>> that we dont have a divide anymore (see include/linux/reciprocal_div.h),
>> and you can have a very precise sqrt this way, not an integral one.
>>
>> With count >= 16, the error you mention becomes small.
>
> proof of concept
>
> You can see the integer approximation, using only a multiply is pretty
> good compared to a float computation,
>
>
> # ./try
> 1 val=16777216 sqrt=4096 rec=4294967295
> interval/sqrt(1)=100000000  integer approx :99999999
> 2 val=33554432 sqrt=5792 rec=3037327360
> interval/sqrt(2)=70710678  integer approx :70718288
> 3 val=50331648 sqrt=7094 rec=2479869952
> interval/sqrt(3)=57735026  integer approx :57738971
> 4 val=67108864 sqrt=8192 rec=2147483648
> interval/sqrt(4)=50000000  integer approx :50000000
> 5 val=83886080 sqrt=9158 rec=1920966656
> interval/sqrt(5)=44721359  integer approx :44725990
> 6 val=100663296 sqrt=10033 rec=1753436160
> interval/sqrt(6)=40824829  integer approx :40825366
> 7 val=117440512 sqrt=10836 rec=1623494656
> interval/sqrt(7)=37796447  integer approx :37799930
> 8 val=134217728 sqrt=11585 rec=1518534656
> interval/sqrt(8)=35355339  integer approx :35356140
> 9 val=150994944 sqrt=12288 rec=1431658496
> interval/sqrt(9)=33333333  integer approx :33333396
> 10 val=167772160 sqrt=12952 rec=1358262272
> interval/sqrt(10)=31622776  integer approx :31624507
> 11 val=184549376 sqrt=13584 rec=1295069184
> interval/sqrt(11)=30151134  integer approx :30153179
> 12 val=201326592 sqrt=14188 rec=1239937024
> interval/sqrt(12)=28867513  integer approx :28869533
> 13 val=218103808 sqrt=14768 rec=1191239680
> interval/sqrt(13)=27735009  integer approx :27735710
> 14 val=234881024 sqrt=15325 rec=1147940864
> interval/sqrt(14)=26726124  integer approx :26727581
> 15 val=251658240 sqrt=15863 rec=1109008384
> interval/sqrt(15)=25819888  integer approx :25821113
>
> cat try.c
>
> #include <stdio.h>
> #include <math.h>
>
> typedef unsigned int u32;
> typedef unsigned long long u64;
>
> static inline u32 reciprocal_divide(u32 A, u32 R)
> {
>        return (u32)(((u64)A * R) >> 32);
> }
> unsigned long int_sqrt(unsigned long x)
> {
>        unsigned long op, res, one;
>
>        op = x;
>        res = 0;
>
>        one = 1UL << (32 - 2);
>        while (one > op)
>                one >>= 2;
>
>        while (one != 0) {
>                if (op >= res + one) {
>                        op = op - (res + one);
>                        res = res +  2 * one;
>                }
>                res /= 2;
>                one /= 4;
>        }
>        return res;
> }
>
> u32 reciprocal_value(u32 k)
> {
>        u64 val = (1LL << 32) + (k - 1);
>
>        val = val/k;
>        return (u32)val;
> }
>
> int main()
> {
>        unsigned long l, val, sq;
>        unsigned interval = 100000000;
>        u32 rec;
>
>        for (l = 1 ; l < 16; l++) {
>                val = (l << 24);
>                sq = int_sqrt(val) ;
>                rec = reciprocal_value(sq) << 12;
>                if (!rec)
>                        rec = ~0U;
>                printf("%lu val=%lu sqrt=%lu rec=%u\n", l, val, sq, rec);
>                printf("interval/sqrt(%u)=%u  integer approx :%u\n", l,
>                        (u32)(interval/sqrt(l)),
>                        reciprocal_divide(interval, rec));
>        }
> }
>
>



-- 
Dave Täht
SKYPE: davetaht
US Tel: 1-239-829-5608
http://www.bufferbloat.net

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Codel] fp sqrt vis int sqrt?
  2012-05-04 18:39               ` Dave Taht
@ 2012-05-04 18:50                 ` Eric Dumazet
  2012-05-04 19:04                   ` Dave Taht
  0 siblings, 1 reply; 18+ messages in thread
From: Eric Dumazet @ 2012-05-04 18:50 UTC (permalink / raw)
  To: Dave Taht; +Cc: codel

On Fri, 2012-05-04 at 11:39 -0700, Dave Taht wrote:
> Heh. I'd only gotten as far as
> 
> #include <stdio.h>
> #include <math.h>
> main() {
> 
> 
> 
> 
> 
> before your mail arrived.

Yep, so maybe we can use a precomputed table of 25 values (100 bytes),
using the reciprocal trick to get precise integer approximation.

(interval/sqrt(count)). 

So that control_law() has no multiply or divide to do, just a lookup in
the array.

Then for count values >= 25, either we can afford an error less than 20%
using the q->sqrt_count maintained in // with q->count, either we
compute the reciprocal each time we change q->count. 

(cost : one divide + int_sqrt() call )
plus one multiply in control_law()

Not sure we need to be ultra precise.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Codel] fp sqrt vis int sqrt?
  2012-05-04 18:50                 ` Eric Dumazet
@ 2012-05-04 19:04                   ` Dave Taht
  2012-05-04 19:16                     ` Eric Dumazet
  0 siblings, 1 reply; 18+ messages in thread
From: Dave Taht @ 2012-05-04 19:04 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: codel

On Fri, May 4, 2012 at 11:50 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Fri, 2012-05-04 at 11:39 -0700, Dave Taht wrote:
>> Heh. I'd only gotten as far as
>>
>> #include <stdio.h>
>> #include <math.h>
>> main() {
>>
>>
>>
>>
>>
>> before your mail arrived.
>
> Yep, so maybe we can use a precomputed table of 25 values (100 bytes),
> using the reciprocal trick to get precise integer approximation.
>
> (interval/sqrt(count)).
>
> So that control_law() has no multiply or divide to do, just a lookup in
> the array.
>
> Then for count values >= 25, either we can afford an error less than 20%
> using the q->sqrt_count maintained in // with q->count, either we
> compute the reciprocal each time we change q->count.

We end up with cumulative error and we are trying to match in inverse
tcp's behavior on the flip side, over a variety of RTTs and bandwidths
ranging from nearly 0 to 10GigE....

table[sch->len] in size doesn't bother me much. Except that I'd hoped
to get away with ~ infinite queue length and not bother with that.

More noise presumably helps with randomization down to some point to
where we fall below kernel time granularity.

> (cost : one divide + int_sqrt() call )
> plus one multiply in control_law()
>
> Not sure we need to be ultra precise.

I'd like to toss these concepts back over the wall to kathie to simulate.

I think seeing the oscillation in queue length with just int_sqrt vs
fp sqrt would be fun too, but I'm weird that way.

Late this evening I'll have time to incorporate the progress so far
today (thx for the cleanups and help!!!) and I'll try to get another
patch out late tonight. Unless you beat me to it.

About the only thing left bugging me is the netlink interface.

-- 
Dave Täht
SKYPE: davetaht
US Tel: 1-239-829-5608
http://www.bufferbloat.net

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Codel] fp sqrt vis int sqrt?
  2012-05-04 19:04                   ` Dave Taht
@ 2012-05-04 19:16                     ` Eric Dumazet
  2012-05-05  8:14                       ` Eric Dumazet
  0 siblings, 1 reply; 18+ messages in thread
From: Eric Dumazet @ 2012-05-04 19:16 UTC (permalink / raw)
  To: Dave Taht; +Cc: codel

On Fri, 2012-05-04 at 12:04 -0700, Dave Taht wrote:

> Late this evening I'll have time to incorporate the progress so far
> today (thx for the cleanups and help!!!) and I'll try to get another
> patch out late tonight. Unless you beat me to it.
> 

I am not planning to send a patch in a near future, its 9h15pm here in
France, and TGIF.




^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Codel] fp sqrt vis int sqrt?
  2012-05-04 17:44             ` Dave Taht
  2012-05-04 17:46               ` Dave Taht
@ 2012-05-04 23:42               ` Kathleen Nichols
  2012-05-05  0:01                 ` Dave Taht
  1 sibling, 1 reply; 18+ messages in thread
From: Kathleen Nichols @ 2012-05-04 23:42 UTC (permalink / raw)
  To: codel


Take a deep breath.

On 5/4/12 10:44 AM, Dave Taht wrote:

>>
>> That's assuming a 100ms target, which by observation over the
> ^^^^^^^^^^^^^^^^^^^^interval
> 
> past
>> year is not what a typical tcp stream is like anymore, it's closer to
>> 50ms, particularly for elephants and cdns, less in many cases.
>>

But that really isn't going to make much difference. Interval is *loosely*
related to RTT.

>>>
>>> And I would have liked to been able to to fiddle with target (for
>>> wireless) on the fly as a function of (active_stations), with interval
>>> slaved to minstrel's idea of everything.

Might be good to know what problem you are exploring before you
fiddle.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Codel] fp sqrt vis int sqrt?
  2012-05-04 23:42               ` Kathleen Nichols
@ 2012-05-05  0:01                 ` Dave Taht
  0 siblings, 0 replies; 18+ messages in thread
From: Dave Taht @ 2012-05-05  0:01 UTC (permalink / raw)
  To: Kathleen Nichols; +Cc: codel

On Fri, May 4, 2012 at 4:42 PM, Kathleen Nichols <nichols@pollere.com> wrote:
>
> Take a deep breath.

ok.

>
> On 5/4/12 10:44 AM, Dave Taht wrote:
>
>>>
>>> That's assuming a 100ms target, which by observation over the
>> ^^^^^^^^^^^^^^^^^^^^interval
>>
>> past
>>> year is not what a typical tcp stream is like anymore, it's closer to
>>> 50ms, particularly for elephants and cdns, less in many cases.
>>>
>
> But that really isn't going to make much difference. Interval is *loosely*
> related to RTT.

yes. But I have got some nice results with codel + qfq, between crashes....

>
>>>>
>>>> And I would have liked to been able to to fiddle with target (for
>>>> wireless) on the fly as a function of (active_stations), with interval
>>>> slaved to minstrel's idea of everything.
>
> Might be good to know what problem you are exploring before you
> fiddle.

Well, we went into the special issues wireless has in the meeting.

and at the same time, I'm very pleased with what I've been able to see
of codel so far.

And am trying to keep it simple... just head buzzing with ideas for later.

>
> _______________________________________________
> Codel mailing list
> Codel@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/codel



-- 
Dave Täht
SKYPE: davetaht
US Tel: 1-239-829-5608
http://www.bufferbloat.net

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Codel] fp sqrt vis int sqrt?
  2012-05-04 19:16                     ` Eric Dumazet
@ 2012-05-05  8:14                       ` Eric Dumazet
  0 siblings, 0 replies; 18+ messages in thread
From: Eric Dumazet @ 2012-05-05  8:14 UTC (permalink / raw)
  To: Dave Taht; +Cc: codel

On Fri, 2012-05-04 at 21:16 +0200, Eric Dumazet wrote:
> On Fri, 2012-05-04 at 12:04 -0700, Dave Taht wrote:
> 
> > Late this evening I'll have time to incorporate the progress so far
> > today (thx for the cleanups and help!!!) and I'll try to get another
> > patch out late tonight. Unless you beat me to it.
> > 
> 
> I am not planning to send a patch in a near future, its 9h15pm here in
> France, and TGIF.
> 
> 

Oh well, this is obvious, just use the following and be done :

/* return interval/sqrt(x) with good precision
 */
u32 int_sqrt_div(u32 _interval, unsigned long x)
{
	u64 interval = _interval;

	/* scale for maximum precision */
	while (x < (1UL << (BITS_PER_LONG - 2))) {
		x <<= 2;
		interval <<= 1;
	}
	do_div(interval, int_sqrt(x));
	return (u32)interval;
}

Doing the scaling before calling int_sqrt() makes int_sqrt() faster,
since we avoid the initial scaling on 'one'

	one = 1UL << (BITS_PER_LONG - 2);
	while (one > op)
		one >>= 2;

Always sleep a bit, its good.




^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2012-05-05  8:14 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-05-04 15:11 [Codel] fp sqrt vis int sqrt? Dave Taht
2012-05-04 15:14 ` Jim Gettys
2012-05-04 15:19   ` Dave Taht
2012-05-04 15:26     ` Jim Gettys
2012-05-04 15:56       ` Eric Dumazet
2012-05-04 17:23         ` Dave Taht
2012-05-04 17:43           ` Dave Taht
2012-05-04 17:44             ` Dave Taht
2012-05-04 17:46               ` Dave Taht
2012-05-04 23:42               ` Kathleen Nichols
2012-05-05  0:01                 ` Dave Taht
2012-05-04 17:47           ` Eric Dumazet
2012-05-04 18:26             ` Eric Dumazet
2012-05-04 18:39               ` Dave Taht
2012-05-04 18:50                 ` Eric Dumazet
2012-05-04 19:04                   ` Dave Taht
2012-05-04 19:16                     ` Eric Dumazet
2012-05-05  8:14                       ` Eric Dumazet

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox