Cake - FQ_codel the next generation
 help / color / mirror / Atom feed
* [Cake] second system syndrome
@ 2015-12-06 14:53 Dave Taht
  2015-12-06 16:08 ` Sebastian Moeller
  2015-12-06 18:21 ` Kevin Darbyshire-Bryant
  0 siblings, 2 replies; 24+ messages in thread
From: Dave Taht @ 2015-12-06 14:53 UTC (permalink / raw)
  To: cake

[-- Attachment #1: Type: text/plain, Size: 4367 bytes --]

I find myself torn by 3 things.

1) The number of huge wins in fixing wifi far outweigh what we have
thus far achieved, or not achieved, in cake.

2) Science - Cake is like wet paint! There knobs to fiddle, endless
tests to run, new ideas to try... measurements to take! papers to
write!

3) Engineering - I just want it to be *done*. It's been too long. It
was demonstrably faster than htb + fq_codel on weak hardware last
june, and handled GRO peeling, which were the two biggest "bugs" in
sqm I viewed we had.

In wearing these 3 hats, I would

3A) like to drop cake, personally, from something I needed to care about.
3B) But, can't, because the profusion of features need to be fully evaluated.
In this test series: http://snapon.cs.kau.se/~d/bcake_tests/ neither
cake or bcake were "better" than the existing codel in any measurable
way, and in most cases, worse. bcake did mildly better at a short
(10ms) RTT... which was interesting.

If you want to take apart this batch with "flent", looking for
enlightenment, also, please go ahead.

Were I to short circuit the science here, I'd rip out the sqrt cache
and fold back in mainline codel into cake. This would also have the
added benefit of also moving us back to 32bitland for various values
(tho "now" becomes a bit trickier) and hopefully improving cpu
efficiency a bit further (but this has to get done carefully unless
your head is good at 32 bit overflow math)

Next up, a series testing the fq portions...

If someone (else) would like to fork cake again and do the two things
above, I'd appreciate it.

3C) Most of the new statistics are pretty useless IMHO. Interesting,
but in the end I mostly care about drops and marks only.

3D) Don't have a use for the rate estimator either, and so far the
dual queue idea has not materialized. I understand how it might be
done now - using the 8 way set associative thing per DEST hash, but I
don't really see the benefit of that vs just using a DEST hash in the
first place.

3E) Want cake to run as fast as possible on cheap hardware and be a
demonstrable win over htb + fq_codel - and upstream it and be done
with it.

3F) At the moment I'm favoring peeling at the current quantum rather
than anything more elaborate.

3G) really want the thing to work correctly down to 64k and up to at
least a gbit.
which needs testing... but probably after we pick a codel....

2A) As a science vehicle, there are many other things we could be
trying in cake, and I happen to like the idea of the (currently sort)
cache in for example, trying a faster curve at startup - or, as in the
ns2 code - a harder curve at say count + 128 or even earlier, as the
speed up in drops gets pretty tiny even at count + 16. (see attached)

(it doesn't make much sense to calculate the sqrt at run time - you
can just calculate the constants elsewhere and plug them in, btw.
attached is a teeny proggie that does that an also explores a harder
initial curve (you'd jump count up to where it matched the curve when
you reverted to the invsqrt calculation) - and no, I haven't tried
plugging this in either... DANGER! Wet Paint!

I also like keeping all the core values 64 bits, from a science perspective.

There are also things like reducing the number of flows, and
exercising the 8 way associative cache more - to say 256, 128, or even
32? Or relative to the bandwidth... or target setting...

and I do keep wishing we could at the very least resolve the target >
mtu issue. std codel turns off with a single mtu outstanding. That
arguably should have been all that was needed...

and then there's ecn...

1A) Boy do we have major problems with wifi, and major gains to be had
1B) All the new platforms have bugs eveyerhwer, including in the
ethernet drivers

0)

So I guess it does come down to - what are the "musts" for cake before
it goes upstream? How much more work is required, by everybody, on
every topic, til that happens? Can we just fork off what is known to
work reasonably well, and let the rest evolve separately in a cake2?
(cleaning up the api in the process?) Is it still "cake" if we do
that?

Because, damn it, 2016 is going to be the year of WiFi.


Dave Täht
Let's go make home routers and wifi faster! With better software!
https://www.gofundme.com/savewifi

[-- Attachment #2: invsqrt.c --]
[-- Type: text/x-csrc, Size: 318 bytes --]

#include <stdio.h>
#include <stdlib.h>
#include <math.h>

main() {
double cur, inv,sum1,sum2;
int i;
	inv = cur = sum1 = sum2 = 0.0;
for (i=1;i<16;i++) {
	sum1 += inv = 1.0/sqrt(i);
	printf("i=%d inv=%f sum2=%f\n",i,inv,sum1);
	sum2 += cur = 2.0 * (1.0/sqrt(i))/3.0;
	printf("i=%d cur=%f sum2=%f\n",i,cur,sum2);
	}
}


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Cake] second system syndrome
  2015-12-06 14:53 [Cake] second system syndrome Dave Taht
@ 2015-12-06 16:08 ` Sebastian Moeller
  2015-12-07 12:24   ` Kevin Darbyshire-Bryant
  2015-12-06 18:21 ` Kevin Darbyshire-Bryant
  1 sibling, 1 reply; 24+ messages in thread
From: Sebastian Moeller @ 2015-12-06 16:08 UTC (permalink / raw)
  To: Dave Täht; +Cc: cake

[-- Attachment #1: Type: text/plain, Size: 7335 bytes --]

Hi Dave,

since I am not really involved in cake development make out of my comments what you will…

Even though I added comments below, IMHO, the way to proceed is discuss the statistics to pass back to tc, and define a set we agree to stick to (sat least as a base set, potentially copied from the best of fq_codel and HTB) and then ask the kernel and iproute2 folks to merge what we have. Changes that improve performance will most likely be possible in the future even if upstreamed already… 


On Dec 6, 2015, at 15:53 , Dave Taht <dave.taht@gmail.com> wrote:

> I find myself torn by 3 things.
> 
> 1) The number of huge wins in fixing wifi far outweigh what we have
> thus far achieved, or not achieved, in cake.

	As we say at home “der Spatz in der Hand ist besser als die Taube auf dem Dach”; so given that cake is almost baked and wifi needs a lot more than simple go-faster-stripes maybe finishing cake while getting wifi improved is achievable?

> 
> 2) Science - Cake is like wet paint! There knobs to fiddle, endless
> tests to run, new ideas to try... measurements to take! papers to
> write!
> 
> 3) Engineering - I just want it to be *done*. It's been too long. It
> was demonstrably faster than htb + fq_codel on weak hardware last
> june, and handled GRO peeling, which were the two biggest "bugs" in
> sqm I viewed we had.

	Two questions:
1) was it really faster for long enough tests (and has anybody accidentally looked at cpu temperatures once cake starts throttling)?
2) Bugs in sqm? I thought that cake’s reason d’être was not to improve sqm-scripts’ performance, but to make it simple for my mom to setup up a decent latency conserving internet access. So performance gains are sugar on top, but lack there of is not necessarily a merge stopper?

> 
> In wearing these 3 hats, I would
> 
> 3A) like to drop cake, personally, from something I needed to care about.
> 3B) But, can't, because the profusion of features need to be fully evaluated.
> In this test series: http://snapon.cs.kau.se/~d/bcake_tests/ neither
> cake or bcake were "better" than the existing codel in any measurable
> way, and in most cases, worse. bcake did mildly better at a short
> (10ms) RTT... which was interesting.

	But since codel/fq_cdel are hard to set up, especially in combination with a shaper; rough performance parity with HTB+fq_codel might be sufficient justification for a merge.

> 
> If you want to take apart this batch with "flent", looking for
> enlightenment, also, please go ahead.
> 
> Were I to short circuit the science here, I'd rip out the sqrt cache
> and fold back in mainline codel into cake. This would also have the
> added benefit of also moving us back to 32bitland for various values
> (tho "now" becomes a bit trickier) and hopefully improving cpu
> efficiency a bit further (but this has to get done carefully unless
> your head is good at 32 bit overflow math)
> 
> Next up, a series testing the fq portions...
> 
> If someone (else) would like to fork cake again and do the two things
> above, I'd appreciate it.
> 
> 3C) Most of the new statistics are pretty useless IMHO. Interesting,
> but in the end I mostly care about drops and marks only.

	I do care about packet size (and max packet size). The kernel’s complicated rules when and which overhead to add or not to add are so under-documented that one needs a way to figure out what information reaches the qdiscs/shapers otherwise meaningful per-paket-overhead accounting is not going to work. Max_packet size I see as the only way to check wether Meta-packets hit the qdiscs or not. Even if cake does not peel or always peel this is informative in my opinion.

> 
> 3D) Don't have a use for the rate estimator either, and so far the
> dual queue idea has not materialized. I understand how it might be
> done now - using the 8 way set associative thing per DEST hash, but I
> don't really see the benefit of that vs just using a DEST hash in the
> first place.
> 
> 3E) Want cake to run as fast as possible on cheap hardware and be a
> demonstrable win over htb + fq_codel - and upstream it and be done
> with it.

	Being able to set up a decent shaper/codel combination in one line of tc is already a win (but I repeat myself)

> 
> 3F) At the moment I'm favoring peeling at the current quantum rather
> than anything more elaborate.

	Why quantum, why not simply at MTU boundaries? I seem to recall that aggregates already carry information how many MTU segments they consist out of which could be re-used?

> 
> 3G) really want the thing to work correctly down to 64k and up to at
> least a gbit.
> which needs testing... but probably after we pick a codel....
> 
> 2A) As a science vehicle, there are many other things we could be
> trying in cake, and I happen to like the idea of the (currently sort)
> cache in for example, trying a faster curve at startup - or, as in the
> ns2 code - a harder curve at say count + 128 or even earlier, as the
> speed up in drops gets pretty tiny even at count + 16. (see attached)
> 
> (it doesn't make much sense to calculate the sqrt at run time - you
> can just calculate the constants elsewhere and plug them in, btw.
> attached is a teeny proggie that does that an also explores a harder
> initial curve (you'd jump count up to where it matched the curve when
> you reverted to the invsqrt calculation) - and no, I haven't tried
> plugging this in either... DANGER! Wet Paint!
> 
> I also like keeping all the core values 64 bits, from a science perspective.
> 
> There are also things like reducing the number of flows, and
> exercising the 8 way associative cache more - to say 256, 128, or even
> 32? Or relative to the bandwidth... or target setting...
> 
> and I do keep wishing we could at the very least resolve the target >
> mtu issue. std codel turns off with a single mtu outstanding. That
> arguably should have been all that was needed...
> 
> and then there's ecn...
> 
> 1A) Boy do we have major problems with wifi, and major gains to be had
> 1B) All the new platforms have bugs eveyerhwer, including in the
> ethernet drivers
> 
> 0)
> 
> So I guess it does come down to - what are the "musts" for cake before
> it goes upstream?

	Get the feature set defined (potentially strip contentious features for the time being and merge them piecewise into the kernel proper) as well as a statistics set so the communication with tc is future proof enough for the near future. Then try to get it merged...

> How much more work is required, by everybody, on
> every topic, til that happens? Can we just fork off what is known to
> work reasonably well, and let the rest evolve separately in a cake2?

	I was under the impression, that you and Toke are currently measuring the performance costs of the additional features so decisions which features to include could be made based on their cost?


> (cleaning up the api in the process?) Is it still "cake" if we do
> that?

	In the end we all know “the cake is a lie” so ;)

Best Regards
	Sebastian

> 
> Because, damn it, 2016 is going to be the year of WiFi.
> 
> 
> Dave Täht
> Let's go make home routers and wifi faster! With better software!
> https://www.gofundme.com/savewifi

[-- Attachment #2: invsqrt.c --]
[-- Type: text/x-csrc, Size: 318 bytes --]

#include <stdio.h>
#include <stdlib.h>
#include <math.h>

main() {
double cur, inv,sum1,sum2;
int i;
	inv = cur = sum1 = sum2 = 0.0;
for (i=1;i<16;i++) {
	sum1 += inv = 1.0/sqrt(i);
	printf("i=%d inv=%f sum2=%f\n",i,inv,sum1);
	sum2 += cur = 2.0 * (1.0/sqrt(i))/3.0;
	printf("i=%d cur=%f sum2=%f\n",i,cur,sum2);
	}
}


[-- Attachment #3: Type: text/plain, Size: 146 bytes --]

> _______________________________________________
> Cake mailing list
> Cake@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cake


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Cake] second system syndrome
  2015-12-06 14:53 [Cake] second system syndrome Dave Taht
  2015-12-06 16:08 ` Sebastian Moeller
@ 2015-12-06 18:21 ` Kevin Darbyshire-Bryant
  1 sibling, 0 replies; 24+ messages in thread
From: Kevin Darbyshire-Bryant @ 2015-12-06 18:21 UTC (permalink / raw)
  To: cake

[-- Attachment #1: Type: text/plain, Size: 5844 bytes --]

Comments:

My feeling is that we've been here (or close to here) before, and every
time there has been a 'just one more thing/feature' Columbo moment which
puts it all on hold again.  The last time was 'dual flow isolation'. 
Without wishing to stir too much that pot again I do think the 'dual
flow isolation', if I understand the intention correctly*, is a feature
that 'consumers' would find desirable.

I'm wondering what the hold up is and whether I can help.   I personally
pledge £200 to the cake project.  I know it's not much in terms of
hours/rate etc but please take it as a sign of how much I personally
want cake to move forward and realise my own limitations in doing so.


One feature/benefit that hasn't been measured yet is 'simplicity'.  cake
offers a good shaper, fair qeueing, dscp washing, overhead/framing
calculation/compensation all in one pretty damn easy to configure package. 

*trying to ensure fairness between hosts, not just between queues.  I
think the main aim/thought is having a 'bittorrent' host isolated and
low priority from everything else.

On 06/12/15 14:53, Dave Taht wrote:
> I find myself torn by 3 things.
>
> 1) The number of huge wins in fixing wifi far outweigh what we have
> thus far achieved, or not achieved, in cake.
>
> 2) Science - Cake is like wet paint! There knobs to fiddle, endless
> tests to run, new ideas to try... measurements to take! papers to
> write!
>
> 3) Engineering - I just want it to be *done*. It's been too long. It
> was demonstrably faster than htb + fq_codel on weak hardware last
> june, and handled GRO peeling, which were the two biggest "bugs" in
> sqm I viewed we had.
>
> In wearing these 3 hats, I would
>
> 3A) like to drop cake, personally, from something I needed to care about.
> 3B) But, can't, because the profusion of features need to be fully evaluated.
> In this test series: http://snapon.cs.kau.se/~d/bcake_tests/ neither
> cake or bcake were "better" than the existing codel in any measurable
> way, and in most cases, worse. bcake did mildly better at a short
> (10ms) RTT... which was interesting.
>
> If you want to take apart this batch with "flent", looking for
> enlightenment, also, please go ahead.
>
> Were I to short circuit the science here, I'd rip out the sqrt cache
> and fold back in mainline codel into cake. This would also have the
> added benefit of also moving us back to 32bitland for various values
> (tho "now" becomes a bit trickier) and hopefully improving cpu
> efficiency a bit further (but this has to get done carefully unless
> your head is good at 32 bit overflow math)
>
> Next up, a series testing the fq portions...
>
> If someone (else) would like to fork cake again and do the two things
> above, I'd appreciate it.
>
> 3C) Most of the new statistics are pretty useless IMHO. Interesting,
> but in the end I mostly care about drops and marks only.
>
> 3D) Don't have a use for the rate estimator either, and so far the
> dual queue idea has not materialized. I understand how it might be
> done now - using the 8 way set associative thing per DEST hash, but I
> don't really see the benefit of that vs just using a DEST hash in the
> first place.
>
> 3E) Want cake to run as fast as possible on cheap hardware and be a
> demonstrable win over htb + fq_codel - and upstream it and be done
> with it.
>
> 3F) At the moment I'm favoring peeling at the current quantum rather
> than anything more elaborate.
>
> 3G) really want the thing to work correctly down to 64k and up to at
> least a gbit.
> which needs testing... but probably after we pick a codel....
>
> 2A) As a science vehicle, there are many other things we could be
> trying in cake, and I happen to like the idea of the (currently sort)
> cache in for example, trying a faster curve at startup - or, as in the
> ns2 code - a harder curve at say count + 128 or even earlier, as the
> speed up in drops gets pretty tiny even at count + 16. (see attached)
>
> (it doesn't make much sense to calculate the sqrt at run time - you
> can just calculate the constants elsewhere and plug them in, btw.
> attached is a teeny proggie that does that an also explores a harder
> initial curve (you'd jump count up to where it matched the curve when
> you reverted to the invsqrt calculation) - and no, I haven't tried
> plugging this in either... DANGER! Wet Paint!
>
> I also like keeping all the core values 64 bits, from a science perspective.
>
> There are also things like reducing the number of flows, and
> exercising the 8 way associative cache more - to say 256, 128, or even
> 32? Or relative to the bandwidth... or target setting...
>
> and I do keep wishing we could at the very least resolve the target >
> mtu issue. std codel turns off with a single mtu outstanding. That
> arguably should have been all that was needed...
>
> and then there's ecn...
>
> 1A) Boy do we have major problems with wifi, and major gains to be had
> 1B) All the new platforms have bugs eveyerhwer, including in the
> ethernet drivers
>
> 0)
>
> So I guess it does come down to - what are the "musts" for cake before
> it goes upstream? How much more work is required, by everybody, on
> every topic, til that happens? Can we just fork off what is known to
> work reasonably well, and let the rest evolve separately in a cake2?
> (cleaning up the api in the process?) Is it still "cake" if we do
> that?
>
> Because, damn it, 2016 is going to be the year of WiFi.
>
>
> Dave Täht
> Let's go make home routers and wifi faster! With better software!
> https://www.gofundme.com/savewifi
>
>
> _______________________________________________
> Cake mailing list
> Cake@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cake



[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4816 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Cake] second system syndrome
  2015-12-06 16:08 ` Sebastian Moeller
@ 2015-12-07 12:24   ` Kevin Darbyshire-Bryant
  2015-12-20 12:47     ` Dave Taht
  0 siblings, 1 reply; 24+ messages in thread
From: Kevin Darbyshire-Bryant @ 2015-12-07 12:24 UTC (permalink / raw)
  To: cake

[-- Attachment #1: Type: text/plain, Size: 10742 bytes --]

Here are some further thoughts, intermingled with Sebastian's comments
as well.  This is more of a brain dump/thinking out loud so is in
complete disorder and blunt.


On 06/12/15 16:08, Sebastian Moeller wrote:
> Hi Dave,
>
> since I am not really involved in cake development make out of my comments what you will…
>
> Even though I added comments below, IMHO, the way to proceed is discuss the statistics to pass back to tc, and define a set we agree to stick to (sat least as a base set, potentially copied from the best of fq_codel and HTB) and then ask the kernel and iproute2 folks to merge what we have. Changes that improve performance will most likely be possible in the future even if upstreamed already… 
I'm going to get shouted at for this, but I don't care.  There is one
more feature (ok 2 more features) I'd like to see.

1) Extend DSCP washing to include a 'washed DSCP value' per tin.  It's a
simple extension and allows washing from the default 'best effort' to
something else we can choose.  I've written about 25% of this, what I'm
struggling with is a reasonable way of passing 8 bytes (the DSCP code
for each tin) from userspace to kernel land.   The tc interface would be
simple (and horrible) in the sense of it'll be a hex string of the
required dscp codes for each tin.  Unless someone wishes to write a
better interface of course.  There will of course be additional cpu cost
- picking up the 'washed to' from memory.

2) dual flow isolation.

>
>
> On Dec 6, 2015, at 15:53 , Dave Taht <dave.taht@gmail.com> wrote:
>
>> I find myself torn by 3 things.
>>
>> 1) The number of huge wins in fixing wifi far outweigh what we have
>> thus far achieved, or not achieved, in cake.
The distractions of crappy wifi and then the FCC débâcle haven't helped
the focus on Cake one bit.

Personally speaking, there's a lack of clear project 'lead' - I persist
in my assertion that I'm not a coder, certainly not a confident one. 
I've been reluctant to push commits to the repo with ideas/lunacy quite
frankly because I don't want to piss either you or Jonathan off by
messing with what I perceive to be 'his' code.  On the other hand, some
things have happened (and stuck!) because I just blew a raspberry in the
general direction, said 'stuff it' and pushed - anything that went into
a feature branch got ignored, anything in 'master' got at least compiled
and possibly even reviewed.   We should make better use of git &
(mis)feature branches.   But there's need for a 'Linus' here.  And a lot
more collaboration.


> 	As we say at home “der Spatz in der Hand ist besser als die Taube auf dem Dach”; so given that cake is almost baked and wifi needs a lot more than simple go-faster-stripes maybe finishing cake while getting wifi improved is achievable?
>
>> 2) Science - Cake is like wet paint! There knobs to fiddle, endless
>> tests to run, new ideas to try... measurements to take! papers to
>> write!
>>
>> 3) Engineering - I just want it to be *done*. It's been too long. It
>> was demonstrably faster than htb + fq_codel on weak hardware last
>> june, and handled GRO peeling, which were the two biggest "bugs" in
>> sqm I viewed we had.
> 	Two questions:
> 1) was it really faster for long enough tests (and has anybody accidentally looked at cpu temperatures once cake starts throttling)?
> 2) Bugs in sqm? I thought that cake’s reason d’être was not to improve sqm-scripts’ performance, but to make it simple for my mom to setup up a decent latency conserving internet access. So performance gains are sugar on top, but lack there of is not necessarily a merge stopper?
1) I suspect but don't *know* that the glitchy performance issue was a
manifestation of the incorrectly sized buffer bug (now well & truly
squished)  There were other odd symptoms associated with that bug too:
apparent changes in ECN marking, now behaving better.  I've run 40/10
rrul tests (wired) for 10 minutes and not seen any evidence of nasties. 
WiFi on the other hand......yes, let's leave that.

2) 'Apparent simplicity' is the phrase used on the bufferbloat cake page
and I like it!  Yes cake is 'complicated', but it's  darn sight easier
than a myriad of iptables rules & other carp that I don't understand.  I
also wonder if the focus on cpu usage is losing sight of offsets by not
having all those rules around.
>
>> In wearing these 3 hats, I would
>>
>> 3A) like to drop cake, personally, from something I needed to care about.
>> 3B) But, can't, because the profusion of features need to be fully evaluated.
>> In this test series: http://snapon.cs.kau.se/~d/bcake_tests/ neither
>> cake or bcake were "better" than the existing codel in any measurable
>> way, and in most cases, worse. bcake did mildly better at a short
>> (10ms) RTT... which was interesting.
> 	But since codel/fq_cdel are hard to set up, especially in combination with a shaper; rough performance parity with HTB+fq_codel might be sufficient justification for a merge.
Agreed
>
>> If you want to take apart this batch with "flent", looking for
>> enlightenment, also, please go ahead.
>>
>> Were I to short circuit the science here, I'd rip out the sqrt cache
>> and fold back in mainline codel into cake. This would also have the
>> added benefit of also moving us back to 32bitland for various values
>> (tho "now" becomes a bit trickier) and hopefully improving cpu
>> efficiency a bit further (but this has to get done carefully unless
>> your head is good at 32 bit overflow math)
>>
>> Next up, a series testing the fq portions...
>>
>> If someone (else) would like to fork cake again and do the two things
>> above, I'd appreciate it.
Feature branch.  Beyond my cut'n'paste C I'm afraid.
>>
>> 3C) Most of the new statistics are pretty useless IMHO. Interesting,
>> but in the end I mostly care about drops and marks only.
> 	I do care about packet size (and max packet size). The kernel’s complicated rules when and which overhead to add or not to add are so under-documented that one needs a way to figure out what information reaches the qdiscs/shapers otherwise meaningful per-paket-overhead accounting is not going to work. Max_packet size I see as the only way to check wether Meta-packets hit the qdiscs or not. Even if cake does not peel or always peel this is informative in my opinion.
I put 'last packet' there simply because it was available nearby in the
code :-)  I agree with Sebastian that max_packet has been extremely
useful in *KNOWING* what overheads the kernel is passing into the
qdisc.  Compromise:  Drop 'last packet', maintain 'max_packet'.
>
>> 3D) Don't have a use for the rate estimator either, and so far the
>> dual queue idea has not materialized. I understand how it might be
>> done now - using the 8 way set associative thing per DEST hash, but I
>> don't really see the benefit of that vs just using a DEST hash in the
>> first place.
>>
>> 3E) Want cake to run as fast as possible on cheap hardware and be a
>> demonstrable win over htb + fq_codel - and upstream it and be done
>> with it.
> 	Being able to set up a decent shaper/codel combination in one line of tc is already a win (but I repeat myself)
You do.  And I agree for a 2nd (or is it 3rd) time.
>
>> 3F) At the moment I'm favoring peeling at the current quantum rather
>> than anything more elaborate.
> 	Why quantum, why not simply at MTU boundaries? I seem to recall that aggregates already carry information how many MTU segments they consist out of which could be re-used?
>
>> 3G) really want the thing to work correctly down to 64k and up to at
>> least a gbit.
>> which needs testing... but probably after we pick a codel....
>>
>> 2A) As a science vehicle, there are many other things we could be
>> trying in cake, and I happen to like the idea of the (currently sort)
>> cache in for example, trying a faster curve at startup - or, as in the
>> ns2 code - a harder curve at say count + 128 or even earlier, as the
>> speed up in drops gets pretty tiny even at count + 16. (see attached)
>>
>> (it doesn't make much sense to calculate the sqrt at run time - you
>> can just calculate the constants elsewhere and plug them in, btw.
>> attached is a teeny proggie that does that an also explores a harder
>> initial curve (you'd jump count up to where it matched the curve when
>> you reverted to the invsqrt calculation) - and no, I haven't tried
>> plugging this in either... DANGER! Wet Paint!
>>
>> I also like keeping all the core values 64 bits, from a science perspective.
>>
>> There are also things like reducing the number of flows, and
>> exercising the 8 way associative cache more - to say 256, 128, or even
>> 32? Or relative to the bandwidth... or target setting...
>>
>> and I do keep wishing we could at the very least resolve the target >
>> mtu issue. std codel turns off with a single mtu outstanding. That
>> arguably should have been all that was needed...
>>
>> and then there's ecn...
>>
>> 1A) Boy do we have major problems with wifi, and major gains to be had
>> 1B) All the new platforms have bugs eveyerhwer, including in the
>> ethernet drivers
>>
>> 0)
>>
>> So I guess it does come down to - what are the "musts" for cake before
>> it goes upstream?
> 	Get the feature set defined (potentially strip contentious features for the time being and merge them piecewise into the kernel proper) as well as a statistics set so the communication with tc is future proof enough for the near future. Then try to get it merged...
>
>> How much more work is required, by everybody, on
>> every topic, til that happens? Can we just fork off what is known to
>> work reasonably well, and let the rest evolve separately in a cake2?
> 	I was under the impression, that you and Toke are currently measuring the performance costs of the additional features so decisions which features to include could be made based on their cost?

It is difficult for the rest of us to help with the measuring if we
don't know how & what you're measuring.  Judging from recent comments
I'd say some code profiling is going on too.  I've no idea how to do
that...let alone on my router...but if there were some instruction I'm
willing to try on a non X86_64 platform.

I'd also like to see a lot more algorithm documentation.  I can read
(some) code but there are quite a few places in cake that are opaque to me:

1) The DRR soft shaper algorithm, how it relates to 'now'
2) Interaction with bandwidth thresholds
3) Difference between tin_quantum_band & tin_quantum_prio.

The code even contains the comment 'this is the priority soft-shaper magic'!

Kevin


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4816 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Cake] second system syndrome
  2015-12-07 12:24   ` Kevin Darbyshire-Bryant
@ 2015-12-20 12:47     ` Dave Taht
  2015-12-20 12:52       ` Dave Taht
  2015-12-20 13:51       ` moeller0
  0 siblings, 2 replies; 24+ messages in thread
From: Dave Taht @ 2015-12-20 12:47 UTC (permalink / raw)
  To: Kevin Darbyshire-Bryant, Jonathan Morton; +Cc: cake

Jonathon, please comment on the proposals I laid out in:

https://lists.bufferbloat.net/pipermail/cake/2015-December/000861.html



On Mon, Dec 7, 2015 at 1:24 PM, Kevin Darbyshire-Bryant
>>> I find myself torn by 3 things.
>>>
>>> 1) The number of huge wins in fixing wifi far outweigh what we have
>>> thus far achieved, or not achieved, in cake.
> The distractions of crappy wifi and then the FCC débâcle haven't helped
> the focus on Cake one bit.

Wifi is not a distraction for me. Crappy wifi was *why* I got involved
in the whole bufferbloat project. I was otherwise quite happily
retired.

https://www.facebook.com/photo.php?fbid=374680036761&set=t.1483968819&type=3&theater

Everything else I've tackled... was to chase funding, or chase ideas
that were working out on some front or another, or cope with the fears
of people I respected...

For 3 years now, we've had the ability to make a dent in some of
wifi's problems. That's over a billion wifi devices shipped that could
have had a better wifi stack. If I could just have one set of working
wifi devices connecting me to San Juan Del Sur, in heavy rain, I'd
be back at that pool, above, and logout from civilization again.

> Personally speaking, there's a lack of clear project 'lead' -

In my mind, jonathon is and has always been the project lead for cake.
I was very happy when he signed up to do it, and went off in april to
try to finally pull together enough resources to tackle wifi.

One thing that went wrong with the cake project along the way was not
being able to incrementally test each new idea.

new rule: Thou Shalt NOT make changes to codel without being able to
test at a wide variety of RTTs.

And not being able to run perf on any architecture hurt too.

I got grumpy when I started seeing featuritis (and yes, I contributed
to this too, guilt also mine) and algorithmic changes without any
realistic testing being done, and stepped in to try and fix these
things.

perf's fixed.

I emphatically want to bail on me even thinking about cake. It is not
my job to get it upstream, it is not my job to make it better. I think
I have shown conclusively (by doing bcake) that a multitude of
features and code bloat don't actually accomplish anything.

There are still no realistic tests of very low bandwidth behavior...
(which I've asked toke to poke into)

>I persist
> in my assertion that I'm not a coder, certainly not a confident one.
> I've been reluctant to push commits to the repo with ideas/lunacy quite
> frankly because I don't want to piss either you or Jonathan off by
> messing with what I perceive to be 'his' code.

I never feel that way about things. It's just "code". It is either
good or it isn't. Lots of things are worth trying. Most of which
aren't worth keeping. always happy to see new ideas.

> On the other hand, some
> things have happened (and stuck!) because I just blew a raspberry in the
> general direction, said 'stuff it' and pushed - anything that went into
> a feature branch got ignored, anything in 'master' got at least compiled
> and possibly even reviewed.   We should make better use of git &
> (mis)feature branches.   But there's need for a 'Linus' here.  And a lot
> more collaboration.

Well, a dumazet would be more effective than a linus.

The architectural choice here is mostly *what to take away*, which is
nearly everything new in it, and get it upstream for further comments.

Which are the choices I'd like jonathon to make, or at least comment
on, which I laid out in my original mail. *not my job*.

After the GRO code is proven I'd rip out sebastian's treasured last
packet size stat, too. I might rip out all the dscp models in favor of
the 3 tier sqm one. I'd go back to 32 bits for codel.

My cup overfloweth. My job at the moment is to move the bloat related
sites elsewhere before isc turns the power off next week.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Cake] second system syndrome
  2015-12-20 12:47     ` Dave Taht
@ 2015-12-20 12:52       ` Dave Taht
  2015-12-21  9:02         ` moeller0
  2015-12-20 13:51       ` moeller0
  1 sibling, 1 reply; 24+ messages in thread
From: Dave Taht @ 2015-12-20 12:52 UTC (permalink / raw)
  To: Kevin Darbyshire-Bryant, Jonathan Morton; +Cc: cake

the 200/20mbit tests I ran yesterday.

http://snapon.cs.kau.se/~d/ptests/

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Cake] second system syndrome
  2015-12-20 12:47     ` Dave Taht
  2015-12-20 12:52       ` Dave Taht
@ 2015-12-20 13:51       ` moeller0
  1 sibling, 0 replies; 24+ messages in thread
From: moeller0 @ 2015-12-20 13:51 UTC (permalink / raw)
  To: Dave Täht; +Cc: cake

> [...]
> After the GRO code is proven I'd rip out sebastian's treasured last
> packet size stat, too.

	Sad ;).  But didn’t your numbers comparing bcake versus cake show that all of the additional statistics cost nothing, at least on the arm platform? I know that my focus on per-paket overhead looks a bit peculiar, but since there was a niche at the time, I occupied it ;) and now I am going to defend my turf. 
Could I argue here with an appeal to authority then, and make cake report the same stats as fq_codel does (I assume Eric had good reasons for each of the stats, it is purely accidental that fq_codel reports maxpackets ;) :

qdisc fq_codel 130: dev ifb4pppoe-ge00 parent 1:13 limit 1001p flows 1024 quantum 300 target 5.0ms interval 100.0ms ecn 
 Sent 1326934 bytes 974 pkt (dropped 0, overlimits 0 requeues 0) 
 backlog 0b 0p requeues 0 
  maxpacket 1538 drop_overlimit 0 new_flow_count 37 ecn_mark 0
  new_flows_len 1 old_flows_len 7 


> I might rip out all the dscp models in favor of
> the 3 tier sqm one.

	Why 3? I actually want 4 in sqm (basically a thin network control layer (#1) on top of en elevated priority (#2) on top of best-effort(#3) and background (#4)), why? I need a place to stuff a few things like PPP control packets into with minimal delay.

> I'd go back to 32 bits for codel.

	Why? Due to the cost on 32 bit platforms or just on principle?

> 
> My cup overfloweth. My job at the moment is to move the bloat related
> sites elsewhere before isc turns the power off next week.

	You rock; it is incredible that you basically carried most of the infrastructure and administration on top of pushing the de-bloat effort. All I can offer is simply to keep helping people startup with sqm-scripts and tinker a bit with the sqm-scripts machinery (basically “prototyping” stuff along until it catches Toke’s attention and he fixes things for good ;) ) My hope for cake always was, to make helping people get started easier, any additional better performance is sugar on top.

Best Regards
	Sebastian


> _______________________________________________
> Cake mailing list
> Cake@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cake


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Cake] second system syndrome
  2015-12-20 12:52       ` Dave Taht
@ 2015-12-21  9:02         ` moeller0
  2015-12-21 10:40           ` Dave Taht
  0 siblings, 1 reply; 24+ messages in thread
From: moeller0 @ 2015-12-21  9:02 UTC (permalink / raw)
  To: Dave Täht; +Cc: Kevin Darbyshire-Bryant, Jonathan Morton, cake

I had a quick look over these, both htb+fq_codel egress and bcake egress (both without perf) seem “contaminated" by a periodic process with a period of 50/8 = 6.25 seconds. Is this one of the cyclic probes measuring cpu load or so? 
	BTW are you using simplest.qos or simple.qos for the htb+fq_codel test (or something unrelated to sqm)? I ask because we have a shipload of costly iptables/tc filter stuff only happening in simple.qos (while rural_be will not use any DSCPs besides 0 the filters should still cost a bit CPU). I do not seem to be able to see any additional meta information from the flent files, probably PEBCAK n my side...

Best Regards
	Sebastian


> On Dec 20, 2015, at 13:52 , Dave Taht <dave.taht@gmail.com> wrote:
> 
> the 200/20mbit tests I ran yesterday.
> 
> http://snapon.cs.kau.se/~d/ptests/
> _______________________________________________
> Cake mailing list
> Cake@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cake


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Cake] second system syndrome
  2015-12-21  9:02         ` moeller0
@ 2015-12-21 10:40           ` Dave Taht
  2015-12-21 11:10             ` moeller0
  0 siblings, 1 reply; 24+ messages in thread
From: Dave Taht @ 2015-12-21 10:40 UTC (permalink / raw)
  To: moeller0; +Cc: Kevin Darbyshire-Bryant, Jonathan Morton, cake

On Mon, Dec 21, 2015 at 10:02 AM, moeller0 <moeller0@gmx.de> wrote:
> I had a quick look over these, both htb+fq_codel egress and bcake egress (both without perf) seem “contaminated" by a periodic process with a period of 50/8 = 6.25 seconds. Is this one of the cyclic probes measuring cpu load or so?
>         BTW are you using simplest.qos or simple.qos for the htb+fq_codel test (or something unrelated to sqm)? I ask because we have a shipload of costly iptables/tc filter stuff only happening in simple.qos (while rural_be will not use any DSCPs besides 0 the filters should still cost a bit CPU). I do not seem to be able to see any additional meta information from the flent files, probably PEBCAK n my side...

simple.qos.

see also: https://github.com/dtaht/sch_cake/commit/a66ee4fa355a62633b34fd05834075ea294e3b79

Did not switch cake over to it...

I still do not see any reason for precedence or diffserv8 to exist,
and can barely cope with the idea of diffserv4.

>
> Best Regards
>         Sebastian
>
>
>> On Dec 20, 2015, at 13:52 , Dave Taht <dave.taht@gmail.com> wrote:
>>
>> the 200/20mbit tests I ran yesterday.
>>
>> http://snapon.cs.kau.se/~d/ptests/
>> _______________________________________________
>> Cake mailing list
>> Cake@lists.bufferbloat.net
>> https://lists.bufferbloat.net/listinfo/cake
>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Cake] second system syndrome
  2015-12-21 10:40           ` Dave Taht
@ 2015-12-21 11:10             ` moeller0
  2015-12-21 12:00               ` Dave Taht
  0 siblings, 1 reply; 24+ messages in thread
From: moeller0 @ 2015-12-21 11:10 UTC (permalink / raw)
  To: Dave Täht; +Cc: Kevin Darbyshire-Bryant, Jonathan Morton, cake

Hi Dave,


> On Dec 21, 2015, at 11:40 , Dave Taht <dave.taht@gmail.com> wrote:
> 
> On Mon, Dec 21, 2015 at 10:02 AM, moeller0 <moeller0@gmx.de> wrote:
>> I had a quick look over these, both htb+fq_codel egress and bcake egress (both without perf) seem “contaminated" by a periodic process with a period of 50/8 = 6.25 seconds. Is this one of the cyclic probes measuring cpu load or so?
>>        BTW are you using simplest.qos or simple.qos for the htb+fq_codel test (or something unrelated to sqm)? I ask because we have a shipload of costly iptables/tc filter stuff only happening in simple.qos (while rural_be will not use any DSCPs besides 0 the filters should still cost a bit CPU). I do not seem to be able to see any additional meta information from the flent files, probably PEBCAK n my side...
> 
> simple.qos.

	Excellent, thanks for the information.

> 
> see also: https://github.com/dtaht/sch_cake/commit/a66ee4fa355a62633b34fd05834075ea294e3b79

	Don’t get me wrong I am always duly impressed by making things more efficient, but without actually looking at the code I would believe this is only called if a new diffserv regime is initialized? And if the initialization would take "a minute" I could not care less (for the same reason I am not sure why "target = interval >> 4” is such a big deal computationally wise; I understand its charm in getting rid of target as an explicit variable though).

> 
> Did not switch cake over to it...
> 
> I still do not see any reason for precedence or diffserv8 to exist,
> and can barely cope with the idea of diffserv4.

	This is all beyond my "pay-grade”/area of expertise, but… 

	You convinced me long ago, that next to best-effort, it would be nice to have a way of saying packets are more or less important, so this is 3 tiers right there and exactly what we have in simple.qos. But it turns out packets, like animals I might add, are not all equal (and  the simple “Four legs good, two legs bad” dichotomy is not sufficient either ;) ) some are just special and that need special care: in case of PPPoE-links all PPP administrative packets need* a priority above all else  as do yet to be coded ICMP latency under load probes. And since 3+1 equals 4, I will go and build a new DSCP3plus1.qos script (once I get around to it ;) ) (<Note to self>, since VoIP is the best candidate for elevated priority, make sure to scale this tier for an integer number of VoIP flows (which clock in at around 100Kbps each) and make this tier symmetric, do not scale ingress and egress differently in the same ratio as the link asymmetry; also tier 4 does not need to be exposed to client machines at all, this is an affair between the router and its uplink</Note to self>)

Diffserv8 and precedence I have no real opinion on; except that in my personal pet theory of DSCP best practices there are only three bit available anyway. (I lifted this idea from  or better "got inspired by"  one of the RFCs cited on one of the bufferbloat lists, so I do not claim originality at all). So a scheme with 8 tiers would be “complete”, not that this justifies additional complexity…

Best Regards
	Sebastian

*: actually work best with, not "will not work with out" highest priority)


> 
>> 
>> Best Regards
>>        Sebastian
>> 
>> 
>>> On Dec 20, 2015, at 13:52 , Dave Taht <dave.taht@gmail.com> wrote:
>>> 
>>> the 200/20mbit tests I ran yesterday.
>>> 
>>> http://snapon.cs.kau.se/~d/ptests/
>>> _______________________________________________
>>> Cake mailing list
>>> Cake@lists.bufferbloat.net
>>> https://lists.bufferbloat.net/listinfo/cake
>> 


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Cake] second system syndrome
  2015-12-21 11:10             ` moeller0
@ 2015-12-21 12:00               ` Dave Taht
  2015-12-21 13:05                 ` moeller0
  0 siblings, 1 reply; 24+ messages in thread
From: Dave Taht @ 2015-12-21 12:00 UTC (permalink / raw)
  To: moeller0; +Cc: Kevin Darbyshire-Bryant, Jonathan Morton, cake

in terms of the pppoE stuff needing elevated priority, this is code
that doesn't exist already and is already handled by the fast/slow
queue stuff in fq_codel?

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Cake] second system syndrome
  2015-12-21 12:00               ` Dave Taht
@ 2015-12-21 13:05                 ` moeller0
  2015-12-21 15:36                   ` Jonathan Morton
  0 siblings, 1 reply; 24+ messages in thread
From: moeller0 @ 2015-12-21 13:05 UTC (permalink / raw)
  To: Dave Täht; +Cc: Kevin Darbyshire-Bryant, Jonathan Morton, cake

Hi Dave,

> On Dec 21, 2015, at 13:00 , Dave Taht <dave.taht@gmail.com> wrote:
> 
> in terms of the pppoE stuff needing elevated priority, this is code
> that doesn't exist already and is already handled by the fast/slow
> queue stuff in fq_codel?

So I am still working on and off on my pet idea of shaping on a physical interface (say ge00 in cerowrt parlance) instead of pope-ge00. In that case we actually see all PPPoE-pakets (like PADI, PADO and friends see https://en.wikipedia.org/wiki/Point-to-point_protocol_over_Ethernet#PPPoE_Discovery_.28PPPoED.29) and since all other traffic relies on the status of the “PPP tunnel” these deserve the highest priority. Since they are relatively rare, current fq_codel+htb sort of works okay (well, I try steering everybody to set up sqm on the pppoe interface to actually avoid stepping into this mess), but the shaper will have no information about the maintenance packets at all and hence will slightly undershape the link (undershape as in not shaping sufficiently). As far as I can tell the PPPoE deamon uses LCP echoes to test the link state, now these are only sent once per second (on cerowrt) but miss/drop 5 of these in a row and the PPP connection is teared down and reestablished which takes a while, potentially long enough for all active flows to time out, not nice… Especially since it is possible to drive sqm into drop-tail behavior by simply flooding it, do this for a few seconds and watch PPPoE disconnect...
	
The challenge of shaping on PPPoE instead, is that the kernel accounts different amounts of overhead for ingress and egress there. In case you wonder fq_codel’s maxpacket statistics helped me figuring that out by simply looking at the maxpacket sizes for pppoe-ge00 (maxpacket 1516) and ifb4pppoe-ge00 (maxpacket 1538), which for all I know is just a heuristic and not real proof. The numbers themselves are easily explained given that I have an overhead of 24 bytes specified: 1516-24 = 1492 this is the PPPoE payload (the 8 byte PPP + PPPoE header live inside the classical ethernet MTU) and 1538-24 = 1514 (were the kernel helpfully added part of the ethernet overhead: 6 (dest MAC) + 6 (src MAC) + 2 (ether type) but forgot all about the other ethernet details worth 24 more Byte on the wire).
	All of which makes me wish the kernel would leave the overhead handling completely to user space, because it clearly is doing something complicated (at least given the scarcity of information besides RTF source code). I guess I can fix^Wwork-around this by teaching sqm to handle ingress and egress overhead independently.

Best Regards
	Sebastian

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Cake] second system syndrome
  2015-12-21 13:05                 ` moeller0
@ 2015-12-21 15:36                   ` Jonathan Morton
  2015-12-21 18:19                     ` moeller0
  2015-12-22 22:30                     ` Kevin Darbyshire-Bryant
  0 siblings, 2 replies; 24+ messages in thread
From: Jonathan Morton @ 2015-12-21 15:36 UTC (permalink / raw)
  To: moeller0; +Cc: Dave Täht, Kevin Darbyshire-Bryant, cake

Addressing the subject line of this thread directly:

I intended, from the start, for Cake to integrate the best available algorithms for shaping, AQM, flow isolation, and priority queuing.  This has, of necessity, resulted in Cake becoming larger and more complex than most other qdiscs.  I do agree that it’s important to avoid *needless* complexity, for performance reasons if none other.

Shaping, AQM, flow isolation and priority queuing are Cake’s core feature set.  I’m not going to remove or cripple any of those functions in the name of elegance - that would be counter-productive.  So features that directly contribute to those functions - such as the packet-size compensation which the shaper relies upon, and the GSO peeling which the packet-size compensator relies upon - are also clearly within scope, even though they might complicate the configuration interface.

The “dual flow isolation”, which after a great deal of thought I’ve decided to rename “triple flow isolation” (for reasons that will become clear later), is within scope because it improves the flow-isolation feature.  If and when I can get my head around all the little changes that keep (potentially) breaking everything behind my back, I’ll be able to actually implement it some day.

It could reasonably be argued that some logic should be moved out of kernel space and into userspace, and/or vice versa.  However, we need concrete justifications for doing so, and a certain level of self-consistency.  We also need *really* strong justification before making optimisations which make future adjustments more difficult.

One such justification might be to more clearly separate the user-visible link characteristics (such as bandwidth, inherent latency and encapsulation) from the algorithm implementation parameters required to make best use of it (such as interval, target, etc).  These are parameters that are calculated at configuration time, so they do not need to be especially fast.

The most difficult function to define and scope has been the priority queue, due in no small part to the hilariously weak specification that is Diffserv.  I have tried to carve a fresh path of clarity here, interpreting the existing specifications into a coherent implementation in the hope that it’ll actually get used as such in future.

In that context, the “squash” and “wash” features truly baffle me.  I’d prefer to see them both gone.  Either you want to *use* Diffserv, in which case downstream networks might also benefit from the same markings, or you want to *ignore* it, in which case you might as well leave the existing marks alone, or you want to do something more sophisticated that Cake’s core feature set doesn’t support.

In short, Cake is *not* the right place to change the DSCP field.  A separate qdisc could be written to do that job with minimal overhead and a dedicated configuration interface, and be inserted either before or after Cake as required.  Or we could wait for pre-ingress-qdisc firewall rules to become available.

The AQM layer is also apparently a source of controversy.  Codel is really, *really* good at dealing with single, well-behaved TCP flows in the congestion-avoidance phase.  It’s also really, *really* bad at dealing with unresponsive UDP flows, and somewhat mediocre at dealing with TCP in slow-start or multiple TCP flows in the same queue.  This is a problem that we need to address, one way or another - whether by modifying Codel or finding some way to switch between two or more different AQM schemes based on traffic characteristics.  In the meantime, we have to experiment.

I am happy to see optimisations go in, as long as they’re clearly beneficial, don’t change the logic, and don't preclude changes to areas that are not entirely settled.  For anything else, please make it a separate branch with a pull request, so that I don’t have to be distracted by reverting stuff that doesn’t work.

 - Jonathan Morton


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Cake] second system syndrome
  2015-12-21 15:36                   ` Jonathan Morton
@ 2015-12-21 18:19                     ` moeller0
  2015-12-21 20:36                       ` Jonathan Morton
  2015-12-22 22:30                     ` Kevin Darbyshire-Bryant
  1 sibling, 1 reply; 24+ messages in thread
From: moeller0 @ 2015-12-21 18:19 UTC (permalink / raw)
  To: Jonathan Morton; +Cc: Dave Täht, Kevin Darbyshire-Bryant, cake

Hi Jonathan,


> On Dec 21, 2015, at 16:36 , Jonathan Morton <chromatix99@gmail.com> wrote:\
[...]
> The “dual flow isolation”, which after a great deal of thought I’ve decided to rename “triple flow isolation” (for reasons that will become clear later), is within scope because it improves the flow-isolation feature.  If and when I can get my head around all the little changes that keep (potentially) breaking everything behind my back, I’ll be able to actually implement it some day.

	From supporting users on the openwrt forums I know that an easy way to compartmentalize bitttorrent traffic would find lots of users. And if in an initial implementation a host with too many flows suffers increasing delay under load, that might not be too bad, given that bitterness can use this added delay to throttle itself back. It is not about perfect here, just about good enough.

[...]
> 
> The most difficult function to define and scope has been the priority queue, due in no small part to the hilariously weak specification that is Diffserv.  I have tried to carve a fresh path of clarity here, interpreting the existing specifications into a coherent implementation in the hope that it’ll actually get used as such in future.

	I would argue differently, instead of trying to make lemonade why not collect the few actual constraints (probably 0=BE, 1=BK and XX=EF should be sufficient to account for al actual marking encountered in a typical home) and come up with something simple that will just work? Simplicity wise I believe the classic precedence 3-bit patterns are simple enough, but they fail; the CS1=BK test...

> 
> In that context, the “squash” and “wash” features truly baffle me.  I’d prefer to see them both gone.  Either you want to *use* Diffserv, in which case downstream networks might also benefit from the same markings, or you want to *ignore* it, in which case you might as well leave the existing marks alone, or you want to do something more sophisticated that Cake’s core feature set doesn’t support.

	A home router is a typical DSCP-domain, so clearing internally valid marks on network egress seems quite prudent, no?  Also it seems not a bad idea to use DSCP at home to push bitterness in the BK without giving an ISP a convenient marker to drop packets, but since I rarely use bitterness I am making this argument up...

> 
> In short, Cake is *not* the right place to change the DSCP field.  A separate qdisc could be written to do that job with minimal overhead and a dedicated configuration interface, and be inserted either before or after Cake as required.  Or we could wait for pre-ingress-qdisc firewall rules to become available.

	Both of these are decidedly not-easy, and as far as I am concerned cake’s reason to be is to make commonly wanted but due to the involved complexity rarely configured home-network setups easy. So if cake would offer this we can make non experts use it, which I would count as a win...

> 
> The AQM layer is also apparently a source of controversy.  Codel is really, *really* good at dealing with single, well-behaved TCP flows in the congestion-avoidance phase.  It’s also really, *really* bad at dealing with unresponsive UDP flows, and somewhat mediocre at dealing with TCP in slow-start or multiple TCP flows in the same queue.  This is a problem that we need to address, one way or another - whether by modifying Codel or finding some way to switch between two or more different AQM schemes based on traffic characteristics.  In the meantime, we have to experiment.

	The beauty of fq, as I understand, is, given enough flows, the misbehaving flows will mainly help themselves to a healthy portion of delay before the dropper ramps up, no?

Best Regards
	Sebastian

> 
> I am happy to see optimisations go in, as long as they’re clearly beneficial, don’t change the logic, and don't preclude changes to areas that are not entirely settled.  For anything else, please make it a separate branch with a pull request, so that I don’t have to be distracted by reverting stuff that doesn’t work.
> 
> - Jonathan Morton
> 


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Cake] second system syndrome
  2015-12-21 18:19                     ` moeller0
@ 2015-12-21 20:36                       ` Jonathan Morton
  2015-12-21 21:19                         ` moeller0
       [not found]                         ` <8737uukf7z.fsf@toke.dk>
  0 siblings, 2 replies; 24+ messages in thread
From: Jonathan Morton @ 2015-12-21 20:36 UTC (permalink / raw)
  To: moeller0; +Cc: Dave Täht, Kevin Darbyshire-Bryant, cake

>> The “dual flow isolation”, which after a great deal of thought I’ve decided to rename “triple flow isolation” (for reasons that will become clear later), is within scope because it improves the flow-isolation feature.  If and when I can get my head around all the little changes that keep (potentially) breaking everything behind my back, I’ll be able to actually implement it some day.
> 
> 	From supporting users on the openwrt forums I know that an easy way to compartmentalize bitttorrent traffic would find lots of users. And if in an initial implementation a host with too many flows suffers increasing delay under load, that might not be too bad, given that bitterness can use this added delay to throttle itself back. It is not about perfect here, just about good enough.

True, and that’s why I’ve been trying to keep flow-isolation work at the front of my queue, at least as far as Cake is concerned.  But I don’t want to be taking steps backwards.  Since I’ve got a theoretical handle on what appears to be the *correct* solution, I don’t want to spend any time working on anything that’s obviously wrong.

>> The most difficult function to define and scope has been the priority queue, due in no small part to the hilariously weak specification that is Diffserv.  I have tried to carve a fresh path of clarity here, interpreting the existing specifications into a coherent implementation in the hope that it’ll actually get used as such in future.
> 
> 	I would argue differently, instead of trying to make lemonade why not collect the few actual constraints (probably 0=BE, 1=BK and XX=EF should be sufficient to account for al actual marking encountered in a typical home) and come up with something simple that will just work? Simplicity wise I believe the classic precedence 3-bit patterns are simple enough, but they fail; the CS1=BK test…

I might try to revisit the DSCP-to-tin allocations later on.  There are several competing specifications of such allocations in the wild, and I might do better to align Cake with one of the more promising ones.  I’m also fairly keen to have a distinct “network control” class right at the top of the stack, which might result in five tins rather than four.

>> In that context, the “squash” and “wash” features truly baffle me.  I’d prefer to see them both gone.  Either you want to *use* Diffserv, in which case downstream networks might also benefit from the same markings, or you want to *ignore* it, in which case you might as well leave the existing marks alone, or you want to do something more sophisticated that Cake’s core feature set doesn’t support.
> 
> 	A home router is a typical DSCP-domain, so clearing internally valid marks on network egress seems quite prudent, no?

No.  That would be true only if you used “local use” or “experimental” DSCPs (or some other mapping incompatible with the published specifications) in your network.  Cake doesn’t understand those uses, so you should be doing your DSCP re-marking before traffic reaches Cake, if at all.

The DSCPs that Cake understands are those in the RFCs, which can be presumed to be widely understood (if not always widely used).  In particular, they’re consistent with other Cake users and the most obvious interpretation of the PHB specs.

> Also it seems not a bad idea to use DSCP at home to push bitterness in the BK without giving an ISP a convenient marker to drop packets, but since I rarely use bitterness I am making this argument up…

If your ISP drops traffic based solely on the DSCP, you have bigger problems.

If your ISP drops BK traffic preferentially during congested periods, however, then they’re more-or-less doing the right thing.  Assuming they’re not permanently congested, that is.

>> In short, Cake is *not* the right place to change the DSCP field.  A separate qdisc could be written to do that job with minimal overhead and a dedicated configuration interface, and be inserted either before or after Cake as required.  Or we could wait for pre-ingress-qdisc firewall rules to become available.
> 
> 	Both of these are decidedly not-easy, and as far as I am concerned cake’s reason to be is to make commonly wanted but due to the involved complexity rarely configured home-network setups easy. So if cake would offer this we can make non experts use it, which I would count as a win…

In my book, non-experts should definitely *not* be using the “wash” feature.

Early re-marking of DSCP would however be useful on ingress, to deal with the plethora of networks out there which either fail to set appropriate DSCPs at origin, or re-mark them inappropriately en route.  In particular, it would be useful for ingress classification of BitTorrent traffic, since the user has a better chance of identifying this traffic by port number than the ISP does.

>> The AQM layer is also apparently a source of controversy.  Codel is really, *really* good at dealing with single, well-behaved TCP flows in the congestion-avoidance phase.  It’s also really, *really* bad at dealing with unresponsive UDP flows, and somewhat mediocre at dealing with TCP in slow-start or multiple TCP flows in the same queue.  This is a problem that we need to address, one way or another - whether by modifying Codel or finding some way to switch between two or more different AQM schemes based on traffic characteristics.  In the meantime, we have to experiment.
> 
> 	The beauty of fq, as I understand, is, given enough flows, the misbehaving flows will mainly help themselves to a healthy portion of delay before the dropper ramps up, no?

Yes, that’s true to the extent that I’m putting improved flow isolation above tuning AQM in my todo list.  Good flow isolation means that, in theory, AQM only needs to act to provide courtesy signals to responsive flows - which is exactly what Codel is good at.

However, if AQM can also act to keep unresponsive flows under control, that relieves the overflow handling logic from having to do that job full-time.  Although I did optimise that logic somewhat some time ago, it isn’t as slick or efficient as AQM could be in this role.  The difference between theory and practice...

 - Jonathan Morton


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Cake] second system syndrome
  2015-12-21 20:36                       ` Jonathan Morton
@ 2015-12-21 21:19                         ` moeller0
       [not found]                         ` <8737uukf7z.fsf@toke.dk>
  1 sibling, 0 replies; 24+ messages in thread
From: moeller0 @ 2015-12-21 21:19 UTC (permalink / raw)
  To: Jonathan Morton; +Cc: Dave Täht, Kevin Darbyshire-Bryant, cake

Hi Jonathan,

> On Dec 21, 2015, at 21:36 , Jonathan Morton <chromatix99@gmail.com> wrote:
> 
>>> The “dual flow isolation”, which after a great deal of thought I’ve decided to rename “triple flow isolation” (for reasons that will become clear later), is within scope because it improves the flow-isolation feature.  If and when I can get my head around all the little changes that keep (potentially) breaking everything behind my back, I’ll be able to actually implement it some day.
>> 
>> 	From supporting users on the openwrt forums I know that an easy way to compartmentalize bitttorrent traffic would find lots of users. And if in an initial implementation a host with too many flows suffers increasing delay under load, that might not be too bad, given that bitterness can use this added delay to throttle itself back. It is not about perfect here, just about good enough.
> 
> True, and that’s why I’ve been trying to keep flow-isolation work at the front of my queue, at least as far as Cake is concerned.  But I don’t want to be taking steps backwards.  Since I’ve got a theoretical handle on what appears to be the *correct* solution, I don’t want to spend any time working on anything that’s obviously wrong.

	As long as we are not missing out on “good enough” for a perfect solution, I am fine with waiting, except then I vote for getting cake upstream ASAP.

> 
>>> The most difficult function to define and scope has been the priority queue, due in no small part to the hilariously weak specification that is Diffserv.  I have tried to carve a fresh path of clarity here, interpreting the existing specifications into a coherent implementation in the hope that it’ll actually get used as such in future.
>> 
>> 	I would argue differently, instead of trying to make lemonade why not collect the few actual constraints (probably 0=BE, 1=BK and XX=EF should be sufficient to account for al actual marking encountered in a typical home) and come up with something simple that will just work? Simplicity wise I believe the classic precedence 3-bit patterns are simple enough, but they fail; the CS1=BK test…
> 
> I might try to revisit the DSCP-to-tin allocations later on.  There are several competing specifications of such allocations in the wild, and I might do better to align Cake with one of the more promising ones.  I’m also fairly keen to have a distinct “network control” class right at the top of the stack, which might result in five tins rather than four.

	Network control is tricky, because even in the home you need to control access to this tier strictly. That said, I want to play with a 4th tier in simple.qos for basically the same reason (specifically to keep PPP-LCP packets on highest priority); but in sqm, since we use tc filters/iptables to do the filtering, it is rather easy to only accept these marks from the router itself and I know the required bandwidth pretty well (less than 1 kbps) so this should work okay.
> 
>>> In that context, the “squash” and “wash” features truly baffle me.  I’d prefer to see them both gone.  Either you want to *use* Diffserv, in which case downstream networks might also benefit from the same markings, or you want to *ignore* it, in which case you might as well leave the existing marks alone, or you want to do something more sophisticated that Cake’s core feature set doesn’t support.
>> 
>> 	A home router is a typical DSCP-domain, so clearing internally valid marks on network egress seems quite prudent, no?
> 
> No.  That would be true only if you used “local use” or “experimental” DSCPs (or some other mapping incompatible with the published specifications) in your network.  Cake doesn’t understand those uses, so you should be doing your DSCP re-marking before traffic reaches Cake, if at all.

	All the published specs are hardly worth the bits the use up, as many ISPs simply re-map everything to zero… Really local use or experimental are just an attempt to make the published schemes look more respectable ;)

> 
> The DSCPs that Cake understands are those in the RFCs, which can be presumed to be widely understood (if not always widely used).  In particular, they’re consistent with other Cake users and the most obvious interpretation of the PHB specs.

	I disagree, there is no real consensus in the field AND there is the recommendation to not trust any unsolicited ingress DSCP markings (on networks that honor them) so there is zero benefit in trying to second guess which scheme a specific ISP will honor/not-remapp to zero. The sorry state of affairs is that if at all DSCP marks leaving a home net are an information leak... Really to repeat myself and play devil’s advocate we only need to honor CS0 and EF and I bet almost nobody using DSCP in the home will notice by behavior alone ;) Or let me try again, all of the PHB description come with their own agenda, and none of these agendas align well with what I believe to be a typical home network setting. Since the native scope of DSCP is a network all we need to offer a scheme that works well in the network we are aiming for. Or put different again, none of the competing proposals so far has shown any “fitness" in the real world so why select any of those to base a simple scheme on? 

> 
>> Also it seems not a bad idea to use DSCP at home to push bitterness in the BK without giving an ISP a convenient marker to drop packets, but since I rarely use bitterness I am making this argument up…
> 
> If your ISP drops traffic based solely on the DSCP, you have bigger problems.
> 
> If your ISP drops BK traffic preferentially during congested periods, however, then they’re more-or-less doing the right thing.  Assuming they’re not permanently congested, that is.

	Arguably, but do I really want my bit-torrent (I just noticed autocorrect turn bit-torrent into bitterness...) to be dropped while my neighbor’s CS0 marked bit-torrent goes through? This is hypothetical I rarely use torrents at all, so I really do not know.

> 
>>> In short, Cake is *not* the right place to change the DSCP field.  A separate qdisc could be written to do that job with minimal overhead and a dedicated configuration interface, and be inserted either before or after Cake as required.  Or we could wait for pre-ingress-qdisc firewall rules to become available.
>> 
>> 	Both of these are decidedly not-easy, and as far as I am concerned cake’s reason to be is to make commonly wanted but due to the involved complexity rarely configured home-network setups easy. So if cake would offer this we can make non experts use it, which I would count as a win…
> 
> In my book, non-experts should definitely *not* be using the “wash” feature.

	Because you believe in the idea that our DSCP marks might have use later on, but for example my ISP zeros out the TOS bits on IPv4 while keeping them on IPv6 this is a big mess… Let me explain, wash as far as I can tell allows the use of DSCP on egress without leaking unwanted information toward the upstream network, and hence is a good candidate for a default setting in my book. (Yes, I realize I have no leverage here, but that never stopped me from voicing an opinion ;) )

> 
> Early re-marking of DSCP would however be useful on ingress, to deal with the plethora of networks out there which either fail to set appropriate DSCPs at origin, or re-mark them inappropriately en route.  In particular, it would be useful for ingress classification of BitTorrent traffic, since the user has a better chance of identifying this traffic by port number than the ISP does.

	You are onto a great idea here, DSCP re-mapping on network entry and potentially zeroing on network egress.

> 
>>> The AQM layer is also apparently a source of controversy.  Codel is really, *really* good at dealing with single, well-behaved TCP flows in the congestion-avoidance phase.  It’s also really, *really* bad at dealing with unresponsive UDP flows, and somewhat mediocre at dealing with TCP in slow-start or multiple TCP flows in the same queue.  This is a problem that we need to address, one way or another - whether by modifying Codel or finding some way to switch between two or more different AQM schemes based on traffic characteristics.  In the meantime, we have to experiment.
>> 
>> 	The beauty of fq, as I understand, is, given enough flows, the misbehaving flows will mainly help themselves to a healthy portion of delay before the dropper ramps up, no?
> 
> Yes, that’s true to the extent that I’m putting improved flow isolation above tuning AQM in my todo list.  Good flow isolation means that, in theory, AQM only needs to act to provide courtesy signals to responsive flows - which is exactly what Codel is good at.
> 
> However, if AQM can also act to keep unresponsive flows under control, that relieves the overflow handling logic from having to do that job full-time.  Although I did optimise that logic somewhat some time ago, it isn’t as slick or efficient as AQM could be in this role.  The difference between theory and practice…

	Thanks for your thoughts
	Sebastian

> 
> - Jonathan Morton
> 


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Cake] second system syndrome
       [not found]                         ` <8737uukf7z.fsf@toke.dk>
@ 2015-12-22 15:34                           ` Jonathan Morton
  0 siblings, 0 replies; 24+ messages in thread
From: Jonathan Morton @ 2015-12-22 15:34 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen; +Cc: moeller0, Kevin Darbyshire-Bryant, cake


> On 22 Dec, 2015, at 11:25, Toke Høiland-Jørgensen <toke@toke.dk> wrote:
> 
> Jonathan Morton <chromatix99@gmail.com> writes:
> 
>> However, if AQM can also act to keep unresponsive flows under control,
>> that relieves the overflow handling logic from having to do that job
>> full-time. Although I did optimise that logic somewhat some time ago,
>> it isn’t as slick or efficient as AQM could be in this role. The
>> difference between theory and practice...
> 
> I don't think relying on CoDel to curb unresponsive flows is the right
> thing to do.

I didn’t say Codel in this context; I said AQM.

However, I haven’t yet put a lot of deep thought into how we might switch between Codel and some other scheme.

 - Jonathan Morton


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Cake] second system syndrome
  2015-12-21 15:36                   ` Jonathan Morton
  2015-12-21 18:19                     ` moeller0
@ 2015-12-22 22:30                     ` Kevin Darbyshire-Bryant
  2015-12-23 11:43                       ` Dave Taht
  1 sibling, 1 reply; 24+ messages in thread
From: Kevin Darbyshire-Bryant @ 2015-12-22 22:30 UTC (permalink / raw)
  To: Jonathan Morton; +Cc: moeller0, Dave Täht, cake

[-- Attachment #1: Type: text/plain, Size: 1087 bytes --]



On 21/12/15 15:36, Jonathan Morton wrote:
> In that context, the “squash” and “wash” features truly baffle me.  I’d prefer to see them both gone.  Either you want to *use* Diffserv, in which case downstream networks might also benefit from the same markings, or you want to *ignore* it, in which case you might as well leave the existing marks alone, or you want to do something more sophisticated that Cake’s core feature set doesn’t support.
>
> In short, Cake is *not* the right place to change the DSCP field.  A separate qdisc could be written to do that job with minimal overhead and a dedicated configuration interface, and be inserted either before or after Cake as required.  Or we could wait for pre-ingress-qdisc firewall rules to become available.
Feel free to remove.  If squash shouldn't be there, then neither should
wash.  It was an idea to split out the use of diffserv markings vs the
clearing of them that the original "squash" option implemented into
'diffserv/squash/wash".  Bad ideas shouldn't be included so get rid of
it :-)

Kevin


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4816 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Cake] second system syndrome
  2015-12-22 22:30                     ` Kevin Darbyshire-Bryant
@ 2015-12-23 11:43                       ` Dave Taht
  2015-12-23 12:14                         ` Kevin Darbyshire-Bryant
  2015-12-23 12:27                         ` Jonathan Morton
  0 siblings, 2 replies; 24+ messages in thread
From: Dave Taht @ 2015-12-23 11:43 UTC (permalink / raw)
  To: Kevin Darbyshire-Bryant; +Cc: Jonathan Morton, moeller0, cake

squashing and washing are both well within the realm of ietf best
practices. It is highly desirable to have this level of simple control
on cake for inbound - for example, 90+ of all comcast traffic, no
matter how originally marked, comes in marked as background, and
should be changed to best effort on the gateway. Doing so with an
iptables rule is inefficient and it is difficult (along with the rest
of sqm-scripts) to hook into other people's iptables rules.

On outbound - frankly I was surprised remarking dns, ntp, and mosh as
AF42 worked as well as it did, there was only one place where it
failed (which resulted in the removal of AF42 marking to mosh
mainline) - MIT. There ARE a few indicators like background and IMM
that often survive end to end elsewhere.

more extensive remarking, as kevin also wants to do, seems complex.
Dave Täht
Let's go make home routers and wifi faster! With better software!
https://www.gofundme.com/savewifi


On Tue, Dec 22, 2015 at 11:30 PM, Kevin Darbyshire-Bryant
<kevin@darbyshire-bryant.me.uk> wrote:
>
>
> On 21/12/15 15:36, Jonathan Morton wrote:
>> In that context, the “squash” and “wash” features truly baffle me.  I’d prefer to see them both gone.  Either you want to *use* Diffserv, in which case downstream networks might also benefit from the same markings, or you want to *ignore* it, in which case you might as well leave the existing marks alone, or you want to do something more sophisticated that Cake’s core feature set doesn’t support.
>>
>> In short, Cake is *not* the right place to change the DSCP field.  A separate qdisc could be written to do that job with minimal overhead and a dedicated configuration interface, and be inserted either before or after Cake as required.  Or we could wait for pre-ingress-qdisc firewall rules to become available.
> Feel free to remove.  If squash shouldn't be there, then neither should
> wash.  It was an idea to split out the use of diffserv markings vs the
> clearing of them that the original "squash" option implemented into
> 'diffserv/squash/wash".  Bad ideas shouldn't be included so get rid of
> it :-)
>
> Kevin
>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Cake] second system syndrome
  2015-12-23 11:43                       ` Dave Taht
@ 2015-12-23 12:14                         ` Kevin Darbyshire-Bryant
  2015-12-23 12:27                         ` Jonathan Morton
  1 sibling, 0 replies; 24+ messages in thread
From: Kevin Darbyshire-Bryant @ 2015-12-23 12:14 UTC (permalink / raw)
  To: Dave Taht; +Cc: Jonathan Morton, moeller0, cake

[-- Attachment #1: Type: text/plain, Size: 1783 bytes --]



On 23/12/15 11:43, Dave Taht wrote:
> squashing and washing are both well within the realm of ietf best
> practices. It is highly desirable to have this level of simple control
> on cake for inbound - for example, 90+ of all comcast traffic, no
> matter how originally marked, comes in marked as background, and
> should be changed to best effort on the gateway. Doing so with an
> iptables rule is inefficient and it is difficult (along with the rest
> of sqm-scripts) to hook into other people's iptables rules.
>
> On outbound - frankly I was surprised remarking dns, ntp, and mosh as
> AF42 worked as well as it did, there was only one place where it
> failed (which resulted in the removal of AF42 marking to mosh
> mainline) - MIT. There ARE a few indicators like background and IMM
> that often survive end to end elsewhere.
>
> more extensive remarking, as kevin also wants to do, seems complex.
The idea was a 'simple' extension of washing, namely to be able to
specify the departing dscp for each tin, relying on the incoming tin
classification performed by cake - it wasn't ever going to be a full per
tin '64 dscp to 64 dscp' mapping table - 'easy' to implement, totally
insane to configure and much better handled by port/protocol awareness.
   The original drive was a thought it potentially useful to be able to
specify a different default marking for 'squash/wash' than the default
hard coded '0'.

If 'wash' is a silly idea, then it's a silly idea, end of story - (my
'c' has improved as a result of coding various little bits so it's a win
for me either way)   No matter, there are branches & matching pull
requests for removal of 'wash & squash' to be found in cake & tc github
repos - merge or delete I've no care :-)

Kevin


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4816 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Cake] second system syndrome
  2015-12-23 11:43                       ` Dave Taht
  2015-12-23 12:14                         ` Kevin Darbyshire-Bryant
@ 2015-12-23 12:27                         ` Jonathan Morton
  2015-12-23 12:41                           ` Dave Taht
  1 sibling, 1 reply; 24+ messages in thread
From: Jonathan Morton @ 2015-12-23 12:27 UTC (permalink / raw)
  To: Dave Taht; +Cc: Kevin Darbyshire-Bryant, moeller0, cake


> On 23 Dec, 2015, at 13:43, Dave Taht <dave.taht@gmail.com> wrote:
> 
> squashing and washing are both well within the realm of ietf best
> practices. It is highly desirable to have this level of simple control
> on cake for inbound - for example, 90+ of all comcast traffic, no
> matter how originally marked, comes in marked as background, and
> should be changed to best effort on the gateway. Doing so with an
> iptables rule is inefficient and it is difficult (along with the rest
> of sqm-scripts) to hook into other people's iptables rules.

My argument is simple: if you care enough about Diffserv to notice Comcast meddling with it, you almost certainly want to do something *much* more sophisticated than just “wash” or “squash”.

The one exception I can think of is where the untrusted DSCP interacts badly with some congested internal-network equipment - such as a long wifi link - that can’t be reconfigured to ignore untrusted DSCPs.  So I can see why you specifically are interested in the “squash" feature, but pretty much nobody else will be.  “Wash” is just plain wrong.

Given that this entire thread is about concern over scope creep, changing the DSCP is my line in the sand.  Doing it properly is out of scope, so doing it *at all* is *also* out of scope.

What “wash” and “squash” have done, though, is to show that changing the DSCP in a qdisc can be done.  Hence, it should be straightforward to implement a qdisc, which can be stacked with Cake or any other Diffserv-aware qdisc, that covers the common use-cases properly and without the major overhead of firewall rules (which, in any case, apply too late to be useful on ingress).  I have some ideas, which I intend to detail separately.

As a compromise, I’ll leave “squash” and “wash” in place (since they’re already there) until a better solution is available.

 - Jonathan Morton


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Cake] second system syndrome
  2015-12-23 12:27                         ` Jonathan Morton
@ 2015-12-23 12:41                           ` Dave Taht
  2015-12-23 13:06                             ` Jonathan Morton
  0 siblings, 1 reply; 24+ messages in thread
From: Dave Taht @ 2015-12-23 12:41 UTC (permalink / raw)
  To: Jonathan Morton; +Cc: Kevin Darbyshire-Bryant, moeller0, cake

On Wed, Dec 23, 2015 at 1:27 PM, Jonathan Morton <chromatix99@gmail.com> wrote:
>
>> On 23 Dec, 2015, at 13:43, Dave Taht <dave.taht@gmail.com> wrote:
>>
>> squashing and washing are both well within the realm of ietf best
>> practices. It is highly desirable to have this level of simple control
>> on cake for inbound - for example, 90+ of all comcast traffic, no
>> matter how originally marked, comes in marked as background, and
>> should be changed to best effort on the gateway. Doing so with an
>> iptables rule is inefficient and it is difficult (along with the rest
>> of sqm-scripts) to hook into other people's iptables rules.
>
> My argument is simple: if you care enough about Diffserv to notice Comcast meddling with it, you almost certainly want to do something *much* more sophisticated than just “wash” or “squash”.

Preserving nearly all traffic as mis-marked as background does bad
things to 802.11e.

> The one exception I can think of is where the untrusted DSCP interacts badly with some congested internal-network equipment - such as a long wifi link - that can’t be reconfigured to ignore untrusted DSCPs.  So I can see why you specifically are interested in the “squash" feature, but pretty much nobody else will be.  “Wash” is just plain wrong.
>
> Given that this entire thread is about concern over scope creep, changing the DSCP is my line in the sand.  Doing it properly is out of scope, so doing it *at all* is *also* out of scope.
>
> What “wash” and “squash” have done, though, is to show that changing the DSCP in a qdisc can be done.  Hence, it should be straightforward to implement a qdisc, which can be stacked with Cake or any other Diffserv-aware qdisc, that covers the common use-cases properly and without the major overhead of firewall rules (which, in any case, apply too late to be useful on ingress).  I have some ideas, which I intend to detail separately.
>
> As a compromise, I’ll leave “squash” and “wash” in place (since they’re already there) until a better solution is available.

Great. Let's move on.

Are you actually testing your codel changes at longer RTTs?

>  - Jonathan Morton
>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Cake] second system syndrome
  2015-12-23 12:41                           ` Dave Taht
@ 2015-12-23 13:06                             ` Jonathan Morton
  2015-12-23 14:58                               ` Dave Taht
  0 siblings, 1 reply; 24+ messages in thread
From: Jonathan Morton @ 2015-12-23 13:06 UTC (permalink / raw)
  To: Dave Taht; +Cc: Kevin Darbyshire-Bryant, moeller0, cake


> On 23 Dec, 2015, at 14:41, Dave Taht <dave.taht@gmail.com> wrote:
> 
> Are you actually testing your codel changes at longer RTTs?

The latest set has, so far, only been tested on an ordinary Internet link.  It produced an immediately noticeable improvement on ingress there, implying that it’s controlling the upstream queue better.

I’m reasonably confident, however, that it’ll stand up to more varied tests as well.  Fundamentally, it returns to the standard Codel trigger mechanism under most circumstances, since that’s been shown to work well.

It also adds a new wrinkle that only appears when the queue is growing very quickly, to trigger the signalling early when it’s abundantly clear that it *will* inevitably trigger later.  I think this will particularly help cope with TCP slow-start or, on a long-RTT path, with RTT-independent TCPs like CUBIC.

Feel free to run your own tests.  I need to sleep.

 - Jonathan Morton

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Cake] second system syndrome
  2015-12-23 13:06                             ` Jonathan Morton
@ 2015-12-23 14:58                               ` Dave Taht
  0 siblings, 0 replies; 24+ messages in thread
From: Dave Taht @ 2015-12-23 14:58 UTC (permalink / raw)
  To: Jonathan Morton; +Cc: Kevin Darbyshire-Bryant, moeller0, cake

On Wed, Dec 23, 2015 at 2:06 PM, Jonathan Morton <chromatix99@gmail.com> wrote:
>
>> On 23 Dec, 2015, at 14:41, Dave Taht <dave.taht@gmail.com> wrote:
>>
>> Are you actually testing your codel changes at longer RTTs?
>
> The latest set has, so far, only been tested on an ordinary Internet link.  It produced an immediately noticeable improvement on ingress there, implying that it’s controlling the upstream queue better.
>
> I’m reasonably confident, however, that it’ll stand up to more varied tests as well.  Fundamentally, it returns to the standard Codel trigger mechanism under most circumstances, since that’s been shown to work well.
>
> It also adds a new wrinkle that only appears when the queue is growing very quickly, to trigger the signalling early when it’s abundantly clear that it *will* inevitably trigger later.  I think this will particularly help cope with TCP slow-start or, on a long-RTT path, with RTT-independent TCPs like CUBIC.
>
> Feel free to run your own tests.  I need to sleep.

The testbed is closed for the holidays, and unless toke logs in, lab
testing will not resume until jan 12th. I might, today do a bit on
your latest code, but I am mostly trying to finish the server moves
before taking off for the holidays myself.

It does strike me that you are perhaps overly concerned about slow
start behavior in general.

The load spike exhibited by many of the "rrul" derived tests in the
flent suite are artifacts of the side effects of starting too many
flows at almost exactly the same time. In a normal congested scenario
we would have a saturated link, with stablized codel values with a set
of flows, and short flows coming and going on a regular basis.

We have a few tests that do something saner, like the tcp_2up_delay test,
as well as the web tests, which takes some setup to have running, as
does the rrul_voip test.

Some of the lab results are colored by using an older version of
netperf which does not restart the udp flows after a loss for 250ms -
the voip tests are a better indicator of what loss is like for
isochronous flows, and honestly I wish we used isochronous rather than
ping based tests for all of the rrul stuff.

Certainly there are issues with doing the massive overload tests like
the 50down one, notably with admission control (some tests simply
can't start with that much congestion), and with ecn (ecn clogs up the
pipe way worse and makes it even harder to start a new flow as the
various aqm algorithms scale up to a higher "drop" rate than desirable
 - ecn has mass, I've always said)

I put out the list of existing flent servers earlier (which are a bit
underconfigured), in the hope that more would do measurements rather
than go by feel - and also of use are the new queue depth things which
toke and I put into flent over the last month or so.

>  - Jonathan Morton

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2015-12-23 14:58 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-12-06 14:53 [Cake] second system syndrome Dave Taht
2015-12-06 16:08 ` Sebastian Moeller
2015-12-07 12:24   ` Kevin Darbyshire-Bryant
2015-12-20 12:47     ` Dave Taht
2015-12-20 12:52       ` Dave Taht
2015-12-21  9:02         ` moeller0
2015-12-21 10:40           ` Dave Taht
2015-12-21 11:10             ` moeller0
2015-12-21 12:00               ` Dave Taht
2015-12-21 13:05                 ` moeller0
2015-12-21 15:36                   ` Jonathan Morton
2015-12-21 18:19                     ` moeller0
2015-12-21 20:36                       ` Jonathan Morton
2015-12-21 21:19                         ` moeller0
     [not found]                         ` <8737uukf7z.fsf@toke.dk>
2015-12-22 15:34                           ` Jonathan Morton
2015-12-22 22:30                     ` Kevin Darbyshire-Bryant
2015-12-23 11:43                       ` Dave Taht
2015-12-23 12:14                         ` Kevin Darbyshire-Bryant
2015-12-23 12:27                         ` Jonathan Morton
2015-12-23 12:41                           ` Dave Taht
2015-12-23 13:06                             ` Jonathan Morton
2015-12-23 14:58                               ` Dave Taht
2015-12-20 13:51       ` moeller0
2015-12-06 18:21 ` Kevin Darbyshire-Bryant

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox