[Ecn-sane] tsvwg preso for sce is up

Discussion of explicit congestion notification's impact on the Internet
 help / color / mirror / Atom feed

* [Ecn-sane] tsvwg preso for sce is up
@ 2019-07-30 16:38 Dave Taht
  2019-07-31  0:42 ` Jonathan Morton
  0 siblings, 1 reply; 4+ messages in thread
From: Dave Taht @ 2019-07-30 16:38 UTC (permalink / raw)
  To: ECN-Sane

SCE: https://www.youtube.com/watch?v=FDK88vdE5r0&t=1h15m

A couple notes:

at 1:25:20 - mirja had asked what the sce marking threshold was, not
the codel parameters (I think). I think she wanted to know the
sce_threshold?

At 1:27, I'd really love the flent files to be able to zoom in on that
data, It's unclear how low the ping overhead is. A 100Mbit result
using native
ethernet and bql at the bottleneck instead of cake with a rate limit
might be interesting. Gbit also.... (have 10gbit in my lab)

1:27:49 Gorry said "This looks like FQ", and no, it's the real
convergence of two SCE AQMed flows at 50mbit, 80ms rtt as Jonathan
pointed out, which
takes 45 seconds. And that brought to mind, what is your intuition?
What would you expect for convergence using fq? And what is it?

1:31:18 One thing long since vanished from the l4s debate is that
codel achieves a ~5ms queue depth, where pie only gets 16ms. The need
for "ultra-low-latency" is less when you get that kind of result in
most cases from your aqm.

I've always felt that pie could be improved - the principal flaws
being the rate estimator - and the update interval. fq-pie on bsd
borrows
codel's rate estimator.

But I digress.

Flent has a default sample rate of 200ms, which means that it can miss
some details. You can sample instead at rates as low as 20ms, although
this is murder on your local cpu and can heisenbug the tests. It's
generally a good idea to be sampling at double the rate you care about
(nyquist
theorim), so a 40ms sample rate here would have shown more detail.

If you really, really want more detail than that, packet captures are
a way to go. Got any?

-- 

Dave Täht
CTO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-831-205-9740

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Ecn-sane] tsvwg preso for sce is up
  2019-07-30 16:38 [Ecn-sane] tsvwg preso for sce is up Dave Taht
@ 2019-07-31  0:42 ` Jonathan Morton
  2019-07-31  2:03   ` Dave Taht
  0 siblings, 1 reply; 4+ messages in thread
From: Jonathan Morton @ 2019-07-31  0:42 UTC (permalink / raw)
  To: Dave Taht; +Cc: ECN-Sane

> On 30 Jul, 2019, at 5:38 pm, Dave Taht <dave.taht@gmail.com> wrote:
> 
> at 1:25:20 - mirja had asked what the sce marking threshold was, not
> the codel parameters (I think). I think she wanted to know the
> sce_threshold?

We used Cake, not fq_codel, so there is no sce threshold function, rather the ramp function carefully illustrated on two of the slides.  I'm pretty sure she was asking about the Codel parameters, which were the defaults, and she seemed satisfied with that answer.

> 1:27:49 Gorry said "This looks like FQ", and no, it's the real
> convergence of two SCE AQMed flows at 50mbit, 80ms rtt as Jonathan
> pointed out, which
> takes 45 seconds. And that brought to mind, what is your intuition?
> What would you expect for convergence using fq? And what is it?

FQ converges a whole lot quicker than that - basically as fast as the second flow can ramp up, which you can eyeball by looking at the first flow.  It also converges more precisely, and the ping flow would show a lower and more consistent reading.  Gorry's reaction is one of unfamiliarity with Flent plots showing FQ'd paths.

> 1:31:18 One thing long since vanished from the l4s debate is that
> codel achieves a ~5ms queue depth, where pie only gets 16ms. The need
> for "ultra-low-latency" is less when you get that kind of result in
> most cases from your aqm.

This is true, although PIE is specifically adjusted by default to accommodate a 30ms MAC grant delay on standard DOCSIS, which means about 15ms average is the best it *can* aim for without killing throughput on TCP.  I assume that PI2 is instead adjusted for the 1ms MAC grant delay of Low Latency DOCSIS.

When asked, the L4S team admitted they weren't familiar with Codel at all - and by inference, had done no testing with it.  We subsequently made the point that Codel is probably the most widely deployed AQM today, being part of the default qdisc on both Linux and OSX, and also available on BSD.  They have made no visible effort to ensure compatibility.

> Flent has a default sample rate of 200ms, which means that it can miss
> some details. You can sample instead at rates as low as 20ms, although
> this is murder on your local cpu and can heisenbug the tests. It's
> generally a good idea to be sampling at double the rate you care about
> (nyquist theorim), so a 40ms sample rate here would have shown more detail.
> 
> If you really, really want more detail than that, packet captures are
> a way to go. Got any?

We do use packet captures for debugging purposes, including exploring the detail of the cwnd evolution.  The graphs on the slides were produced for illustration more than anything else, though it's possible to infer much from them as-is.  I'll ask Pete if increasing the sample rate works on the hardware we use.

 - Jonathan Morton

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Ecn-sane] tsvwg preso for sce is up
  2019-07-31  0:42 ` Jonathan Morton
@ 2019-07-31  2:03   ` Dave Taht
  2019-07-31  7:39     ` Sebastian Moeller
  0 siblings, 1 reply; 4+ messages in thread
From: Dave Taht @ 2019-07-31  2:03 UTC (permalink / raw)
  To: Jonathan Morton; +Cc: ECN-Sane

[-- Attachment #1: Type: text/plain, Size: 5813 bytes --]

On Tue, Jul 30, 2019 at 5:42 PM Jonathan Morton <chromatix99@gmail.com> wrote:
>
> > On 30 Jul, 2019, at 5:38 pm, Dave Taht <dave.taht@gmail.com> wrote:
> >
> > at 1:25:20 - mirja had asked what the sce marking threshold was, not
> > the codel parameters (I think). I think she wanted to know the
> > sce_threshold?
>
> We used Cake, not fq_codel, so there is no sce threshold function, rather the ramp function carefully illustrated on two of the slides.  I'm pretty sure she was asking about the Codel parameters, which were the defaults, and she seemed satisfied with that answer.

No I think she gave up. But whatever.

> > 1:27:49 Gorry said "This looks like FQ", and no, it's the real
> > convergence of two SCE AQMed flows at 50mbit, 80ms rtt as Jonathan
> > pointed out, which
> > takes 45 seconds. And that brought to mind, what is your intuition?
> > What would you expect for convergence using fq? And what is it?
>
> FQ converges a whole lot quicker than that - basically as fast as the second flow can ramp up, which you can eyeball by looking at the first flow.  It also converges more precisely, and the ping flow would show a lower and more consistent reading.  Gorry's reaction is one of unfamiliarity with Flent plots showing FQ'd paths.

Given that part of the debate is about FQ, and this 80ms rtt, 50mbit
example would show compelling differences between fq, fq_codel,
codel, pie, dualpi w/wo sce, reno, cubic, building a story around it
to convey more intuition might be useful.

I'm really certain that 98% of the audience does not grok fq as deeply
as we do. And I do wish we'd get more folk grokking flent.

Also bob is perpetually making a point about applications needing to
briefly exceed their fair share, I'd like to make a point
about how quickly an application can grab more share when it appears
with shorter rtts possible on the link. Quadratic response
times mean a lot....

Another RTT to pursue more deeply is 10ms, which is about what I get
from my comcast link to my cloud in fremont. It is a "best case cdn"
latency number. ~2-3ms is what I used to get from sonic fiber (gpon)
to the same site, 20ms is a typical dsl figure.

A while back I gave a quiz to a class that should otherwise understand
tcp to some extent. Nobody scored higher than a 40%. Even toke only
got an 85% with a couple "creative" answers that I let go because I
was otherwise so depressed.

Anybody on this thread that wants to take that quiz, and send me your
answers privately... it's attached. I'm pretty sure you'll
laugh at several of the questions.... perhaps turning that into an
online quiz would help us on some really basic miscomprensions
we're always having.

> > 1:31:18 One thing long since vanished from the l4s debate is that
> > codel achieves a ~5ms queue depth, where pie only gets 16ms. The need
> > for "ultra-low-latency" is less when you get that kind of result in
> > most cases from your aqm.
>
> This is true, although PIE is specifically adjusted by default to accommodate a 30ms MAC grant delay on standard DOCSIS, which means about 15ms average is the best it *can* aim for without killing throughput on TCP.  I assume that PI2 is instead adjusted for the 1ms MAC grant delay of Low Latency DOCSIS.

I *think* normal docsis is 6ms on the up and 2ms on the down not 30.
There's an overlapping request/grant function also, called
CFsomethingorother that masks that under load. It's been 7 years since
I dumped that spec out of my head, though. I really don't
want to re-read the L4S ECO that went by in december... only have room
in my head for 802.11ax and my diesel engine repair manual.

similarly my docsis tests tend to be "right on the money" at 5-6ms
particularly with a compensation script for badmodems.com issues.

>
> When asked, the L4S team admitted they weren't familiar with Codel at all - and by inference, had done no testing with it.

Sigh. All these years.... sqm is so easy to set up with anything,
including pi....

>  We subsequently made the point that Codel is probably the most widely deployed AQM today, being part of the default qdisc on both Linux and OSX, and also available on BSD.  They have made no visible effort to ensure compatibility.

Convincing an entire market to take a 3x latency hit on normal traffic
so some other traffic can gain priority seems like an uphill climb.

I'd like to define terms better. "Ultra-low-latency" seems to mean sub
1ms latency?, which fq achieves on 90+% of flows.

as for Low latency?
5ms vs 15ms extra seems like quite a lot of overhead on a 10ms path.

Pie: Middling good latency?
Codel: amazing latency?

Compare this to the typical BDP or against pacing

> > Flent has a default sample rate of 200ms, which means that it can miss
> > some details. You can sample instead at rates as low as 20ms, although
> > this is murder on your local cpu and can heisenbug the tests. It's
> > generally a good idea to be sampling at double the rate you care about
> > (nyquist theorim), so a 40ms sample rate here would have shown more detail.
> >
> > If you really, really want more detail than that, packet captures are
> > a way to go. Got any?
>
> We do use packet captures for debugging purposes, including exploring the detail of the cwnd evolution.  The graphs on the slides were produced for illustration more than anything else, though it's possible to infer much from them as-is.  I'll ask Pete if increasing the sample rate works on the hardware we use.

Groovy. There is also a c program that grabs buffer lengths in the
misc dir, and a patch to tc needed to poll it that fast with that.
>
>  - Jonathan Morton

--

Dave Täht
CTO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-831-205-9740

[-- Attachment #2: bufferbloattcpquiz (1).pdf --]
[-- Type: application/pdf, Size: 32887 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Ecn-sane] tsvwg preso for sce is up
  2019-07-31  2:03   ` Dave Taht
@ 2019-07-31  7:39     ` Sebastian Moeller
  0 siblings, 0 replies; 4+ messages in thread
From: Sebastian Moeller @ 2019-07-31  7:39 UTC (permalink / raw)
  To: Dave Täht; +Cc: Jonathan Morton, ECN-Sane

Hi Dave,


> On Jul 31, 2019, at 04:03, Dave Taht <dave.taht@gmail.com> wrote:
> [...]
> Also bob is perpetually making a point about applications needing to
> briefly exceed their fair share, I'd like to make a point
> about how quickly an application can grab more share when it appears
> with shorter rtts possible on the link. Quadratic response
> times mean a lot....

	The obvious problem with Bob's rationale is that his idealized behavior immediately turn against the user, if it is "everything else" in Bob's example that the user actually would like to have priority then his lack of euqal-bandwidth enforcement will actually be harmful. 
	Personally, I believe that it is better to do, equal-share as default and let the user's override that on a if-needed-basis instead of having Bob's anything goes world (which unfortunately is the current reality), as equal share is easy to understand and to predict, and we have ample proof now that FQ (sqm-scripts in OpenWrt) for the home network is an improvement over the what-ever mode of the default internet experience. 
	In other words, Bob seemingly puts too much trust in the benevolence of all endpoints. This reminds me yet again on the discussions I had decades ago about the differences/advantages between cooperative and/over preemptive multitasking: theoretically cooperative multitasking will be superior to preemptive one but assumes both benevolent processes and effectively perfect information not only about a processe's own resource demands but also about all other running (and to be started) processes; I end this tangent by noting that preemptive multitasking more or less won this battle with only nice use of cooperative multitasking surviving. 
	I believe the same rational to be applicable for AQMs, since not all endpoints are benevolent (see the arms race of CDNs with the initial window size), we are better off to treat all of them as "hostile" and enforce a sane default policy; FQ being one of those sane policies that already proved that they are worth their salt. I wonder whether it is worth disscusing that point openly with Bob, though as it is partly a matter of taste and subjective risk-aversion.


> [...]
> Convincing an entire market to take a 3x latency hit on normal traffic
> so some other traffic can gain priority seems like an uphill climb.

	So far I fail to see L4S working for any other use-case than "high-way" to a close DC, as I simply do not believe that the required synchronization of remote senders to below-millisecond accuracy will generally work over the internet (L$S basically assumes close to zero RTT variations and the ability of OSs to actually dispatch packets with <ms accuracy). The initial paper only demonstrated that this accuracy is achievable but failed ot demonstrate how robust and reliable this can be achieved. The fact that ACK-clocking is supposed to aim as synchronisation mechanism makes we wonder what these guys were smoking. As far as I know NTP only claims several-milliseconds accuracy for its synchronisation over the internet, so I wonder what special souce the L4S team brought to the table to improve on the status quo by ~an order of magnitude?


> 
> I'd like to define terms better. "Ultra-low-latency" seems to mean sub
> 1ms latency?, which fq achieves on 90+% of flows.

	"Ultra-low-latency" is perfect marketing-speak with very little substance...


Best Regards
	Sebastian

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2019-07-31  7:39 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-30 16:38 [Ecn-sane] tsvwg preso for sce is up Dave Taht
2019-07-31  0:42 ` Jonathan Morton
2019-07-31  2:03   ` Dave Taht
2019-07-31  7:39     ` Sebastian Moeller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox