[Bloat] my backlogged comments on the ECT(1) interim call

General list for discussing Bufferbloat
 help / color / mirror / Atom feed

* [Bloat] my backlogged comments on the ECT(1) interim call
@ 2020-04-27 19:24 Dave Taht
  2020-04-28  8:53 ` Luca Muscariello
  0 siblings, 1 reply; 11+ messages in thread
From: Dave Taht @ 2020-04-27 19:24 UTC (permalink / raw)
  To: tsvwg IETF list; +Cc: bloat

It looks like the majority of what I say below is not related to the
fate of the "bit". The push to take the bit was
strong with this one, and me... can't we deploy more of what we
already got in places where it matters?

...

so: A) PLEA: From 10 years now, of me working on bufferbloat, working
on real end-user and wifi traffic and real networks....

I would like folk here to stop benchmarking two flows that run for a long time
and in one direction only... and thus exclusively in tcp congestion
avoidance mode.

Please. just. stop. Real traffic looks nothing like that. The internet
looks nothing like that.
The netops folk I know just roll their eyes up at benchmarks like this
that prove nothing and tell me to go to ripe meetings instead.
When y'all talk about "not looking foolish for not mandating ecn now",
you've already lost that audience with benchmarks like these.

Sure, setup a background flow(s)  like that, but then hit the result
with a mix of
far more normal traffic? Please? networks are never used unidirectionally
and both directions congesting is frequent. To illustrate that problem...

I have a really robust benchmark that we have used throughout the bufferbloat
project that I would like everyone to run in their environments, the flent
"rrul" test. Everybody on both sides has big enough testbeds setup that a few
hours spent on doing that - and please add in asymmetric networks especially -
and perusing the results ought to be enlightening to everyone as to the kind
of problems real people have, on real networks.

Can the L4S and SCE folk run the rrul test some day soon? Please?

I rather liked this benchmark that tested another traffic mix,

( https://www.cablelabs.com/wp-content/uploads/2014/06/DOCSIS-AQM_May2014.pdf )

although it had many flaws (like not doing dns lookups), I wish it
could be dusted off and used to compare this
new fangled ecn enabled stuff with the kind of results you can merely get
with packet loss and rtt awareness. It would be so great to be able
to directly compare all these new algorithms against this benchmark.

Adding in a non ecn'd udp based routing protocol on heavily
oversubscribed 100mbit link is also enlightening.

I'd rather like to see that benchmark improved for a more modernized
home traffic mix
where it is projected there may be 30 devices on the network on average,
in a few years.

If there is any one thing y'all can do to reduce my blood pressure and
keep me engaged here whilst you
debate the end of the internet as I understand it, it would be to run
the rrul test as part of all your benchmarks.

thank you.

B) Stuart Cheshire regaled us with several anecdotes - one concerning
his problems
with comcast's 1Gbit/35mbit service being unusable, under load, for
videoconferencing. This is true. The overbuffering at the CMTSes
still, has to be seen to be believed, at all rates. At lower rates
it's possible to shape this, with another device (which is what
the entire SQM deployment does in self defense and why cake has a
specific docsis ingress mode), but it is cpu intensive
and requires x86 hardware to do well at rates above 500Mbits, presently.

So I wish CMTS makers (Arris and Cisco) were in this room. are they?

(Stuart, if you'd like a box that can make your comcast link pleasurable
under all workloads, whenever you get back to los gatos, I've got a few
lying around. Was so happy to get a few ietfers this past week to apply
what's off the shelf for end users today. :)

C) I am glad bob said the L4S is finally looking at asymmetric
networks, and starting to tackle ack-filtering and accecn issues
there.

But... I would have *started there*. Asymmetric access is the predominate form
of all edge technologies.

I would love to see flent rrul test results for 1gig/35mbit, 100/10, 200/10
services, in particular. (from SCE also!). "lifeline" service (11/2)
would be good
to have results on. It would be especially good to have baseline
comparison data from the measured, current deployment
of the CMTSes at these rates, to start with, with no queue management in
play, then pie on the uplink, then fq_codel on the uplink, and then
this ecn stuff, and so on.

D) The two CPE makers in the room have dismissed both fq and sce as
being too difficult to implement. They did say that dualpi was
actually implemented in software, not hardware.

I would certainly like them to benchmark what they plan to offer in L4S
vs what is already available in the edgerouter X, as one low end
example among thousands.

I also have to note, at higher speeds, all the buffering moves into
the wifi and the results are currently ugly. I imagine
they are exploring how to fix their wifi stacks also? I wish more folk
were using RVR + latency benchmarks like this one:

http://flent-newark.bufferbloat.net/~d/Airtime%20based%20queue%20limit%20for%20FQ_CoDel%20in%20wireless%20interface.pdf

Same goes for the LTE folk.

E) Andrew mcgregor mentioned how great it would be for a closeted musician to
be able to play in real time with someone across town. that has been my goal
for nearly 30 years now!! And although I rather enjoyed his participation in
my last talk on the subject (
https://blog.apnic.net/2020/01/22/bufferbloat-may-be-solved-but-its-not-over-yet/
) conflating
a need for ecn and l4s signalling for low latency audio applications
with what I actually said in that talk, kind of hurt. I achieved
"my 2ms fiber based guitarist to fiber based drummer dream" 4+ years
back with fq_codel and diffserv, no ecn required,
no changes to the specs, no mandating packets be undroppable" and
would like to rip the opus codec out of that mix one day.

F) I agree with jana that changing the definition of RFC3168 to suit
the RED algorithm (which is not pi or anything fancy) often present in
network switches,
today to suit dctcp, works. But you should say "configuring red to
have l4s marking style" and document that.

Sometimes I try to point out many switches have a form of DRR in them,
and it's helpful to use that in conjunction with whatever diffserv
markings you trust in your network.

To this day I wish someone would publish how much they use DCTCP style
signalling on a dc network relative to their other traffic.

To this day I keep hoping that someone will publish a suitable
set of RED parameters for a wide variety of switches and routers -
for the most common switches and ethernet chips, for correct DCTCP usage.

Mellonox's example:
( https://community.mellanox.com/s/article/howto-configure-ecn-on-mellanox-ethernet-switches--spectrum-x
) is not dctcp specific.

many switches have a form of DRR in them, and it's helpful to use that
in conjunction with whatever diffserv markings you trust in your
network,
and, as per the above example, segregate two red queues that way. From
what I see
above there is no way to differentiate ECT(0) from ECT(1) in that switch. (?)

I do keep trying to point out the size of the end user ecn enabled
deployment, starting with the data I have from free.fr. Are we
building a network for AIs or people?

G) Jana also made a point about 2 queues "being enough" (I might be
mis-remembering the exact point). Mellonoxes ethernet chips at 10Gig expose
64 hardware queues, some new intel hardware exposes 2000+. How do these
queues work relative to these algorithms?

We have generally found hw mq to be far less of a benefit than the
manufacturers think, especially as regard to
lower latency or reduced cpu usage (as cache crossing is a bear).
There is a lot of software work in this area left to be done, however
they are needed to match queues to cpus (and tenants)

Until sch_pie gained timestamping support recently, the rate estimator
did not work correctly in a hw mq environment. Haven't looked over
dualpi in this respect.

-- 
Make Music, Not War

Dave Täht
CTO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-831-435-0729

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Bloat] my backlogged comments on the ECT(1) interim call
  2020-04-27 19:24 [Bloat] my backlogged comments on the ECT(1) interim call Dave Taht
@ 2020-04-28  8:53 ` Luca Muscariello
  2020-04-28 17:12   ` Holland, Jake
  0 siblings, 1 reply; 11+ messages in thread
From: Luca Muscariello @ 2020-04-28  8:53 UTC (permalink / raw)
  To: Dave Taht; +Cc: tsvwg IETF list, bloat

[-- Attachment #1: Type: text/plain, Size: 12844 bytes --]

Hi Dave and list members,

It was difficult to follow the discussion at the meeting yesterday.
Who  said what in the first place.

There have been a lot of non-technical comments such as: this solution
is better than another in my opinion. "better" has often been used
as when evaluating the taste of an ice cream: White chocolate vs black
chocolate.
This has taken a significant amount of time at the meeting. I haven't
learned
much from that kind of discussion and I do not think that helped to make
much progress.

If people can re-make their points in the list it would help the debate.

Another point that a few raised is that we have to make a decision as fast
as possible.
I dismissed entirely that argument. Trading off latency with resilience of
the Internet
is entirely against the design principle of the Internet architecture
itself.
Risk analysis is something that we should keep in mind even when
deploying any experiment
and should be a substantial part of it.

Someone claimed that on-line meeting traffic is elastic. This is not true,
I tried to
clarify this. These applications (WebEx/Zoom) are low rate, a typical
maximum upstream
rate is 2Mbps and is not elastic. These applications have often a
stand-alone app
that is not using the browser WebRTC stack (the standalone app typically
works better).

A client sends upstream one or two video qualities unless the video camera
is switched off.
In presence of losses, FEC is used but it is still non elastic.
Someone claimed (at yesterday's meeting) that fairness is not an issue (who
cares, I heard!)
Well, fairness can constitute a differentiation advantage between two
companies that are
commercializing on-line meetings products. Unless at the IETF we accept
"law-of-the-jungle" behaviours from Internet applications developers, we
should be careful
about making such claims.
Any opportunity to cheat, that brings a business advantage WILL be used.

/Luca

TL;DR
To Dave: you asked several times what  Cisco does on latency reduction in
network equipment. I tend to be very shy when replying on these questions
as this is not vendor neutral. If chairs think this is not appropriate for
the list, please say it and I'll reply privately only.

What I write below can be found in Cisco products data sheets and is not
trade secret. There are very good blog posts explaining details.
Not surprisingly Cisco implements the state of the art on the topic
and it is totally feasible to do-the-right-thing in software and hardware.

Cisco implements AFD (one queue + a flow table) accompanied by a priority
queue for
flows that have a certain profile in rate and size. The concept is well
known and well
studied in the literature. AFD is safe and can well serve a complex traffic
mix when
accompanied by a priority queue. This prio-queue should not be confused
with a strict
priority queue (e.g. EF in diffserv). There are subtleties related to the
DOCSIS
shared medium which would be too long to describe here.

This is available in Cisco CMTS for the DOCSIS segment. Bottleneck traffic
does not negatively impact non-bottlenecked-traffic such as an on-line
meeting like
the WebEx call we had yesterday. It is safe from a network neutrality
point-of-view
and no applications get hurt.

Cisco implements AFD+prio also for some DC switches such as the Nexus 9k.
There
is a blog post written by Tom Edsal online that explains pretty well how
that works.
This includes mechanisms such as p-fabric to approximate SRPT (shortest
remaining processing time)
and minimize flow completion time for many DC workloads. The mix of the two
brings FCT minimization AND latency minimization. This is silicon and
scales at any speed.
For those who are not familiar with these concepts, please search the
research work of Balaji
Prabhakar and Ron Pang at Stanford.

Wi-Fi: Cisco does airtime fairness in Aironet but I think in the Meraki
series too.
The concept is similar to what described above but there are several
queues, one per STA.
Packets are enqueued in the access (category) queue at dequeue time from
the air-time
packet scheduler.

On Mon, Apr 27, 2020 at 9:24 PM Dave Taht <dave.taht@gmail.com> wrote:

> It looks like the majority of what I say below is not related to the
> fate of the "bit". The push to take the bit was
> strong with this one, and me... can't we deploy more of what we
> already got in places where it matters?
>
> ...
>
> so: A) PLEA: From 10 years now, of me working on bufferbloat, working
> on real end-user and wifi traffic and real networks....
>
> I would like folk here to stop benchmarking two flows that run for a long
> time
> and in one direction only... and thus exclusively in tcp congestion
> avoidance mode.
>
> Please. just. stop. Real traffic looks nothing like that. The internet
> looks nothing like that.
> The netops folk I know just roll their eyes up at benchmarks like this
> that prove nothing and tell me to go to ripe meetings instead.
> When y'all talk about "not looking foolish for not mandating ecn now",
> you've already lost that audience with benchmarks like these.
>
> Sure, setup a background flow(s)  like that, but then hit the result
> with a mix of
> far more normal traffic? Please? networks are never used unidirectionally
> and both directions congesting is frequent. To illustrate that problem...
>
> I have a really robust benchmark that we have used throughout the
> bufferbloat
> project that I would like everyone to run in their environments, the flent
> "rrul" test. Everybody on both sides has big enough testbeds setup that a
> few
> hours spent on doing that - and please add in asymmetric networks
> especially -
> and perusing the results ought to be enlightening to everyone as to the
> kind
> of problems real people have, on real networks.
>
> Can the L4S and SCE folk run the rrul test some day soon? Please?
>
> I rather liked this benchmark that tested another traffic mix,
>
> (
> https://www.cablelabs.com/wp-content/uploads/2014/06/DOCSIS-AQM_May2014.pdf
> )
>
> although it had many flaws (like not doing dns lookups), I wish it
> could be dusted off and used to compare this
> new fangled ecn enabled stuff with the kind of results you can merely get
> with packet loss and rtt awareness. It would be so great to be able
> to directly compare all these new algorithms against this benchmark.
>
> Adding in a non ecn'd udp based routing protocol on heavily
> oversubscribed 100mbit link is also enlightening.
>
> I'd rather like to see that benchmark improved for a more modernized
> home traffic mix
> where it is projected there may be 30 devices on the network on average,
> in a few years.
>
> If there is any one thing y'all can do to reduce my blood pressure and
> keep me engaged here whilst you
> debate the end of the internet as I understand it, it would be to run
> the rrul test as part of all your benchmarks.
>
> thank you.
>
> B) Stuart Cheshire regaled us with several anecdotes - one concerning
> his problems
> with comcast's 1Gbit/35mbit service being unusable, under load, for
> videoconferencing. This is true. The overbuffering at the CMTSes
> still, has to be seen to be believed, at all rates. At lower rates
> it's possible to shape this, with another device (which is what
> the entire SQM deployment does in self defense and why cake has a
> specific docsis ingress mode), but it is cpu intensive
> and requires x86 hardware to do well at rates above 500Mbits, presently.
>
> So I wish CMTS makers (Arris and Cisco) were in this room. are they?
>
> (Stuart, if you'd like a box that can make your comcast link pleasurable
> under all workloads, whenever you get back to los gatos, I've got a few
> lying around. Was so happy to get a few ietfers this past week to apply
> what's off the shelf for end users today. :)
>
> C) I am glad bob said the L4S is finally looking at asymmetric
> networks, and starting to tackle ack-filtering and accecn issues
> there.
>
> But... I would have *started there*. Asymmetric access is the predominate
> form
> of all edge technologies.
>
> I would love to see flent rrul test results for 1gig/35mbit, 100/10, 200/10
> services, in particular. (from SCE also!). "lifeline" service (11/2)
> would be good
> to have results on. It would be especially good to have baseline
> comparison data from the measured, current deployment
> of the CMTSes at these rates, to start with, with no queue management in
> play, then pie on the uplink, then fq_codel on the uplink, and then
> this ecn stuff, and so on.
>
> D) The two CPE makers in the room have dismissed both fq and sce as
> being too difficult to implement. They did say that dualpi was
> actually implemented in software, not hardware.
>
> I would certainly like them to benchmark what they plan to offer in L4S
> vs what is already available in the edgerouter X, as one low end
> example among thousands.
>
> I also have to note, at higher speeds, all the buffering moves into
> the wifi and the results are currently ugly. I imagine
> they are exploring how to fix their wifi stacks also? I wish more folk
> were using RVR + latency benchmarks like this one:
>
>
> http://flent-newark.bufferbloat.net/~d/Airtime%20based%20queue%20limit%20for%20FQ_CoDel%20in%20wireless%20interface.pdf
>
> Same goes for the LTE folk.
>
> E) Andrew mcgregor mentioned how great it would be for a closeted musician
> to
> be able to play in real time with someone across town. that has been my
> goal
> for nearly 30 years now!! And although I rather enjoyed his participation
> in
> my last talk on the subject (
>
> https://blog.apnic.net/2020/01/22/bufferbloat-may-be-solved-but-its-not-over-yet/
> ) conflating
> a need for ecn and l4s signalling for low latency audio applications
> with what I actually said in that talk, kind of hurt. I achieved
> "my 2ms fiber based guitarist to fiber based drummer dream" 4+ years
> back with fq_codel and diffserv, no ecn required,
> no changes to the specs, no mandating packets be undroppable" and
> would like to rip the opus codec out of that mix one day.
>
> F) I agree with jana that changing the definition of RFC3168 to suit
> the RED algorithm (which is not pi or anything fancy) often present in
> network switches,
> today to suit dctcp, works. But you should say "configuring red to
> have l4s marking style" and document that.
>
> Sometimes I try to point out many switches have a form of DRR in them,
> and it's helpful to use that in conjunction with whatever diffserv
> markings you trust in your network.
>
> To this day I wish someone would publish how much they use DCTCP style
> signalling on a dc network relative to their other traffic.
>
> To this day I keep hoping that someone will publish a suitable
> set of RED parameters for a wide variety of switches and routers -
> for the most common switches and ethernet chips, for correct DCTCP usage.
>
> Mellonox's example:
> (
> https://community.mellanox.com/s/article/howto-configure-ecn-on-mellanox-ethernet-switches--spectrum-x
> ) is not dctcp specific.
>
> many switches have a form of DRR in them, and it's helpful to use that
> in conjunction with whatever diffserv markings you trust in your
> network,
> and, as per the above example, segregate two red queues that way. From
> what I see
> above there is no way to differentiate ECT(0) from ECT(1) in that switch.
> (?)
>
> I do keep trying to point out the size of the end user ecn enabled
> deployment, starting with the data I have from free.fr. Are we
> building a network for AIs or people?
>
> G) Jana also made a point about 2 queues "being enough" (I might be
> mis-remembering the exact point). Mellonoxes ethernet chips at 10Gig expose
> 64 hardware queues, some new intel hardware exposes 2000+. How do these
> queues work relative to these algorithms?
>
> We have generally found hw mq to be far less of a benefit than the
> manufacturers think, especially as regard to
> lower latency or reduced cpu usage (as cache crossing is a bear).
> There is a lot of software work in this area left to be done, however
> they are needed to match queues to cpus (and tenants)
>
> Until sch_pie gained timestamping support recently, the rate estimator
> did not work correctly in a hw mq environment. Haven't looked over
> dualpi in this respect.
>
>
>
>
>
> --
> Make Music, Not War
>
> Dave Täht
> CTO, TekLibre, LLC
> http://www.teklibre.com
> Tel: 1-831-435-0729
> _______________________________________________
> Bloat mailing list
> Bloat@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/bloat
>

[-- Attachment #2: Type: text/html, Size: 18939 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Bloat] my backlogged comments on the ECT(1) interim call
  2020-04-28  8:53 ` Luca Muscariello
@ 2020-04-28 17:12   ` Holland, Jake
  2020-04-28 19:04     ` Luca Muscariello
  0 siblings, 1 reply; 11+ messages in thread
From: Holland, Jake @ 2020-04-28 17:12 UTC (permalink / raw)
  To: Luca Muscariello, Dave Taht; +Cc: tsvwg IETF list, bloat

[-- Attachment #1: Type: text/plain, Size: 16963 bytes --]

Hi Luca,

To your point about the discussion being difficult to follow: I tried to capture the intent of everyone who commented while taking notes:
https://etherpad.ietf.org:9009/p/notes-ietf-interim-2020-tsvwg-03

I think this was intended to take the place of a need for everyone to re-send the same points to the list, but of course some of the most crucial points could probably use fleshing out with on-list follow up.

It got a bit rough in places because I was disconnected a few times and had to cut over to a local text file, and I may have failed to correctly understand or summarize some of the comments, so there’s chances I might have missed something, but I did my best to capture them all.

I encourage people to review comments and check whether they came out more or less correct, and to offer formatting and cleanup suggestions if there’s a good way to make it easier to follow.

I had timestamps at the beginning of each main point of discussion, with the intent that after the video is published it would be easier to go back and check precisely what was said. It looks like someone has been making cleanup edits that removed the first half of those so far, but my local text file still has most of those and I can go back and re-insert them if it seems useful.

@Luca: during your comments in particular I think there might have been a disruption--I had a “first comment missed, please check video” placeholder and I may have misunderstood the part about video elasticity, but my interpretation at the time was that Stuart was claiming that video was elastic in that it would adjust downward to avoid overflowing a loaded link, and I thought you were claiming that it was not elastic in that it would not exceed a maximum rate, which I summarized as perhaps a semantic disagreement, but if you’d like to help clean that up, it might be useful.

From this message, it sounds like the key point you were making was that it also will not go below a certain rate, and perhaps that quality can stay relatively good in spite of high network loss?

Best regards,
Jake

From: Luca Muscariello <muscariello@ieee.org>
Date: Tuesday, April 28, 2020 at 1:54 AM
To: Dave Taht <dave.taht@gmail.com>
Cc: tsvwg IETF list <tsvwg@ietf.org>, bloat <bloat@lists.bufferbloat.net>
Subject: Re: [Bloat] my backlogged comments on the ECT(1) interim call

Hi Dave and list members,

It was difficult to follow the discussion at the meeting yesterday.
Who  said what in the first place.

There have been a lot of non-technical comments such as: this solution
is better than another in my opinion. "better" has often been used
as when evaluating the taste of an ice cream: White chocolate vs black chocolate.
This has taken a significant amount of time at the meeting. I haven't learned
much from that kind of discussion and I do not think that helped to make
much progress.

If people can re-make their points in the list it would help the debate.

Another point that a few raised is that we have to make a decision as fast as possible.
I dismissed entirely that argument. Trading off latency with resilience of the Internet
is entirely against the design principle of the Internet architecture itself.
Risk analysis is something that we should keep in mind even when deploying any experiment
and should be a substantial part of it.

Someone claimed that on-line meeting traffic is elastic. This is not true, I tried to
clarify this. These applications (WebEx/Zoom) are low rate, a typical maximum upstream
rate is 2Mbps and is not elastic. These applications have often a stand-alone app
that is not using the browser WebRTC stack (the standalone app typically works better).

A client sends upstream one or two video qualities unless the video camera is switched off.
In presence of losses, FEC is used but it is still non elastic.
Someone claimed (at yesterday's meeting) that fairness is not an issue (who cares, I heard!)
Well, fairness can constitute a differentiation advantage between two companies that are
commercializing on-line meetings products. Unless at the IETF we accept
"law-of-the-jungle" behaviours from Internet applications developers, we should be careful
about making such claims.
Any opportunity to cheat, that brings a business advantage WILL be used.

/Luca

TL;DR
To Dave: you asked several times what  Cisco does on latency reduction in
network equipment. I tend to be very shy when replying on these questions
as this is not vendor neutral. If chairs think this is not appropriate for
the list, please say it and I'll reply privately only.

What I write below can be found in Cisco products data sheets and is not
trade secret. There are very good blog posts explaining details.
Not surprisingly Cisco implements the state of the art on the topic
and it is totally feasible to do-the-right-thing in software and hardware.

Cisco implements AFD (one queue + a flow table) accompanied by a priority queue for
flows that have a certain profile in rate and size. The concept is well known and well
studied in the literature. AFD is safe and can well serve a complex traffic mix when
accompanied by a priority queue. This prio-queue should not be confused with a strict
priority queue (e.g. EF in diffserv). There are subtleties related to the DOCSIS
shared medium which would be too long to describe here.

This is available in Cisco CMTS for the DOCSIS segment. Bottleneck traffic
does not negatively impact non-bottlenecked-traffic such as an on-line meeting like
the WebEx call we had yesterday. It is safe from a network neutrality point-of-view
and no applications get hurt.

Cisco implements AFD+prio also for some DC switches such as the Nexus 9k. There
is a blog post written by Tom Edsal online that explains pretty well how that works.
This includes mechanisms such as p-fabric to approximate SRPT (shortest remaining processing time)
and minimize flow completion time for many DC workloads. The mix of the two
brings FCT minimization AND latency minimization. This is silicon and scales at any speed.
For those who are not familiar with these concepts, please search the research work of Balaji
Prabhakar and Ron Pang at Stanford.

Wi-Fi: Cisco does airtime fairness in Aironet but I think in the Meraki series too.
The concept is similar to what described above but there are several queues, one per STA.
Packets are enqueued in the access (category) queue at dequeue time from the air-time
packet scheduler.

On Mon, Apr 27, 2020 at 9:24 PM Dave Taht <dave.taht@gmail.com<mailto:dave.taht@gmail.com>> wrote:
It looks like the majority of what I say below is not related to the
fate of the "bit". The push to take the bit was
strong with this one, and me... can't we deploy more of what we
already got in places where it matters?

...

so: A) PLEA: From 10 years now, of me working on bufferbloat, working
on real end-user and wifi traffic and real networks....

I would like folk here to stop benchmarking two flows that run for a long time
and in one direction only... and thus exclusively in tcp congestion
avoidance mode.

Please. just. stop. Real traffic looks nothing like that. The internet
looks nothing like that.
The netops folk I know just roll their eyes up at benchmarks like this
that prove nothing and tell me to go to ripe meetings instead.
When y'all talk about "not looking foolish for not mandating ecn now",
you've already lost that audience with benchmarks like these.

Sure, setup a background flow(s)  like that, but then hit the result
with a mix of
far more normal traffic? Please? networks are never used unidirectionally
and both directions congesting is frequent. To illustrate that problem...

I have a really robust benchmark that we have used throughout the bufferbloat
project that I would like everyone to run in their environments, the flent
"rrul" test. Everybody on both sides has big enough testbeds setup that a few
hours spent on doing that - and please add in asymmetric networks especially -
and perusing the results ought to be enlightening to everyone as to the kind
of problems real people have, on real networks.

Can the L4S and SCE folk run the rrul test some day soon? Please?

I rather liked this benchmark that tested another traffic mix,

( https://www.cablelabs.com/wp-content/uploads/2014/06/DOCSIS-AQM_May2014.pdf<https://urldefense.proofpoint.com/v2/url?u=https-3A__www.cablelabs.com_wp-2Dcontent_uploads_2014_06_DOCSIS-2DAQM-5FMay2014.pdf&d=DwMFaQ&c=96ZbZZcaMF4w0F4jpN6LZg&r=bqnFROivDo_4iF8Z3R4DyNWKbbMeXr0LOgLnElT1Ook&m=j5nEJ3W8fRmqjnBSWapTVKj6dNbpegl4kSeynebCQT4&s=DrB4ENWjWbVu9SqtIh7lXKJj96fwm6TqESC6E8_IdnY&e=> )

although it had many flaws (like not doing dns lookups), I wish it
could be dusted off and used to compare this
new fangled ecn enabled stuff with the kind of results you can merely get
with packet loss and rtt awareness. It would be so great to be able
to directly compare all these new algorithms against this benchmark.

Adding in a non ecn'd udp based routing protocol on heavily
oversubscribed 100mbit link is also enlightening.

I'd rather like to see that benchmark improved for a more modernized
home traffic mix
where it is projected there may be 30 devices on the network on average,
in a few years.

If there is any one thing y'all can do to reduce my blood pressure and
keep me engaged here whilst you
debate the end of the internet as I understand it, it would be to run
the rrul test as part of all your benchmarks.

thank you.

B) Stuart Cheshire regaled us with several anecdotes - one concerning
his problems
with comcast's 1Gbit/35mbit service being unusable, under load, for
videoconferencing. This is true. The overbuffering at the CMTSes
still, has to be seen to be believed, at all rates. At lower rates
it's possible to shape this, with another device (which is what
the entire SQM deployment does in self defense and why cake has a
specific docsis ingress mode), but it is cpu intensive
and requires x86 hardware to do well at rates above 500Mbits, presently.

So I wish CMTS makers (Arris and Cisco) were in this room. are they?

(Stuart, if you'd like a box that can make your comcast link pleasurable
under all workloads, whenever you get back to los gatos, I've got a few
lying around. Was so happy to get a few ietfers this past week to apply
what's off the shelf for end users today. :)

C) I am glad bob said the L4S is finally looking at asymmetric
networks, and starting to tackle ack-filtering and accecn issues
there.

But... I would have *started there*. Asymmetric access is the predominate form
of all edge technologies.

I would love to see flent rrul test results for 1gig/35mbit, 100/10, 200/10
services, in particular. (from SCE also!). "lifeline" service (11/2)
would be good
to have results on. It would be especially good to have baseline
comparison data from the measured, current deployment
of the CMTSes at these rates, to start with, with no queue management in
play, then pie on the uplink, then fq_codel on the uplink, and then
this ecn stuff, and so on.

D) The two CPE makers in the room have dismissed both fq and sce as
being too difficult to implement. They did say that dualpi was
actually implemented in software, not hardware.

I would certainly like them to benchmark what they plan to offer in L4S
vs what is already available in the edgerouter X, as one low end
example among thousands.

I also have to note, at higher speeds, all the buffering moves into
the wifi and the results are currently ugly. I imagine
they are exploring how to fix their wifi stacks also? I wish more folk
were using RVR + latency benchmarks like this one:

http://flent-newark.bufferbloat.net/~d/Airtime%20based%20queue%20limit%20for%20FQ_CoDel%20in%20wireless%20interface.pdf<https://urldefense.proofpoint.com/v2/url?u=http-3A__flent-2Dnewark.bufferbloat.net_-7Ed_Airtime-2520based-2520queue-2520limit-2520for-2520FQ-5FCoDel-2520in-2520wireless-2520interface.pdf&d=DwMFaQ&c=96ZbZZcaMF4w0F4jpN6LZg&r=bqnFROivDo_4iF8Z3R4DyNWKbbMeXr0LOgLnElT1Ook&m=j5nEJ3W8fRmqjnBSWapTVKj6dNbpegl4kSeynebCQT4&s=UEzrGb3xL5zElDhYxB7wHpux1_SLFHGUcEkgTNMOe2Q&e=>

Same goes for the LTE folk.

E) Andrew mcgregor mentioned how great it would be for a closeted musician to
be able to play in real time with someone across town. that has been my goal
for nearly 30 years now!! And although I rather enjoyed his participation in
my last talk on the subject (
https://blog.apnic.net/2020/01/22/bufferbloat-may-be-solved-but-its-not-over-yet/<https://urldefense.proofpoint.com/v2/url?u=https-3A__blog.apnic.net_2020_01_22_bufferbloat-2Dmay-2Dbe-2Dsolved-2Dbut-2Dits-2Dnot-2Dover-2Dyet_&d=DwMFaQ&c=96ZbZZcaMF4w0F4jpN6LZg&r=bqnFROivDo_4iF8Z3R4DyNWKbbMeXr0LOgLnElT1Ook&m=j5nEJ3W8fRmqjnBSWapTVKj6dNbpegl4kSeynebCQT4&s=BSDbzxnB7k7krFmkHv9id0BeDC6Vh39LgPNxyHUIg34&e=>
) conflating
a need for ecn and l4s signalling for low latency audio applications
with what I actually said in that talk, kind of hurt. I achieved
"my 2ms fiber based guitarist to fiber based drummer dream" 4+ years
back with fq_codel and diffserv, no ecn required,
no changes to the specs, no mandating packets be undroppable" and
would like to rip the opus codec out of that mix one day.

F) I agree with jana that changing the definition of RFC3168 to suit
the RED algorithm (which is not pi or anything fancy) often present in
network switches,
today to suit dctcp, works. But you should say "configuring red to
have l4s marking style" and document that.

Sometimes I try to point out many switches have a form of DRR in them,
and it's helpful to use that in conjunction with whatever diffserv
markings you trust in your network.

To this day I wish someone would publish how much they use DCTCP style
signalling on a dc network relative to their other traffic.

To this day I keep hoping that someone will publish a suitable
set of RED parameters for a wide variety of switches and routers -
for the most common switches and ethernet chips, for correct DCTCP usage.

Mellonox's example:
( https://community.mellanox.com/s/article/howto-configure-ecn-on-mellanox-ethernet-switches--spectrum-x<https://urldefense.proofpoint.com/v2/url?u=https-3A__community.mellanox.com_s_article_howto-2Dconfigure-2Decn-2Don-2Dmellanox-2Dethernet-2Dswitches-2D-2Dspectrum-2Dx&d=DwMFaQ&c=96ZbZZcaMF4w0F4jpN6LZg&r=bqnFROivDo_4iF8Z3R4DyNWKbbMeXr0LOgLnElT1Ook&m=j5nEJ3W8fRmqjnBSWapTVKj6dNbpegl4kSeynebCQT4&s=nEIW1DhRXOHu3F5tMwpyO5rQUBMfCZx3Hs4wVvkVFIQ&e=>
) is not dctcp specific.

many switches have a form of DRR in them, and it's helpful to use that
in conjunction with whatever diffserv markings you trust in your
network,
and, as per the above example, segregate two red queues that way. From
what I see
above there is no way to differentiate ECT(0) from ECT(1) in that switch. (?)

I do keep trying to point out the size of the end user ecn enabled
deployment, starting with the data I have from free.fr<https://urldefense.proofpoint.com/v2/url?u=http-3A__free.fr&d=DwMFaQ&c=96ZbZZcaMF4w0F4jpN6LZg&r=bqnFROivDo_4iF8Z3R4DyNWKbbMeXr0LOgLnElT1Ook&m=j5nEJ3W8fRmqjnBSWapTVKj6dNbpegl4kSeynebCQT4&s=7gswGhl21lejSnIiu3yyUTPZEArHqQG6hD64BoW2Zco&e=>. Are we
building a network for AIs or people?

G) Jana also made a point about 2 queues "being enough" (I might be
mis-remembering the exact point). Mellonoxes ethernet chips at 10Gig expose
64 hardware queues, some new intel hardware exposes 2000+. How do these
queues work relative to these algorithms?

We have generally found hw mq to be far less of a benefit than the
manufacturers think, especially as regard to
lower latency or reduced cpu usage (as cache crossing is a bear).
There is a lot of software work in this area left to be done, however
they are needed to match queues to cpus (and tenants)

Until sch_pie gained timestamping support recently, the rate estimator
did not work correctly in a hw mq environment. Haven't looked over
dualpi in this respect.

--
Make Music, Not War

Dave Täht
CTO, TekLibre, LLC
http://www.teklibre.com<https://urldefense.proofpoint.com/v2/url?u=http-3A__www.teklibre.com&d=DwMFaQ&c=96ZbZZcaMF4w0F4jpN6LZg&r=bqnFROivDo_4iF8Z3R4DyNWKbbMeXr0LOgLnElT1Ook&m=j5nEJ3W8fRmqjnBSWapTVKj6dNbpegl4kSeynebCQT4&s=DqPVjNVWDmF4_cwubNhhJS4Y1jCj71szPiBn9pmDZ70&e=>
Tel: 1-831-435-0729
_______________________________________________
Bloat mailing list
Bloat@lists.bufferbloat.net<mailto:Bloat@lists.bufferbloat.net>
https://lists.bufferbloat.net/listinfo/bloat<https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.bufferbloat.net_listinfo_bloat&d=DwMFaQ&c=96ZbZZcaMF4w0F4jpN6LZg&r=bqnFROivDo_4iF8Z3R4DyNWKbbMeXr0LOgLnElT1Ook&m=j5nEJ3W8fRmqjnBSWapTVKj6dNbpegl4kSeynebCQT4&s=DBDxIR6eSYcBOh7rqZx0PWzsHOfvvJeqioI3r2IQOA4&e=>

[-- Attachment #2: Type: text/html, Size: 28861 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Bloat] my backlogged comments on the ECT(1) interim call
  2020-04-28 17:12   ` Holland, Jake
@ 2020-04-28 19:04     ` Luca Muscariello
       [not found]       ` <bad22a6b-698d-c85f-b829-6b5391833a1e@erg.abdn.ac.uk>
  2020-04-29  8:44       ` Rodney W. Grimes
  0 siblings, 2 replies; 11+ messages in thread
From: Luca Muscariello @ 2020-04-28 19:04 UTC (permalink / raw)
  To: Holland, Jake; +Cc: Dave Taht, bloat, tsvwg IETF list

[-- Attachment #1: Type: text/plain, Size: 19086 bytes --]

Hi Jake,

Thanks for the notes. Very useful.
The other issue with the meeting was that the virtual mic queue control
channel was the WebEx Meeting chat that does not exist in WebEx Teams. So,
I had to switch to Meetings and lost some pieces of the discussion.

Yes there might be a terminology difference. Elastic traffic is usually
used in the sense of bandwidth sharing not just to define variable bit
rates.

The point is that there are incentives to cheat in L4S.

There is a priority queue that my application can enter by providing as
input ECT(1).
Applications such as on-line meetings will have a relatively low and highly
paced rate.

This traffic is conformant to dualQ L queue but is unresponsive to
congestion notifications.

This is especially true for FEC streams which could be used to ameliorate
the media quality in presence of losses(e.g. Wi-Fi)
or increased jitter.


That was one more point on why using ECT(1) as input assumes trust or a
black list after being caught.

In both cases the ECT(1) as input is DoSable.



On Tue, Apr 28, 2020 at 7:12 PM Holland, Jake <jholland@akamai.com> wrote:

> Hi Luca,
>
>
>
> To your point about the discussion being difficult to follow: I tried to
> capture the intent of everyone who commented while taking notes:
>
> https://etherpad.ietf.org:9009/p/notes-ietf-interim-2020-tsvwg-03
>
>
>
> I think this was intended to take the place of a need for everyone to
> re-send the same points to the list, but of course some of the most crucial
> points could probably use fleshing out with on-list follow up.
>
>
>
> It got a bit rough in places because I was disconnected a few times and
> had to cut over to a local text file, and I may have failed to correctly
> understand or summarize some of the comments, so there’s chances I might
> have missed something, but I did my best to capture them all.
>
>
>
> I encourage people to review comments and check whether they came out more
> or less correct, and to offer formatting and cleanup suggestions if there’s
> a good way to make it easier to follow.
>
>
>
> I had timestamps at the beginning of each main point of discussion, with
> the intent that after the video is published it would be easier to go back
> and check precisely what was said. It looks like someone has been making
> cleanup edits that removed the first half of those so far, but my local
> text file still has most of those and I can go back and re-insert them if
> it seems useful.
>
>
>
> @Luca: during your comments in particular I think there might have been a
> disruption--I had a “first comment missed, please check video” placeholder
> and I may have misunderstood the part about video elasticity, but my
> interpretation at the time was that Stuart was claiming that video was
> elastic in that it would adjust downward to avoid overflowing a loaded
> link, and I thought you were claiming that it was not elastic in that it
> would not exceed a maximum rate, which I summarized as perhaps a semantic
> disagreement, but if you’d like to help clean that up, it might be useful.
>
>
>
> From this message, it sounds like the key point you were making was that
> it also will not go below a certain rate, and perhaps that quality can stay
> relatively good in spite of high network loss?
>
>
>
> Best regards,
>
> Jake
>
>
>
> *From: *Luca Muscariello <muscariello@ieee.org>
> *Date: *Tuesday, April 28, 2020 at 1:54 AM
> *To: *Dave Taht <dave.taht@gmail.com>
> *Cc: *tsvwg IETF list <tsvwg@ietf.org>, bloat <bloat@lists.bufferbloat.net
> >
> *Subject: *Re: [Bloat] my backlogged comments on the ECT(1) interim call
>
>
>
> Hi Dave and list members,
>
>
>
> It was difficult to follow the discussion at the meeting yesterday.
>
> Who  said what in the first place.
>
>
>
> There have been a lot of non-technical comments such as: this solution
>
> is better than another in my opinion. "better" has often been used
>
> as when evaluating the taste of an ice cream: White chocolate vs black
> chocolate.
>
> This has taken a significant amount of time at the meeting. I haven't
> learned
>
> much from that kind of discussion and I do not think that helped to make
>
> much progress.
>
>
>
> If people can re-make their points in the list it would help the debate.
>
>
>
> Another point that a few raised is that we have to make a decision as fast
> as possible.
>
> I dismissed entirely that argument. Trading off latency with resilience of
> the Internet
>
> is entirely against the design principle of the Internet architecture
> itself.
>
> Risk analysis is something that we should keep in mind even when
> deploying any experiment
>
> and should be a substantial part of it.
>
>
>
> Someone claimed that on-line meeting traffic is elastic. This is not true,
> I tried to
>
> clarify this. These applications (WebEx/Zoom) are low rate, a typical
> maximum upstream
>
> rate is 2Mbps and is not elastic. These applications have often a
> stand-alone app
>
> that is not using the browser WebRTC stack (the standalone app typically
> works better).
>
>
>
> A client sends upstream one or two video qualities unless the video camera
> is switched off.
>
> In presence of losses, FEC is used but it is still non elastic.
>
> Someone claimed (at yesterday's meeting) that fairness is not an issue
> (who cares, I heard!)
>
> Well, fairness can constitute a differentiation advantage between two
> companies that are
>
> commercializing on-line meetings products. Unless at the IETF we accept
>
> "law-of-the-jungle" behaviours from Internet applications developers, we
> should be careful
>
> about making such claims.
>
> Any opportunity to cheat, that brings a business advantage WILL be used.
>
>
>
> /Luca
>
>
>
> TL;DR
>
> To Dave: you asked several times what  Cisco does on latency reduction in
>
> network equipment. I tend to be very shy when replying on these questions
>
> as this is not vendor neutral. If chairs think this is not appropriate for
>
> the list, please say it and I'll reply privately only.
>
>
>
> What I write below can be found in Cisco products data sheets and is not
>
> trade secret. There are very good blog posts explaining details.
>
> Not surprisingly Cisco implements the state of the art on the topic
>
> and it is totally feasible to do-the-right-thing in software and hardware.
>
>
>
> Cisco implements AFD (one queue + a flow table) accompanied by a priority
> queue for
>
> flows that have a certain profile in rate and size. The concept is well
> known and well
>
> studied in the literature. AFD is safe and can well serve a complex
> traffic mix when
>
> accompanied by a priority queue. This prio-queue should not be confused
> with a strict
>
> priority queue (e.g. EF in diffserv). There are subtleties related to the
> DOCSIS
>
> shared medium which would be too long to describe here.
>
>
>
> This is available in Cisco CMTS for the DOCSIS segment. Bottleneck traffic
>
> does not negatively impact non-bottlenecked-traffic such as an on-line
> meeting like
>
> the WebEx call we had yesterday. It is safe from a network neutrality
> point-of-view
>
> and no applications get hurt.
>
>
>
> Cisco implements AFD+prio also for some DC switches such as the Nexus 9k.
> There
>
> is a blog post written by Tom Edsal online that explains pretty well how
> that works.
>
> This includes mechanisms such as p-fabric to approximate SRPT (shortest
> remaining processing time)
>
> and minimize flow completion time for many DC workloads. The mix of the two
>
> brings FCT minimization AND latency minimization. This is silicon and
> scales at any speed.
>
> For those who are not familiar with these concepts, please search the
> research work of Balaji
>
> Prabhakar and Ron Pang at Stanford.
>
>
>
> Wi-Fi: Cisco does airtime fairness in Aironet but I think in the Meraki
> series too.
>
> The concept is similar to what described above but there are several
> queues, one per STA.
>
> Packets are enqueued in the access (category) queue at dequeue time from
> the air-time
>
> packet scheduler.
>
>
>
> On Mon, Apr 27, 2020 at 9:24 PM Dave Taht <dave.taht@gmail.com> wrote:
>
> It looks like the majority of what I say below is not related to the
> fate of the "bit". The push to take the bit was
> strong with this one, and me... can't we deploy more of what we
> already got in places where it matters?
>
> ...
>
> so: A) PLEA: From 10 years now, of me working on bufferbloat, working
> on real end-user and wifi traffic and real networks....
>
> I would like folk here to stop benchmarking two flows that run for a long
> time
> and in one direction only... and thus exclusively in tcp congestion
> avoidance mode.
>
> Please. just. stop. Real traffic looks nothing like that. The internet
> looks nothing like that.
> The netops folk I know just roll their eyes up at benchmarks like this
> that prove nothing and tell me to go to ripe meetings instead.
> When y'all talk about "not looking foolish for not mandating ecn now",
> you've already lost that audience with benchmarks like these.
>
> Sure, setup a background flow(s)  like that, but then hit the result
> with a mix of
> far more normal traffic? Please? networks are never used unidirectionally
> and both directions congesting is frequent. To illustrate that problem...
>
> I have a really robust benchmark that we have used throughout the
> bufferbloat
> project that I would like everyone to run in their environments, the flent
> "rrul" test. Everybody on both sides has big enough testbeds setup that a
> few
> hours spent on doing that - and please add in asymmetric networks
> especially -
> and perusing the results ought to be enlightening to everyone as to the
> kind
> of problems real people have, on real networks.
>
> Can the L4S and SCE folk run the rrul test some day soon? Please?
>
> I rather liked this benchmark that tested another traffic mix,
>
> (
> https://www.cablelabs.com/wp-content/uploads/2014/06/DOCSIS-AQM_May2014.pdf
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__www.cablelabs.com_wp-2Dcontent_uploads_2014_06_DOCSIS-2DAQM-5FMay2014.pdf&d=DwMFaQ&c=96ZbZZcaMF4w0F4jpN6LZg&r=bqnFROivDo_4iF8Z3R4DyNWKbbMeXr0LOgLnElT1Ook&m=j5nEJ3W8fRmqjnBSWapTVKj6dNbpegl4kSeynebCQT4&s=DrB4ENWjWbVu9SqtIh7lXKJj96fwm6TqESC6E8_IdnY&e=>
> )
>
> although it had many flaws (like not doing dns lookups), I wish it
> could be dusted off and used to compare this
> new fangled ecn enabled stuff with the kind of results you can merely get
> with packet loss and rtt awareness. It would be so great to be able
> to directly compare all these new algorithms against this benchmark.
>
> Adding in a non ecn'd udp based routing protocol on heavily
> oversubscribed 100mbit link is also enlightening.
>
> I'd rather like to see that benchmark improved for a more modernized
> home traffic mix
> where it is projected there may be 30 devices on the network on average,
> in a few years.
>
> If there is any one thing y'all can do to reduce my blood pressure and
> keep me engaged here whilst you
> debate the end of the internet as I understand it, it would be to run
> the rrul test as part of all your benchmarks.
>
> thank you.
>
> B) Stuart Cheshire regaled us with several anecdotes - one concerning
> his problems
> with comcast's 1Gbit/35mbit service being unusable, under load, for
> videoconferencing. This is true. The overbuffering at the CMTSes
> still, has to be seen to be believed, at all rates. At lower rates
> it's possible to shape this, with another device (which is what
> the entire SQM deployment does in self defense and why cake has a
> specific docsis ingress mode), but it is cpu intensive
> and requires x86 hardware to do well at rates above 500Mbits, presently.
>
> So I wish CMTS makers (Arris and Cisco) were in this room. are they?
>
> (Stuart, if you'd like a box that can make your comcast link pleasurable
> under all workloads, whenever you get back to los gatos, I've got a few
> lying around. Was so happy to get a few ietfers this past week to apply
> what's off the shelf for end users today. :)
>
> C) I am glad bob said the L4S is finally looking at asymmetric
> networks, and starting to tackle ack-filtering and accecn issues
> there.
>
> But... I would have *started there*. Asymmetric access is the predominate
> form
> of all edge technologies.
>
> I would love to see flent rrul test results for 1gig/35mbit, 100/10, 200/10
> services, in particular. (from SCE also!). "lifeline" service (11/2)
> would be good
> to have results on. It would be especially good to have baseline
> comparison data from the measured, current deployment
> of the CMTSes at these rates, to start with, with no queue management in
> play, then pie on the uplink, then fq_codel on the uplink, and then
> this ecn stuff, and so on.
>
> D) The two CPE makers in the room have dismissed both fq and sce as
> being too difficult to implement. They did say that dualpi was
> actually implemented in software, not hardware.
>
> I would certainly like them to benchmark what they plan to offer in L4S
> vs what is already available in the edgerouter X, as one low end
> example among thousands.
>
> I also have to note, at higher speeds, all the buffering moves into
> the wifi and the results are currently ugly. I imagine
> they are exploring how to fix their wifi stacks also? I wish more folk
> were using RVR + latency benchmarks like this one:
>
>
> http://flent-newark.bufferbloat.net/~d/Airtime%20based%20queue%20limit%20for%20FQ_CoDel%20in%20wireless%20interface.pdf
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__flent-2Dnewark.bufferbloat.net_-7Ed_Airtime-2520based-2520queue-2520limit-2520for-2520FQ-5FCoDel-2520in-2520wireless-2520interface.pdf&d=DwMFaQ&c=96ZbZZcaMF4w0F4jpN6LZg&r=bqnFROivDo_4iF8Z3R4DyNWKbbMeXr0LOgLnElT1Ook&m=j5nEJ3W8fRmqjnBSWapTVKj6dNbpegl4kSeynebCQT4&s=UEzrGb3xL5zElDhYxB7wHpux1_SLFHGUcEkgTNMOe2Q&e=>
>
> Same goes for the LTE folk.
>
> E) Andrew mcgregor mentioned how great it would be for a closeted musician
> to
> be able to play in real time with someone across town. that has been my
> goal
> for nearly 30 years now!! And although I rather enjoyed his participation
> in
> my last talk on the subject (
>
> https://blog.apnic.net/2020/01/22/bufferbloat-may-be-solved-but-its-not-over-yet/
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__blog.apnic.net_2020_01_22_bufferbloat-2Dmay-2Dbe-2Dsolved-2Dbut-2Dits-2Dnot-2Dover-2Dyet_&d=DwMFaQ&c=96ZbZZcaMF4w0F4jpN6LZg&r=bqnFROivDo_4iF8Z3R4DyNWKbbMeXr0LOgLnElT1Ook&m=j5nEJ3W8fRmqjnBSWapTVKj6dNbpegl4kSeynebCQT4&s=BSDbzxnB7k7krFmkHv9id0BeDC6Vh39LgPNxyHUIg34&e=>
> ) conflating
> a need for ecn and l4s signalling for low latency audio applications
> with what I actually said in that talk, kind of hurt. I achieved
> "my 2ms fiber based guitarist to fiber based drummer dream" 4+ years
> back with fq_codel and diffserv, no ecn required,
> no changes to the specs, no mandating packets be undroppable" and
> would like to rip the opus codec out of that mix one day.
>
> F) I agree with jana that changing the definition of RFC3168 to suit
> the RED algorithm (which is not pi or anything fancy) often present in
> network switches,
> today to suit dctcp, works. But you should say "configuring red to
> have l4s marking style" and document that.
>
> Sometimes I try to point out many switches have a form of DRR in them,
> and it's helpful to use that in conjunction with whatever diffserv
> markings you trust in your network.
>
> To this day I wish someone would publish how much they use DCTCP style
> signalling on a dc network relative to their other traffic.
>
> To this day I keep hoping that someone will publish a suitable
> set of RED parameters for a wide variety of switches and routers -
> for the most common switches and ethernet chips, for correct DCTCP usage.
>
> Mellonox's example:
> (
> https://community.mellanox.com/s/article/howto-configure-ecn-on-mellanox-ethernet-switches--spectrum-x
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__community.mellanox.com_s_article_howto-2Dconfigure-2Decn-2Don-2Dmellanox-2Dethernet-2Dswitches-2D-2Dspectrum-2Dx&d=DwMFaQ&c=96ZbZZcaMF4w0F4jpN6LZg&r=bqnFROivDo_4iF8Z3R4DyNWKbbMeXr0LOgLnElT1Ook&m=j5nEJ3W8fRmqjnBSWapTVKj6dNbpegl4kSeynebCQT4&s=nEIW1DhRXOHu3F5tMwpyO5rQUBMfCZx3Hs4wVvkVFIQ&e=>
> ) is not dctcp specific.
>
> many switches have a form of DRR in them, and it's helpful to use that
> in conjunction with whatever diffserv markings you trust in your
> network,
> and, as per the above example, segregate two red queues that way. From
> what I see
> above there is no way to differentiate ECT(0) from ECT(1) in that switch.
> (?)
>
> I do keep trying to point out the size of the end user ecn enabled
> deployment, starting with the data I have from free.fr
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__free.fr&d=DwMFaQ&c=96ZbZZcaMF4w0F4jpN6LZg&r=bqnFROivDo_4iF8Z3R4DyNWKbbMeXr0LOgLnElT1Ook&m=j5nEJ3W8fRmqjnBSWapTVKj6dNbpegl4kSeynebCQT4&s=7gswGhl21lejSnIiu3yyUTPZEArHqQG6hD64BoW2Zco&e=>.
> Are we
> building a network for AIs or people?
>
> G) Jana also made a point about 2 queues "being enough" (I might be
> mis-remembering the exact point). Mellonoxes ethernet chips at 10Gig expose
> 64 hardware queues, some new intel hardware exposes 2000+. How do these
> queues work relative to these algorithms?
>
> We have generally found hw mq to be far less of a benefit than the
> manufacturers think, especially as regard to
> lower latency or reduced cpu usage (as cache crossing is a bear).
> There is a lot of software work in this area left to be done, however
> they are needed to match queues to cpus (and tenants)
>
> Until sch_pie gained timestamping support recently, the rate estimator
> did not work correctly in a hw mq environment. Haven't looked over
> dualpi in this respect.
>
>
>
>
>
> --
> Make Music, Not War
>
> Dave Täht
> CTO, TekLibre, LLC
> http://www.teklibre.com
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.teklibre.com&d=DwMFaQ&c=96ZbZZcaMF4w0F4jpN6LZg&r=bqnFROivDo_4iF8Z3R4DyNWKbbMeXr0LOgLnElT1Ook&m=j5nEJ3W8fRmqjnBSWapTVKj6dNbpegl4kSeynebCQT4&s=DqPVjNVWDmF4_cwubNhhJS4Y1jCj71szPiBn9pmDZ70&e=>
> Tel: 1-831-435-0729
> _______________________________________________
> Bloat mailing list
> Bloat@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/bloat
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.bufferbloat.net_listinfo_bloat&d=DwMFaQ&c=96ZbZZcaMF4w0F4jpN6LZg&r=bqnFROivDo_4iF8Z3R4DyNWKbbMeXr0LOgLnElT1Ook&m=j5nEJ3W8fRmqjnBSWapTVKj6dNbpegl4kSeynebCQT4&s=DBDxIR6eSYcBOh7rqZx0PWzsHOfvvJeqioI3r2IQOA4&e=>
>
>

[-- Attachment #2: Type: text/html, Size: 31352 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Bloat] [tsvwg] my backlogged comments on the ECT(1) interim call
       [not found]       ` <bad22a6b-698d-c85f-b829-6b5391833a1e@erg.abdn.ac.uk>
@ 2020-04-28 19:38         ` Luca Muscariello
  2020-04-28 19:43           ` Black, David
  2020-04-28 20:33         ` Sebastian Moeller
  1 sibling, 1 reply; 11+ messages in thread
From: Luca Muscariello @ 2020-04-28 19:38 UTC (permalink / raw)
  To: Gorry Fairhurst; +Cc: Holland, Jake, tsvwg IETF list, bloat

[-- Attachment #1: Type: text/plain, Size: 20793 bytes --]

The link is that the L queue starves the other queue, and indeed the
envisaged queue protection mechanism
is supposed to react to that behavior by black-listing the misbehaving
sender.
This would be the third coupled component in L4S (the senders, the AQM and
the policer). Which is currently non-mandatory.

The level of starvation is a parameter in dualQ as the service ratio
between the two queues has to be set
but the AQM owner. How to set this ratio is yet another knob that is
unclear how to optimally set
under a general traffic mix,  that includes unresponsive traffic.

I have already raised this issue in the past.



On Tue, Apr 28, 2020 at 9:26 PM Gorry Fairhurst <gorry@erg.abdn.ac.uk>
wrote:

> This seems all interesting, but isn't this true of any network technology.
> If I use a UDP app with my own style of CC I can just take all the capacity
> if I want.
>
> A solution could be to apply a circuit-breaker/policer in the network to
> perform admission control, but I don't see the link to L4S. Have I missed
> something ?
>
> Gorry
> On 28/04/2020 20:04, Luca Muscariello wrote:
>
> Hi Jake,
>
> Thanks for the notes. Very useful.
> The other issue with the meeting was that the virtual mic queue control
> channel was the WebEx Meeting chat that does not exist in WebEx Teams. So,
> I had to switch to Meetings and lost some pieces of the discussion.
>
> Yes there might be a terminology difference. Elastic traffic is usually
> used in the sense of bandwidth sharing not just to define variable bit
> rates.
>
> The point is that there are incentives to cheat in L4S.
>
> There is a priority queue that my application can enter by providing as
> input ECT(1).
> Applications such as on-line meetings will have a relatively low and
> highly paced rate.
>
> This traffic is conformant to dualQ L queue but is unresponsive to
> congestion notifications.
>
> This is especially true for FEC streams which could be used to ameliorate
> the media quality in presence of losses(e.g. Wi-Fi)
> or increased jitter.
>
>
> That was one more point on why using ECT(1) as input assumes trust or a
> black list after being caught.
>
> In both cases the ECT(1) as input is DoSable.
>
>
>
> On Tue, Apr 28, 2020 at 7:12 PM Holland, Jake <jholland@akamai.com> wrote:
>
>> Hi Luca,
>>
>>
>>
>> To your point about the discussion being difficult to follow: I tried to
>> capture the intent of everyone who commented while taking notes:
>>
>> https://etherpad.ietf.org:9009/p/notes-ietf-interim-2020-tsvwg-03
>>
>>
>>
>> I think this was intended to take the place of a need for everyone to
>> re-send the same points to the list, but of course some of the most crucial
>> points could probably use fleshing out with on-list follow up.
>>
>>
>>
>> It got a bit rough in places because I was disconnected a few times and
>> had to cut over to a local text file, and I may have failed to correctly
>> understand or summarize some of the comments, so there’s chances I might
>> have missed something, but I did my best to capture them all.
>>
>>
>>
>> I encourage people to review comments and check whether they came out
>> more or less correct, and to offer formatting and cleanup suggestions if
>> there’s a good way to make it easier to follow.
>>
>>
>>
>> I had timestamps at the beginning of each main point of discussion, with
>> the intent that after the video is published it would be easier to go back
>> and check precisely what was said. It looks like someone has been making
>> cleanup edits that removed the first half of those so far, but my local
>> text file still has most of those and I can go back and re-insert them if
>> it seems useful.
>>
>>
>>
>> @Luca: during your comments in particular I think there might have been a
>> disruption--I had a “first comment missed, please check video” placeholder
>> and I may have misunderstood the part about video elasticity, but my
>> interpretation at the time was that Stuart was claiming that video was
>> elastic in that it would adjust downward to avoid overflowing a loaded
>> link, and I thought you were claiming that it was not elastic in that it
>> would not exceed a maximum rate, which I summarized as perhaps a semantic
>> disagreement, but if you’d like to help clean that up, it might be useful.
>>
>>
>>
>> From this message, it sounds like the key point you were making was that
>> it also will not go below a certain rate, and perhaps that quality can stay
>> relatively good in spite of high network loss?
>>
>>
>>
>> Best regards,
>>
>> Jake
>>
>>
>>
>> *From: *Luca Muscariello <muscariello@ieee.org>
>> *Date: *Tuesday, April 28, 2020 at 1:54 AM
>> *To: *Dave Taht <dave.taht@gmail.com>
>> *Cc: *tsvwg IETF list <tsvwg@ietf.org>, bloat <
>> bloat@lists.bufferbloat.net>
>> *Subject: *Re: [Bloat] my backlogged comments on the ECT(1) interim call
>>
>>
>>
>> Hi Dave and list members,
>>
>>
>>
>> It was difficult to follow the discussion at the meeting yesterday.
>>
>> Who  said what in the first place.
>>
>>
>>
>> There have been a lot of non-technical comments such as: this solution
>>
>> is better than another in my opinion. "better" has often been used
>>
>> as when evaluating the taste of an ice cream: White chocolate vs black
>> chocolate.
>>
>> This has taken a significant amount of time at the meeting. I haven't
>> learned
>>
>> much from that kind of discussion and I do not think that helped to make
>>
>> much progress.
>>
>>
>>
>> If people can re-make their points in the list it would help the debate.
>>
>>
>>
>> Another point that a few raised is that we have to make a decision as
>> fast as possible.
>>
>> I dismissed entirely that argument. Trading off latency with resilience
>> of the Internet
>>
>> is entirely against the design principle of the Internet architecture
>> itself.
>>
>> Risk analysis is something that we should keep in mind even when
>> deploying any experiment
>>
>> and should be a substantial part of it.
>>
>>
>>
>> Someone claimed that on-line meeting traffic is elastic. This is not
>> true, I tried to
>>
>> clarify this. These applications (WebEx/Zoom) are low rate, a typical
>> maximum upstream
>>
>> rate is 2Mbps and is not elastic. These applications have often a
>> stand-alone app
>>
>> that is not using the browser WebRTC stack (the standalone app typically
>> works better).
>>
>>
>>
>> A client sends upstream one or two video qualities unless the video
>> camera is switched off.
>>
>> In presence of losses, FEC is used but it is still non elastic.
>>
>> Someone claimed (at yesterday's meeting) that fairness is not an issue
>> (who cares, I heard!)
>>
>> Well, fairness can constitute a differentiation advantage between two
>> companies that are
>>
>> commercializing on-line meetings products. Unless at the IETF we accept
>>
>> "law-of-the-jungle" behaviours from Internet applications developers, we
>> should be careful
>>
>> about making such claims.
>>
>> Any opportunity to cheat, that brings a business advantage WILL be used.
>>
>>
>>
>> /Luca
>>
>>
>>
>> TL;DR
>>
>> To Dave: you asked several times what  Cisco does on latency reduction in
>>
>> network equipment. I tend to be very shy when replying on these questions
>>
>> as this is not vendor neutral. If chairs think this is not appropriate for
>>
>> the list, please say it and I'll reply privately only.
>>
>>
>>
>> What I write below can be found in Cisco products data sheets and is not
>>
>> trade secret. There are very good blog posts explaining details.
>>
>> Not surprisingly Cisco implements the state of the art on the topic
>>
>> and it is totally feasible to do-the-right-thing in software and hardware.
>>
>>
>>
>> Cisco implements AFD (one queue + a flow table) accompanied by a priority
>> queue for
>>
>> flows that have a certain profile in rate and size. The concept is well
>> known and well
>>
>> studied in the literature. AFD is safe and can well serve a complex
>> traffic mix when
>>
>> accompanied by a priority queue. This prio-queue should not be confused
>> with a strict
>>
>> priority queue (e.g. EF in diffserv). There are subtleties related to the
>> DOCSIS
>>
>> shared medium which would be too long to describe here.
>>
>>
>>
>> This is available in Cisco CMTS for the DOCSIS segment. Bottleneck traffic
>>
>> does not negatively impact non-bottlenecked-traffic such as an on-line
>> meeting like
>>
>> the WebEx call we had yesterday. It is safe from a network neutrality
>> point-of-view
>>
>> and no applications get hurt.
>>
>>
>>
>> Cisco implements AFD+prio also for some DC switches such as the Nexus 9k.
>> There
>>
>> is a blog post written by Tom Edsal online that explains pretty well how
>> that works.
>>
>> This includes mechanisms such as p-fabric to approximate SRPT (shortest
>> remaining processing time)
>>
>> and minimize flow completion time for many DC workloads. The mix of the
>> two
>>
>> brings FCT minimization AND latency minimization. This is silicon and
>> scales at any speed.
>>
>> For those who are not familiar with these concepts, please search the
>> research work of Balaji
>>
>> Prabhakar and Ron Pang at Stanford.
>>
>>
>>
>> Wi-Fi: Cisco does airtime fairness in Aironet but I think in the Meraki
>> series too.
>>
>> The concept is similar to what described above but there are several
>> queues, one per STA.
>>
>> Packets are enqueued in the access (category) queue at dequeue time from
>> the air-time
>>
>> packet scheduler.
>>
>>
>>
>> On Mon, Apr 27, 2020 at 9:24 PM Dave Taht <dave.taht@gmail.com> wrote:
>>
>> It looks like the majority of what I say below is not related to the
>> fate of the "bit". The push to take the bit was
>> strong with this one, and me... can't we deploy more of what we
>> already got in places where it matters?
>>
>> ....
>>
>> so: A) PLEA: From 10 years now, of me working on bufferbloat, working
>> on real end-user and wifi traffic and real networks....
>>
>> I would like folk here to stop benchmarking two flows that run for a long
>> time
>> and in one direction only... and thus exclusively in tcp congestion
>> avoidance mode.
>>
>> Please. just. stop. Real traffic looks nothing like that. The internet
>> looks nothing like that.
>> The netops folk I know just roll their eyes up at benchmarks like this
>> that prove nothing and tell me to go to ripe meetings instead.
>> When y'all talk about "not looking foolish for not mandating ecn now",
>> you've already lost that audience with benchmarks like these.
>>
>> Sure, setup a background flow(s)  like that, but then hit the result
>> with a mix of
>> far more normal traffic? Please? networks are never used unidirectionally
>> and both directions congesting is frequent. To illustrate that problem...
>>
>> I have a really robust benchmark that we have used throughout the
>> bufferbloat
>> project that I would like everyone to run in their environments, the flent
>> "rrul" test. Everybody on both sides has big enough testbeds setup that a
>> few
>> hours spent on doing that - and please add in asymmetric networks
>> especially -
>> and perusing the results ought to be enlightening to everyone as to the
>> kind
>> of problems real people have, on real networks.
>>
>> Can the L4S and SCE folk run the rrul test some day soon? Please?
>>
>> I rather liked this benchmark that tested another traffic mix,
>>
>> (
>> https://www.cablelabs.com/wp-content/uploads/2014/06/DOCSIS-AQM_May2014.pdf
>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__www.cablelabs.com_wp-2Dcontent_uploads_2014_06_DOCSIS-2DAQM-5FMay2014.pdf&d=DwMFaQ&c=96ZbZZcaMF4w0F4jpN6LZg&r=bqnFROivDo_4iF8Z3R4DyNWKbbMeXr0LOgLnElT1Ook&m=j5nEJ3W8fRmqjnBSWapTVKj6dNbpegl4kSeynebCQT4&s=DrB4ENWjWbVu9SqtIh7lXKJj96fwm6TqESC6E8_IdnY&e=>
>> )
>>
>> although it had many flaws (like not doing dns lookups), I wish it
>> could be dusted off and used to compare this
>> new fangled ecn enabled stuff with the kind of results you can merely get
>> with packet loss and rtt awareness. It would be so great to be able
>> to directly compare all these new algorithms against this benchmark.
>>
>> Adding in a non ecn'd udp based routing protocol on heavily
>> oversubscribed 100mbit link is also enlightening.
>>
>> I'd rather like to see that benchmark improved for a more modernized
>> home traffic mix
>> where it is projected there may be 30 devices on the network on average,
>> in a few years.
>>
>> If there is any one thing y'all can do to reduce my blood pressure and
>> keep me engaged here whilst you
>> debate the end of the internet as I understand it, it would be to run
>> the rrul test as part of all your benchmarks.
>>
>> thank you.
>>
>> B) Stuart Cheshire regaled us with several anecdotes - one concerning
>> his problems
>> with comcast's 1Gbit/35mbit service being unusable, under load, for
>> videoconferencing. This is true. The overbuffering at the CMTSes
>> still, has to be seen to be believed, at all rates. At lower rates
>> it's possible to shape this, with another device (which is what
>> the entire SQM deployment does in self defense and why cake has a
>> specific docsis ingress mode), but it is cpu intensive
>> and requires x86 hardware to do well at rates above 500Mbits, presently.
>>
>> So I wish CMTS makers (Arris and Cisco) were in this room. are they?
>>
>> (Stuart, if you'd like a box that can make your comcast link pleasurable
>> under all workloads, whenever you get back to los gatos, I've got a few
>> lying around. Was so happy to get a few ietfers this past week to apply
>> what's off the shelf for end users today. :)
>>
>> C) I am glad bob said the L4S is finally looking at asymmetric
>> networks, and starting to tackle ack-filtering and accecn issues
>> there.
>>
>> But... I would have *started there*. Asymmetric access is the predominate
>> form
>> of all edge technologies.
>>
>> I would love to see flent rrul test results for 1gig/35mbit, 100/10,
>> 200/10
>> services, in particular. (from SCE also!). "lifeline" service (11/2)
>> would be good
>> to have results on. It would be especially good to have baseline
>> comparison data from the measured, current deployment
>> of the CMTSes at these rates, to start with, with no queue management in
>> play, then pie on the uplink, then fq_codel on the uplink, and then
>> this ecn stuff, and so on.
>>
>> D) The two CPE makers in the room have dismissed both fq and sce as
>> being too difficult to implement. They did say that dualpi was
>> actually implemented in software, not hardware.
>>
>> I would certainly like them to benchmark what they plan to offer in L4S
>> vs what is already available in the edgerouter X, as one low end
>> example among thousands.
>>
>> I also have to note, at higher speeds, all the buffering moves into
>> the wifi and the results are currently ugly. I imagine
>> they are exploring how to fix their wifi stacks also? I wish more folk
>> were using RVR + latency benchmarks like this one:
>>
>>
>> http://flent-newark.bufferbloat.net/~d/Airtime%20based%20queue%20limit%20for%20FQ_CoDel%20in%20wireless%20interface.pdf
>> <https://urldefense.proofpoint.com/v2/url?u=http-3A__flent-2Dnewark.bufferbloat.net_-7Ed_Airtime-2520based-2520queue-2520limit-2520for-2520FQ-5FCoDel-2520in-2520wireless-2520interface.pdf&d=DwMFaQ&c=96ZbZZcaMF4w0F4jpN6LZg&r=bqnFROivDo_4iF8Z3R4DyNWKbbMeXr0LOgLnElT1Ook&m=j5nEJ3W8fRmqjnBSWapTVKj6dNbpegl4kSeynebCQT4&s=UEzrGb3xL5zElDhYxB7wHpux1_SLFHGUcEkgTNMOe2Q&e=>
>>
>> Same goes for the LTE folk.
>>
>> E) Andrew mcgregor mentioned how great it would be for a closeted
>> musician to
>> be able to play in real time with someone across town. that has been my
>> goal
>> for nearly 30 years now!! And although I rather enjoyed his participation
>> in
>> my last talk on the subject (
>>
>> https://blog.apnic.net/2020/01/22/bufferbloat-may-be-solved-but-its-not-over-yet/
>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__blog.apnic.net_2020_01_22_bufferbloat-2Dmay-2Dbe-2Dsolved-2Dbut-2Dits-2Dnot-2Dover-2Dyet_&d=DwMFaQ&c=96ZbZZcaMF4w0F4jpN6LZg&r=bqnFROivDo_4iF8Z3R4DyNWKbbMeXr0LOgLnElT1Ook&m=j5nEJ3W8fRmqjnBSWapTVKj6dNbpegl4kSeynebCQT4&s=BSDbzxnB7k7krFmkHv9id0BeDC6Vh39LgPNxyHUIg34&e=>
>> ) conflating
>> a need for ecn and l4s signalling for low latency audio applications
>> with what I actually said in that talk, kind of hurt. I achieved
>> "my 2ms fiber based guitarist to fiber based drummer dream" 4+ years
>> back with fq_codel and diffserv, no ecn required,
>> no changes to the specs, no mandating packets be undroppable" and
>> would like to rip the opus codec out of that mix one day.
>>
>> F) I agree with jana that changing the definition of RFC3168 to suit
>> the RED algorithm (which is not pi or anything fancy) often present in
>> network switches,
>> today to suit dctcp, works. But you should say "configuring red to
>> have l4s marking style" and document that.
>>
>> Sometimes I try to point out many switches have a form of DRR in them,
>> and it's helpful to use that in conjunction with whatever diffserv
>> markings you trust in your network.
>>
>> To this day I wish someone would publish how much they use DCTCP style
>> signalling on a dc network relative to their other traffic.
>>
>> To this day I keep hoping that someone will publish a suitable
>> set of RED parameters for a wide variety of switches and routers -
>> for the most common switches and ethernet chips, for correct DCTCP usage.
>>
>> Mellonox's example:
>> (
>> https://community.mellanox.com/s/article/howto-configure-ecn-on-mellanox-ethernet-switches--spectrum-x
>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__community.mellanox.com_s_article_howto-2Dconfigure-2Decn-2Don-2Dmellanox-2Dethernet-2Dswitches-2D-2Dspectrum-2Dx&d=DwMFaQ&c=96ZbZZcaMF4w0F4jpN6LZg&r=bqnFROivDo_4iF8Z3R4DyNWKbbMeXr0LOgLnElT1Ook&m=j5nEJ3W8fRmqjnBSWapTVKj6dNbpegl4kSeynebCQT4&s=nEIW1DhRXOHu3F5tMwpyO5rQUBMfCZx3Hs4wVvkVFIQ&e=>
>> ) is not dctcp specific.
>>
>> many switches have a form of DRR in them, and it's helpful to use that
>> in conjunction with whatever diffserv markings you trust in your
>> network,
>> and, as per the above example, segregate two red queues that way. From
>> what I see
>> above there is no way to differentiate ECT(0) from ECT(1) in that switch.
>> (?)
>>
>> I do keep trying to point out the size of the end user ecn enabled
>> deployment, starting with the data I have from free.fr
>> <https://urldefense.proofpoint.com/v2/url?u=http-3A__free.fr&d=DwMFaQ&c=96ZbZZcaMF4w0F4jpN6LZg&r=bqnFROivDo_4iF8Z3R4DyNWKbbMeXr0LOgLnElT1Ook&m=j5nEJ3W8fRmqjnBSWapTVKj6dNbpegl4kSeynebCQT4&s=7gswGhl21lejSnIiu3yyUTPZEArHqQG6hD64BoW2Zco&e=>.
>> Are we
>> building a network for AIs or people?
>>
>> G) Jana also made a point about 2 queues "being enough" (I might be
>> mis-remembering the exact point). Mellonoxes ethernet chips at 10Gig
>> expose
>> 64 hardware queues, some new intel hardware exposes 2000+. How do these
>> queues work relative to these algorithms?
>>
>> We have generally found hw mq to be far less of a benefit than the
>> manufacturers think, especially as regard to
>> lower latency or reduced cpu usage (as cache crossing is a bear).
>> There is a lot of software work in this area left to be done, however
>> they are needed to match queues to cpus (and tenants)
>>
>> Until sch_pie gained timestamping support recently, the rate estimator
>> did not work correctly in a hw mq environment. Haven't looked over
>> dualpi in this respect.
>>
>>
>>
>>
>>
>> --
>> Make Music, Not War
>>
>> Dave Täht
>> CTO, TekLibre, LLC
>> http://www.teklibre.com
>> <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.teklibre.com&d=DwMFaQ&c=96ZbZZcaMF4w0F4jpN6LZg&r=bqnFROivDo_4iF8Z3R4DyNWKbbMeXr0LOgLnElT1Ook&m=j5nEJ3W8fRmqjnBSWapTVKj6dNbpegl4kSeynebCQT4&s=DqPVjNVWDmF4_cwubNhhJS4Y1jCj71szPiBn9pmDZ70&e=>
>> Tel: 1-831-435-0729
>> _______________________________________________
>> Bloat mailing list
>> Bloat@lists.bufferbloat.net
>> https://lists.bufferbloat.net/listinfo/bloat
>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.bufferbloat.net_listinfo_bloat&d=DwMFaQ&c=96ZbZZcaMF4w0F4jpN6LZg&r=bqnFROivDo_4iF8Z3R4DyNWKbbMeXr0LOgLnElT1Ook&m=j5nEJ3W8fRmqjnBSWapTVKj6dNbpegl4kSeynebCQT4&s=DBDxIR6eSYcBOh7rqZx0PWzsHOfvvJeqioI3r2IQOA4&e=>
>>
>> --
> G. Fairhurst, School of Engineering
>
>

[-- Attachment #2: Type: text/html, Size: 48200 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Bloat] [tsvwg] my backlogged comments on the ECT(1) interim call
  2020-04-28 19:38         ` [Bloat] [tsvwg] " Luca Muscariello
@ 2020-04-28 19:43           ` Black, David
  2020-04-28 19:59             ` Jonathan Morton
  0 siblings, 1 reply; 11+ messages in thread
From: Black, David @ 2020-04-28 19:43 UTC (permalink / raw)
  To: Luca Muscariello, Gorry Fairhurst; +Cc: tsvwg IETF list, bloat, Black, David

[-- Attachment #1: Type: text/plain, Size: 19987 bytes --]

And I also noted this at the end of the meeting:  “queue protection that might apply the disincentive”

That would send cheaters to the L4S conventional queue along with all the other queue-building traffic.

Thanks, --David

From: tsvwg <tsvwg-bounces@ietf.org> On Behalf Of Luca Muscariello
Sent: Tuesday, April 28, 2020 3:39 PM
To: Gorry Fairhurst
Cc: tsvwg IETF list; bloat
Subject: Re: [tsvwg] [Bloat] my backlogged comments on the ECT(1) interim call

[EXTERNAL EMAIL]
The link is that the L queue starves the other queue, and indeed the envisaged queue protection mechanism
is supposed to react to that behavior by black-listing the misbehaving sender.
This would be the third coupled component in L4S (the senders, the AQM and the policer). Which is currently non-mandatory.

The level of starvation is a parameter in dualQ as the service ratio between the two queues has to be set
but the AQM owner. How to set this ratio is yet another knob that is unclear how to optimally set
under a general traffic mix,  that includes unresponsive traffic.

I have already raised this issue in the past.

On Tue, Apr 28, 2020 at 9:26 PM Gorry Fairhurst <gorry@erg.abdn.ac.uk<mailto:gorry@erg.abdn.ac.uk>> wrote:

This seems all interesting, but isn't this true of any network technology. If I use a UDP app with my own style of CC I can just take all the capacity if I want.

A solution could be to apply a circuit-breaker/policer in the network to perform admission control, but I don't see the link to L4S. Have I missed something ?

Gorry
On 28/04/2020 20:04, Luca Muscariello wrote:
Hi Jake,

Thanks for the notes. Very useful.
The other issue with the meeting was that the virtual mic queue control channel was the WebEx Meeting chat that does not exist in WebEx Teams. So, I had to switch to Meetings and lost some pieces of the discussion.

Yes there might be a terminology difference. Elastic traffic is usually used in the sense of bandwidth sharing not just to define variable bit rates.

The point is that there are incentives to cheat in L4S.

There is a priority queue that my application can enter by providing as input ECT(1).
Applications such as on-line meetings will have a relatively low and highly paced rate.

This traffic is conformant to dualQ L queue but is unresponsive to congestion notifications.

This is especially true for FEC streams which could be used to ameliorate the media quality in presence of losses(e.g. Wi-Fi)
or increased jitter.

That was one more point on why using ECT(1) as input assumes trust or a black list after being caught.

In both cases the ECT(1) as input is DoSable.

On Tue, Apr 28, 2020 at 7:12 PM Holland, Jake <jholland@akamai.com<mailto:jholland@akamai..com>> wrote:
Hi Luca,

To your point about the discussion being difficult to follow: I tried to capture the intent of everyone who commented while taking notes:
https://etherpad.ietf.org:9009/p/notes-ietf-interim-2020-tsvwg-03<https://etherpad.ietf..org:9009/p/notes-ietf-interim-2020-tsvwg-03>

I think this was intended to take the place of a need for everyone to re-send the same points to the list, but of course some of the most crucial points could probably use fleshing out with on-list follow up.

It got a bit rough in places because I was disconnected a few times and had to cut over to a local text file, and I may have failed to correctly understand or summarize some of the comments, so there’s chances I might have missed something, but I did my best to capture them all.

I encourage people to review comments and check whether they came out more or less correct, and to offer formatting and cleanup suggestions if there’s a good way to make it easier to follow.

I had timestamps at the beginning of each main point of discussion, with the intent that after the video is published it would be easier to go back and check precisely what was said. It looks like someone has been making cleanup edits that removed the first half of those so far, but my local text file still has most of those and I can go back and re-insert them if it seems useful.

@Luca: during your comments in particular I think there might have been a disruption--I had a “first comment missed, please check video” placeholder and I may have misunderstood the part about video elasticity, but my interpretation at the time was that Stuart was claiming that video was elastic in that it would adjust downward to avoid overflowing a loaded link, and I thought you were claiming that it was not elastic in that it would not exceed a maximum rate, which I summarized as perhaps a semantic disagreement, but if you’d like to help clean that up, it might be useful.

From this message, it sounds like the key point you were making was that it also will not go below a certain rate, and perhaps that quality can stay relatively good in spite of high network loss?

Best regards,
Jake

From: Luca Muscariello <muscariello@ieee.org<mailto:muscariello@ieee.org>>
Date: Tuesday, April 28, 2020 at 1:54 AM
To: Dave Taht <dave.taht@gmail.com<mailto:dave.taht@gmail.com>>
Cc: tsvwg IETF list <tsvwg@ietf.org<mailto:tsvwg@ietf.org>>, bloat <bloat@lists.bufferbloat.net<mailto:bloat@lists.bufferbloat.net>>
Subject: Re: [Bloat] my backlogged comments on the ECT(1) interim call

Hi Dave and list members,

It was difficult to follow the discussion at the meeting yesterday.
Who  said what in the first place.

There have been a lot of non-technical comments such as: this solution
is better than another in my opinion. "better" has often been used
as when evaluating the taste of an ice cream: White chocolate vs black chocolate.
This has taken a significant amount of time at the meeting. I haven't learned
much from that kind of discussion and I do not think that helped to make
much progress.

If people can re-make their points in the list it would help the debate.

Another point that a few raised is that we have to make a decision as fast as possible.
I dismissed entirely that argument. Trading off latency with resilience of the Internet
is entirely against the design principle of the Internet architecture itself.
Risk analysis is something that we should keep in mind even when deploying any experiment
and should be a substantial part of it.

Someone claimed that on-line meeting traffic is elastic. This is not true, I tried to
clarify this. These applications (WebEx/Zoom) are low rate, a typical maximum upstream
rate is 2Mbps and is not elastic. These applications have often a stand-alone app
that is not using the browser WebRTC stack (the standalone app typically works better).

A client sends upstream one or two video qualities unless the video camera is switched off.
In presence of losses, FEC is used but it is still non elastic.
Someone claimed (at yesterday's meeting) that fairness is not an issue (who cares, I heard!)
Well, fairness can constitute a differentiation advantage between two companies that are
commercializing on-line meetings products. Unless at the IETF we accept
"law-of-the-jungle" behaviours from Internet applications developers, we should be careful
about making such claims.
Any opportunity to cheat, that brings a business advantage WILL be used.

/Luca

TL;DR
To Dave: you asked several times what  Cisco does on latency reduction in
network equipment. I tend to be very shy when replying on these questions
as this is not vendor neutral. If chairs think this is not appropriate for
the list, please say it and I'll reply privately only.

What I write below can be found in Cisco products data sheets and is not
trade secret. There are very good blog posts explaining details.
Not surprisingly Cisco implements the state of the art on the topic
and it is totally feasible to do-the-right-thing in software and hardware.

Cisco implements AFD (one queue + a flow table) accompanied by a priority queue for
flows that have a certain profile in rate and size. The concept is well known and well
studied in the literature. AFD is safe and can well serve a complex traffic mix when
accompanied by a priority queue. This prio-queue should not be confused with a strict
priority queue (e.g. EF in diffserv). There are subtleties related to the DOCSIS
shared medium which would be too long to describe here.

This is available in Cisco CMTS for the DOCSIS segment. Bottleneck traffic
does not negatively impact non-bottlenecked-traffic such as an on-line meeting like
the WebEx call we had yesterday. It is safe from a network neutrality point-of-view
and no applications get hurt.

Cisco implements AFD+prio also for some DC switches such as the Nexus 9k. There
is a blog post written by Tom Edsal online that explains pretty well how that works.
This includes mechanisms such as p-fabric to approximate SRPT (shortest remaining processing time)
and minimize flow completion time for many DC workloads. The mix of the two
brings FCT minimization AND latency minimization. This is silicon and scales at any speed.
For those who are not familiar with these concepts, please search the research work of Balaji
Prabhakar and Ron Pang at Stanford.

Wi-Fi: Cisco does airtime fairness in Aironet but I think in the Meraki series too.
The concept is similar to what described above but there are several queues, one per STA.
Packets are enqueued in the access (category) queue at dequeue time from the air-time
packet scheduler.

On Mon, Apr 27, 2020 at 9:24 PM Dave Taht <dave.taht@gmail.com<mailto:dave.taht@gmail.com>> wrote:
It looks like the majority of what I say below is not related to the
fate of the "bit". The push to take the bit was
strong with this one, and me... can't we deploy more of what we
already got in places where it matters?

....

so: A) PLEA: From 10 years now, of me working on bufferbloat, working
on real end-user and wifi traffic and real networks....

I would like folk here to stop benchmarking two flows that run for a long time
and in one direction only... and thus exclusively in tcp congestion
avoidance mode.

Please. just. stop. Real traffic looks nothing like that. The internet
looks nothing like that.
The netops folk I know just roll their eyes up at benchmarks like this
that prove nothing and tell me to go to ripe meetings instead.
When y'all talk about "not looking foolish for not mandating ecn now",
you've already lost that audience with benchmarks like these.

Sure, setup a background flow(s)  like that, but then hit the result
with a mix of
far more normal traffic? Please? networks are never used unidirectionally
and both directions congesting is frequent. To illustrate that problem...

I have a really robust benchmark that we have used throughout the bufferbloat
project that I would like everyone to run in their environments, the flent
"rrul" test. Everybody on both sides has big enough testbeds setup that a few
hours spent on doing that - and please add in asymmetric networks especially -
and perusing the results ought to be enlightening to everyone as to the kind
of problems real people have, on real networks.

Can the L4S and SCE folk run the rrul test some day soon? Please?

I rather liked this benchmark that tested another traffic mix,

( https://www.cablelabs.com/wp-content/uploads/2014/06/DOCSIS-AQM_May2014.pdf<https://urldefense.proofpoint.com/v2/url?u=https-3A__www.cablelabs.com_wp-2Dcontent_uploads_2014_06_DOCSIS-2DAQM-5FMay2014.pdf&d=DwMFaQ&c=96ZbZZcaMF4w0F4jpN6LZg&r=bqnFROivDo_4iF8Z3R4DyNWKbbMeXr0LOgLnElT1Ook&m=j5nEJ3W8fRmqjnBSWapTVKj6dNbpegl4kSeynebCQT4&s=DrB4ENWjWbVu9SqtIh7lXKJj96fwm6TqESC6E8_IdnY&e=> )

although it had many flaws (like not doing dns lookups), I wish it
could be dusted off and used to compare this
new fangled ecn enabled stuff with the kind of results you can merely get
with packet loss and rtt awareness. It would be so great to be able
to directly compare all these new algorithms against this benchmark.

Adding in a non ecn'd udp based routing protocol on heavily
oversubscribed 100mbit link is also enlightening.

I'd rather like to see that benchmark improved for a more modernized
home traffic mix
where it is projected there may be 30 devices on the network on average,
in a few years.

If there is any one thing y'all can do to reduce my blood pressure and
keep me engaged here whilst you
debate the end of the internet as I understand it, it would be to run
the rrul test as part of all your benchmarks.

thank you.

B) Stuart Cheshire regaled us with several anecdotes - one concerning
his problems
with comcast's 1Gbit/35mbit service being unusable, under load, for
videoconferencing. This is true. The overbuffering at the CMTSes
still, has to be seen to be believed, at all rates. At lower rates
it's possible to shape this, with another device (which is what
the entire SQM deployment does in self defense and why cake has a
specific docsis ingress mode), but it is cpu intensive
and requires x86 hardware to do well at rates above 500Mbits, presently.

So I wish CMTS makers (Arris and Cisco) were in this room. are they?

(Stuart, if you'd like a box that can make your comcast link pleasurable
under all workloads, whenever you get back to los gatos, I've got a few
lying around. Was so happy to get a few ietfers this past week to apply
what's off the shelf for end users today. :)

C) I am glad bob said the L4S is finally looking at asymmetric
networks, and starting to tackle ack-filtering and accecn issues
there.

But... I would have *started there*. Asymmetric access is the predominate form
of all edge technologies.

I would love to see flent rrul test results for 1gig/35mbit, 100/10, 200/10
services, in particular. (from SCE also!). "lifeline" service (11/2)
would be good
to have results on. It would be especially good to have baseline
comparison data from the measured, current deployment
of the CMTSes at these rates, to start with, with no queue management in
play, then pie on the uplink, then fq_codel on the uplink, and then
this ecn stuff, and so on.

D) The two CPE makers in the room have dismissed both fq and sce as
being too difficult to implement. They did say that dualpi was
actually implemented in software, not hardware.

I would certainly like them to benchmark what they plan to offer in L4S
vs what is already available in the edgerouter X, as one low end
example among thousands.

I also have to note, at higher speeds, all the buffering moves into
the wifi and the results are currently ugly. I imagine
they are exploring how to fix their wifi stacks also? I wish more folk
were using RVR + latency benchmarks like this one:

http://flent-newark.bufferbloat.net/~d/Airtime%20based%20queue%20limit%20for%20FQ_CoDel%20in%20wireless%20interface.pdf<https://urldefense.proofpoint.com/v2/url?u=http-3A__flent-2Dnewark.bufferbloat.net_-7Ed_Airtime-2520based-2520queue-2520limit-2520for-2520FQ-5FCoDel-2520in-2520wireless-2520interface.pdf&d=DwMFaQ&c=96ZbZZcaMF4w0F4jpN6LZg&r=bqnFROivDo_4iF8Z3R4DyNWKbbMeXr0LOgLnElT1Ook&m=j5nEJ3W8fRmqjnBSWapTVKj6dNbpegl4kSeynebCQT4&s=UEzrGb3xL5zElDhYxB7wHpux1_SLFHGUcEkgTNMOe2Q&e=>

Same goes for the LTE folk.

E) Andrew mcgregor mentioned how great it would be for a closeted musician to
be able to play in real time with someone across town. that has been my goal
for nearly 30 years now!! And although I rather enjoyed his participation in
my last talk on the subject (
https://blog.apnic.net/2020/01/22/bufferbloat-may-be-solved-but-its-not-over-yet/<https://urldefense.proofpoint.com/v2/url?u=https-3A__blog.apnic.net_2020_01_22_bufferbloat-2Dmay-2Dbe-2Dsolved-2Dbut-2Dits-2Dnot-2Dover-2Dyet_&d=DwMFaQ&c=96ZbZZcaMF4w0F4jpN6LZg&r=bqnFROivDo_4iF8Z3R4DyNWKbbMeXr0LOgLnElT1Ook&m=j5nEJ3W8fRmqjnBSWapTVKj6dNbpegl4kSeynebCQT4&s=BSDbzxnB7k7krFmkHv9id0BeDC6Vh39LgPNxyHUIg34&e=>
) conflating
a need for ecn and l4s signalling for low latency audio applications
with what I actually said in that talk, kind of hurt. I achieved
"my 2ms fiber based guitarist to fiber based drummer dream" 4+ years
back with fq_codel and diffserv, no ecn required,
no changes to the specs, no mandating packets be undroppable" and
would like to rip the opus codec out of that mix one day.

F) I agree with jana that changing the definition of RFC3168 to suit
the RED algorithm (which is not pi or anything fancy) often present in
network switches,
today to suit dctcp, works. But you should say "configuring red to
have l4s marking style" and document that.

Sometimes I try to point out many switches have a form of DRR in them,
and it's helpful to use that in conjunction with whatever diffserv
markings you trust in your network.

To this day I wish someone would publish how much they use DCTCP style
signalling on a dc network relative to their other traffic.

To this day I keep hoping that someone will publish a suitable
set of RED parameters for a wide variety of switches and routers -
for the most common switches and ethernet chips, for correct DCTCP usage.

Mellonox's example:
( https://community.mellanox.com/s/article/howto-configure-ecn-on-mellanox-ethernet-switches--spectrum-x<https://urldefense.proofpoint.com/v2/url?u=https-3A__community.mellanox.com_s_article_howto-2Dconfigure-2Decn-2Don-2Dmellanox-2Dethernet-2Dswitches-2D-2Dspectrum-2Dx&d=DwMFaQ&c=96ZbZZcaMF4w0F4jpN6LZg&r=bqnFROivDo_4iF8Z3R4DyNWKbbMeXr0LOgLnElT1Ook&m=j5nEJ3W8fRmqjnBSWapTVKj6dNbpegl4kSeynebCQT4&s=nEIW1DhRXOHu3F5tMwpyO5rQUBMfCZx3Hs4wVvkVFIQ&e=>
) is not dctcp specific.

many switches have a form of DRR in them, and it's helpful to use that
in conjunction with whatever diffserv markings you trust in your
network,
and, as per the above example, segregate two red queues that way. From
what I see
above there is no way to differentiate ECT(0) from ECT(1) in that switch. (?)

I do keep trying to point out the size of the end user ecn enabled
deployment, starting with the data I have from free.fr<https://urldefense.proofpoint.com/v2/url?u=http-3A__free.fr&d=DwMFaQ&c=96ZbZZcaMF4w0F4jpN6LZg&r=bqnFROivDo_4iF8Z3R4DyNWKbbMeXr0LOgLnElT1Ook&m=j5nEJ3W8fRmqjnBSWapTVKj6dNbpegl4kSeynebCQT4&s=7gswGhl21lejSnIiu3yyUTPZEArHqQG6hD64BoW2Zco&e=>. Are we
building a network for AIs or people?

G) Jana also made a point about 2 queues "being enough" (I might be
mis-remembering the exact point). Mellonoxes ethernet chips at 10Gig expose
64 hardware queues, some new intel hardware exposes 2000+. How do these
queues work relative to these algorithms?

We have generally found hw mq to be far less of a benefit than the
manufacturers think, especially as regard to
lower latency or reduced cpu usage (as cache crossing is a bear).
There is a lot of software work in this area left to be done, however
they are needed to match queues to cpus (and tenants)

Until sch_pie gained timestamping support recently, the rate estimator
did not work correctly in a hw mq environment. Haven't looked over
dualpi in this respect.

--
Make Music, Not War

Dave Täht
CTO, TekLibre, LLC
http://www.teklibre.com<https://urldefense.proofpoint.com/v2/url?u=http-3A__www.teklibre.com&d=DwMFaQ&c=96ZbZZcaMF4w0F4jpN6LZg&r=bqnFROivDo_4iF8Z3R4DyNWKbbMeXr0LOgLnElT1Ook&m=j5nEJ3W8fRmqjnBSWapTVKj6dNbpegl4kSeynebCQT4&s=DqPVjNVWDmF4_cwubNhhJS4Y1jCj71szPiBn9pmDZ70&e=>
Tel: 1-831-435-0729
_______________________________________________
Bloat mailing list
Bloat@lists.bufferbloat.net<mailto:Bloat@lists.bufferbloat.net>
https://lists.bufferbloat.net/listinfo/bloat<https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.bufferbloat.net_listinfo_bloat&d=DwMFaQ&c=96ZbZZcaMF4w0F4jpN6LZg&r=bqnFROivDo_4iF8Z3R4DyNWKbbMeXr0LOgLnElT1Ook&m=j5nEJ3W8fRmqjnBSWapTVKj6dNbpegl4kSeynebCQT4&s=DBDxIR6eSYcBOh7rqZx0PWzsHOfvvJeqioI3r2IQOA4&e=>

--

G. Fairhurst, School of Engineering

[-- Attachment #2: Type: text/html, Size: 40391 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Bloat] [tsvwg] my backlogged comments on the ECT(1) interim call
  2020-04-28 19:43           ` Black, David
@ 2020-04-28 19:59             ` Jonathan Morton
  0 siblings, 0 replies; 11+ messages in thread
From: Jonathan Morton @ 2020-04-28 19:59 UTC (permalink / raw)
  To: Black, David; +Cc: Luca Muscariello, Gorry Fairhurst, tsvwg IETF list, bloat

> On 28 Apr, 2020, at 10:43 pm, Black, David <David.Black@dell.com> wrote:
> 
> And I also noted this at the end of the meeting:  “queue protection that might apply the disincentive”
>  
> That would send cheaters to the L4S conventional queue along with all the other queue-building traffic.

Alas, we have not yet seen an integrated implementation of the queue protection mechanism, so that we can test its effectiveness.  I think it is part of the extra evidence that would be needed before a decision could be taken in favour of using ECT(1) as an input.

I would also note in this context that mere volume of data, or length of development, are not marks that should be taken in favour of a proposal.  The relevance, quality, thoroughness and results of data collection must be carefully evaluated, and it could easily be argued that a lengthy development cycle that still has not produced reliable results should be retired, to avoid throwing good money after bad.  The fact that we were able to find serious problems with the (only?) reference implementation of L4S using a relatively small, but independently selected test suite does not lend confidence in its maturity.

Reputable engineers know that it is necessary to establish a robust design first.  Only then can a robust implementation be hoped for.  It is the basic design decision, over the semantics of each ECN codepoint, that we were trying to discuss yesterday.  I'm not certain that everyone in the room understood that.

 - Jonathan Morton

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Bloat] [tsvwg] my backlogged comments on the ECT(1) interim call
       [not found]       ` <bad22a6b-698d-c85f-b829-6b5391833a1e@erg.abdn.ac.uk>
  2020-04-28 19:38         ` [Bloat] [tsvwg] " Luca Muscariello
@ 2020-04-28 20:33         ` Sebastian Moeller
  1 sibling, 0 replies; 11+ messages in thread
From: Sebastian Moeller @ 2020-04-28 20:33 UTC (permalink / raw)
  To: Gorry Fairhurst; +Cc: Luca Muscariello, Holland, Jake, tsvwg IETF list, bloat



> On Apr 28, 2020, at 21:26, Gorry Fairhurst <gorry@erg.abdn.ac.uk> wrote:
> 
> This seems all interesting, but isn't this true of any network technology. If I use a UDP app with my own style of CC I can just take all the capacity if I want.
> 
> A solution could be to apply a circuit-breaker/policer in the network to perform admission control, but I don't see the link to L4S. Have I missed something ?

	Maybe that the purposefully shallow LL-queue will make it especially easy to drive it into overload-reactive pure dropping mode? The designed-in (lack of) burst tolerance* really is not to expect bursts in the LL-queue. Disruption will be possible with surprisingly small bursts directed at the LL-queue (and due to ECT(1) without admission control this will be a target even skript kiddies can not miss) which will make the circuit-breaker/policer interventions a bit of a gamble (if you throttle all traffic you basically compromises low-latency, low-loss and throughput, but if you try to isolate the offenders, you are in FQ territory and that is verboten for L4S). 
But we have discussed the almost naive thread modelling of the L4S rfc's before. To which the response was, a user can always implement queue protection and (D)DOS is possible even today,so L4S is not making the situation worse (which is a) a low bar to clear, and b) more than can be said about RTT-dependence). 


Best Regards
	Sebastian



*) The dual queue coupled AQM removed PIE's additional burst tolerance mode; I do not claim that that in itself is a problem, and that that mode would buy much more resilience but it demonstrates a rather haphazard approach to engineering iMHO. But see Pete's recent message about how L4S copes with bursty traffic.


> 
> Gorry
> 
> On 28/04/2020 20:04, Luca Muscariello wrote:
>> Hi Jake,
>> 
>> Thanks for the notes. Very useful.
>> The other issue with the meeting was that the virtual mic queue control channel was the WebEx Meeting chat that does not exist in WebEx Teams. So, I had to switch to Meetings and lost some pieces of the discussion. 
>> 
>> Yes there might be a terminology difference. Elastic traffic is usually used in the sense of bandwidth sharing not just to define variable bit rates.
>> 
>> The point is that there are incentives to cheat in L4S.
>> 
>> There is a priority queue that my application can enter by providing as input ECT(1). 
>> Applications such as on-line meetings will have a relatively low and highly paced rate.
>> 
>> This traffic is conformant to dualQ L queue but is unresponsive to congestion notifications.  
>> 
>> This is especially true for FEC streams which could be used to ameliorate the media quality in presence of losses(e.g. Wi-Fi)
>> or increased jitter.
>> 
>> 
>> That was one more point on why using ECT(1) as input assumes trust or a black list after being caught.
>> 
>> In both cases the ECT(1) as input is DoSable.
>> 
>> 
>> 
>> On Tue, Apr 28, 2020 at 7:12 PM Holland, Jake <jholland@akamai.com> wrote:
>> Hi Luca,
>> 
>>  
>> To your point about the discussion being difficult to follow: I tried to capture the intent of everyone who commented while taking notes:
>> 
>> https://etherpad.ietf.org:9009/p/notes-ietf-interim-2020-tsvwg-03
>> 
>>  
>> I think this was intended to take the place of a need for everyone to re-send the same points to the list, but of course some of the most crucial points could probably use fleshing out with on-list follow up.
>> 
>>  
>> It got a bit rough in places because I was disconnected a few times and had to cut over to a local text file, and I may have failed to correctly understand or summarize some of the comments, so there’s chances I might have missed something, but I did my best to capture them all.
>> 
>>  
>> I encourage people to review comments and check whether they came out more or less correct, and to offer formatting and cleanup suggestions if there’s a good way to make it easier to follow.
>> 
>>  
>> I had timestamps at the beginning of each main point of discussion, with the intent that after the video is published it would be easier to go back and check precisely what was said. It looks like someone has been making cleanup edits that removed the first half of those so far, but my local text file still has most of those and I can go back and re-insert them if it seems useful.
>> 
>>  
>> @Luca: during your comments in particular I think there might have been a disruption--I had a “first comment missed, please check video” placeholder and I may have misunderstood the part about video elasticity, but my interpretation at the time was that Stuart was claiming that video was elastic in that it would adjust downward to avoid overflowing a loaded link, and I thought you were claiming that it was not elastic in that it would not exceed a maximum rate, which I summarized as perhaps a semantic disagreement, but if you’d like to help clean that up, it might be useful.
>> 
>>  
>> From this message, it sounds like the key point you were making was that it also will not go below a certain rate, and perhaps that quality can stay relatively good in spite of high network loss?
>> 
>>  
>> Best regards,
>> 
>> Jake
>> 
>>  
>> From: Luca Muscariello <muscariello@ieee.org>
>> Date: Tuesday, April 28, 2020 at 1:54 AM
>> To: Dave Taht <dave.taht@gmail.com>
>> Cc: tsvwg IETF list <tsvwg@ietf.org>, bloat <bloat@lists.bufferbloat.net>
>> Subject: Re: [Bloat] my backlogged comments on the ECT(1) interim call
>> 
>>  
>> Hi Dave and list members,
>> 
>>  
>> It was difficult to follow the discussion at the meeting yesterday.
>> 
>> Who  said what in the first place.
>> 
>>  
>> There have been a lot of non-technical comments such as: this solution
>> 
>> is better than another in my opinion. "better" has often been used
>> 
>> as when evaluating the taste of an ice cream: White chocolate vs black chocolate.
>> 
>> This has taken a significant amount of time at the meeting. I haven't learned
>> 
>> much from that kind of discussion and I do not think that helped to make 
>> 
>> much progress.
>> 
>>  
>> If people can re-make their points in the list it would help the debate.
>> 
>>  
>> Another point that a few raised is that we have to make a decision as fast as possible.
>> 
>> I dismissed entirely that argument. Trading off latency with resilience of the Internet
>> 
>> is entirely against the design principle of the Internet architecture itself.
>> 
>> Risk analysis is something that we should keep in mind even when deploying any experiment
>> 
>> and should be a substantial part of it. 
>> 
>>  
>> Someone claimed that on-line meeting traffic is elastic. This is not true, I tried to
>> 
>> clarify this. These applications (WebEx/Zoom) are low rate, a typical maximum upstream
>> 
>> rate is 2Mbps and is not elastic. These applications have often a stand-alone app
>> 
>> that is not using the browser WebRTC stack (the standalone app typically works better).
>> 
>>  
>> A client sends upstream one or two video qualities unless the video camera is switched off. 
>> 
>> In presence of losses, FEC is used but it is still non elastic.
>> 
>> Someone claimed (at yesterday's meeting) that fairness is not an issue (who cares, I heard!)
>> 
>> Well, fairness can constitute a differentiation advantage between two companies that are 
>> 
>> commercializing on-line meetings products. Unless at the IETF we accept 
>> 
>> "law-of-the-jungle" behaviours from Internet applications developers, we should be careful
>> 
>> about making such claims.
>> 
>> Any opportunity to cheat, that brings a business advantage WILL be used.
>> 
>>  
>> /Luca
>> 
>>  
>> TL;DR
>> 
>> To Dave: you asked several times what  Cisco does on latency reduction in
>> 
>> network equipment. I tend to be very shy when replying on these questions
>> 
>> as this is not vendor neutral. If chairs think this is not appropriate for
>> 
>> the list, please say it and I'll reply privately only.
>> 
>>  
>> What I write below can be found in Cisco products data sheets and is not
>> 
>> trade secret. There are very good blog posts explaining details.
>> 
>> Not surprisingly Cisco implements the state of the art on the topic
>> 
>> and it is totally feasible to do-the-right-thing in software and hardware.
>> 
>>  
>> Cisco implements AFD (one queue + a flow table) accompanied by a priority queue for 
>> 
>> flows that have a certain profile in rate and size. The concept is well known and well
>> 
>> studied in the literature. AFD is safe and can well serve a complex traffic mix when 
>> 
>> accompanied by a priority queue. This prio-queue should not be confused with a strict
>> 
>> priority queue (e.g. EF in diffserv). There are subtleties related to the DOCSIS
>> 
>> shared medium which would be too long to describe here.
>> 
>>  
>> This is available in Cisco CMTS for the DOCSIS segment. Bottleneck traffic
>> 
>> does not negatively impact non-bottlenecked-traffic such as an on-line meeting like
>> 
>> the WebEx call we had yesterday. It is safe from a network neutrality point-of-view
>> 
>> and no applications get hurt. 
>> 
>>  
>> Cisco implements AFD+prio also for some DC switches such as the Nexus 9k. There
>> 
>> is a blog post written by Tom Edsal online that explains pretty well how that works.
>> 
>> This includes mechanisms such as p-fabric to approximate SRPT (shortest remaining processing time)
>> 
>> and minimize flow completion time for many DC workloads. The mix of the two
>> 
>> brings FCT minimization AND latency minimization. This is silicon and scales at any speed.
>> 
>> For those who are not familiar with these concepts, please search the research work of Balaji 
>> 
>> Prabhakar and Ron Pang at Stanford.
>> 
>>  
>> Wi-Fi: Cisco does airtime fairness in Aironet but I think in the Meraki series too.
>> 
>> The concept is similar to what described above but there are several queues, one per STA.
>> 
>> Packets are enqueued in the access (category) queue at dequeue time from the air-time
>> 
>> packet scheduler. 
>> 
>>  
>> On Mon, Apr 27, 2020 at 9:24 PM Dave Taht <dave.taht@gmail.com> wrote:
>> 
>> It looks like the majority of what I say below is not related to the
>> fate of the "bit". The push to take the bit was
>> strong with this one, and me... can't we deploy more of what we
>> already got in places where it matters?
>> 
>> ....
>> 
>> so: A) PLEA: From 10 years now, of me working on bufferbloat, working
>> on real end-user and wifi traffic and real networks....
>> 
>> I would like folk here to stop benchmarking two flows that run for a long time
>> and in one direction only... and thus exclusively in tcp congestion
>> avoidance mode.
>> 
>> Please. just. stop. Real traffic looks nothing like that. The internet
>> looks nothing like that.
>> The netops folk I know just roll their eyes up at benchmarks like this
>> that prove nothing and tell me to go to ripe meetings instead.
>> When y'all talk about "not looking foolish for not mandating ecn now",
>> you've already lost that audience with benchmarks like these.
>> 
>> Sure, setup a background flow(s)  like that, but then hit the result
>> with a mix of
>> far more normal traffic? Please? networks are never used unidirectionally
>> and both directions congesting is frequent. To illustrate that problem...
>> 
>> I have a really robust benchmark that we have used throughout the bufferbloat
>> project that I would like everyone to run in their environments, the flent
>> "rrul" test. Everybody on both sides has big enough testbeds setup that a few
>> hours spent on doing that - and please add in asymmetric networks especially -
>> and perusing the results ought to be enlightening to everyone as to the kind
>> of problems real people have, on real networks.
>> 
>> Can the L4S and SCE folk run the rrul test some day soon? Please?
>> 
>> I rather liked this benchmark that tested another traffic mix,
>> 
>> ( https://www.cablelabs.com/wp-content/uploads/2014/06/DOCSIS-AQM_May2014.pdf )
>> 
>> although it had many flaws (like not doing dns lookups), I wish it
>> could be dusted off and used to compare this
>> new fangled ecn enabled stuff with the kind of results you can merely get
>> with packet loss and rtt awareness. It would be so great to be able
>> to directly compare all these new algorithms against this benchmark.
>> 
>> Adding in a non ecn'd udp based routing protocol on heavily
>> oversubscribed 100mbit link is also enlightening.
>> 
>> I'd rather like to see that benchmark improved for a more modernized
>> home traffic mix
>> where it is projected there may be 30 devices on the network on average,
>> in a few years.
>> 
>> If there is any one thing y'all can do to reduce my blood pressure and
>> keep me engaged here whilst you
>> debate the end of the internet as I understand it, it would be to run
>> the rrul test as part of all your benchmarks.
>> 
>> thank you.
>> 
>> B) Stuart Cheshire regaled us with several anecdotes - one concerning
>> his problems
>> with comcast's 1Gbit/35mbit service being unusable, under load, for
>> videoconferencing. This is true. The overbuffering at the CMTSes
>> still, has to be seen to be believed, at all rates. At lower rates
>> it's possible to shape this, with another device (which is what
>> the entire SQM deployment does in self defense and why cake has a
>> specific docsis ingress mode), but it is cpu intensive
>> and requires x86 hardware to do well at rates above 500Mbits, presently.
>> 
>> So I wish CMTS makers (Arris and Cisco) were in this room. are they?
>> 
>> (Stuart, if you'd like a box that can make your comcast link pleasurable
>> under all workloads, whenever you get back to los gatos, I've got a few
>> lying around. Was so happy to get a few ietfers this past week to apply
>> what's off the shelf for end users today. :)
>> 
>> C) I am glad bob said the L4S is finally looking at asymmetric
>> networks, and starting to tackle ack-filtering and accecn issues
>> there.
>> 
>> But... I would have *started there*. Asymmetric access is the predominate form
>> of all edge technologies.
>> 
>> I would love to see flent rrul test results for 1gig/35mbit, 100/10, 200/10
>> services, in particular. (from SCE also!). "lifeline" service (11/2)
>> would be good
>> to have results on. It would be especially good to have baseline
>> comparison data from the measured, current deployment
>> of the CMTSes at these rates, to start with, with no queue management in
>> play, then pie on the uplink, then fq_codel on the uplink, and then
>> this ecn stuff, and so on.
>> 
>> D) The two CPE makers in the room have dismissed both fq and sce as
>> being too difficult to implement. They did say that dualpi was
>> actually implemented in software, not hardware.
>> 
>> I would certainly like them to benchmark what they plan to offer in L4S
>> vs what is already available in the edgerouter X, as one low end
>> example among thousands.
>> 
>> I also have to note, at higher speeds, all the buffering moves into
>> the wifi and the results are currently ugly. I imagine
>> they are exploring how to fix their wifi stacks also? I wish more folk
>> were using RVR + latency benchmarks like this one:
>> 
>> http://flent-newark.bufferbloat.net/~d/Airtime%20based%20queue%20limit%20for%20FQ_CoDel%20in%20wireless%20interface.pdf
>> 
>> Same goes for the LTE folk.
>> 
>> E) Andrew mcgregor mentioned how great it would be for a closeted musician to
>> be able to play in real time with someone across town. that has been my goal
>> for nearly 30 years now!! And although I rather enjoyed his participation in
>> my last talk on the subject (
>> https://blog.apnic.net/2020/01/22/bufferbloat-may-be-solved-but-its-not-over-yet/
>> ) conflating
>> a need for ecn and l4s signalling for low latency audio applications
>> with what I actually said in that talk, kind of hurt. I achieved
>> "my 2ms fiber based guitarist to fiber based drummer dream" 4+ years
>> back with fq_codel and diffserv, no ecn required,
>> no changes to the specs, no mandating packets be undroppable" and
>> would like to rip the opus codec out of that mix one day.
>> 
>> F) I agree with jana that changing the definition of RFC3168 to suit
>> the RED algorithm (which is not pi or anything fancy) often present in
>> network switches,
>> today to suit dctcp, works. But you should say "configuring red to
>> have l4s marking style" and document that.
>> 
>> Sometimes I try to point out many switches have a form of DRR in them,
>> and it's helpful to use that in conjunction with whatever diffserv
>> markings you trust in your network.
>> 
>> To this day I wish someone would publish how much they use DCTCP style
>> signalling on a dc network relative to their other traffic.
>> 
>> To this day I keep hoping that someone will publish a suitable
>> set of RED parameters for a wide variety of switches and routers -
>> for the most common switches and ethernet chips, for correct DCTCP usage.
>> 
>> Mellonox's example:
>> ( https://community.mellanox.com/s/article/howto-configure-ecn-on-mellanox-ethernet-switches--spectrum-x
>> ) is not dctcp specific.
>> 
>> many switches have a form of DRR in them, and it's helpful to use that
>> in conjunction with whatever diffserv markings you trust in your
>> network,
>> and, as per the above example, segregate two red queues that way. From
>> what I see
>> above there is no way to differentiate ECT(0) from ECT(1) in that switch. (?)
>> 
>> I do keep trying to point out the size of the end user ecn enabled
>> deployment, starting with the data I have from free.fr. Are we
>> building a network for AIs or people?
>> 
>> G) Jana also made a point about 2 queues "being enough" (I might be
>> mis-remembering the exact point). Mellonoxes ethernet chips at 10Gig expose
>> 64 hardware queues, some new intel hardware exposes 2000+. How do these
>> queues work relative to these algorithms?
>> 
>> We have generally found hw mq to be far less of a benefit than the
>> manufacturers think, especially as regard to
>> lower latency or reduced cpu usage (as cache crossing is a bear).
>> There is a lot of software work in this area left to be done, however
>> they are needed to match queues to cpus (and tenants)
>> 
>> Until sch_pie gained timestamping support recently, the rate estimator
>> did not work correctly in a hw mq environment. Haven't looked over
>> dualpi in this respect.
>> 
>> 
>> 
>> 
>> 
>> -- 
>> Make Music, Not War
>> 
>> Dave Täht
>> CTO, TekLibre, LLC
>> http://www.teklibre.com
>> Tel: 1-831-435-0729
>> _______________________________________________
>> Bloat mailing list
>> Bloat@lists.bufferbloat.net
>> https://lists.bufferbloat.net/listinfo/bloat
>> 
> -- 
> G. Fairhurst, School of Engineering
> 


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Bloat] [tsvwg] my backlogged comments on the ECT(1) interim call
  2020-04-28 19:04     ` Luca Muscariello
       [not found]       ` <bad22a6b-698d-c85f-b829-6b5391833a1e@erg.abdn.ac.uk>
@ 2020-04-29  8:44       ` Rodney W. Grimes
  2020-04-29  9:25         ` Luca Muscariello
  1 sibling, 1 reply; 11+ messages in thread
From: Rodney W. Grimes @ 2020-04-29  8:44 UTC (permalink / raw)
  To: Luca Muscariello; +Cc: Holland, Jake, tsvwg IETF list, bloat

Hello Luca, tsvwg'ers,

	I believe that there is some confusion around about how video
conference streams, and video *streams* in general differ from other
forms of traffic.  I believe some of that confusion comes about not
only becasue of the FEC nature that many use but also over the terms
"elastic", "greedy" and "capacity seeking."

	Though video streams *do* adapt to network conditions, they
do so at fixed consumption steps, this is the elastic nature of a
video stream.  They do not continually seek to find full bandwidth,
that is a greedy or capacity seeking flow which video streams are *not*.

	There is a differenece between watching a video vs downloading
a video on the internet.

	The above are *rough* statements, as the details are much more
involved with things like traffic burst of next frame chunk, and other
techniques that have come onto the market.  I would love to hear from
an expert on the current true nature, but there was certainly some
mis-statements about video conference streams during the meeting.

Regards,
Rod

> Hi Jake,
> 
> Thanks for the notes. Very useful.
> The other issue with the meeting was that the virtual mic queue control
> channel was the WebEx Meeting chat that does not exist in WebEx Teams. So,
> I had to switch to Meetings and lost some pieces of the discussion.
> 
> Yes there might be a terminology difference. Elastic traffic is usually
> used in the sense of bandwidth sharing not just to define variable bit
> rates.
> 
> The point is that there are incentives to cheat in L4S.
> 
> There is a priority queue that my application can enter by providing as
> input ECT(1).
> Applications such as on-line meetings will have a relatively low and highly
> paced rate.
> 
> This traffic is conformant to dualQ L queue but is unresponsive to
> congestion notifications.
> 
> This is especially true for FEC streams which could be used to ameliorate
> the media quality in presence of losses(e.g. Wi-Fi)
> or increased jitter.
> 
> 
> That was one more point on why using ECT(1) as input assumes trust or a
> black list after being caught.
> 
> In both cases the ECT(1) as input is DoSable.
> 
> 
> 
> On Tue, Apr 28, 2020 at 7:12 PM Holland, Jake <jholland@akamai.com> wrote:
> 
> > Hi Luca,
> >
> >
> >
> > To your point about the discussion being difficult to follow: I tried to
> > capture the intent of everyone who commented while taking notes:
> >
> > https://etherpad.ietf.org:9009/p/notes-ietf-interim-2020-tsvwg-03
> >
> >
> >
> > I think this was intended to take the place of a need for everyone to
> > re-send the same points to the list, but of course some of the most crucial
> > points could probably use fleshing out with on-list follow up.
> >
> >
> >
> > It got a bit rough in places because I was disconnected a few times and
> > had to cut over to a local text file, and I may have failed to correctly
> > understand or summarize some of the comments, so there?s chances I might
> > have missed something, but I did my best to capture them all.
> >
> >
> >
> > I encourage people to review comments and check whether they came out more
> > or less correct, and to offer formatting and cleanup suggestions if there?s
> > a good way to make it easier to follow.
> >
> >
> >
> > I had timestamps at the beginning of each main point of discussion, with
> > the intent that after the video is published it would be easier to go back
> > and check precisely what was said. It looks like someone has been making
> > cleanup edits that removed the first half of those so far, but my local
> > text file still has most of those and I can go back and re-insert them if
> > it seems useful.
> >
> >
> >
> > @Luca: during your comments in particular I think there might have been a
> > disruption--I had a ?first comment missed, please check video? placeholder
> > and I may have misunderstood the part about video elasticity, but my
> > interpretation at the time was that Stuart was claiming that video was
> > elastic in that it would adjust downward to avoid overflowing a loaded
> > link, and I thought you were claiming that it was not elastic in that it
> > would not exceed a maximum rate, which I summarized as perhaps a semantic
> > disagreement, but if you?d like to help clean that up, it might be useful.
> >
> >
> >
> > From this message, it sounds like the key point you were making was that
> > it also will not go below a certain rate, and perhaps that quality can stay
> > relatively good in spite of high network loss?
> >
> >
> >
> > Best regards,
> >
> > Jake
> >
> >
> >
> > *From: *Luca Muscariello <muscariello@ieee.org>
> > *Date: *Tuesday, April 28, 2020 at 1:54 AM
> > *To: *Dave Taht <dave.taht@gmail.com>
> > *Cc: *tsvwg IETF list <tsvwg@ietf.org>, bloat <bloat@lists.bufferbloat.net
> > >
> > *Subject: *Re: [Bloat] my backlogged comments on the ECT(1) interim call
> >
> >
> >
> > Hi Dave and list members,
> >
> >
> >
> > It was difficult to follow the discussion at the meeting yesterday.
> >
> > Who  said what in the first place.
> >
> >
> >
> > There have been a lot of non-technical comments such as: this solution
> >
> > is better than another in my opinion. "better" has often been used
> >
> > as when evaluating the taste of an ice cream: White chocolate vs black
> > chocolate.
> >
> > This has taken a significant amount of time at the meeting. I haven't
> > learned
> >
> > much from that kind of discussion and I do not think that helped to make
> >
> > much progress.
> >
> >
> >
> > If people can re-make their points in the list it would help the debate.
> >
> >
> >
> > Another point that a few raised is that we have to make a decision as fast
> > as possible.
> >
> > I dismissed entirely that argument. Trading off latency with resilience of
> > the Internet
> >
> > is entirely against the design principle of the Internet architecture
> > itself.
> >
> > Risk analysis is something that we should keep in mind even when
> > deploying any experiment
> >
> > and should be a substantial part of it.
> >
> >
> >
> > Someone claimed that on-line meeting traffic is elastic. This is not true,
> > I tried to
> >
> > clarify this. These applications (WebEx/Zoom) are low rate, a typical
> > maximum upstream
> >
> > rate is 2Mbps and is not elastic. These applications have often a
> > stand-alone app
> >
> > that is not using the browser WebRTC stack (the standalone app typically
> > works better).
> >
> >
> >
> > A client sends upstream one or two video qualities unless the video camera
> > is switched off.
> >
> > In presence of losses, FEC is used but it is still non elastic.
> >
> > Someone claimed (at yesterday's meeting) that fairness is not an issue
> > (who cares, I heard!)
> >
> > Well, fairness can constitute a differentiation advantage between two
> > companies that are
> >
> > commercializing on-line meetings products. Unless at the IETF we accept
> >
> > "law-of-the-jungle" behaviours from Internet applications developers, we
> > should be careful
> >
> > about making such claims.
> >
> > Any opportunity to cheat, that brings a business advantage WILL be used.
> >
> >
> >
> > /Luca
> >
> >
> >
> > TL;DR
> >
> > To Dave: you asked several times what  Cisco does on latency reduction in
> >
> > network equipment. I tend to be very shy when replying on these questions
> >
> > as this is not vendor neutral. If chairs think this is not appropriate for
> >
> > the list, please say it and I'll reply privately only.
> >
> >
> >
> > What I write below can be found in Cisco products data sheets and is not
> >
> > trade secret. There are very good blog posts explaining details.
> >
> > Not surprisingly Cisco implements the state of the art on the topic
> >
> > and it is totally feasible to do-the-right-thing in software and hardware..
> >
> >
> >
> > Cisco implements AFD (one queue + a flow table) accompanied by a priority
> > queue for
> >
> > flows that have a certain profile in rate and size. The concept is well
> > known and well
> >
> > studied in the literature. AFD is safe and can well serve a complex
> > traffic mix when
> >
> > accompanied by a priority queue. This prio-queue should not be confused
> > with a strict
> >
> > priority queue (e.g. EF in diffserv). There are subtleties related to the
> > DOCSIS
> >
> > shared medium which would be too long to describe here.
> >
> >
> >
> > This is available in Cisco CMTS for the DOCSIS segment. Bottleneck traffic
> >
> > does not negatively impact non-bottlenecked-traffic such as an on-line
> > meeting like
> >
> > the WebEx call we had yesterday. It is safe from a network neutrality
> > point-of-view
> >
> > and no applications get hurt.
> >
> >
> >
> > Cisco implements AFD+prio also for some DC switches such as the Nexus 9k.
> > There
> >
> > is a blog post written by Tom Edsal online that explains pretty well how
> > that works.
> >
> > This includes mechanisms such as p-fabric to approximate SRPT (shortest
> > remaining processing time)
> >
> > and minimize flow completion time for many DC workloads. The mix of the two
> >
> > brings FCT minimization AND latency minimization. This is silicon and
> > scales at any speed.
> >
> > For those who are not familiar with these concepts, please search the
> > research work of Balaji
> >
> > Prabhakar and Ron Pang at Stanford.
> >
> >
> >
> > Wi-Fi: Cisco does airtime fairness in Aironet but I think in the Meraki
> > series too.
> >
> > The concept is similar to what described above but there are several
> > queues, one per STA.
> >
> > Packets are enqueued in the access (category) queue at dequeue time from
> > the air-time
> >
> > packet scheduler.
> >
> >
> >
> > On Mon, Apr 27, 2020 at 9:24 PM Dave Taht <dave.taht@gmail.com> wrote:
> >
> > It looks like the majority of what I say below is not related to the
> > fate of the "bit". The push to take the bit was
> > strong with this one, and me... can't we deploy more of what we
> > already got in places where it matters?
> >
> > ...
> >
> > so: A) PLEA: From 10 years now, of me working on bufferbloat, working
> > on real end-user and wifi traffic and real networks....
> >
> > I would like folk here to stop benchmarking two flows that run for a long
> > time
> > and in one direction only... and thus exclusively in tcp congestion
> > avoidance mode.
> >
> > Please. just. stop. Real traffic looks nothing like that. The internet
> > looks nothing like that.
> > The netops folk I know just roll their eyes up at benchmarks like this
> > that prove nothing and tell me to go to ripe meetings instead.
> > When y'all talk about "not looking foolish for not mandating ecn now",
> > you've already lost that audience with benchmarks like these.
> >
> > Sure, setup a background flow(s)  like that, but then hit the result
> > with a mix of
> > far more normal traffic? Please? networks are never used unidirectionally
> > and both directions congesting is frequent. To illustrate that problem...
> >
> > I have a really robust benchmark that we have used throughout the
> > bufferbloat
> > project that I would like everyone to run in their environments, the flent
> > "rrul" test. Everybody on both sides has big enough testbeds setup that a
> > few
> > hours spent on doing that - and please add in asymmetric networks
> > especially -
> > and perusing the results ought to be enlightening to everyone as to the
> > kind
> > of problems real people have, on real networks.
> >
> > Can the L4S and SCE folk run the rrul test some day soon? Please?
> >
> > I rather liked this benchmark that tested another traffic mix,
> >
> > (
> > https://www.cablelabs.com/wp-content/uploads/2014/06/DOCSIS-AQM_May2014.pdf
> > <https://urldefense.proofpoint.com/v2/url?u=https-3A__www.cablelabs.com_wp-2Dcontent_uploads_2014_06_DOCSIS-2DAQM-5FMay2014.pdf&d=DwMFaQ&c=96ZbZZcaMF4w0F4jpN6LZg&r=bqnFROivDo_4iF8Z3R4DyNWKbbMeXr0LOgLnElT1Ook&m=j5nEJ3W8fRmqjnBSWapTVKj6dNbpegl4kSeynebCQT4&s=DrB4ENWjWbVu9SqtIh7lXKJj96fwm6TqESC6E8_IdnY&e=>
> > )
> >
> > although it had many flaws (like not doing dns lookups), I wish it
> > could be dusted off and used to compare this
> > new fangled ecn enabled stuff with the kind of results you can merely get
> > with packet loss and rtt awareness. It would be so great to be able
> > to directly compare all these new algorithms against this benchmark.
> >
> > Adding in a non ecn'd udp based routing protocol on heavily
> > oversubscribed 100mbit link is also enlightening.
> >
> > I'd rather like to see that benchmark improved for a more modernized
> > home traffic mix
> > where it is projected there may be 30 devices on the network on average,
> > in a few years.
> >
> > If there is any one thing y'all can do to reduce my blood pressure and
> > keep me engaged here whilst you
> > debate the end of the internet as I understand it, it would be to run
> > the rrul test as part of all your benchmarks.
> >
> > thank you.
> >
> > B) Stuart Cheshire regaled us with several anecdotes - one concerning
> > his problems
> > with comcast's 1Gbit/35mbit service being unusable, under load, for
> > videoconferencing. This is true. The overbuffering at the CMTSes
> > still, has to be seen to be believed, at all rates. At lower rates
> > it's possible to shape this, with another device (which is what
> > the entire SQM deployment does in self defense and why cake has a
> > specific docsis ingress mode), but it is cpu intensive
> > and requires x86 hardware to do well at rates above 500Mbits, presently.
> >
> > So I wish CMTS makers (Arris and Cisco) were in this room. are they?
> >
> > (Stuart, if you'd like a box that can make your comcast link pleasurable
> > under all workloads, whenever you get back to los gatos, I've got a few
> > lying around. Was so happy to get a few ietfers this past week to apply
> > what's off the shelf for end users today. :)
> >
> > C) I am glad bob said the L4S is finally looking at asymmetric
> > networks, and starting to tackle ack-filtering and accecn issues
> > there.
> >
> > But... I would have *started there*. Asymmetric access is the predominate
> > form
> > of all edge technologies.
> >
> > I would love to see flent rrul test results for 1gig/35mbit, 100/10, 200/10
> > services, in particular. (from SCE also!). "lifeline" service (11/2)
> > would be good
> > to have results on. It would be especially good to have baseline
> > comparison data from the measured, current deployment
> > of the CMTSes at these rates, to start with, with no queue management in
> > play, then pie on the uplink, then fq_codel on the uplink, and then
> > this ecn stuff, and so on.
> >
> > D) The two CPE makers in the room have dismissed both fq and sce as
> > being too difficult to implement. They did say that dualpi was
> > actually implemented in software, not hardware.
> >
> > I would certainly like them to benchmark what they plan to offer in L4S
> > vs what is already available in the edgerouter X, as one low end
> > example among thousands.
> >
> > I also have to note, at higher speeds, all the buffering moves into
> > the wifi and the results are currently ugly. I imagine
> > they are exploring how to fix their wifi stacks also? I wish more folk
> > were using RVR + latency benchmarks like this one:
> >
> >
> > http://flent-newark.bufferbloat.net/~d/Airtime%20based%20queue%20limit%20for%20FQ_CoDel%20in%20wireless%20interface.pdf
> > <https://urldefense.proofpoint.com/v2/url?u=http-3A__flent-2Dnewark.bufferbloat.net_-7Ed_Airtime-2520based-2520queue-2520limit-2520for-2520FQ-5FCoDel-2520in-2520wireless-2520interface.pdf&d=DwMFaQ&c=96ZbZZcaMF4w0F4jpN6LZg&r=bqnFROivDo_4iF8Z3R4DyNWKbbMeXr0LOgLnElT1Ook&m=j5nEJ3W8fRmqjnBSWapTVKj6dNbpegl4kSeynebCQT4&s=UEzrGb3xL5zElDhYxB7wHpux1_SLFHGUcEkgTNMOe2Q&e=>
> >
> > Same goes for the LTE folk.
> >
> > E) Andrew mcgregor mentioned how great it would be for a closeted musician
> > to
> > be able to play in real time with someone across town. that has been my
> > goal
> > for nearly 30 years now!! And although I rather enjoyed his participation
> > in
> > my last talk on the subject (
> >
> > https://blog.apnic.net/2020/01/22/bufferbloat-may-be-solved-but-its-not-over-yet/
> > <https://urldefense.proofpoint.com/v2/url?u=https-3A__blog.apnic.net_2020_01_22_bufferbloat-2Dmay-2Dbe-2Dsolved-2Dbut-2Dits-2Dnot-2Dover-2Dyet_&d=DwMFaQ&c=96ZbZZcaMF4w0F4jpN6LZg&r=bqnFROivDo_4iF8Z3R4DyNWKbbMeXr0LOgLnElT1Ook&m=j5nEJ3W8fRmqjnBSWapTVKj6dNbpegl4kSeynebCQT4&s=BSDbzxnB7k7krFmkHv9id0BeDC6Vh39LgPNxyHUIg34&e=>
> > ) conflating
> > a need for ecn and l4s signalling for low latency audio applications
> > with what I actually said in that talk, kind of hurt. I achieved
> > "my 2ms fiber based guitarist to fiber based drummer dream" 4+ years
> > back with fq_codel and diffserv, no ecn required,
> > no changes to the specs, no mandating packets be undroppable" and
> > would like to rip the opus codec out of that mix one day.
> >
> > F) I agree with jana that changing the definition of RFC3168 to suit
> > the RED algorithm (which is not pi or anything fancy) often present in
> > network switches,
> > today to suit dctcp, works. But you should say "configuring red to
> > have l4s marking style" and document that.
> >
> > Sometimes I try to point out many switches have a form of DRR in them,
> > and it's helpful to use that in conjunction with whatever diffserv
> > markings you trust in your network.
> >
> > To this day I wish someone would publish how much they use DCTCP style
> > signalling on a dc network relative to their other traffic.
> >
> > To this day I keep hoping that someone will publish a suitable
> > set of RED parameters for a wide variety of switches and routers -
> > for the most common switches and ethernet chips, for correct DCTCP usage.
> >
> > Mellonox's example:
> > (
> > https://community.mellanox.com/s/article/howto-configure-ecn-on-mellanox-ethernet-switches--spectrum-x
> > <https://urldefense.proofpoint.com/v2/url?u=https-3A__community.mellanox.com_s_article_howto-2Dconfigure-2Decn-2Don-2Dmellanox-2Dethernet-2Dswitches-2D-2Dspectrum-2Dx&d=DwMFaQ&c=96ZbZZcaMF4w0F4jpN6LZg&r=bqnFROivDo_4iF8Z3R4DyNWKbbMeXr0LOgLnElT1Ook&m=j5nEJ3W8fRmqjnBSWapTVKj6dNbpegl4kSeynebCQT4&s=nEIW1DhRXOHu3F5tMwpyO5rQUBMfCZx3Hs4wVvkVFIQ&e=>
> > ) is not dctcp specific.
> >
> > many switches have a form of DRR in them, and it's helpful to use that
> > in conjunction with whatever diffserv markings you trust in your
> > network,
> > and, as per the above example, segregate two red queues that way. From
> > what I see
> > above there is no way to differentiate ECT(0) from ECT(1) in that switch.
> > (?)
> >
> > I do keep trying to point out the size of the end user ecn enabled
> > deployment, starting with the data I have from free.fr
> > <https://urldefense.proofpoint.com/v2/url?u=http-3A__free.fr&d=DwMFaQ&c=96ZbZZcaMF4w0F4jpN6LZg&r=bqnFROivDo_4iF8Z3R4DyNWKbbMeXr0LOgLnElT1Ook&m=j5nEJ3W8fRmqjnBSWapTVKj6dNbpegl4kSeynebCQT4&s=7gswGhl21lejSnIiu3yyUTPZEArHqQG6hD64BoW2Zco&e=>.
> > Are we
> > building a network for AIs or people?
> >
> > G) Jana also made a point about 2 queues "being enough" (I might be
> > mis-remembering the exact point). Mellonoxes ethernet chips at 10Gig expose
> > 64 hardware queues, some new intel hardware exposes 2000+. How do these
> > queues work relative to these algorithms?
> >
> > We have generally found hw mq to be far less of a benefit than the
> > manufacturers think, especially as regard to
> > lower latency or reduced cpu usage (as cache crossing is a bear).
> > There is a lot of software work in this area left to be done, however
> > they are needed to match queues to cpus (and tenants)
> >
> > Until sch_pie gained timestamping support recently, the rate estimator
> > did not work correctly in a hw mq environment. Haven't looked over
> > dualpi in this respect.
> >
> >
> >
> >
> >
> > --
> > Make Music, Not War
> >
> > Dave T?ht
> > CTO, TekLibre, LLC
> > http://www.teklibre.com
> > <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.teklibre.com&d=DwMFaQ&c=96ZbZZcaMF4w0F4jpN6LZg&r=bqnFROivDo_4iF8Z3R4DyNWKbbMeXr0LOgLnElT1Ook&m=j5nEJ3W8fRmqjnBSWapTVKj6dNbpegl4kSeynebCQT4&s=DqPVjNVWDmF4_cwubNhhJS4Y1jCj71szPiBn9pmDZ70&e=>
> > Tel: 1-831-435-0729
> > _______________________________________________
> > Bloat mailing list
> > Bloat@lists.bufferbloat.net
> > https://lists.bufferbloat.net/listinfo/bloat
> > <https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.bufferbloat..net_listinfo_bloat&d=DwMFaQ&c=96ZbZZcaMF4w0F4jpN6LZg&r=bqnFROivDo_4iF8Z3R4DyNWKbbMeXr0LOgLnElT1Ook&m=j5nEJ3W8fRmqjnBSWapTVKj6dNbpegl4kSeynebCQT4&s=DBDxIR6eSYcBOh7rqZx0PWzsHOfvvJeqioI3r2IQOA4&e=>
> >
> >

-- 
Rod Grimes                                                 rgrimes@freebsd.org

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Bloat] [tsvwg] my backlogged comments on the ECT(1) interim call
  2020-04-29  8:44       ` Rodney W. Grimes
@ 2020-04-29  9:25         ` Luca Muscariello
  2020-04-29  9:46           ` Jonathan Morton
  0 siblings, 1 reply; 11+ messages in thread
From: Luca Muscariello @ 2020-04-29  9:25 UTC (permalink / raw)
  To: Rodney W. Grimes; +Cc: Holland, Jake, tsvwg IETF list, bloat

[-- Attachment #1: Type: text/plain, Size: 23465 bytes --]

Hello Rodney,

yes I agree. Elasticity may mean different things.

BTW, I hope I made the point about incentives to cheat, and the risks
for unresponsive traffic for L4S when using ECT(1) as a trusted input.

My comments are mostly related to my experience with Cisco WebEx as a Cisco
employee
working on that.
I do care about these applications! But I do care of all other on-line
meetings apps.
I'm sure we all do lately. These apps may have incentives to cheat. Other
apps too.
I know for sure Cisco WebEx does not cheat.

Incentives to cheat will force everyone to cheat. I believe we do not want
that.

So I was talking about real-time video (media in general, including content
sharing),
transported using RTP/UDP. Not DAS video transported using HTTP/TCP.

Typically the latter do not use FEC while the former do have FEC streams
over defined RTP payload types.

Luca


On Wed, Apr 29, 2020 at 10:44 AM Rodney W. Grimes <ietf@gndrsh.dnsmgr.net>
wrote:

> Hello Luca, tsvwg'ers,
>
>         I believe that there is some confusion around about how video
> conference streams, and video *streams* in general differ from other
> forms of traffic.  I believe some of that confusion comes about not
> only becasue of the FEC nature that many use but also over the terms
> "elastic", "greedy" and "capacity seeking."
>
>         Though video streams *do* adapt to network conditions, they
> do so at fixed consumption steps, this is the elastic nature of a
> video stream.  They do not continually seek to find full bandwidth,
> that is a greedy or capacity seeking flow which video streams are *not*.
>
>         There is a differenece between watching a video vs downloading
> a video on the internet.
>
>         The above are *rough* statements, as the details are much more
> involved with things like traffic burst of next frame chunk, and other
> techniques that have come onto the market.  I would love to hear from
> an expert on the current true nature, but there was certainly some
> mis-statements about video conference streams during the meeting.
>
> Regards,
> Rod
>
> > Hi Jake,
> >
> > Thanks for the notes. Very useful.
> > The other issue with the meeting was that the virtual mic queue control
> > channel was the WebEx Meeting chat that does not exist in WebEx Teams.
> So,
> > I had to switch to Meetings and lost some pieces of the discussion.
> >
> > Yes there might be a terminology difference. Elastic traffic is usually
> > used in the sense of bandwidth sharing not just to define variable bit
> > rates.
> >
> > The point is that there are incentives to cheat in L4S.
> >
> > There is a priority queue that my application can enter by providing as
> > input ECT(1).
> > Applications such as on-line meetings will have a relatively low and
> highly
> > paced rate.
> >
> > This traffic is conformant to dualQ L queue but is unresponsive to
> > congestion notifications.
> >
> > This is especially true for FEC streams which could be used to ameliorate
> > the media quality in presence of losses(e.g. Wi-Fi)
> > or increased jitter.
> >
> >
> > That was one more point on why using ECT(1) as input assumes trust or a
> > black list after being caught.
> >
> > In both cases the ECT(1) as input is DoSable.
> >
> >
> >
> > On Tue, Apr 28, 2020 at 7:12 PM Holland, Jake <jholland@akamai.com>
> wrote:
> >
> > > Hi Luca,
> > >
> > >
> > >
> > > To your point about the discussion being difficult to follow: I tried
> to
> > > capture the intent of everyone who commented while taking notes:
> > >
> > > https://etherpad.ietf.org:9009/p/notes-ietf-interim-2020-tsvwg-03
> > >
> > >
> > >
> > > I think this was intended to take the place of a need for everyone to
> > > re-send the same points to the list, but of course some of the most
> crucial
> > > points could probably use fleshing out with on-list follow up.
> > >
> > >
> > >
> > > It got a bit rough in places because I was disconnected a few times and
> > > had to cut over to a local text file, and I may have failed to
> correctly
> > > understand or summarize some of the comments, so there?s chances I
> might
> > > have missed something, but I did my best to capture them all.
> > >
> > >
> > >
> > > I encourage people to review comments and check whether they came out
> more
> > > or less correct, and to offer formatting and cleanup suggestions if
> there?s
> > > a good way to make it easier to follow.
> > >
> > >
> > >
> > > I had timestamps at the beginning of each main point of discussion,
> with
> > > the intent that after the video is published it would be easier to go
> back
> > > and check precisely what was said. It looks like someone has been
> making
> > > cleanup edits that removed the first half of those so far, but my local
> > > text file still has most of those and I can go back and re-insert them
> if
> > > it seems useful.
> > >
> > >
> > >
> > > @Luca: during your comments in particular I think there might have
> been a
> > > disruption--I had a ?first comment missed, please check video?
> placeholder
> > > and I may have misunderstood the part about video elasticity, but my
> > > interpretation at the time was that Stuart was claiming that video was
> > > elastic in that it would adjust downward to avoid overflowing a loaded
> > > link, and I thought you were claiming that it was not elastic in that
> it
> > > would not exceed a maximum rate, which I summarized as perhaps a
> semantic
> > > disagreement, but if you?d like to help clean that up, it might be
> useful.
> > >
> > >
> > >
> > > From this message, it sounds like the key point you were making was
> that
> > > it also will not go below a certain rate, and perhaps that quality can
> stay
> > > relatively good in spite of high network loss?
> > >
> > >
> > >
> > > Best regards,
> > >
> > > Jake
> > >
> > >
> > >
> > > *From: *Luca Muscariello <muscariello@ieee.org>
> > > *Date: *Tuesday, April 28, 2020 at 1:54 AM
> > > *To: *Dave Taht <dave.taht@gmail.com>
> > > *Cc: *tsvwg IETF list <tsvwg@ietf.org>, bloat <
> bloat@lists.bufferbloat.net
> > > >
> > > *Subject: *Re: [Bloat] my backlogged comments on the ECT(1) interim
> call
> > >
> > >
> > >
> > > Hi Dave and list members,
> > >
> > >
> > >
> > > It was difficult to follow the discussion at the meeting yesterday.
> > >
> > > Who  said what in the first place.
> > >
> > >
> > >
> > > There have been a lot of non-technical comments such as: this solution
> > >
> > > is better than another in my opinion. "better" has often been used
> > >
> > > as when evaluating the taste of an ice cream: White chocolate vs black
> > > chocolate.
> > >
> > > This has taken a significant amount of time at the meeting. I haven't
> > > learned
> > >
> > > much from that kind of discussion and I do not think that helped to
> make
> > >
> > > much progress.
> > >
> > >
> > >
> > > If people can re-make their points in the list it would help the
> debate.
> > >
> > >
> > >
> > > Another point that a few raised is that we have to make a decision as
> fast
> > > as possible.
> > >
> > > I dismissed entirely that argument. Trading off latency with
> resilience of
> > > the Internet
> > >
> > > is entirely against the design principle of the Internet architecture
> > > itself.
> > >
> > > Risk analysis is something that we should keep in mind even when
> > > deploying any experiment
> > >
> > > and should be a substantial part of it.
> > >
> > >
> > >
> > > Someone claimed that on-line meeting traffic is elastic. This is not
> true,
> > > I tried to
> > >
> > > clarify this. These applications (WebEx/Zoom) are low rate, a typical
> > > maximum upstream
> > >
> > > rate is 2Mbps and is not elastic. These applications have often a
> > > stand-alone app
> > >
> > > that is not using the browser WebRTC stack (the standalone app
> typically
> > > works better).
> > >
> > >
> > >
> > > A client sends upstream one or two video qualities unless the video
> camera
> > > is switched off.
> > >
> > > In presence of losses, FEC is used but it is still non elastic.
> > >
> > > Someone claimed (at yesterday's meeting) that fairness is not an issue
> > > (who cares, I heard!)
> > >
> > > Well, fairness can constitute a differentiation advantage between two
> > > companies that are
> > >
> > > commercializing on-line meetings products. Unless at the IETF we accept
> > >
> > > "law-of-the-jungle" behaviours from Internet applications developers,
> we
> > > should be careful
> > >
> > > about making such claims.
> > >
> > > Any opportunity to cheat, that brings a business advantage WILL be
> used.
> > >
> > >
> > >
> > > /Luca
> > >
> > >
> > >
> > > TL;DR
> > >
> > > To Dave: you asked several times what  Cisco does on latency reduction
> in
> > >
> > > network equipment. I tend to be very shy when replying on these
> questions
> > >
> > > as this is not vendor neutral. If chairs think this is not appropriate
> for
> > >
> > > the list, please say it and I'll reply privately only.
> > >
> > >
> > >
> > > What I write below can be found in Cisco products data sheets and is
> not
> > >
> > > trade secret. There are very good blog posts explaining details.
> > >
> > > Not surprisingly Cisco implements the state of the art on the topic
> > >
> > > and it is totally feasible to do-the-right-thing in software and
> hardware..
> > >
> > >
> > >
> > > Cisco implements AFD (one queue + a flow table) accompanied by a
> priority
> > > queue for
> > >
> > > flows that have a certain profile in rate and size. The concept is well
> > > known and well
> > >
> > > studied in the literature. AFD is safe and can well serve a complex
> > > traffic mix when
> > >
> > > accompanied by a priority queue. This prio-queue should not be confused
> > > with a strict
> > >
> > > priority queue (e.g. EF in diffserv). There are subtleties related to
> the
> > > DOCSIS
> > >
> > > shared medium which would be too long to describe here.
> > >
> > >
> > >
> > > This is available in Cisco CMTS for the DOCSIS segment. Bottleneck
> traffic
> > >
> > > does not negatively impact non-bottlenecked-traffic such as an on-line
> > > meeting like
> > >
> > > the WebEx call we had yesterday. It is safe from a network neutrality
> > > point-of-view
> > >
> > > and no applications get hurt.
> > >
> > >
> > >
> > > Cisco implements AFD+prio also for some DC switches such as the Nexus
> 9k.
> > > There
> > >
> > > is a blog post written by Tom Edsal online that explains pretty well
> how
> > > that works.
> > >
> > > This includes mechanisms such as p-fabric to approximate SRPT (shortest
> > > remaining processing time)
> > >
> > > and minimize flow completion time for many DC workloads. The mix of
> the two
> > >
> > > brings FCT minimization AND latency minimization. This is silicon and
> > > scales at any speed.
> > >
> > > For those who are not familiar with these concepts, please search the
> > > research work of Balaji
> > >
> > > Prabhakar and Ron Pang at Stanford.
> > >
> > >
> > >
> > > Wi-Fi: Cisco does airtime fairness in Aironet but I think in the Meraki
> > > series too.
> > >
> > > The concept is similar to what described above but there are several
> > > queues, one per STA.
> > >
> > > Packets are enqueued in the access (category) queue at dequeue time
> from
> > > the air-time
> > >
> > > packet scheduler.
> > >
> > >
> > >
> > > On Mon, Apr 27, 2020 at 9:24 PM Dave Taht <dave.taht@gmail.com> wrote:
> > >
> > > It looks like the majority of what I say below is not related to the
> > > fate of the "bit". The push to take the bit was
> > > strong with this one, and me... can't we deploy more of what we
> > > already got in places where it matters?
> > >
> > > ...
> > >
> > > so: A) PLEA: From 10 years now, of me working on bufferbloat, working
> > > on real end-user and wifi traffic and real networks....
> > >
> > > I would like folk here to stop benchmarking two flows that run for a
> long
> > > time
> > > and in one direction only... and thus exclusively in tcp congestion
> > > avoidance mode.
> > >
> > > Please. just. stop. Real traffic looks nothing like that. The internet
> > > looks nothing like that.
> > > The netops folk I know just roll their eyes up at benchmarks like this
> > > that prove nothing and tell me to go to ripe meetings instead.
> > > When y'all talk about "not looking foolish for not mandating ecn now",
> > > you've already lost that audience with benchmarks like these.
> > >
> > > Sure, setup a background flow(s)  like that, but then hit the result
> > > with a mix of
> > > far more normal traffic? Please? networks are never used
> unidirectionally
> > > and both directions congesting is frequent. To illustrate that
> problem...
> > >
> > > I have a really robust benchmark that we have used throughout the
> > > bufferbloat
> > > project that I would like everyone to run in their environments, the
> flent
> > > "rrul" test. Everybody on both sides has big enough testbeds setup
> that a
> > > few
> > > hours spent on doing that - and please add in asymmetric networks
> > > especially -
> > > and perusing the results ought to be enlightening to everyone as to the
> > > kind
> > > of problems real people have, on real networks.
> > >
> > > Can the L4S and SCE folk run the rrul test some day soon? Please?
> > >
> > > I rather liked this benchmark that tested another traffic mix,
> > >
> > > (
> > >
> https://www.cablelabs.com/wp-content/uploads/2014/06/DOCSIS-AQM_May2014.pdf
> > > <
> https://urldefense.proofpoint.com/v2/url?u=https-3A__www.cablelabs.com_wp-2Dcontent_uploads_2014_06_DOCSIS-2DAQM-5FMay2014.pdf&d=DwMFaQ&c=96ZbZZcaMF4w0F4jpN6LZg&r=bqnFROivDo_4iF8Z3R4DyNWKbbMeXr0LOgLnElT1Ook&m=j5nEJ3W8fRmqjnBSWapTVKj6dNbpegl4kSeynebCQT4&s=DrB4ENWjWbVu9SqtIh7lXKJj96fwm6TqESC6E8_IdnY&e=
> >
> > > )
> > >
> > > although it had many flaws (like not doing dns lookups), I wish it
> > > could be dusted off and used to compare this
> > > new fangled ecn enabled stuff with the kind of results you can merely
> get
> > > with packet loss and rtt awareness. It would be so great to be able
> > > to directly compare all these new algorithms against this benchmark.
> > >
> > > Adding in a non ecn'd udp based routing protocol on heavily
> > > oversubscribed 100mbit link is also enlightening.
> > >
> > > I'd rather like to see that benchmark improved for a more modernized
> > > home traffic mix
> > > where it is projected there may be 30 devices on the network on
> average,
> > > in a few years.
> > >
> > > If there is any one thing y'all can do to reduce my blood pressure and
> > > keep me engaged here whilst you
> > > debate the end of the internet as I understand it, it would be to run
> > > the rrul test as part of all your benchmarks.
> > >
> > > thank you.
> > >
> > > B) Stuart Cheshire regaled us with several anecdotes - one concerning
> > > his problems
> > > with comcast's 1Gbit/35mbit service being unusable, under load, for
> > > videoconferencing. This is true. The overbuffering at the CMTSes
> > > still, has to be seen to be believed, at all rates. At lower rates
> > > it's possible to shape this, with another device (which is what
> > > the entire SQM deployment does in self defense and why cake has a
> > > specific docsis ingress mode), but it is cpu intensive
> > > and requires x86 hardware to do well at rates above 500Mbits,
> presently.
> > >
> > > So I wish CMTS makers (Arris and Cisco) were in this room. are they?
> > >
> > > (Stuart, if you'd like a box that can make your comcast link
> pleasurable
> > > under all workloads, whenever you get back to los gatos, I've got a few
> > > lying around. Was so happy to get a few ietfers this past week to apply
> > > what's off the shelf for end users today. :)
> > >
> > > C) I am glad bob said the L4S is finally looking at asymmetric
> > > networks, and starting to tackle ack-filtering and accecn issues
> > > there.
> > >
> > > But... I would have *started there*. Asymmetric access is the
> predominate
> > > form
> > > of all edge technologies.
> > >
> > > I would love to see flent rrul test results for 1gig/35mbit, 100/10,
> 200/10
> > > services, in particular. (from SCE also!). "lifeline" service (11/2)
> > > would be good
> > > to have results on. It would be especially good to have baseline
> > > comparison data from the measured, current deployment
> > > of the CMTSes at these rates, to start with, with no queue management
> in
> > > play, then pie on the uplink, then fq_codel on the uplink, and then
> > > this ecn stuff, and so on.
> > >
> > > D) The two CPE makers in the room have dismissed both fq and sce as
> > > being too difficult to implement. They did say that dualpi was
> > > actually implemented in software, not hardware.
> > >
> > > I would certainly like them to benchmark what they plan to offer in L4S
> > > vs what is already available in the edgerouter X, as one low end
> > > example among thousands.
> > >
> > > I also have to note, at higher speeds, all the buffering moves into
> > > the wifi and the results are currently ugly. I imagine
> > > they are exploring how to fix their wifi stacks also? I wish more folk
> > > were using RVR + latency benchmarks like this one:
> > >
> > >
> > >
> http://flent-newark.bufferbloat.net/~d/Airtime%20based%20queue%20limit%20for%20FQ_CoDel%20in%20wireless%20interface.pdf
> > > <
> https://urldefense.proofpoint.com/v2/url?u=http-3A__flent-2Dnewark.bufferbloat.net_-7Ed_Airtime-2520based-2520queue-2520limit-2520for-2520FQ-5FCoDel-2520in-2520wireless-2520interface.pdf&d=DwMFaQ&c=96ZbZZcaMF4w0F4jpN6LZg&r=bqnFROivDo_4iF8Z3R4DyNWKbbMeXr0LOgLnElT1Ook&m=j5nEJ3W8fRmqjnBSWapTVKj6dNbpegl4kSeynebCQT4&s=UEzrGb3xL5zElDhYxB7wHpux1_SLFHGUcEkgTNMOe2Q&e=
> >
> > >
> > > Same goes for the LTE folk.
> > >
> > > E) Andrew mcgregor mentioned how great it would be for a closeted
> musician
> > > to
> > > be able to play in real time with someone across town. that has been my
> > > goal
> > > for nearly 30 years now!! And although I rather enjoyed his
> participation
> > > in
> > > my last talk on the subject (
> > >
> > >
> https://blog.apnic.net/2020/01/22/bufferbloat-may-be-solved-but-its-not-over-yet/
> > > <
> https://urldefense.proofpoint.com/v2/url?u=https-3A__blog.apnic.net_2020_01_22_bufferbloat-2Dmay-2Dbe-2Dsolved-2Dbut-2Dits-2Dnot-2Dover-2Dyet_&d=DwMFaQ&c=96ZbZZcaMF4w0F4jpN6LZg&r=bqnFROivDo_4iF8Z3R4DyNWKbbMeXr0LOgLnElT1Ook&m=j5nEJ3W8fRmqjnBSWapTVKj6dNbpegl4kSeynebCQT4&s=BSDbzxnB7k7krFmkHv9id0BeDC6Vh39LgPNxyHUIg34&e=
> >
> > > ) conflating
> > > a need for ecn and l4s signalling for low latency audio applications
> > > with what I actually said in that talk, kind of hurt. I achieved
> > > "my 2ms fiber based guitarist to fiber based drummer dream" 4+ years
> > > back with fq_codel and diffserv, no ecn required,
> > > no changes to the specs, no mandating packets be undroppable" and
> > > would like to rip the opus codec out of that mix one day.
> > >
> > > F) I agree with jana that changing the definition of RFC3168 to suit
> > > the RED algorithm (which is not pi or anything fancy) often present in
> > > network switches,
> > > today to suit dctcp, works. But you should say "configuring red to
> > > have l4s marking style" and document that.
> > >
> > > Sometimes I try to point out many switches have a form of DRR in them,
> > > and it's helpful to use that in conjunction with whatever diffserv
> > > markings you trust in your network.
> > >
> > > To this day I wish someone would publish how much they use DCTCP style
> > > signalling on a dc network relative to their other traffic.
> > >
> > > To this day I keep hoping that someone will publish a suitable
> > > set of RED parameters for a wide variety of switches and routers -
> > > for the most common switches and ethernet chips, for correct DCTCP
> usage.
> > >
> > > Mellonox's example:
> > > (
> > >
> https://community.mellanox.com/s/article/howto-configure-ecn-on-mellanox-ethernet-switches--spectrum-x
> > > <
> https://urldefense.proofpoint.com/v2/url?u=https-3A__community.mellanox.com_s_article_howto-2Dconfigure-2Decn-2Don-2Dmellanox-2Dethernet-2Dswitches-2D-2Dspectrum-2Dx&d=DwMFaQ&c=96ZbZZcaMF4w0F4jpN6LZg&r=bqnFROivDo_4iF8Z3R4DyNWKbbMeXr0LOgLnElT1Ook&m=j5nEJ3W8fRmqjnBSWapTVKj6dNbpegl4kSeynebCQT4&s=nEIW1DhRXOHu3F5tMwpyO5rQUBMfCZx3Hs4wVvkVFIQ&e=
> >
> > > ) is not dctcp specific.
> > >
> > > many switches have a form of DRR in them, and it's helpful to use that
> > > in conjunction with whatever diffserv markings you trust in your
> > > network,
> > > and, as per the above example, segregate two red queues that way. From
> > > what I see
> > > above there is no way to differentiate ECT(0) from ECT(1) in that
> switch.
> > > (?)
> > >
> > > I do keep trying to point out the size of the end user ecn enabled
> > > deployment, starting with the data I have from free.fr
> > > <
> https://urldefense.proofpoint.com/v2/url?u=http-3A__free.fr&d=DwMFaQ&c=96ZbZZcaMF4w0F4jpN6LZg&r=bqnFROivDo_4iF8Z3R4DyNWKbbMeXr0LOgLnElT1Ook&m=j5nEJ3W8fRmqjnBSWapTVKj6dNbpegl4kSeynebCQT4&s=7gswGhl21lejSnIiu3yyUTPZEArHqQG6hD64BoW2Zco&e=
> >.
> > > Are we
> > > building a network for AIs or people?
> > >
> > > G) Jana also made a point about 2 queues "being enough" (I might be
> > > mis-remembering the exact point). Mellonoxes ethernet chips at 10Gig
> expose
> > > 64 hardware queues, some new intel hardware exposes 2000+. How do these
> > > queues work relative to these algorithms?
> > >
> > > We have generally found hw mq to be far less of a benefit than the
> > > manufacturers think, especially as regard to
> > > lower latency or reduced cpu usage (as cache crossing is a bear).
> > > There is a lot of software work in this area left to be done, however
> > > they are needed to match queues to cpus (and tenants)
> > >
> > > Until sch_pie gained timestamping support recently, the rate estimator
> > > did not work correctly in a hw mq environment. Haven't looked over
> > > dualpi in this respect.
> > >
> > >
> > >
> > >
> > >
> > > --
> > > Make Music, Not War
> > >
> > > Dave T?ht
> > > CTO, TekLibre, LLC
> > > http://www.teklibre.com
> > > <
> https://urldefense.proofpoint.com/v2/url?u=http-3A__www.teklibre.com&d=DwMFaQ&c=96ZbZZcaMF4w0F4jpN6LZg&r=bqnFROivDo_4iF8Z3R4DyNWKbbMeXr0LOgLnElT1Ook&m=j5nEJ3W8fRmqjnBSWapTVKj6dNbpegl4kSeynebCQT4&s=DqPVjNVWDmF4_cwubNhhJS4Y1jCj71szPiBn9pmDZ70&e=
> >
> > > Tel: 1-831-435-0729
> > > _______________________________________________
> > > Bloat mailing list
> > > Bloat@lists.bufferbloat.net
> > > https://lists.bufferbloat.net/listinfo/bloat
> > > <
> https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.bufferbloat..net_listinfo_bloat&d=DwMFaQ&c=96ZbZZcaMF4w0F4jpN6LZg&r=bqnFROivDo_4iF8Z3R4DyNWKbbMeXr0LOgLnElT1Ook&m=j5nEJ3W8fRmqjnBSWapTVKj6dNbpegl4kSeynebCQT4&s=DBDxIR6eSYcBOh7rqZx0PWzsHOfvvJeqioI3r2IQOA4&e=
> >
> > >
> > >
>
> --
> Rod Grimes
> rgrimes@freebsd.org
>

[-- Attachment #2: Type: text/html, Size: 33334 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Bloat] [tsvwg] my backlogged comments on the ECT(1) interim call
  2020-04-29  9:25         ` Luca Muscariello
@ 2020-04-29  9:46           ` Jonathan Morton
  0 siblings, 0 replies; 11+ messages in thread
From: Jonathan Morton @ 2020-04-29  9:46 UTC (permalink / raw)
  To: Luca Muscariello; +Cc: Rodney W. Grimes, tsvwg IETF list, bloat

> On 29 Apr, 2020, at 12:25 pm, Luca Muscariello <muscariello@ieee.org> wrote:
> 
> BTW, I hope I made the point about incentives to cheat, and the risks
> for unresponsive traffic for L4S when using ECT(1) as a trusted input.

One scenario that I think hasn't been highlighted yet, is the case of a transport which implements 1/p congestion control through CE, but marks itself as a "classic" transport.  We don't even have to imagine such a thing; it already exists as DCTCP, so is trivial for a bad (or merely ignorant) actor to implement.

Such a flow would squeeze out other traffic that correctly responds to CE with MD, and would not be "caught" by queue protection logic designed to protect the latency of the LL queue (as that has no effect on traffic in the classic queue).  It would only be corralled by an AQM which can act to isolate the effects of one flow on others; in this case AF would suffice, but FQ would also work.

This hazard already exists today.  However, the L4S proposal "legitimises" the use of 1/p congestion control using CE, and the subtlety that marking such traffic with a specific classifier is required for effective congestion control is likely to be lost on people focused entirely on their own throughput, as much of the Internet still is.

Using ECT(1) as an output from the network avoids this new hazard, by making it clear that 1/p CC behaviour is only acceptable on signals that unambiguously originate from an AQM which expects and can handle it.  The SCE proposal also inserts AF or FQ protection at these nodes, which serves as a prophylactic against the likes of DCTCP being used inappropriately on the Internet.

 - Jonathan Morton

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2020-04-29  9:46 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-27 19:24 [Bloat] my backlogged comments on the ECT(1) interim call Dave Taht
2020-04-28  8:53 ` Luca Muscariello
2020-04-28 17:12   ` Holland, Jake
2020-04-28 19:04     ` Luca Muscariello
     [not found]       ` <bad22a6b-698d-c85f-b829-6b5391833a1e@erg.abdn.ac.uk>
2020-04-28 19:38         ` [Bloat] [tsvwg] " Luca Muscariello
2020-04-28 19:43           ` Black, David
2020-04-28 19:59             ` Jonathan Morton
2020-04-28 20:33         ` Sebastian Moeller
2020-04-29  8:44       ` Rodney W. Grimes
2020-04-29  9:25         ` Luca Muscariello
2020-04-29  9:46           ` Jonathan Morton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox