[Rpm] Alternate definitions of "working condition"

revolutions per minute - a new metric for measuring responsiveness
 help / color / mirror / Atom feed

* [Rpm] Alternate definitions of "working condition" - unnecessary?
@ 2021-10-06 19:11 Rich Brown
  2021-10-06 20:36 ` Jonathan Foulkes
  2021-10-06 21:22 ` Dave Taht
  0 siblings, 2 replies; 19+ messages in thread
From: Rich Brown @ 2021-10-06 19:11 UTC (permalink / raw)
  To: rpm

A portion of yesterday's RPM call encouraged people to come up with new definitions of "working conditions".

This feels like a red herring. 

We already have two worst-case definitions - with implementations - of tools that "stuff up" a network. Flent and Apple's RPM Tool drive a network into worst-case behavior for long (> 60 seconds) and medium (~20 seconds) terms.

What new information would another "working conditions" test expose that doesn't already come from Flent/RPM Tool? Thanks.

Rich

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Rpm] Alternate definitions of "working condition" - unnecessary?
  2021-10-06 19:11 [Rpm] Alternate definitions of "working condition" - unnecessary? Rich Brown
@ 2021-10-06 20:36 ` Jonathan Foulkes
  2021-10-07 16:40   ` Toke Høiland-Jørgensen
  2021-10-07 21:39   ` Rich Brown
  2021-10-06 21:22 ` Dave Taht
  1 sibling, 2 replies; 19+ messages in thread
From: Jonathan Foulkes @ 2021-10-06 20:36 UTC (permalink / raw)
  To: Rich Brown, rpm

Let me add another tool, as it’s the one I use to ensure I measure full line capacity during our tuning tests. Netperf.

Netperf supports both TCP and UDP streams, and since one usually needs many streams, you can mix those in any combination of proportions to generate load.

Note- I manage the load on my Netperf servers in a way that guarantees I can measure up to a gig worth of capacity on any single given test. An often overlooked aspect of ‘load’ is whether the remote can actually meet a given capacity/latency goal. I can tell you, that matters.

As I mentioned in one of the IAB breakout sessions, even a CMTS with an AQM on upload can be driven into bloated conditions, but it takes substantially more streams than what would be expected to fully load it to the point of bloat.

I have such a line, a DOCSIS 3.0 300/25 that I use for testing. After a particularly brutal couple of hours of testing while I tuned some algorithms, the ISP engaged an AQM on upload (automatically or manually, don't know, but a similar line on the same CMTS does NOT have the AQM) that produces test results that look good, but  a limited capacity of 17Mbps, with a ’normal’ (12) number of streams. But when hammered with 30 upload streams, we see the full 25Mbps and some 300ms of bloat (or worse).

I’ll also note that whatever load pattern the Waveform test uses, seems to generate some bloat within my own MacBook Pro, as if I run a concurrent PingPlotter session on the MBP that is also running the Waveform test in a browser, PP logs high (800+ms) pings during the test.
I recall checking with PP running on another device and the plots looked like what one would expect with Cake running on the router.

So what ‘load’ the current device running the test has going on concurrently can skew the tests (at least insofar as determining if the problem is the line vs the device).

But for research, I total agree Flent is great tool. Just wish it was easier to tweak parameters, maybe I just need to use it more ;-)

> What new information would another "working conditions" test expose that doesn't already come from Flent/RPM Tool? Thanks.

While I’m happy with what I get from Netperf and Flent, I’m one who would like to see a long-running ( >60sec) test suite that had a real-world set of traffic patterns that combines a mix of use cases (VoIP, VideoConf, streaming, DropBox uploads, web surfing, gaming, etc.) and would rank the performance results for each category. 
To be effective at testing a router, it would ideally be a distributed test run from multiple devices with varying networking stacks. Maybe an array of RPi4’s with VMs running various OS’s?

So to Rich’s question, ranking results in each category might show some QoS approaches being more effective at some use cases over others. Even if the Qos’s are reasonably effective at the usual bloat metrics.

Even using the same QoS (Cake on OpenWRT) two different sets of settings will both give A’s on the DSLreports test, but have very different results in a mixed load scenario.

Cheers,

Jonathan

> On Oct 6, 2021, at 3:11 PM, Rich Brown via Rpm <rpm@lists.bufferbloat.net> wrote:
> 
> A portion of yesterday's RPM call encouraged people to come up with new definitions of "working conditions".
> 
> This feels like a red herring. 
> 
> We already have two worst-case definitions - with implementations - of tools that "stuff up" a network. Flent and Apple's RPM Tool drive a network into worst-case behavior for long (> 60 seconds) and medium (~20 seconds) terms.
> 
> What new information would another "working conditions" test expose that doesn't already come from Flent/RPM Tool? Thanks.
> 
> Rich
> _______________________________________________
> Rpm mailing list
> Rpm@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/rpm

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Rpm] Alternate definitions of "working condition" - unnecessary?
  2021-10-06 20:36 ` Jonathan Foulkes
@ 2021-10-07 16:40   ` Toke Høiland-Jørgensen
  2021-10-07 18:49     ` Dave Taht
  2021-10-07 21:39   ` Rich Brown
  1 sibling, 1 reply; 19+ messages in thread
From: Toke Høiland-Jørgensen @ 2021-10-07 16:40 UTC (permalink / raw)
  To: Jonathan Foulkes, Rich Brown, rpm

Jonathan Foulkes via Rpm <rpm@lists.bufferbloat.net> writes:

> Let me add another tool, as it’s the one I use to ensure I measure
> full line capacity during our tuning tests. Netperf.
[...]
> But for research, I total agree Flent is great tool. Just wish it was
> easier to tweak parameters, maybe I just need to use it more ;-)

Fun fact: I originally started working on Flent because I grew tired of
manually running 'netperf' tests. The original name was literally
'netperf-wrapper' ;)

The original idea was that you'd customise it by writing new test
definition files. Of course it has since turned out to be useful to
customise things more at runtime, and Flent has grown quite a few
features in that direction since. But the original legacy endures, so
there are certainly things you can only do by writing test definition
files.

Anyway, specific feature requests are always welcome! :)

-Toke

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Rpm] Alternate definitions of "working condition" - unnecessary?
  2021-10-07 16:40   ` Toke Høiland-Jørgensen
@ 2021-10-07 18:49     ` Dave Taht
  2021-10-08 17:51       ` Toke Høiland-Jørgensen
  0 siblings, 1 reply; 19+ messages in thread
From: Dave Taht @ 2021-10-07 18:49 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen; +Cc: Jonathan Foulkes, Rich Brown, Rpm

On Thu, Oct 7, 2021 at 9:40 AM Toke Høiland-Jørgensen via Rpm
<rpm@lists.bufferbloat.net> wrote:
>
> Jonathan Foulkes via Rpm <rpm@lists.bufferbloat.net> writes:
>
> > Let me add another tool, as it’s the one I use to ensure I measure
> > full line capacity during our tuning tests. Netperf.
> [...]
> > But for research, I total agree Flent is great tool. Just wish it was
> > easier to tweak parameters, maybe I just need to use it more ;-)
>
> Fun fact: I originally started working on Flent because I grew tired of
> manually running 'netperf' tests. The original name was literally
> 'netperf-wrapper' ;)
>
> The original idea was that you'd customise it by writing new test
> definition files. Of course it has since turned out to be useful to
> customise things more at runtime, and Flent has grown quite a few
> features in that direction since. But the original legacy endures, so
> there are certainly things you can only do by writing test definition
> files.
>
> Anyway, specific feature requests are always welcome! :)

I learned very painfully recently that using the gplv3 for anything
results in a blanket ban for any use whatsoever at many companies.
It might as well just be proprietary closed source code.

I am not sure where to go from there, I recognise the 8+ years of work
into flent make it into a tool that is vastly superior to
anything else, but where it would do the most good - inside orgs
building and testing new products - is prohibited by lawyers.


> -Toke
> _______________________________________________
> Rpm mailing list
> Rpm@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/rpm



-- 
Fixing Starlink's Latencies: https://www.youtube.com/watch?v=c9gLo6Xrwgw

Dave Täht CEO, TekLibre, LLC

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Rpm] Alternate definitions of "working condition" - unnecessary?
  2021-10-07 18:49     ` Dave Taht
@ 2021-10-08 17:51       ` Toke Høiland-Jørgensen
  0 siblings, 0 replies; 19+ messages in thread
From: Toke Høiland-Jørgensen @ 2021-10-08 17:51 UTC (permalink / raw)
  To: Dave Taht; +Cc: Jonathan Foulkes, Rich Brown, Rpm

Dave Taht <dave.taht@gmail.com> writes:

> On Thu, Oct 7, 2021 at 9:40 AM Toke Høiland-Jørgensen via Rpm
> <rpm@lists.bufferbloat.net> wrote:
>>
>> Jonathan Foulkes via Rpm <rpm@lists.bufferbloat.net> writes:
>>
>> > Let me add another tool, as it’s the one I use to ensure I measure
>> > full line capacity during our tuning tests. Netperf.
>> [...]
>> > But for research, I total agree Flent is great tool. Just wish it was
>> > easier to tweak parameters, maybe I just need to use it more ;-)
>>
>> Fun fact: I originally started working on Flent because I grew tired of
>> manually running 'netperf' tests. The original name was literally
>> 'netperf-wrapper' ;)
>>
>> The original idea was that you'd customise it by writing new test
>> definition files. Of course it has since turned out to be useful to
>> customise things more at runtime, and Flent has grown quite a few
>> features in that direction since. But the original legacy endures, so
>> there are certainly things you can only do by writing test definition
>> files.
>>
>> Anyway, specific feature requests are always welcome! :)
>
> I learned very painfully recently that using the gplv3 for anything
> results in a blanket ban for any use whatsoever at many companies.
> It might as well just be proprietary closed source code.
>
> I am not sure where to go from there, I recognise the 8+ years of work
> into flent make it into a tool that is vastly superior to
> anything else, but where it would do the most good - inside orgs
> building and testing new products - is prohibited by lawyers.

While I'm all for accommodating reasonable requests, I also prefer to
fix the root causes of problems. And in this case that's at layer 8:

Anyone whose lawyers tell them to not *use* Flent because of the GPLv3
license need to find themselves better lawyers. I did add a
clarification of this to the Flent documentation[0], but really it
shouldn't be necessary.

-Toke

[0] https://github.com/tohojo/flent/commit/d1a79eb227b050daf8137de0c8056a7ac85cd68b

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Rpm] Alternate definitions of "working condition" - unnecessary?
  2021-10-06 20:36 ` Jonathan Foulkes
  2021-10-07 16:40   ` Toke Høiland-Jørgensen
@ 2021-10-07 21:39   ` Rich Brown
  1 sibling, 0 replies; 19+ messages in thread
From: Rich Brown @ 2021-10-07 21:39 UTC (permalink / raw)
  Cc: rpm

Thanks for all these thoughts. I see that there *are* indeed other tests that could be applied. (And we mustn't forget the DSLReports or Waveform tools.)

I'm going to switch things up, and and argue the plight of the poor engineer at some Wi-Fi chip or router manufacturer.

Suppose the RPM Test tool is wildly successful. Customers everywhere are using it and finding that their current routers stink out loud. They complain to their vendors. The marketing department says, "This is terrible!" to their product managers, who all see the light, and walk into the design team to say, "We need to be responsive! Make it so!"

What's an engineer to do?

a) At the minimum, they should figure out how to get fq_codel/cake/PIE/whatever into the product. That'll make a world of difference.

b) But then ... what? Engineers need to optimize against some "standard". 

- Do we have an obligation to declare some "standard of goodness"?

- Is it sufficient to be "good enough"? What does that mean?
	- 95th percentile latency less than max(5 msec or 2 packet transmission times)? 
	- RPM above 2000?

- Something else?

Thanks again for indulging in my fantasy (that vendors will ever care about responsiveness...)

Rich

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Rpm] Alternate definitions of "working condition" - unnecessary?
  2021-10-06 19:11 [Rpm] Alternate definitions of "working condition" - unnecessary? Rich Brown
  2021-10-06 20:36 ` Jonathan Foulkes
@ 2021-10-06 21:22 ` Dave Taht
  2021-10-06 23:18   ` Jonathan Morton
  1 sibling, 1 reply; 19+ messages in thread
From: Dave Taht @ 2021-10-06 21:22 UTC (permalink / raw)
  To: Rich Brown; +Cc: Rpm

On Wed, Oct 6, 2021 at 12:11 PM Rich Brown via Rpm
<rpm@lists.bufferbloat.net> wrote:
>
> A portion of yesterday's RPM call encouraged people to come up with new definitions of "working conditions".
>
> This feels like a red herring.
>
> We already have two worst-case definitions - with implementations - of tools that "stuff up" a network. Flent and Apple's RPM Tool drive a network into worst-case behavior for long (> 60 seconds) and medium (~20 seconds) terms.
>
> What new information would another "working conditions" test expose that doesn't already come from Flent/RPM Tool? Thanks.

The specific case where it seemed needed was in testing wifi. A single
client test on most APs today does tend to blow up the
whole link, but doesn't on fq_codel derived ones, (Also, Meraki used
to use SFQ). Without two or more clients the result
can be misleading.

We had gone into the case where testing the latest 802.11ax DU
standards - where simultaneous transmssions to multiple clients were
feasible, but really hard to test for (As well as how to go about
designing a gang scheduler for the AP), and I'm also beginning to
worry
about the chaos with that standard for the ack return path.

There are also some cases (cake's per host/per flow fq) where perhaps
the test should bind to multiple ipv6 address.

There are additional cases where, perhaps, the fq component works, and
the aqm doesn't.

>
> Rich
> _______________________________________________
> Rpm mailing list
> Rpm@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/rpm

-- 
Fixing Starlink's Latencies: https://www.youtube.com/watch?v=c9gLo6Xrwgw

Dave Täht CEO, TekLibre, LLC

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Rpm] Alternate definitions of "working condition" - unnecessary?
  2021-10-06 21:22 ` Dave Taht
@ 2021-10-06 23:18   ` Jonathan Morton
  2021-10-07  0:11     ` Christoph Paasch
  0 siblings, 1 reply; 19+ messages in thread
From: Jonathan Morton @ 2021-10-06 23:18 UTC (permalink / raw)
  To: Dave Taht; +Cc: Rich Brown, Rpm

> On 7 Oct, 2021, at 12:22 am, Dave Taht via Rpm <rpm@lists.bufferbloat.net> wrote:
> 
> There are additional cases where, perhaps, the fq component works, and the aqm doesn't.

Such as Apple's version of FQ-Codel?  The source code is public, so we might as well talk about it.

There are two deviations I know about in the AQM portion of that.  First is that they do the marking and/or dropping at the tail of the queue, not the head.  Second is that the marking/dropping frequency is fixed, instead of increasing during a continuous period of congestion as real Codel does.

I predict the consequences of these mistakes will differ according to the type of traffic applied:

With TCP traffic over an Internet-scale path, the consequences are not serious.  The tail-drop means that the response at the end of slow-start will be slower, with a higher peak of intra-flow induced delay, and there is also a small but measurable risk of tail-loss causing a more serious application-level delay.  These alone *should* be enough to prompt a fix, if Apple are actually serious about improving application responsiveness.  The fixed marking frequency, however, is probably invisible for this traffic.

With TCP traffic over a short-RTT path, the effects are more pronounced.  The delay excursion at the end of slow-start will be larger in comparison to the baseline RTT, and when the latter is short enough, the fixed congestion signalling frequency means there will be some standing queue that real Codel would get rid of.  This standing queue will influence the TCP stack's RTT estimator and thus RTO value, increasing the delay consequent to tail loss.

Similar effects to the above can be expected with other reliable stream transports (SCTP, QUIC), though the details may differ.

The consequences with non-congestion-controlled traffic could be much more serious.  Real Codel will increase its drop frequency continuously when faced with overload, eventually gaining control of the queue depth as long as the load remains finite and reasonably constant.  Because Apple's AQM doesn't increase its drop frequency, the queue depth for such a flow will increase continuously until either a delay-sensitive rate selection mechanism is triggered at the sender, or the queue overflows and triggers burst losses.

So in the context of this discussion, is it worth generating a type of load that specifically exercises this failure mode?  If so, what does it look like?

 - Jonathan Morton

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Rpm] Alternate definitions of "working condition" - unnecessary?
  2021-10-06 23:18   ` Jonathan Morton
@ 2021-10-07  0:11     ` Christoph Paasch
  2021-10-07 10:29       ` Jonathan Morton
  2021-10-07 10:30       ` [Rpm] Alternate definitions of "working condition" - unnecessary? Sebastian Moeller
  0 siblings, 2 replies; 19+ messages in thread
From: Christoph Paasch @ 2021-10-07  0:11 UTC (permalink / raw)
  To: Jonathan Morton; +Cc: Dave Taht, Rpm

On 10/07/21 - 02:18, Jonathan Morton via Rpm wrote:
> > On 7 Oct, 2021, at 12:22 am, Dave Taht via Rpm <rpm@lists.bufferbloat.net> wrote:
> > There are additional cases where, perhaps, the fq component works, and the aqm doesn't.
> 
> Such as Apple's version of FQ-Codel?  The source code is public, so we might as well talk about it.

Let's not just talk about it, but actually read it ;-)

> There are two deviations I know about in the AQM portion of that.  First is that they do the marking and/or dropping at the tail of the queue, not the head.  Second is that the marking/dropping frequency is fixed, instead of increasing during a continuous period of congestion as real Codel does.

We don't drop/mark locally generated traffic (which is the use-case we care abhout).
We signal flow-control straight back to the TCP-stack at which point the queue
is entirely drained before TCP starts transmitting again.

So, drop-frequency really doesn't matter because there is no drop.


Christoph

> 
> I predict the consequences of these mistakes will differ according to the type of traffic applied:
> 
> With TCP traffic over an Internet-scale path, the consequences are not serious.  The tail-drop means that the response at the end of slow-start will be slower, with a higher peak of intra-flow induced delay, and there is also a small but measurable risk of tail-loss causing a more serious application-level delay.  These alone *should* be enough to prompt a fix, if Apple are actually serious about improving application responsiveness.  The fixed marking frequency, however, is probably invisible for this traffic.
> 
> With TCP traffic over a short-RTT path, the effects are more pronounced.  The delay excursion at the end of slow-start will be larger in comparison to the baseline RTT, and when the latter is short enough, the fixed congestion signalling frequency means there will be some standing queue that real Codel would get rid of.  This standing queue will influence the TCP stack's RTT estimator and thus RTO value, increasing the delay consequent to tail loss.
> 
> Similar effects to the above can be expected with other reliable stream transports (SCTP, QUIC), though the details may differ.
> 
> The consequences with non-congestion-controlled traffic could be much more serious.  Real Codel will increase its drop frequency continuously when faced with overload, eventually gaining control of the queue depth as long as the load remains finite and reasonably constant.  Because Apple's AQM doesn't increase its drop frequency, the queue depth for such a flow will increase continuously until either a delay-sensitive rate selection mechanism is triggered at the sender, or the queue overflows and triggers burst losses.
> 
> So in the context of this discussion, is it worth generating a type of load that specifically exercises this failure mode?  If so, what does it look like?
> 
>  - Jonathan Morton
> _______________________________________________
> Rpm mailing list
> Rpm@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/rpm

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Rpm] Alternate definitions of "working condition" - unnecessary?
  2021-10-07  0:11     ` Christoph Paasch
@ 2021-10-07 10:29       ` Jonathan Morton
  2021-10-07 15:44         ` [Rpm] apple's fq_"codel" implementation Dave Taht
  2021-10-07 10:30       ` [Rpm] Alternate definitions of "working condition" - unnecessary? Sebastian Moeller
  1 sibling, 1 reply; 19+ messages in thread
From: Jonathan Morton @ 2021-10-07 10:29 UTC (permalink / raw)
  To: Christoph Paasch; +Cc: Dave Taht, Rpm

> On 7 Oct, 2021, at 3:11 am, Christoph Paasch <cpaasch@apple.com> wrote:
> 
>>> On 7 Oct, 2021, at 12:22 am, Dave Taht via Rpm <rpm@lists.bufferbloat.net> wrote:
>>> There are additional cases where, perhaps, the fq component works, and the aqm doesn't.
>> 
>> Such as Apple's version of FQ-Codel?  The source code is public, so we might as well talk about it.
> 
> Let's not just talk about it, but actually read it ;-)
> 
>> There are two deviations I know about in the AQM portion of that.  First is that they do the marking and/or dropping at the tail of the queue, not the head.  Second is that the marking/dropping frequency is fixed, instead of increasing during a continuous period of congestion as real Codel does.
> 
> We don't drop/mark locally generated traffic (which is the use-case we care abhout).
> We signal flow-control straight back to the TCP-stack at which point the queue
> is entirely drained before TCP starts transmitting again.
> 
> So, drop-frequency really doesn't matter because there is no drop.

Hmm, that would be more reasonable behaviour for a machine that never has to forward anything - but that is not at all obvious from the source code I found.  I think I'll need to run tests to see what actually happens in practice.

 - Jonathan Morton

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Rpm] apple's fq_"codel" implementation
  2021-10-07 10:29       ` Jonathan Morton
@ 2021-10-07 15:44         ` Dave Taht
  0 siblings, 0 replies; 19+ messages in thread
From: Dave Taht @ 2021-10-07 15:44 UTC (permalink / raw)
  To: Jonathan Morton; +Cc: Christoph Paasch, Rpm

On Thu, Oct 7, 2021 at 3:29 AM Jonathan Morton <chromatix99@gmail.com> wrote:
>
> > On 7 Oct, 2021, at 3:11 am, Christoph Paasch <cpaasch@apple.com> wrote:
> >
> >>> On 7 Oct, 2021, at 12:22 am, Dave Taht via Rpm <rpm@lists.bufferbloat.net> wrote:
> >>> There are additional cases where, perhaps, the fq component works, and the aqm doesn't.
> >>
> >> Such as Apple's version of FQ-Codel?  The source code is public, so we might as well talk about it.
> >
> > Let's not just talk about it, but actually read it ;-)

Since enough people have now actually read the code, and there are two
students performing experiments,
we can have this conversation.

> >> There are two deviations I know about in the AQM portion of that.  First is that they do the marking and/or dropping at the tail of the queue, not the head.  Second is that the marking/dropping frequency is fixed, instead of increasing during a continuous period of congestion as real Codel does.
> >
> > We don't drop/mark locally generated traffic (which is the use-case we care abhout).

"We", who?  :)

It's unclear as to what happens in the case of virtualization.

It's unclear what happens with UDP flows.

It's unclear what happens with tunneled flows (userspace vpn)

It's unclear what happens with sockets, rather than the apple APIs.

What I observed - exercising sockets (using 16 netperf, 4 irtt, osx as
the target) - was a sharp spike in the "drop_overload" statistic, and
tcp rsts in the captures, and that inspired me to inspect the code to
see what was hit, and to be a mite :deleted: at what I thought were
two essential components of the codel aqm not being there.

at the time I had WAY more other sources of error in my network setup
than I'd cared for and got pulled into something else before being
able to qualm my uncertainties here.

> > We signal flow-control straight back to the TCP-stack at which point the queue
> > is entirely drained before TCP starts transmitting again.

This is rather bursty.  The 1/count reduction in the drop scheduler
(or in this case the "pushback scheduler"),
should gradually reduce the needed local buffering in the queue to 5ms
(or in the case of apple, 10ms),
and compensate for the natural variabity of wifi and lte better.

I'd have to go read the code again to remember what the drop_overlimit
behavior was. I had thought that dropping cnt-1 rather than "entirely"
made more sense.

Anyway there were many, many other variables in play - a queue size of
300, 2000, or more, the presense off offloads, no BQL, testing how
usb-c-ethernet worked -

> > So, drop-frequency really doesn't matter because there is no drop.

It "should" be cutting the cwnd until the queue also is under control.
Without doing that, it will just fill up
immediately again, with the wrong rtt estimate.

>
> Hmm, that would be more reasonable behaviour for a machine that never has to forward anything - but that is not at all obvious from the source code I found.  I think I'll need to run tests to see what actually happens in practice.

Please!!! I did feel it was potentially a big bug, with some easy
fixes, needing only more eyeballs and time
to diagnose, or at least describe.

>
>  - Jonathan Morton

-- 
Fixing Starlink's Latencies: https://www.youtube.com/watch?v=c9gLo6Xrwgw

Dave Täht CEO, TekLibre, LLC

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Rpm] Alternate definitions of "working condition" - unnecessary?
  2021-10-07  0:11     ` Christoph Paasch
  2021-10-07 10:29       ` Jonathan Morton
@ 2021-10-07 10:30       ` Sebastian Moeller
  2021-10-08  0:33         ` Jonathan Morton
  2021-10-08 23:32         ` Christoph Paasch
  1 sibling, 2 replies; 19+ messages in thread
From: Sebastian Moeller @ 2021-10-07 10:30 UTC (permalink / raw)
  To: Christoph Paasch; +Cc: Jonathan Morton, Rpm

Hi Christoph,

> On Oct 7, 2021, at 02:11, Christoph Paasch via Rpm <rpm@lists.bufferbloat.net> wrote:
> 
> On 10/07/21 - 02:18, Jonathan Morton via Rpm wrote:
>>> On 7 Oct, 2021, at 12:22 am, Dave Taht via Rpm <rpm@lists.bufferbloat.net> wrote:
>>> There are additional cases where, perhaps, the fq component works, and the aqm doesn't.
>> 
>> Such as Apple's version of FQ-Codel?  The source code is public, so we might as well talk about it.
> 
> Let's not just talk about it, but actually read it ;-)
> 
>> There are two deviations I know about in the AQM portion of that.  First is that they do the marking and/or dropping at the tail of the queue, not the head.  Second is that the marking/dropping frequency is fixed, instead of increasing during a continuous period of congestion as real Codel does.
> 
> We don't drop/mark locally generated traffic (which is the use-case we care abhout).

	In this discussion probably true, but I recall that one reason why sch_fq_codel is a more versatile qdisc compared to sch_fq under Linux is that fq excels for locally generated traffic, while fq_codel is also working well for forwarded traffic. And I use "forwarding" here to encompass things like VMs running on a host, where direct "back-pressure" will not work... 


> We signal flow-control straight back to the TCP-stack at which point the queue
> is entirely drained before TCP starts transmitting again.
> 
> So, drop-frequency really doesn't matter because there is no drop.

	But is it still codel/fq_codel if it does not implement head drop (as described in https://datatracker.ietf.org/doc/html/rfc8290#section-4.2) and if the control loop (https://datatracker.ietf.org/doc/html/rfc8289#section-3.3) is changed? (I am also wondering how reducing the default number of sub-queues from 1024 to 128 behaves on the background of the birthday paradox).

Best Regards
	Sebastian

P.S.: My definition of working conditions entails bidirectionally saturating traffic with responsive and (transiently) under-responsive flows. Something like a few long running TCP transfers to generate "base-load" and a higher number of TCP flows in IW or slow start to add some spice to the whole. In the future, once QUIC actually takes off*, adding more well defined/behaved UDP flows to the mix seems reasonable. My off the cuff test for the effect of IW used to be to start a browser and open a collection of (30-50) tabs getting a nice "thundering herd" of TCP flows starting around the same time. But it seems browser makers got too smart for me and will not do what I want any more but temporally space the different sites in the tabs so that my nice thundering herd is less obnoxious (which IMHO is actually the right thing to do for actual usage, but for testing it sucks).

*) Occasionally browsing the NANOG archives makes me wonder how the move from HTTP/TCP to  QUIC/UDP is going to play with operators propensity to rate-limit UDP, but that is a different kettle of fish...


> 
> 
> Christoph
> 
>> 
>> I predict the consequences of these mistakes will differ according to the type of traffic applied:
>> 
>> With TCP traffic over an Internet-scale path, the consequences are not serious.  The tail-drop means that the response at the end of slow-start will be slower, with a higher peak of intra-flow induced delay, and there is also a small but measurable risk of tail-loss causing a more serious application-level delay.  These alone *should* be enough to prompt a fix, if Apple are actually serious about improving application responsiveness.  The fixed marking frequency, however, is probably invisible for this traffic.
>> 
>> With TCP traffic over a short-RTT path, the effects are more pronounced.  The delay excursion at the end of slow-start will be larger in comparison to the baseline RTT, and when the latter is short enough, the fixed congestion signalling frequency means there will be some standing queue that real Codel would get rid of.  This standing queue will influence the TCP stack's RTT estimator and thus RTO value, increasing the delay consequent to tail loss.
>> 
>> Similar effects to the above can be expected with other reliable stream transports (SCTP, QUIC), though the details may differ.
>> 
>> The consequences with non-congestion-controlled traffic could be much more serious.  Real Codel will increase its drop frequency continuously when faced with overload, eventually gaining control of the queue depth as long as the load remains finite and reasonably constant.  Because Apple's AQM doesn't increase its drop frequency, the queue depth for such a flow will increase continuously until either a delay-sensitive rate selection mechanism is triggered at the sender, or the queue overflows and triggers burst losses.
>> 
>> So in the context of this discussion, is it worth generating a type of load that specifically exercises this failure mode?  If so, what does it look like?
>> 
>> - Jonathan Morton
>> _______________________________________________
>> Rpm mailing list
>> Rpm@lists.bufferbloat.net
>> https://lists.bufferbloat.net/listinfo/rpm
> _______________________________________________
> Rpm mailing list
> Rpm@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/rpm


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Rpm] Alternate definitions of "working condition" - unnecessary?
  2021-10-07 10:30       ` [Rpm] Alternate definitions of "working condition" - unnecessary? Sebastian Moeller
@ 2021-10-08  0:33         ` Jonathan Morton
  2021-10-08 23:32         ` Christoph Paasch
  1 sibling, 0 replies; 19+ messages in thread
From: Jonathan Morton @ 2021-10-08  0:33 UTC (permalink / raw)
  To: Sebastian Moeller; +Cc: Christoph Paasch, Rpm

> On 7 Oct, 2021, at 1:30 pm, Sebastian Moeller <moeller0@gmx.de> wrote:
> 
> I am also wondering how reducing the default number of sub-queues from 1024 to 128 behaves on the background of the birthday paradox

With 1024 queues, the 50% probability of a collision is expected at sqrt(1024) = 32 flows.  With 128, this decreases to about 11 or 12 flows (11*11 = 121; 12*12 = 144).

In both cases, onset of high collision probability could be staved off until a larger number of flows by using a collision-avoiding hash function, such as the set-associative hash used in Cake.

 - Jonathan Morton

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Rpm] Alternate definitions of "working condition" - unnecessary?
  2021-10-07 10:30       ` [Rpm] Alternate definitions of "working condition" - unnecessary? Sebastian Moeller
  2021-10-08  0:33         ` Jonathan Morton
@ 2021-10-08 23:32         ` Christoph Paasch
  2021-10-11  7:31           ` Sebastian Moeller
  1 sibling, 1 reply; 19+ messages in thread
From: Christoph Paasch @ 2021-10-08 23:32 UTC (permalink / raw)
  To: Sebastian Moeller; +Cc: Jonathan Morton, Rpm

On 10/07/21 - 12:30, Sebastian Moeller wrote:
> Hi Christoph,
> 
> > On Oct 7, 2021, at 02:11, Christoph Paasch via Rpm
> > <rpm@lists.bufferbloat.net> wrote:
> > 
> > On 10/07/21 - 02:18, Jonathan Morton via Rpm wrote:
> >>> On 7 Oct, 2021, at 12:22 am, Dave Taht via Rpm
> >>> <rpm@lists.bufferbloat.net> wrote: There are additional cases where,
> >>> perhaps, the fq component works, and the aqm doesn't.
> >> 
> >> Such as Apple's version of FQ-Codel?  The source code is public, so we
> >> might as well talk about it.
> > 
> > Let's not just talk about it, but actually read it ;-)
> > 
> >> There are two deviations I know about in the AQM portion of that.
> >> First is that they do the marking and/or dropping at the tail of the
> >> queue, not the head.  Second is that the marking/dropping frequency is
> >> fixed, instead of increasing during a continuous period of congestion
> >> as real Codel does.
> > 
> > We don't drop/mark locally generated traffic (which is the use-case we
> > care abhout).
> 
> 	In this discussion probably true, but I recall that one reason why
> 	sch_fq_codel is a more versatile qdisc compared to sch_fq under
> 	Linux is that fq excels for locally generated traffic, while
> 	fq_codel is also working well for forwarded traffic. And I use
> 	"forwarding" here to encompass things like VMs running on a host,
> 	where direct "back-pressure" will not work... 

Our main use-case is iOS. This is by far the most common case and thus there
are no VMs or alike. All traffic is generated locally by our TCP
implementation.

>> > We signal flow-control straight back to the TCP-stack at which point the
> > queue is entirely drained before TCP starts transmitting again.
> > 
> > So, drop-frequency really doesn't matter because there is no drop.
> 
> 	But is it still codel/fq_codel if it does not implement head drop
> 	(as described in
> 	https://datatracker.ietf.org/doc/html/rfc8290#section-4.2) and if
> 	the control loop
> 	(https://datatracker.ietf.org/doc/html/rfc8289#section-3.3) is
> 	changed? (I am also wondering how reducing the default number of
> 	sub-queues from 1024 to 128 behaves on the background of the
> 	birthday paradox).

Not sure where the 128 comes from ?
And birthday paradox does not apply. The magic happens in inp_calc_flowhash() ;-)


Cheers,
Christoph


> Best Regards Sebastian
> 
> P.S.: My definition of working conditions entails bidirectionally
> saturating traffic with responsive and (transiently) under-responsive
> flows. Something like a few long running TCP transfers to generate
> "base-load" and a higher number of TCP flows in IW or slow start to add
> some spice to the whole. In the future, once QUIC actually takes off*,
> adding more well defined/behaved UDP flows to the mix seems reasonable. My
> off the cuff test for the effect of IW used to be to start a browser and
> open a collection of (30-50) tabs getting a nice "thundering herd" of TCP
> flows starting around the same time. But it seems browser makers got too
> smart for me and will not do what I want any more but temporally space the
> different sites in the tabs so that my nice thundering herd is less
> obnoxious (which IMHO is actually the right thing to do for actual usage,
> but for testing it sucks).
> 
> *) Occasionally browsing the NANOG archives makes me wonder how the move
> from HTTP/TCP to  QUIC/UDP is going to play with operators propensity to
> rate-limit UDP, but that is a different kettle of fish...
> 
> 
> > 
> > 
> > Christoph
> > 
> >> 
> >> I predict the consequences of these mistakes will differ according to
> >> the type of traffic applied:
> >> 
> >> With TCP traffic over an Internet-scale path, the consequences are not
> >> serious.  The tail-drop means that the response at the end of
> >> slow-start will be slower, with a higher peak of intra-flow induced
> >> delay, and there is also a small but measurable risk of tail-loss
> >> causing a more serious application-level delay.  These alone *should*
> >> be enough to prompt a fix, if Apple are actually serious about
> >> improving application responsiveness.  The fixed marking frequency,
> >> however, is probably invisible for this traffic.
> >> 
> >> With TCP traffic over a short-RTT path, the effects are more
> >> pronounced.  The delay excursion at the end of slow-start will be
> >> larger in comparison to the baseline RTT, and when the latter is short
> >> enough, the fixed congestion signalling frequency means there will be
> >> some standing queue that real Codel would get rid of.  This standing
> >> queue will influence the TCP stack's RTT estimator and thus RTO value,
> >> increasing the delay consequent to tail loss.
> >> 
> >> Similar effects to the above can be expected with other reliable stream
> >> transports (SCTP, QUIC), though the details may differ.
> >> 
> >> The consequences with non-congestion-controlled traffic could be much
> >> more serious.  Real Codel will increase its drop frequency continuously
> >> when faced with overload, eventually gaining control of the queue depth
> >> as long as the load remains finite and reasonably constant.  Because
> >> Apple's AQM doesn't increase its drop frequency, the queue depth for
> >> such a flow will increase continuously until either a delay-sensitive
> >> rate selection mechanism is triggered at the sender, or the queue
> >> overflows and triggers burst losses.
> >> 
> >> So in the context of this discussion, is it worth generating a type of
> >> load that specifically exercises this failure mode?  If so, what does
> >> it look like?
> >> 
> >> - Jonathan Morton _______________________________________________ Rpm
> >> mailing list Rpm@lists.bufferbloat.net
> >> https://lists.bufferbloat.net/listinfo/rpm
> > _______________________________________________ Rpm mailing list
> > Rpm@lists.bufferbloat.net https://lists.bufferbloat.net/listinfo/rpm
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Rpm] Alternate definitions of "working condition" - unnecessary?
  2021-10-08 23:32         ` Christoph Paasch
@ 2021-10-11  7:31           ` Sebastian Moeller
  2021-10-11  9:01             ` Jonathan Morton
  2021-10-11 17:34             ` Christoph Paasch
  0 siblings, 2 replies; 19+ messages in thread
From: Sebastian Moeller @ 2021-10-11  7:31 UTC (permalink / raw)
  To: Christoph Paasch; +Cc: Jonathan Morton, Rpm

Hi Christoph,



> On Oct 9, 2021, at 01:32, Christoph Paasch <cpaasch@apple.com> wrote:
> 
> On 10/07/21 - 12:30, Sebastian Moeller wrote:
>> Hi Christoph,
>> 
>>> On Oct 7, 2021, at 02:11, Christoph Paasch via Rpm
>>> <rpm@lists.bufferbloat.net> wrote:
>>> 
>>> On 10/07/21 - 02:18, Jonathan Morton via Rpm wrote:
>>>>> On 7 Oct, 2021, at 12:22 am, Dave Taht via Rpm
>>>>> <rpm@lists.bufferbloat.net> wrote: There are additional cases where,
>>>>> perhaps, the fq component works, and the aqm doesn't.
>>>> 
>>>> Such as Apple's version of FQ-Codel?  The source code is public, so we
>>>> might as well talk about it.
>>> 
>>> Let's not just talk about it, but actually read it ;-)
>>> 
>>>> There are two deviations I know about in the AQM portion of that.
>>>> First is that they do the marking and/or dropping at the tail of the
>>>> queue, not the head.  Second is that the marking/dropping frequency is
>>>> fixed, instead of increasing during a continuous period of congestion
>>>> as real Codel does.
>>> 
>>> We don't drop/mark locally generated traffic (which is the use-case we
>>> care abhout).
>> 
>> 	In this discussion probably true, but I recall that one reason why
>> 	sch_fq_codel is a more versatile qdisc compared to sch_fq under
>> 	Linux is that fq excels for locally generated traffic, while
>> 	fq_codel is also working well for forwarded traffic. And I use
>> 	"forwarding" here to encompass things like VMs running on a host,
>> 	where direct "back-pressure" will not work... 
> 
> Our main use-case is iOS. This is by far the most common case and thus there
> are no VMs or alike. All traffic is generated locally by our TCP
> implementation.

	Ah, explains your priorities. My only iOS device is 11 years old, and as far as I understand does not support fq_codel at all, so my testing is restricted to my macbooks and was under mojave and catalina:

macbook:~ user$ sudo netstat -I en4 -qq
en4:
     [ sched:  FQ_CODEL  qlength:    0/128 ]
     [ pkts:     126518  bytes:   60151318  dropped pkts:      0 bytes:      0 ]
=====================================================
     [ pri: CTL (0)	srv_cl: 0x480190	quantum: 600	drr_max: 8 ]
     [ queued pkts: 0	bytes: 0 ]
     [ dequeued pkts: 16969	bytes: 1144841 ]
     [ budget: 0	target qdelay:  5.00 msec	update interval:100.00 msec ]
     [ flow control: 0	feedback: 0	stalls: 0	failed: 0 ]
     [ drop overflow: 0	early: 0	memfail: 0	duprexmt:0 ]
     [ flows total: 0	new: 0	old: 0 ]
     [ throttle on: 0	off: 0	drop: 0 ]
=====================================================
     [ pri: VO (1)	srv_cl: 0x400180	quantum: 600	drr_max: 8 ]
     [ queued pkts: 0	bytes: 0 ]
     [ dequeued pkts: 0	bytes: 0 ]
     [ budget: 0	target qdelay:  5.00 msec	update interval:100.00 msec ]
     [ flow control: 0	feedback: 0	stalls: 0	failed: 0 ]
     [ drop overflow: 0	early: 0	memfail: 0	duprexmt:0 ]
     [ flows total: 0	new: 0	old: 0 ]
     [ throttle on: 0	off: 0	drop: 0 ]
=====================================================
     [ pri: VI (2)	srv_cl: 0x380100	quantum: 3000	drr_max: 6 ]
     [ queued pkts: 0	bytes: 0 ]
     [ dequeued pkts: 0	bytes: 0 ]
     [ budget: 0	target qdelay:  5.00 msec	update interval:100.00 msec ]
     [ flow control: 0	feedback: 0	stalls: 0	failed: 0 ]
     [ drop overflow: 0	early: 0	memfail: 0	duprexmt:0 ]
     [ flows total: 0	new: 0	old: 0 ]
     [ throttle on: 0	off: 0	drop: 0 ]
=====================================================
     [ pri: RV (3)	srv_cl: 0x300110	quantum: 3000	drr_max: 6 ]
     [ queued pkts: 0	bytes: 0 ]
     [ dequeued pkts: 0	bytes: 0 ]
     [ budget: 0	target qdelay:  5.00 msec	update interval:100.00 msec ]
     [ flow control: 0	feedback: 0	stalls: 0	failed: 0 ]
     [ drop overflow: 0	early: 0	memfail: 0	duprexmt:0 ]
     [ flows total: 0	new: 0	old: 0 ]
     [ throttle on: 0	off: 0	drop: 0 ]
=====================================================
     [ pri: AV (4)	srv_cl: 0x280120	quantum: 3000	drr_max: 6 ]
     [ queued pkts: 0	bytes: 0 ]
     [ dequeued pkts: 0	bytes: 0 ]
     [ budget: 0	target qdelay:  5.00 msec	update interval:100.00 msec ]
     [ flow control: 0	feedback: 0	stalls: 0	failed: 0 ]
     [ drop overflow: 0	early: 0	memfail: 0	duprexmt:0 ]
     [ flows total: 0	new: 0	old: 0 ]
     [ throttle on: 0	off: 0	drop: 0 ]
=====================================================
     [ pri: OAM (5)	srv_cl: 0x200020	quantum: 1500	drr_max: 4 ]
     [ queued pkts: 0	bytes: 0 ]
     [ dequeued pkts: 0	bytes: 0 ]
     [ budget: 0	target qdelay:  5.00 msec	update interval:100.00 msec ]
     [ flow control: 0	feedback: 0	stalls: 0	failed: 0 ]
     [ drop overflow: 0	early: 0	memfail: 0	duprexmt:0 ]
     [ flows total: 0	new: 0	old: 0 ]
     [ throttle on: 0	off: 0	drop: 0 ]
=====================================================
     [ pri: RD (6)	srv_cl: 0x180010	quantum: 1500	drr_max: 4 ]
     [ queued pkts: 0	bytes: 0 ]
     [ dequeued pkts: 78	bytes: 13943 ]
     [ budget: 0	target qdelay:  5.00 msec	update interval:100.00 msec ]
     [ flow control: 0	feedback: 0	stalls: 0	failed: 0 ]
     [ drop overflow: 0	early: 0	memfail: 0	duprexmt:0 ]
     [ flows total: 0	new: 0	old: 0 ]
     [ throttle on: 0	off: 0	drop: 0 ]
=====================================================
     [ pri: BE (7)	srv_cl: 0x0	quantum: 1500	drr_max: 4 ]
     [ queued pkts: 0	bytes: 0 ]
     [ dequeued pkts: 98857	bytes: 56860512 ]
     [ budget: 0	target qdelay:  5.00 msec	update interval:100.00 msec ]
     [ flow control: 0	feedback: 0	stalls: 0	failed: 0 ]
     [ drop overflow: 0	early: 0	memfail: 0	duprexmt:0 ]
     [ flows total: 0	new: 0	old: 0 ]
     [ throttle on: 0	off: 0	drop: 0 ]
=====================================================
     [ pri: BK (8)	srv_cl: 0x100080	quantum: 1500	drr_max: 2 ]
     [ queued pkts: 0	bytes: 0 ]
     [ dequeued pkts: 10565	bytes: 2126520 ]
     [ budget: 0	target qdelay:  5.00 msec	update interval:100.00 msec ]
     [ flow control: 0	feedback: 0	stalls: 0	failed: 0 ]
     [ drop overflow: 0	early: 0	memfail: 0	duprexmt:0 ]
     [ flows total: 0	new: 0	old: 0 ]
     [ throttle on: 0	off: 0	drop: 0 ]
=====================================================
     [ pri: BK_SYS (9)	srv_cl: 0x80090	quantum: 1500	drr_max: 2 ]
     [ queued pkts: 0	bytes: 0 ]
     [ dequeued pkts: 49	bytes: 5502 ]
     [ budget: 0	target qdelay:  5.00 msec	update interval:100.00 msec ]
     [ flow control: 0	feedback: 0	stalls: 0	failed: 0 ]
     [ drop overflow: 0	early: 0	memfail: 0	duprexmt:0 ]
     [ flows total: 0	new: 0	old: 0 ]
     [ throttle on: 0	off: 0	drop: 0 ]
hms-beagle2:~ smoeller$ 

macbook:~ user$ 





> 
>>>> We signal flow-control straight back to the TCP-stack at which point the
>>> queue is entirely drained before TCP starts transmitting again.
>>> 
>>> So, drop-frequency really doesn't matter because there is no drop.
>> 
>> 	But is it still codel/fq_codel if it does not implement head drop
>> 	(as described in
>> 	https://datatracker.ietf.org/doc/html/rfc8290#section-4.2) and if
>> 	the control loop
>> 	(https://datatracker.ietf.org/doc/html/rfc8289#section-3.3) is
>> 	changed? (I am also wondering how reducing the default number of
>> 	sub-queues from 1024 to 128 behaves on the background of the
>> 	birthday paradox).
> 
> Not sure where the 128 comes from ?

See above:
     [ sched:  FQ_CODEL  qlength:    0/128 ]
but I might simply be misinterpreting the number here, because reading this again instead of relaying on memory indicates that 128 is the length of each individual queue and not the number of queues? Or is that the length of the hardware queue sitting below fq_codel? 
Anyway, any hints how to query/configure the fq_codel instance under macos (I am not fluent in BSDeses)?


> And birthday paradox does not apply. The magic happens in inp_calc_flowhash() ;-)

	Thanks will need to spend time reading and understanding the code obviously.

Regards
	Sebastian

> 
> 
> Cheers,
> Christoph
> 
> 
>> Best Regards Sebastian
>> 
>> P.S.: My definition of working conditions entails bidirectionally
>> saturating traffic with responsive and (transiently) under-responsive
>> flows. Something like a few long running TCP transfers to generate
>> "base-load" and a higher number of TCP flows in IW or slow start to add
>> some spice to the whole. In the future, once QUIC actually takes off*,
>> adding more well defined/behaved UDP flows to the mix seems reasonable. My
>> off the cuff test for the effect of IW used to be to start a browser and
>> open a collection of (30-50) tabs getting a nice "thundering herd" of TCP
>> flows starting around the same time. But it seems browser makers got too
>> smart for me and will not do what I want any more but temporally space the
>> different sites in the tabs so that my nice thundering herd is less
>> obnoxious (which IMHO is actually the right thing to do for actual usage,
>> but for testing it sucks).
>> 
>> *) Occasionally browsing the NANOG archives makes me wonder how the move
>> from HTTP/TCP to  QUIC/UDP is going to play with operators propensity to
>> rate-limit UDP, but that is a different kettle of fish...
>> 
>> 
>>> 
>>> 
>>> Christoph
>>> 
>>>> 
>>>> I predict the consequences of these mistakes will differ according to
>>>> the type of traffic applied:
>>>> 
>>>> With TCP traffic over an Internet-scale path, the consequences are not
>>>> serious.  The tail-drop means that the response at the end of
>>>> slow-start will be slower, with a higher peak of intra-flow induced
>>>> delay, and there is also a small but measurable risk of tail-loss
>>>> causing a more serious application-level delay.  These alone *should*
>>>> be enough to prompt a fix, if Apple are actually serious about
>>>> improving application responsiveness.  The fixed marking frequency,
>>>> however, is probably invisible for this traffic.
>>>> 
>>>> With TCP traffic over a short-RTT path, the effects are more
>>>> pronounced.  The delay excursion at the end of slow-start will be
>>>> larger in comparison to the baseline RTT, and when the latter is short
>>>> enough, the fixed congestion signalling frequency means there will be
>>>> some standing queue that real Codel would get rid of.  This standing
>>>> queue will influence the TCP stack's RTT estimator and thus RTO value,
>>>> increasing the delay consequent to tail loss.
>>>> 
>>>> Similar effects to the above can be expected with other reliable stream
>>>> transports (SCTP, QUIC), though the details may differ.
>>>> 
>>>> The consequences with non-congestion-controlled traffic could be much
>>>> more serious.  Real Codel will increase its drop frequency continuously
>>>> when faced with overload, eventually gaining control of the queue depth
>>>> as long as the load remains finite and reasonably constant.  Because
>>>> Apple's AQM doesn't increase its drop frequency, the queue depth for
>>>> such a flow will increase continuously until either a delay-sensitive
>>>> rate selection mechanism is triggered at the sender, or the queue
>>>> overflows and triggers burst losses.
>>>> 
>>>> So in the context of this discussion, is it worth generating a type of
>>>> load that specifically exercises this failure mode?  If so, what does
>>>> it look like?
>>>> 
>>>> - Jonathan Morton _______________________________________________ Rpm
>>>> mailing list Rpm@lists.bufferbloat.net
>>>> https://lists.bufferbloat.net/listinfo/rpm
>>> _______________________________________________ Rpm mailing list
>>> Rpm@lists.bufferbloat.net https://lists.bufferbloat.net/listinfo/rpm


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Rpm] Alternate definitions of "working condition" - unnecessary?
  2021-10-11  7:31           ` Sebastian Moeller
@ 2021-10-11  9:01             ` Jonathan Morton
  2021-10-11 10:03               ` Sebastian Moeller
  2021-10-11 17:34             ` Christoph Paasch
  1 sibling, 1 reply; 19+ messages in thread
From: Jonathan Morton @ 2021-10-11  9:01 UTC (permalink / raw)
  To: Sebastian Moeller; +Cc: Christoph Paasch, Rpm

> On 11 Oct, 2021, at 10:31 am, Sebastian Moeller <moeller0@gmx.de> wrote:
> 
>>> (I am also wondering how reducing the default number of
>>> 	sub-queues from 1024 to 128 behaves on the background of the
>>> 	birthday paradox).
>> 
>> Not sure where the 128 comes from ?
> 
> See above:
>     [ sched:  FQ_CODEL  qlength:    0/128 ]
> but I might simply be misinterpreting the number here…

Yes, I think so.  This probably refers to the maximum number of packets that can be enqueued in total, and has no relation to the number of hash buckets that may or may not be present - though obviously you can't have more occupied queues than there are enqueued packets.

 - Jonathan Morton

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Rpm] Alternate definitions of "working condition" - unnecessary?
  2021-10-11  9:01             ` Jonathan Morton
@ 2021-10-11 10:03               ` Sebastian Moeller
  0 siblings, 0 replies; 19+ messages in thread
From: Sebastian Moeller @ 2021-10-11 10:03 UTC (permalink / raw)
  To: Jonathan Morton; +Cc: Christoph Paasch, Rpm

Hi Jonathan,


> On Oct 11, 2021, at 11:01, Jonathan Morton <chromatix99@gmail.com> wrote:
> 
>> On 11 Oct, 2021, at 10:31 am, Sebastian Moeller <moeller0@gmx.de> wrote:
>> 
>>>> (I am also wondering how reducing the default number of
>>>> 	sub-queues from 1024 to 128 behaves on the background of the
>>>> 	birthday paradox).
>>> 
>>> Not sure where the 128 comes from ?
>> 
>> See above:
>>    [ sched:  FQ_CODEL  qlength:    0/128 ]
>> but I might simply be misinterpreting the number here…
> 
> Yes, I think so.

	Thanks.


>  This probably refers to the maximum number of packets that can be enqueued in total, and has no relation to the number of hash buckets that may or may not be present - though obviously you can't have more occupied queues than there are enqueued packets.

	Do you have a link to the fq_codel source code for macos/iOS that I could use he next time to first do some research, by any chance?

Best Regards
	Sebastian


> 
> - Jonathan Morton


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Rpm] Alternate definitions of "working condition" - unnecessary?
  2021-10-11  7:31           ` Sebastian Moeller
  2021-10-11  9:01             ` Jonathan Morton
@ 2021-10-11 17:34             ` Christoph Paasch
  2021-10-12 10:23               ` Sebastian Moeller
  1 sibling, 1 reply; 19+ messages in thread
From: Christoph Paasch @ 2021-10-11 17:34 UTC (permalink / raw)
  To: Sebastian Moeller; +Cc: Jonathan Morton, Rpm

On 10/11/21 - 09:31, Sebastian Moeller wrote:
> > On Oct 9, 2021, at 01:32, Christoph Paasch <cpaasch@apple.com> wrote:
> > 
> > On 10/07/21 - 12:30, Sebastian Moeller wrote:
> >> Hi Christoph,
> >> 
> >>> On Oct 7, 2021, at 02:11, Christoph Paasch via Rpm
> >>> <rpm@lists.bufferbloat.net> wrote:
> >>> 
> >>> On 10/07/21 - 02:18, Jonathan Morton via Rpm wrote:
> >>>>> On 7 Oct, 2021, at 12:22 am, Dave Taht via Rpm
> >>>>> <rpm@lists.bufferbloat.net> wrote: There are additional cases where,
> >>>>> perhaps, the fq component works, and the aqm doesn't.
> >>>> 
> >>>> Such as Apple's version of FQ-Codel?  The source code is public, so we
> >>>> might as well talk about it.
> >>> 
> >>> Let's not just talk about it, but actually read it ;-)
> >>> 
> >>>> There are two deviations I know about in the AQM portion of that.
> >>>> First is that they do the marking and/or dropping at the tail of the
> >>>> queue, not the head.  Second is that the marking/dropping frequency is
> >>>> fixed, instead of increasing during a continuous period of congestion
> >>>> as real Codel does.
> >>> 
> >>> We don't drop/mark locally generated traffic (which is the use-case we
> >>> care abhout).
> >> 
> >> 	In this discussion probably true, but I recall that one reason why
> >> 	sch_fq_codel is a more versatile qdisc compared to sch_fq under
> >> 	Linux is that fq excels for locally generated traffic, while
> >> 	fq_codel is also working well for forwarded traffic. And I use
> >> 	"forwarding" here to encompass things like VMs running on a host,
> >> 	where direct "back-pressure" will not work... 
> > 
> > Our main use-case is iOS. This is by far the most common case and thus there
> > are no VMs or alike. All traffic is generated locally by our TCP
> > implementation.
> 
> 	Ah, explains your priorities.

Yes - we are aware of these issues for forwarding or VM-generated traffic.

But the amount of traffic is so much lower compared to the other use-cases
that it is not even a drop in the bucket.

> My only iOS device is 11 years old, and as far as I understand does not support fq_codel at all, so my testing is restricted to my macbooks and was under mojave and catalina:
> 
> macbook:~ user$ sudo netstat -I en4 -qq
> en4:
>      [ sched:  FQ_CODEL  qlength:    0/128 ]
>      [ pkts:     126518  bytes:   60151318  dropped pkts:      0 bytes:      0 ]
> =====================================================
>      [ pri: CTL (0)	srv_cl: 0x480190	quantum: 600	drr_max: 8 ]
>      [ queued pkts: 0	bytes: 0 ]
>      [ dequeued pkts: 16969	bytes: 1144841 ]
>      [ budget: 0	target qdelay:  5.00 msec	update interval:100.00 msec ]
>      [ flow control: 0	feedback: 0	stalls: 0	failed: 0 ]
>      [ drop overflow: 0	early: 0	memfail: 0	duprexmt:0 ]
>      [ flows total: 0	new: 0	old: 0 ]
>      [ throttle on: 0	off: 0	drop: 0 ]
> =====================================================
>      [ pri: VO (1)	srv_cl: 0x400180	quantum: 600	drr_max: 8 ]
>      [ queued pkts: 0	bytes: 0 ]
>      [ dequeued pkts: 0	bytes: 0 ]
>      [ budget: 0	target qdelay:  5.00 msec	update interval:100.00 msec ]
>      [ flow control: 0	feedback: 0	stalls: 0	failed: 0 ]
>      [ drop overflow: 0	early: 0	memfail: 0	duprexmt:0 ]
>      [ flows total: 0	new: 0	old: 0 ]
>      [ throttle on: 0	off: 0	drop: 0 ]
> =====================================================
>      [ pri: VI (2)	srv_cl: 0x380100	quantum: 3000	drr_max: 6 ]
>      [ queued pkts: 0	bytes: 0 ]
>      [ dequeued pkts: 0	bytes: 0 ]
>      [ budget: 0	target qdelay:  5.00 msec	update interval:100.00 msec ]
>      [ flow control: 0	feedback: 0	stalls: 0	failed: 0 ]
>      [ drop overflow: 0	early: 0	memfail: 0	duprexmt:0 ]
>      [ flows total: 0	new: 0	old: 0 ]
>      [ throttle on: 0	off: 0	drop: 0 ]
> =====================================================
>      [ pri: RV (3)	srv_cl: 0x300110	quantum: 3000	drr_max: 6 ]
>      [ queued pkts: 0	bytes: 0 ]
>      [ dequeued pkts: 0	bytes: 0 ]
>      [ budget: 0	target qdelay:  5.00 msec	update interval:100.00 msec ]
>      [ flow control: 0	feedback: 0	stalls: 0	failed: 0 ]
>      [ drop overflow: 0	early: 0	memfail: 0	duprexmt:0 ]
>      [ flows total: 0	new: 0	old: 0 ]
>      [ throttle on: 0	off: 0	drop: 0 ]
> =====================================================
>      [ pri: AV (4)	srv_cl: 0x280120	quantum: 3000	drr_max: 6 ]
>      [ queued pkts: 0	bytes: 0 ]
>      [ dequeued pkts: 0	bytes: 0 ]
>      [ budget: 0	target qdelay:  5.00 msec	update interval:100.00 msec ]
>      [ flow control: 0	feedback: 0	stalls: 0	failed: 0 ]
>      [ drop overflow: 0	early: 0	memfail: 0	duprexmt:0 ]
>      [ flows total: 0	new: 0	old: 0 ]
>      [ throttle on: 0	off: 0	drop: 0 ]
> =====================================================
>      [ pri: OAM (5)	srv_cl: 0x200020	quantum: 1500	drr_max: 4 ]
>      [ queued pkts: 0	bytes: 0 ]
>      [ dequeued pkts: 0	bytes: 0 ]
>      [ budget: 0	target qdelay:  5.00 msec	update interval:100.00 msec ]
>      [ flow control: 0	feedback: 0	stalls: 0	failed: 0 ]
>      [ drop overflow: 0	early: 0	memfail: 0	duprexmt:0 ]
>      [ flows total: 0	new: 0	old: 0 ]
>      [ throttle on: 0	off: 0	drop: 0 ]
> =====================================================
>      [ pri: RD (6)	srv_cl: 0x180010	quantum: 1500	drr_max: 4 ]
>      [ queued pkts: 0	bytes: 0 ]
>      [ dequeued pkts: 78	bytes: 13943 ]
>      [ budget: 0	target qdelay:  5.00 msec	update interval:100.00 msec ]
>      [ flow control: 0	feedback: 0	stalls: 0	failed: 0 ]
>      [ drop overflow: 0	early: 0	memfail: 0	duprexmt:0 ]
>      [ flows total: 0	new: 0	old: 0 ]
>      [ throttle on: 0	off: 0	drop: 0 ]
> =====================================================
>      [ pri: BE (7)	srv_cl: 0x0	quantum: 1500	drr_max: 4 ]
>      [ queued pkts: 0	bytes: 0 ]
>      [ dequeued pkts: 98857	bytes: 56860512 ]
>      [ budget: 0	target qdelay:  5.00 msec	update interval:100.00 msec ]
>      [ flow control: 0	feedback: 0	stalls: 0	failed: 0 ]
>      [ drop overflow: 0	early: 0	memfail: 0	duprexmt:0 ]
>      [ flows total: 0	new: 0	old: 0 ]
>      [ throttle on: 0	off: 0	drop: 0 ]
> =====================================================
>      [ pri: BK (8)	srv_cl: 0x100080	quantum: 1500	drr_max: 2 ]
>      [ queued pkts: 0	bytes: 0 ]
>      [ dequeued pkts: 10565	bytes: 2126520 ]
>      [ budget: 0	target qdelay:  5.00 msec	update interval:100.00 msec ]
>      [ flow control: 0	feedback: 0	stalls: 0	failed: 0 ]
>      [ drop overflow: 0	early: 0	memfail: 0	duprexmt:0 ]
>      [ flows total: 0	new: 0	old: 0 ]
>      [ throttle on: 0	off: 0	drop: 0 ]
> =====================================================
>      [ pri: BK_SYS (9)	srv_cl: 0x80090	quantum: 1500	drr_max: 2 ]
>      [ queued pkts: 0	bytes: 0 ]
>      [ dequeued pkts: 49	bytes: 5502 ]
>      [ budget: 0	target qdelay:  5.00 msec	update interval:100.00 msec ]
>      [ flow control: 0	feedback: 0	stalls: 0	failed: 0 ]
>      [ drop overflow: 0	early: 0	memfail: 0	duprexmt:0 ]
>      [ flows total: 0	new: 0	old: 0 ]
>      [ throttle on: 0	off: 0	drop: 0 ]
> hms-beagle2:~ smoeller$ 
> 
> macbook:~ user$ 
> 
> 
> 
> 
> 
> > 
> >>>> We signal flow-control straight back to the TCP-stack at which point the
> >>> queue is entirely drained before TCP starts transmitting again.
> >>> 
> >>> So, drop-frequency really doesn't matter because there is no drop.
> >> 
> >> 	But is it still codel/fq_codel if it does not implement head drop
> >> 	(as described in
> >> 	https://datatracker.ietf.org/doc/html/rfc8290#section-4.2) and if
> >> 	the control loop
> >> 	(https://datatracker.ietf.org/doc/html/rfc8289#section-3.3) is
> >> 	changed? (I am also wondering how reducing the default number of
> >> 	sub-queues from 1024 to 128 behaves on the background of the
> >> 	birthday paradox).
> > 
> > Not sure where the 128 comes from ?
> 
> See above:
>      [ sched:  FQ_CODEL  qlength:    0/128 ]
> but I might simply be misinterpreting the number here, because reading this again instead of relaying on memory indicates that 128 is the length of each individual queue and not the number of queues? Or is that the length of the hardware queue sitting below fq_codel? 

This 128 can be safely ignored. It has no meaning :)

> Anyway, any hints how to query/configure the fq_codel instance under macos (I am not fluent in BSDeses)?

For querying, netstat -qq as you already saw.

There is not much to configure... Just two sysctls:

net.classq.target_qdelay: 0
net.classq.update_interval: 0

0 means that the system's default is in use.


Christoph

> > And birthday paradox does not apply. The magic happens in inp_calc_flowhash() ;-)
> 
> 	Thanks will need to spend time reading and understanding the code obviously.
> 
> Regards
> 	Sebastian
> 
> > 
> > 
> > Cheers,
> > Christoph
> > 
> > 
> >> Best Regards Sebastian
> >> 
> >> P.S.: My definition of working conditions entails bidirectionally
> >> saturating traffic with responsive and (transiently) under-responsive
> >> flows. Something like a few long running TCP transfers to generate
> >> "base-load" and a higher number of TCP flows in IW or slow start to add
> >> some spice to the whole. In the future, once QUIC actually takes off*,
> >> adding more well defined/behaved UDP flows to the mix seems reasonable. My
> >> off the cuff test for the effect of IW used to be to start a browser and
> >> open a collection of (30-50) tabs getting a nice "thundering herd" of TCP
> >> flows starting around the same time. But it seems browser makers got too
> >> smart for me and will not do what I want any more but temporally space the
> >> different sites in the tabs so that my nice thundering herd is less
> >> obnoxious (which IMHO is actually the right thing to do for actual usage,
> >> but for testing it sucks).
> >> 
> >> *) Occasionally browsing the NANOG archives makes me wonder how the move
> >> from HTTP/TCP to  QUIC/UDP is going to play with operators propensity to
> >> rate-limit UDP, but that is a different kettle of fish...
> >> 
> >> 
> >>> 
> >>> 
> >>> Christoph
> >>> 
> >>>> 
> >>>> I predict the consequences of these mistakes will differ according to
> >>>> the type of traffic applied:
> >>>> 
> >>>> With TCP traffic over an Internet-scale path, the consequences are not
> >>>> serious.  The tail-drop means that the response at the end of
> >>>> slow-start will be slower, with a higher peak of intra-flow induced
> >>>> delay, and there is also a small but measurable risk of tail-loss
> >>>> causing a more serious application-level delay.  These alone *should*
> >>>> be enough to prompt a fix, if Apple are actually serious about
> >>>> improving application responsiveness.  The fixed marking frequency,
> >>>> however, is probably invisible for this traffic.
> >>>> 
> >>>> With TCP traffic over a short-RTT path, the effects are more
> >>>> pronounced.  The delay excursion at the end of slow-start will be
> >>>> larger in comparison to the baseline RTT, and when the latter is short
> >>>> enough, the fixed congestion signalling frequency means there will be
> >>>> some standing queue that real Codel would get rid of.  This standing
> >>>> queue will influence the TCP stack's RTT estimator and thus RTO value,
> >>>> increasing the delay consequent to tail loss.
> >>>> 
> >>>> Similar effects to the above can be expected with other reliable stream
> >>>> transports (SCTP, QUIC), though the details may differ.
> >>>> 
> >>>> The consequences with non-congestion-controlled traffic could be much
> >>>> more serious.  Real Codel will increase its drop frequency continuously
> >>>> when faced with overload, eventually gaining control of the queue depth
> >>>> as long as the load remains finite and reasonably constant.  Because
> >>>> Apple's AQM doesn't increase its drop frequency, the queue depth for
> >>>> such a flow will increase continuously until either a delay-sensitive
> >>>> rate selection mechanism is triggered at the sender, or the queue
> >>>> overflows and triggers burst losses.
> >>>> 
> >>>> So in the context of this discussion, is it worth generating a type of
> >>>> load that specifically exercises this failure mode?  If so, what does
> >>>> it look like?
> >>>> 
> >>>> - Jonathan Morton _______________________________________________ Rpm
> >>>> mailing list Rpm@lists.bufferbloat.net
> >>>> https://lists.bufferbloat.net/listinfo/rpm
> >>> _______________________________________________ Rpm mailing list
> >>> Rpm@lists.bufferbloat.net https://lists.bufferbloat.net/listinfo/rpm
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Rpm] Alternate definitions of "working condition" - unnecessary?
  2021-10-11 17:34             ` Christoph Paasch
@ 2021-10-12 10:23               ` Sebastian Moeller
  0 siblings, 0 replies; 19+ messages in thread
From: Sebastian Moeller @ 2021-10-12 10:23 UTC (permalink / raw)
  To: Christoph Paasch; +Cc: Jonathan Morton, Rpm

Hi Christoph,


> On Oct 11, 2021, at 19:34, Christoph Paasch <cpaasch@apple.com> wrote:
> 
> On 10/11/21 - 09:31, Sebastian Moeller wrote:
>>> On Oct 9, 2021, at 01:32, Christoph Paasch <cpaasch@apple.com> wrote:
>>> 
>>> On 10/07/21 - 12:30, Sebastian Moeller wrote:
>>>> Hi Christoph,
>>>> 
>>>>> On Oct 7, 2021, at 02:11, Christoph Paasch via Rpm
>>>>> <rpm@lists.bufferbloat.net> wrote:
>>>>> 
>>>>> On 10/07/21 - 02:18, Jonathan Morton via Rpm wrote:
>>>>>>> On 7 Oct, 2021, at 12:22 am, Dave Taht via Rpm
>>>>>>> <rpm@lists.bufferbloat.net> wrote: There are additional cases where,
>>>>>>> perhaps, the fq component works, and the aqm doesn't.
>>>>>> 
>>>>>> Such as Apple's version of FQ-Codel?  The source code is public, so we
>>>>>> might as well talk about it.
>>>>> 
>>>>> Let's not just talk about it, but actually read it ;-)
>>>>> 
>>>>>> There are two deviations I know about in the AQM portion of that.
>>>>>> First is that they do the marking and/or dropping at the tail of the
>>>>>> queue, not the head.  Second is that the marking/dropping frequency is
>>>>>> fixed, instead of increasing during a continuous period of congestion
>>>>>> as real Codel does.
>>>>> 
>>>>> We don't drop/mark locally generated traffic (which is the use-case we
>>>>> care abhout).
>>>> 
>>>> 	In this discussion probably true, but I recall that one reason why
>>>> 	sch_fq_codel is a more versatile qdisc compared to sch_fq under
>>>> 	Linux is that fq excels for locally generated traffic, while
>>>> 	fq_codel is also working well for forwarded traffic. And I use
>>>> 	"forwarding" here to encompass things like VMs running on a host,
>>>> 	where direct "back-pressure" will not work... 
>>> 
>>> Our main use-case is iOS. This is by far the most common case and thus there
>>> are no VMs or alike. All traffic is generated locally by our TCP
>>> implementation.
>> 
>> 	Ah, explains your priorities.
> 
> Yes - we are aware of these issues for forwarding or VM-generated traffic.
> 
> But the amount of traffic is so much lower compared to the other use-cases
> that it is not even a drop in the bucket.

	[SM] One last try ;), how does this affect using an iOS device mobile AP to get other devices online?


> 
>> My only iOS device is 11 years old, and as far as I understand does not support fq_codel at all, so my testing is restricted to my macbooks and was under mojave and catalina:
>> 
>> macbook:~ user$ sudo netstat -I en4 -qq
>> en4:
>>     [ sched:  FQ_CODEL  qlength:    0/128 ]
>>     [ pkts:     126518  bytes:   60151318  dropped pkts:      0 bytes:      0 ]
>> =====================================================
>>     [ pri: CTL (0)	srv_cl: 0x480190	quantum: 600	drr_max: 8 ]
>>     [ queued pkts: 0	bytes: 0 ]
>>     [ dequeued pkts: 16969	bytes: 1144841 ]
>>     [ budget: 0	target qdelay:  5.00 msec	update interval:100.00 msec ]
>>     [ flow control: 0	feedback: 0	stalls: 0	failed: 0 ]
>>     [ drop overflow: 0	early: 0	memfail: 0	duprexmt:0 ]
>>     [ flows total: 0	new: 0	old: 0 ]
>>     [ throttle on: 0	off: 0	drop: 0 ]
>> =====================================================
>>     [ pri: VO (1)	srv_cl: 0x400180	quantum: 600	drr_max: 8 ]
>>     [ queued pkts: 0	bytes: 0 ]
>>     [ dequeued pkts: 0	bytes: 0 ]
>>     [ budget: 0	target qdelay:  5.00 msec	update interval:100.00 msec ]
>>     [ flow control: 0	feedback: 0	stalls: 0	failed: 0 ]
>>     [ drop overflow: 0	early: 0	memfail: 0	duprexmt:0 ]
>>     [ flows total: 0	new: 0	old: 0 ]
>>     [ throttle on: 0	off: 0	drop: 0 ]
>> =====================================================
>>     [ pri: VI (2)	srv_cl: 0x380100	quantum: 3000	drr_max: 6 ]
>>     [ queued pkts: 0	bytes: 0 ]
>>     [ dequeued pkts: 0	bytes: 0 ]
>>     [ budget: 0	target qdelay:  5.00 msec	update interval:100.00 msec ]
>>     [ flow control: 0	feedback: 0	stalls: 0	failed: 0 ]
>>     [ drop overflow: 0	early: 0	memfail: 0	duprexmt:0 ]
>>     [ flows total: 0	new: 0	old: 0 ]
>>     [ throttle on: 0	off: 0	drop: 0 ]
>> =====================================================
>>     [ pri: RV (3)	srv_cl: 0x300110	quantum: 3000	drr_max: 6 ]
>>     [ queued pkts: 0	bytes: 0 ]
>>     [ dequeued pkts: 0	bytes: 0 ]
>>     [ budget: 0	target qdelay:  5.00 msec	update interval:100.00 msec ]
>>     [ flow control: 0	feedback: 0	stalls: 0	failed: 0 ]
>>     [ drop overflow: 0	early: 0	memfail: 0	duprexmt:0 ]
>>     [ flows total: 0	new: 0	old: 0 ]
>>     [ throttle on: 0	off: 0	drop: 0 ]
>> =====================================================
>>     [ pri: AV (4)	srv_cl: 0x280120	quantum: 3000	drr_max: 6 ]
>>     [ queued pkts: 0	bytes: 0 ]
>>     [ dequeued pkts: 0	bytes: 0 ]
>>     [ budget: 0	target qdelay:  5.00 msec	update interval:100.00 msec ]
>>     [ flow control: 0	feedback: 0	stalls: 0	failed: 0 ]
>>     [ drop overflow: 0	early: 0	memfail: 0	duprexmt:0 ]
>>     [ flows total: 0	new: 0	old: 0 ]
>>     [ throttle on: 0	off: 0	drop: 0 ]
>> =====================================================
>>     [ pri: OAM (5)	srv_cl: 0x200020	quantum: 1500	drr_max: 4 ]
>>     [ queued pkts: 0	bytes: 0 ]
>>     [ dequeued pkts: 0	bytes: 0 ]
>>     [ budget: 0	target qdelay:  5.00 msec	update interval:100.00 msec ]
>>     [ flow control: 0	feedback: 0	stalls: 0	failed: 0 ]
>>     [ drop overflow: 0	early: 0	memfail: 0	duprexmt:0 ]
>>     [ flows total: 0	new: 0	old: 0 ]
>>     [ throttle on: 0	off: 0	drop: 0 ]
>> =====================================================
>>     [ pri: RD (6)	srv_cl: 0x180010	quantum: 1500	drr_max: 4 ]
>>     [ queued pkts: 0	bytes: 0 ]
>>     [ dequeued pkts: 78	bytes: 13943 ]
>>     [ budget: 0	target qdelay:  5.00 msec	update interval:100.00 msec ]
>>     [ flow control: 0	feedback: 0	stalls: 0	failed: 0 ]
>>     [ drop overflow: 0	early: 0	memfail: 0	duprexmt:0 ]
>>     [ flows total: 0	new: 0	old: 0 ]
>>     [ throttle on: 0	off: 0	drop: 0 ]
>> =====================================================
>>     [ pri: BE (7)	srv_cl: 0x0	quantum: 1500	drr_max: 4 ]
>>     [ queued pkts: 0	bytes: 0 ]
>>     [ dequeued pkts: 98857	bytes: 56860512 ]
>>     [ budget: 0	target qdelay:  5.00 msec	update interval:100.00 msec ]
>>     [ flow control: 0	feedback: 0	stalls: 0	failed: 0 ]
>>     [ drop overflow: 0	early: 0	memfail: 0	duprexmt:0 ]
>>     [ flows total: 0	new: 0	old: 0 ]
>>     [ throttle on: 0	off: 0	drop: 0 ]
>> =====================================================
>>     [ pri: BK (8)	srv_cl: 0x100080	quantum: 1500	drr_max: 2 ]
>>     [ queued pkts: 0	bytes: 0 ]
>>     [ dequeued pkts: 10565	bytes: 2126520 ]
>>     [ budget: 0	target qdelay:  5.00 msec	update interval:100.00 msec ]
>>     [ flow control: 0	feedback: 0	stalls: 0	failed: 0 ]
>>     [ drop overflow: 0	early: 0	memfail: 0	duprexmt:0 ]
>>     [ flows total: 0	new: 0	old: 0 ]
>>     [ throttle on: 0	off: 0	drop: 0 ]
>> =====================================================
>>     [ pri: BK_SYS (9)	srv_cl: 0x80090	quantum: 1500	drr_max: 2 ]
>>     [ queued pkts: 0	bytes: 0 ]
>>     [ dequeued pkts: 49	bytes: 5502 ]
>>     [ budget: 0	target qdelay:  5.00 msec	update interval:100.00 msec ]
>>     [ flow control: 0	feedback: 0	stalls: 0	failed: 0 ]
>>     [ drop overflow: 0	early: 0	memfail: 0	duprexmt:0 ]
>>     [ flows total: 0	new: 0	old: 0 ]
>>     [ throttle on: 0	off: 0	drop: 0 ]
>> hms-beagle2:~ smoeller$ 
>> 
>> macbook:~ user$ 
>> 
>> 
>> 
>> 
>> 
>>> 
>>>>>> We signal flow-control straight back to the TCP-stack at which point the
>>>>> queue is entirely drained before TCP starts transmitting again.
>>>>> 
>>>>> So, drop-frequency really doesn't matter because there is no drop.
>>>> 
>>>> 	But is it still codel/fq_codel if it does not implement head drop
>>>> 	(as described in
>>>> 	https://datatracker.ietf.org/doc/html/rfc8290#section-4.2) and if
>>>> 	the control loop
>>>> 	(https://datatracker.ietf.org/doc/html/rfc8289#section-3.3) is
>>>> 	changed? (I am also wondering how reducing the default number of
>>>> 	sub-queues from 1024 to 128 behaves on the background of the
>>>> 	birthday paradox).
>>> 
>>> Not sure where the 128 comes from ?
>> 
>> See above:
>>     [ sched:  FQ_CODEL  qlength:    0/128 ]
>> but I might simply be misinterpreting the number here, because reading this again instead of relaying on memory indicates that 128 is the length of each individual queue and not the number of queues? Or is that the length of the hardware queue sitting below fq_codel? 
> 
> This 128 can be safely ignored. It has no meaning :)

	[SM] Thanks! and excuse my confusion.


> 
>> Anyway, any hints how to query/configure the fq_codel instance under macos (I am not fluent in BSDeses)?
> 
> For querying, netstat -qq as you already saw.
> 
> There is not much to configure... Just two sysctls:
> 
> net.classq.target_qdelay: 0
> net.classq.update_interval: 0
> 
> 0 means that the system's default is in use.

	[SM] Okay that is a refreshing uncomplicated UI, a bit hidden, but since the default 5/100 work well, this seems a decent choice.

Thanks again
	Regards
		Sebastian


> 
> 
> Christoph
> 
>>> And birthday paradox does not apply. The magic happens in inp_calc_flowhash() ;-)
>> 
>> 	Thanks will need to spend time reading and understanding the code obviously.
>> 
>> Regards
>> 	Sebastian
>> 
>>> 
>>> 
>>> Cheers,
>>> Christoph
>>> 
>>> 
>>>> Best Regards Sebastian
>>>> 
>>>> P.S.: My definition of working conditions entails bidirectionally
>>>> saturating traffic with responsive and (transiently) under-responsive
>>>> flows. Something like a few long running TCP transfers to generate
>>>> "base-load" and a higher number of TCP flows in IW or slow start to add
>>>> some spice to the whole. In the future, once QUIC actually takes off*,
>>>> adding more well defined/behaved UDP flows to the mix seems reasonable. My
>>>> off the cuff test for the effect of IW used to be to start a browser and
>>>> open a collection of (30-50) tabs getting a nice "thundering herd" of TCP
>>>> flows starting around the same time. But it seems browser makers got too
>>>> smart for me and will not do what I want any more but temporally space the
>>>> different sites in the tabs so that my nice thundering herd is less
>>>> obnoxious (which IMHO is actually the right thing to do for actual usage,
>>>> but for testing it sucks).
>>>> 
>>>> *) Occasionally browsing the NANOG archives makes me wonder how the move
>>>> from HTTP/TCP to  QUIC/UDP is going to play with operators propensity to
>>>> rate-limit UDP, but that is a different kettle of fish...
>>>> 
>>>> 
>>>>> 
>>>>> 
>>>>> Christoph
>>>>> 
>>>>>> 
>>>>>> I predict the consequences of these mistakes will differ according to
>>>>>> the type of traffic applied:
>>>>>> 
>>>>>> With TCP traffic over an Internet-scale path, the consequences are not
>>>>>> serious.  The tail-drop means that the response at the end of
>>>>>> slow-start will be slower, with a higher peak of intra-flow induced
>>>>>> delay, and there is also a small but measurable risk of tail-loss
>>>>>> causing a more serious application-level delay.  These alone *should*
>>>>>> be enough to prompt a fix, if Apple are actually serious about
>>>>>> improving application responsiveness.  The fixed marking frequency,
>>>>>> however, is probably invisible for this traffic.
>>>>>> 
>>>>>> With TCP traffic over a short-RTT path, the effects are more
>>>>>> pronounced.  The delay excursion at the end of slow-start will be
>>>>>> larger in comparison to the baseline RTT, and when the latter is short
>>>>>> enough, the fixed congestion signalling frequency means there will be
>>>>>> some standing queue that real Codel would get rid of.  This standing
>>>>>> queue will influence the TCP stack's RTT estimator and thus RTO value,
>>>>>> increasing the delay consequent to tail loss.
>>>>>> 
>>>>>> Similar effects to the above can be expected with other reliable stream
>>>>>> transports (SCTP, QUIC), though the details may differ.
>>>>>> 
>>>>>> The consequences with non-congestion-controlled traffic could be much
>>>>>> more serious.  Real Codel will increase its drop frequency continuously
>>>>>> when faced with overload, eventually gaining control of the queue depth
>>>>>> as long as the load remains finite and reasonably constant.  Because
>>>>>> Apple's AQM doesn't increase its drop frequency, the queue depth for
>>>>>> such a flow will increase continuously until either a delay-sensitive
>>>>>> rate selection mechanism is triggered at the sender, or the queue
>>>>>> overflows and triggers burst losses.
>>>>>> 
>>>>>> So in the context of this discussion, is it worth generating a type of
>>>>>> load that specifically exercises this failure mode?  If so, what does
>>>>>> it look like?
>>>>>> 
>>>>>> - Jonathan Morton _______________________________________________ Rpm
>>>>>> mailing list Rpm@lists.bufferbloat.net
>>>>>> https://lists.bufferbloat.net/listinfo/rpm
>>>>> _______________________________________________ Rpm mailing list
>>>>> Rpm@lists.bufferbloat.net https://lists.bufferbloat.net/listinfo/rpm
>> 


^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2021-10-12 10:23 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-06 19:11 [Rpm] Alternate definitions of "working condition" - unnecessary? Rich Brown
2021-10-06 20:36 ` Jonathan Foulkes
2021-10-07 16:40   ` Toke Høiland-Jørgensen
2021-10-07 18:49     ` Dave Taht
2021-10-08 17:51       ` Toke Høiland-Jørgensen
2021-10-07 21:39   ` Rich Brown
2021-10-06 21:22 ` Dave Taht
2021-10-06 23:18   ` Jonathan Morton
2021-10-07  0:11     ` Christoph Paasch
2021-10-07 10:29       ` Jonathan Morton
2021-10-07 15:44         ` [Rpm] apple's fq_"codel" implementation Dave Taht
2021-10-07 10:30       ` [Rpm] Alternate definitions of "working condition" - unnecessary? Sebastian Moeller
2021-10-08  0:33         ` Jonathan Morton
2021-10-08 23:32         ` Christoph Paasch
2021-10-11  7:31           ` Sebastian Moeller
2021-10-11  9:01             ` Jonathan Morton
2021-10-11 10:03               ` Sebastian Moeller
2021-10-11 17:34             ` Christoph Paasch
2021-10-12 10:23               ` Sebastian Moeller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox