[Rpm] apple's fq_"codel" implementation

Thu Oct 7 11:44:30 EDT 2021

On Thu, Oct 7, 2021 at 3:29 AM Jonathan Morton <chromatix99 at gmail.com> wrote:
>
> > On 7 Oct, 2021, at 3:11 am, Christoph Paasch <cpaasch at apple.com> wrote:
> >
> >>> On 7 Oct, 2021, at 12:22 am, Dave Taht via Rpm <rpm at lists.bufferbloat.net> wrote:
> >>> There are additional cases where, perhaps, the fq component works, and the aqm doesn't.
> >>
> >> Such as Apple's version of FQ-Codel?  The source code is public, so we might as well talk about it.
> >
> > Let's not just talk about it, but actually read it ;-)

Since enough people have now actually read the code, and there are two
students performing experiments,
we can have this conversation.

> >> There are two deviations I know about in the AQM portion of that.  First is that they do the marking and/or dropping at the tail of the queue, not the head.  Second is that the marking/dropping frequency is fixed, instead of increasing during a continuous period of congestion as real Codel does.
> >
> > We don't drop/mark locally generated traffic (which is the use-case we care abhout).

"We", who?  :)

It's unclear as to what happens in the case of virtualization.

It's unclear what happens with UDP flows.

It's unclear what happens with tunneled flows (userspace vpn)

It's unclear what happens with sockets, rather than the apple APIs.

What I observed - exercising sockets (using 16 netperf, 4 irtt, osx as
the target) - was a sharp spike in the "drop_overload" statistic, and
tcp rsts in the captures, and that inspired me to inspect the code to
see what was hit, and to be a mite :deleted: at what I thought were
two essential components of the codel aqm not being there.

at the time I had WAY more other sources of error in my network setup
than I'd cared for and got pulled into something else before being
able to qualm my uncertainties here.

> > We signal flow-control straight back to the TCP-stack at which point the queue
> > is entirely drained before TCP starts transmitting again.

This is rather bursty.  The 1/count reduction in the drop scheduler
(or in this case the "pushback scheduler"),
should gradually reduce the needed local buffering in the queue to 5ms
(or in the case of apple, 10ms),
and compensate for the natural variabity of wifi and lte better.

I'd have to go read the code again to remember what the drop_overlimit
behavior was. I had thought that dropping cnt-1 rather than "entirely"
made more sense.

Anyway there were many, many other variables in play - a queue size of
300, 2000, or more, the presense off offloads, no BQL, testing how
usb-c-ethernet worked -

> > So, drop-frequency really doesn't matter because there is no drop.

It "should" be cutting the cwnd until the queue also is under control.
Without doing that, it will just fill up
immediately again, with the wrong rtt estimate.

>
> Hmm, that would be more reasonable behaviour for a machine that never has to forward anything - but that is not at all obvious from the source code I found.  I think I'll need to run tests to see what actually happens in practice.

Please!!! I did feel it was potentially a big bug, with some easy
fixes, needing only more eyeballs and time
to diagnose, or at least describe.

>
>  - Jonathan Morton

-- 
Fixing Starlink's Latencies: https://www.youtube.com/watch?v=c9gLo6Xrwgw

Dave Täht CEO, TekLibre, LLC