From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gettysjim@gmail.com>
Received: from mail-ed1-f43.google.com (mail-ed1-f43.google.com
 [209.85.208.43])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (No client certificate requested)
 by lists.bufferbloat.net (Postfix) with ESMTPS id B93563B29E
 for <bloat@lists.bufferbloat.net>; Mon, 25 Jun 2018 19:54:35 -0400 (EDT)
Received: by mail-ed1-f43.google.com with SMTP id g12-v6so1839165edi.9
 for <bloat@lists.bufferbloat.net>; Mon, 25 Jun 2018 16:54:35 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:mime-version:references:in-reply-to:from:date
 :message-id:subject:to:cc;
 bh=aGvH+xea8yrTjufamjS+lx6jQTX0P6zob2i+tYyQgDI=;
 b=O4i72SVfEfmEk4NiGEayeRzqerifg2YcrfLTVeiZdpoTBCCqsWBwOXHDTjv0FY2U+S
 KWKJsdNQU6Rr0F2FIgPP5so8/lXOKE1Xn+CF6yLzm/DbP6H90/ObJd7G32r0/FvmF+z1
 3CNNf41Tq4tB6J9gOO2eD3sr99YFv0v0u6WZIFcs9o3xR4KrKXq7Cg3VYBZ2RSDTkZS0
 8Ljgh7RCFrZozIJZpSj9tD0l47ysnpKETnfhL9AHBLtL4BrEU2yKSBOxXz2tlBEjITnK
 CXCgOn486yqJotr2ARk+3r7OGx5jtszSZqd7j1fZ6ED/Jz4mn6zcKTPKkQremmwYL2dF
 QkIw==
X-Gm-Message-State: APt69E2czPkn5wSO3fCL5eBR0Bbej6esmldXon2gObK9Uwy7KfrEIe7H
 MN2pYHgwrQbMcXikXRJeEwZz0XdJveckSEm3wLtxrw==
X-Google-Smtp-Source: ADUXVKJ5UNrquQcH1Zx5Ie+9yC/ItQPCCnFbi+afj9zD/NUW3F1tPd9PpAIXSoxClxdWwkhCOG5F90QdPvmbX0qYTww=
X-Received: by 2002:a50:a1a7:: with SMTP id
 36-v6mr13101905edk.287.1529970874759; 
 Mon, 25 Jun 2018 16:54:34 -0700 (PDT)
MIME-Version: 1.0
References: <CAA93jw4p0chU_d4jEyqadqCLAMMhLsjf7NX0DjKKiCoPjxBCSA@mail.gmail.com>
 <8736xgsdcp.fsf@toke.dk> <838b212e-7a8c-6139-1306-9e60bfda926b@gmail.com>
 <CAA93jw48fVZUoPrR_sdZ5O0mEh_fQCeKx5k28pRTzbdM+7Rerg@mail.gmail.com>
 <8f80b36b-ef81-eadc-6218-350132f4d56a@pollere.com>
 <CAA93jw7R2+yogxMHJWaS6vMwzeJUOfq9T7MWNg0JQfBcEm=0CQ@mail.gmail.com>
 <9dbb8dc8-bec6-8252-c063-ff0ba5fd7c1a@pollere.com>
 <C03B3E60-FD68-46BC-930F-143879904767@gmail.com>
 <25305.1529678986@localhost> <47EC21F5-94D2-4982-B0BE-FA1FA30E7C88@gmail.com>
 <18224.1529704505@localhost> <87muvjnobj.fsf@toke.dk>
In-Reply-To: <87muvjnobj.fsf@toke.dk>
From: Jim Gettys <jg@freedesktop.org>
Date: Mon, 25 Jun 2018 19:54:18 -0400
Message-ID: <CAGhGL2BAnd7LAncKoK=1rWcTZTqpZo33BKOTFrNmrHzin2B8Vw@mail.gmail.com>
To: =?UTF-8?B?VG9rZSBIw7hpbGFuZC1Kw7hyZ2Vuc2Vu?= <toke@toke.dk>
Cc: Michael Richardson <mcr@sandelman.ca>, bloat <bloat@lists.bufferbloat.net>
Content-Type: multipart/alternative; boundary="00000000000081be88056f801a74"
Subject: Re: [Bloat] lwn.net's tcp small queues vs wifi aggregation solved
X-BeenThere: bloat@lists.bufferbloat.net
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: General list for discussing Bufferbloat <bloat.lists.bufferbloat.net>
List-Unsubscribe: <https://lists.bufferbloat.net/options/bloat>,
 <mailto:bloat-request@lists.bufferbloat.net?subject=unsubscribe>
List-Archive: <https://lists.bufferbloat.net/pipermail/bloat>
List-Post: <mailto:bloat@lists.bufferbloat.net>
List-Help: <mailto:bloat-request@lists.bufferbloat.net?subject=help>
List-Subscribe: <https://lists.bufferbloat.net/listinfo/bloat>,
 <mailto:bloat-request@lists.bufferbloat.net?subject=subscribe>
X-List-Received-Date: Mon, 25 Jun 2018 23:54:36 -0000

--00000000000081be88056f801a74
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

On Mon, Jun 25, 2018 at 6:38 AM Toke H=C3=B8iland-J=C3=B8rgensen <toke@toke=
.dk> wrote:

> Michael Richardson <mcr@sandelman.ca> writes:
>
> > Jonathan Morton <chromatix99@gmail.com> wrote:
> >     >>> I would instead frame the problem as "how can we get hardware t=
o
> >     >>> incorporate extra packets, which arrive between the request and
> grant
> >     >>> phases of the MAC, into the same TXOP?"  Then we no longer need
> to
> >     >>> think probabilistically, or induce unnecessary delay in the cas=
e
> that
> >     >>> no further packets arrive.
> >     >>
> >     >> I've never looked at the ring/buffer/descriptor structure of the
> ath9k, but
> >     >> with most ethernet devices, they would just continue reading
> descriptors
> >     >> until it was empty.   Is there some reason that something simila=
r
> can not
> >     >> occur?
> >     >>
> >     >> Or is the problem at a higher level?
> >     >> Or is that we don't want to enqueue packets so early, because
> it's a source
> >     >> of bloat?
> >
> >     > The question is of when the aggregate frame is constructed and
> >     > "frozen", using only the packets in the queue at that instant.
> When
> >     > the MAC grant occurs, transmission must begin immediately, so mos=
t
> >     > hardware prepares the frame in advance of that moment - but how
> far in
> >     > advance?
> >
> > Oh, I understand now.  The aggregate frame has to be constructed, and
> it's
> > this frame that is actually in the xmit queue.  I'm guessing that it's
> in the
> > hardware, because if it was in the driver, then we could perhaps do
> > something?
>
> No, it's in the driver for ath9k. So it would be possible to delay it
> slightly to try to build a larger one. The timing constraints are too
> tight to do it reactively when the request is granted, though; so
> delaying would result in idleness if there are no other flows to queue
> before then...
>
> Even for devices that build aggregates in firmware or hardware (as all
> AC chipsets do), it might be possible to throttle the queues at higher
> levels to try to get better batching. It's just not obvious that there's
> an algorithm that can do this in a way that will "do no harm" for other
> types of traffic, for instance...
>
>
> =E2=80=8B
=E2=80=8B
=E2=80=8BIsn't this sort of delay a natural consequence of a busy channel?

What matters is not conserving txops *all the time*, but only when the
channel is busy and there aren't more txops available....

So when you are trying to transmit on a busy channel, that contention time
will naturally increase, since you won't
be able to get a transmit opportunity immediately.  So you should queue up
more packets into an aggregate in that case.

We only care about conserving txops when they are scarce, not when they are
abundant.

This principle is why a window system as crazy as X11 is competitive: it
naturally becomes more efficient in the
face of load (more and more requests batch up and are handled at maximum
efficiency, so the system is at maximum
efficiency at full load.

Or am I missing something here?

Jim

--00000000000081be88056f801a74
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div class=3D"gmail_default" style=3D"font-size:small"><br=
></div><br><div class=3D"gmail_quote"><div dir=3D"ltr">On Mon, Jun 25, 2018=
 at 6:38 AM Toke H=C3=B8iland-J=C3=B8rgensen &lt;<a href=3D"mailto:toke@tok=
e.dk">toke@toke.dk</a>&gt; wrote:<br></div><blockquote class=3D"gmail_quote=
" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">M=
ichael Richardson &lt;<a href=3D"mailto:mcr@sandelman.ca" target=3D"_blank"=
>mcr@sandelman.ca</a>&gt; writes:<br>
<br>
&gt; Jonathan Morton &lt;<a href=3D"mailto:chromatix99@gmail.com" target=3D=
"_blank">chromatix99@gmail.com</a>&gt; wrote:<br>
&gt;=C2=A0 =C2=A0 =C2=A0&gt;&gt;&gt; I would instead frame the problem as &=
quot;how can we get hardware to<br>
&gt;=C2=A0 =C2=A0 =C2=A0&gt;&gt;&gt; incorporate extra packets, which arriv=
e between the request and grant<br>
&gt;=C2=A0 =C2=A0 =C2=A0&gt;&gt;&gt; phases of the MAC, into the same TXOP?=
&quot;=C2=A0 Then we no longer need to<br>
&gt;=C2=A0 =C2=A0 =C2=A0&gt;&gt;&gt; think probabilistically, or induce unn=
ecessary delay in the case that<br>
&gt;=C2=A0 =C2=A0 =C2=A0&gt;&gt;&gt; no further packets arrive.<br>
&gt;=C2=A0 =C2=A0 =C2=A0&gt;&gt;<br>
&gt;=C2=A0 =C2=A0 =C2=A0&gt;&gt; I&#39;ve never looked at the ring/buffer/d=
escriptor structure of the ath9k, but<br>
&gt;=C2=A0 =C2=A0 =C2=A0&gt;&gt; with most ethernet devices, they would jus=
t continue reading descriptors<br>
&gt;=C2=A0 =C2=A0 =C2=A0&gt;&gt; until it was empty.=C2=A0 =C2=A0Is there s=
ome reason that something similar can not<br>
&gt;=C2=A0 =C2=A0 =C2=A0&gt;&gt; occur?<br>
&gt;=C2=A0 =C2=A0 =C2=A0&gt;&gt;<br>
&gt;=C2=A0 =C2=A0 =C2=A0&gt;&gt; Or is the problem at a higher level?<br>
&gt;=C2=A0 =C2=A0 =C2=A0&gt;&gt; Or is that we don&#39;t want to enqueue pa=
ckets so early, because it&#39;s a source<br>
&gt;=C2=A0 =C2=A0 =C2=A0&gt;&gt; of bloat?<br>
&gt;<br>
&gt;=C2=A0 =C2=A0 =C2=A0&gt; The question is of when the aggregate frame is=
 constructed and<br>
&gt;=C2=A0 =C2=A0 =C2=A0&gt; &quot;frozen&quot;, using only the packets in =
the queue at that instant.=C2=A0 When<br>
&gt;=C2=A0 =C2=A0 =C2=A0&gt; the MAC grant occurs, transmission must begin =
immediately, so most<br>
&gt;=C2=A0 =C2=A0 =C2=A0&gt; hardware prepares the frame in advance of that=
 moment - but how far in<br>
&gt;=C2=A0 =C2=A0 =C2=A0&gt; advance?<br>
&gt;<br>
&gt; Oh, I understand now.=C2=A0 The aggregate frame has to be constructed,=
 and it&#39;s<br>
&gt; this frame that is actually in the xmit queue.=C2=A0 I&#39;m guessing =
that it&#39;s in the<br>
&gt; hardware, because if it was in the driver, then we could perhaps do<br=
>
&gt; something?<br>
<br>
No, it&#39;s in the driver for ath9k. So it would be possible to delay it<b=
r>
slightly to try to build a larger one. The timing constraints are too<br>
tight to do it reactively when the request is granted, though; so<br>
delaying would result in idleness if there are no other flows to queue<br>
before then...<br>
<br>
Even for devices that build aggregates in firmware or hardware (as all<br>
AC chipsets do), it might be possible to throttle the queues at higher<br>
levels to try to get better batching. It&#39;s just not obvious that there&=
#39;s<br>
an algorithm that can do this in a way that will &quot;do no harm&quot; for=
 other<br>
types of traffic, for instance...<br>
<br><br></blockquote><div><div class=3D"gmail_default" style=3D"font-size:s=
mall;display:inline">=E2=80=8B</div><div class=3D"gmail_default" style=3D"f=
ont-size:small;display:inline">=E2=80=8B</div><div class=3D"gmail_default" =
style=3D"font-size:small;display:inline">=E2=80=8BIsn&#39;t this sort of de=
lay a natural consequence of a busy channel?</div></div><div><div class=3D"=
gmail_default" style=3D"font-size:small;display:inline"><br></div></div><di=
v><div class=3D"gmail_default" style=3D"font-size:small;display:inline">Wha=
t matters is not conserving txops *all the time*, but only when the channel=
 is busy and there aren&#39;t more txops available....</div></div><div><div=
 class=3D"gmail_default" style=3D"font-size:small;display:inline"><br></div=
></div><div><div class=3D"gmail_default" style=3D"font-size:small;display:i=
nline">So when you are trying to transmit on a busy channel, that contentio=
n time will naturally increase, since you won&#39;t</div></div><div><div cl=
ass=3D"gmail_default" style=3D"font-size:small;display:inline">be able to g=
et a transmit opportunity immediately.=C2=A0 So you should queue up more pa=
ckets into an aggregate in that case.</div></div><div><div class=3D"gmail_d=
efault" style=3D"font-size:small;display:inline"><br></div></div><div><div =
class=3D"gmail_default" style=3D"font-size:small;display:inline">We only ca=
re about conserving txops when they are scarce, not when they are abundant.=
</div></div><div><div class=3D"gmail_default" style=3D"font-size:small;disp=
lay:inline"><br></div></div><div><div class=3D"gmail_default" style=3D"font=
-size:small;display:inline">This principle is why a window system as crazy =
as X11 is competitive: it naturally becomes more efficient in the</div></di=
v><div><div class=3D"gmail_default" style=3D"font-size:small;display:inline=
">face of load (more and more requests batch up and are handled at maximum =
efficiency, so the system is at maximum</div></div><div><div class=3D"gmail=
_default" style=3D"font-size:small;display:inline">efficiency at full load.=
</div></div><div><div class=3D"gmail_default" style=3D"font-size:small;disp=
lay:inline"><br></div></div><div><div class=3D"gmail_default" style=3D"font=
-size:small;display:inline">Or am I missing something here?</div></div><div=
><div class=3D"gmail_default" style=3D"font-size:small;display:inline"><br>=
</div></div><div><div class=3D"gmail_default" style=3D"font-size:small;disp=
lay:inline">Jim</div></div><div><div class=3D"gmail_default" style=3D"font-=
size:small;display:inline"><br></div></div></div></div>

--00000000000081be88056f801a74--