From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-io1-xd2c.google.com (mail-io1-xd2c.google.com [IPv6:2607:f8b0:4864:20::d2c]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id 16AD33B29E for ; Fri, 12 Apr 2019 19:04:02 -0400 (EDT) Received: by mail-io1-xd2c.google.com with SMTP id n11so9986480ioh.1 for ; Fri, 12 Apr 2019 16:04:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=MJWgXiYa0X5boN18ZZOO1vboTFgHlMMg9jsWl5j7bNg=; b=QWoQ/gZ3Qf0D5cSpJ9zgXd+1aJvnrzPM+QdTXIAGjCIpMWCr8HM1+hBXb5g55xOzST G2biRBhhvqNFJdGoZJlZyqZkUCRdJuuERHTABsdQzRpyE60o6R9cbr1qXU4gis7ufnz9 NYToCml3bSck+RmeuEWT1iPKlxNYAO46a/MZTL4ST++i9r7sO1UKY7x6AEMM7kIRmKvx vMWT4UjzfdNqLuVb4x2+LmU9f2hhjPsEkwSOsaoFgwYPjhAxas9Aj74KeEYsygJH0oXm h22DxvuMjHZraHR/9xxmZbZf8pT2B2Fz6+PKS+0xvcrWVm0MXSzI9Y6frF61mEq8km5T JUaA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=MJWgXiYa0X5boN18ZZOO1vboTFgHlMMg9jsWl5j7bNg=; b=CFts6hi++67kaPqlRU68uUheWRJK08jxJjMxccqdgCM0FKNgK3OFDdKU9Na1i4DSuE odjThEIlTp2H8vb+x6TEr0FaG+Zz2FkJOghoxbiU+T/cJQTVJZb5p1H2yfQzv2/60r3E F2aO3+bcGG++1MPZ+/KrxpCBnfdbrOdzIrs1WC6soDprMGBPsq7LZS9FezqzWUzML+GU EC3KJA7KYR8Z9xrbPMOy8zGWKg/yBen94Vnx3fTgzduf3HINi46L7qXUtOcU5hTYHvK0 5Jh0JaH2QFYC4pPxnR+2jazO5lSILhtHFI34CoOtuGDoV2dHfEL10dVbLvo9S1i9xW2o F2JQ== X-Gm-Message-State: APjAAAVXa4jHrvqKeAa19sQoIH6YSOgYWvIUaefRw9RuqVO1USk7Hnja orTx1QzEQNPx2UEFHzbRQ9cwSuRQuhm+G5nNssE= X-Google-Smtp-Source: APXvYqwLwb4I1ptFYLnYDkC9nmOSXs32DPfyMzq5FnUIy+9S/37veh7yxD/S95QGzwN1wMVUi2hBR9QnNw+fXr0sQ3k= X-Received: by 2002:a5e:9e02:: with SMTP id i2mr4390679ioq.238.1555110241328; Fri, 12 Apr 2019 16:04:01 -0700 (PDT) MIME-Version: 1.0 References: <878swfshsk.fsf@toke.dk> <87ftqnvvxl.fsf@toke.dk> In-Reply-To: <87ftqnvvxl.fsf@toke.dk> From: Joshua Zhao Date: Fri, 12 Apr 2019 16:03:49 -0700 Message-ID: To: =?UTF-8?B?VG9rZSBIw7hpbGFuZC1Kw7hyZ2Vuc2Vu?= Cc: make-wifi-fast@lists.bufferbloat.net Content-Type: multipart/alternative; boundary="00000000000085629a05865d5144" X-Mailman-Approved-At: Sat, 13 Apr 2019 01:29:09 -0400 Subject: Re: [Make-wifi-fast] tx queue stuck for many minutes X-BeenThere: make-wifi-fast@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 12 Apr 2019 23:04:02 -0000 --00000000000085629a05865d5144 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable That makes sense. I guess missing TX completion could be potential suspect and I'll check on that. On the other hand, why I ask about back-pressure is because when the problem happens the UDP TX socket shows as stuck and doesn't take any new packets. ~# netstat -tulnp Active Internet connections (only servers) Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name udp 0 22400 0.0.0.0:48439 0.0.0.0:* 2407/audiod-xxxxx Basically the "Send-Q" number stays as a very high number for long time (I didn't save what the exact number is when the problem happens) in the above example, so that the sendto() function simply fails. This is why I wondered about back-pressure being applied. Otherwise shouldn't UDP socket keeps sending and packets would be dropped by the queue scheduler? Thanks, Joshua On Fri, Apr 12, 2019 at 12:57 PM Toke H=C3=B8iland-J=C3=B8rgensen wrote: > Joshua Zhao writes: > > > Hi, > > Thanks for the reply! I've also emailed the ath10k and linux-wireless > list > > and waiting to hear back suggestions. > > In the meantime can you educate me how the aqm queue interacts with wif= i > > driver? Is that the driver pulls from the queue from time to time, > instead > > of aqm pushes to the network interface? How often or what triggers the > > driver to pull? > > Generally two paths: > > 1. Packet comes in from upper netdev -> mac80211 queues the packet to tx = -> > driver is notified through wake_tx_queue() op, driver initiates > transmission scheduling and pulls from TXQ > > and > > 2. Driver gets notification from hardware (mostly TX completion) -> > driver initiates TX scheduling and pulls from TXQ > > There are some more cases that are variants of the above (e.g., wakeup > from powersave etc). My guess is that in your case it is one of the > cases in the second category that goes wrong... > > > I hope I can verify that if you can point me to the code to check that > > :) And, for the queue itself, how long it's supposed to drop packets > > and clean up? > > Well, when the hardware is reset, or the station is disassociated, the > queue will be flushed. Other than that, there's no separate "cleanup" > per se; rather, the two mechanisms outlined above should ensure that > packets keep flowing towards the station at the other end. > > > It seems that when it's full, it notifies back-pressure to the socket > > instead of simply dropping the packets from the head or the tail of > > the queue? > > No, it doesn't generally do much back-pressure. Rather, when it fills > up, it will drop packets from the head of the longest flow to clear > space (see fq_tin_enqueue()). The limit is pretty high, though - 8192 > packets or 16 Mbytes of memory... > > -Toke > --00000000000085629a05865d5144 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
That makes sense. I guess missing TX comp= letion could be potential suspect and I'll check on that.=C2=A0
On the=C2=A0other hand, why I ask about back-pressure is becaus= e when the problem happens the UDP TX socket shows as stuck and doesn't= take any new packets.=C2=A0

~# netstat -tulnp

Active Internet connections (on= ly servers)

Proto Recv-Q Send-Q Local Addre= ss =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 Foreign Address = =C2=A0 =C2=A0 =C2=A0 =C2=A0 State =C2=A0 =C2=A0 =C2=A0 PID/Program name =C2=A0 =C2=A0

udp=C2=A0 =C2=A0 =C2= =A0 =C2=A0 0 =C2=A022400 0.0.0.= 0:48439 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= 0.0.0.0:= * =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 2407/audiod-xxxxx


Basically the "Send-Q" number stays = as a very high number for long time (I didn't save what the exact numbe= r is when the problem happens) in the above example, so that the sendto() f= unction simply fails.
This is why I wondered about back-pressure = being applied.=C2=A0 Otherwise shouldn't UDP socket keeps sending and p= ackets would be dropped by the queue scheduler?

Th= anks,
Joshua


<= div dir=3D"ltr" class=3D"gmail_attr">On Fri, Apr 12, 2019 at 12:57 PM Toke = H=C3=B8iland-J=C3=B8rgensen <toke@red= hat.com> wrote:
Joshua Zhao <swzhao@gmail.com> writes:

> Hi,
> Thanks for the reply!=C2=A0 I've also emailed the ath10k and linux= -wireless list
> and waiting to hear back suggestions.
> In the meantime can you educate me how the aqm queue interacts with wi= fi
> driver? Is that the driver pulls from the queue from time to time, ins= tead
> of aqm pushes to the network interface? How often or what triggers the=
> driver to pull?

Generally two paths:

1. Packet comes in from upper netdev -> mac80211 queues the packet to tx= ->
=C2=A0 =C2=A0driver is notified through wake_tx_queue() op, driver initiate= s
=C2=A0 =C2=A0transmission scheduling and pulls from TXQ

and

2. Driver gets notification from hardware (mostly TX completion) ->
=C2=A0 =C2=A0driver initiates TX scheduling and pulls from TXQ

There are some more cases that are variants of the above (e.g., wakeup
from powersave etc). My guess is that in your case it is one of the
cases in the second category that goes wrong...

> I hope I can verify that if you can point me to the code to check that=
> :) And, for the queue itself, how long it's supposed to drop packe= ts
> and clean up?

Well, when the hardware is reset, or the station is disassociated, the
queue will be flushed. Other than that, there's no separate "clean= up"
per se; rather, the two mechanisms outlined above should ensure that
packets keep flowing towards the station at the other end.

> It seems that when it's full, it notifies back-pressure to the soc= ket
> instead of simply dropping the packets from the head or the tail of > the queue?

No, it doesn't generally do much back-pressure. Rather, when it fills up, it will drop packets from the head of the longest flow to clear
space (see fq_tin_enqueue()). The limit is pretty high, though - 8192
packets or 16 Mbytes of memory...

-Toke
--00000000000085629a05865d5144--