Development issues regarding the cerowrt test router project
 help / color / mirror / Atom feed
* Re: [Cerowrt-devel] Fwd: [bug #442] ath9k queue hang
@ 2014-04-15 19:00 Dave Taht
  2014-04-16 13:11 ` Felix Fietkau
  0 siblings, 1 reply; 8+ messages in thread
From: Dave Taht @ 2014-04-15 19:00 UTC (permalink / raw)
  To: Felix Fietkau; +Cc: cerowrt, cerowrt-devel

Thx felix!

Given that there seems to be a potential race in the code
review I did at:

http://www.bufferbloat.net/issues/442#note-22

another thought is to make the increment and decrement of

txq->pending_frame atomic, or to do a flush before the unlock

What tree is this patch against?

On Tue, Apr 15, 2014 at 11:46 AM, Felix Fietkau <nbd@openwrt.org> wrote:
> On 2014-04-15 06:06, Dave Taht wrote:
>> regrettably I am too wiped to look this over further right now, but the patchset
>> seems very promising.
>>
>> I will review on a fresh brain in the morning. Other eyeballs desired
>> - this will have to get patched on top of 3.14 and then backported to
>> the 3.10 backport....
> The patch is a rather crude workaround which unfortunately will not
> help with narrowing down the cause. Also, doing a chip reset because a
> software queue is stuck is overkill.
>
> Please test if this patch helps. The tid->paused flag is no longer
> necessary since my rework of the tx path.
> ---
> --- a/drivers/net/wireless/ath/ath9k/ath9k.h
> +++ b/drivers/net/wireless/ath/ath9k/ath9k.h
> @@ -254,7 +254,6 @@ struct ath_atx_tid {
>
>         s8 bar_index;
>         bool sched;
> -       bool paused;
>         bool active;
>  };
>
> --- a/drivers/net/wireless/ath/ath9k/xmit.c
> +++ b/drivers/net/wireless/ath/ath9k/xmit.c
> @@ -107,9 +107,6 @@ static void ath_tx_queue_tid(struct ath_
>  {
>         struct ath_atx_ac *ac = tid->ac;
>
> -       if (tid->paused)
> -               return;
> -
>         if (tid->sched)
>                 return;
>
> @@ -1407,7 +1404,6 @@ int ath_tx_aggr_start(struct ath_softc *
>         ath_tx_tid_change_state(sc, txtid);
>
>         txtid->active = true;
> -       txtid->paused = true;
>         *ssn = txtid->seq_start = txtid->seq_next;
>         txtid->bar_index = -1;
>
> @@ -1427,7 +1423,6 @@ void ath_tx_aggr_stop(struct ath_softc *
>
>         ath_txq_lock(sc, txq);
>         txtid->active = false;
> -       txtid->paused = false;
>         ath_tx_flush_tid(sc, txtid);
>         ath_tx_tid_change_state(sc, txtid);
>         ath_txq_unlock_complete(sc, txq);
> @@ -1487,7 +1482,7 @@ void ath_tx_aggr_wakeup(struct ath_softc
>                 ath_txq_lock(sc, txq);
>                 ac->clear_ps_filter = true;
>
> -               if (!tid->paused && ath_tid_has_buffered(tid)) {
> +               if (ath_tid_has_buffered(tid)) {
>                         ath_tx_queue_tid(txq, tid);
>                         ath_txq_schedule(sc, txq);
>                 }
> @@ -1510,7 +1505,6 @@ void ath_tx_aggr_resume(struct ath_softc
>         ath_txq_lock(sc, txq);
>
>         tid->baw_size = IEEE80211_MIN_AMPDU_BUF << sta->ht_cap.ampdu_factor;
> -       tid->paused = false;
>
>         if (ath_tid_has_buffered(tid)) {
>                 ath_tx_queue_tid(txq, tid);
> @@ -1544,8 +1538,6 @@ void ath9k_release_buffered_frames(struc
>                         continue;
>
>                 tid = ATH_AN_2_TID(an, i);
> -               if (tid->paused)
> -                       continue;
>
>                 ath_txq_lock(sc, tid->ac->txq);
>                 while (nframes > 0) {
> @@ -1844,9 +1836,6 @@ void ath_txq_schedule(struct ath_softc *
>                         list_del(&tid->list);
>                         tid->sched = false;
>
> -                       if (tid->paused)
> -                               continue;
> -
>                         if (ath_tx_sched_aggr(sc, txq, tid, &stop))
>                                 sent = true;
>
> @@ -2698,7 +2687,6 @@ void ath_tx_node_init(struct ath_softc *
>                 tid->baw_size  = WME_MAX_BA;
>                 tid->baw_head  = tid->baw_tail = 0;
>                 tid->sched     = false;
> -               tid->paused    = false;
>                 tid->active        = false;
>                 __skb_queue_head_init(&tid->buf_q);
>                 __skb_queue_head_init(&tid->retry_q);
>



-- 
Dave Täht

NSFW: https://w2.eff.org/Censorship/Internet_censorship_bills/russell_0296_indecent.article

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Cerowrt-devel] Fwd: [bug #442] ath9k queue hang
  2014-04-15 19:00 [Cerowrt-devel] Fwd: [bug #442] ath9k queue hang Dave Taht
@ 2014-04-16 13:11 ` Felix Fietkau
  2014-04-16 15:34   ` Dave Taht
  0 siblings, 1 reply; 8+ messages in thread
From: Felix Fietkau @ 2014-04-16 13:11 UTC (permalink / raw)
  To: Dave Taht; +Cc: cerowrt, cerowrt-devel

On 2014-04-15 21:00, Dave Taht wrote:
> Thx felix!
> 
> Given that there seems to be a potential race in the code
> review I did at:
> 
> http://www.bufferbloat.net/issues/442#note-22
> 
> another thought is to make the increment and decrement of
> 
> txq->pending_frame atomic, or to do a flush before the unlock
I'm not convinced that there's a race that involves txq->pending_frames.
There is no need to make the increment/decrement atomic, because that
variable is already protected by the txq lock.

> What tree is this patch against?
mac80211 from OpenWrt trunk.

- Felix

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Cerowrt-devel] Fwd: [bug #442] ath9k queue hang
  2014-04-16 13:11 ` Felix Fietkau
@ 2014-04-16 15:34   ` Dave Taht
  2014-04-16 16:55     ` Felix Fietkau
  0 siblings, 1 reply; 8+ messages in thread
From: Dave Taht @ 2014-04-16 15:34 UTC (permalink / raw)
  To: Felix Fietkau; +Cc: cerowrt, cerowrt-devel

On Wed, Apr 16, 2014 at 6:11 AM, Felix Fietkau <nbd@openwrt.org> wrote:
> On 2014-04-15 21:00, Dave Taht wrote:
>> Thx felix!
>>
>> Given that there seems to be a potential race in the code
>> review I did at:
>>
>> http://www.bufferbloat.net/issues/442#note-22
>>
>> another thought is to make the increment and decrement of
>>
>> txq->pending_frame atomic, or to do a flush before the unlock
> I'm not convinced that there's a race that involves txq->pending_frames.
> There is no need to make the increment/decrement atomic, because that
> variable is already protected by the txq lock.

It and "stopped" are briefly unprotected along that code path.

>
>> What tree is this patch against?
> mac80211 from OpenWrt trunk.

Thx, will try your patch today.

> - Felix



-- 
Dave Täht

NSFW: https://w2.eff.org/Censorship/Internet_censorship_bills/russell_0296_indecent.article

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Cerowrt-devel] Fwd: [bug #442] ath9k queue hang
  2014-04-16 15:34   ` Dave Taht
@ 2014-04-16 16:55     ` Felix Fietkau
  2014-04-16 17:00       ` Dave Taht
  0 siblings, 1 reply; 8+ messages in thread
From: Felix Fietkau @ 2014-04-16 16:55 UTC (permalink / raw)
  To: Dave Taht; +Cc: cerowrt, cerowrt-devel

On 2014-04-16 17:34, Dave Taht wrote:
> On Wed, Apr 16, 2014 at 6:11 AM, Felix Fietkau <nbd@openwrt.org> wrote:
>> On 2014-04-15 21:00, Dave Taht wrote:
>>> Thx felix!
>>>
>>> Given that there seems to be a potential race in the code
>>> review I did at:
>>>
>>> http://www.bufferbloat.net/issues/442#note-22
>>>
>>> another thought is to make the increment and decrement of
>>>
>>> txq->pending_frame atomic, or to do a flush before the unlock
>> I'm not convinced that there's a race that involves txq->pending_frames.
>> There is no need to make the increment/decrement atomic, because that
>> variable is already protected by the txq lock.
> 
> It and "stopped" are briefly unprotected along that code path.
Where?

- Felix

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Cerowrt-devel] Fwd: [bug #442] ath9k queue hang
  2014-04-16 16:55     ` Felix Fietkau
@ 2014-04-16 17:00       ` Dave Taht
  2014-04-16 17:25         ` Robert Bradley
  0 siblings, 1 reply; 8+ messages in thread
From: Dave Taht @ 2014-04-16 17:00 UTC (permalink / raw)
  To: Felix Fietkau; +Cc: cerowrt, cerowrt-devel

should I have said "de-protected"? in

linux-3.14/drivers/net/wireless/ath/ath9k/xmit.c

        ath_txq_lock(sc, txq);
        if (txq == sc->tx.txq_map[q] &&
            ++txq->pending_frames > sc->tx.txq_max_pending[q] &&
            !txq->stopped) {
                ieee80211_stop_queue(sc->hw, q);
                txq->stopped = true;
        }

        if (txctl->an && ieee80211_is_data_present(hdr->frame_control))
                tid = ath_get_skb_tid(sc, txctl->an, skb);

        if (info->flags & IEEE80211_TX_CTL_PS_RESPONSE) {
                ath_txq_unlock(sc, txq);
                txq = sc->tx.uapsdq;
^^^^^^
                ath_txq_lock(sc, txq);
        } else if (txctl->an &&



On Wed, Apr 16, 2014 at 9:55 AM, Felix Fietkau <nbd@openwrt.org> wrote:
> On 2014-04-16 17:34, Dave Taht wrote:
>> On Wed, Apr 16, 2014 at 6:11 AM, Felix Fietkau <nbd@openwrt.org> wrote:
>>> On 2014-04-15 21:00, Dave Taht wrote:
>>>> Thx felix!
>>>>
>>>> Given that there seems to be a potential race in the code
>>>> review I did at:
>>>>
>>>> http://www.bufferbloat.net/issues/442#note-22
>>>>
>>>> another thought is to make the increment and decrement of
>>>>
>>>> txq->pending_frame atomic, or to do a flush before the unlock
>>> I'm not convinced that there's a race that involves txq->pending_frames.
>>> There is no need to make the increment/decrement atomic, because that
>>> variable is already protected by the txq lock.
>>
>> It and "stopped" are briefly unprotected along that code path.
> Where?
>
> - Felix



-- 
Dave Täht

NSFW: https://w2.eff.org/Censorship/Internet_censorship_bills/russell_0296_indecent.article

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Cerowrt-devel] Fwd: [bug #442] ath9k queue hang
  2014-04-16 17:00       ` Dave Taht
@ 2014-04-16 17:25         ` Robert Bradley
  2014-04-16 17:38           ` Dave Taht
  0 siblings, 1 reply; 8+ messages in thread
From: Robert Bradley @ 2014-04-16 17:25 UTC (permalink / raw)
  To: cerowrt-devel

[-- Attachment #1: Type: text/plain, Size: 529 bytes --]

On 16/04/2014 18:00, Dave Taht wrote:
> should I have said "de-protected"? in
>
> linux-3.14/drivers/net/wireless/ath/ath9k/xmit.c
<snip>
>         if (info->flags & IEEE80211_TX_CTL_PS_RESPONSE) {
>                 ath_txq_unlock(sc, txq);
>                 txq = sc->tx.uapsdq;
> ^^^^^^
>                 ath_txq_lock(sc, txq);
>         } else if (txctl->an &&
>

Isn't the point here that you're potentially switching txq and so need
to unlock the old one and then lock the new one?

-- 
Robert Bradley



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 899 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Cerowrt-devel] Fwd: [bug #442] ath9k queue hang
  2014-04-16 17:25         ` Robert Bradley
@ 2014-04-16 17:38           ` Dave Taht
  2014-04-16 21:05             ` Felix Fietkau
  0 siblings, 1 reply; 8+ messages in thread
From: Dave Taht @ 2014-04-16 17:38 UTC (permalink / raw)
  To: Robert Bradley, Felix Fietkau; +Cc: cerowrt-devel

On Wed, Apr 16, 2014 at 10:25 AM, Robert Bradley
<robert.bradley1@gmail.com> wrote:
> On 16/04/2014 18:00, Dave Taht wrote:
>> should I have said "de-protected"? in
>>
>> linux-3.14/drivers/net/wireless/ath/ath9k/xmit.c
> <snip>
>>         if (info->flags & IEEE80211_TX_CTL_PS_RESPONSE) {
>>                 ath_txq_unlock(sc, txq);
>>                 txq = sc->tx.uapsdq;
>> ^^^^^^
>>                 ath_txq_lock(sc, txq);
>>         } else if (txctl->an &&
>>
>
> Isn't the point here that you're potentially switching txq and so need
> to unlock the old one and then lock the new one?

But it's happening after incrementing the old one and potentially stopping it...

My take on this bit of code is that it needs to happen before the increment
and potental stop, and it should validate the value of tx.uapsdq.

but I am by no means the expert here! Felix is da man....

>
> --
> Robert Bradley
>
>
>
> _______________________________________________
> Cerowrt-devel mailing list
> Cerowrt-devel@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cerowrt-devel
>



-- 
Dave Täht

NSFW: https://w2.eff.org/Censorship/Internet_censorship_bills/russell_0296_indecent.article

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Cerowrt-devel] Fwd: [bug #442] ath9k queue hang
  2014-04-16 17:38           ` Dave Taht
@ 2014-04-16 21:05             ` Felix Fietkau
  0 siblings, 0 replies; 8+ messages in thread
From: Felix Fietkau @ 2014-04-16 21:05 UTC (permalink / raw)
  To: Dave Taht, Robert Bradley; +Cc: cerowrt-devel

On 2014-04-16 19:38, Dave Taht wrote:
> On Wed, Apr 16, 2014 at 10:25 AM, Robert Bradley
> <robert.bradley1@gmail.com> wrote:
>> On 16/04/2014 18:00, Dave Taht wrote:
>>> should I have said "de-protected"? in
>>>
>>> linux-3.14/drivers/net/wireless/ath/ath9k/xmit.c
>> <snip>
>>>         if (info->flags & IEEE80211_TX_CTL_PS_RESPONSE) {
>>>                 ath_txq_unlock(sc, txq);
>>>                 txq = sc->tx.uapsdq;
>>> ^^^^^^
>>>                 ath_txq_lock(sc, txq);
>>>         } else if (txctl->an &&
>>>
>>
>> Isn't the point here that you're potentially switching txq and so need
>> to unlock the old one and then lock the new one?
> 
> But it's happening after incrementing the old one and potentially stopping it...
> 
> My take on this bit of code is that it needs to happen before the increment
> and potental stop, and it should validate the value of tx.uapsdq.
> 
> but I am by no means the expert here! Felix is da man....
The txq switching handling wrt. pending_frames is correct. Since the
uapsd queue is used for powersave responses (both from the drv_tx
codepath and internal buffering), it is much better to credit these
packets to the queue that similar frames are normally meant to be
transmitted on. This is the txq before the switch.

To make that work, ath_txq_skb_done has an explicit check for the uapsd
queue.

- Felix

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2014-04-16 21:06 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-04-15 19:00 [Cerowrt-devel] Fwd: [bug #442] ath9k queue hang Dave Taht
2014-04-16 13:11 ` Felix Fietkau
2014-04-16 15:34   ` Dave Taht
2014-04-16 16:55     ` Felix Fietkau
2014-04-16 17:00       ` Dave Taht
2014-04-16 17:25         ` Robert Bradley
2014-04-16 17:38           ` Dave Taht
2014-04-16 21:05             ` Felix Fietkau

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox