From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wi0-x22b.google.com (mail-wi0-x22b.google.com [IPv6:2a00:1450:400c:c05::22b]) (using TLSv1 with cipher RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by huchra.bufferbloat.net (Postfix) with ESMTPS id 7CC2121F0D3; Mon, 14 Apr 2014 16:08:02 -0700 (PDT) Received: by mail-wi0-f171.google.com with SMTP id q5so4809710wiv.16 for ; Mon, 14 Apr 2014 16:08:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:date:message-id:subject:from:to:content-type :content-transfer-encoding; bh=YrbtY/LUnty61OaMW2uP3gnMGD3FAEhdNY4qyKCm/Fc=; b=wsfG/UG5ghHUq4i3A0Rg/UX7aSfHK4NTDLvx9PR26A8z91JBTLXslU4TNt18kZIiMz oZdGkmbzTxnzcs9s9YC/WmPnObkUXQ0YhQvpDxXm3nfE0nXqIdPzL5SLvYeWrWK+Eqzj phHdwWkZNyjhNdsPJl8/lYitfAnFNebkmsCRH4p8ono4h2AqicaYlqoEEhaTy1XgmyY0 2ZJWPrRIFoWGyPtuTkODGtJB6Ze+xXe/NmvCnBXDXwUAOEVa0Yr1U0NstKabyvsW19Dl gLGtCQ1LLMeKbq8ZzBTCadE9TBeQ/aDJzAI27KvthEeuEPkWf0PBHioYYwXRAahaF8Jv 9xiw== MIME-Version: 1.0 X-Received: by 10.194.171.167 with SMTP id av7mr29051636wjc.32.1397516880360; Mon, 14 Apr 2014 16:08:00 -0700 (PDT) Received: by 10.216.177.10 with HTTP; Mon, 14 Apr 2014 16:08:00 -0700 (PDT) Date: Mon, 14 Apr 2014 16:08:00 -0700 Message-ID: From: Dave Taht To: cerowrt@lists.bufferbloat.net, "cerowrt-devel@lists.bufferbloat.net" Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Subject: [Cerowrt-devel] [bug #442] tearing apart ath_tx_start in search of the hang X-BeenThere: cerowrt-devel@lists.bufferbloat.net X-Mailman-Version: 2.1.13 Precedence: list List-Id: Development issues regarding the cerowrt test router project List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 14 Apr 2014 23:08:03 -0000 So what we are seeing here is some sort of problem with accounting for pending_frames on a given queue. And since we've had a reminder of how useful code review can be - and I'd rather like to understand this logic anyway - // my comments in // /* Upon failure caller should free skb */ int ath_tx_start(struct ieee80211_hw *hw, struct sk_buff *skb, struct ath_tx_control *txctl) { struct ieee80211_hdr *hdr; struct ieee80211_tx_info *info =3D IEEE80211_SKB_CB(skb); struct ieee80211_sta *sta =3D txctl->sta; struct ieee80211_vif *vif =3D info->control.vif; struct ath_softc *sc =3D hw->priv; struct ath_txq *txq =3D txctl->txq; struct ath_atx_tid *tid =3D NULL; struct ath_buf *bf; int q; int ret; ret =3D ath_tx_prepare(hw, skb, txctl); if (ret) return ret; hdr =3D (struct ieee80211_hdr *) skb->data; /* * At this point, the vif, hw_key and sta pointers in the tx contro= l * info are no longer valid (overwritten by the ath_frame_info data= . */ // I haven't looked at what skb_get_queue_mapping can return yet q =3D skb_get_queue_mapping(skb); ath_txq_lock(sc, txq); if (txq =3D=3D sc->tx.txq_map[q] && ++txq->pending_frames > sc->tx.txq_max_pending[q] && !txq->stopped) { ieee80211_stop_queue(sc->hw, q); txq->stopped =3D true; // is there a difference between stopped and sleeping? } // So if the queue is not mapped properly we don't increment pending // frames. Also we are dependent on C processing the if left to right, // which is a good assumption, but it leaves the ++txq as a side effect if (txctl->an && ieee80211_is_data_present(hdr->frame_control)) tid =3D ath_get_skb_tid(sc, txctl->an, skb); if (info->flags & IEEE80211_TX_CTL_PS_RESPONSE) { ath_txq_unlock(sc, txq); txq =3D sc->tx.uapsdq; // So here we have a bit of state that changes after we've got some pending // state above that's been changed. I imagine this lock could stay unlocked // for a while and lead to races elsewhere. // haven't a clue what tx.uapsdq is ath_txq_lock(sc, txq); } else if (txctl->an && ieee80211_is_data_present(hdr->frame_control)) { WARN_ON(tid->ac->txq !=3D txctl->txq); if (info->flags & IEEE80211_TX_CTL_CLEAR_PS_FILT) tid->ac->clear_ps_filter =3D true; /* * Add this frame to software queue for scheduling later * for aggregation. */ TX_STAT_INC(txq->axq_qnum, a_queued_sw); __skb_queue_tail(&tid->buf_q, skb); if (!txctl->an->sleeping) ath_tx_queue_tid(txq, tid); // so if we're not sleeping, queue it up // and regardless if we're sleeping or not, schedule it ath_txq_schedule(sc, txq); goto out; } // So if data is not present OR txctl->an is invalid OR IEEE80211_TX_CTL_PS_RESPONSE is set in flags /// we fall through to here. bf =3D ath_tx_setup_buffer(sc, txq, tid, skb); // if we fell through to here, tid can be null unless data was present if (!bf) { ath_txq_skb_done(sc, txq, skb); if (txctl->paprd) dev_kfree_skb_any(skb); else ieee80211_free_txskb(sc->hw, skb); goto out; } // Well, I note that we incremented the frames earlier in some cases // should they be decremented above? bf->bf_state.bfs_paprd =3D txctl->paprd; if (txctl->paprd) bf->bf_state.bfs_paprd_timestamp =3D jiffies; ath_set_rates(vif, sta, bf); ath_tx_send_normal(sc, txq, tid, skb); // Not clear as to why you set_rates here, and I assume tx_send_normal // sends a non-aggregate out: ath_txq_unlock(sc, txq); return 0; } --=20 Dave T=C3=A4ht NSFW: https://w2.eff.org/Censorship/Internet_censorship_bills/russell_0296_= indecent.article