* [Cerowrt-devel] wifi airtime fairness patches could use eyeballs and testing
@ 2016-08-10 11:28 Dave Taht
2016-08-10 13:06 ` [Cerowrt-devel] [Make-wifi-fast] " Noah Causin
0 siblings, 1 reply; 18+ messages in thread
From: Dave Taht @ 2016-08-10 11:28 UTC (permalink / raw)
To: make-wifi-fast, cerowrt-devel, Felix Fietkau, Simon Wunderlich
[-- Attachment #1: Type: text/plain, Size: 1308 bytes --]
Simon's team successfully[1] tested the attached patches on top of
felix's lede staging tree on monday against 30 stations. The initial
results were lovely, at the rates he tested at.
If anyone out there is daring enough to try building these for the
wndr3800, wndr3700v2 and nanostation, (or anything else with ath9k in
it!) it would be very good to put these through as many other
scenarios as possible, notably adhoc and wds need to get looked at -
but go forth! blow things up anyway you can! measure! Flent has a
rtt_fair_var test if you can get multiple stations going...
Virtual beer to you, if you put your build up somewhere.
The staging tree:
https://git.lede-project.org/?p=lede/nbd/staging.git;a=summary
---------- Forwarded message ----------
From: Toke Høiland-Jørgensen <toke@toke.dk>
Date: Mon, Aug 8, 2016 at 12:49 PM
Subject: Re: u alive today? patch
To: Dave Taht <dave.taht@gmail.com>
Cc: Simon Wunderlich <sw@simonwunderlich.de>
Dave Taht <dave.taht@gmail.com> writes:
> I am hanging with simon today, have lab setup, was hoping to do some patching...
These are for LEDE; just dump them in package/kernel/mac80211/patches
-Toke
--
Dave Täht
Let's go make home routers and wifi faster! With better software!
http://blog.cerowrt.org
[-- Attachment #2: 338-ath9k-Measure-per-station-airtime.patch --]
[-- Type: text/x-diff, Size: 9745 bytes --]
From 9ef266357c98faece1b4b400cf83fd9a5bbfcafc Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Toke=20H=C3=B8iland-J=C3=B8rgensen?= <toke@toke.dk>
Date: Wed, 22 Jun 2016 13:48:14 +0200
Subject: [PATCH 2/3] ath9k: Measure per-station airtime
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
This adds hooks to measure the airtime used by stations for TX and RX.
The cumulative time is accounted in a stats structure and exported in
debugfs.
Uses the ts->duration + retry-chain information to account for time
spent transmitting to a station. The RX airtime is measured as the
duration from first to last frame in an aggregate, using the rs_tstamp
fields.
Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk>
---
drivers/net/wireless/ath/ath9k/ath9k.h | 4 +++
drivers/net/wireless/ath/ath9k/debug.h | 13 +++++++
drivers/net/wireless/ath/ath9k/debug_sta.c | 45 +++++++++++++++++++++++
drivers/net/wireless/ath/ath9k/recv.c | 58 ++++++++++++++++++++++++++++++
drivers/net/wireless/ath/ath9k/xmit.c | 46 ++++++++++++++++++++++--
5 files changed, 164 insertions(+), 2 deletions(-)
diff --git a/drivers/net/wireless/ath/ath9k/ath9k.h b/drivers/net/wireless/ath/ath9k/ath9k.h
index 083d96b..5bcdc8d 100644
--- a/drivers/net/wireless/ath/ath9k/ath9k.h
+++ b/drivers/net/wireless/ath/ath9k/ath9k.h
@@ -258,9 +258,11 @@ struct ath_node {
bool sleeping;
bool no_ps_filter;
+ u32 airtime_rx_start;
#ifdef CPTCFG_ATH9K_STATION_STATISTICS
struct ath_rx_rate_stats rx_rate_stats;
+ struct ath_airtime_stats airtime_stats;
#endif
u8 key_idx[4];
@@ -561,6 +563,8 @@ void ath_txq_schedule_all(struct ath_softc *sc);
int ath_tx_init(struct ath_softc *sc, int nbufs);
int ath_txq_update(struct ath_softc *sc, int qnum,
struct ath9k_tx_queue_info *q);
+u32 ath_pkt_duration(struct ath_softc *sc, u8 rix, int pktlen,
+ int width, int half_gi, bool shortPreamble);
void ath_update_max_aggr_framelen(struct ath_softc *sc, int queue, int txop);
void ath_assign_seq(struct ath_common *common, struct sk_buff *skb);
int ath_tx_start(struct ieee80211_hw *hw, struct sk_buff *skb,
diff --git a/drivers/net/wireless/ath/ath9k/debug.h b/drivers/net/wireless/ath/ath9k/debug.h
index a078cdd..8659a96 100644
--- a/drivers/net/wireless/ath/ath9k/debug.h
+++ b/drivers/net/wireless/ath/ath9k/debug.h
@@ -221,6 +221,11 @@ struct ath_rx_rate_stats {
} cck_stats[4];
};
+struct ath_airtime_stats {
+ u32 rx_airtime;
+ u32 tx_airtime;
+};
+
#define ANT_MAIN 0
#define ANT_ALT 1
@@ -314,12 +319,20 @@ ath9k_debug_sync_cause(struct ath_softc *sc, u32 sync_cause)
void ath_debug_rate_stats(struct ath_softc *sc,
struct ath_rx_status *rs,
struct sk_buff *skb);
+void ath_debug_airtime(struct ath_softc *sc,
+ struct ath_node *an,
+ u32 rx, u32 tx);
#else
static inline void ath_debug_rate_stats(struct ath_softc *sc,
struct ath_rx_status *rs,
struct sk_buff *skb)
{
}
+static inline void ath_debug_airtime(struct ath_softc *sc,
+ struct ath_node *an,
+ u32 rx, u32 tx)
+{
+}
#endif /* CPTCFG_ATH9K_STATION_STATISTICS */
#endif /* DEBUG_H */
diff --git a/drivers/net/wireless/ath/ath9k/debug_sta.c b/drivers/net/wireless/ath/ath9k/debug_sta.c
index df22e7c..d0d211e 100644
--- a/drivers/net/wireless/ath/ath9k/debug_sta.c
+++ b/drivers/net/wireless/ath/ath9k/debug_sta.c
@@ -242,6 +242,50 @@ static const struct file_operations fops_node_recv = {
.llseek = default_llseek,
};
+void ath_debug_airtime(struct ath_softc *sc,
+ struct ath_node *an,
+ u32 rx,
+ u32 tx)
+{
+ struct ath_airtime_stats *astats = &an->airtime_stats;
+
+ astats->rx_airtime += rx;
+ astats->tx_airtime += tx;
+}
+
+static ssize_t read_airtime(struct file *file, char __user *user_buf,
+ size_t count, loff_t *ppos)
+{
+ struct ath_node *an = file->private_data;
+ struct ath_airtime_stats *astats;
+ u32 len = 0, size = 128;
+ char *buf;
+ size_t retval;
+
+ buf = kzalloc(size, GFP_KERNEL);
+ if (buf == NULL)
+ return -ENOMEM;
+
+ astats = &an->airtime_stats;
+
+ len += scnprintf(buf + len, size - len, "RX: %u us\n", astats->rx_airtime);
+ len += scnprintf(buf + len, size - len, "TX: %u us\n", astats->tx_airtime);
+
+ retval = simple_read_from_buffer(user_buf, count, ppos, buf, len);
+ kfree(buf);
+
+ return retval;
+}
+
+
+static const struct file_operations fops_airtime = {
+ .read = read_airtime,
+ .open = simple_open,
+ .owner = THIS_MODULE,
+ .llseek = default_llseek,
+};
+
+
void ath9k_sta_add_debugfs(struct ieee80211_hw *hw,
struct ieee80211_vif *vif,
struct ieee80211_sta *sta,
@@ -251,4 +295,5 @@ void ath9k_sta_add_debugfs(struct ieee80211_hw *hw,
debugfs_create_file("node_aggr", S_IRUGO, dir, an, &fops_node_aggr);
debugfs_create_file("node_recv", S_IRUGO, dir, an, &fops_node_recv);
+ debugfs_create_file("airtime", S_IRUGO, dir, an, &fops_airtime);
}
diff --git a/drivers/net/wireless/ath/ath9k/recv.c b/drivers/net/wireless/ath/ath9k/recv.c
index 32160fc..87c3c0e 100644
--- a/drivers/net/wireless/ath/ath9k/recv.c
+++ b/drivers/net/wireless/ath/ath9k/recv.c
@@ -991,6 +991,63 @@ static void ath9k_apply_ampdu_details(struct ath_softc *sc,
}
}
+static void ath_rx_count_airtime(struct ath_softc *sc,
+ struct ath_rx_status *rs,
+ struct sk_buff *skb)
+{
+ struct ath_node *an;
+ struct ieee80211_hdr *hdr = (struct ieee80211_hdr *) skb->data;
+ struct ath_hw *ah = sc->sc_ah;
+ struct ath_common *common = ath9k_hw_common(ah);
+ struct ieee80211_sta *sta;
+ struct ieee80211_rx_status *rxs;
+ const struct ieee80211_rate *rate;
+ bool is_sgi, is_40, is_sp;
+ int phy;
+ u32 airtime = 0;
+
+ if (!ieee80211_is_data(hdr->frame_control))
+ return;
+
+ rcu_read_lock();
+
+ sta = ieee80211_find_sta_by_ifaddr(sc->hw, hdr->addr2, NULL);
+ if (!sta)
+ goto exit;
+ an = (struct ath_node *) sta->drv_priv;
+
+ if (rs->rs_isaggr && rs->rs_firstaggr) {
+ an->airtime_rx_start = rs->rs_tstamp;
+ } else if (rs->rs_isaggr && !rs->rs_moreaggr && an->airtime_rx_start) {
+ airtime = rs->rs_tstamp - an->airtime_rx_start;
+ } else if (!rs->rs_isaggr) {
+ an->airtime_rx_start = 0;
+
+ rxs = IEEE80211_SKB_RXCB(skb);
+
+ is_sgi = !!(rxs->flag & RX_FLAG_SHORT_GI);
+ is_40 = !!(rxs->flag & RX_FLAG_40MHZ);
+ is_sp = !!(rxs->flag & RX_FLAG_SHORTPRE);
+
+ if (!!(rxs->flag & RX_FLAG_HT)) {
+ /* MCS rates */
+
+ airtime += ath_pkt_duration(sc, rxs->rate_idx, rs->rs_datalen,
+ is_40, is_sgi, is_sp);
+ } else {
+
+ phy = IS_CCK_RATE(rs->rs_rate) ? WLAN_RC_PHY_CCK : WLAN_RC_PHY_OFDM;
+ rate = &common->sbands[rxs->band].bitrates[rxs->rate_idx];
+ airtime += ath9k_hw_computetxtime(ah, phy, rate->bitrate * 100,
+ rs->rs_datalen, rxs->rate_idx, is_sp);
+ }
+ }
+
+ ath_debug_airtime(sc, an, airtime, 0);
+exit:
+ rcu_read_unlock();
+}
+
int ath_rx_tasklet(struct ath_softc *sc, int flush, bool hp)
{
struct ath_rxbuf *bf;
@@ -1137,6 +1194,7 @@ int ath_rx_tasklet(struct ath_softc *sc, int flush, bool hp)
ath9k_antenna_check(sc, &rs);
ath9k_apply_ampdu_details(sc, &rs, rxs);
ath_debug_rate_stats(sc, &rs, skb);
+ ath_rx_count_airtime(sc, &rs, skb);
hdr = (struct ieee80211_hdr *)skb->data;
if (ieee80211_is_ack(hdr->frame_control))
diff --git a/drivers/net/wireless/ath/ath9k/xmit.c b/drivers/net/wireless/ath/ath9k/xmit.c
index b508a49..a0de873 100644
--- a/drivers/net/wireless/ath/ath9k/xmit.c
+++ b/drivers/net/wireless/ath/ath9k/xmit.c
@@ -676,6 +676,46 @@ static bool bf_is_ampdu_not_probing(struct ath_buf *bf)
return bf_isampdu(bf) && !(info->flags & IEEE80211_TX_CTL_RATE_CTRL_PROBE);
}
+static void ath_tx_count_airtime(struct ath_softc *sc,
+ struct ath_buf *bf,
+ struct ath_tx_status *ts)
+{
+ struct ath_node *an;
+ struct sk_buff *skb;
+ struct ieee80211_hdr *hdr;
+ struct ieee80211_hw *hw = sc->hw;
+ struct ieee80211_tx_rate rates[4];
+ struct ieee80211_sta *sta;
+ int i;
+ u32 airtime = 0;
+
+ skb = bf->bf_mpdu;
+ if(!skb)
+ return;
+
+ hdr = (struct ieee80211_hdr *)skb->data;
+ memcpy(rates, bf->rates, sizeof(rates));
+
+ rcu_read_lock();
+
+ sta = ieee80211_find_sta_by_ifaddr(hw, hdr->addr1, hdr->addr2);
+ if(!sta)
+ goto exit;
+
+
+ an = (struct ath_node *) sta->drv_priv;
+
+ airtime += ts->duration * (ts->ts_longretry + 1);
+
+ for(i=0; i < ts->ts_rateindex; i++)
+ airtime += ath9k_hw_get_duration(sc->sc_ah, bf->bf_desc, i) * rates[i].count;
+
+ ath_debug_airtime(sc, an, 0, airtime);
+
+exit:
+ rcu_read_unlock();
+}
+
static void ath_tx_process_buffer(struct ath_softc *sc, struct ath_txq *txq,
struct ath_tx_status *ts, struct ath_buf *bf,
struct list_head *bf_head)
@@ -693,6 +733,7 @@ static void ath_tx_process_buffer(struct ath_softc *sc, struct ath_txq *txq,
ts->duration = ath9k_hw_get_duration(sc->sc_ah, bf->bf_desc,
ts->ts_rateindex);
+ ath_tx_count_airtime(sc, bf, ts);
hdr = (struct ieee80211_hdr *) bf->bf_mpdu->data;
sta = ieee80211_find_sta_by_ifaddr(hw, hdr->addr1, hdr->addr2);
@@ -1042,8 +1084,8 @@ finish:
* width - 0 for 20 MHz, 1 for 40 MHz
* half_gi - to use 4us v/s 3.6 us for symbol time
*/
-static u32 ath_pkt_duration(struct ath_softc *sc, u8 rix, int pktlen,
- int width, int half_gi, bool shortPreamble)
+u32 ath_pkt_duration(struct ath_softc *sc, u8 rix, int pktlen,
+ int width, int half_gi, bool shortPreamble)
{
u32 nbits, nsymbits, duration, nsymbols;
int streams;
--
2.9.0
[-- Attachment #3: 340-mac80211-Use-more-standard-codel-and-fq_codel-defaul.patch --]
[-- Type: text/x-diff, Size: 1918 bytes --]
From 80473865e0123b197dbb676dcede26a219d5d769 Mon Sep 17 00:00:00 2001
From: dave taht <dave@taht.net>
Date: Wed, 3 Aug 2016 08:20:22 -0700
Subject: [PATCH] mac80211: Use more standard codel and fq_codel defaults
Having the quantum parameter at a single packet saves 5+ runs through the
inner loop of fq_codel, and makes the fast/slow concept work for all
flows.
The codel target should be thought of as "are you willing to tolerate
more than X delay for more than interval Y ms", and the target should
be thought of as more of a percentage of the interval, where 5% is
the point of maximum "power".
Additionally:
It is better to reduce the maximum size of the txop when there are
many stations queueing up, and to try to hold the delay low, but
this patch doesn't do that.
---
include/net/fq_impl.h | 2 +-
net/mac80211/tx.c | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/include/net/fq_impl.h b/include/net/fq_impl.h
index 163f3ed..9afb1b6 100644
--- a/include/net/fq_impl.h
+++ b/include/net/fq_impl.h
@@ -249,7 +249,7 @@ static int fq_init(struct fq *fq, int flows_cnt)
spin_lock_init(&fq->lock);
fq->flows_cnt = max_t(u32, flows_cnt, 1);
fq->perturbation = prandom_u32();
- fq->quantum = 300;
+ fq->quantum = 1514;
fq->limit = 8192;
fq->flows = kcalloc(fq->flows_cnt, sizeof(fq->flows[0]), GFP_KERNEL);
diff --git a/net/mac80211/tx.c b/net/mac80211/tx.c
index 91461c4..eb8b1c9 100644
--- a/net/mac80211/tx.c
+++ b/net/mac80211/tx.c
@@ -1443,7 +1443,7 @@ int ieee80211_txq_setup_flows(struct ieee80211_local *local)
codel_params_init(&local->cparams);
codel_stats_init(&local->cstats);
local->cparams.interval = MS2TIME(100);
- local->cparams.target = MS2TIME(20);
+ local->cparams.target = MS2TIME(5);
local->cparams.ecn = true;
local->cvars = kcalloc(fq->flows_cnt, sizeof(local->cvars[0]),
--
2.7.4
[-- Attachment #4: 339-ath9k-Add-a-per-station-airtime-deficit-scheduler.patch --]
[-- Type: application/octet-stream, Size: 13103 bytes --]
From c9d67d4f41905557e4cae45b820f3996cd614bc2 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Toke=20H=C3=B8iland-J=C3=B8rgensen?= <toke@toke.dk>
Date: Wed, 25 May 2016 13:11:17 +0200
Subject: [PATCH 3/3] ath9k: Add a per-station airtime deficit scheduler
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
This modifies the logic in ath_txq_schedule to account airtime consumed
by each station and uses a deficit-based scheduler derived from FQ-CoDel
to try to enforce airtime fairness. A debugfs entry controls whether TX
airtime, RX airtime or both is accounted to the deficit on which the
scheduler makes decisions.
Uses the ts->duration + retry-chain information to account for time
spent transmitting to a station. The RX airtime is measured as the
duration from first to last frame in an aggregate, using the rs_tstamp
fields.
Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk>
---
drivers/net/wireless/ath/ath9k/ath9k.h | 16 ++++-
drivers/net/wireless/ath/ath9k/channel.c | 12 ++--
drivers/net/wireless/ath/ath9k/debug.c | 3 +
drivers/net/wireless/ath/ath9k/debug.h | 1 +
drivers/net/wireless/ath/ath9k/debug_sta.c | 2 +
drivers/net/wireless/ath/ath9k/init.c | 1 +
drivers/net/wireless/ath/ath9k/main.c | 6 +-
drivers/net/wireless/ath/ath9k/recv.c | 2 +
drivers/net/wireless/ath/ath9k/xmit.c | 99 +++++++++++++++++-------------
9 files changed, 92 insertions(+), 50 deletions(-)
diff --git a/drivers/net/wireless/ath/ath9k/ath9k.h b/drivers/net/wireless/ath/ath9k/ath9k.h
index 5bcdc8d..b75ab49 100644
--- a/drivers/net/wireless/ath/ath9k/ath9k.h
+++ b/drivers/net/wireless/ath/ath9k/ath9k.h
@@ -112,6 +112,8 @@ int ath_descdma_setup(struct ath_softc *sc, struct ath_descdma *dd,
#define ATH_TXFIFO_DEPTH 8
#define ATH_TX_ERROR 0x01
+#define ATH_AIRTIME_QUANTUM 300 /* usec */
+
/* Stop tx traffic 1ms before the GO goes away */
#define ATH_P2P_PS_STOP_TIME 1000
@@ -260,6 +262,7 @@ struct ath_node {
bool sleeping;
bool no_ps_filter;
+ s64 airtime_deficit;
u32 airtime_rx_start;
#ifdef CONFIG_ATH9K_STATION_STATISTICS
@@ -321,10 +324,15 @@ struct ath_rx {
/* Channel Context */
/*******************/
+struct ath_acq {
+ struct list_head acq_new;
+ struct list_head acq_old;
+};
+
struct ath_chanctx {
struct cfg80211_chan_def chandef;
struct list_head vifs;
- struct list_head acq[IEEE80211_NUM_ACS];
+ struct ath_acq acq[IEEE80211_NUM_ACS];
int hw_queue_base;
/* do not dereference, use for comparison only */
@@ -951,6 +959,10 @@ void ath_ant_comb_scan(struct ath_softc *sc, struct ath_rx_status *rs);
#define ATH9K_NUM_CHANCTX 2 /* supports 2 operating channels */
+#define AIRTIME_USE_TX BIT(0)
+#define AIRTIME_USE_RX BIT(1)
+#define AIRTIME_ACTIVE(flags) (!!(flags & (AIRTIME_USE_TX|AIRTIME_USE_RX)))
+
struct ath_softc {
struct ieee80211_hw *hw;
struct device *dev;
@@ -993,6 +1005,8 @@ struct ath_softc {
short nbcnvifs;
unsigned long ps_usecount;
+ u16 airtime_flags; /* AIRTIME_* */
+
struct ath_rx rx;
struct ath_tx tx;
struct ath_beacon beacon;
diff --git a/drivers/net/wireless/ath/ath9k/channel.c b/drivers/net/wireless/ath/ath9k/channel.c
index 929dd70..24e5575 100644
--- a/drivers/net/wireless/ath/ath9k/channel.c
+++ b/drivers/net/wireless/ath/ath9k/channel.c
@@ -118,8 +118,10 @@ void ath_chanctx_init(struct ath_softc *sc)
INIT_LIST_HEAD(&ctx->vifs);
ctx->txpower = ATH_TXPOWER_MAX;
ctx->flush_timeout = HZ / 5; /* 200ms */
- for (j = 0; j < ARRAY_SIZE(ctx->acq); j++)
- INIT_LIST_HEAD(&ctx->acq[j]);
+ for (j = 0; j < ARRAY_SIZE(ctx->acq); j++) {
+ INIT_LIST_HEAD(&ctx->acq[j].acq_new);
+ INIT_LIST_HEAD(&ctx->acq[j].acq_old);
+ }
}
}
@@ -1345,8 +1347,10 @@ void ath9k_offchannel_init(struct ath_softc *sc)
ctx->txpower = ATH_TXPOWER_MAX;
cfg80211_chandef_create(&ctx->chandef, chan, NL80211_CHAN_HT20);
- for (i = 0; i < ARRAY_SIZE(ctx->acq); i++)
- INIT_LIST_HEAD(&ctx->acq[i]);
+ for (i = 0; i < ARRAY_SIZE(ctx->acq); i++) {
+ INIT_LIST_HEAD(&ctx->acq[i].acq_new);
+ INIT_LIST_HEAD(&ctx->acq[i].acq_old);
+ }
sc->offchannel.chan.offchannel = true;
}
diff --git a/drivers/net/wireless/ath/ath9k/debug.c b/drivers/net/wireless/ath/ath9k/debug.c
index 89a94dd..43930c3 100644
--- a/drivers/net/wireless/ath/ath9k/debug.c
+++ b/drivers/net/wireless/ath/ath9k/debug.c
@@ -1399,5 +1399,8 @@ int ath9k_init_debug(struct ath_hw *ah)
debugfs_create_file("tpc", S_IRUSR | S_IWUSR,
sc->debug.debugfs_phy, sc, &fops_tpc);
+ debugfs_create_u16("airtime_flags", S_IRUSR | S_IWUSR,
+ sc->debug.debugfs_phy, &sc->airtime_flags);
+
return 0;
}
diff --git a/drivers/net/wireless/ath/ath9k/debug.h b/drivers/net/wireless/ath/ath9k/debug.h
index 8659a96..cc6e0ea 100644
--- a/drivers/net/wireless/ath/ath9k/debug.h
+++ b/drivers/net/wireless/ath/ath9k/debug.h
@@ -224,6 +224,7 @@ struct ath_rx_rate_stats {
struct ath_airtime_stats {
u32 rx_airtime;
u32 tx_airtime;
+ u32 queued_new;
};
#define ANT_MAIN 0
diff --git a/drivers/net/wireless/ath/ath9k/debug_sta.c b/drivers/net/wireless/ath/ath9k/debug_sta.c
index d0d211e..769c41b 100644
--- a/drivers/net/wireless/ath/ath9k/debug_sta.c
+++ b/drivers/net/wireless/ath/ath9k/debug_sta.c
@@ -270,6 +270,8 @@ static ssize_t read_airtime(struct file *file, char __user *user_buf,
len += scnprintf(buf + len, size - len, "RX: %u us\n", astats->rx_airtime);
len += scnprintf(buf + len, size - len, "TX: %u us\n", astats->tx_airtime);
+ len += scnprintf(buf + len, size - len, "Deficit: %lld us\n", an->airtime_deficit);
+ len += scnprintf(buf + len, size - len, "Queued as new: %u times\n", astats->queued_new);
retval = simple_read_from_buffer(user_buf, count, ppos, buf, len);
kfree(buf);
diff --git a/drivers/net/wireless/ath/ath9k/init.c b/drivers/net/wireless/ath/ath9k/init.c
index 19989e3..cfd796f 100644
--- a/drivers/net/wireless/ath/ath9k/init.c
+++ b/drivers/net/wireless/ath/ath9k/init.c
@@ -559,6 +559,7 @@ static int ath9k_init_softc(u16 devid, struct ath_softc *sc,
/* Will be cleared in ath9k_start() */
set_bit(ATH_OP_INVALID, &common->op_flags);
+ sc->airtime_flags = AIRTIME_USE_TX | AIRTIME_USE_RX;
sc->sc_ah = ah;
sc->dfs_detector = dfs_pattern_detector_init(common, NL80211_DFS_UNSET);
diff --git a/drivers/net/wireless/ath/ath9k/main.c b/drivers/net/wireless/ath/ath9k/main.c
index 6ab56e5..e13068b 100644
--- a/drivers/net/wireless/ath/ath9k/main.c
+++ b/drivers/net/wireless/ath/ath9k/main.c
@@ -70,10 +70,10 @@ static bool ath9k_has_pending_frames(struct ath_softc *sc, struct ath_txq *txq,
goto out;
if (txq->mac80211_qnum >= 0) {
- struct list_head *list;
+ struct ath_acq *acq;
- list = &sc->cur_chan->acq[txq->mac80211_qnum];
- if (!list_empty(list))
+ acq = &sc->cur_chan->acq[txq->mac80211_qnum];
+ if (!list_empty(&acq->acq_new) || !list_empty(&acq->acq_old))
pending = true;
}
out:
diff --git a/drivers/net/wireless/ath/ath9k/recv.c b/drivers/net/wireless/ath/ath9k/recv.c
index 87c3c0e..a48667f 100644
--- a/drivers/net/wireless/ath/ath9k/recv.c
+++ b/drivers/net/wireless/ath/ath9k/recv.c
@@ -1043,6 +1043,8 @@ static void ath_rx_count_airtime(struct ath_softc *sc,
}
}
+ if (!!(sc->airtime_flags & AIRTIME_USE_RX))
+ an->airtime_deficit -= airtime;
ath_debug_airtime(sc, an, airtime, 0);
exit:
rcu_read_unlock();
diff --git a/drivers/net/wireless/ath/ath9k/xmit.c b/drivers/net/wireless/ath/ath9k/xmit.c
index a0de873..badfe73 100644
--- a/drivers/net/wireless/ath/ath9k/xmit.c
+++ b/drivers/net/wireless/ath/ath9k/xmit.c
@@ -108,16 +108,26 @@ void ath_txq_unlock_complete(struct ath_softc *sc, struct ath_txq *txq)
static void ath_tx_queue_tid(struct ath_softc *sc, struct ath_txq *txq,
struct ath_atx_tid *tid)
{
- struct list_head *list;
struct ath_vif *avp = (struct ath_vif *) tid->an->vif->drv_priv;
struct ath_chanctx *ctx = avp->chanctx;
+ struct ath_acq *acq;
+ struct list_head *tid_list;
- if (!ctx)
+ if (!ctx || !list_empty(&tid->list))
return;
- list = &ctx->acq[TID_TO_WME_AC(tid->tidno)];
- if (list_empty(&tid->list))
- list_add_tail(&tid->list, list);
+
+ acq = &ctx->acq[TID_TO_WME_AC(tid->tidno)];
+ if (AIRTIME_ACTIVE(sc->airtime_flags)) {
+ tid_list = &acq->acq_new;
+#ifdef CPTCFG_ATH9K_STATION_STATISTICS
+ tid->an->airtime_stats.queued_new++;
+#endif
+ } else {
+ tid_list = &acq->acq_old;
+ }
+
+ list_add_tail(&tid->list, tid_list);
}
void ath9k_wake_tx_queue(struct ieee80211_hw *hw, struct ieee80211_txq *queue)
@@ -710,6 +718,8 @@ static void ath_tx_count_airtime(struct ath_softc *sc,
for(i=0; i < ts->ts_rateindex; i++)
airtime += ath9k_hw_get_duration(sc->sc_ah, bf->bf_desc, i) * rates[i].count;
+ if (!!(sc->airtime_flags & AIRTIME_USE_TX))
+ an->airtime_deficit -= airtime;
ath_debug_airtime(sc, an, 0, airtime);
exit:
@@ -747,6 +757,7 @@ static void ath_tx_process_buffer(struct ath_softc *sc, struct ath_txq *txq,
} else
ath_tx_complete_aggr(sc, txq, bf, bf_head, ts, txok);
+
if (!flush)
ath_txq_schedule(sc, txq);
}
@@ -1483,7 +1494,7 @@ ath_tx_form_burst(struct ath_softc *sc, struct ath_txq *txq,
}
static bool ath_tx_sched_aggr(struct ath_softc *sc, struct ath_txq *txq,
- struct ath_atx_tid *tid, bool *stop)
+ struct ath_atx_tid *tid)
{
struct ath_buf *bf;
struct ieee80211_tx_info *tx_info;
@@ -1505,7 +1516,6 @@ static bool ath_tx_sched_aggr(struct ath_softc *sc, struct ath_txq *txq,
if ((aggr && txq->axq_ampdu_depth >= ATH_AGGR_MIN_QDEPTH) ||
(!aggr && txq->axq_depth >= ATH_NON_AGGR_MIN_QDEPTH)) {
__skb_queue_tail(&tid->retry_q, bf->bf_mpdu);
- *stop = true;
return false;
}
@@ -1947,9 +1957,10 @@ void ath_tx_cleanupq(struct ath_softc *sc, struct ath_txq *txq)
void ath_txq_schedule(struct ath_softc *sc, struct ath_txq *txq)
{
struct ath_common *common = ath9k_hw_common(sc->sc_ah);
- struct ath_atx_tid *tid, *last_tid;
+ struct ath_atx_tid *tid;
struct list_head *tid_list;
- bool sent = false;
+ struct ath_acq *acq;
+ bool active = AIRTIME_ACTIVE(sc->airtime_flags);
if (txq->mac80211_qnum < 0)
return;
@@ -1958,48 +1969,50 @@ void ath_txq_schedule(struct ath_softc *sc, struct ath_txq *txq)
return;
spin_lock_bh(&sc->chan_lock);
- tid_list = &sc->cur_chan->acq[txq->mac80211_qnum];
+ acq = &sc->cur_chan->acq[txq->mac80211_qnum];
- if (list_empty(tid_list)) {
- spin_unlock_bh(&sc->chan_lock);
- return;
- }
+ if (sc->cur_chan->stopped)
+ goto out;
rcu_read_lock();
+begin:
+ tid_list = &acq->acq_new;
+ if (list_empty(tid_list)) {
+ tid_list = &acq->acq_old;
+ if (list_empty(tid_list))
+ goto out;
+ }
+ tid = list_first_entry(tid_list, struct ath_atx_tid, list);
- last_tid = list_entry(tid_list->prev, struct ath_atx_tid, list);
- while (!list_empty(tid_list)) {
- bool stop = false;
-
- if (sc->cur_chan->stopped)
- break;
-
- tid = list_first_entry(tid_list, struct ath_atx_tid, list);
- list_del_init(&tid->list);
-
- if (ath_tx_sched_aggr(sc, txq, tid, &stop))
- sent = true;
-
- /*
- * add tid to round-robin queue if more frames
- * are pending for the tid
- */
- if (ath_tid_has_buffered(tid))
- ath_tx_queue_tid(sc, txq, tid);
+ if (active && tid->an->airtime_deficit <= 0) {
+ tid->an->airtime_deficit += ATH_AIRTIME_QUANTUM;
+ list_move_tail(&tid->list, &acq->acq_old);
+ goto begin;
+ }
- if (stop)
- break;
+ if (!ath_tid_has_buffered(tid)) {
+ if ((tid_list == &acq->acq_new) && !list_empty(&acq->acq_old))
+ list_move_tail(&tid->list, &acq->acq_old);
+ else
+ list_del_init(&tid->list);
+ goto begin;
+ }
- if (tid == last_tid) {
- if (!sent)
- break;
- sent = false;
- last_tid = list_entry(tid_list->prev,
- struct ath_atx_tid, list);
- }
+ /* If a station succeeds in queueing something, immediately restart the
+ * loop. This makes sure the queues are shuffled if the station now has
+ * no more packets queued, and also ensures we keep the hardware queues
+ * full.
+ *
+ * If we dequeued from a new queue, shuffle the queues, to prevent it
+ * from hogging too much airtime. */
+ if(ath_tx_sched_aggr(sc, txq, tid)) {
+ if (!active || ((tid_list == &acq->acq_new) && !list_empty(&acq->acq_old)))
+ list_move_tail(&tid->list, &acq->acq_old);
+ goto begin;
}
+out:
rcu_read_unlock();
spin_unlock_bh(&sc->chan_lock);
}
@@ -2838,6 +2851,8 @@ void ath_tx_node_init(struct ath_softc *sc, struct ath_node *an)
struct ath_atx_tid *tid;
int tidno, acno;
+ an->airtime_deficit = ATH_AIRTIME_QUANTUM;
+
for (tidno = 0; tidno < IEEE80211_NUM_TIDS; tidno++) {
tid = ATH_AN_2_TID(an, tidno);
tid->an = an;
--
2.9.0
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Cerowrt-devel] [Make-wifi-fast] wifi airtime fairness patches could use eyeballs and testing
2016-08-10 11:28 [Cerowrt-devel] wifi airtime fairness patches could use eyeballs and testing Dave Taht
@ 2016-08-10 13:06 ` Noah Causin
2016-08-10 13:11 ` Dave Taht
` (2 more replies)
0 siblings, 3 replies; 18+ messages in thread
From: Noah Causin @ 2016-08-10 13:06 UTC (permalink / raw)
To: Dave Taht, make-wifi-fast, cerowrt-devel, Felix Fietkau,
Simon Wunderlich
[-- Attachment #1: Type: text/plain, Size: 1865 bytes --]
Hi,
I am aware that a performance regression with using fq in mac80211 with
multiple tcp streams has been reported, and this patch in lede disables it.
https://github.com/lede-project/source/commit/4952469ff9278288d766b28247a17694b1c4faaa
Has that been resolved?
Also, would it work if I removed the 220-fq_disable_hack.patch from
mac80211 in lede and applied those patches for airtime fairness,
Thank you,
Noah Causin
On 8/10/2016 7:28 AM, Dave Taht wrote:
> Simon's team successfully[1] tested the attached patches on top of
> felix's lede staging tree on monday against 30 stations. The initial
> results were lovely, at the rates he tested at.
>
> If anyone out there is daring enough to try building these for the
> wndr3800, wndr3700v2 and nanostation, (or anything else with ath9k in
> it!) it would be very good to put these through as many other
> scenarios as possible, notably adhoc and wds need to get looked at -
> but go forth! blow things up anyway you can! measure! Flent has a
> rtt_fair_var test if you can get multiple stations going...
>
> Virtual beer to you, if you put your build up somewhere.
>
> The staging tree:
>
> https://git.lede-project.org/?p=lede/nbd/staging.git;a=summary
>
> ---------- Forwarded message ----------
> From: Toke Høiland-Jørgensen <toke@toke.dk>
> Date: Mon, Aug 8, 2016 at 12:49 PM
> Subject: Re: u alive today? patch
> To: Dave Taht <dave.taht@gmail.com>
> Cc: Simon Wunderlich <sw@simonwunderlich.de>
>
>
> Dave Taht <dave.taht@gmail.com> writes:
>
>> I am hanging with simon today, have lab setup, was hoping to do some patching...
> These are for LEDE; just dump them in package/kernel/mac80211/patches
>
> -Toke
>
>
>
>
>
> _______________________________________________
> Make-wifi-fast mailing list
> Make-wifi-fast@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/make-wifi-fast
[-- Attachment #2: Type: text/html, Size: 3227 bytes --]
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Cerowrt-devel] [Make-wifi-fast] wifi airtime fairness patches could use eyeballs and testing
2016-08-10 13:06 ` [Cerowrt-devel] [Make-wifi-fast] " Noah Causin
@ 2016-08-10 13:11 ` Dave Taht
2016-08-10 15:04 ` Toke Høiland-Jørgensen
2016-08-14 22:21 ` Dave Taht
2 siblings, 0 replies; 18+ messages in thread
From: Dave Taht @ 2016-08-10 13:11 UTC (permalink / raw)
To: Noah Causin
Cc: Felix Fietkau, Simon Wunderlich, cerowrt-devel, make-wifi-fast
[-- Attachment #1: Type: text/plain, Size: 2083 bytes --]
On Aug 10, 2016 3:06 PM, "Noah Causin" <n0manletter@gmail.com> wrote:
>
> Hi,
>
> I am aware that a performance regression with using fq in mac80211 with
multiple tcp streams has been reported, and this patch in lede disables it.
>
>
https://github.com/lede-project/source/commit/4952469ff9278288d766b28247a17694b1c4faaa
>
> Has that been resolved?
I don't know! Please nuke it in your build.
Some evidence pointed at an iperf threading bug.
>
> Also, would it work if I removed the 220-fq_disable_hack.patch from
mac80211 in lede and applied those patches for airtime fairness,
Yes.
Gofer it!
> Thank you,
>
> Noah Causin
>
> On 8/10/2016 7:28 AM, Dave Taht wrote:
>>
>> Simon's team successfully[1] tested the attached patches on top of
felix's lede staging tree on monday against 30 stations. The initial
results were lovely, at the rates he tested at. If anyone out there is
daring enough to try building these for the wndr3800, wndr3700v2 and
nanostation, (or anything else with ath9k in it!) it would be very good to
put these through as many other scenarios as possible, notably adhoc and
wds need to get looked at - but go forth! blow things up anyway you can!
measure! Flent has a rtt_fair_var test if you can get multiple stations
going... Virtual beer to you, if you put your build up somewhere. The
staging tree: https://git.lede-project.org/?p=lede/nbd/staging.git;a=summary
---------- Forwarded message ---------- From: Toke Høiland-Jørgensen <
toke@toke.dk> Date: Mon, Aug 8, 2016 at 12:49 PM Subject: Re: u alive
today? patch To: Dave Taht <dave.taht@gmail.com> Cc: Simon Wunderlich <
sw@simonwunderlich.de> Dave Taht <dave.taht@gmail.com> writes:
>>>
>>> I am hanging with simon today, have lab setup, was hoping to do some
patching...
>>
>> These are for LEDE; just dump them in package/kernel/mac80211/patches
-Toke
>>
>>
>>
>> _______________________________________________ Make-wifi-fast mailing
list Make-wifi-fast@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/make-wifi-fast
>
>
[-- Attachment #2: Type: text/html, Size: 2950 bytes --]
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Cerowrt-devel] [Make-wifi-fast] wifi airtime fairness patches could use eyeballs and testing
2016-08-10 13:06 ` [Cerowrt-devel] [Make-wifi-fast] " Noah Causin
2016-08-10 13:11 ` Dave Taht
@ 2016-08-10 15:04 ` Toke Høiland-Jørgensen
2016-08-10 19:35 ` Dave Taht
2016-08-14 22:21 ` Dave Taht
2 siblings, 1 reply; 18+ messages in thread
From: Toke Høiland-Jørgensen @ 2016-08-10 15:04 UTC (permalink / raw)
To: Noah Causin
Cc: Dave Taht, make-wifi-fast, cerowrt-devel, Felix Fietkau,
Simon Wunderlich
Noah Causin <n0manletter@gmail.com> writes:
> Hi,
>
> I am aware that a performance regression with using fq in mac80211 with multiple tcp streams has been reported, and this patch in lede disables it.
>
> https://github.com/lede-project/source/commit/4952469ff9278288d766b28247a17694b1c4faaa
>
> Has that been resolved?
Just did a test of this and can reproduce with a wndr3800 as the access
point. I see about a 25% reduction in aggregate throughput with two
streams as opposed to one.
Packet captures and time sequence graphs are here:
https://kau.toke.dk/experiments/softq-debug/ - the onestream* and
twostream* files.
-Toke
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Cerowrt-devel] [Make-wifi-fast] wifi airtime fairness patches could use eyeballs and testing
2016-08-10 15:04 ` Toke Høiland-Jørgensen
@ 2016-08-10 19:35 ` Dave Taht
2016-08-10 20:06 ` Toke Høiland-Jørgensen
0 siblings, 1 reply; 18+ messages in thread
From: Dave Taht @ 2016-08-10 19:35 UTC (permalink / raw)
To: Toke Høiland-Jørgensen
Cc: Noah Causin, make-wifi-fast, cerowrt-devel, Felix Fietkau,
Simon Wunderlich
Wow, that *is* weird. It is good to see the tcp window changing on
this set of data (it wasn't before), and CWRs, but... hmmm... SCIENCE.
Enabling ecn on both sides will rule out some potential bugs.
Doubling the quantum again (3028) will go easier on the other side's
tcp stack. (unless 3028 is the wrong number at this layer)
Are you thinking timestamps are being scribbled on?
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Cerowrt-devel] [Make-wifi-fast] wifi airtime fairness patches could use eyeballs and testing
2016-08-10 19:35 ` Dave Taht
@ 2016-08-10 20:06 ` Toke Høiland-Jørgensen
2016-08-10 21:50 ` Toke Høiland-Jørgensen
0 siblings, 1 reply; 18+ messages in thread
From: Toke Høiland-Jørgensen @ 2016-08-10 20:06 UTC (permalink / raw)
To: Dave Taht
Cc: Noah Causin, make-wifi-fast, cerowrt-devel, Felix Fietkau,
Simon Wunderlich
On 10 August 2016 21:35:40 CEST, Dave Taht <dave.taht@gmail.com> wrote:
>Wow, that *is* weird. It is good to see the tcp window changing on
>this set of data (it wasn't before), and CWRs, but... hmmm... SCIENCE.
>
>Enabling ecn on both sides will rule out some potential bugs.
Yeah, couldn't get ecn to work on the host I was using as the other endpoint on that test. Will try with another box that's not on quite as ancient a kernel. Was also planning to disable codel (by setting a very high target) to try to narrow down the problem.
>Doubling the quantum again (3028) will go easier on the other side's
>tcp stack. (unless 3028 is the wrong number at this layer)
The endpoints have plenty of CPU.
>Are you thinking timestamps are being scribbled on?
Not necessarily; that was related to something else...
-Toke
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Cerowrt-devel] [Make-wifi-fast] wifi airtime fairness patches could use eyeballs and testing
2016-08-10 20:06 ` Toke Høiland-Jørgensen
@ 2016-08-10 21:50 ` Toke Høiland-Jørgensen
2016-08-10 21:58 ` Dave Taht
0 siblings, 1 reply; 18+ messages in thread
From: Toke Høiland-Jørgensen @ 2016-08-10 21:50 UTC (permalink / raw)
To: Dave Taht; +Cc: make-wifi-fast, cerowrt-devel, Felix Fietkau
Toke Høiland-Jørgensen <toke@toke.dk> writes:
> On 10 August 2016 21:35:40 CEST, Dave Taht <dave.taht@gmail.com> wrote:
>>Wow, that *is* weird. It is good to see the tcp window changing on
>>this set of data (it wasn't before), and CWRs, but... hmmm... SCIENCE.
>>
>>Enabling ecn on both sides will rule out some potential bugs.
>
> Yeah, couldn't get ecn to work on the host I was using as the other endpoint on
> that test. Will try with another box that's not on quite as ancient a kernel.
> Was also planning to disable codel (by setting a very high target) to try to
> narrow down the problem.
OK, digging some more on this:
I am seeing *no* drops by CoDel, and no backlog in the mac80211 softq
layer either (or at most one or two packets when polling with a 1 sec
interval). This is with one as well as with two flows.
On my x86 testbed I see backlog building and packets getting dropped by
CoDel - and can't reproduce the performance hit. So I'm wondering what
the difference is. I can think of:
- Another bottleneck somewhere in the system limiting things on the
wndr3800.
- A locking issue in the FQ mechanism preventing packets from being
queued properly.
- Something related to the wndr only having a single CPU.
-Toke
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Cerowrt-devel] [Make-wifi-fast] wifi airtime fairness patches could use eyeballs and testing
2016-08-10 21:50 ` Toke Høiland-Jørgensen
@ 2016-08-10 21:58 ` Dave Taht
2016-08-10 22:05 ` Toke Høiland-Jørgensen
0 siblings, 1 reply; 18+ messages in thread
From: Dave Taht @ 2016-08-10 21:58 UTC (permalink / raw)
To: Toke Høiland-Jørgensen
Cc: make-wifi-fast, cerowrt-devel, Felix Fietkau
On Wed, Aug 10, 2016 at 11:50 PM, Toke Høiland-Jørgensen <toke@toke.dk> wrote:
> Toke Høiland-Jørgensen <toke@toke.dk> writes:
>
>> On 10 August 2016 21:35:40 CEST, Dave Taht <dave.taht@gmail.com> wrote:
>>>Wow, that *is* weird. It is good to see the tcp window changing on
>>>this set of data (it wasn't before), and CWRs, but... hmmm... SCIENCE.
>>>
>>>Enabling ecn on both sides will rule out some potential bugs.
>>
>> Yeah, couldn't get ecn to work on the host I was using as the other endpoint on
>> that test. Will try with another box that's not on quite as ancient a kernel.
>> Was also planning to disable codel (by setting a very high target) to try to
>> narrow down the problem.
>
> OK, digging some more on this:
>
> I am seeing *no* drops by CoDel, and no backlog in the mac80211 softq
> layer either (or at most one or two packets when polling with a 1 sec
> interval). This is with one as well as with two flows.
But there are tons of drops evident from the captures.
> On my x86 testbed I see backlog building and packets getting dropped by
> CoDel - and can't reproduce the performance hit. So I'm wondering what
> the difference is. I can think of:
>
> - Another bottleneck somewhere in the system limiting things on the
> wndr3800.
>
> - A locking issue in the FQ mechanism preventing packets from being
> queued properly.
>
> - Something related to the wndr only having a single CPU.
>
> -Toke
--
Dave Täht
Let's go make home routers and wifi faster! With better software!
http://blog.cerowrt.org
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Cerowrt-devel] [Make-wifi-fast] wifi airtime fairness patches could use eyeballs and testing
2016-08-10 21:58 ` Dave Taht
@ 2016-08-10 22:05 ` Toke Høiland-Jørgensen
2016-08-10 22:07 ` Toke Høiland-Jørgensen
2016-08-10 22:18 ` Jonathan Morton
0 siblings, 2 replies; 18+ messages in thread
From: Toke Høiland-Jørgensen @ 2016-08-10 22:05 UTC (permalink / raw)
To: Dave Taht; +Cc: make-wifi-fast, cerowrt-devel, Felix Fietkau
Dave Taht <dave.taht@gmail.com> writes:
> On Wed, Aug 10, 2016 at 11:50 PM, Toke Høiland-Jørgensen <toke@toke.dk> wrote:
>> Toke Høiland-Jørgensen <toke@toke.dk> writes:
>>
>>> On 10 August 2016 21:35:40 CEST, Dave Taht <dave.taht@gmail.com> wrote:
>>>>Wow, that *is* weird. It is good to see the tcp window changing on
>>>>this set of data (it wasn't before), and CWRs, but... hmmm... SCIENCE.
>>>>
>>>>Enabling ecn on both sides will rule out some potential bugs.
>>>
>>> Yeah, couldn't get ecn to work on the host I was using as the other endpoint on
>>> that test. Will try with another box that's not on quite as ancient a kernel.
>>> Was also planning to disable codel (by setting a very high target) to try to
>>> narrow down the problem.
>>
>> OK, digging some more on this:
>>
>> I am seeing *no* drops by CoDel, and no backlog in the mac80211 softq
>> layer either (or at most one or two packets when polling with a 1 sec
>> interval). This is with one as well as with two flows.
>
> But there are tons of drops evident from the captures.
Well the CoDel drop counter stays completely flat. There are cwnd
reductions, yes, but are they drops? My thought was that they were
retransmissions caused by OOO packets?
-Toke
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Cerowrt-devel] [Make-wifi-fast] wifi airtime fairness patches could use eyeballs and testing
2016-08-10 22:05 ` Toke Høiland-Jørgensen
@ 2016-08-10 22:07 ` Toke Høiland-Jørgensen
2016-08-10 22:18 ` Jonathan Morton
1 sibling, 0 replies; 18+ messages in thread
From: Toke Høiland-Jørgensen @ 2016-08-10 22:07 UTC (permalink / raw)
To: Dave Taht; +Cc: make-wifi-fast, cerowrt-devel, Felix Fietkau
[-- Attachment #1: Type: text/plain, Size: 1450 bytes --]
Toke Høiland-Jørgensen <toke@toke.dk> writes:
> Dave Taht <dave.taht@gmail.com> writes:
>
>> On Wed, Aug 10, 2016 at 11:50 PM, Toke Høiland-Jørgensen <toke@toke.dk> wrote:
>>> Toke Høiland-Jørgensen <toke@toke.dk> writes:
>>>
>>>> On 10 August 2016 21:35:40 CEST, Dave Taht <dave.taht@gmail.com> wrote:
>>>>>Wow, that *is* weird. It is good to see the tcp window changing on
>>>>>this set of data (it wasn't before), and CWRs, but... hmmm... SCIENCE.
>>>>>
>>>>>Enabling ecn on both sides will rule out some potential bugs.
>>>>
>>>> Yeah, couldn't get ecn to work on the host I was using as the other endpoint on
>>>> that test. Will try with another box that's not on quite as ancient a kernel.
>>>> Was also planning to disable codel (by setting a very high target) to try to
>>>> narrow down the problem.
>>>
>>> OK, digging some more on this:
>>>
>>> I am seeing *no* drops by CoDel, and no backlog in the mac80211 softq
>>> layer either (or at most one or two packets when polling with a 1 sec
>>> interval). This is with one as well as with two flows.
>>
>> But there are tons of drops evident from the captures.
>
> Well the CoDel drop counter stays completely flat. There are cwnd
> reductions, yes, but are they drops? My thought was that they were
> retransmissions caused by OOO packets?
Here's the patch to expose the CoDel drop counter per flow, BTW. Can be
dropped into a LEDE build.
-Toke
[-- Attachment #2: 341-mac80211-Keep-CoDel-stats-per-txq-export-them-in-deb.patch --]
[-- Type: text/x-diff, Size: 4640 bytes --]
From d5a40556aa16fe0f452fd2ee975d7adce6f54837 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Toke=20H=C3=B8iland-J=C3=B8rgensen?= <toke@toke.dk>
Date: Wed, 20 Jul 2016 16:06:49 +0200
Subject: [PATCH] mac80211: Keep CoDel stats per txq, export them in debugfs.
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Currently the 'aqm' stats in mac80211 only keeps overlimit drop stats,
not CoDel stats. This moves the CoDel stats into the txqi structure and
adds the drop and mark counts to the debug output.
Cc: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk>
---
net/mac80211/debugfs.c | 12 ++++++++----
net/mac80211/ieee80211_i.h | 2 +-
net/mac80211/tx.c | 4 ++--
3 files changed, 11 insertions(+), 7 deletions(-)
diff --git a/net/mac80211/debugfs.c b/net/mac80211/debugfs.c
index 27e6fb9..69bf2e5 100644
--- a/net/mac80211/debugfs.c
+++ b/net/mac80211/debugfs.c
@@ -137,18 +137,20 @@ static int aqm_open(struct inode *inode, struct file *file)
len += scnprintf(info->buf + len,
info->size - len,
"* vif\n"
- "ifname addr ac backlog-bytes backlog-packets flows overlimit collisions tx-bytes tx-packets\n");
+ "ifname addr ac backlog-bytes backlog-packets flows drops marks overlimit collisions tx-bytes tx-packets\n");
list_for_each_entry_rcu(sdata, &local->interfaces, list) {
txqi = to_txq_info(sdata->vif.txq);
len += scnprintf(info->buf + len, info->size - len,
- "%s %pM %u %u %u %u %u %u %u %u\n",
+ "%s %pM %u %u %u %u %u %u %u %u %u %u\n",
sdata->name,
sdata->vif.addr,
txqi->txq.ac,
txqi->tin.backlog_bytes,
txqi->tin.backlog_packets,
txqi->tin.flows,
+ txqi->cstats.drop_count,
+ txqi->cstats.ecn_mark,
txqi->tin.overlimit,
txqi->tin.collisions,
txqi->tin.tx_bytes,
@@ -158,14 +160,14 @@ static int aqm_open(struct inode *inode, struct file *file)
len += scnprintf(info->buf + len,
info->size - len,
"* sta\n"
- "ifname addr tid ac backlog-bytes backlog-packets flows overlimit collisions tx-bytes tx-packets\n");
+ "ifname addr tid ac backlog-bytes backlog-packets flows drops marks overlimit collisions tx-bytes tx-packets\n");
list_for_each_entry_rcu(sta, &local->sta_list, list) {
sdata = sta->sdata;
for (i = 0; i < ARRAY_SIZE(sta->sta.txq); i++) {
txqi = to_txq_info(sta->sta.txq[i]);
len += scnprintf(info->buf + len, info->size - len,
- "%s %pM %d %d %u %u %u %u %u %u %u\n",
+ "%s %pM %d %d %u %u %u %u %u %u %u %u %u\n",
sdata->name,
sta->sta.addr,
txqi->txq.tid,
@@ -173,6 +175,8 @@ static int aqm_open(struct inode *inode, struct file *file)
txqi->tin.backlog_bytes,
txqi->tin.backlog_packets,
txqi->tin.flows,
+ txqi->cstats.drop_count,
+ txqi->cstats.ecn_mark,
txqi->tin.overlimit,
txqi->tin.collisions,
txqi->tin.tx_bytes,
diff --git a/net/mac80211/ieee80211_i.h b/net/mac80211/ieee80211_i.h
index c9f8c80..9f11b13 100644
--- a/net/mac80211/ieee80211_i.h
+++ b/net/mac80211/ieee80211_i.h
@@ -812,6 +812,7 @@ struct txq_info {
struct fq_tin tin;
struct fq_flow def_flow;
struct codel_vars def_cvars;
+ struct codel_stats cstats;
unsigned long flags;
/* keep last! */
@@ -1106,7 +1107,6 @@ struct ieee80211_local {
struct fq fq;
struct codel_vars *cvars;
struct codel_params cparams;
- struct codel_stats cstats;
const struct ieee80211_ops *ops;
diff --git a/net/mac80211/tx.c b/net/mac80211/tx.c
index 682011e..201167d 100644
--- a/net/mac80211/tx.c
+++ b/net/mac80211/tx.c
@@ -1339,7 +1339,7 @@ static struct sk_buff *fq_tin_dequeue_func(struct fq *fq,
local = container_of(fq, struct ieee80211_local, fq);
txqi = container_of(tin, struct txq_info, tin);
cparams = &local->cparams;
- cstats = &local->cstats;
+ cstats = &txqi->cstats;
if (flow == &txqi->def_flow)
cvars = &txqi->def_cvars;
@@ -1399,6 +1399,7 @@ void ieee80211_txq_init(struct ieee80211_sub_if_data *sdata,
fq_tin_init(&txqi->tin);
fq_flow_init(&txqi->def_flow);
codel_vars_init(&txqi->def_cvars);
+ codel_stats_init(&txqi->cstats);
txqi->txq.vif = &sdata->vif;
@@ -1437,7 +1438,6 @@ int ieee80211_txq_setup_flows(struct ieee80211_local *local)
return ret;
codel_params_init(&local->cparams);
- codel_stats_init(&local->cstats);
local->cparams.interval = MS2TIME(100);
local->cparams.target = MS2TIME(20);
local->cparams.ecn = true;
--
2.9.0
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Cerowrt-devel] [Make-wifi-fast] wifi airtime fairness patches could use eyeballs and testing
2016-08-10 22:05 ` Toke Høiland-Jørgensen
2016-08-10 22:07 ` Toke Høiland-Jørgensen
@ 2016-08-10 22:18 ` Jonathan Morton
2016-08-10 22:45 ` Toke Høiland-Jørgensen
1 sibling, 1 reply; 18+ messages in thread
From: Jonathan Morton @ 2016-08-10 22:18 UTC (permalink / raw)
To: Toke Høiland-Jørgensen
Cc: Dave Taht, make-wifi-fast, cerowrt-devel, Felix Fietkau
> On 11 Aug, 2016, at 01:05, Toke Høiland-Jørgensen <toke@toke.dk> wrote:
>
> There are cwnd reductions, yes, but are they drops? My thought was that they were retransmissions caused by OOO packets?
You should be able to find the retransmitted packets, in that case. But FQ shouldn’t be reordering packets within the same flow.
- Jonathan Morton
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Cerowrt-devel] [Make-wifi-fast] wifi airtime fairness patches could use eyeballs and testing
2016-08-10 22:18 ` Jonathan Morton
@ 2016-08-10 22:45 ` Toke Høiland-Jørgensen
2016-08-11 6:43 ` Dave Taht
0 siblings, 1 reply; 18+ messages in thread
From: Toke Høiland-Jørgensen @ 2016-08-10 22:45 UTC (permalink / raw)
To: Jonathan Morton; +Cc: Dave Taht, make-wifi-fast, cerowrt-devel, Felix Fietkau
Jonathan Morton <chromatix99@gmail.com> writes:
>> On 11 Aug, 2016, at 01:05, Toke Høiland-Jørgensen <toke@toke.dk> wrote:
>>
>> There are cwnd reductions, yes, but are they drops? My thought was
>> that they were retransmissions caused by OOO packets?
>
> You should be able to find the retransmitted packets, in that case.
> But FQ shouldn’t be reordering packets within the same flow.
No, but the WiFi retransmission logic might. Specifically, the ath9k
will put packets in a retry buffer that takes precedence over new
packets when building aggregates. But since it has one or two aggregates
built already by the time it does this, reordering can occur that way.
Now, digging in to the packet traces (looking at the single flow case,
since that is easier to follow), there's a ~320k segment of the sequence
space that is about 1.5MB and 200 ms late (sequence numbers 10044929
through 10370177 show up after sequence number 11617665 at around the
12k packet mark in the trace). This is nine full aggregates being
reordered.
I'm not really sure the reordering mechanism described above can account
for this, unless there are some pretty serious hardware buffering going
on that we are not aware of. Does anyone else have any ideas?
-Toke
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Cerowrt-devel] [Make-wifi-fast] wifi airtime fairness patches could use eyeballs and testing
2016-08-10 22:45 ` Toke Høiland-Jørgensen
@ 2016-08-11 6:43 ` Dave Taht
2016-08-11 9:16 ` Toke Høiland-Jørgensen
0 siblings, 1 reply; 18+ messages in thread
From: Dave Taht @ 2016-08-11 6:43 UTC (permalink / raw)
To: Toke Høiland-Jørgensen
Cc: Jonathan Morton, make-wifi-fast, cerowrt-devel, Felix Fietkau
On Thu, Aug 11, 2016 at 12:45 AM, Toke Høiland-Jørgensen <toke@toke.dk> wrote:
> Jonathan Morton <chromatix99@gmail.com> writes:
>
>>> On 11 Aug, 2016, at 01:05, Toke Høiland-Jørgensen <toke@toke.dk> wrote:
>>>
>>> There are cwnd reductions, yes, but are they drops? My thought was
>>> that they were retransmissions caused by OOO packets?
>>
>> You should be able to find the retransmitted packets, in that case.
>> But FQ shouldn’t be reordering packets within the same flow.
>
> No, but the WiFi retransmission logic might. Specifically, the ath9k
> will put packets in a retry buffer that takes precedence over new
> packets when building aggregates. But since it has one or two aggregates
> built already by the time it does this, reordering can occur that way.
>
> Now, digging in to the packet traces (looking at the single flow case,
> since that is easier to follow), there's a ~320k segment of the sequence
> space that is about 1.5MB and 200 ms late (sequence numbers 10044929
> through 10370177 show up after sequence number 11617665 at around the
> 12k packet mark in the trace). This is nine full aggregates being
> reordered.
>
> I'm not really sure the reordering mechanism described above can account
> for this, unless there are some pretty serious hardware buffering going
> on that we are not aware of. Does anyone else have any ideas?
From a bug finding perspective, disable fq as per the original patch,
and try again, look for reordering.
The question I have is does this also happen on x86? A thought is that
we're interacting with a mips specific bug in header parsing...
>
> -Toke
--
Dave Täht
Let's go make home routers and wifi faster! With better software!
http://blog.cerowrt.org
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Cerowrt-devel] [Make-wifi-fast] wifi airtime fairness patches could use eyeballs and testing
2016-08-11 6:43 ` Dave Taht
@ 2016-08-11 9:16 ` Toke Høiland-Jørgensen
0 siblings, 0 replies; 18+ messages in thread
From: Toke Høiland-Jørgensen @ 2016-08-11 9:16 UTC (permalink / raw)
To: Dave Taht; +Cc: Jonathan Morton, make-wifi-fast, cerowrt-devel, Felix Fietkau
On 11 August 2016 08:43:46 CEST, Dave Taht <dave.taht@gmail.com> wrote:
>On Thu, Aug 11, 2016 at 12:45 AM, Toke Høiland-Jørgensen <toke@toke.dk>
>wrote:
>> Jonathan Morton <chromatix99@gmail.com> writes:
>>
>>>> On 11 Aug, 2016, at 01:05, Toke Høiland-Jørgensen <toke@toke.dk>
>wrote:
>>>>
>>>> There are cwnd reductions, yes, but are they drops? My thought was
>>>> that they were retransmissions caused by OOO packets?
>>>
>>> You should be able to find the retransmitted packets, in that case.
>>> But FQ shouldn’t be reordering packets within the same flow.
>>
>> No, but the WiFi retransmission logic might. Specifically, the ath9k
>> will put packets in a retry buffer that takes precedence over new
>> packets when building aggregates. But since it has one or two
>aggregates
>> built already by the time it does this, reordering can occur that
>way.
>>
>> Now, digging in to the packet traces (looking at the single flow
>case,
>> since that is easier to follow), there's a ~320k segment of the
>sequence
>> space that is about 1.5MB and 200 ms late (sequence numbers 10044929
>> through 10370177 show up after sequence number 11617665 at around the
>> 12k packet mark in the trace). This is nine full aggregates being
>> reordered.
>>
>> I'm not really sure the reordering mechanism described above can
>account
>> for this, unless there are some pretty serious hardware buffering
>going
>> on that we are not aware of. Does anyone else have any ideas?
>
>From a bug finding perspective, disable fq as per the original patch,
>and try again, look for reordering.
Yes, though this was on a single flow test. Having thought it over some more, it could also be that the aggregates were simply lost entirely and retransmitted (the capture was taken at the receiver). Would make more sense from a timing perspective; but then we still need to explain why eight consecutive aggregates were lost.
I'll do some more tests and take captures at both ends.
>The question I have is does this also happen on x86? A thought is that
>we're interacting with a mips specific bug in header parsing...
Don't think so, at least not in the same pattern. But the physical environments are also fairly different...
-Toke
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Cerowrt-devel] [Make-wifi-fast] wifi airtime fairness patches could use eyeballs and testing
2016-08-10 13:06 ` [Cerowrt-devel] [Make-wifi-fast] " Noah Causin
2016-08-10 13:11 ` Dave Taht
2016-08-10 15:04 ` Toke Høiland-Jørgensen
@ 2016-08-14 22:21 ` Dave Taht
2016-08-14 22:49 ` Aaron Wood
2 siblings, 1 reply; 18+ messages in thread
From: Dave Taht @ 2016-08-14 22:21 UTC (permalink / raw)
To: Noah Causin
Cc: make-wifi-fast, cerowrt-devel, Felix Fietkau, Simon Wunderlich
Toke and I have now done a great deal of testing and exploration of
the issue with leaving "fq" enabled on the ath9k (running with and
without felix's patch).
The only thing we've found that "moves the bar" is the presence or
absence of crypto. With crypto + fq on, on the wndr 3800, we see a
significant performance variation of 32-80mbit as the number of flows
goes from 12 to 1.
without crypto, with or with fq enabled, we are able to reliably get
120Mbits on our test setup.
The patterns of the degradation are also often variable, we will have
long periods where things are quite stable, and other 10-20 sec
periods where throughput drops or stops almost entirely.
It would be nice to get confirmation that the fq is working without
crypto enabled.
Ideas as to why crypto chops off so much throughput in the first place
(from 120mbit to 80), and as to what else to poke into with crypto on,
would be nice.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Cerowrt-devel] [Make-wifi-fast] wifi airtime fairness patches could use eyeballs and testing
2016-08-14 22:21 ` Dave Taht
@ 2016-08-14 22:49 ` Aaron Wood
2016-08-14 23:02 ` Dave Taht
0 siblings, 1 reply; 18+ messages in thread
From: Aaron Wood @ 2016-08-14 22:49 UTC (permalink / raw)
To: Dave Taht
Cc: Noah Causin, make-wifi-fast, Simon Wunderlich, cerowrt-devel,
Felix Fietkau
[-- Attachment #1: Type: text/plain, Size: 478 bytes --]
On Sun, Aug 14, 2016 at 3:21 PM, Dave Taht <dave.taht@gmail.com> wrote:
>
> Ideas as to why crypto chops off so much throughput in the first place
> (from 120mbit to 80), and as to what else to poke into with crypto on,
> would be nice.
Crypto as in WPA2? If 1 client is fine, but multiple clients is not, my
guess is that changing the AES key in the crypto accelerator is
slow/difficult, and so what you're seeing is the lost time as it switches
security contexts.
-Aaron
[-- Attachment #2: Type: text/html, Size: 877 bytes --]
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Cerowrt-devel] [Make-wifi-fast] wifi airtime fairness patches could use eyeballs and testing
2016-08-14 22:49 ` Aaron Wood
@ 2016-08-14 23:02 ` Dave Taht
2016-08-14 23:04 ` Aaron Wood
0 siblings, 1 reply; 18+ messages in thread
From: Dave Taht @ 2016-08-14 23:02 UTC (permalink / raw)
To: Aaron Wood
Cc: Noah Causin, make-wifi-fast, Simon Wunderlich, cerowrt-devel,
Felix Fietkau
On Mon, Aug 15, 2016 at 12:49 AM, Aaron Wood <woody77@gmail.com> wrote:
> On Sun, Aug 14, 2016 at 3:21 PM, Dave Taht <dave.taht@gmail.com> wrote:
>>
>>
>> Ideas as to why crypto chops off so much throughput in the first place
>> (from 120mbit to 80), and as to what else to poke into with crypto on,
>> would be nice.
>
>
> Crypto as in WPA2? If 1 client is fine, but multiple clients is not, my
> guess is that changing the AES key in the crypto accelerator is
> slow/difficult, and so what you're seeing is the lost time as it switches
> security contexts.
1 client suffers degradation with more than one flow in flight, being
fq'd, and crypted with wpa2.
I'll have plots in the morning.
> -Aaron
--
Dave Täht
Let's go make home routers and wifi faster! With better software!
http://blog.cerowrt.org
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Cerowrt-devel] [Make-wifi-fast] wifi airtime fairness patches could use eyeballs and testing
2016-08-14 23:02 ` Dave Taht
@ 2016-08-14 23:04 ` Aaron Wood
0 siblings, 0 replies; 18+ messages in thread
From: Aaron Wood @ 2016-08-14 23:04 UTC (permalink / raw)
To: Dave Taht
Cc: Noah Causin, make-wifi-fast, Simon Wunderlich, cerowrt-devel,
Felix Fietkau
[-- Attachment #1: Type: text/plain, Size: 962 bytes --]
On Sun, Aug 14, 2016 at 4:02 PM, Dave Taht <dave.taht@gmail.com> wrote:
> On Mon, Aug 15, 2016 at 12:49 AM, Aaron Wood <woody77@gmail.com> wrote:
> > On Sun, Aug 14, 2016 at 3:21 PM, Dave Taht <dave.taht@gmail.com> wrote:
> >>
> >>
> >> Ideas as to why crypto chops off so much throughput in the first place
> >> (from 120mbit to 80), and as to what else to poke into with crypto on,
> >> would be nice.
> >
> >
> > Crypto as in WPA2? If 1 client is fine, but multiple clients is not, my
> > guess is that changing the AES key in the crypto accelerator is
> > slow/difficult, and so what you're seeing is the lost time as it switches
> > security contexts.
>
> 1 client suffers degradation with more than one flow in flight, being
> fq'd, and crypted with wpa2.
>
> I'll have plots in the morning.
>
So it's not, that, then. Unless it's doing something dumb with the keys
when it switches flows (since that's now an interrupt back up to the
driver?)
-Aaron
[-- Attachment #2: Type: text/html, Size: 1543 bytes --]
^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2016-08-14 23:04 UTC | newest]
Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-08-10 11:28 [Cerowrt-devel] wifi airtime fairness patches could use eyeballs and testing Dave Taht
2016-08-10 13:06 ` [Cerowrt-devel] [Make-wifi-fast] " Noah Causin
2016-08-10 13:11 ` Dave Taht
2016-08-10 15:04 ` Toke Høiland-Jørgensen
2016-08-10 19:35 ` Dave Taht
2016-08-10 20:06 ` Toke Høiland-Jørgensen
2016-08-10 21:50 ` Toke Høiland-Jørgensen
2016-08-10 21:58 ` Dave Taht
2016-08-10 22:05 ` Toke Høiland-Jørgensen
2016-08-10 22:07 ` Toke Høiland-Jørgensen
2016-08-10 22:18 ` Jonathan Morton
2016-08-10 22:45 ` Toke Høiland-Jørgensen
2016-08-11 6:43 ` Dave Taht
2016-08-11 9:16 ` Toke Høiland-Jørgensen
2016-08-14 22:21 ` Dave Taht
2016-08-14 22:49 ` Aaron Wood
2016-08-14 23:02 ` Dave Taht
2016-08-14 23:04 ` Aaron Wood
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox