So I've finally gotten around to trying out the Python bandit algorithm implementations from https://www.math.univ-toulouse.fr/~agarivie/Telecom/bandits/ I have added a very simple WiFi arm implementation to the code base (which is basically a Bernoulli arm that succeeds with a certain probability and gives a payoff scaled by the base rate), which can be instantiated from the rate_stats_csv output from Minstrel in the kernel. Based on the data from a simple test run in my own testbed I was able to get three of the algorithms to produce something meaningful; see the attached graph. The best of the algorithms performs roughly comparable to Minstrel (I think; the numbers are not quite straight forward to compare). I have not been able to get the Thompson and BayesUCB algorithms to work with this scenario yet (they require a posterior distribution to sample from, and the included implementation doesn't work with the varying payoffs of each arm). However, perhaps sticking with the KL-UCB algorithm is better anyway (same one the "optimal rate sampling" modifies; haven't quite grokked how they modify it yet). Anyway, I do believe it is possible to extend this simulation to something that we can use to guide, say, a dynamic implementation (i.e., change the probabilities over the duration of the test run), as well as evaluating the effects of collapsing arms / defining them differently. It would probably be good with a better source of the actual probabilities for each rate, though. So yeah, bit of a brain dump of where I'm at; but I'll be away for the next couple of weeks, so though it better to get this out there. Code here: https://kau.toke.dk/git/pybandits/ -Toke