[Make-wifi-fast] Instrumented ATH9K for Crashes?

Frank Horowitz frank at horow.net
Thu Feb 16 22:18:45 EST 2017

Hi All,

TL/DR: I’ve been seeing reliable crashes from ATH9K drivers in net-next kernels for weeks, but have been unable to capture a crash log.

In an attempt at having a reliably/regularly updatable router running the ATF and BBR codes, I’ve assembled an Atom based Zotac mini-itx board with two different ATH9K based radios. I’ve installed Ubuntu 16.10, with a kernel compiled from Dave Miller’s net-next tree (currently running 4.10-rc7). The radios are set up using 2 different hostapd.conf files (one for the 2.4GHz radio, and one for the 5GHz radio). The motherboard has an RTL8169 ethernet onboard, and I’ve got a 4 port Intel ethernet card also in the mix. The RTL8169 is my WAN port, fed by a DSL modem (running LEDE), and all but one of the other network ports are part of a LAN bridge — the last port is ultimately meant to feed a DMZ, but there’s nothing on it at the moment.

When the radios are not connected to the bridge, everything has run stably for days. When the radios are connected to the bridge, but have no clients, the result has run stably for about 24 hours before I stopped the test.

When a radio is connected to the bridge and has a client, the system reliably crashes within an hour or two.

I’ve tried to get netconsole logs from another linux box on my bridged LAN. but thats a Heisenbug because I can’t get the ATH9K’s to play well with netconsole over the bridge. I think this is due to the lack of polling in the ATH9K driver, but would be delighted to find out that it’s something configurable for those radios. Bottom line, I’ve had no luck in snagging a log from the crashes via netconsole. I’ve also tried looking at the systemd logs, but nothing made it to the log database before the crash.

I could reconfigure my network such that the unbridged DMZ is feeding my external linux box.

Before I try that, I thought I’d ask Toke and the list for advice about any configs for the ATH9K driver that might help with A) capturing a crash log, and/or B) debugging the drivers.

Hopefully, by the time this bites someone else in 4.11 kernels, we’ll have been able to squish this bug. (Just to be explicit, I’m volunteering to be a testbed. Don’t tell my wife! ;-) )

TIA for any hints on how best to proceed.

Frank Horowitz
frank at horow.net

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 841 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <https://lists.bufferbloat.net/pipermail/make-wifi-fast/attachments/20170216/68c19d50/attachment.sig>

More information about the Make-wifi-fast mailing list