From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ob0-x22d.google.com (mail-ob0-x22d.google.com [IPv6:2607:f8b0:4003:c01::22d]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id C02363B25E; Sat, 30 Apr 2016 23:41:39 -0400 (EDT) Received: by mail-ob0-x22d.google.com with SMTP id j9so74541687obd.3; Sat, 30 Apr 2016 20:41:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:date:message-id:subject:from:to :content-transfer-encoding; bh=zn9+++oeLArfe5QzYCMW5eTufvxRsr/AdkJPyuDi3Hs=; b=Rfeh4c2t/2rRtN6TwS5o9bxULlnqaf6v/BnrY0HJWpvzAZoasGIKryS85Y93VZuDI7 3kwQKb3rBF3qdmfrbHELRCk37eZ4zxydliNxHmfoQutlIFOFWlptoKeOAfMoQC8pvbSE NSa9K9IOpfN5kbo4Atb0mLEtkt0DxTzeCg8O4JGhIG8nZ5XgAMUVqWHOPjX++W9+qmo3 mme2BnM7PjBdvL/KOWvi8ylGTfdBYOhiizR6fsK4RYU5aMHgotLIvvlu9NnAAYBAQi3X FYVXI6Pl9Xajrh5s4/csuTkFSKV9MO1pFIPp7USUQTycUH/RwebLulcMjSE5UvemHFLx dSCA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:date:message-id:subject:from:to :content-transfer-encoding; bh=zn9+++oeLArfe5QzYCMW5eTufvxRsr/AdkJPyuDi3Hs=; b=E9tM62z79rprZH2wKfHKNfEAAfQu3j0+nPrdIh7gEuBhQiS7SRPeZjXYPkVRaLnWE7 vPtWS7B2ianwdQkpMiTQD5deqYsMiXb+bnU0tO8GewbhFV8I+CucUVGLfmddT3RHyv8i uf2V44j44siAHmTedMmk0Ok9Qk1vDWMXms+QGwXYZp1JkCzK4nxtPuVpH3bYqPBb6FB9 +bmXkqNNt8lFmBzLwptQMrgPOah1v7UKaNTWjnDM/LpGMSQEEVPvPrT85a9MrWFHMS29 RG/xYrk1wq6DhZZTR6B1Cd7XMqCRiE50wbtTy/aC86GKgYcgEoep9eULGfuKGCUue0Lq oHYg== X-Gm-Message-State: AOPr4FX+Ww/igDiEV+yKcQvIcRdfeNGgF7omJYKmlbkDEQtQ69wE6kzziHGI5cVXo/JTLTvXrMKjwHVin5ATyg== MIME-Version: 1.0 X-Received: by 10.60.129.166 with SMTP id nx6mr12901818oeb.13.1462074098992; Sat, 30 Apr 2016 20:41:38 -0700 (PDT) Received: by 10.202.78.23 with HTTP; Sat, 30 Apr 2016 20:41:38 -0700 (PDT) Date: Sat, 30 Apr 2016 20:41:38 -0700 Message-ID: From: Dave Taht To: ath10k , "codel@lists.bufferbloat.net" , make-wifi-fast@lists.bufferbloat.net Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Subject: [Codel] fq_codel_drop vs a udp flood X-BeenThere: codel@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: CoDel AQM discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 01 May 2016 03:41:39 -0000 There were a few things on this thread that went by, and I wasn't on the ath10k list (https://www.mail-archive.com/ath10k@lists.infradead.org/msg04461.html) first up, udp flood... >>> From: ath10k on behalf of Roman >>> Yeryomin >>> Sent: Friday, April 8, 2016 8:14 PM >>> To: ath10k@lists.infradead.org >>> Subject: ath10k performance, master branch from 20160407 >>> >>> Hello! >>> >>> I've seen performance patches were commited so I've decided to give it >>> a try (using 4.1 kernel and backports). >>> The results are quite disappointing: TCP download (client pov) dropped >>> from 750Mbps to ~550 and UDP shows completely weird behavour - if >>> generating 900Mbps it gives 30Mbps max, if generating 300Mbps it gives >>> 250Mbps, before (latest official backports release from January) I was >>> able to get 900Mbps. >>> Hardware is basically ap152 + qca988x 3x3. >>> When running perf top I see that fq_codel_drop eats a lot of cpu. >>> Here is the output when running iperf3 UDP test: >>> >>> 45.78% [kernel] [k] fq_codel_drop >>> 3.05% [kernel] [k] ag71xx_poll >>> 2.18% [kernel] [k] skb_release_data >>> 2.01% [kernel] [k] r4k_dma_cache_inv The udp flood behavior is not "weird". The test is wrong. It is so filling the local queue as to dramatically exceed the bandwidth on the link. The size of the local queue has exceeded anything rational, gentle tcp-friendly methods have failed, we're out of configured queue space, and as a last ditch move, fq_codel_drop is attempting to reduce the backlog via brute force. Approaches: 0) Fix the test The udp flood test should seek an operating point roughly equal to the bandwidth of the link, to where there is near zero queuing delay, and nearly 100% utilization. There are several well known methods for an endpoint to seek equilibrium, - filling the pipe and not the queue - notably the ones outlined in this: http://ee.lbl.gov/papers/congavoid.pdf are a good starting point for further research. :) Now, a unicast flood test is useful for figuring out how many packets can fit in a link (both large and small), and tweaking the cpu (or running a box out of memory). However - I have seen a lot of udp flood tests that are constructed badly. Measuring time to *send* X packets without counting the queue length in the test is one. This was iperf3 what options, exactly? Running locally or via a test client connected via ethernet? (so at local cpu speeds, rather than the network ingress speed?) Simple test of your test: if your udp flood test tool reports a better result with a 10000 packet local queue than a 1000 packet one, it's broken. A "Good" udp flood test merely counts the number of *received* packets and bytes over some (set of) intervals, gradually ramping up until it sees no further improvements. A better one might shock the system and try to measure the rate controller or aggregator as well, AND count and graph packet loss over time, etc. and then there's side effects like running out of cpu on an artificial test. Still, in the real world, udp floods exist, and we can rip some of the cpu cost out of fq_codel drop. fq_codel_drop looks through 1024 queues in the mainline version and 4096 in this. [4] That's *expensive*. 1) fq_codel_drop should probably bump up the codel count on every drop to give the main portion of the algorithm a higher drop frequency, faster. Won't hurt, but won't help much in the face of a large disparity of input vs output rates for a fairly long time. A smaller disparity (like with gigE feeding 800mbit wifi) will naturally have the main part of the algo kick in sooner. 2) fq_codel_drop can simply taildrop. That would cut the cpu cost by quite a lot and make the udp flood test easier to "pass". It does little in the real world to actually shoot at the offending flow and a serious flood will end up hurting flows behaving sanely. I favor this option as it is cheap and more or less what happened in the pre-fq_codeled world. Coupling it with 1 above doesn't quite work as well as you might want, either, but might help. 3) Steering - you could store the size and ptr to the biggest flow of all flows and drop from head of that. Or to give more friendly behavior store the top 3 and circulate between them. This incurs an ongoing cpu cost on every queue/dequeue of a packet. 4) Do it more per-station airtime fairness (find the station with the biggest backlog) and have a smaller number of fq_codel queues per station. For most purposes, honestly, 64 queues per station sounds like plenty at the moment. ... I am painfully aware we have a long way to go to get this right, but http://blog.cerowrt.org/post/rtt_fair_on_wifi/ is the endgame for normal traffic.... --=20 Dave T=C3=A4ht Let's go make home routers and wifi faster! With better software! http://blog.cerowrt.org