From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pa0-x241.google.com (mail-pa0-x241.google.com [IPv6:2607:f8b0:400e:c03::241]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id DE1E93B25E for ; Tue, 16 Aug 2016 16:47:26 -0400 (EDT) Received: by mail-pa0-x241.google.com with SMTP id hh10so5786645pac.1 for ; Tue, 16 Aug 2016 13:47:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=message-id:subject:from:to:cc:date:in-reply-to:references :mime-version:content-transfer-encoding; bh=aB4jA1Adi1789mKRwOGg9a1lVKgfx7RooV7zUph3OT8=; b=S0Sw71eDgod/2Qp++O0Rga2OB8uqykHjLNocBoSb/z/3Ip3IvNm0HuJlFTk/RaLneL UC1UDEgU3V1V7zNParMrhf62St+TctDkCiiaIESES3qIEuBAuCJVCGixn2Q5rSvWBk/6 WH1MWo5gr44DGbroxIzo+/7zuVmGg03A+mk/JJt7oaC+X0A5S3SRmwp9Ut5W+GpYWtES f1/2M6zEKVYMLkR1V1+TJPDNENhPKdP5hlcqpVxv421q7qaFyyms/sS3OM6T1cZZgrDp 6auuq8/XTrPCzZhSspOQZwMLmpD0Mx5ePTBW9wZjKWGKzraHU+2koN/PXxmM+TcKmDrf 0KxQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:message-id:subject:from:to:cc:date:in-reply-to :references:mime-version:content-transfer-encoding; bh=aB4jA1Adi1789mKRwOGg9a1lVKgfx7RooV7zUph3OT8=; b=keh0BSv0+OG9tt6Gk40Uf/v4ivL5NqIZ3dMpt7mQNCFOpKh345dMVWdSgHprNxG9ZY 84zoGrM4+w4gA8Wmmk1Ki+SHmU1zQMbL1LHBTDEtkYi1fjEApODYwUhE9lwdvu77Xja+ vtl8RE92gBSzRgjrKjL+jQTy0dc9GgrzuPB3e8oDM1aM7RHaodBlx4VMMXmicR4CQSDj wKrZGUjpmGJ2NjNSvvWbGs+VRxX1AQH80K4XwC2G2LX3FUZcnn789rZKneODTph5hRu8 RBmGol9431mNFfimmFbPjIGz1BnOlkC/hjouyGR8oiaDjoO76dRVJtw+obNzOahv81mJ z2yw== X-Gm-Message-State: AEkooutvlwSq0v6u1FL6XSiG2i4nm+dxgJf9X0sOfx8q5fGEfoFmapxOj0p4wAFdbWVwRA== X-Received: by 10.66.123.42 with SMTP id lx10mr9789286pab.95.1471380446061; Tue, 16 Aug 2016 13:47:26 -0700 (PDT) Received: from ?IPv6:2620:0:1000:1704:89e6:3019:93d5:46e4? ([2620:0:1000:1704:89e6:3019:93d5:46e4]) by smtp.googlemail.com with ESMTPSA id cp11sm41535009pac.28.2016.08.16.13.47.25 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 16 Aug 2016 13:47:25 -0700 (PDT) Message-ID: <1471380444.4943.17.camel@edumazet-glaptop3.roam.corp.google.com> From: Eric Dumazet To: Toke =?ISO-8859-1?Q?H=F8iland-J=F8rgensen?= Cc: make-wifi-fast@lists.bufferbloat.net, linux-wireless@vger.kernel.org, Felix Fietkau Date: Tue, 16 Aug 2016 13:47:24 -0700 In-Reply-To: <87pop85tvr.fsf@toke.dk> References: <87pop85tvr.fsf@toke.dk> Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.10.4-0ubuntu2 Mime-Version: 1.0 Content-Transfer-Encoding: 8bit Subject: Re: [Make-wifi-fast] On the ath9k performance regression with FQ and crypto X-BeenThere: make-wifi-fast@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 16 Aug 2016 20:47:27 -0000 Do you have tcpdumps of 1) sample with crypto 2) sample without crypto. Looks like some TCP Small queue interaction with skb->truesize, if GSO is involved, or encapsulation adding overhead. On Tue, 2016-08-16 at 22:41 +0200, Toke Høiland-Jørgensen wrote: > So Dave and I have been spending the last couple of days trying to > narrow down why there's a performance regression in some cases on ath9k > with the softq-FQ patches. Felix first noticed this regression, and LEDE > currently carries a patch [1] to disable the FQ portion of the softq > patches to avoid it. > > While we have been able to narrow it down a little bit, no solution has > been forthcoming, so this is an attempt to describe the bug in the hope > that someone else will have an idea about what could be causing it. > > What we're seeing is the following (when the access point is running > ath9k with the softq patches): > > When running two or more flows to a station, their combined throughput > will be roughly 20-30% lower than the throughput of a single flow to the > same station. This happens: > > - for both TCP and UDP traffic. > - independent of the base rate (i.e. signal quality). > - but only with crypto enabled (WPA2 CCMP in this case). > > However, the regression completely disappears if either of the > following is true: > > - no crypto is enabled. > - the FQ part of mac80211 is disabled (as in [1]). > > We have been able to reproduce this behaviour on two different ath9k > hardware chips and two different architectures. > > The cause of the regression seems to be that the aggregates are smaller > when there are two flows than when there is only one. Adding debug > statements to the aggregate forming code indicates that this is because > no more packets are available when the aggregates are built (i.e. > ieee80211_tx_dequeue() returns NULL). > > We have not been able to determine why the queues run empty when this > combination of circumstances arise. Since we easily get upwards of 120 > Mbps of TCP throughput without crypto but with full FQ, it's clearly not > the hashing overhead in itself that does it (and the hashing also > happens with just one flow, so the overhead is still there). And the > crypto itself should be offloaded to hardware (shouldn't it? we do see a > marked drop in overall throughput from just enabling crypto), so how > would the queueing (say, mixing of packets from different flows) > influence that? > > Does anyone have any ideas? We are stumped... > > -Toke > > [1] https://git.lede-project.org/?p=lede/nbd/staging.git;a=blob;f=package/kernel/mac80211/patches/220-fq_disable_hack.patch;h=7f420beea56335d5043de6fd71b5febae3e9bd79;hb=HEAD > _______________________________________________ > Make-wifi-fast mailing list > Make-wifi-fast@lists.bufferbloat.net > https://lists.bufferbloat.net/listinfo/make-wifi-fast