From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from frisell.zx2c4.com (frisell.zx2c4.com [192.95.5.64]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id 272F53B29F; Sat, 1 Oct 2016 22:26:04 -0400 (EDT) Received: by frisell.zx2c4.com (ZX2C4 Mail Server) with ESMTP id 82ecd5de; Sun, 2 Oct 2016 02:15:08 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=zx2c4.com; h=mime-version :in-reply-to:references:from:date:message-id:subject:to:cc :content-type:content-transfer-encoding; s=mail; bh=uG5QwNKCzgQ6 EkOLfhRVHC749I4=; b=JVz2Y3sxlJjBDVmPaPtSPrWH4Om9y3K1ukwgmEoT8CAP Kk2tp9+hRrwZie+4lqm43Q9VGJQH6POhdmsh7kThenn5O7QR7BJl7yXo3EmbSYsa MW2aO0TDvXVPRtPdO07oLU7QAbmO2xQYl/JTGIh143p+uZMmgbyeEozfpdUmDCeL Exr/8AJ31jHqTxdMf2gwbnH1efRD2PysxjOPBQQvUrHSxSMpF3Z9peaFxwHb661v QkV8jPgVaFGhL3/M26ai0Hr7MkimeTEC2FfpuF3Ts6NhqbtId2lvYex5Yeeph7A1 ehx1/eaSMx0CDUoma4Eavpbl5xlNJuv+z/xAs0PN8Q== Received: by frisell.zx2c4.com (ZX2C4 Mail Server) with ESMTPSA id 0e4cebf8 (TLSv1.2:ECDHE-RSA-AES128-GCM-SHA256:128:NO); Sun, 2 Oct 2016 02:15:07 +0000 (UTC) Received: by mail-lf0-f46.google.com with SMTP id o17so9162775lfg.3; Sat, 01 Oct 2016 19:26:02 -0700 (PDT) X-Gm-Message-State: AA6/9RkK7lHcsPTbN9bBOTWCUUrIaowjTQH7zvlR7lJKvyLiNQRP9NTVpC/N6DafURbbgjxQWuHq/+7akOIzNw== X-Received: by 10.25.23.216 with SMTP id 85mr860602lfx.174.1475375160518; Sat, 01 Oct 2016 19:26:00 -0700 (PDT) MIME-Version: 1.0 Received: by 10.25.159.16 with HTTP; Sat, 1 Oct 2016 19:25:59 -0700 (PDT) In-Reply-To: <87intbfxit.fsf@toke.dk> References: <87twcw9tih.fsf@toke.dk> <87ponk9if1.fsf@toke.dk> <87intc9dx2.fsf@toke.dk> <87intbfxit.fsf@toke.dk> From: "Jason A. Donenfeld" Date: Sun, 2 Oct 2016 04:25:59 +0200 X-Gmail-Original-Message-ID: Message-ID: To: =?UTF-8?B?VG9rZSBIw7hpbGFuZC1Kw7hyZ2Vuc2Vu?= Cc: Dave Taht , cake@lists.bufferbloat.net, make-wifi-fast@lists.bufferbloat.net, WireGuard mailing list Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Subject: Re: [Cake] WireGuard Queuing, Bufferbloat, Performance, Latency, and related issues X-BeenThere: cake@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: Cake - FQ_codel the next generation List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 02 Oct 2016 02:26:04 -0000 Hey Toke, On Sun, Oct 2, 2016 at 1:40 AM, Toke H=C3=B8iland-J=C3=B8rgensen wrote: > I assumed that there probably was, but was not sure where. Thanks for > clearing this up. I'll take a step back and try to describe this on the > conceptual level: Conceptual overview: exactly what I needed, thanks. > First, the reason we want better queueing is to deal with the case where > wireguard is a bottleneck. If it's not a bottleneck, no queue will > build, so there's no queueing latency to reduce. In this case the > bottleneck happens when we're waiting for the crypto step to finish, > i.e. when your queue (2) starts to fill. So we want to limit the time > packets spent in the total queueing system, i.e. from they enter through > xmit() and until they are released by send_off_bundle(). > > The simple way to do this is to just apply CoDel to the whole system. > I.e. timestamp packets when they appear in xmit() and apply CoDel before > sending them off in send_off_bundle(). You'd have to make sure the > timestamp is preserved through the encryption step (keep it in the cb, > for instance), but other than that it's only a couple of handfuls of LOC > you need to add to achieve this. Thank you! Excellent. Okay, that clears things up for me quite a bit. So if= my understanding is correct, the goal of a smart queue is to realize when the queue is being filled and apply artificial delay to the dequeuing process, = so that TCP's rate control algorithm won't suffer from initial bursts. So, the whole idea is not to do anything fancy with the padata situation, but inste= ad record when the packet is submitted to the crypto engine and when it comes back out of the crypto engine, feed this to CoDel, and let CoDel apply whatever necessary delays. This should make going back and reading the wireless code a bit more clear for me. > However, the FQ part of FQ-CoDel gives substantial benefits on top of > plain CoDel, namely fairness between (5-tuple) flows, and lower latency > to sparse flows (because they get priority). Since you have a strict > FIFO encryption step in the middle (an re-order-sensitivity after > encryption is completed), it's difficult to apply FQ across all of > wireguard. However, it can be applied to the peer queues, as I > mentioned. This would have the benefit that when a peer is backlogged > and waiting for space on the crypto queue, packets arriving to it will > get the benefit of the FQ when transmission resumes. Do you mean that: a. I should add another mechanism to the part where I remove items from = the peer queue (queue (1)) and put it in the padata queue (queue (2)) so that it fairly chooses packets based on the inner IP header? or b. When queueing up any packets into the padata queue (queue (2)), I sho= uld somehow apply fairness in deciding which peer's packets get put from that peer's queue (1) into the global queue (2)? > BTW, when reading the code (which is very readable!) Glad you liked it! If anything turns out to be unreadable or unclear, do le= t me know. This sort of thing is somewhat important to me. > 1. When the crypto queue is full, you stick all the unencrypted packets > on the peer queues. What mechanism resumes the transmission of those > packets, other than a new packet arriving to the same peer? And what > happens if several peers are backlogged at the same time? Will their > order they resume transmission depend on which peer happens to get a > new packet once the crypto queue has space? No mechanism. :( Right now nothing happens until the peer has another packe= t to send, which is far from ideal. Due to some other interesting aspects of WireGuard that aren't directly relevant here, if the peer receives data but hasn't sent any data for roughly 10 seconds, the queue will start up again. This obviously isn't a satisfactory solution though. What would you suggest? I try again after 100ms? I try again after some mag= ic CoDel-defined timeout? If I set a one-off timer for 100ms after the rejecti= on, if several peers all get rejected at the same time, I suppose I'll have something of a "thundering herd" situation when the timer fires, though I'm not sure this would necessarily be a bad thing. > 2. There are several places where you walk the queue with a manual > for-loop. Is there any reason why you're not using the > skb_queue_walk_* macros? In particular, in some places you are > storing the pointer in the beginning of a loop in case it goes away; > that seems to be what skb_queue_walk_safe() is meant for? skb_queues are circularly linked. This requires an extra 16 bytes for the h= ead element. Rather than waste that in the cb, I simply use a list of skbs that terminate with a NULL. This makes traversal a bit different, and more simpl= e. Thanks for the guidance here. Very much appreciated. Regards, Jason