From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from bifrost.lang.hm (lang.hm [66.167.227.134]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id 9B6833B260 for ; Mon, 6 Jun 2016 15:25:09 -0400 (EDT) Received: from asgard.lang.hm (asgard.lang.hm [10.0.0.100]) by bifrost.lang.hm (8.13.4/8.13.4/Debian-3) with ESMTP id u56JP5Cp027375; Mon, 6 Jun 2016 12:25:06 -0700 Date: Mon, 6 Jun 2016 12:25:05 -0700 (PDT) From: David Lang X-X-Sender: dlang@asgard.lang.hm To: Dave Taht cc: cake@lists.bufferbloat.net In-Reply-To: Message-ID: References: User-Agent: Alpine 2.02 (DEB 1266 2009-07-14) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Subject: Re: [Cake] faster scheduling, maybe X-BeenThere: cake@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: Cake - FQ_codel the next generation List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 06 Jun 2016 19:25:09 -0000 On Mon, 6 Jun 2016, Dave Taht wrote: > On Mon, Jun 6, 2016 at 11:48 AM, David Lang wrote: >> On Mon, 6 Jun 2016, Dave Taht wrote: >> >>> http://info.iet.unipi.it/~luigi/papers/20160511-mysched-preprint.pdf >> >> >> I don't think so. >> >> They don't even try for fairness between flows, they are just looking at >> fairness between different VMs. they tell a VM that it has complete access >> to the NIC for a time, then give another VM complete access to the NIC. At >> best they put each VMs traffic into a different hardware queue in the NIC. >> >> This avoids all AQM decisions on the part of the host OS, because the >> packets never get to the host OS. >> >> The speed improvement is by bypassing the host OS and just having the VMs >> deliver packets directly to the NIC. This speeds things up, but at the cost >> of any coordination across VMs. Each VM can run fq_codel but it's much >> corser timeslicing between VMs. > > > Well, the principal things bugging me are: > > * that we have multi-core on nearly all the new routers. > * Nearly all the ethernet devices themselves support hardware multiqueue. > * we take 6 locks on the qdiscs > * rx and tx ring cleanup are often combined in existing drivers in a > single thread. These are valid concerns. But this paper just arbitrated between multiple VMs accessing one hardware NIC. If you don't have multiple VMs in play, their approach has nothing to work with. > The chance to rework the mac80211 layer on make-wifi-fast (where > manufacturers are also busy adding hardware mq), gives us a chance to > rethink how we process access to these queues. Watching the discussion I see a few things. 1. if we can figure out exactly how much data the system is going to handle, we can fill each queue with what we want in the next aggregate to that destination. 2. with multiple queues/cores, when we have data from different sources, we can route it to different cores and split the work of sorting the data into different queues between the cores (each working on a different subset of queues, so not having to lock against other cores) 2a. balancing traffic across the cores/queuesets (or at least the output from each of them) would be tricky, but that's where thinking like this paper could possibly help. 3. as we are seeing in the MAC80211 work, qdiscs operate way to early in the process, so we need to eliminate them for wifi and doing the queueing closer to the hardware. On a network where packets to different destinaions can be freely mixed, qdiscs can continue to operate. for drivers where the rx and tx ring cleanup is combined, are you talking about ones that already go BQL? or are these drivers that need BQL added as well? it may be that splitting this when BQL is added is the right thing to do. David Lang