From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from alpha.coverfire.com (alpha.coverfire.com [69.41.199.58]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id 7836D3BA8E for ; Mon, 6 Aug 2018 21:46:52 -0400 (EDT) Received: from neptune.home (ganymede.home [69.41.199.68]) (authenticated bits=0) by alpha.coverfire.com (8.15.2/8.15.2) with ESMTPSA id w771kmsf016929 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Mon, 6 Aug 2018 21:46:48 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=coverfire.com; s=alpha2011102501; t=1533606410; bh=SIjClZZx1URdODDzTE118mORjQjvYy0/v4ZvoLVa8Qw=; h=Subject:From:To:Cc:Date:In-Reply-To:References; b=KZJ73q2L6pupQ2pDJtD2ht+eTq913voHE/rJH1Jhx8CPEQkMR6UW4uhwTFOBPfmaP BSukiYgVbYUo4QT5SUUtJzjS0A3OthQur0kU1saTYwtuW6X7vp1WxhypfBPpXfqN4B DOj+QIZCEE66SAz3tetqc/lVyj9Pb2KHFb9LPrCc= Message-ID: <82fecddc233c81fde8556797c69050cea8c6798c.camel@coverfire.com> From: Dan Siemon To: Jonathan Morton Cc: Toke =?ISO-8859-1?Q?H=F8iland-J=F8rgensen?= , Dave Taht , Cake List Date: Mon, 06 Aug 2018 21:46:48 -0400 In-Reply-To: References: <1357421162.31089.1531812291583@webmail.strato.de> <1c323544b3076c0ab31b887d6113f25f572e41ae.camel@coverfire.com> <87woth28rw.fsf@toke.dk> <87tvol1z6h.fsf@toke.dk> Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.28.5 (3.28.5-1.fc28) Mime-Version: 1.0 Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.84 on 69.41.199.58 X-Spam-Status: No, score=-1.1 required=5.0 tests=ALL_TRUSTED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,T_DATE_IN_FUTURE_96_Q autolearn=unavailable autolearn_force=no version=3.4.1 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on alpha.coverfire.com Subject: Re: [Cake] =?utf-8?q?Using_cake_to_shape_1000=E2=80=99s_of_users=2E?= X-BeenThere: cake@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: Cake - FQ_codel the next generation List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 07 Aug 2018 01:46:52 -0000 On Fri, 2018-07-27 at 21:58 +0300, Jonathan Morton wrote: > > > > We have some deployments with multiple access technologies (eg > > DOCSIS, > > DSL and wireless) behind the same box so per customer overhead > > would be > > useful. > > The design I presently have in mind would allow setting the overhead > per speed tier. Thus you could have "10Mbit DOCSIS" and "10Mbit > ADSL" as separate tiers. That is, there's likely to be far fewer > unique overhead settings than customers. The unique overhead configurations will be much smaller than the number of subscribers but from a provisioning standpoint, this info belongs with the subscriber. Having to map a subscriber into an 'overhead config' vs. just setting those values along with rate would be inconvenient. > > I am interested in the diffserv class ideas in Cake but need to do > > some > > experimentation to see if that helps in this environment. It would > > be > > interesting to be able to target sub-classes (not sure what the > > proper > > Cake terminology is) based on a eBPF classifier. > > The trick with Diffserv is classifying the traffic correctly, given > that most traffic in the wild is not properly marked, and some > actively tries to hide its nature. As an ISP, applying your own > rules could be seen as questionable according to some interpretations > of Net Neutrality. With Cake that's not an issue by default, since > it relies entirely on the DSCP on the packet itself, but it does mean > that a lot of traffic which *should* probably be classified other > than best-effort is not. > > Cake does cope well with unclassified latency-sensitive traffic, due > to DRR++, so it's not actually necessary to specifically identify > such applications. What it has trouble with, in common with all > other flow-isolating qdiscs, is applications which open many parallel > connections for bulk transfer (eg. BitTorrent, which is used by many > software-update mechanisms as well as file-sharing). Classifying > this traffic as Bulk (CS1 DSCP) gets it out of the way of other > applications while still permitting them to use available capacity. > > I have some small hope that wider deployment of sensible, basic > Diffserv at bottlenecks will encourage applications to start using it > as intended, thereby solving a decades-long chicken-egg > problem. Presently Cake has such an implementation by default, but > is not yet widely deployed enough to have any visible impact. > > This is something we should probably discuss in more detail later. > > > Re accounting, we currently count per-IP via eBPF not via qdisc > > counts. > > Fine, that's a feature I'm happy to leave out. One problem I haven't solved yet is counting bytes and packets per-IP post qdisc. This is required when multiple IPs are sent to the same QDisc but per-IP stats are still needed. It would be great if there were a post qdisc hook for eBPF programs. Something like a cls_act that runs after the other qdiscs on Tx instead of before. It would be even better if the skb could be flagged with an ID of some kind so that the outbound eBPF program doesn't need to parse the headers again and could instead just do a map lookup + update. I started looking at XDP for this but at the time it was all receive side. I'm not sure if that's still the case. > > > So in summary, the logical flow of a packet should be: > > 1: Map dst or src IP to subscriber (eBPF). > 2: Map subscriber to speed/overhead tier (eBPF). > 3: (optional) Classify Diffserv (???). > 4: Enqueue per flow, handle global queue overflow (rare!) by dropping > from head of longest queue (like Cake). Since the encapsulations used in the ISP world are pretty diverse, being able to generate the flow hash from the eBPF program is also required. Today we do this by setting the skb->hash field and letting the outbound FQ-CoDel classify based on that. > --- enqueue/dequeue split --- > 5: Wait for shaper on earliest-scheduled subscriber link with waiting > traffic (borrow sch_fq's rbtree?). > 6: Wait for shaper on aggregate backhaul link (congestion can happen > here too). Yes, multiple layers of bandwidth limiting are required - two at a minimum. One for the subscriber's plan rate and another for the nearest access device. Conceptually, is it possible to build something akin to the HTB qdisc that uses DRR++ and retains the flexible hierarchy? I don't have a good sense of the performance impact of this type of setup vs. a single more featureful qdisc. > 7: Choose one of subscriber's queues, apply AQM and deliver a packet > (mostly like Cake does). > > If that seems reasonable, we can treat it as a baseline spec for > others' input. > > - Jonathan Morton