From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-qt0-x22f.google.com (mail-qt0-x22f.google.com [IPv6:2607:f8b0:400d:c0d::22f]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id 6F0023B29E for ; Thu, 2 Aug 2018 17:03:48 -0400 (EDT) Received: by mail-qt0-x22f.google.com with SMTP id y19-v6so3948375qto.5 for ; Thu, 02 Aug 2018 14:03:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:content-transfer-encoding:mime-version:date:subject:message-id :references:in-reply-to:to; bh=wYHoF3VmrtE3RzcDKna/UpU9Oj7/4C36vAkMKtYHmM0=; b=mtzedNpxgA970lZ3P0vqvRmSNyQZ2ZW4AGgxU1IP+f7TY+2jcAZ/SGrAhTlGNj6D1F rKIjdrzeSO1GrkGFcLNxaCTIERD1ELu+c9imB1X9vasvJYQ/soBLVDbg6Cvm2sBMjWDx iT8n++qVBw9PR45+4CHnC35qCl7IiY9EtLlDvrtq5A+gMsVQ40Lw0nd5J3P607IJPm2C nv0cNLAwfJLunDZlgFHAvTWEUmF3GsZ9J72WAQsGm5lm+NNW0dsQPbqomvBWn4bI0Zgf ToTu/80jGa3K9GLqRBK5my3DD7qDJqXFL3K4V4h8qz1/xD3k7bdKBULX/3AZC2ddIJ6R Dttg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:content-transfer-encoding:mime-version:date :subject:message-id:references:in-reply-to:to; bh=wYHoF3VmrtE3RzcDKna/UpU9Oj7/4C36vAkMKtYHmM0=; b=PtJVex0Ku0UOsrj48oqwP1EgpDQ4pTyqzq/bqWm//HUNHbbzBGTx9lNuKIcbwnEBrW zSICEjFnFq+f9fM81lzt3VGTgdTTuLh8NG7c6PkgS1rEpTIcdfUwYeSSHa9IogAooKLn Web1HTV/QJ1MWuJu0KM4kRVCHlqTiu7GJJk3tEOXneRhjVTu7muf57ksenCyh12E8FcH 7aMrLRJGNecur0FHPsl66eV06C9wkFrEuIM27r+mCXDJUzW3jpZFRKOQfheHnALFBEKE 3XEiXol9BUb/iVZM3qb4+mfpzY9WOU+GBFtTGadSU4jDT6FmX+ETsf4KnQDn89F7VjFQ ykfw== X-Gm-Message-State: AOUpUlHYkUtDGfLwKnGGauINMmn5PnlD+i/Bp7AnboOPA0cVM2yv6TQE my/yOqD//41eI5B5oXifa9dAdFGd X-Google-Smtp-Source: AAOMgpfelQRjsQ0F8gq845CDzooyMZBtXa5UzVkuAyCAeXQC1Rgg87oZMQSlJ1y7kJBcThqg6e6CuA== X-Received: by 2002:ac8:4402:: with SMTP id j2-v6mr1149667qtn.233.1533243827632; Thu, 02 Aug 2018 14:03:47 -0700 (PDT) Received: from ?IPv6:2607:fb90:64e3:8e6c:4c67:c78f:c6a8:e1d2? ([2607:fb90:64e3:8e6c:4c67:c78f:c6a8:e1d2]) by smtp.gmail.com with ESMTPSA id a17-v6sm2081191qkb.62.2018.08.02.14.03.46 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 02 Aug 2018 14:03:46 -0700 (PDT) From: Eduardo Simonetti Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Mime-Version: 1.0 (1.0) Date: Thu, 2 Aug 2018 17:03:45 -0400 Message-Id: <37515660-0B10-4B03-9D20-883EA650C7F1@gmail.com> References: In-Reply-To: To: cake@lists.bufferbloat.net X-Mailer: iPhone Mail (15F79) Subject: Re: [Cake] Cake Digest, Vol 41, Issue 3 X-BeenThere: cake@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: Cake - FQ_codel the next generation List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 02 Aug 2018 21:03:48 -0000 Sent from my iPhone > On Aug 1, 2018, at 3:48 PM, cake-request@lists.bufferbloat.net wrote: >=20 > Send Cake mailing list submissions to > cake@lists.bufferbloat.net >=20 > To subscribe or unsubscribe via the World Wide Web, visit > https://lists.bufferbloat.net/listinfo/cake > or, via email, send a message with subject or body 'help' to > cake-request@lists.bufferbloat.net >=20 > You can reach the person managing the list at > cake-owner@lists.bufferbloat.net >=20 > When replying, please edit your Subject line so it is more specific > than "Re: Contents of Cake digest..." >=20 >=20 > Today's Topics: >=20 > 1. passing args to bpf programs (Dave Taht) > 2. Re: passing args to bpf programs (Stephen Hemminger) > 3. Re: passing args to bpf programs (Dave Taht) > 4. Re: passing args to bpf programs (Jonathan Morton) > 5. Re: passing args to bpf programs (Dave Taht) > 6. Re: passing args to bpf programs (Dave Taht) > 7. codel in ebpf? (Dave Taht) > 8. fq_codel on netronome's NICs? (Dave Taht) >=20 >=20 > ---------------------------------------------------------------------- >=20 > Message: 1 > Date: Wed, 1 Aug 2018 09:22:41 -0700 > From: Dave Taht > To: Cake List > Subject: [Cake] passing args to bpf programs > Message-ID: > > Content-Type: text/plain; charset=3D"UTF-8" >=20 > this really isn't the right list for this... but I wanted to build on > the ack_filter bpf code I had, to create impairments, like dropping > acks every X packets, or randomly, or when a specific pattern is seen > (like timestamps or sack). This was sort of the reverse complement to > getting the cake ack-filter right, now that I know everything that can > go wrong... >=20 > I see I can return ACT_SHOT, so I can drop packets. >=20 > But what I can't quite figure out is how to pass args to an tc ebpf > program. Do I have to pass those via a file descriptor? A map > generated elsewhere? what? Sure as heck don't want to compile one > program per opt.... >=20 > Simplest args would be: >=20 > max 16 - drop every 16th ack packet > random 24 - drop randomly between 0 24 > match only certain flags >=20 > followed by more gnarly ones like: >=20 > miscalculate if I have a payload or not > drop sack > mangle timestamps >=20 > --=20 >=20 > Dave T=C3=A4ht > CEO, TekLibre, LLC > http://www.teklibre.com > Tel: 1-669-226-2619 >=20 >=20 > ------------------------------ >=20 > Message: 2 > Date: Wed, 1 Aug 2018 09:35:22 -0700 > From: Stephen Hemminger > To: Dave Taht > Cc: Cake List > Subject: Re: [Cake] passing args to bpf programs > Message-ID: <20180801093522.22c1f043@xeon-e3> > Content-Type: text/plain; charset=3DUS-ASCII >=20 > On Wed, 1 Aug 2018 09:22:41 -0700 > Dave Taht wrote: >=20 >> this really isn't the right list for this... but I wanted to build on >> the ack_filter bpf code I had, to create impairments, like dropping >> acks every X packets, or randomly, or when a specific pattern is seen >> (like timestamps or sack). This was sort of the reverse complement to >> getting the cake ack-filter right, now that I know everything that can >> go wrong... >>=20 >> I see I can return ACT_SHOT, so I can drop packets. >>=20 >> But what I can't quite figure out is how to pass args to an tc ebpf >> program. Do I have to pass those via a file descriptor? A map >> generated elsewhere? what? Sure as heck don't want to compile one >> program per opt.... >>=20 >> Simplest args would be: >>=20 >> max 16 - drop every 16th ack packet >> random 24 - drop randomly between 0 24 >> match only certain flags >>=20 >> followed by more gnarly ones like: >>=20 >> miscalculate if I have a payload or not >> drop sack >> mangle timestamps >>=20 >=20 > With Xnetem, I ended up creating a map of config options. >=20 >=20 > ------------------------------ >=20 > Message: 3 > Date: Wed, 1 Aug 2018 09:36:32 -0700 > From: Dave Taht > To: Cake List > Subject: Re: [Cake] passing args to bpf programs > Message-ID: > > Content-Type: text/plain; charset=3D"UTF-8" >=20 > A somewhat related goal would be to apply the codel algorithm via bpf. > We'd take advantage of hardware > multiqueue for the fq part, ensure a good timestamp always existed on > all ingress ports, check it on egress. >=20 > The one major loop in codel we could unroll to be a fixed unroll (and > just give up), and we're done there. >=20 >=20 > ------------------------------ >=20 > Message: 4 > Date: Wed, 1 Aug 2018 19:42:02 +0300 > From: Jonathan Morton > To: Dave Taht > Cc: Cake List > Subject: Re: [Cake] passing args to bpf programs > Message-ID: > Content-Type: text/plain; charset=3Dus-ascii >=20 >> On 1 Aug, 2018, at 7:36 pm, Dave Taht wrote: >>=20 >> The one major loop in codel we could unroll to be a fixed unroll (and >> just give up), and we're done there. >=20 > The COBALT version only has a loop in the recovery phase, and that mainly t= o handle long pauses immediately following heavy congestion. The idle and m= arking phases do not loop. >=20 > - Jonathan Morton >=20 >=20 >=20 > ------------------------------ >=20 > Message: 5 > Date: Wed, 1 Aug 2018 09:54:02 -0700 > From: Dave Taht > To: Jonathan Morton > Cc: Cake List > Subject: Re: [Cake] passing args to bpf programs > Message-ID: > > Content-Type: text/plain; charset=3D"UTF-8" >=20 > the other thing I noticed while fiddling with bql and cake unshaped is > that bql, too, had gained the ability to limit rates at mbit > granularity, when I wasn't looking. I am not sure if additional > hardware support is required, but: >=20 > https://patchwork.ozlabs.org/patch/449002/ >=20 >=20 >> On Wed, Aug 1, 2018 at 9:42 AM Jonathan Morton wr= ote: >>=20 >>> On 1 Aug, 2018, at 7:36 pm, Dave Taht wrote: >>>=20 >>> The one major loop in codel we could unroll to be a fixed unroll (and >>> just give up), and we're done there. >>=20 >> The COBALT version only has a loop in the recovery phase, and that mainly= to handle long pauses immediately following heavy congestion. The idle and= marking phases do not loop. >>=20 >> - Jonathan Morton >>=20 >=20 >=20 > --=20 >=20 > Dave T=C3=A4ht > CEO, TekLibre, LLC > http://www.teklibre.com > Tel: 1-669-226-2619 >=20 >=20 > ------------------------------ >=20 > Message: 6 > Date: Wed, 1 Aug 2018 10:25:52 -0700 > From: Dave Taht > To: Jonathan Morton > Cc: Cake List > Subject: Re: [Cake] passing args to bpf programs > Message-ID: > > Content-Type: text/plain; charset=3D"UTF-8" >=20 > I wonder if ebpf has opcode space for an invsqrt? >=20 >=20 > ------------------------------ >=20 > Message: 7 > Date: Wed, 1 Aug 2018 12:20:46 -0700 > From: Dave Taht > To: Jonathan Morton > Cc: Cake List > Subject: [Cake] codel in ebpf? > Message-ID: > > Content-Type: text/plain; charset=3D"UTF-8" >=20 >> On Wed, Aug 1, 2018 at 10:25 AM Dave Taht wrote: >>=20 >> I wonder if ebpf has opcode space for an invsqrt? >=20 > bpf_ktime_get_ns() exists... >=20 > one thing that I don't know if bpf can do is read/write the > skb->tstamp field. The plan would be to rigorously write it (if not > supplied by hw) on all ingress ports and check it on all egress ports. >=20 > That said, every time I've tried to do something in ebpf I hit a > limitation I'd not thunk of yet. For example, where can you attach the > egress filter? >=20 > My thought would be to use a bfifo > bpf -> bql, but from what little I > understand, it's bpf -> bfifo -> bql >=20 > --=20 >=20 > Dave T=C3=A4ht > CEO, TekLibre, LLC > http://www.teklibre.com > Tel: 1-669-226-2619 >=20 >=20 > ------------------------------ >=20 > Message: 8 > Date: Wed, 1 Aug 2018 12:48:58 -0700 > From: Dave Taht > To: cerowrt-devel@lists.bufferbloat.net, Cake List > , codel@lists.bufferbloat.net > Cc: Jakub Kicinski > Subject: [Cake] fq_codel on netronome's NICs? > Message-ID: > > Content-Type: text/plain; charset=3D"UTF-8" >=20 > Being kind of inspired by all the tricks > https://homes.cs.washington.edu/~arvind/papers/afq.pdf used on the > cavium, I went looking for other smart nics to play with. > https://open-nfp.org/resources/ looked interesting so I pinged them... >=20 > from netronome: >=20 > "I think it would be feasible to implement fq_codel on the NFP. >=20 > The hardware schedulers do not support fq_codel, so the schedulers > would have to be implemented in one of the NFP firmware languages > (e.g. micro-C or micro-code); the NFP hardware rings could be used for > the queueing mechanism. Practically, this may be one way of making it > work: >=20 > The main worker threads could calculate the flow hash in order to > select which ring should be used, and then issue the packet to a > re-ordering thread. > I believe the re-ordering thread can push the packets to the internal > NFP rings instead of the wire. > The scheduler thread could then make the scheduling decision, pop the > packet from the corresponding ring, then send the packet to the > hardware packet schedulers (or drop the packet if performing a > head-drop), and also check the timestamp for the CoDel portion of the > algorithm. > The hardware packet schedulers should then transmit the packet. >=20 >=20 > In terms of handling any rate-mismatch on the outgoing interface, you > could have another thread monitor the NFP hardware packet scheduler > queue levels. The scheduler thread can then throttle the packet rate > being sent to the hardware packet schedulers (unless of course it is > okay to tail-drop at the hardware packet scheduler queues). >=20 > Finally, if the outgoing interface is not the natural point of > congestion/rate mis-match (e.g. if the outgoing Ethernet interface is > attached to a cable/DLS modem), the NFP hardware does have some > support for rate-limiting the outgoing interface (e.g. limiting a 10 > Gigabit Ethernet interface down to 600 Mbps outbound), so as to move > the congestion/rate mis-match point to the NFP, so that fq_codel can > take effect in terms of handling the buffer bloat." >=20 > --=20 >=20 > Dave T=C3=A4ht > CEO, TekLibre, LLC > http://www.teklibre.com > Tel: 1-669-226-2619 >=20 >=20 > ------------------------------ >=20 > Subject: Digest Footer >=20 > _______________________________________________ > Cake mailing list > Cake@lists.bufferbloat.net > https://lists.bufferbloat.net/listinfo/cake >=20 >=20 > ------------------------------ >=20 > End of Cake Digest, Vol 41, Issue 3 > ***********************************