[Cake] Cake Digest, Vol 41, Issue 3

Eduardo Simonetti eduardo.simonetti at gmail.com
Thu Aug 2 17:03:45 EDT 2018



Sent from my iPhone

> On Aug 1, 2018, at 3:48 PM, cake-request at lists.bufferbloat.net wrote:
> 
> Send Cake mailing list submissions to
>    cake at lists.bufferbloat.net
> 
> To subscribe or unsubscribe via the World Wide Web, visit
>    https://lists.bufferbloat.net/listinfo/cake
> or, via email, send a message with subject or body 'help' to
>    cake-request at lists.bufferbloat.net
> 
> You can reach the person managing the list at
>    cake-owner at lists.bufferbloat.net
> 
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Cake digest..."
> 
> 
> Today's Topics:
> 
>   1. passing args to bpf programs (Dave Taht)
>   2. Re: passing args to bpf programs (Stephen Hemminger)
>   3. Re: passing args to bpf programs (Dave Taht)
>   4. Re: passing args to bpf programs (Jonathan Morton)
>   5. Re: passing args to bpf programs (Dave Taht)
>   6. Re: passing args to bpf programs (Dave Taht)
>   7. codel in ebpf? (Dave Taht)
>   8. fq_codel on netronome's NICs? (Dave Taht)
> 
> 
> ----------------------------------------------------------------------
> 
> Message: 1
> Date: Wed, 1 Aug 2018 09:22:41 -0700
> From: Dave Taht <dave.taht at gmail.com>
> To: Cake List <cake at lists.bufferbloat.net>
> Subject: [Cake] passing args to bpf programs
> Message-ID:
>    <CAA93jw4YyAfgyFX-6_HTMaCdhfsWVt=V3eQ5uUzH78WuunVLRw at mail.gmail.com>
> Content-Type: text/plain; charset="UTF-8"
> 
> this really isn't the right list for this... but I wanted to build on
> the ack_filter bpf code I had, to create impairments, like dropping
> acks every X packets, or randomly, or when a specific pattern is seen
> (like timestamps or sack). This was sort of the reverse complement to
> getting the cake ack-filter right, now that I know everything that can
> go wrong...
> 
> I see I can return ACT_SHOT, so I can drop packets.
> 
> But what I can't quite figure out is how to pass args to an tc ebpf
> program. Do I have to pass those via a file descriptor? A map
> generated elsewhere? what? Sure as heck don't want to compile one
> program per opt....
> 
> Simplest args would be:
> 
> max 16 - drop every 16th ack packet
> random 24 - drop randomly between 0 24
> match only certain flags
> 
> followed by more gnarly ones like:
> 
> miscalculate if I have a payload or not
> drop sack
> mangle timestamps
> 
> -- 
> 
> Dave Täht
> CEO, TekLibre, LLC
> http://www.teklibre.com
> Tel: 1-669-226-2619
> 
> 
> ------------------------------
> 
> Message: 2
> Date: Wed, 1 Aug 2018 09:35:22 -0700
> From: Stephen Hemminger <stephen at networkplumber.org>
> To: Dave Taht <dave.taht at gmail.com>
> Cc: Cake List <cake at lists.bufferbloat.net>
> Subject: Re: [Cake] passing args to bpf programs
> Message-ID: <20180801093522.22c1f043 at xeon-e3>
> Content-Type: text/plain; charset=US-ASCII
> 
> On Wed, 1 Aug 2018 09:22:41 -0700
> Dave Taht <dave.taht at gmail.com> wrote:
> 
>> this really isn't the right list for this... but I wanted to build on
>> the ack_filter bpf code I had, to create impairments, like dropping
>> acks every X packets, or randomly, or when a specific pattern is seen
>> (like timestamps or sack). This was sort of the reverse complement to
>> getting the cake ack-filter right, now that I know everything that can
>> go wrong...
>> 
>> I see I can return ACT_SHOT, so I can drop packets.
>> 
>> But what I can't quite figure out is how to pass args to an tc ebpf
>> program. Do I have to pass those via a file descriptor? A map
>> generated elsewhere? what? Sure as heck don't want to compile one
>> program per opt....
>> 
>> Simplest args would be:
>> 
>> max 16 - drop every 16th ack packet
>> random 24 - drop randomly between 0 24
>> match only certain flags
>> 
>> followed by more gnarly ones like:
>> 
>> miscalculate if I have a payload or not
>> drop sack
>> mangle timestamps
>> 
> 
> With Xnetem, I ended up creating a map of config options.
> 
> 
> ------------------------------
> 
> Message: 3
> Date: Wed, 1 Aug 2018 09:36:32 -0700
> From: Dave Taht <dave.taht at gmail.com>
> To: Cake List <cake at lists.bufferbloat.net>
> Subject: Re: [Cake] passing args to bpf programs
> Message-ID:
>    <CAA93jw6yZ3=Nrt6uRVp=c94TfXWt4q5bwaUM=eq1K4dONGgopQ at mail.gmail.com>
> Content-Type: text/plain; charset="UTF-8"
> 
> A somewhat related goal would be to apply the codel algorithm via bpf.
> We'd take advantage of hardware
> multiqueue for the fq part, ensure a good timestamp always existed on
> all ingress ports, check it on egress.
> 
> The one major loop in codel we could unroll to be a fixed unroll (and
> just give up), and we're done there.
> 
> 
> ------------------------------
> 
> Message: 4
> Date: Wed, 1 Aug 2018 19:42:02 +0300
> From: Jonathan Morton <chromatix99 at gmail.com>
> To: Dave Taht <dave.taht at gmail.com>
> Cc: Cake List <cake at lists.bufferbloat.net>
> Subject: Re: [Cake] passing args to bpf programs
> Message-ID: <FD8357DC-5238-42AB-A8B5-9ADD4E4D6CC7 at gmail.com>
> Content-Type: text/plain;    charset=us-ascii
> 
>> On 1 Aug, 2018, at 7:36 pm, Dave Taht <dave.taht at gmail.com> wrote:
>> 
>> The one major loop in codel we could unroll to be a fixed unroll (and
>> just give up), and we're done there.
> 
> The COBALT version only has a loop in the recovery phase, and that mainly to handle long pauses immediately following heavy congestion.  The idle and marking phases do not loop.
> 
> - Jonathan Morton
> 
> 
> 
> ------------------------------
> 
> Message: 5
> Date: Wed, 1 Aug 2018 09:54:02 -0700
> From: Dave Taht <dave.taht at gmail.com>
> To: Jonathan Morton <chromatix99 at gmail.com>
> Cc: Cake List <cake at lists.bufferbloat.net>
> Subject: Re: [Cake] passing args to bpf programs
> Message-ID:
>    <CAA93jw5M3VeL0Q3NeDg7YphrcTp7e=zrUF1H5YH4LHrLaWEC3w at mail.gmail.com>
> Content-Type: text/plain; charset="UTF-8"
> 
> the other thing I noticed while fiddling with bql and cake unshaped is
> that bql, too, had gained the ability to limit rates at mbit
> granularity, when I wasn't looking. I am not sure if additional
> hardware support is required, but:
> 
> https://patchwork.ozlabs.org/patch/449002/
> 
> 
>> On Wed, Aug 1, 2018 at 9:42 AM Jonathan Morton <chromatix99 at gmail.com> wrote:
>> 
>>> On 1 Aug, 2018, at 7:36 pm, Dave Taht <dave.taht at gmail.com> wrote:
>>> 
>>> The one major loop in codel we could unroll to be a fixed unroll (and
>>> just give up), and we're done there.
>> 
>> The COBALT version only has a loop in the recovery phase, and that mainly to handle long pauses immediately following heavy congestion.  The idle and marking phases do not loop.
>> 
>> - Jonathan Morton
>> 
> 
> 
> -- 
> 
> Dave Täht
> CEO, TekLibre, LLC
> http://www.teklibre.com
> Tel: 1-669-226-2619
> 
> 
> ------------------------------
> 
> Message: 6
> Date: Wed, 1 Aug 2018 10:25:52 -0700
> From: Dave Taht <dave.taht at gmail.com>
> To: Jonathan Morton <chromatix99 at gmail.com>
> Cc: Cake List <cake at lists.bufferbloat.net>
> Subject: Re: [Cake] passing args to bpf programs
> Message-ID:
>    <CAA93jw4_jmGGEUUKfYLbDyLf14vp5hfcotMhLtVhwTF98MnKrQ at mail.gmail.com>
> Content-Type: text/plain; charset="UTF-8"
> 
> I wonder if ebpf has opcode space for an invsqrt?
> 
> 
> ------------------------------
> 
> Message: 7
> Date: Wed, 1 Aug 2018 12:20:46 -0700
> From: Dave Taht <dave.taht at gmail.com>
> To: Jonathan Morton <chromatix99 at gmail.com>
> Cc: Cake List <cake at lists.bufferbloat.net>
> Subject: [Cake] codel in ebpf?
> Message-ID:
>    <CAA93jw5RvZSj3JBE9pS=sE29vSzNP9zonQzLBrE07XA3QAq7EQ at mail.gmail.com>
> Content-Type: text/plain; charset="UTF-8"
> 
>> On Wed, Aug 1, 2018 at 10:25 AM Dave Taht <dave.taht at gmail.com> wrote:
>> 
>> I wonder if ebpf has opcode space for an invsqrt?
> 
> bpf_ktime_get_ns() exists...
> 
> one thing that I don't know if bpf can do is read/write the
> skb->tstamp field. The plan would be to rigorously write it (if not
> supplied by hw) on all ingress ports and check it on all egress ports.
> 
> That said, every time I've tried to do something in ebpf I hit a
> limitation I'd not thunk of yet. For example, where can you attach the
> egress filter?
> 
> My thought would be to use a bfifo > bpf -> bql, but from what little I
> understand, it's bpf -> bfifo -> bql
> 
> -- 
> 
> Dave Täht
> CEO, TekLibre, LLC
> http://www.teklibre.com
> Tel: 1-669-226-2619
> 
> 
> ------------------------------
> 
> Message: 8
> Date: Wed, 1 Aug 2018 12:48:58 -0700
> From: Dave Taht <dave.taht at gmail.com>
> To: cerowrt-devel at lists.bufferbloat.net,  Cake List
>    <cake at lists.bufferbloat.net>, codel at lists.bufferbloat.net
> Cc: Jakub Kicinski <jakub.kicinski at netronome.com>
> Subject: [Cake] fq_codel on netronome's NICs?
> Message-ID:
>    <CAA93jw6L6F19RaeMnYz5YXL7q_3vqoipZR-0uqurqjsfsEfwFg at mail.gmail.com>
> Content-Type: text/plain; charset="UTF-8"
> 
> Being kind of inspired by all the tricks
> https://homes.cs.washington.edu/~arvind/papers/afq.pdf used on the
> cavium, I went looking for other smart nics to play with.
> https://open-nfp.org/resources/ looked interesting so I pinged them...
> 
> from netronome:
> 
> "I think it would be feasible to implement fq_codel on the NFP.
> 
> The hardware schedulers do not support fq_codel, so the schedulers
> would have to be implemented in one of the NFP firmware languages
> (e.g. micro-C or micro-code); the NFP hardware rings could be used for
> the queueing mechanism.  Practically, this may be one way of making it
> work:
> 
> The main worker threads could calculate the flow hash in order to
> select which ring should be used, and then issue the packet to a
> re-ordering thread.
> I believe the re-ordering thread can push the packets to the internal
> NFP rings instead of the wire.
> The scheduler thread could then make the scheduling decision, pop the
> packet from the corresponding ring, then send the packet to the
> hardware packet schedulers (or drop the packet if performing a
> head-drop), and also check the timestamp for the CoDel portion of the
> algorithm.
> The hardware packet schedulers should then transmit the packet.
> 
> 
> In terms of handling any rate-mismatch on the outgoing interface, you
> could have another thread monitor the NFP hardware packet scheduler
> queue levels.  The scheduler thread can then throttle the packet rate
> being sent to the hardware packet schedulers (unless of course it is
> okay to tail-drop at the hardware packet scheduler queues).
> 
> Finally, if the outgoing interface is not the natural point of
> congestion/rate mis-match (e.g. if the outgoing Ethernet interface is
> attached to a cable/DLS modem), the NFP hardware does have some
> support for rate-limiting the outgoing interface (e.g. limiting a 10
> Gigabit Ethernet interface down to 600 Mbps outbound), so as to move
> the congestion/rate mis-match point to the NFP, so that fq_codel can
> take effect in terms of handling the buffer bloat."
> 
> -- 
> 
> Dave Täht
> CEO, TekLibre, LLC
> http://www.teklibre.com
> Tel: 1-669-226-2619
> 
> 
> ------------------------------
> 
> Subject: Digest Footer
> 
> _______________________________________________
> Cake mailing list
> Cake at lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cake
> 
> 
> ------------------------------
> 
> End of Cake Digest, Vol 41, Issue 3
> ***********************************


More information about the Cake mailing list