From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-qk0-x244.google.com (mail-qk0-x244.google.com [IPv6:2607:f8b0:400d:c09::244]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id 740433B29E for ; Wed, 8 Aug 2018 11:13:19 -0400 (EDT) Received: by mail-qk0-x244.google.com with SMTP id c126-v6so1740800qkd.7 for ; Wed, 08 Aug 2018 08:13:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=IVHQGRtqitLzZg+aGdr10IF0kc4SYgwtxmqYac62QnI=; b=SKb5vTi2o0BgQ1wIpBeYxQE2fkWQsiPeatb3LU6HgCcSxPF/ZDrmJX6UkxPocqhTB2 rSrcR5ipr7uJO7T72j3alXyRuLe0VNF8dQjhPXottil0BMx1+3fRXpr5LKspet4KtxYc wv+q8Dpx5T8tGyMvwonULJdFBIz3OKyHIlVIPx5uzSevEcFoeGhmzYll6vxWEco//1AE i2tbgsSvXi2cd8mAHI5RYoz5NiZEhUjIyqMDG2YhRrVJRPT0cPQ7QITLg90+BmCGsjcR C3XPozxztdk38LoLOaghNADGcvJ7W8LZV0hqve7LEboYNkQxYrcbZJ/rdeg4i/eeVh97 JKWw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=IVHQGRtqitLzZg+aGdr10IF0kc4SYgwtxmqYac62QnI=; b=RoEo796o58QJojAQtL1UiusUyPrBQ3ZE9x0QlqIRCEcI3nd4yspn8ZChEtQwU2PT6q xgJaaQr0f0DDg+I7hSYn8c8A7VyxmuhO26QT7HBq1ONVk5IHos1AjVqpS2q+HyB2HgPl 5d65t5WIVvjfhbkrnQfBhzzPLokbGzJxGMXzFtiesKNy3TT+xeOJ/4Ps5EyLaUEWBkUO 4xUGKRlKV9tIjzah5OdYcH6Lf3sdVQGXgnvE+eGYoXY2rRYyXGOoFKzcbfh0AsED8F6e c5sFcn4ZEDj2HwuJo6hkoKYF+b5MLNEzXQaoOsr9eM7wHXgeX825VaQupk4sNz2KiawW VNDQ== X-Gm-Message-State: AOUpUlHg2HcplR0RRMwiHrn0BXfAq2uWJixiStT/KME1028ztgE6C5s0 P/E495CyjlML3w8izUq2rrEqAI0lU4jtcOC63rw= X-Google-Smtp-Source: AA+uWPxu0aCmbyWO4p2BehLzAZZI2jvhd2OHOJ79FtYRmKAVRbF+7GxO0+0PgqGEUKW+S8hubZ1Jp4AFkX/YDZAzvXo= X-Received: by 2002:a37:c40d:: with SMTP id d13-v6mr2823284qki.190.1533741198941; Wed, 08 Aug 2018 08:13:18 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Dave Taht Date: Wed, 8 Aug 2018 08:13:07 -0700 Message-ID: To: Jonathan Morton Cc: Cake List Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Subject: Re: [Cake] ebpf policing? X-BeenThere: cake@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: Cake - FQ_codel the next generation List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 08 Aug 2018 15:13:19 -0000 On Wed, Aug 8, 2018 at 6:04 AM Jonathan Morton wrot= e: > > > On 7 Aug, 2018, at 3:12 am, Dave Taht wrote: > > > >> Writing a modern policer in ebpf is feasible. it's got nsec > >> timestamping, counters, threads, lots of potential parallelism. Loops > >> that have to be carefully bound. The worst possible config api. A nice > >> statistics export system. > >> > >> Classic policers use token buckets, but anyone up for codel, > >> time_per_byte, framing awareness and deficits, and a little aqm? > > > > :crickets: :) > > > > I started taking a stab at writing a straight tc filter for this, > > trying to learn enough about how the tc filter subsystem works > > today - like, can you scribble on a packet? Can you keep local state? > > I actually thought about it myself, but while it's undoubtedly *possible*= to write a policer in eBPF, I don't see an overwhelmingly good reason to a= ctually do so. It still makes the most sense to do it in C, for maximum pe= rformance, until the semantics are proved useful enough to bake in hardware= . for relative ease of coding, too. straight c is way easier than ebpf c. I merely proved to myself that you could translate it to ebpf if needed, and it is the rx side of linux that struggles to reach even 1/10th the pps of the tx side, so pushing stuff like this into an offload engine might be long term worthwhile - ISPs also have to deal with packet floods and ddos attacks.... > There does still seem to be an awful lot of boilerplate in a netfilter ac= tion, though. Makes it harder to tease out what is actually going on. Lord gawd this got complicated in the last decade. > > > What's this rcu thing do? > > Read-Copy-Update is a scheme for efficient, lock-free concurrent access, = where most of the accesses are reads while changes are comparatively rare. = Basically there's a pointer to the real data, and the pointer can be switc= hed atomically after a complete new set of data is constructed, then the ol= d data can be deleted once all the read-only references to it are released.= It's a natural fit for configuration data, but would be a bad choice for = realtime statistics - better to use atomic_add et al for that. well, it Is used by the distributed bstats code to collect incrementing counters and later merge them after the rcu period. So for stats ok. for state, not ok. > > It strikes me that the filter environment may differ from the qdisc envir= onment in one crucial matter: concurrency. A qdisc is always called with a= lock held, so concurrency with respect to itself is not a factor, but maxi= mum throughput is limited. If that is *not* true of a filter action, then = greater throughput should be feasible but the programming techniques requir= ed will be more subtle. Does anyone know for certain which it is? many older filters do take a lock per packet, notably act_police. The modernized act_skbedit doesn't. > > - Jonathan Morton > --=20 Dave T=C3=A4ht CEO, TekLibre, LLC http://www.teklibre.com Tel: 1-669-226-2619