* [Bloat] Fwd: Re: Unable to create htb tc classes more than 64K [not found] ` <9cbefe10-b172-ae2a-0ac7-d972468eb7a2@gmail.com> @ 2019-08-26 7:35 ` Toke Høiland-Jørgensen 2019-08-27 20:04 ` [Bloat] [Cake] " Stephen Hemminger [not found] ` <CAA93jw6TWUmqsvBDT4tFPgwjGxAmm_S5bUibj16nwp1F=AwyRA@mail.gmail.com> 1 sibling, 1 reply; 4+ messages in thread From: Toke Høiland-Jørgensen @ 2019-08-26 7:35 UTC (permalink / raw) To: bloat, cake [-- Attachment #1: Type: text/plain, Size: 214 bytes --] Turns out that with the "earliest departure time" support in sched_fq, it is now possible to write a shaper in eBPF, thus avoiding the global qdisc lock in sched_htb. This is pretty cool, if you ask me! :) -Toke [-- Attachment #2: Type: message/rfc822, Size: 11786 bytes --] From: Eric Dumazet <eric.dumazet@gmail.com> To: Cong Wang <xiyou.wangcong@gmail.com>, Akshat Kakkar <akshat.1984@gmail.com> Cc: Anton Danilov <littlesmilingcloud@gmail.com>, NetFilter <netfilter-devel@vger.kernel.org>, lartc <lartc@vger.kernel.org>, netdev <netdev@vger.kernel.org> Subject: Re: Unable to create htb tc classes more than 64K Date: Mon, 26 Aug 2019 08:32:48 +0200 Message-ID: <9cbefe10-b172-ae2a-0ac7-d972468eb7a2@gmail.com> On 8/25/19 7:52 PM, Cong Wang wrote: > On Wed, Aug 21, 2019 at 11:00 PM Akshat Kakkar <akshat.1984@gmail.com> wrote: >> >> On Thu, Aug 22, 2019 at 3:37 AM Cong Wang <xiyou.wangcong@gmail.com> wrote: >>>> I am using ipset + iptables to classify and not filters. Besides, if >>>> tc is allowing me to define qdisc -> classes -> qdsic -> classes >>>> (1,2,3 ...) sort of structure (ie like the one shown in ascii tree) >>>> then how can those lowest child classes be actually used or consumed? >>> >>> Just install tc filters on the lower level too. >> >> If I understand correctly, you are saying, >> instead of : >> tc filter add dev eno2 parent 100: protocol ip prio 1 handle >> 0x00000001 fw flowid 1:10 >> tc filter add dev eno2 parent 100: protocol ip prio 1 handle >> 0x00000002 fw flowid 1:20 >> tc filter add dev eno2 parent 100: protocol ip prio 1 handle >> 0x00000003 fw flowid 2:10 >> tc filter add dev eno2 parent 100: protocol ip prio 1 handle >> 0x00000004 fw flowid 2:20 >> >> >> I should do this: (i.e. changing parent to just immediate qdisc) >> tc filter add dev eno2 parent 1: protocol ip prio 1 handle 0x00000001 >> fw flowid 1:10 >> tc filter add dev eno2 parent 1: protocol ip prio 1 handle 0x00000002 >> fw flowid 1:20 >> tc filter add dev eno2 parent 2: protocol ip prio 1 handle 0x00000003 >> fw flowid 2:10 >> tc filter add dev eno2 parent 2: protocol ip prio 1 handle 0x00000004 >> fw flowid 2:20 > > > Yes, this is what I meant. > > >> >> I tried this previously. But there is not change in the result. >> Behaviour is exactly same, i.e. I am still getting 100Mbps and not >> 100kbps or 300kbps >> >> Besides, as I mentioned previously I am using ipset + skbprio and not >> filters stuff. Filters I used just to test. >> >> ipset -N foo hash:ip,mark skbinfo >> >> ipset -A foo 10.10.10.10, 0x0x00000001 skbprio 1:10 >> ipset -A foo 10.10.10.20, 0x0x00000002 skbprio 1:20 >> ipset -A foo 10.10.10.30, 0x0x00000003 skbprio 2:10 >> ipset -A foo 10.10.10.40, 0x0x00000004 skbprio 2:20 >> >> iptables -A POSTROUTING -j SET --map-set foo dst,dst --map-prio > > Hmm.. > > I am not familiar with ipset, but it seems to save the skbprio into > skb->priority, so it doesn't need TC filter to classify it again. > > I guess your packets might go to the direct queue of HTB, which > bypasses the token bucket. Can you dump the stats and check? With more than 64K 'classes' I suggest to use a single FQ qdisc [1], and an eBPF program using EDT model (Earliest Departure Time) The BPF program would perform the classification, then find a data structure based on the 'class', and then update/maintain class virtual times and skb->tstamp TBF = bpf_map_lookup_elem(&map, &classid); uint64_t now = bpf_ktime_get_ns(); uint64_t time_to_send = max(TBF->time_to_send, now); time_to_send += (u64)qdisc_pkt_len(skb) * NSEC_PER_SEC / TBF->rate; if (time_to_send > TBF->max_horizon) { return TC_ACT_SHOT; } TBF->time_to_send = time_to_send; skb->tstamp = max(time_to_send, skb->tstamp); if (time_to_send - now > TBF->ecn_horizon) bpf_skb_ecn_set_ce(skb); return TC_ACT_OK; tools/testing/selftests/bpf/progs/test_tc_edt.c shows something similar. [1] MQ + FQ if the device is multi-queues. Note that this setup scales very well on SMP, since we no longer are forced to use a single HTB hierarchy (protected by a single spinlock) ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [Bloat] [Cake] Fwd: Re: Unable to create htb tc classes more than 64K 2019-08-26 7:35 ` [Bloat] Fwd: Re: Unable to create htb tc classes more than 64K Toke Høiland-Jørgensen @ 2019-08-27 20:04 ` Stephen Hemminger 2019-08-28 8:34 ` Eric Dumazet 0 siblings, 1 reply; 4+ messages in thread From: Stephen Hemminger @ 2019-08-27 20:04 UTC (permalink / raw) To: Toke Høiland-Jørgensen; +Cc: bloat, cake On Mon, 26 Aug 2019 09:35:14 +0200 Toke Høiland-Jørgensen <toke@redhat.com> wrote: > Turns out that with the "earliest departure time" support in sched_fq, > it is now possible to write a shaper in eBPF, thus avoiding the global > qdisc lock in sched_htb. This is pretty cool, if you ask me! :) > > -Toke > Thanks, I may use this to revisit doing netem in eBPF (xnetem). Not having this feature was a show stopper at the time. ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [Bloat] [Cake] Fwd: Re: Unable to create htb tc classes more than 64K 2019-08-27 20:04 ` [Bloat] [Cake] " Stephen Hemminger @ 2019-08-28 8:34 ` Eric Dumazet 0 siblings, 0 replies; 4+ messages in thread From: Eric Dumazet @ 2019-08-28 8:34 UTC (permalink / raw) To: bloat On 8/27/19 10:04 PM, Stephen Hemminger wrote: > On Mon, 26 Aug 2019 09:35:14 +0200 > Toke Høiland-Jørgensen <toke@redhat.com> wrote: > >> Turns out that with the "earliest departure time" support in sched_fq, >> it is now possible to write a shaper in eBPF, thus avoiding the global >> qdisc lock in sched_htb. This is pretty cool, if you ask me! :) >> >> -Toke >> > > Thanks, I may use this to revisit doing netem in eBPF (xnetem). > Not having this feature was a show stopper at the time. Note that TCP stack got support for arbitrary per-socket delays. Very useful to build a complex network emulator with thousands of TCP flows with very different rtt. https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=a842fe1425cb20f457abd3f8ef98b468f83ca98b ^ permalink raw reply [flat|nested] 4+ messages in thread
[parent not found: <CAA93jw6TWUmqsvBDT4tFPgwjGxAmm_S5bUibj16nwp1F=AwyRA@mail.gmail.com>]
[parent not found: <48a3284b-e8ba-f169-6a2d-9611f8538f07@gmail.com>]
* Re: [Bloat] Unable to create htb tc classes more than 64K [not found] ` <48a3284b-e8ba-f169-6a2d-9611f8538f07@gmail.com> @ 2019-08-27 21:41 ` Dave Taht 0 siblings, 0 replies; 4+ messages in thread From: Dave Taht @ 2019-08-27 21:41 UTC (permalink / raw) To: Eric Dumazet Cc: Cong Wang, Akshat Kakkar, Anton Danilov, NetFilter, lartc, netdev, bloat On Tue, Aug 27, 2019 at 2:09 PM Eric Dumazet <eric.dumazet@gmail.com> wrote: > > > > On 8/27/19 10:53 PM, Dave Taht wrote: > > > > Although this is very cool, I think in this case the OP is being > > a router, not server? > > This mechanism is generic. EDT has not been designed for servers only. > > One HTB class (with one associated qdisc per leaf) per rate limiter > does not scale, and consumes a _lot_ more memory. > > We have abandoned HTB at Google for these reasons. > > Nice thing with EDT is that you can stack arbitrary number of rate limiters, > and still keep a single queue (in FQ or another layer downstream) There's a lot of nice things about EDT! I'd followed along on the theory, timerwheels, virtual clocks, etc, and went seeking ethernet hw that could do it (directly) on the low end and came up empty - and doing anything with the concept required a complete rethink on everything we were already doing in wifi/fq_codel/cake ;(, and after we shipped cake in 4.19, I bought a sailboat, and logged out for a while. The biggest problem bufferbloat.net has left is more efficient inbound shaping/policing on cheap hw. I don't suppose you've solved that already? :puppy dog eyes: Next year's version of openwrt we can maybe try to do something coherent with EDT. > -- Dave Täht CTO, TekLibre, LLC http://www.teklibre.com Tel: 1-831-205-9740 ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2019-08-28 8:34 UTC | newest] Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- [not found] <CAA5aLPhf1=wzQG0BAonhR3td-RhEmXaczug8n4hzXCzreb+52g@mail.gmail.com> [not found] ` <CAM_iQpVyEtOGd5LbyGcSNKCn5XzT8+Ouup26fvE1yp7T5aLSjg@mail.gmail.com> [not found] ` <CAA5aLPiqyhnWjY7A3xsaNJ71sDOf=Rqej8d+7=_PyJPmV9uApA@mail.gmail.com> [not found] ` <CAM_iQpUH6y8oEct3FXUhqNekQ3sn3N7LoSR0chJXAPYUzvWbxA@mail.gmail.com> [not found] ` <CAA5aLPjzX+9YFRGgCgceHjkU0=e6x8YMENfp_cC9fjfHYK3e+A@mail.gmail.com> [not found] ` <CAM_iQpXBhrOXtfJkibyxyq781Pjck-XJNgZ-=Ucj7=DeG865mw@mail.gmail.com> [not found] ` <CAA5aLPjO9rucCLJnmQiPBxw2pJ=6okf3C88rH9GWnh3p0R+Rmw@mail.gmail.com> [not found] ` <CAM_iQpVtGUH6CAAegRtTgyemLtHsO+RFP8f6LH2WtiYu9-srfw@mail.gmail.com> [not found] ` <9cbefe10-b172-ae2a-0ac7-d972468eb7a2@gmail.com> 2019-08-26 7:35 ` [Bloat] Fwd: Re: Unable to create htb tc classes more than 64K Toke Høiland-Jørgensen 2019-08-27 20:04 ` [Bloat] [Cake] " Stephen Hemminger 2019-08-28 8:34 ` Eric Dumazet [not found] ` <CAA93jw6TWUmqsvBDT4tFPgwjGxAmm_S5bUibj16nwp1F=AwyRA@mail.gmail.com> [not found] ` <48a3284b-e8ba-f169-6a2d-9611f8538f07@gmail.com> 2019-08-27 21:41 ` [Bloat] " Dave Taht
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox