* [Bloat] Fwd: Re: Unable to create htb tc classes more than 64K
[not found] ` <9cbefe10-b172-ae2a-0ac7-d972468eb7a2@gmail.com>
@ 2019-08-26 7:35 ` Toke Høiland-Jørgensen
2019-08-27 20:04 ` [Bloat] [Cake] " Stephen Hemminger
[not found] ` <CAA93jw6TWUmqsvBDT4tFPgwjGxAmm_S5bUibj16nwp1F=AwyRA@mail.gmail.com>
1 sibling, 1 reply; 4+ messages in thread
From: Toke Høiland-Jørgensen @ 2019-08-26 7:35 UTC (permalink / raw)
To: bloat, cake
[-- Attachment #1: Type: text/plain, Size: 214 bytes --]
Turns out that with the "earliest departure time" support in sched_fq,
it is now possible to write a shaper in eBPF, thus avoiding the global
qdisc lock in sched_htb. This is pretty cool, if you ask me! :)
-Toke
[-- Attachment #2: Type: message/rfc822, Size: 11786 bytes --]
From: Eric Dumazet <eric.dumazet@gmail.com>
To: Cong Wang <xiyou.wangcong@gmail.com>, Akshat Kakkar <akshat.1984@gmail.com>
Cc: Anton Danilov <littlesmilingcloud@gmail.com>, NetFilter <netfilter-devel@vger.kernel.org>, lartc <lartc@vger.kernel.org>, netdev <netdev@vger.kernel.org>
Subject: Re: Unable to create htb tc classes more than 64K
Date: Mon, 26 Aug 2019 08:32:48 +0200
Message-ID: <9cbefe10-b172-ae2a-0ac7-d972468eb7a2@gmail.com>
On 8/25/19 7:52 PM, Cong Wang wrote:
> On Wed, Aug 21, 2019 at 11:00 PM Akshat Kakkar <akshat.1984@gmail.com> wrote:
>>
>> On Thu, Aug 22, 2019 at 3:37 AM Cong Wang <xiyou.wangcong@gmail.com> wrote:
>>>> I am using ipset + iptables to classify and not filters. Besides, if
>>>> tc is allowing me to define qdisc -> classes -> qdsic -> classes
>>>> (1,2,3 ...) sort of structure (ie like the one shown in ascii tree)
>>>> then how can those lowest child classes be actually used or consumed?
>>>
>>> Just install tc filters on the lower level too.
>>
>> If I understand correctly, you are saying,
>> instead of :
>> tc filter add dev eno2 parent 100: protocol ip prio 1 handle
>> 0x00000001 fw flowid 1:10
>> tc filter add dev eno2 parent 100: protocol ip prio 1 handle
>> 0x00000002 fw flowid 1:20
>> tc filter add dev eno2 parent 100: protocol ip prio 1 handle
>> 0x00000003 fw flowid 2:10
>> tc filter add dev eno2 parent 100: protocol ip prio 1 handle
>> 0x00000004 fw flowid 2:20
>>
>>
>> I should do this: (i.e. changing parent to just immediate qdisc)
>> tc filter add dev eno2 parent 1: protocol ip prio 1 handle 0x00000001
>> fw flowid 1:10
>> tc filter add dev eno2 parent 1: protocol ip prio 1 handle 0x00000002
>> fw flowid 1:20
>> tc filter add dev eno2 parent 2: protocol ip prio 1 handle 0x00000003
>> fw flowid 2:10
>> tc filter add dev eno2 parent 2: protocol ip prio 1 handle 0x00000004
>> fw flowid 2:20
>
>
> Yes, this is what I meant.
>
>
>>
>> I tried this previously. But there is not change in the result.
>> Behaviour is exactly same, i.e. I am still getting 100Mbps and not
>> 100kbps or 300kbps
>>
>> Besides, as I mentioned previously I am using ipset + skbprio and not
>> filters stuff. Filters I used just to test.
>>
>> ipset -N foo hash:ip,mark skbinfo
>>
>> ipset -A foo 10.10.10.10, 0x0x00000001 skbprio 1:10
>> ipset -A foo 10.10.10.20, 0x0x00000002 skbprio 1:20
>> ipset -A foo 10.10.10.30, 0x0x00000003 skbprio 2:10
>> ipset -A foo 10.10.10.40, 0x0x00000004 skbprio 2:20
>>
>> iptables -A POSTROUTING -j SET --map-set foo dst,dst --map-prio
>
> Hmm..
>
> I am not familiar with ipset, but it seems to save the skbprio into
> skb->priority, so it doesn't need TC filter to classify it again.
>
> I guess your packets might go to the direct queue of HTB, which
> bypasses the token bucket. Can you dump the stats and check?
With more than 64K 'classes' I suggest to use a single FQ qdisc [1], and
an eBPF program using EDT model (Earliest Departure Time)
The BPF program would perform the classification, then find a data structure
based on the 'class', and then update/maintain class virtual times and skb->tstamp
TBF = bpf_map_lookup_elem(&map, &classid);
uint64_t now = bpf_ktime_get_ns();
uint64_t time_to_send = max(TBF->time_to_send, now);
time_to_send += (u64)qdisc_pkt_len(skb) * NSEC_PER_SEC / TBF->rate;
if (time_to_send > TBF->max_horizon) {
return TC_ACT_SHOT;
}
TBF->time_to_send = time_to_send;
skb->tstamp = max(time_to_send, skb->tstamp);
if (time_to_send - now > TBF->ecn_horizon)
bpf_skb_ecn_set_ce(skb);
return TC_ACT_OK;
tools/testing/selftests/bpf/progs/test_tc_edt.c shows something similar.
[1] MQ + FQ if the device is multi-queues.
Note that this setup scales very well on SMP, since we no longer are forced
to use a single HTB hierarchy (protected by a single spinlock)
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [Bloat] [Cake] Fwd: Re: Unable to create htb tc classes more than 64K
2019-08-26 7:35 ` [Bloat] Fwd: Re: Unable to create htb tc classes more than 64K Toke Høiland-Jørgensen
@ 2019-08-27 20:04 ` Stephen Hemminger
2019-08-28 8:34 ` Eric Dumazet
0 siblings, 1 reply; 4+ messages in thread
From: Stephen Hemminger @ 2019-08-27 20:04 UTC (permalink / raw)
To: Toke Høiland-Jørgensen; +Cc: bloat, cake
On Mon, 26 Aug 2019 09:35:14 +0200
Toke Høiland-Jørgensen <toke@redhat.com> wrote:
> Turns out that with the "earliest departure time" support in sched_fq,
> it is now possible to write a shaper in eBPF, thus avoiding the global
> qdisc lock in sched_htb. This is pretty cool, if you ask me! :)
>
> -Toke
>
Thanks, I may use this to revisit doing netem in eBPF (xnetem).
Not having this feature was a show stopper at the time.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [Bloat] Unable to create htb tc classes more than 64K
[not found] ` <48a3284b-e8ba-f169-6a2d-9611f8538f07@gmail.com>
@ 2019-08-27 21:41 ` Dave Taht
0 siblings, 0 replies; 4+ messages in thread
From: Dave Taht @ 2019-08-27 21:41 UTC (permalink / raw)
To: Eric Dumazet
Cc: Cong Wang, Akshat Kakkar, Anton Danilov, NetFilter, lartc, netdev, bloat
On Tue, Aug 27, 2019 at 2:09 PM Eric Dumazet <eric.dumazet@gmail.com> wrote:
>
>
>
> On 8/27/19 10:53 PM, Dave Taht wrote:
> >
> > Although this is very cool, I think in this case the OP is being
> > a router, not server?
>
> This mechanism is generic. EDT has not been designed for servers only.
>
> One HTB class (with one associated qdisc per leaf) per rate limiter
> does not scale, and consumes a _lot_ more memory.
>
> We have abandoned HTB at Google for these reasons.
>
> Nice thing with EDT is that you can stack arbitrary number of rate limiters,
> and still keep a single queue (in FQ or another layer downstream)
There's a lot of nice things about EDT! I'd followed along on the
theory, timerwheels, virtual clocks, etc, and went
seeking ethernet hw that could do it (directly) on the low end and
came up empty - and doing anything with the concept required a
complete rethink on everything we were already doing in
wifi/fq_codel/cake ;(, and after we shipped cake in 4.19, I bought a
sailboat, and logged out for a while.
The biggest problem bufferbloat.net has left is more efficient inbound
shaping/policing on cheap hw.
I don't suppose you've solved that already? :puppy dog eyes:
Next year's version of openwrt we can maybe try to do something
coherent with EDT.
>
--
Dave Täht
CTO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-831-205-9740
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [Bloat] [Cake] Fwd: Re: Unable to create htb tc classes more than 64K
2019-08-27 20:04 ` [Bloat] [Cake] " Stephen Hemminger
@ 2019-08-28 8:34 ` Eric Dumazet
0 siblings, 0 replies; 4+ messages in thread
From: Eric Dumazet @ 2019-08-28 8:34 UTC (permalink / raw)
To: bloat
On 8/27/19 10:04 PM, Stephen Hemminger wrote:
> On Mon, 26 Aug 2019 09:35:14 +0200
> Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>
>> Turns out that with the "earliest departure time" support in sched_fq,
>> it is now possible to write a shaper in eBPF, thus avoiding the global
>> qdisc lock in sched_htb. This is pretty cool, if you ask me! :)
>>
>> -Toke
>>
>
> Thanks, I may use this to revisit doing netem in eBPF (xnetem).
> Not having this feature was a show stopper at the time.
Note that TCP stack got support for arbitrary per-socket delays.
Very useful to build a complex network emulator with thousands of TCP flows
with very different rtt.
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=a842fe1425cb20f457abd3f8ef98b468f83ca98b
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2019-08-28 8:34 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <CAA5aLPhf1=wzQG0BAonhR3td-RhEmXaczug8n4hzXCzreb+52g@mail.gmail.com>
[not found] ` <CAM_iQpVyEtOGd5LbyGcSNKCn5XzT8+Ouup26fvE1yp7T5aLSjg@mail.gmail.com>
[not found] ` <CAA5aLPiqyhnWjY7A3xsaNJ71sDOf=Rqej8d+7=_PyJPmV9uApA@mail.gmail.com>
[not found] ` <CAM_iQpUH6y8oEct3FXUhqNekQ3sn3N7LoSR0chJXAPYUzvWbxA@mail.gmail.com>
[not found] ` <CAA5aLPjzX+9YFRGgCgceHjkU0=e6x8YMENfp_cC9fjfHYK3e+A@mail.gmail.com>
[not found] ` <CAM_iQpXBhrOXtfJkibyxyq781Pjck-XJNgZ-=Ucj7=DeG865mw@mail.gmail.com>
[not found] ` <CAA5aLPjO9rucCLJnmQiPBxw2pJ=6okf3C88rH9GWnh3p0R+Rmw@mail.gmail.com>
[not found] ` <CAM_iQpVtGUH6CAAegRtTgyemLtHsO+RFP8f6LH2WtiYu9-srfw@mail.gmail.com>
[not found] ` <9cbefe10-b172-ae2a-0ac7-d972468eb7a2@gmail.com>
2019-08-26 7:35 ` [Bloat] Fwd: Re: Unable to create htb tc classes more than 64K Toke Høiland-Jørgensen
2019-08-27 20:04 ` [Bloat] [Cake] " Stephen Hemminger
2019-08-28 8:34 ` Eric Dumazet
[not found] ` <CAA93jw6TWUmqsvBDT4tFPgwjGxAmm_S5bUibj16nwp1F=AwyRA@mail.gmail.com>
[not found] ` <48a3284b-e8ba-f169-6a2d-9611f8538f07@gmail.com>
2019-08-27 21:41 ` [Bloat] " Dave Taht
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox