From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-qk0-x22f.google.com (mail-qk0-x22f.google.com [IPv6:2607:f8b0:400d:c09::22f]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id 303823B2A4 for ; Fri, 24 Nov 2017 16:28:55 -0500 (EST) Received: by mail-qk0-x22f.google.com with SMTP id p19so25951612qke.2 for ; Fri, 24 Nov 2017 13:28:55 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-transfer-encoding; bh=5+JxDZihF+oPIAYy6DF2o4+3Ar+sqP/SVuBR3EZnLZQ=; b=eSyEh3TEUKMEUNNoYI94sWNl/3z1+Ppq8I6sq0YuEWJLRm0NIIl6R3zPk5/KjibfIn tOE4bktez3uvA/OYoaHYynyyQGAwTzEtWd0YziZQSeHmsT75LwDgiI46+J2fq0xFUtzU vmV/XhZavMkj3r70G94CTcglMddL81JHVtWCQQ37FuYPTWGh3w+kFcDbYkv1q2TarUCw InOBEZfB6IkQOJF+iuw2fy7Sist7e3Brr/vu4ZWgifvTuBKx3Qp3I3FfMeRv1IJAMkJU 0BCCWAI3G6il0VVZ2Z9ZVHX6aoij1NEjdMnvA7ifs8CACVPfjM9l7bXE3hKK/qU+oWsE 8CaA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:content-transfer-encoding; bh=5+JxDZihF+oPIAYy6DF2o4+3Ar+sqP/SVuBR3EZnLZQ=; b=o5i98EYNCDXQIqH5OUD3jAMLKkJkvckNjkL/UpPvQ9ZEq+hbngrdcpqADVlp4KGpPF czMHzHJaaIOT4aczy40BSM4d99M3hpZbsmVG1DzG/xuidyAHBDpjP80vGKBqRXIXlHGi UIhC7RiSduYOx3nc3vcnNFxPx8RR2nDQUFYjjVI2Sm4gfCxXCIS2l1VxayCIFOqkik+6 POm8psUd+K4/3EtrxZOTkrjzX1dcLy6PFTXvXd5kfiuRS2UsvRxLT5wlphS68j5QFiXQ naGsKQU9rUwpaWARi1Pcx8FA9pTWmC4OPcfXkJQkgw4A4Om1vKuDflM/Kf4w0RZGxkKD XFuw== X-Gm-Message-State: AJaThX5/KjxWjj5trbBPBX2C2WouTb0nXu0A2OsVNxnOqGsqeXAw1VcX YjppxOvOof43sx52PKAE+7wlYnVmcXcjhvbeIwQ= X-Google-Smtp-Source: AGs4zMZw3L7afhFrMb3hR7RjGtrxWj8fPOgarspOCfiY5cMTEwrygpt9sXRsWzu27JSlWPgOOvCbWrOezvkEk2LVKek= X-Received: by 10.55.188.6 with SMTP id m6mr8588069qkf.75.1511558934383; Fri, 24 Nov 2017 13:28:54 -0800 (PST) MIME-Version: 1.0 Received: by 10.12.193.93 with HTTP; Fri, 24 Nov 2017 13:28:53 -0800 (PST) In-Reply-To: <20171124085133.6bdfc246@redhat.com> References: <20171124085133.6bdfc246@redhat.com> From: Dave Taht Date: Fri, 24 Nov 2017 13:28:53 -0800 Message-ID: To: Cake List Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Subject: [Cake] Fwd: testers wanted for the "cake" qdisc on mainstream platforms X-BeenThere: cake@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: Cake - FQ_codel the next generation List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 24 Nov 2017 21:28:55 -0000 Jesper, unlike the rest of us, is regularly testing against 10Gig+ hw, and did a brief test run against cake. ---------- Forwarded message ---------- From: Jesper Dangaard Brouer Date: Thu, Nov 23, 2017 at 11:51 PM Subject: Re: testers wanted for the "cake" qdisc on mainstream platforms To: Dave Taht Cc: brouer@redhat.com On Thu, 23 Nov 2017 12:46:25 -0800 Dave Taht wrote: > you are the only person I know with 10GigE+ hardware. > > I'm kind of dying to know if ack filtering has any effect there, and > how close we get to line rates these days, unshaped. If unloading nf_conntrack, then Cake can do 10G with large 1514 bytes packets ~820Kpps. With small packets (obviously) it cannot. Details below Quick eval on Broadwell =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D Host Broadwell: CPU E5-1650 v4 @ 3.60GHz NICs: Intel ixgbe Run pktgen on ingress ixgbe1, and forward packets egress ixgbe2. Notice: 10G wirespeed PPS varies greatly with packet size: - Smallest 64 bytes packets (+interframe gap + MAC preample =3D 84 bytes) - @64 bytes =3D 14.88 Mpps - @1538 bytes =3D 813 Kpps pktgen01: single flow small packets ----------------------------------- Install qdisc :: sudo $TC qdisc add dev ixgbe2 root cake bandwidth 10000Mbit ack-filter Pktgen script small 64 byte pkts:: ./pktgen_sample03_burst_single_flow.sh -vi ixgbe2 \ x -m 00:1b:21:bb:9a:84 -d 10.10.10.1 -t3 Quick nstat:: $ nstat > /dev/null && sleep 1 && nstat #kernel IpInReceives 1584842 0.0 IpForwDatagrams 1584842 0.0 IpOutRequests 1584842 0.0 IpExtInOctets 72902318 0.0 IpExtOutOctets 139465304 0.0 IpExtInNoECTPkts 1584832 0.0 ethtool_stats:: $ ethtool_stats.pl --dev ixgbe1 --dev ixgbe2 --sec 3 [...] Show adapter(s) (ixgbe1 ixgbe2) statistics (ONLY that changed!) Ethtool(ixgbe1 ) stat: 9695260 ( 9,695,260) <=3D fdir_miss /sec Ethtool(ixgbe1 ) stat: 94619772 ( 94,619,772) <=3D rx_bytes /sec Ethtool(ixgbe1 ) stat: 952572687 ( 952,572,687) <=3D rx_bytes_nic /sec Ethtool(ixgbe1 ) stat: 5245486 ( 5,245,486) <=3D rx_missed_errors /s= ec Ethtool(ixgbe1 ) stat: 8061456 ( 8,061,456) <=3D rx_no_dma_resources= /sec Ethtool(ixgbe1 ) stat: 1576996 ( 1,576,996) <=3D rx_packets /sec Ethtool(ixgbe1 ) stat: 9638460 ( 9,638,460) <=3D rx_pkts_nic /sec Ethtool(ixgbe1 ) stat: 94619772 ( 94,619,772) <=3D rx_queue_4_bytes /s= ec Ethtool(ixgbe1 ) stat: 1576996 ( 1,576,996) <=3D rx_queue_4_packets = /sec Ethtool(ixgbe2 ) stat: 91526011 ( 91,526,011) <=3D tx_bytes /sec Ethtool(ixgbe2 ) stat: 100993190 ( 100,993,190) <=3D tx_bytes_nic /sec Ethtool(ixgbe2 ) stat: 1578035 ( 1,578,035) <=3D tx_packets /sec Ethtool(ixgbe2 ) stat: 1578019 ( 1,578,019) <=3D tx_pkts_nic /sec Ethtool(ixgbe2 ) stat: 91526011 ( 91,526,011) <=3D tx_queue_4_bytes /s= ec Ethtool(ixgbe2 ) stat: 1578035 ( 1,578,035) <=3D tx_queue_4_packets = /sec Perf report:: Samples: 72K of event 'cycles:ppp', Event count (approx.): 66265541749 Overhead CPU Shared Object Symbol - 10.60% 004 [kernel.vmlinux] [k] _raw_spin_lock - _raw_spin_lock + 10.04% sch_direct_xmit + 6.87% 004 [kernel.vmlinux] [k] flow_hash_from_keys + 6.26% 004 [kernel.vmlinux] [k] cake_dequeue + 6.12% 004 [kernel.vmlinux] [k] fib_table_lookup + 5.32% 004 [kernel.vmlinux] [k] ixgbe_poll + 4.62% 004 [kernel.vmlinux] [k] ixgbe_xmit_frame_ring + 3.17% 004 [kernel.vmlinux] [k] __skb_flow_dissect + 2.99% 004 [kernel.vmlinux] [k] cake_enqueue + 2.89% 004 [kernel.vmlinux] [k] ip_route_input_rcu + 2.68% 004 [kernel.vmlinux] [k] ktime_get + 2.47% 004 [kernel.vmlinux] [k] ip_rcv + 2.31% 004 [kernel.vmlinux] [k] ip_finish_output2 + 2.23% 004 [kernel.vmlinux] [k] __netif_receive_skb_core + 2.14% 004 [kernel.vmlinux] [k] cake_hash + 2.04% 004 [kernel.vmlinux] [k] __dev_queue_xmit + 2.00% 004 [kernel.vmlinux] [k] inet_gro_receive + 1.87% 004 [kernel.vmlinux] [k] udp_v4_early_demux + 1.79% 004 [kernel.vmlinux] [k] dev_gro_receive + 1.59% 004 [kernel.vmlinux] [k] __qdisc_run + 1.51% 004 [kernel.vmlinux] [k] ip_rcv_finish + 1.48% 004 [kernel.vmlinux] [k] __build_skb + 1.41% 004 [kernel.vmlinux] [k] ip_forward + 1.21% 004 [kernel.vmlinux] [k] dev_hard_start_xmit + 1.13% 004 [kernel.vmlinux] [k] sch_direct_xmit + 1.08% 004 [kernel.vmlinux] [k] __local_bh_enable_ip + 1.05% 004 [kernel.vmlinux] [k] build_skb + 0.91% 004 [kernel.vmlinux] [k] fib_validate_source + 0.87% 004 [kernel.vmlinux] [k] ip_finish_output + 0.87% 004 [kernel.vmlinux] [k] read_tsc + 0.83% 004 [kernel.vmlinux] [k] kmem_cache_alloc + 0.81% 004 [kernel.vmlinux] [k] ___slab_alloc + 0.72% 004 [kernel.vmlinux] [k] kmem_cache_free_bulk + 0.67% 004 [kernel.vmlinux] [k] cake_advance_shaper In the single flow test the overhead in _raw_spin_lock is FAKE. It is actually caused by the NIC tailptr/doorbell on TX. pktgen02: 8 x flows small packets --------------------------------- Pktgen 8x flows:: ./pktgen_sample05_flow_per_thread.sh -vi ixgbe2 \ -m 00:1b:21:bb:9a:84 -d 10.10.10.1 -t8 Quick nstat 8x flows:: $ nstat > /dev/null && sleep 1 && nstat #kernel IpInReceives 1163458 0.0 IpForwDatagrams 1163458 0.0 IpOutRequests 1163458 0.0 IpExtInOctets 53519160 0.0 IpExtOutOctets 102384480 0.0 IpExtInNoECTPkts 1163461 0.0 ethtool_stats 8x flows:: Show adapter(s) (ixgbe1 ixgbe2) statistics (ONLY that changed!) Ethtool(ixgbe1 ) stat: 92911 ( 92,911) <=3D alloc_rx_page /sec Ethtool(ixgbe1 ) stat: 9502884 ( 9,502,884) <=3D fdir_miss /sec Ethtool(ixgbe1 ) stat: 66311206 ( 66,311,206) <=3D rx_bytes /sec Ethtool(ixgbe1 ) stat: 956178211 ( 956,178,211) <=3D rx_bytes_nic /sec Ethtool(ixgbe1 ) stat: 5440134 ( 5,440,134) <=3D rx_missed_errors /s= ec Ethtool(ixgbe1 ) stat: 8394967 ( 8,394,967) <=3D rx_no_dma_resources= /sec Ethtool(ixgbe1 ) stat: 1105187 ( 1,105,187) <=3D rx_packets /sec Ethtool(ixgbe1 ) stat: 9500151 ( 9,500,151) <=3D rx_pkts_nic /sec Ethtool(ixgbe1 ) stat: 11149586 ( 11,149,586) <=3D rx_queue_0_bytes /s= ec Ethtool(ixgbe1 ) stat: 185826 ( 185,826) <=3D rx_queue_0_packets = /sec Ethtool(ixgbe1 ) stat: 10937155 ( 10,937,155) <=3D rx_queue_1_bytes /s= ec Ethtool(ixgbe1 ) stat: 182286 ( 182,286) <=3D rx_queue_1_packets = /sec Ethtool(ixgbe1 ) stat: 11119239 ( 11,119,239) <=3D rx_queue_2_bytes /s= ec Ethtool(ixgbe1 ) stat: 185321 ( 185,321) <=3D rx_queue_2_packets = /sec Ethtool(ixgbe1 ) stat: 10950508 ( 10,950,508) <=3D rx_queue_3_bytes /s= ec Ethtool(ixgbe1 ) stat: 182508 ( 182,508) <=3D rx_queue_3_packets = /sec Ethtool(ixgbe1 ) stat: 10950508 ( 10,950,508) <=3D rx_queue_4_bytes /s= ec Ethtool(ixgbe1 ) stat: 182508 ( 182,508) <=3D rx_queue_4_packets = /sec Ethtool(ixgbe1 ) stat: 11204211 ( 11,204,211) <=3D rx_queue_5_bytes /s= ec Ethtool(ixgbe1 ) stat: 186737 ( 186,737) <=3D rx_queue_5_packets = /sec Ethtool(ixgbe2 ) stat: 62216109 ( 62,216,109) <=3D tx_bytes /sec Ethtool(ixgbe2 ) stat: 68653701 ( 68,653,701) <=3D tx_bytes_nic /sec Ethtool(ixgbe2 ) stat: 1072692 ( 1,072,692) <=3D tx_packets /sec Ethtool(ixgbe2 ) stat: 1072714 ( 1,072,714) <=3D tx_pkts_nic /sec Ethtool(ixgbe2 ) stat: 8838570 ( 8,838,570) <=3D tx_queue_0_bytes /s= ec Ethtool(ixgbe2 ) stat: 152389 ( 152,389) <=3D tx_queue_0_packets = /sec Ethtool(ixgbe2 ) stat: 10584083 ( 10,584,083) <=3D tx_queue_1_bytes /s= ec Ethtool(ixgbe2 ) stat: 182484 ( 182,484) <=3D tx_queue_1_packets = /sec Ethtool(ixgbe2 ) stat: 10743376 ( 10,743,376) <=3D tx_queue_2_bytes /s= ec Ethtool(ixgbe2 ) stat: 185231 ( 185,231) <=3D tx_queue_2_packets = /sec Ethtool(ixgbe2 ) stat: 10599318 ( 10,599,318) <=3D tx_queue_3_bytes /s= ec Ethtool(ixgbe2 ) stat: 182747 ( 182,747) <=3D tx_queue_3_packets = /sec Ethtool(ixgbe2 ) stat: 10596994 ( 10,596,994) <=3D tx_queue_4_bytes /s= ec Ethtool(ixgbe2 ) stat: 182707 ( 182,707) <=3D tx_queue_4_packets = /sec Ethtool(ixgbe2 ) stat: 10853767 ( 10,853,767) <=3D tx_queue_5_bytes /s= ec Ethtool(ixgbe2 ) stat: 187134 ( 187,134) <=3D tx_queue_5_packets = /sec Perf report:: Samples: 733K of event 'cycles:ppp', Event count (approx.): 691576256580 Overhead CPU Shared Object Symbol - 12.32% 005 [kernel.vmlinux] [k] queued_spin_lock_slowpath - queued_spin_lock_slowpath + 11.68% __dev_queue_xmit + 0.62% sch_direct_xmit + 12.30% 004 [kernel.vmlinux] [k] queued_spin_lock_slowpath + 12.19% 003 [kernel.vmlinux] [k] queued_spin_lock_slowpath + 12.03% 000 [kernel.vmlinux] [k] queued_spin_lock_slowpath + 11.98% 001 [kernel.vmlinux] [k] queued_spin_lock_slowpath + 11.95% 002 [kernel.vmlinux] [k] queued_spin_lock_slowpath + 0.78% 002 [kernel.vmlinux] [k] _raw_spin_lock + 0.76% 000 [kernel.vmlinux] [k] _raw_spin_lock + 0.74% 001 [kernel.vmlinux] [k] _raw_spin_lock + 0.70% 003 [kernel.vmlinux] [k] _raw_spin_lock + 0.65% 005 [kernel.vmlinux] [k] _raw_spin_lock + 0.64% 001 [kernel.vmlinux] [k] cake_dequeue + 0.63% 004 [kernel.vmlinux] [k] _raw_spin_lock + 0.53% 003 [kernel.vmlinux] [k] cake_dequeue + 0.50% 000 [kernel.vmlinux] [k] cake_dequeue 0.49% 002 [kernel.vmlinux] [k] cake_dequeue 0.48% 005 [kernel.vmlinux] [k] cake_dequeue 0.40% 004 [kernel.vmlinux] [k] cake_dequeue 0.32% 002 [kernel.vmlinux] [k] cake_enqueue 0.32% 004 [kernel.vmlinux] [k] cake_enqueue 0.31% 001 [kernel.vmlinux] [k] cake_enqueue 0.30% 000 [kernel.vmlinux] [k] cake_enqueue 0.29% 005 [kernel.vmlinux] [k] cake_enqueue 0.28% 000 [kernel.vmlinux] [k] __build_skb 0.27% 003 [kernel.vmlinux] [k] cake_enqueue 0.27% 004 [kernel.vmlinux] [k] __build_skb 0.25% 003 [kernel.vmlinux] [k] __build_skb 0.25% 002 [kernel.vmlinux] [k] __build_skb 0.24% 005 [kernel.vmlinux] [k] __build_skb 0.22% 002 [kernel.vmlinux] [k] __qdisc_run 0.22% 000 [kernel.vmlinux] [k] __qdisc_run 0.21% 004 [kernel.vmlinux] [k] cake_hash 0.20% 001 [kernel.vmlinux] [k] __build_skb 0.19% 005 [kernel.vmlinux] [k] cake_hash 0.19% 002 [kernel.vmlinux] [k] cake_hash 0.19% 004 [kernel.vmlinux] [k] ixgbe_poll 0.18% 003 [kernel.vmlinux] [k] cake_hash 0.18% 005 [kernel.vmlinux] [k] kmem_cache_alloc 0.17% 001 [kernel.vmlinux] [k] ixgbe_xmit_frame_ring 0.17% 001 [kernel.vmlinux] [k] cake_hash 0.17% 000 [kernel.vmlinux] [k] cake_hash 0.17% 004 [kernel.vmlinux] [k] __qdisc_run 0.17% 002 [kernel.vmlinux] [k] kmem_cache_alloc In the multi-flow test the overhead/congestion on queued_spin_lock_slowpath is real, and is simply due to cake being a single-queue qdisc. pktgen03: 8 x flows large packets --------------------------------- Pktgen 8x flows large packets:: ./pktgen_sample05_flow_per_thread.sh -vi ixgbe2 \ -m 00:1b:21:bb:9a:84 -d 10.10.10.1 -t8 -s 1514 TX speed measured on generator: 821,361 <=3D tx_packets /sec Quick nstat 8x flows large packets:: $ nstat > /dev/null && sleep 1 && nstat #kernel IpInReceives 824925 0.0 IpForwDatagrams 824925 0.0 IpOutRequests 824925 0.0 IpExtInOctets 1224145664 0.0 IpExtOutOctets 2448288360 0.0 IpExtInNoECTPkts 824895 0.0 Good result as we at 10G wirespeed with large packets. ethtool_stats 8x flows large packets:: Show adapter(s) (ixgbe1 ixgbe2) statistics (ONLY that changed!) Ethtool(ixgbe1 ) stat: 246451 ( 246,451) <=3D alloc_rx_page /sec Ethtool(ixgbe1 ) stat: 820293 ( 820,293) <=3D fdir_miss /sec Ethtool(ixgbe1 ) stat: 1226074253 (1,226,074,253) <=3D rx_bytes /sec Ethtool(ixgbe1 ) stat: 1232080619 (1,232,080,619) <=3D rx_bytes_nic /sec Ethtool(ixgbe1 ) stat: 1806 ( 1,806) <=3D rx_no_dma_resource= s /sec Ethtool(ixgbe1 ) stat: 818474 ( 818,474) <=3D rx_packets /sec Ethtool(ixgbe1 ) stat: 820286 ( 820,286) <=3D rx_pkts_nic /sec Ethtool(ixgbe1 ) stat: 153599624 ( 153,599,624) <=3D rx_queue_0_bytes /= sec Ethtool(ixgbe1 ) stat: 102536 ( 102,536) <=3D rx_queue_0_packets= /sec Ethtool(ixgbe1 ) stat: 153600115 ( 153,600,115) <=3D rx_queue_1_bytes /= sec Ethtool(ixgbe1 ) stat: 102537 ( 102,537) <=3D rx_queue_1_packets= /sec Ethtool(ixgbe1 ) stat: 306006517 ( 306,006,517) <=3D rx_queue_2_bytes /= sec Ethtool(ixgbe1 ) stat: 204277 ( 204,277) <=3D rx_queue_2_packets= /sec Ethtool(ixgbe1 ) stat: 153601585 ( 153,601,585) <=3D rx_queue_3_bytes /= sec Ethtool(ixgbe1 ) stat: 102538 ( 102,538) <=3D rx_queue_3_packets= /sec Ethtool(ixgbe1 ) stat: 153603546 ( 153,603,546) <=3D rx_queue_4_bytes /= sec Ethtool(ixgbe1 ) stat: 102539 ( 102,539) <=3D rx_queue_4_packets= /sec Ethtool(ixgbe1 ) stat: 305662865 ( 305,662,865) <=3D rx_queue_5_bytes /= sec Ethtool(ixgbe1 ) stat: 204047 ( 204,047) <=3D rx_queue_5_packets= /sec Ethtool(ixgbe2 ) stat: 1223143065 (1,223,143,065) <=3D tx_bytes /sec Ethtool(ixgbe2 ) stat: 1226420442 (1,226,420,442) <=3D tx_bytes_nic /sec Ethtool(ixgbe2 ) stat: 816517 ( 816,517) <=3D tx_packets /sec Ethtool(ixgbe2 ) stat: 816525 ( 816,525) <=3D tx_pkts_nic /sec Ethtool(ixgbe2 ) stat: 152763623 ( 152,763,623) <=3D tx_queue_0_bytes /= sec Ethtool(ixgbe2 ) stat: 101978 ( 101,978) <=3D tx_queue_0_packets= /sec Ethtool(ixgbe2 ) stat: 152922976 ( 152,922,976) <=3D tx_queue_1_bytes /= sec Ethtool(ixgbe2 ) stat: 102085 ( 102,085) <=3D tx_queue_1_packets= /sec Ethtool(ixgbe2 ) stat: 305999422 ( 305,999,422) <=3D tx_queue_2_bytes /= sec Ethtool(ixgbe2 ) stat: 204272 ( 204,272) <=3D tx_queue_2_packets= /sec Ethtool(ixgbe2 ) stat: 152759210 ( 152,759,210) <=3D tx_queue_3_bytes /= sec Ethtool(ixgbe2 ) stat: 101975 ( 101,975) <=3D tx_queue_3_packets= /sec Ethtool(ixgbe2 ) stat: 152875415 ( 152,875,415) <=3D tx_queue_4_bytes /= sec Ethtool(ixgbe2 ) stat: 102053 ( 102,053) <=3D tx_queue_4_packets= /sec Ethtool(ixgbe2 ) stat: 305822417 ( 305,822,417) <=3D tx_queue_5_bytes /= sec Ethtool(ixgbe2 ) stat: 204154 ( 204,154) <=3D tx_queue_5_packets= /sec Perf report 8x flows large packets:: Samples: 136K of event 'cycles:ppp', Event count (approx.): 104319432585 Overhead CPU Shared Object Symbol + 23.06% 005 [kernel.vmlinux] [k] queued_spin_lock_slowpath - queued_spin_lock_slowpath + 20.49% __dev_queue_xmit + 2.41% sch_direct_xmit + 22.16% 002 [kernel.vmlinux] [k] queued_spin_lock_slowpath + 3.09% 005 [kernel.vmlinux] [k] cake_dequeue + 3.02% 005 [kernel.vmlinux] [k] _raw_spin_lock + 2.77% 002 [kernel.vmlinux] [k] cake_dequeue + 2.76% 002 [kernel.vmlinux] [k] _raw_spin_lock + 2.48% 001 [kernel.vmlinux] [k] queued_spin_lock_slowpath + 2.03% 000 [kernel.vmlinux] [k] queued_spin_lock_slowpath + 1.29% 003 [kernel.vmlinux] [k] queued_spin_lock_slowpath + 1.20% 002 [kernel.vmlinux] [k] __build_skb + 1.12% 005 [kernel.vmlinux] [k] __build_skb + 1.08% 005 [kernel.vmlinux] [k] ixgbe_xmit_frame_ring + 0.91% 002 [kernel.vmlinux] [k] ixgbe_xmit_frame_ring + 0.91% 002 [kernel.vmlinux] [k] cake_enqueue + 0.85% 005 [kernel.vmlinux] [k] ixgbe_poll + 0.84% 005 [kernel.vmlinux] [k] cake_enqueue + 0.82% 002 [kernel.vmlinux] [k] ixgbe_poll + 0.70% 004 [kernel.vmlinux] [k] queued_spin_lock_slowpath + 0.68% 005 [kernel.vmlinux] [k] cake_dequeue_one + 0.59% 002 [kernel.vmlinux] [k] cake_dequeue_one + 0.53% 002 [kernel.vmlinux] [k] cake_hash + 0.52% 002 [kernel.vmlinux] [k] __dev_queue_xmit + 0.51% 005 [kernel.vmlinux] [k] prandom_u32_state + 0.50% 005 [kernel.vmlinux] [k] __qdisc_run + 0.50% 002 [kernel.vmlinux] [k] netif_skb_features 0.50% 002 [kernel.vmlinux] [k] prandom_u32_state 0.50% 002 [kernel.vmlinux] [k] flow_hash_from_keys 0.49% 005 [kernel.vmlinux] [k] __dev_queue_xmit 0.49% 002 [kernel.vmlinux] [k] __qdisc_run 0.48% 005 [kernel.vmlinux] [k] cake_hash 0.47% 005 [kernel.vmlinux] [k] flow_hash_from_keys > ---------- Forwarded message ---------- > From: Dave Taht > Date: Thu, Nov 23, 2017 at 9:33 AM > Subject: testers wanted for the "cake" qdisc on mainstream platforms > To: "cerowrt-devel@lists.bufferbloat.net" > , bloat > , Cake List > > > It is my hope to get the cake qdisc into the Linux kernel in the next > release cycle. We could definitely use more testers! The version we > have here will compile against almost any kernel on any platform, > dating back as far as 3.10, and has been integrated into the > sqm-scripts and shipped in lede for ages... > > But what I'm hoping for is for people to try it against their current > linux kernels on x86, arm, ppc64, arm64, etc hardware, and document > the kernel version, linux distribution, and any anomalies. > > To build it you need to have installed your kernel headers, and: > > git clone -b cobalt https://github.com/dtaht/sch_cake.git > git clone https://github.com/dtaht/iproute2-cake-next.git > cd sch_cake; make; sudo make install; cd .. > cd iproute2-cake-next; make; > # don't do a make install, instead something like export TC=3D`pwd`/tc/tc > > $TC qdisc add dev your_device root cake bandwidth XXMbit ack-filter > > And then pound it flat with whatever traffic types you like. In particula= r, > getting some videoconferencing and flooding results would be great. > > There are a TON of features documented on the man page, several > (ack-filter, wash) are not, as I write. > > Please comment via the cake@lists.bufferbloat.net mailing list. THX! > > NOTE: flent has gained a lot of new features of late, including > support for the new go based irtt tool, which can measure one way > delay (which is also pretty nifty at the command line) > > > flent: https://github.com/tohojo/ > irtt: https://github.com/peteheist/irtt > > From the pending commit message: > > Add Common Applications Kept Enhanced (sch_cake) qdisc > > sch_cake is intended to squeeze the most bandwidth and lowest > latency out of even the slowest ISP links and routers, while > presenting an API simple enough that even an ISP can configure it. > > Example of use on an ISP uplink: > > tc qdisc add dev eth0 cake bandwidth 20Mbit nat docsis ack-filter > > Cake can also be used in unlimited mode to drive packets at the > speed of the underlying link. > > Cake is filled with: > > * A hybrid Codel/Blue AQM algorithm, =E2=80=9CCobalt=E2=80=9D, tied t= o an FQ_Codel > derived Flow Queuing system, which autoconfigures based on the band= width. > * A unique "triple-isolate" mode (the default) which balances per-flo= w > and per-host flow FQ even through NAT. > * An integral deficit based shaper with extensive dsl and docsis supp= ort > that can also be used in unlimited mode. > * 8 way set associative queuing to reduce flow collisions to a minimu= m. > * A reasonable interpretation of various diffserv latency/loss tradeo= ffs. > * Support for washing diffserv for entering and exiting traffic. > * Perfect support for interacting with Docsis 3.0 shapers. > * Extensive support for DSL framing types. > * (New) Support for ack filtering. > - 20 % better throughput at a 16x1 down/up ratio on the rrul test. > * Extensive statistics for measuring, loss, ecn markings, latency var= iation. > > There are some features still considered experimental, notably the > ingress_autorate bandwidth estimator and cobalt itself. > > sch_cake replaces a combination of iptables, tc filter, htb and fq_codel = in > the sqm-scripts, with sane defaults and vastly easier configuration. > > Cake's principal author is Jonathan Morton, with contributions from > Kevin Darbyshire-Bryant, Toke H=C3=B8iland-J=C3=B8rgensen, Sebastian Moel= ler, > Ryan Mounce, Dean Scarff, Guido Sarducci, Nils Andreas Svee, Dave > T=C3=A4ht, and Loganaden Velvindron. > > > > > > -- > > Dave T=C3=A4ht > CEO, TekLibre, LLC > http://www.teklibre.com > Tel: 1-669-226-2619 > > -- Best regards, Jesper Dangaard Brouer MSc.CS, Principal Kernel Engineer at Red Hat LinkedIn: http://www.linkedin.com/in/brouer --=20 Dave T=C3=A4ht CEO, TekLibre, LLC http://www.teklibre.com Tel: 1-669-226-2619