* [Codel] cake3 vs sqm+fq_codel at 115/12 mbit (basically comcast´s blast service)
@ 2015-04-02 18:05 Dave Taht
2015-04-02 19:03 ` [Codel] [Cerowrt-devel] " Jonathan Morton
0 siblings, 1 reply; 2+ messages in thread
From: Dave Taht @ 2015-04-02 18:05 UTC (permalink / raw)
To: cerowrt-devel, codel
[-- Attachment #1: Type: text/plain, Size: 2517 bytes --]
this is with a special build of openwrt (not CeroWrt) on the tplink
archer c7v2. It rips out the unaligned access hacks, and is compiled
for the mips74k processor in that box.
Even with hostapd
running like crazy for no good reason, we do fq/aqm/ecn perfectly with cake3
at the 115/12 mbit rate now common from comcast, with about 5% cpu
left over, where the sqm+fq_codel version runs out of cpu and falls
apart you will see in the attached graphs....
For the longest time we were aiming for a piece of affordable hardware
that could do 300
Mbit download shaping, with no luck. On this low end (this box is 89
dollars on newegg),
maybe this is enough to get restarted with, while we wait for other
stuff to stablize.
The 115Mbit service from comcast exhibits about 230ms worth of latency
under load on downloads without this shaping in place, 5-25ms with it.
:) The uplink, well, I have data for it somewhere, but it isnt
pretty... and totally fixed by cake3 here also.
(I still have to benchmark ipv6, I want to share some joy, however
briefly, first.)
That test build is at:
http://snapon.lab.bufferbloat.net/~cero3/archerc7v2/ar71xx/
DO NOT install this on any hardware that is not mips74k (e.g. dont try
the wndr3800). Do feel free to try anything in the above list that is
mips74k.
I would like to try an octeon build with cake3, to see if 115mbit can
be achieved there, too, but I think more performance analysis and
optimization is needed first.
Anyway, cake3 outputs a ton more statistics
root@OpenWrt:/# tc -s qdisc show dev eth1
qdisc cake3 8005: root refcnt 2 bandwidth 12Mbit diffserv4 flows
Sent 437523173 bytes 1386559 pkt (dropped 4317, overlimits 1852389 requeues 0)
backlog 0b 0p requeues 0
Class 0 Class 1 Class 2 Class 3
rate 12Mbit 11250Kbit 9Mbit 3Mbit
target 5.0ms 5.0ms 5.0ms 6.1ms
interval 105.0ms 105.0ms 105.0ms 106.1ms
Pk delay 5.5ms 301us 295us 196us
Av delay 1.2ms 16us 32us 10us
Sp delay 2us 1us 2us 2us
pkts 215134 1048937 10377 116428
way inds 4 0 0 0
way miss 4466 143 6 13
way cols 0 0 0 0
bytes 160066252 257893148 1903096 24102496
drops 4310 3 0 4
marks 8037 52634 0 11451
[-- Attachment #2: fq_codel_sqm_archer.png --]
[-- Type: image/png, Size: 96956 bytes --]
[-- Attachment #3: sqm_cake3_archer.png --]
[-- Type: image/png, Size: 85009 bytes --]
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: [Codel] [Cerowrt-devel] cake3 vs sqm+fq_codel at 115/12 mbit (basically comcast´s blast service)
2015-04-02 18:05 [Codel] cake3 vs sqm+fq_codel at 115/12 mbit (basically comcast´s blast service) Dave Taht
@ 2015-04-02 19:03 ` Jonathan Morton
0 siblings, 0 replies; 2+ messages in thread
From: Jonathan Morton @ 2015-04-02 19:03 UTC (permalink / raw)
To: Dave Taht; +Cc: codel, cerowrt-devel
[-- Attachment #1: Type: text/plain, Size: 3200 bytes --]
Awesome.
Oddly enough, cake3 actually gets slightly less throughput than
htb+fq_codel on the Pentium-MMX. However that's with the simplest possible
htb configuration (since I'm manually typing it in), and no firewall rules
or NAT going on (just a bridge between two Ethernet ports).
A couple of notes on the statistics that are now reported:
The rate for each class is now a threshold rather than a limit. The class
is permitted to use more than that bandwidth (up to the global limit), but
will yield to lower priority classes in that condition. This is consistent
with both user expectations and standard PHB specs, and means that traffic
benefits from high priority markings only if it's appropriately sparse.
On that note, I expect roughly the filtering uses of each class:
0 - background bulk traffic, CS1 marked, ie. BitTorrent. Use as many
parallel connections as you like, without worrying about ordinary traffic.
1 - best effort, the great majority of ordinary traffic - web pages,
software updates, whatever. If in doubt, leave it here (default CS0 lands
here).
2 - elevated priority, bandwidth sensitive traffic, such as streaming video
or a vlan.
3 - low volume, latency sensitive traffic such as VoIP, online games, NTP,
etc. EF traffic lands here.
A minor frustration for me here - firewall rules on ingress are processed
only after the traffic has already passed through ifb. This means I can't
custom mark my inbound traffic.
Three delay statistics are now reported, all of which are based on EWMAs of
packet sojourn times at dequeue. Pk is biased heavily to high delays (so
should usually report on fat flows), Sp to low delays (so should capture
sparse flows), and Av keeps a true average. The concept of a biased EWMA is
borrowed from ReplayGain and the whole "loudness war" problem that it aims
to solve; some broadcast studios (including the BBC) use audio meters which
work this way.
The new set-associative hash function also generates extra statistics. The
same 1024 queues are now divided into 128 sets of 8 "ways", and a tag on
each queue tracks which flow is presently using it. This allows hash
collisions to be resolved in most cases, with limited worst case overhead,
greatly improving flow isolation under severely stressed conditions. (It's
difficult to provoke this on a home network, but offices may well
appreciate this feature.)
The "way miss" counter is incremented whenever an empty queue's tag is
changed to assign it to a new flow, signalling a departure from the fast
path for that packet. Expect to see a small percentage of these with normal
traffic.
The "way indirect hit" counter tracks the situations where a hash collision
would have occurred with a plain hash function, but was resolved by the set
associativity. This is also a departure from the fast path.
The "way collision" counter indicates when even set associative hashing is
insufficient - there are more than 8 distinct flows attempting to occupy
queues in the same set. In such a case, the search for an empty queue is
terminated and the packet is placed in the queue matching the plain hash.
NB: so far this code path is completely untested to my knowledge!
- Jonathan Morton
[-- Attachment #2: Type: text/html, Size: 3554 bytes --]
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2015-04-02 19:03 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-04-02 18:05 [Codel] cake3 vs sqm+fq_codel at 115/12 mbit (basically comcast´s blast service) Dave Taht
2015-04-02 19:03 ` [Codel] [Cerowrt-devel] " Jonathan Morton
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox