From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <moeller0@gmx.de>
Received: from mout.gmx.net (mout.gmx.net [212.227.17.21])
	(using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits))
	(Client CN "mout.gmx.net",
	Issuer "TeleSec ServerPass DE-1" (verified OK))
	by huchra.bufferbloat.net (Postfix) with ESMTPS id 8D6C6208A7C;
	Fri, 29 May 2015 03:42:14 -0700 (PDT)
Received: from u-089-d091.biologie.uni-tuebingen.de ([134.2.89.91]) by
	mail.gmx.com (mrgmx103) with ESMTPSA (Nemesis) id
	0Maa3B-1YehaH00YD-00K9rp; Fri, 29 May 2015 12:42:04 +0200
Content-Type: text/plain; charset=windows-1252
Mime-Version: 1.0 (Mac OS X Mail 7.3 \(1878.6\))
From: Sebastian Moeller <moeller0@gmx.de>
In-Reply-To: <CAA93jw7oWs+AkghxzSSbHO1qpJNOOJ4AFxi8HpCggDHdS=WU4w@mail.gmail.com>
Date: Fri, 29 May 2015 12:42:00 +0200
Content-Transfer-Encoding: quoted-printable
Message-Id: <287922DC-2CED-43EF-973B-B5DE1C5CE9DD@gmx.de>
References: <F98187B0-5EB9-4C23-8E92-05C50E7BCD97@gmx.de>
	<CAA93jw7oWs+AkghxzSSbHO1qpJNOOJ4AFxi8HpCggDHdS=WU4w@mail.gmail.com>
To: =?windows-1252?Q?Dave_T=E4ht?= <dave.taht@gmail.com>
X-Mailer: Apple Mail (2.1878.6)
X-Provags-ID: V03:K0:x8hR5U0Y9+j65ggph7VrdfFE2eR5M3bBdbPZSJJ8E6CNSil02Wk
	DV9mebZ5Hhbx4SttBXrfkG4PAXPhGmbBVPoEyTooO7SeRbsibuzZVOhOppGhSNEwI9dkT9Q
	4DoNXiAuLTZKlNRmpA8vithQrPkz0lU7ow3gwzzForqaFgzszzIhm2Slhat2Ci8X853Hd1P
	g/YLLT7J9HP5u+oz7brFw==
X-UI-Out-Filterresults: notjunk:1;V01:K0:/9o7SmgmTAI=:XN5uL9BsFo5lPemDYbt10W
	oC0KncoEHeZq8Aq9dBjQB5X6gTY2Blslb+6MBKJugfqBgz4Y0qCEyfphNZz6WDRVq4fzvyqg6
	5N7cZK+tsOas2/839MMPSA9wpfMoJq8IpwueAYrRRUzpmaSph99ULfo4ShJ9FSAdKEVs+zGko
	CcsnYPF7wGHnSJ9bPIYMRZsqXgqaKcNKk6pQav5l1TMqbuNaV9/VOOA9fyYrKjnUkHW5+jR70
	se5Jq8pPTJW44ScQz+sEEDnjHs1mMst8/0+HJo6iuVb13D0HyNLGzKA5PGtho5I/zi0xXwHLp
	etL6KPZtZx5o6W4BdmE9lGdZdWVwlWVNqIJFMEhWHqJCaV/mrrbdhPDsG4UBESBDD78DCwdSC
	msNRCeUUnV026xGHvb/u5ivf60Dw64K+TetDdb5N9sqg6injOI/HOJHjArnu5/RivI0MJGH8d
	XZ28YYlPz7gsdOq6ftVs3Kf8RXEH5VUvywrH+dGsV7IFy166EDuB7x6zSxH8DLQmFxJ0TJEYn
	/RRDWWBK2w8NvHDNxt1gtCh3qjQ1Blt6K/j/7fB2VRbtKB6ynIaiuymaeToUdlqv9rF9/2p+P
	7wKmSSD8UMsJNYMr/Deb9ekGEkMOdBJiPvSs83vV9T8mUthWYBd8roin2kJidTh8n4blFYX4H
	+Tc2XQQZK4u7C09RjhdXP//ibWIi7Chgwitf3QvoPZL7MuOu89dsYjOw+Y5Gsjjm1O4s=
Cc: cake@lists.bufferbloat.net,
	cerowrt-devel <cerowrt-devel@lists.bufferbloat.net>
Subject: Re: [Cake] policer question
X-BeenThere: cake@lists.bufferbloat.net
X-Mailman-Version: 2.1.13
Precedence: list
List-Id: Cake - FQ_codel the next generation <cake.lists.bufferbloat.net>
List-Unsubscribe: <https://lists.bufferbloat.net/options/cake>,
	<mailto:cake-request@lists.bufferbloat.net?subject=unsubscribe>
List-Archive: <https://lists.bufferbloat.net/pipermail/cake>
List-Post: <mailto:cake@lists.bufferbloat.net>
List-Help: <mailto:cake-request@lists.bufferbloat.net?subject=help>
List-Subscribe: <https://lists.bufferbloat.net/listinfo/cake>,
	<mailto:cake-request@lists.bufferbloat.net?subject=subscribe>
X-List-Received-Date: Fri, 29 May 2015 10:43:18 -0000

Hi Dave, hi list,

On May 27, 2015, at 19:12 , Dave Taht <dave.taht@gmail.com> wrote:

> On Tue, May 26, 2015 at 3:49 AM, Sebastian Moeller <moeller0@gmx.de> =
wrote:
>> Hi Dave,
>>=20
>> I just stumbled over your last edit of "wondershaper needs to go the =
way of the dodo=94; especially the following caught my attention (lines =
303 - 311):
>>=20
>> ## The ingress policer doesn't work against ipv6, so if you have =
mixed traffic
>> ## you are not matching all of it, and the policer fails entirely
>> ## A correct, modern line for this would be:
>> ## tc filter add dev ${DEV} parent ffff: protocol all match u32 0 0 \
>> ## police rate ${DOWNLINK}kbit burst 100k drop flowid :1
>> ##
>> ## Even if it did work, the police burst size is too small for higher =
speed
>> ## connections and what I suggest above for a burst size needs to be
>> ## a calculated figure.(that one works ok at 100mbit)
>>=20
>> I think we should implement a policer setting in SQM (if only for =
testing) so I wonder how to set the burst size?
>> I think we basically can run the policer only X times per second, so =
we should allow something along the lines of:
>> bandwidth [bits/sec] / X [1/sec] =3D max bits in batch [bits]
>> Since in the end, we only can work in bursts/batches we need to =
figure out what worst-case batches to expect.
>> Now it would be sweet if we could get a handle on X, but what about =
just using the following approximation:
>>=20
>> How often does the policer run per second worst case?
>>=20
>> 100[kB]*1000*8 =3D 800000 [bit]
>> (100*1000^2 [bit/sec] / (100*1000*8) [bit]) =3D 125 1/sec or 8 =
milliseconds per invocation
>>=20
>> So your example seems to show that if we can run 125 times per second =
we will be able to drain enough packets so we do not drop excessively =
many packets. This bursting issue will increase the latency under load =
for sure, but I guess not more than 8ms on average?.
>>        Now, I guess one issue will be that this is not simply =
dependent on either data size or packet count, but probably we are both =
limited at how many packets per second we can process as well as how =
many bytes. So what about:
>>=20
>> burst =3D (bandwidth [bits/sec] /  125 [1/sec]) / (1000*8) [kB]
>>=20
>> This is probably too simplistic, but better than nothing.
>>=20
>> I would appreciate any hints how to improve this; so thanks in =
advance. Now all I need to do is hook this up with sqm-scripts and then =
go test the hell out of it ;)
>>=20
>> Best Regards
>>        Sebastian
>=20
> Well, I am pretty sure policing as currently understood is generally
> not a win compared to inbound shaping with aqm, particularly in the
> concatenated queues case which was the one I wanted to address (90/100
> rate differential).

	ACK; what I want is something that will allow me to test my =
100/40Mbps link without reducing the aggregate throughput to ~70Mbps. =
Especially as the in the bi-directional saturation case the downloads =
get an edge over the already smaller uploads (not diligently tested, so =
might be observer bias). You fully convinced me that a dumb policer is =
just that, dumb; but since I want that to test anyway I thought we could =
include it in sqm-scripts as this sort of is our test vehicle (with the =
goal that cake will take over later and make normal things easy and =
complicated things at least possible ;) )
	So I really am looking for a solution to the best issue. Do you =
think my reasoning as looking as burst controlling the maximum induced =
delay one is willing to accept? (This probably needs the same treatment =
as codel=92s target, so that at really low speeds one needs to accept =
more delay to account for the longer transmission times).

>=20
> policers have generally been "pitched" as a means of customer
> bandwidth control (with CIR and other "features"), and do seem to be
> highly used=85

	Yep, my ISP uses a (bi-directional IIRC) policer at the BRAS =
stage to avoid making the VDSL link the true bottleneck. I hope they at =
least use this to make sure VoIP packets get through with a higher =
probability (since they switched their telephony product to all-IP).

>=20
> What I wanted to do was come up with a kinder, gentler policer that
> was effective but less damaging to non-tcp-like flow types, and the
> whole concept of a burst parameter just doesn't work with shorter RTTs
> in particular when used with ewma.

	Where does the ewma come in? And at what RTTs does this break =
badly, if 5ms is a problem, but 20ms+ would already be decent that would =
probably be good enough for a typical (non-fiber) WAN link, no?

> The initial burst characteristics
> we see today are very different from the slow speeds and small initial
> windows (2) of yesteryear, and we tend to see a bunch of flows in slow
> start all at the same time. signal sent is too late to dissipate the
> original burst, and yet the policer signal is a brick wall so once it
> kicks in bad things happen to all flows.

	This is the case of fq_police I guess ;), so that the policer =
would start hitting the largest flows first (as these will help the =
situation most if that half their rate, but where to store all the =
required state?)

>=20
> So, for example, I came up with a simple mod to the existing policer
> code, to "shred" inbound with a fq-like idea The shred.patch and some
> flent data are here:
>=20
> http://snapon.lab.bufferbloat.net/~cero3/bobbie/

	So how well does this work in real-life?=20

>=20
> But: new data points galore hit me at the same time.
>=20
> Recently I boosted my signal strength on my cable modems in the
> biggest testbed, and switched to a new one. the old one, latched up at
> 110mbit down prior (and really horrible download bufferbloat), started
> giving me 172mbit service. This morning I measured that at about
> 142mbit service. THAT difference in performance ended up pretty
> dramatic, I went from where dslreports would peak at seconds on
> inbound to mere 100s of ms on an unshaped modem.

	Any theory as to why? I had though that the induced latency =
should inversely scale with the available/provided bandwidth as the same =
buffers will last only for shorter durations. But I had assumed this to =
be a basically linear relationship, but you observed at least one =
(base10) order of magnitude, any hypothesis?

>=20
> http://www.dslreports.com/speedtest/560968
>=20
> dslreports changed their cable test to be 16 down and 12 up, also =
(from 16/6).

	I hope that this crystalizes more making it easier to get =
comparable numbers as the numbers between runs without needing to =
manually set the number of flows. This would make post-hoc comparisons =
between runs of different sources more convenient, but I digress.

>=20
> ( I have also made so many other changes to the test driving box - for
> example I reduced tcp_limit_output_bytes to 4k and started using the
> sch_fq qdisc on my test driver box - and certainly it is my hope that
> the cable isps sat up and took notice and deployed some fixes in the
> past few weeks)
>=20
> So here is me just fixing outbound on this test:
>=20
> http://www.dslreports.com/speedtest/560989
>=20
> so there are WAY too many variables in play again.
>=20
> and trying to fix inbound (and failing)
>=20
> http://www.dslreports.com/speedtest/561097
>=20
> (and see the dataset)
>=20
> I am still seeing 30ms of induced latency on the rrul test but it is
> so far from horrible that I think I am still dreaming.

	I know I know, our goal is miminum induced latency, but 30ms =
certainly is acceptable (for the WAN side, in my LAN I would be more =
unhappy ;) )


> and i have a whole bunch of variables to tediously recheck.

	Fun, fun, fun all around=85

Best Regards
	Sebastian


>=20
>>=20
>>=20
>>=20
>>=20
>=20
>=20
>=20
> --=20
> Dave T=E4ht
> Open Networking needs **Open Source Hardware**
>=20
> https://plus.google.com/u/0/+EricRaymond/posts/JqxCe2pFr67