From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <moeller0@gmx.de>
Received: from mout.gmx.net (mout.gmx.net [212.227.15.19])
	(using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits))
	(Client CN "mout.gmx.net",
	Issuer "TeleSec ServerPass DE-1" (verified OK))
	by huchra.bufferbloat.net (Postfix) with ESMTPS id 3FA4521F3D5
	for <cake@lists.bufferbloat.net>; Thu, 14 May 2015 03:58:34 -0700 (PDT)
Received: from hms-beagle-5.lan ([134.2.89.70]) by mail.gmx.com (mrgmx003)
	with ESMTPSA (Nemesis) id 0MIuSH-1YvJgk17f7-002VlF;
	Thu, 14 May 2015 12:58:31 +0200
Content-Type: text/plain; charset=windows-1252
Mime-Version: 1.0 (Mac OS X Mail 7.3 \(1878.6\))
From: Sebastian Moeller <moeller0@gmx.de>
In-Reply-To: <CF5D7897-07DD-4041-884B-EB3B8440A4BC@gmail.com>
Date: Thu, 14 May 2015 12:58:29 +0200
Content-Transfer-Encoding: quoted-printable
Message-Id: <FD941860-7D3E-4530-9E3C-2460F8E04A6E@gmx.de>
References: <554F64E1.6000609@gmail.com> <554F9594.60808@gmail.com>
	<E30C4EBD-0094-4BDA-80D1-70EE9E954631@gmx.de>
	<50DB1E31-61AE-4298-B80F-8C6F7487C99B@gmail.com>
	<AF973A4B-E500-43FD-9E2D-36BD27C70AE3@gmail.com>
	<002A5BFC-5511-4995-8785-370251F24083@gmx.de>
	<CF5D7897-07DD-4041-884B-EB3B8440A4BC@gmail.com>
To: Jonathan Morton <chromatix99@gmail.com>
X-Mailer: Apple Mail (2.1878.6)
X-Provags-ID: V03:K0:uNYzhPkd1Yyd4Pe5FvJ7GzSGGSgWonzqgZeFQh7AZqW7YznDO+k
	bRWARW1Ft2Hio6oW64bc02nF/3NvGHfkpFwYWn5P0VBgLqQ5wpyYxluNFY7/fwE87cLkMw1
	kyPNCmmEzGmJJ8wk9DD3OSUYsPv8RnT5imyrGlwi9aFsnLDHlkftjgSXqXpeJR1H98d4lab
	TCeX5pmDgZTDXEFDWFp0w==
X-UI-Out-Filterresults: notjunk:1;
Cc: cake@lists.bufferbloat.net
Subject: Re: [Cake] openwrt build with latest cake and other qdiscs
X-BeenThere: cake@lists.bufferbloat.net
X-Mailman-Version: 2.1.13
Precedence: list
List-Id: Cake - FQ_codel the next generation <cake.lists.bufferbloat.net>
List-Unsubscribe: <https://lists.bufferbloat.net/options/cake>,
	<mailto:cake-request@lists.bufferbloat.net?subject=unsubscribe>
List-Archive: <https://lists.bufferbloat.net/pipermail/cake>
List-Post: <mailto:cake@lists.bufferbloat.net>
List-Help: <mailto:cake-request@lists.bufferbloat.net?subject=help>
List-Subscribe: <https://lists.bufferbloat.net/listinfo/cake>,
	<mailto:cake-request@lists.bufferbloat.net?subject=subscribe>
X-List-Received-Date: Thu, 14 May 2015 10:59:03 -0000

Hi Jonathan,


On May 14, 2015, at 12:24 , Jonathan Morton <chromatix99@gmail.com> =
wrote:

>>> I=92ve just pushed support for an overhead parameter; both cake =
itself and the iproute2 module.  I took the opportunity to put in a =
minor optimisation for the cell-framing compensation as well.
>>=20
>> 	Great, thanks a lot. I have a question though: =
http://lxr.free-electrons.com/ident?i=3Dpsched_l2t_ns basically does the =
same operation, but slightly different:
>>              DIV_ROUND_UP instead of do_div((n+d-1), d)
>> What is the kernel policy here, reuse specialized macros or rather =
code more readable (with slight redundancy)?
>=20
> It looks as though the DIV_ROUND_UP() expands to exactly the same =
code, except that a plain division is used instead of do_div().  The =
latter includes a conversion to multiplying by the inverse on ARM, when =
the divisor is a constant (which it is), since ARM doesn=92t have a =
hardware integer divide.  (AArch64 does.)
>=20
> With that said, I haven=92t closely examined the resulting assembler.

	I just noticed the difference and thought I=92d bring it up, so =
I can understand the code better, that=92s all.

>=20
> I=92m also not going to use psched_l2t_ns(), because I use the =
corrected packet length for other purposes than just time.

	Sure, HTB does its accounting in a weird way, and the different =
rate tables plainly confuse me. I was just referencing thgis code for =
the do_div vs DIV_ROUND_UP question.

>  It also fails to support negative overheads, which can occasionally =
occur when using IPoA.

	I know, that is why we default to =93stab=94 in sqm scripts=85 =
and as far as I can see Alan tested whether stab works with cake and it =
seems it does. Still it is much better if cake controls both overhead =
and encapsulation, since stab=92s encapsulation handling is not optimal.

>=20
>> It seems clear that cake does fully rely on the supplied overhead, =
unlike htb which will automatically add ethernet overhead and an =
estimate? of the additional header GRO packets drag in, see:
>> http://lxr.free-electrons.com/source/net/core/dev.c#L2744
>=20
> I can=92t figure out the connection between HTB and that code. =20

	Well, this function is called by __dev_xmit_skb (see =
http://lxr.free-electrons.com/source/net/core/dev.c#L2774 ) so it is not =
HTB specific, that is everyone looking in qdisc_skb_cb(skb)->pkt_len for =
the size seems to get that adjustment, only the following call to  =
qdisc_calculate_pkt_len(skb, q); unfortunately overrides skb->pkt_len =
with skb->len+overhead, but everybody else using pkt_len should get this =
size correction, I believe.


> Also, that appears to be GSO, not GRO.

	My bad, I was using GRO just as a moniker for packet aggregate =
processing in the network stack, without even thinking through the =
details.

>  I=92m not precisely sure what the difference is, but I=92d hazard a =
guess that GSO is outbound, GRO is inbound.

	No idea.

>=20
> Frankly, I hate having to deal with packet aggregates in the core =
network stack. =20

	But that ship has sailed, I fear, at high speeds the network =
stack profits noticeably by not going through the motions for each =
packet sequentially, but basically treating a batch of packets as one =
that the NIC will then segment out, so I have my doubts whether this is =
going away any time soon.

> Device drivers can aggregate if that makes sense for the hardware, but =
I=92d much rather that was kept out of my qdisc.  Peeling is on the =
agenda; that=92ll make sure we are dealing with actual, individual =
packets when we need to. =20

	I agree, that sounds conceptually much cleaner, but peeling is =
going to be costlier than pushing the segmentation to the NIC, so =
bandwidth aficionados will not appreciate unconditional peeling, I would =
guess.

> Certainly when dealing with cell-framing overhead, we *always* need to =
know individual packet sizes.

	Well that or the sum for an aggregate as long as the sum takes =
all fancy =93celling=94 into account, all we really need to know to how =
many bits the data expands on the wire.

>=20
>> I actually like that cake does not try to auto-adjust the overhead by =
itself, since the kernel does this automatically for an ethernet link, =
but not say for a PPPoE interface, making it a bit tricky to recommend =
the proper encapsulation to ATM users, =93use 40 if you shape on the =
pppoe-wan interface but 26 if you shape on the wan interface directly is =
a sure way to confuse people=94.
>=20
> I consider that a user-interface problem, as well as reflecting a =
general problem with PPPoE.  Actually, PPPoE has *never* been =
user-friendly; it outright sucks in all respects.  I can=92t think of a =
single reason to use PPPoE instead of PPPoA.  AFAIK, all Finnish and =
most British DSL ISPs use either PPPoA or bridging; I=92ve only =
personally encountered PPPoE in the US.

	Again, I agree, but say in Germany all big ISPs use PPPoE, even =
over fiber, so this is going to stay with us a bit longer. Since ATM is =
going to go the way of the dodo fast, PPPoA will not be an option for =
much longer, so dhcp would be nice to have (it is not like the ISP does =
not know which line it services anyway, so the billing and =
identification issue that is often brought up is a bit of a straw man; I =
believe they just stick to it because their billing back-end already =
knows how to handle this).

>=20
> To help reduce confusion, it would probably be best to offer =
consistent advice on which interface to shape and how much overhead to =
account for there.  I think shaping the traffic that actually goes over =
the link is more correct than shaping the traffic that goes to the modem =
(which might include some management traffic that doesn=92t go on the =
wire).  So you should shape on the PPPoE interface and add the full 40 =
bytes there.

	Well almost, this depends whether there is a BRAS throttle or =
not, the pppoe interface does not see or account for the PPPoE =
management packets, that without BRAS throttling will also eat up bits =
on the DSL link. I admit those packets are rare, but still=85 There =
should be no other important traffic to the modem heavy enough to be =
noticeable to the user. That said, I currently shape on pppoe-ge00, and =
it works well enough, I guess the PPPoE traffic simply squeezes into the =
small %age the shaper is reduced from line/throttled rate.


>  Happily, this advice is also safe if the user accidentally selects =
the wrong interface, since 40 bytes is conservative for the Ethernet =
interface.

	As seen from our latency focussed vantage point ;)

>=20
> Anyway, user-interface problems are best solved in userspace.  Cake=92s =
internal implementation is thus kept simple and numerical.  The tc =
module now supports that directly, and more user-friendly support can be =
added either there or in external scripts, or some combination of the =
two.

	Okay, sounds like a good division of labor between the kernel =
and tc ;)


Best Regards
	Sebastian

> - Jonathan Morton
>=20