From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mout.gmx.net (mout.gmx.net [212.227.15.18]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client CN "mout.gmx.net", Issuer "TeleSec ServerPass DE-1" (verified OK)) by huchra.bufferbloat.net (Postfix) with ESMTPS id EAABD21F1CE for ; Thu, 19 Mar 2015 02:58:32 -0700 (PDT) Received: from hms-beagle.lan ([134.2.89.70]) by mail.gmx.com (mrgmx003) with ESMTPSA (Nemesis) id 0LmKag-1Z72xT099m-00Zurb; Thu, 19 Mar 2015 10:58:29 +0100 Content-Type: text/plain; charset=windows-1252 Mime-Version: 1.0 (Mac OS X Mail 7.3 \(1878.6\)) From: Sebastian Moeller In-Reply-To: <550A9A12.1040007@gmail.com> Date: Thu, 19 Mar 2015 10:58:28 +0100 Content-Transfer-Encoding: quoted-printable Message-Id: References: <1895D16A-1B0F-48C7-B4B5-6FC84CA92F43@gmx.de> <5509F8B1.6090608@gmail.com> <550A9A12.1040007@gmail.com> To: Alan Jenkins X-Mailer: Apple Mail (2.1878.6) X-Provags-ID: V03:K0:N4mBlsojiBP0AdMsU/iRVkIACvs5wRMzbCQxn22d8XZcwMgqAJJ GpVbKxzxHJXoAiMVb4uonhOqZok7PF1UL5cUONwruUQUfJRThw58xr8N11AQi2fG6otMuMy n/4Q18YqSSKZXehk+6/dSXtZf5EUaPb3vxnxTpBCfqQ6q+p10o4k9H5LpWrVkTsvtEbNRBa +f8uXGx/vwglYdt2pECwQ== X-UI-Out-Filterresults: notjunk:1; Cc: cerowrt-devel Subject: Re: [Cerowrt-devel] SQM and PPPoE, more questions than answers... X-BeenThere: cerowrt-devel@lists.bufferbloat.net X-Mailman-Version: 2.1.13 Precedence: list List-Id: Development issues regarding the cerowrt test router project List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 19 Mar 2015 09:59:01 -0000 HI Alan, On Mar 19, 2015, at 10:42 , Alan Jenkins = wrote: > On 19/03/15 08:29, Sebastian Moeller wrote: >> Hi Alan, >>=20 >>=20 >> On Mar 18, 2015, at 23:14 , Alan Jenkins = wrote: >>=20 >>> Hi Seb >>>=20 >>> I tested shaping on eth1 vs pppoe-wan, as it applies to ADSL. (On = Barrier Breaker + sqm-scripts). Maybe this is going back a bit & no = longer interesting to read. But it seemed suspicious & interesting = enough that I wanted to test it. >>>=20 >>> My conclusion was 1) I should stick with pppoe-wan, >> Not a bad decision, especially given the recent changes to SQM = to make it survive transient pppoe-interface disappearances. Before = those changes the beauty of shaping on the ethernet device was that = pppoe could come and go, but SQM stayed active and working. But due to = your help this problem seems fixed now. > I'd say your help and my selfish prodding :). >=20 >>> 2) the question really means do you want to disable classification >>> 3) I personally want to preserve the upload bandwidth and accept = slightly higher latency. >> My question still is, is the bandwidth sacrifice really = necessary or is this test just showing a corner case in simple.qos that = can be fixed. I currently lack enough time to tackle this effectively. > Yep ok (no complaint). >=20 >>> [netperf-wrapper noob puzzle: most of the ping lines vanish part-way = through. Maybe I failed it somehow.] >> This is not your fault, the UDP probes net-perf wrapper uses do = not accept packet loss, once a packet (I believe) is lost the stream = stops. This is not ideal, but it gives a good quick indicator of packet = loss for sparse streams ;) > Heh, thanks. >=20 >>> My tests look like simplest.qos gives a lower egress rate, but not = as low as eth1. (Like 20% vs 40%). So that's also similar. >>>=20 >>>>> So the current choice is either to accept a noticeable increase in >>>>> LULI (but note some years ago even an average of 20ms most likely >>>>> was rare in the real life) or a equally noticeable decrease in >>>>> egress bandwidth=85 >>>> I guess it is back to the drawing board to figure out how to speed = up >>>> the classification=85 and then revisit the PPPoE question again=85 >>> so maybe the question is actually classification v.s. not? >>>=20 >>> + IMO slow asymmetric links don't want to lose more upload bandwidth = than necessary. And I'm losing a *lot* in this test. >>> + As you say, having only 20ms excess would still be a big = improvement. We could ignore the bait of 10ms right now. >>>=20 >>> vs >>>=20 >>> - lowest latency I've seen testing my link. almost suspicious. looks = close to 10ms average, when the dsl rate puts a lower bound of 7ms on = the average. >> Curious: what is your link speed? >=20 > dsl sync 912k up > shaped at 850 > fq_codel auto target says =3D> 14.5ms <=3D >=20 > MTU time is > 912kbps / (1500*8)b =3D 0.0132s > so if the link is filled with MTU packets, there's a hard 7ms lower = bound, on average icmp ping increase v.s. an empty link > and the same logic says on achieving that average, you have >=3D 7ms = jitter Ah I see, 50% chance of getting the link immediately versus = having to wait for a full packet transmit time. >=20 >=20 > (or 6.5ms, but since my download rate is about 10x better, 6.5 + 0.65 = ~=3D 7). >=20 >>> - fq_codel honestly works miracles already. classification is the = knob people had to use previously, who had enough time to twiddle it. >>> - on netperf-runner plots the "banding" doesn't look brilliant on = slow links anyway >> On slow links I always used to add =93-s 0.8=94 with higher = numbers the slower the link to increase the temporal averaging window, = this reduces accuracy of the display for the downlink, but at least = allows better understanding of the uplink. I always wanted to see = whether I could treach netperf-wrapper to allow larger averaging windows = after measurements, just for display purposes, but I am a total beginner = with python... >>=20 >>>>> P.S.: It turns out, at least on my link, that for shaping on >>>>> pppoe-ge00 the kernel does not account for any header >>>>> automatically, so I need to specify a per-packet-overhead (PPOH) = of >>>>> 40 bytes (an an ADSL2+ link with ATM linklayer); when shaping on >>>>> ge00 however (with the kernel still terminating the PPPoE link to >>>>> my ISP) I only need to specify an PPOH of 26 as the kernel already >>>>> adds the 14 bytes for the ethernet header=85 >> Please disregard this part, I need to implement better tests for = this instead on only relaying on netperf-wrapper results ;) > . Apart from kernel code, I did wonder how = this was tested :). Oh, quite roughly=85 at that time I was only limited by my DSLAM = (now I have a lower throttle in the BRAS that is somewhat hard to = measure), I realized I could get decent RRUL results with egress shaping = at 100% if the encapsulation and per packet overhead was set correctly. = Increasing the per packet overhead above theoretical value did not = affect latency and bandwidth (it should have affected bandwidth but the = change was too small to measure). Decreasing the per packet overhead = below the correct value noticeably increased the LULI during RRUL runs. = The issue is I did not collect enough runs to be certain about the LULI = I measured, even though my current hypothesis is that the kernel does = not account for the ethernet header on an pppoe interface=85 Also This = can partly be tested on router itself with a bit of tc magic that = someone used to show me that the kernel does account for the 14 bytes = for ethernet interfaces; I just need to find my notes from that = experiment again (I fear it was lost by my btrfs raid5 disintegrating=85 = they call btrfs raid5 experimental for a reason ;) ) Best Regards Sebastian >=20 > Thanks again > Alan