From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wg0-x236.google.com (mail-wg0-x236.google.com [IPv6:2a00:1450:400c:c00::236]) (using TLSv1 with cipher RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by huchra.bufferbloat.net (Postfix) with ESMTPS id DF32C21F2E7 for ; Wed, 18 Mar 2015 15:14:13 -0700 (PDT) Received: by wggv3 with SMTP id v3so47037494wgg.1 for ; Wed, 18 Mar 2015 15:14:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=message-id:date:from:user-agent:mime-version:to:subject:references :in-reply-to:content-type:content-transfer-encoding; bh=GgJ9sHCjR/RAD0GNBXpSUCxXvyw1zeySFNnFL1oFqTo=; b=tV/n6kFxhSRkF1b7zNclW5DpycEKBeIrVultTOSw29UEs6YwdWDP4kjRCqjKVOA7g9 EApgbxcKaV6g8ln79feE3q/hJeErzP0GzSDtn+zV2t5iWbzPLgtPfByYFjPu9pckRSjQ CQMeXDCAM4cJEonzEUwoxDOZedstnTAvVKmuD5YQrkO7V9QP6tj6epIJHdjQ4+6EsxYT 7hgOBDBzFXLYc29nZobMyMxB4q85Wk4h4Azr28d/dOsAmzRKoHpyibPC3v2k6zUtUWQr zFVLtPkvWNVEWWgha+jb+coFDAKwz7xFdtFy2hOmYeExa8NSo6D0FBs/zZP/roshNcLq 0pNQ== X-Received: by 10.180.83.129 with SMTP id q1mr10751136wiy.46.1426716851637; Wed, 18 Mar 2015 15:14:11 -0700 (PDT) Received: from volcano.localdomain (host-92-11-217-213.as43234.net. [92.11.217.213]) by mx.google.com with ESMTPSA id ps4sm26232192wjc.31.2015.03.18.15.14.09 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 18 Mar 2015 15:14:10 -0700 (PDT) Message-ID: <5509F8B1.6090608@gmail.com> Date: Wed, 18 Mar 2015 22:14:09 +0000 From: Alan Jenkins User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.5.0 MIME-Version: 1.0 To: Sebastian Moeller , cerowrt-devel References: <1895D16A-1B0F-48C7-B4B5-6FC84CA92F43@gmx.de> In-Reply-To: <1895D16A-1B0F-48C7-B4B5-6FC84CA92F43@gmx.de> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 8bit Subject: Re: [Cerowrt-devel] SQM and PPPoE, more questions than answers... X-BeenThere: cerowrt-devel@lists.bufferbloat.net X-Mailman-Version: 2.1.13 Precedence: list List-Id: Development issues regarding the cerowrt test router project List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 18 Mar 2015 22:14:42 -0000 Hi Seb I tested shaping on eth1 vs pppoe-wan, as it applies to ADSL. (On Barrier Breaker + sqm-scripts). Maybe this is going back a bit & no longer interesting to read. But it seemed suspicious & interesting enough that I wanted to test it. My conclusion was 1) I should stick with pppoe-wan, 2) the question really means do you want to disable classification 3) I personally want to preserve the upload bandwidth and accept slightly higher latency. On 15/10/14 01:03, Sebastian Moeller wrote: > Hi All, > > some more testing: On Oct 12, 2014, at 01:12 , Sebastian Moeller > wrote: >> 1) SQM on ge00 does not show a working egress classification in the >> RRUL test (no visible “banding”/stratification of the 4 different >> priority TCP flows), while SQM on pppoe-ge00 does show this >> stratification. > Usind tc filters u32 filter makes it possible to actually dive into > PPPoE encapsulated ipv4 and ipv6 packets and perform classification > on “pass-through” PPPoE packets (as encountered when starting SQM on > ge00 instead of pppoe-ge00, if the latter actually handles the wan > connection), so that one is solved (but see below). > >> >> 2) SQM on ge00 shows better latency under load (LUL), the LUL >> increases for ~2*fq_codels target so 10ms, while SQM on pppeo-ge00 >> shows a LUL-increase (LULI) roughly twice as large or around 20ms. >> >> I have no idea why that is, if anybody has an idea please chime >> in. I saw the same, though with higher difference for egress rate. See first three files here: https://www.dropbox.com/sh/shwz0l7j4syp2ea/AAAxrhDkJ3TTy_Mq5KiFF3u2a?dl=0 [netperf-wrapper noob puzzle: most of the ping lines vanish part-way through. Maybe I failed it somehow.] > Once SQM on ge00 actually dives into the PPPoE packets and > applies/tests u32 filters the LUL increases to be almost identical to > pppoe-ge00’s if both ingress and egress classification are active and > do work. So it looks like the u32 filters I naively set up are quite > costly. Maybe there is a better way to set these up... Later you mentioned testing for coupling with egress rate. But you didn't test coupling with classification! I switched from simple.qos to simplest.qos, and that achieved the lower latency on pppoe-wan. So I think your naive u32 filter setup wasn't the real problem. I did think ECN wouldn't be applied on eth1, and that would be the cause of the latency. But disabling ECN didn't affect it. See files 3 to 6: https://www.dropbox.com/sh/shwz0l7j4syp2ea/AAAxrhDkJ3TTy_Mq5KiFF3u2a?dl=0 I also admit surprise at fq_codel working within 20%/10ms on eth1. I thought it'd really hurt, by breaking the FQ part. Now I guess it doesn't. I still wonder about ECN marking, though I didn't check my endpoint is using ECN. >> >> 3) SQM on pppoe-ge00 has a rough 20% higher egress rate than SQM on >> ge00 (with ingress more or less identical between the two). Also 2) >> and 3) do not seem to be coupled, artificially reducing the egress >> rate on pppoe-ge00 to yield the same egress rate as seen on ge00 >> does not reduce the LULI to the ge00 typical 10ms, but it stays at >> 20ms. >> >> For this I also have no good hypothesis, any ideas? > > With classification fixed the difference in egress rate shrinks to > ~10% instead of 20, so this partly seems related to the > classification issue as well. My tests look like simplest.qos gives a lower egress rate, but not as low as eth1. (Like 20% vs 40%). So that's also similar. >> So the current choice is either to accept a noticeable increase in >> LULI (but note some years ago even an average of 20ms most likely >> was rare in the real life) or a equally noticeable decrease in >> egress bandwidth… > > I guess it is back to the drawing board to figure out how to speed up > the classification… and then revisit the PPPoE question again… so maybe the question is actually classification v.s. not? + IMO slow asymmetric links don't want to lose more upload bandwidth than necessary. And I'm losing a *lot* in this test. + As you say, having only 20ms excess would still be a big improvement. We could ignore the bait of 10ms right now. vs - lowest latency I've seen testing my link. almost suspicious. looks close to 10ms average, when the dsl rate puts a lower bound of 7ms on the average. - fq_codel honestly works miracles already. classification is the knob people had to use previously, who had enough time to twiddle it. - on netperf-runner plots the "banding" doesn't look brilliant on slow links anyway > Regards Sebastian > >> >> Best Regards Sebastian >> >> P.S.: It turns out, at least on my link, that for shaping on >> pppoe-ge00 the kernel does not account for any header >> automatically, so I need to specify a per-packet-overhead (PPOH) of >> 40 bytes (an an ADSL2+ link with ATM linklayer); when shaping on >> ge00 however (with the kernel still terminating the PPPoE link to >> my ISP) I only need to specify an PPOH of 26 as the kernel already >> adds the 14 bytes for the ethernet header…