[Cerowrt-devel] SQM and PPPoE, more questions than answers...

Dave Taht dave.taht at gmail.com
Wed Mar 18 23:11:32 EDT 2015

On Wed, Mar 18, 2015 at 7:43 PM, David Lang <david at lang.hm> wrote:
> On Wed, 18 Mar 2015, Alan Jenkins wrote:
>>> Once SQM on ge00 actually dives into the PPPoE packets and
>>> applies/tests u32 filters the LUL increases to be almost identical to
>>> pppoe-ge00’s if both ingress and egress classification are active and
>>> do work. So it looks like the u32 filters I naively set up are quite
>>> costly. Maybe there is a better way to set these up...
>> Later you mentioned testing for coupling with egress rate.  But you didn't
>> test coupling with classification!
>> I switched from simple.qos to simplest.qos, and that achieved the lower
>> latency on pppoe-wan.  So I think your naive u32 filter setup wasn't the
>> real problem.
>> I did think ECN wouldn't be applied on eth1, and that would be the cause
>> of the latency.  But disabling ECN didn't affect it.  See files 3 to 6:
>> https://www.dropbox.com/sh/shwz0l7j4syp2ea/AAAxrhDkJ3TTy_Mq5KiFF3u2a?dl=0
>> I also admit surprise at fq_codel working within 20%/10ms on eth1.  I
>> thought it'd really hurt, by breaking the FQ part.  Now I guess it doesn't.
>> I still wonder about ECN marking, though I didn't check my endpoint is using
>> ECN.
> ECN should never increase latency, if it has any effect it should improve
> latency because you slow down sending packets when some hop along the path
> is overloaded rather than sending the packets anyway and having them sit in
> a buffer for a while. This doesn't decrease actual throughput either
> (although if you are doing a test that doesn't actually wait for all the
> packets to arrive at the far end, it will look like it decreases throughput)

ECN does, provably, increase latency (and loss) for other non-ecn marked flows.

Not by a lot, but it does. In the case of a malignantly mis-marked
flow, the present
codel aqm algorithm does pretty bad things to itself and to other
non-ecn marked packets.

(have fixes for codel, but fq_codel doesn't have this problem, pie
somewhat has it)

>>>> 3) SQM on pppoe-ge00 has a rough 20% higher egress rate than SQM on
>>>> ge00 (with ingress more or less identical between the two). Also 2)
>>>> and 3) do not seem to be coupled, artificially reducing the egress
>>>> rate on pppoe-ge00 to yield the same egress rate as seen on ge00
>>>> does not reduce the LULI to the ge00 typical 10ms, but it stays at
>>>> 20ms.
>>>> For this I also have no good hypothesis, any ideas?
>>> With classification fixed the difference in egress rate shrinks to
>>> ~10% instead of 20, so this partly seems related to the
>>> classification issue as well.

One of the things we really have to get around to doing is more high
rate testing,
and actually measuring how much latency the tcp flows are experiencing.

>> My tests look like simplest.qos gives a lower egress rate, but not as low
>> as eth1.  (Like 20% vs 40%).  So that's also similar.
>>>> So the current choice is either to accept a noticeable increase in
>>>> LULI (but note some years ago even an average of 20ms most likely
>>>> was rare in the real life) or a equally noticeable decrease in
>>>> egress bandwidth…
>>> I guess it is back to the drawing board to figure out how to speed up
>>> the classification… and then revisit the PPPoE question again…
>> so maybe the question is actually classification v.s. not?
>> + IMO slow asymmetric links don't want to lose more upload bandwidth than
>> necessary.  And I'm losing a *lot* in this test.
>> + As you say, having only 20ms excess would still be a big improvement.
>> We could ignore the bait of 10ms right now.
>> vs
>> - lowest latency I've seen testing my link. almost suspicious. looks close
>> to 10ms average, when the dsl rate puts a lower bound of 7ms on the average.
>> - fq_codel honestly works miracles already. classification is the knob
>> people had to use previously, who had enough time to twiddle it.
> That's what most people find when they try it. Classification doesn't result
> in throughput vs latency tradeoffs as much as it gives absolute priority to
> some types of traffic. But unless you are really up against your bandwidth
> limit, this seldom matters in the real world. As long as latency is kept
> low, everything works so you don't need to give VoIP priority over other
> traffic or things like that.


> David Lang
> _______________________________________________
> Cerowrt-devel mailing list
> Cerowrt-devel at lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cerowrt-devel

Dave Täht
Let's make wifi fast, less jittery and reliable again!


More information about the Cerowrt-devel mailing list