[Cerowrt-devel] SQM and PPPoE, more questions than answers...

Thu Mar 19 04:37:41 EDT 2015

Hi David,

On Mar 19, 2015, at 03:43 , David Lang <david at lang.hm> wrote:

> On Wed, 18 Mar 2015, Alan Jenkins wrote:
> 
>>> Once SQM on ge00 actually dives into the PPPoE packets and
>>> applies/tests u32 filters the LUL increases to be almost identical to
>>> pppoe-ge00’s if both ingress and egress classification are active and
>>> do work. So it looks like the u32 filters I naively set up are quite
>>> costly. Maybe there is a better way to set these up...
>> 
>> Later you mentioned testing for coupling with egress rate.  But you didn't test coupling with classification!
>> 
>> I switched from simple.qos to simplest.qos, and that achieved the lower latency on pppoe-wan.  So I think your naive u32 filter setup wasn't the real problem.
>> 
>> I did think ECN wouldn't be applied on eth1, and that would be the cause of the latency.  But disabling ECN didn't affect it.  See files 3 to 6:
>> 
>> https://www.dropbox.com/sh/shwz0l7j4syp2ea/AAAxrhDkJ3TTy_Mq5KiFF3u2a?dl=0
>> 
>> I also admit surprise at fq_codel working within 20%/10ms on eth1.  I thought it'd really hurt, by breaking the FQ part.  Now I guess it doesn't.  I still wonder about ECN marking, though I didn't check my endpoint is using ECN.
> 
> ECN should never increase latency, if it has any effect it should improve latency because you slow down sending packets when some hop along the path is overloaded rather than sending the packets anyway and having them sit in a buffer for a while. This doesn't decrease actual throughput either (although if you are doing a test that doesn't actually wait for all the packets to arrive at the far end, it will look like it decreases throughput)
> 
>>>> 3) SQM on pppoe-ge00 has a rough 20% higher egress rate than SQM on
>>>> ge00 (with ingress more or less identical between the two). Also 2)
>>>> and 3) do not seem to be coupled, artificially reducing the egress
>>>> rate on pppoe-ge00 to yield the same egress rate as seen on ge00
>>>> does not reduce the LULI to the ge00 typical 10ms, but it stays at
>>>> 20ms.
>>>> For this I also have no good hypothesis, any ideas?
>>> With classification fixed the difference in egress rate shrinks to
>>> ~10% instead of 20, so this partly seems related to the
>>> classification issue as well.
>> 
>> My tests look like simplest.qos gives a lower egress rate, but not as low as eth1.  (Like 20% vs 40%).  So that's also similar.
>> 
>>>> So the current choice is either to accept a noticeable increase in
>>>> LULI (but note some years ago even an average of 20ms most likely
>>>> was rare in the real life) or a equally noticeable decrease in
>>>> egress bandwidth…
>>> I guess it is back to the drawing board to figure out how to speed up
>>> the classification… and then revisit the PPPoE question again…
>> 
>> so maybe the question is actually classification v.s. not?
>> 
>> + IMO slow asymmetric links don't want to lose more upload bandwidth than necessary.  And I'm losing a *lot* in this test.
>> + As you say, having only 20ms excess would still be a big improvement.  We could ignore the bait of 10ms right now.
>> 
>> vs
>> 
>> - lowest latency I've seen testing my link. almost suspicious. looks close to 10ms average, when the dsl rate puts a lower bound of 7ms on the average.
>> - fq_codel honestly works miracles already. classification is the knob people had to use previously, who had enough time to twiddle it.
> 
> That's what most people find when they try it. Classification doesn't result in throughput vs latency tradeoffs as much as it gives absolute priority to some types of traffic. But unless you are really up against your bandwidth limit, this seldom matters in the real world. As long as latency is kept low, everything works so you don't need to give VoIP priority over other traffic or things like that.

	But note, not all traffic is equal ;) Take the example from the mail Alan was quoting from, shaping on an ethernet interface that handles pppoe traffic: the shaper sees all packets including the packets PPP uses to establish and maintain the link, I would argue that these actually need a guaranteed delivery as dropping them can take out the pop link and hence the internet connection. I admit it is rare for home users to actually encounter such drop-averse packets, but they at least justify the use of classification/priorities. Whether VoIP makes the cut really depends on its drop probability on each end link (I just want to note that commercial VoIP system at least use precedence and EF markings on their packets, so classification of these is a) easy and b) is actually performed by many ISP’s home router offerings for that ISP’s brand of VoIP)

Best Regards
	Sebastian

> 
> David Lang