[Bloat] Tuning fq_codel: are there more best practices for slow connections? (<1mbit)

Fri Nov 3 06:31:56 EDT 2017

Hi Yutaka,

thanks for the link. I believe this is quite interesting, but shaping the downstream down to <50% of the sync seems a bit drastic. Wth the link layer adjustment features that linux acquired in the meantime, my observation is that going to 80 to 90% of syncspeed should be sufficient for most usage pattern (if you have excessively many concurrent flows this might not be good enough). In that case, many concurrent flow, it might be good to look into the development version of cake as that has a mode where it allows stricter ingress shaping (by trying to throttle that the rate of incoming packets matches the defined bandwidth instead of the usual shaping of the outgoing packet rate).

Best Regards
	Sebastian

> On Nov 3, 2017, at 11:10, Yutaka <intruder_tkyf at yahoo.fr> wrote:
> 
> 
> Hi , Sebastian.
> 
> Thank you for your reply .
> I added URL while I am reading :)
> 
> On 2017年11月03日 18:53, Sebastian Moeller wrote:
>> Hi Yutaka,
>> 
>> 
>>> On Nov 3, 2017, at 01:31, Yutaka <intruder_tkyf at yahoo.fr> wrote:
>>> 
>>> Hi , Sebastian.
>>> 
>>> 
>>> On 2017年11月03日 05:31, Sebastian Moeller wrote:
>>> 
>>>> Hi Yutaka,
>>>> 
>>>> 
>>>>> On Nov 2, 2017, at 17:58, Y <intruder_tkyf at yahoo.fr> wrote:
>>>>> 
>>>>> Hi ,Moeller.
>>>>> 
>>>>> Fomula of target is 1643 bytes / 810kbps = 0.015846836.
>>>>> 
>>>>> It added ATM linklayer padding.
>>>>> 
>>>>> 16ms plus 4ms as my sence :P
>>>>> 
>>>>> My connection is 12mbps/1mbps ADSL PPPoA line.
>>>>> and I set 7Mbps/810kbps for bypass router buffer.
>>>> 	That sounds quite extreme, on uplink with the proper link layer adjustments you should be able to go up to 100% of the sync rate as reported by the modem (unless your ISP has another traffic shaper at a higher level). And going from 12 to 7 is also quite extreme, given that the ATM link layer adjustments will cost you another 9% of bandwidth. Then again 12/1 might be the contracted maximal rate, what are the sync rates as reported by your modem?
>>> Link speed is
>>> 11872 bps download
>>> 832 bps upload
>> 	Thanks. With proper link layer adjustments I would aim for 11872 * 0.9 = 10684.8 and 832 * 0.995 = 827.84; downstream shaping is a bit approximate (even though there is a feature in cake's development branch that has promise to make it less approximate) so I would go to 90 or 85% of the sync bandwidth. As you know linux shapers (with proper overlay specified) are shaping gross bandwidth so due to ATM's 48/53 encoding the measurable goodput will be around 9% lower than would be expected:
>> 
>> 10685 * (48/53) * ((1478 - 2 - 20 - 20)/(1478 + 10)) = 9338.8
>> 878 * (48/53) * ((1478 - 2 - 20 - 20)/(1478 + 10)) = 767.4
>> This actually excludes the typical HTTP part of your web based speedtest but that should be in the noise. I realize what you did with the MTU/MSS ((1478 + 10) / 48 = 31; so for full sized packets you have no atm/aal5 cell padding), clever; I never bothered to go to this level of detail, so respect!
>> 
>>> Why I reduce download 12 to 7 , Because according to this page, please see espesially download rate.
>> `	Which page?
> I forget to paste page URL sorry.
> http://tldp.org/HOWTO/ADSL-Bandwidth-Management-HOWTO/implementation.html
> 
>>> But I know that I can let around 11mbps download rate work :)
>>> And I will set 11mbps for download
>> 	As stated above I would aim for in the range of 10500 initially and then test.
>> 
>> 
>>>>> I changed Target 27ms Interval 540ms as you say( down delay plus upload delay).
>>>> 	I could be out to lunch, but this large interval seems counter-intuitive. The idea (and please anybody correct me if I am wrong) is that interval should be long enough for both end points to realize a drop/ecn marking, in essence that would be the RTT of a flow (plus a small add-on to allow some variation; in practice you will need to set one interval for all flows and empirically 100ms works well, unless most of your flows go to more remote places then setting interval to the real RTT would be better. But an interval of 540ms seems quite extreme (unless you often use connections to hosts with only satellite links). Have you tried something smaller?
>>> I did smaller something.
>>> I thought that dropping rate is getting increased.
>> 	My mental model for interval is that this is the reaction time you are willing to give a flows endpoint to react before you drop more aggressively, if set too high you might be trading of more bandwidth for a higher latency under load increase (which is a valid trade-off as long as you make it consciously ;) ).
>> 
>> 
>>>>> It works well  , now .
>>>> 	Could you post the output of "tc -d qdisc" and "tc -s qdisc please" so I have a better idea what your configuration currently is?
>>>> 
>>>> Best Regards
>>>> 	Sebastian
>>> My dirty stat :P
>>> 
>>> [ippan at localhost ~]$ tc -d -s qdisc
>>> qdisc noqueue 0: dev lo root refcnt 2
>>>  Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
>>>  backlog 0b 0p requeues 0
>>> qdisc htb 1: dev eno1 root refcnt 2 r2q 10 default 26 direct_packets_stat 0 ver 3.17 direct_qlen 1000
>>>  linklayer atm overhead -4 mpu 64 mtu 1478 tsize 128
>> So you are shaping on an ethernet device (eno1) but you try to adjust for a PPPoA, VC/Mux RFC-2364 link (so since the kernel adds 14 bytes for ethernet interfaces, you specify -4 to get the desired IP+10; Protocol (bytes): PPP (2), ATM AAL5 SAR (8) : Total 10), but both MPU and MTU seem wrong to me.
>> For tcstab the tcMTU parameter really does not need to match the real MTU, but needs to be larger than the largest packet size you expect to encounter so we default to 2047 since that is larger than the 48/53 expanded packet size. Together with tsize tcMTU is used to create the look-up table that the kernel uses to calculate from real packet size to estimated on-the-wire packetsize, the defaulf 2047, 128 will make a table that increments in units of 16 bytes (as (2047+1)/128 = 16) which will correctly deal will the 48 byte quantisation that linklayer atm will create (48 = 3*16). , your values (1478+1)/128 = 11.5546875 will be somewhat odd. And yes the tcstab thing is somewhat opaque.
>> Finally mpu64 is correct for any ethernet based transport (or rather any transport that uses full L2 ethernet frames including the frame check sequence), but most ATM links do a) not use the FCS (and hence are not bound to ethernets 64 byte minimum) and b) your link does not use ethernet framing at all (as you can see from your overhead that is smaller than the ethernet srcmac. dstmac and ethertype).
>> So I would set tcMPU to 0, tcMTU to 2047 and let tsize at 128.
>> Or I would give cake a trial (needs to be used in combination with a patches tc utility); cake can do its own overhead accounting which is way simpler than tcstab (it should also be slightly more efficient and will deal with all possible packet sizes).
>> 
>> 
>>>  Sent 161531280 bytes 138625 pkt (dropped 1078, overlimits 331194 requeues 0)
>>>  backlog 1590b 1p requeues 0
>>> qdisc fq_codel 2: dev eno1 parent 1:2 limit 300p flows 256 quantum 300 target 5.0ms interval 100.0ms
>>>  Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
>>>  backlog 0b 0p requeues 0
>>>   maxpacket 0 drop_overlimit 0 new_flow_count 0 ecn_mark 0
>>>   new_flows_len 0 old_flows_len 0
>>> qdisc fq_codel 260: dev eno1 parent 1:26 limit 300p flows 256 quantum 300 target 36.0ms interval 720.0ms
>>>  Sent 151066695 bytes 99742 pkt (dropped 1078, overlimits 0 requeues 0)
>>>  backlog 1590b 1p requeues 0
>>>   maxpacket 1643 drop_overlimit 0 new_flow_count 5997 ecn_mark 0
>>>   new_flows_len 1 old_flows_len 1
>>> qdisc fq_codel 110: dev eno1 parent 1:10 limit 300p flows 256 quantum 300 target 5.0ms interval 100.0ms
>>>  Sent 1451034 bytes 13689 pkt (dropped 0, overlimits 0 requeues 0)
>>>  backlog 0b 0p requeues 0
>>>   maxpacket 106 drop_overlimit 0 new_flow_count 2050 ecn_mark 0
>>>   new_flows_len 1 old_flows_len 7
>>> qdisc fq_codel 120: dev eno1 parent 1:20 limit 300p flows 256 quantum 300 target 36.0ms interval 720.0ms
>>>  Sent 9013551 bytes 25194 pkt (dropped 0, overlimits 0 requeues 0)
>>>  backlog 0b 0p requeues 0
>>>   maxpacket 1643 drop_overlimit 0 new_flow_count 2004 ecn_mark 0
>>>   new_flows_len 0 old_flows_len 1
>>> qdisc ingress ffff: dev eno1 parent ffff:fff1 ----------------
>>>  Sent 59600088 bytes 149809 pkt (dropped 0, overlimits 0 requeues 0)
>>>  backlog 0b 0p requeues 0
>>> qdisc htb 1: dev ifb0 root refcnt 2 r2q 10 default 26 direct_packets_stat 0 ver 3.17 direct_qlen 32
>>>  linklayer atm overhead -4 mpu 64 mtu 1478 tsize 128
>>>  Sent 71997532 bytes 149750 pkt (dropped 59, overlimits 42426 requeues 0)
>>>  backlog 0b 0p requeues 0
>>> qdisc fq_codel 200: dev ifb0 parent 1:20 limit 300p flows 1024 quantum 300 target 27.0ms interval 540.0ms ecn
>>>  Sent 34641860 bytes 27640 pkt (dropped 1, overlimits 0 requeues 0)
>>>  backlog 0b 0p requeues 0
>>>   maxpacket 1643 drop_overlimit 0 new_flow_count 1736 ecn_mark 0
>>>   new_flows_len 0 old_flows_len 1
>>> qdisc fq_codel 260: dev ifb0 parent 1:26 limit 300p flows 1024 quantum 300 target 27.0ms interval 540.0ms ecn
>>>  Sent 37355672 bytes 122110 pkt (dropped 58, overlimits 0 requeues 0)
>>>  backlog 0b 0p requeues 0
>>>   maxpacket 1643 drop_overlimit 0 new_flow_count 8033 ecn_mark 0
>>>   new_flows_len 1 old_flows_len 2
>>> qdisc noqueue 0: dev virbr0 root refcnt 2
>>>  Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
>>>  backlog 0b 0p requeues 0
>>> qdisc pfifo_fast 0: dev virbr0-nic root refcnt 2 bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
>>>  Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
>>>  backlog 0b 0p requeues 0
>>> [ippan at localhost ~]$ tc -d -s qdisc
>>> qdisc noqueue 0: dev lo root refcnt 2
>>>  Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
>>>  backlog 0b 0p requeues 0
>>> qdisc htb 1: dev eno1 root refcnt 2 r2q 10 default 26 direct_packets_stat 0 ver 3.17 direct_qlen 1000
>>>  linklayer atm overhead -4 mpu 64 mtu 1478 tsize 128
>> 	Same comments apply as above.
>> 
>>>  Sent 168960078 bytes 145643 pkt (dropped 1094, overlimits 344078 requeues 0)
>>>  backlog 0b 0p requeues 0
>>> qdisc fq_codel 2: dev eno1 parent 1:2 limit 300p flows 256 quantum 300 target 5.0ms interval 100.0ms
>>>  Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
>>>  backlog 0b 0p requeues 0
>>>   maxpacket 0 drop_overlimit 0 new_flow_count 0 ecn_mark 0
>>>   new_flows_len 0 old_flows_len 0
>>> qdisc fq_codel 260: dev eno1 parent 1:26 limit 300p flows 256 quantum 300 target 36.0ms interval 720.0ms
>>>  Sent 157686660 bytes 104157 pkt (dropped 1094, overlimits 0 requeues 0)
>>>  backlog 0b 0p requeues 0
>>>   maxpacket 1643 drop_overlimit 0 new_flow_count 6547 ecn_mark 0
>>>   new_flows_len 0 old_flows_len 1
>>> qdisc fq_codel 110: dev eno1 parent 1:10 limit 300p flows 256 quantum 300 target 5.0ms interval 100.0ms
>>>  Sent 1465132 bytes 13822 pkt (dropped 0, overlimits 0 requeues 0)
>>>  backlog 0b 0p requeues 0
>>>   maxpacket 106 drop_overlimit 0 new_flow_count 2112 ecn_mark 0
>>>   new_flows_len 0 old_flows_len 6
>>> qdisc fq_codel 120: dev eno1 parent 1:20 limit 300p flows 256 quantum 300 target 36.0ms interval 720.0ms
>>>  Sent 9808286 bytes 27664 pkt (dropped 0, overlimits 0 requeues 0)
>>>  backlog 0b 0p requeues 0
>>>   maxpacket 1643 drop_overlimit 0 new_flow_count 2280 ecn_mark 0
>>>   new_flows_len 0 old_flows_len 1
>>> qdisc ingress ffff: dev eno1 parent ffff:fff1 ----------------
>>>  Sent 62426837 bytes 155632 pkt (dropped 0, overlimits 0 requeues 0)
>>>  backlog 0b 0p requeues 0
>>> qdisc htb 1: dev ifb0 root refcnt 2 r2q 10 default 26 direct_packets_stat 0 ver 3.17 direct_qlen 32
>>>  linklayer atm overhead -4 mpu 64 mtu 1478 tsize 128
>>>  Sent 75349888 bytes 155573 pkt (dropped 59, overlimits 43545 requeues 0)
>>>  backlog 0b 0p requeues 0
>>> qdisc fq_codel 200: dev ifb0 parent 1:20 limit 300p flows 1024 quantum 300 target 27.0ms interval 540.0ms ecn
>>>  Sent 37624117 bytes 30196 pkt (dropped 1, overlimits 0 requeues 0)
>>>  backlog 0b 0p requeues 0
>>>   maxpacket 1643 drop_overlimit 0 new_flow_count 1967 ecn_mark 0
>>>   new_flows_len 0 old_flows_len 1
>>> qdisc fq_codel 260: dev ifb0 parent 1:26 limit 300p flows 1024 quantum 300 target 27.0ms interval 540.0ms ecn
>>>  Sent 37725771 bytes 125377 pkt (dropped 58, overlimits 0 requeues 0)
>>>  backlog 0b 0p requeues 0
>>>   maxpacket 1643 drop_overlimit 0 new_flow_count 8613 ecn_mark 0
>>>   new_flows_len 0 old_flows_len 2
>>> qdisc noqueue 0: dev virbr0 root refcnt 2
>>>  Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
>>>  backlog 0b 0p requeues 0
>>> qdisc pfifo_fast 0: dev virbr0-nic root refcnt 2 bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
>>>  Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
>>>  backlog 0b 0p requeues 0
>>> 
>>> I wrote script according to this mailinglist and sqm-script.
>> Would you be willing to share your script?
>> 
>> Best Regards
>> 	Sebastian
>> 
>> 
>>> Thanks to Sebastian and all
>>> 
>>> Maybe, This works without problem.
>>> From now on , I need strict thinking.
>>> 
>>> Yutaka.
>>>>> Thank you.
>>>>> 
>>>>> Yutaka.
>>>>> 
>>>>> On 2017年11月02日 17:25, Sebastian Moeller wrote:
>>>>>> Hi Y.
>>>>>> 
>>>>>> 
>>>>>>> On Nov 2, 2017, at 07:42, Y <intruder_tkyf at yahoo.fr> wrote:
>>>>>>> 
>>>>>>> hi.
>>>>>>> 
>>>>>>> My connection is 810kbps( <= 1Mbps).
>>>>>>> 
>>>>>>> This is my setting For Fq_codel,
>>>>>>> quantum=300
>>>>>>> 
>>>>>>> target=20ms
>>>>>>> interval=400ms
>>>>>>> 
>>>>>>> MTU=1478 (for PPPoA)
>>>>>>> I cannot compare well. But A Latency is around 14ms-40ms.
>>>>>> 	Under full saturation in theory you would expect the average latency to equal the sum of upstream target and downstream target (which in your case would be 20 + ???) in reality I often see something like 1.5 to 2 times the expected value (but I have never inquired any deeper, so that might be a measuring artifact)...
>>>>>> 
>>>>>> Best Regards
>>>>>> 
>>>>>> 
>>>>>>> Yutaka.
>>>>>>> 
>>>>>>> On 2017年11月02日 15:01, cloneman wrote:
>>>>>>>> I'm trying to gather advice for people stuck on older connections. It appears that having dedictated /micromanged tc classes greatly outperforms the "no knobs" fq_codel approach for connections with  slow upload speed.
>>>>>>>> 
>>>>>>>> When running a single file upload @350kbps , I've observed the competing ICMP traffic quickly begin to drop (fq_codel) or be delayed considerably ( under sfq). From reading the tuning best practices page is not optimized for this scenario. (<2.5mbps)
>>>>>>>> (https://www.bufferbloat.net/projects/codel/wiki/Best_practices_for_benchmarking_Codel_and_FQ_Codel/) fq_codel
>>>>>>>> 
>>>>>>>> Of particular concern is that a no-knobs SFQ works better for me than an untuned codel ( more delay but much less loss for small flows). People just flipping the fq_codel button on their router at these low speeds could be doing themselves a disservice.
>>>>>>>> 
>>>>>>>> I've toyed with increasing the target and this does solve the excessive drops. I haven't played with limit and quantum all that much.
>>>>>>>> 
>>>>>>>> My go-to solution for this would be different classes, a.k.a. traditional QoS. But ,  wouldn't it be possible to tune fq_codel punish the large flows 'properly' for this very low bandwidth scenario? Surely <1kb ICMP packets can squeeze through properly without being dropped if there is 350kbps available, if the competing flow is managed correctly.
>>>>>>>> 
>>>>>>>> I could create a class filter by packet length, thereby moving ICMP/VoIP to its own tc class, but  this goes against "no knobs" it seems like I'm re-inventing the wheel of fair queuing - shouldn't the smallest flows never be delayed/dropped automatically?
>>>>>>>> 
>>>>>>>> Lowering Quantum below 1500 is confusing, serving a fractional packet in a time interval?
>>>>>>>> 
>>>>>>>> Is there real value in tuning fq_codel for these connections or should people migrate to something else like nfq_codel?
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> _______________________________________________
>>>>>>>> Bloat mailing list
>>>>>>>> 
>>>>>>>> Bloat at lists.bufferbloat.net
>>>>>>>> https://lists.bufferbloat.net/listinfo/bloat
>>>>>>> _______________________________________________
>>>>>>> Bloat mailing list
>>>>>>> Bloat at lists.bufferbloat.net
>>>>>>> https://lists.bufferbloat.net/listinfo/bloat
>>>>> _______________________________________________
>>>>> Bloat mailing list
>>>>> Bloat at lists.bufferbloat.net
>>>>> https://lists.bufferbloat.net/listinfo/bloat
>>> _______________________________________________
>>> Bloat mailing list
>>> Bloat at lists.bufferbloat.net
>>> https://lists.bufferbloat.net/listinfo/bloat
>