[Cerowrt-devel] Equivocal results with using 3.10.28-14

Development issues regarding the cerowrt test router project
 help / color / mirror / Atom feed

* [Cerowrt-devel] Equivocal results with using 3.10.28-14
@ 2014-02-24 14:36 Rich Brown
  2014-02-24 14:56 ` Aaron Wood
                   ` (3 more replies)
  0 siblings, 4 replies; 14+ messages in thread
From: Rich Brown @ 2014-02-24 14:36 UTC (permalink / raw)
  To: cerowrt-devel

CeroWrt 3.10.28-14 is doing a good job of keeping latency low. But... it has two other effects:

- I don't get the full "7 mbps down, 768 kbps up" as touted by my DSL provider (Fairpoint). In fact, CeroWrt struggles to get above 6.0/0.6 mbps.

- When I adjust the SQM parameters to get close to those numbers, I get increasing levels of packet loss (5-8%) during a concurrent ping test.

So my question to the group is whether this behavior makes sense: that we can have low latency while losing ~10% of the link capacity, or that getting close to the link capacity should induce large packet loss...

Experimental setup:

I'm using a Comtrend 583-U DSL modem, that has a sync rate of 7616 kbps down, 864 kbps up. Theoretically, I should be able to tell SQM to use numbers a bit lower than those values, with an ATM plus header overhead with default settings. 

I have posted the results of my netperf-wrapper trials at http://richb-hanover.com - There are a number of RRUL charts, taken with different link rates configured, and with different link layers.

I welcome people's thoughts for other tests/adjustments/etc.

Rich Brown
Hanover, NH USA

PS I did try the 3.10.28-16, but ran into troubles with wifi and ethernet connectivity. I must have screwed up my local configuration - I was doing it quickly - so I rolled back to 3.10.28.14.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Cerowrt-devel] Equivocal results with using 3.10.28-14
  2014-02-24 14:36 [Cerowrt-devel] Equivocal results with using 3.10.28-14 Rich Brown
@ 2014-02-24 14:56 ` Aaron Wood
  2014-02-25 13:09   ` Rich Brown
  2014-02-24 15:24 ` Fred Stratton
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 14+ messages in thread
From: Aaron Wood @ 2014-02-24 14:56 UTC (permalink / raw)
  To: Rich Brown; +Cc: cerowrt-devel

[-- Attachment #1: Type: text/plain, Size: 1792 bytes --]

Do you have the latest (head) version of netperf and netperf-wrapper?  some
changes were made to both that give better UDP results.

-Aaron


On Mon, Feb 24, 2014 at 3:36 PM, Rich Brown <richb.hanover@gmail.com> wrote:

>
> CeroWrt 3.10.28-14 is doing a good job of keeping latency low. But... it
> has two other effects:
>
> - I don't get the full "7 mbps down, 768 kbps up" as touted by my DSL
> provider (Fairpoint). In fact, CeroWrt struggles to get above 6.0/0.6 mbps.
>
> - When I adjust the SQM parameters to get close to those numbers, I get
> increasing levels of packet loss (5-8%) during a concurrent ping test.
>
> So my question to the group is whether this behavior makes sense: that we
> can have low latency while losing ~10% of the link capacity, or that
> getting close to the link capacity should induce large packet loss...
>
> Experimental setup:
>
> I'm using a Comtrend 583-U DSL modem, that has a sync rate of 7616 kbps
> down, 864 kbps up. Theoretically, I should be able to tell SQM to use
> numbers a bit lower than those values, with an ATM plus header overhead
> with default settings.
>
> I have posted the results of my netperf-wrapper trials at
> http://richb-hanover.com - There are a number of RRUL charts, taken with
> different link rates configured, and with different link layers.
>
> I welcome people's thoughts for other tests/adjustments/etc.
>
> Rich Brown
> Hanover, NH USA
>
> PS I did try the 3.10.28-16, but ran into troubles with wifi and ethernet
> connectivity. I must have screwed up my local configuration - I was doing
> it quickly - so I rolled back to 3.10.28.14.
> _______________________________________________
> Cerowrt-devel mailing list
> Cerowrt-devel@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cerowrt-devel
>

[-- Attachment #2: Type: text/html, Size: 2544 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Cerowrt-devel] Equivocal results with using 3.10.28-14
  2014-02-24 14:36 [Cerowrt-devel] Equivocal results with using 3.10.28-14 Rich Brown
  2014-02-24 14:56 ` Aaron Wood
@ 2014-02-24 15:24 ` Fred Stratton
  2014-02-24 22:02   ` Sebastian Moeller
  2014-02-24 15:51 ` Dave Taht
  2014-02-24 21:54 ` Sebastian Moeller
  3 siblings, 1 reply; 14+ messages in thread
From: Fred Stratton @ 2014-02-24 15:24 UTC (permalink / raw)
  To: Rich Brown, cerowrt-devel

How are you measuring the link speed?

With SQM enabled, I have speedtest.net results far below the values at 
which the gateway syncs.

IF the gateway syncs at 12000/1000, the speedtest figures are 9500/850

The performance I obtain with streaming video is very good, tweaking the 
extra settings in SQM on 3.10.28-16

I am sure you are aware that you will never achieve the values quoted by 
the ISP. How long is your line? Downstream attenuation is a proxy for 
this. Are you using ADSL2+, or some other protocol? Does the device even 
tell you?

On 24/02/14 14:36, Rich Brown wrote:
> CeroWrt 3.10.28-14 is doing a good job of keeping latency low. But... it has two other effects:
>
> - I don't get the full "7 mbps down, 768 kbps up" as touted by my DSL provider (Fairpoint). In fact, CeroWrt struggles to get above 6.0/0.6 mbps.
>
> - When I adjust the SQM parameters to get close to those numbers, I get increasing levels of packet loss (5-8%) during a concurrent ping test.
>
> So my question to the group is whether this behavior makes sense: that we can have low latency while losing ~10% of the link capacity, or that getting close to the link capacity should induce large packet loss...
>
> Experimental setup:
>
> I'm using a Comtrend 583-U DSL modem, that has a sync rate of 7616 kbps down, 864 kbps up. Theoretically, I should be able to tell SQM to use numbers a bit lower than those values, with an ATM plus header overhead with default settings.
>
> I have posted the results of my netperf-wrapper trials at http://richb-hanover.com - There are a number of RRUL charts, taken with different link rates configured, and with different link layers.
>
> I welcome people's thoughts for other tests/adjustments/etc.
>
> Rich Brown
> Hanover, NH USA
>
> PS I did try the 3.10.28-16, but ran into troubles with wifi and ethernet connectivity. I must have screwed up my local configuration - I was doing it quickly - so I rolled back to 3.10.28.14.
> _______________________________________________
> Cerowrt-devel mailing list
> Cerowrt-devel@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cerowrt-devel


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Cerowrt-devel] Equivocal results with using 3.10.28-14
  2014-02-24 14:36 [Cerowrt-devel] Equivocal results with using 3.10.28-14 Rich Brown
  2014-02-24 14:56 ` Aaron Wood
  2014-02-24 15:24 ` Fred Stratton
@ 2014-02-24 15:51 ` Dave Taht
  2014-02-24 16:14   ` Dave Taht
  2014-02-24 21:54 ` Sebastian Moeller
  3 siblings, 1 reply; 14+ messages in thread
From: Dave Taht @ 2014-02-24 15:51 UTC (permalink / raw)
  To: Rich Brown; +Cc: cerowrt-devel

On Mon, Feb 24, 2014 at 9:36 AM, Rich Brown <richb.hanover@gmail.com> wrote:
>
> CeroWrt 3.10.28-14 is doing a good job of keeping latency low. But... it has two other effects:
>
> - I don't get the full "7 mbps down, 768 kbps up" as touted by my DSL provider (Fairpoint). In fact, CeroWrt struggles to get above 6.0/0.6 mbps.

0) try the tcp_upload or tcp_download or tcp_bidir tests to get
results closer to what your provider claims.

since your plots are pretty sane, you can get cleaner ones with using
the 'totals' plot type
and/or comparing multiple runs to get a cdf

-p totals or -p icmp (theres a few different ones, --list-plots

-i somerun.json.gz -i somerun2.json.gz

1) http://richb-hanover.com/wp-content/uploads/2014/02/6854-777-dflt-sqm-disabled1.png

is your baseline without SQM?

If so why do you compare the providers stated rate...

   with the measured rate with/without SQM?

These are two measures of the truth - one with and without a change.

Vs a providers claim for link rate that doesn't account for real
packet dynamics.

2) the netperf reporting interval is too high to get good measurements
at below a few
mbit, so you kind of have to give up on the upload chart at these
rates. (totals chart is
clearer)

Note that the tcp acks are invisible - you are getting >6mbit down,
and sending back approximately
150kbit in acks which we can't easily measure. The overhead in the
measurement streams is
relative to the RTT as well.

I'd really like to get to a test that emulated tcp and got a fully
correct measurement.

3) Generally using a larger fq_codel target will give you better
upload throughput and
better utiliziation at these rates. try target 40ms as a start. We've
embedded a version
of the calculation in the latest cero build attempts (but other stuff is broke)

nfq_codel seems also do to give a better balance between up and
downloads at low rates,
also with a larger target.

it looks like overhead 44 is about right and your first set of charts
about right.

>
> - When I adjust the SQM parameters to get close to those numbers, I get increasing levels of packet loss (5-8%) during a concurrent ping test.

Shows the pings are now accruing delay.

>
> So my question to the group is whether this behavior makes sense: that we can have low latency while losing ~10% of the link capacity, or that getting close to the link capacity should induce large packet loss...

You never had the 10% in the first place.

>
> Experimental setup:
>
> I'm using a Comtrend 583-U DSL modem, that has a sync rate of 7616 kbps down, 864 kbps up. Theoretically, I should be able to tell SQM to use numbers a bit lower than those values, with an ATM plus header overhead with default settings.
>
> I have posted the results of my netperf-wrapper trials at http://richb-hanover.com - There are a number of RRUL charts, taken with different link rates configured, and with different link layers.
>
> I welcome people's thoughts for other tests/adjustments/etc.
>
> Rich Brown
> Hanover, NH USA
>
> PS I did try the 3.10.28-16, but ran into troubles with wifi and ethernet connectivity. I must have screwed up my local configuration - I was doing it quickly - so I rolled back to 3.10.28.14.

manually adjust the target.

> _______________________________________________
> Cerowrt-devel mailing list
> Cerowrt-devel@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cerowrt-devel

-- 
Dave Täht

Fixing bufferbloat with cerowrt: http://www.teklibre.com/cerowrt/subscribe.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Cerowrt-devel] Equivocal results with using 3.10.28-14
  2014-02-24 15:51 ` Dave Taht
@ 2014-02-24 16:14   ` Dave Taht
  2014-02-24 16:38     ` Aaron Wood
  0 siblings, 1 reply; 14+ messages in thread
From: Dave Taht @ 2014-02-24 16:14 UTC (permalink / raw)
  To: Rich Brown; +Cc: cerowrt-devel

On Mon, Feb 24, 2014 at 10:51 AM, Dave Taht <dave.taht@gmail.com> wrote:
> On Mon, Feb 24, 2014 at 9:36 AM, Rich Brown <richb.hanover@gmail.com> wrote:
>>
>> CeroWrt 3.10.28-14 is doing a good job of keeping latency low. But... it has two other effects:
>>
>> - I don't get the full "7 mbps down, 768 kbps up" as touted by my DSL provider (Fairpoint). In fact, CeroWrt struggles to get above 6.0/0.6 mbps.
>
> 0) try the tcp_upload or tcp_download or tcp_bidir tests to get
> results closer to what your provider claims.
>
> since your plots are pretty sane, you can get cleaner ones with using
> the 'totals' plot type
> and/or comparing multiple runs to get a cdf
>
> -p totals or -p icmp (theres a few different ones, --list-plots
>
> -i somerun.json.gz -i somerun2.json.gz
>
>
> 1) http://richb-hanover.com/wp-content/uploads/2014/02/6854-777-dflt-sqm-disabled1.png
>
> is your baseline without SQM?
>
> If so why do you compare the providers stated rate...
>
>    with the measured rate with/without SQM?
>
> These are two measures of the truth - one with and without a change.
>
> Vs a providers claim for link rate that doesn't account for real
> packet dynamics.

I awoke mildly grumpy this morning, sorry. The the sqm-disabled link
above shows you
getting less than a mbit down under the providers default settings.

So rather than saying you lose 10% of link bandwidth relative to the
stated ISP specification,
I prefer to think you are getting 6x more usable bandwidth from using
SQM, and somewhere
around 1/25th or more less latency.

Making tcp's congestion avoidance work rapidly and avoiding bursty
packet loss leads to
more usable bandwidth.

> 2) the netperf reporting interval is too high to get good measurements
> at below a few
> mbit, so you kind of have to give up on the upload chart at these
> rates. (totals chart is
> clearer)
>
> Note that the tcp acks are invisible - you are getting >6mbit down,
> and sending back approximately
> 150kbit in acks which we can't easily measure. The overhead in the
> measurement streams is
> relative to the RTT as well.
>
> I'd really like to get to a test that emulated tcp and got a fully
> correct measurement.
>
> 3) Generally using a larger fq_codel target will give you better
> upload throughput and
> better utiliziation at these rates. try target 40ms as a start. We've
> embedded a version
> of the calculation in the latest cero build attempts (but other stuff is broke)
>
> nfq_codel seems also do to give a better balance between up and
> downloads at low rates,
> also with a larger target.
>
> it looks like overhead 44 is about right and your first set of charts
> about right.

so if you could repeat your first set of tests changing the target to at least
40ms on the upload, and trying both nfq_codel and fq_codel, you'll be getting
somewhere.

nfq_codel behaves more like SFQ, and is probably closer to what more people
want at these speeds.

>
>
>
>>
>> - When I adjust the SQM parameters to get close to those numbers, I get increasing levels of packet loss (5-8%) during a concurrent ping test.
>
> Shows the pings are now accruing delay.
>
>>
>> So my question to the group is whether this behavior makes sense: that we can have low latency while losing ~10% of the link capacity, or that getting close to the link capacity should induce large packet loss...
>
> You never had the 10% in the first place.
>
>>
>> Experimental setup:
>>
>> I'm using a Comtrend 583-U DSL modem, that has a sync rate of 7616 kbps down, 864 kbps up. Theoretically, I should be able to tell SQM to use numbers a bit lower than those values, with an ATM plus header overhead with default settings.
>>
>> I have posted the results of my netperf-wrapper trials at http://richb-hanover.com - There are a number of RRUL charts, taken with different link rates configured, and with different link layers.
>>
>> I welcome people's thoughts for other tests/adjustments/etc.
>>
>> Rich Brown
>> Hanover, NH USA
>>
>> PS I did try the 3.10.28-16, but ran into troubles with wifi and ethernet connectivity. I must have screwed up my local configuration - I was doing it quickly - so I rolled back to 3.10.28.14.
>
>
> manually adjust the target.
>
>> _______________________________________________
>> Cerowrt-devel mailing list
>> Cerowrt-devel@lists.bufferbloat.net
>> https://lists.bufferbloat.net/listinfo/cerowrt-devel
>
>
>
> --
> Dave Täht
>
> Fixing bufferbloat with cerowrt: http://www.teklibre.com/cerowrt/subscribe.html



-- 
Dave Täht

Fixing bufferbloat with cerowrt: http://www.teklibre.com/cerowrt/subscribe.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Cerowrt-devel] Equivocal results with using 3.10.28-14
  2014-02-24 16:14   ` Dave Taht
@ 2014-02-24 16:38     ` Aaron Wood
  2014-02-24 16:47       ` Dave Taht
  0 siblings, 1 reply; 14+ messages in thread
From: Aaron Wood @ 2014-02-24 16:38 UTC (permalink / raw)
  To: Dave Taht; +Cc: cerowrt-devel

[-- Attachment #1: Type: text/plain, Size: 592 bytes --]

Rich,

One thing that helped considerably on my my DSL link (21000/1200), was
turning on the Link Layer Adaptation.  With that, and efq_codel, I've been
very happy with the (nearly non-existent) latency.  Yeah, I lost a couple
percent off the top, but the behavior is better.  Although I was starting
from a much better base given that Free.fr's DSL modems already using
either fq_codel or RED (I'm not sure the specifics, but I think Dave Taht's
gotten them from Free.fr in the past).

Dave, the 40ms target is the buffer latency target?  or how often the drop
rates are recomputed?

-Aaron

[-- Attachment #2: Type: text/html, Size: 739 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Cerowrt-devel] Equivocal results with using 3.10.28-14
  2014-02-24 16:38     ` Aaron Wood
@ 2014-02-24 16:47       ` Dave Taht
  0 siblings, 0 replies; 14+ messages in thread
From: Dave Taht @ 2014-02-24 16:47 UTC (permalink / raw)
  To: Aaron Wood; +Cc: cerowrt-devel

On Mon, Feb 24, 2014 at 11:38 AM, Aaron Wood <woody77@gmail.com> wrote:
> Rich,
>
> One thing that helped considerably on my my DSL link (21000/1200), was
> turning on the Link Layer Adaptation.  With that, and efq_codel, I've been
> very happy with the (nearly non-existent) latency.  Yeah, I lost a couple
> percent off the top, but the behavior is better.  Although I was starting
> from a much better base given that Free.fr's DSL modems already using either
> fq_codel or RED (I'm not sure the specifics, but I think Dave Taht's gotten
> them from Free.fr in the past).

Free deployed fq_codel on the revolution V6 box 1.5 years back. They
found it necessary to fiddle with the target, and shared with me
nearly all the details of their setup
and their formula for target, which I have embedded here:

http://snapon.lab.bufferbloat.net/~d/draft-taht-home-gateway-best-practices-00.html

despite testing a lot on your line we didn't do much better than they...

> Dave, the 40ms target is the buffer latency target?  or how often the drop
> rates are recomputed?

It is the starting point for the drop scheduler to start thinking
about dropping some packets.

>  or how often the drop
> rates are recomputed?

That's a RED or PIE -thinking way of looking at the problem,
recomputing a random number across a fixed interval.

the drop rates in codel are computed on a decreasing interval based on
a invsqrt control law that is much smoother than pie or red.

>
> -Aaron
>

-- 
Dave Täht

Fixing bufferbloat with cerowrt: http://www.teklibre.com/cerowrt/subscribe.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Cerowrt-devel] Equivocal results with using 3.10.28-14
  2014-02-24 14:36 [Cerowrt-devel] Equivocal results with using 3.10.28-14 Rich Brown
                   ` (2 preceding siblings ...)
  2014-02-24 15:51 ` Dave Taht
@ 2014-02-24 21:54 ` Sebastian Moeller
  2014-02-24 22:40   ` Sebastian Moeller
  3 siblings, 1 reply; 14+ messages in thread
From: Sebastian Moeller @ 2014-02-24 21:54 UTC (permalink / raw)
  To: Rich Brown; +Cc: cerowrt-devel

Hi Rich,


On Feb 24, 2014, at 15:36 , Rich Brown <richb.hanover@gmail.com> wrote:

> 
> CeroWrt 3.10.28-14 is doing a good job of keeping latency low. But... it has two other effects:
> 
> - I don't get the full "7 mbps down, 768 kbps up" as touted by my DSL provider (Fairpoint). In fact, CeroWrt struggles to get above 6.0/0.6 mbps.

	Okay, that sounds like a rather large bandwidth sacrifice, but let's see what we can expect to see on your link, to get a better hypothesis of what we can expect on your link.

0) the raw line rates as presented by your modem:
		DOWN [kbps]:		7616
		UP [kbps]:			864


1) let's start with the reported sync rates: the sync rates of the modem (that Rich graciously send me privately) also contain bins used for forward error correction (these sub-carriers will not be available for ATM-payload so reduce the useable sync. It looks like K reports the number of data bytes per dmt-frame, while R denotes the number of FEC bytes per dmt-frame. From my current understanding K is the useable part of the K+R total, so with K(down) = 239 and R(down) = 16 (and K(up) = 28 and R(up) = 0) we get:
but from the numbers you send it looks like for the downlink 16 in 239 byte are FEC bytes (and zero in you uplink) so you seem to loose 100*16/239 = 6.69% for forward error control on your downlink. In other words the useable DSL rate is 7616 * (1-(16/(239+16))) = 7138.13  = 7106.14 kbps

		DOWN [kbps]:		7616 	* (1-(16/(239+16))) 	= 7138.13333333
		UP [kbps]:			864		 * (1-(0/(28+0))) 		= 864

2) ATM framing 1: For the greater group I think it is worth reminding that the ATM cell train that the packets get transferred over uses a 48 payload in 53 byte cells encapsulation so even if the ATM encapsulation would not have more quirks (but it does) you could at best expect 100*48/53 = 90.57% of the sync rate to show up as IP throughput. 
	So in your case:
		downlink:	7616* (1-(16/(239+16))) * (48/53)  = 6464.7245283
		uplink:		864* (1-(0/(28+0))) * (48/53) = 782.490566038

3) per packet fixed overhead: each packet also drags in some overhead for all the headers (some like ATM and ethernet headers are on top of the MTU, some like the PPPoE headers or potential VLAN tags reduce your useable MTU). I assume that with your link with PPPoE your MTU is 1492 (the PPPoE headers are 8 byte) and you have a total of 40 bytes overhead, so packets are maximally 1492+40 = 1532 bytes on the wire, so this is the reference for size: (1- (1492/1532)) * 100 = 2.61096605744 % you loose 2.6% just for the overheads (now since this is fixed it will be larger for small packets, say a 64Byte packet ends up with 100*64/(64+40) = 61.5384615385  of the expected rates this is not specific to DSL though you have fixed headers also with ethernet it is just with most DSL encapsulation schemes the overhead just mushrooms… Let's assume that netsurf tries to use maximally full packets for its TCP streams, so we get
		downlink:	7616* (1-(16/(239+16))) * (48/53) * (1492/1532)) = 6295.93276516 
		uplink:		864* (1-(0/(28+0))) * (48/53) * (1492/1532)) = 762.060002956

4) per packet variable overhead: now the black horse comes in ;), the variable padding caused by each IP packet being send in an full integer number of ATM cells, worst case is 47 bytes of padding in the last cell (actually the padding gets spread over the last two packets, but the principle remains the same; did I mention quirky in connection with ATM already ;) ). So for large packets, depending on size we have an additional 0 to 47 bytes of overhead of roughly 47/1500 = 3%.
	For you link with 1492 MTU packets (required to make room for the 8 byte PPPoE header) we have (1492+40)/48 = 31.9166666667, so we need 32 ATM cells, resulting in (1492+40) - (48*32)  = 4 bytes of padding
		downlink:	7616* (1-(16/(239+16))) * (48/53) * (1492/1536)) = 6279.53710692
		uplink:		864* (1-(0/(28+0))) * (48/53) * (1492/1536)) = 760.075471698 

5) stuff that netsurf does not report: netsurf will not see any ACK packets; but we can try to estate those (if anybody sees a fleaw in this reasoning please holler). I assume that typically we sent one ACK per two packets, so estimate the number of MTU-sized packets we could maximally send per second:
		downlink:	(7616* (1-(16/(239+16))) * (48/53) * (1492/1536)))  = 6279.53710692 / (1536*8/1000)  = 511.030037998
		uplink:		864* (1-(0/(28+0))) * (48/53) * (1492/1536)) / 1536 = 760.075471698  / (1536*8/1000) = 61.8551002358
	Now an ACK packet is rather small (40 bytes without 52 with timestamp?) but with overhead and cell-padding we get 40+40 = 80 results in two cells worth 96 bytes (52+40 = 92, so also two cells just less padding) so the relevant size of our ACKs is 96bytes. I do not know about your stem but mine send one ACK per two data packets (I think) so lets fold this into our calculations by assuming each data packet would contain the ACK data already by simply assuming each packet is 48 bytes longer
		downlink:	7616* (1-(16/(239+16))) * (48/53) * (1492/(1536+48))) = 6089.24810368
		uplink:		864* (1-(0/(28+0))) * (48/53) * (1492/(1536+48)))  = 737.042881647



6) more stuff that does not show up in netsurf-wrappers TCP averages: all the ICMP and UDP packets for the latency probe are not accounted for, yet consume bandwidth as well. The UDP probes in your experiments all stop pretty quickly, if they state at all so we can ignore those. The ICMP pings come at 5 per second and cost 56 default ping size plus 8 byte ICMP header bytes plus 20 bytes IP4 header, plus overhead 40, so 56+8+20+40 = 124 resulting in 3 ATM cells or 3*48 = 144 bytes 144*8*5/1000 = 5.76 kbps, which we probably can ignore here.

Overall it looks like your actual measured results are pretty close to the maximum we can expect, at least for the download direction; looking at the upstream plots it is really not clear what the cumulative rate actually is, but the order of magnitude looks about right. I really wish we all could switch to ethernet of fiber optics soon, so the calculation of the expected maximum will be much easier…
	Note if you shape down to below the rates calculated in 1) use the shaped rates as inputs for the further calculations, also note that activating the ATM link-layer option in SQM will take care of2) 3) 4) independent of whether your link actually suffers from ATM in the first place, so activation these options on a fiber link will cause the same apparent bandwidth waste…

Best Regards
	Sebastian



> 
> - When I adjust the SQM parameters to get close to those numbers, I get increasing levels of packet loss (5-8%) during a concurrent ping test.
> 
> So my question to the group is whether this behavior makes sense: that we can have low latency while losing ~10% of the link capacity, or that getting close to the link capacity should induce large packet loss...
> 
> Experimental setup:
> 
> I'm using a Comtrend 583-U DSL modem, that has a sync rate of 7616 kbps down, 864 kbps up. Theoretically, I should be able to tell SQM to use numbers a bit lower than those values, with an ATM plus header overhead with default settings. 
> 
> I have posted the results of my netperf-wrapper trials at http://richb-hanover.com - There are a number of RRUL charts, taken with different link rates configured, and with different link layers.

From you website:
Note: I don’t know why the upload charts show such fragmentary data.

This is because netsurf-wrapper works with a fixed step size (from netperf-wrapper --help: -s STEP_SIZE, --step-size=STEP_SIZE: Measurement data point step size.) which works okay for high enough bandwidths, your uplink however is too slow, so "-s 1.0" or even 2.0 would look reasonable ()the default is as far as I remember 0.1. Unfortunately netperf-wrapper does not seem to allow setting different -s options for up and down...


> 
> I welcome people's thoughts for other tests/adjustments/etc.
> 
> Rich Brown
> Hanover, NH USA
> 
> PS I did try the 3.10.28-16, but ran into troubles with wifi and ethernet connectivity. I must have screwed up my local configuration - I was doing it quickly - so I rolled back to 3.10.28.14.
> _______________________________________________
> Cerowrt-devel mailing list
> Cerowrt-devel@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cerowrt-devel


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Cerowrt-devel] Equivocal results with using 3.10.28-14
  2014-02-24 15:24 ` Fred Stratton
@ 2014-02-24 22:02   ` Sebastian Moeller
  0 siblings, 0 replies; 14+ messages in thread
From: Sebastian Moeller @ 2014-02-24 22:02 UTC (permalink / raw)
  To: Fred Stratton; +Cc: cerowrt-devel

Hi Fred,
On Feb 24, 2014, at 16:24 , Fred Stratton <fredstratton@imap.cc> wrote:

> How are you measuring the link speed?
> 
> With SQM enabled, I have speedtest.net results far below the values at which the gateway syncs.
> 
> IF the gateway syncs at 12000/1000, the speedtest figures are 9500/850
> 
> The performance I obtain with streaming video is very good, tweaking the extra settings in SQM on 3.10.28-16
> 
> I am sure you are aware that you will never achieve the values quoted by the ISP.

	But the current rate given by the modem is a pretty true measurement of the bandwidth between the modem and the DSLAM, independent on the marketing numbers of the ISP ;)


> How long is your line? Downstream attenuation is a proxy for this.

	Once the sync is working this does not matter any more, having seen Rich's line stats, he has a very clean ADSL with SNRM of 22 and 11 and almost no errors (not even many FECs).

Best Regards
	Sebastian

> Are you using ADSL2+, or some other protocol? Does the device even tell you?
> 
> On 24/02/14 14:36, Rich Brown wrote:
>> CeroWrt 3.10.28-14 is doing a good job of keeping latency low. But... it has two other effects:
>> 
>> - I don't get the full "7 mbps down, 768 kbps up" as touted by my DSL provider (Fairpoint). In fact, CeroWrt struggles to get above 6.0/0.6 mbps.
>> 
>> - When I adjust the SQM parameters to get close to those numbers, I get increasing levels of packet loss (5-8%) during a concurrent ping test.
>> 
>> So my question to the group is whether this behavior makes sense: that we can have low latency while losing ~10% of the link capacity, or that getting close to the link capacity should induce large packet loss...
>> 
>> Experimental setup:
>> 
>> I'm using a Comtrend 583-U DSL modem, that has a sync rate of 7616 kbps down, 864 kbps up. Theoretically, I should be able to tell SQM to use numbers a bit lower than those values, with an ATM plus header overhead with default settings.
>> 
>> I have posted the results of my netperf-wrapper trials at http://richb-hanover.com - There are a number of RRUL charts, taken with different link rates configured, and with different link layers.
>> 
>> I welcome people's thoughts for other tests/adjustments/etc.
>> 
>> Rich Brown
>> Hanover, NH USA
>> 
>> PS I did try the 3.10.28-16, but ran into troubles with wifi and ethernet connectivity. I must have screwed up my local configuration - I was doing it quickly - so I rolled back to 3.10.28.14.
>> _______________________________________________
>> Cerowrt-devel mailing list
>> Cerowrt-devel@lists.bufferbloat.net
>> https://lists.bufferbloat.net/listinfo/cerowrt-devel
> 
> _______________________________________________
> Cerowrt-devel mailing list
> Cerowrt-devel@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cerowrt-devel


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Cerowrt-devel] Equivocal results with using 3.10.28-14
  2014-02-24 21:54 ` Sebastian Moeller
@ 2014-02-24 22:40   ` Sebastian Moeller
  0 siblings, 0 replies; 14+ messages in thread
From: Sebastian Moeller @ 2014-02-24 22:40 UTC (permalink / raw)
  To: Sebastian Moeller; +Cc: cerowrt-devel

Hi Rich,

On Feb 24, 2014, at 22:54 , Sebastian Moeller <moeller0@gmx.de> wrote:

> Hi Rich,
> 
> 
> On Feb 24, 2014, at 15:36 , Rich Brown <richb.hanover@gmail.com> wrote:
> 
>> 
>> CeroWrt 3.10.28-14 is doing a good job of keeping latency low. But... it has two other effects:
>> 
>> - I don't get the full "7 mbps down, 768 kbps up" as touted by my DSL provider (Fairpoint). In fact, CeroWrt struggles to get above 6.0/0.6 mbps.
> 
> 	Okay, that sounds like a rather large bandwidth sacrifice, but let's see what we can expect to see on your link, to get a better hypothesis of what we can expect on your link.
> 
> 0) the raw line rates as presented by your modem:
> 		DOWN [kbps]:		7616
> 		UP [kbps]:			864
> 
> 
> 1) let's start with the reported sync rates: the sync rates of the modem (that Rich graciously send me privately) also contain bins used for forward error correction (these sub-carriers will not be available for ATM-payload so reduce the useable sync. It looks like K reports the number of data bytes per dmt-frame, while R denotes the number of FEC bytes per dmt-frame. From my current understanding K is the useable part of the K+R total, so with K(down) = 239 and R(down) = 16 (and K(up) = 28 and R(up) = 0) we get:
> but from the numbers you send it looks like for the downlink 16 in 239 byte are FEC bytes (and zero in you uplink) so you seem to loose 100*16/239 = 6.69% for forward error control on your downlink. In other words the useable DSL rate is 7616 * (1-(16/(239+16))) = 7138.13  = 7106.14 kbps
> 
> 		DOWN [kbps]:		7616 	* (1-(16/(239+16))) 	= 7138.13333333
> 		UP [kbps]:			864		 * (1-(0/(28+0))) 		= 864
> 
> 2) ATM framing 1: For the greater group I think it is worth reminding that the ATM cell train that the packets get transferred over uses a 48 payload in 53 byte cells encapsulation so even if the ATM encapsulation would not have more quirks (but it does) you could at best expect 100*48/53 = 90.57% of the sync rate to show up as IP throughput. 
> 	So in your case:
> 		downlink:	7616* (1-(16/(239+16))) * (48/53)  = 6464.7245283
> 		uplink:		864* (1-(0/(28+0))) * (48/53) = 782.490566038
> 
> 3) per packet fixed overhead: each packet also drags in some overhead for all the headers (some like ATM and ethernet headers are on top of the MTU, some like the PPPoE headers or potential VLAN tags reduce your useable MTU). I assume that with your link with PPPoE your MTU is 1492 (the PPPoE headers are 8 byte) and you have a total of 40 bytes overhead, so packets are maximally 1492+40 = 1532 bytes on the wire, so this is the reference for size: (1- (1492/1532)) * 100 = 2.61096605744 % you loose 2.6% just for the overheads (now since this is fixed it will be larger for small packets, say a 64Byte packet ends up with 100*64/(64+40) = 61.5384615385  of the expected rates this is not specific to DSL though you have fixed headers also with ethernet it is just with most DSL encapsulation schemes the overhead just mushrooms… Let's assume that netsurf tries to use maximally full packets for its TCP streams, so we get
> 		downlink:	7616* (1-(16/(239+16))) * (48/53) * (1492/1532)) = 6295.93276516 
> 		uplink:		864* (1-(0/(28+0))) * (48/53) * (1492/1532)) = 762.060002956
> 
> 4) per packet variable overhead: now the black horse comes in ;), the variable padding caused by each IP packet being send in an full integer number of ATM cells, worst case is 47 bytes of padding in the last cell (actually the padding gets spread over the last two packets, but the principle remains the same; did I mention quirky in connection with ATM already ;) ). So for large packets, depending on size we have an additional 0 to 47 bytes of overhead of roughly 47/1500 = 3%.
> 	For you link with 1492 MTU packets (required to make room for the 8 byte PPPoE header) we have (1492+40)/48 = 31.9166666667, so we need 32 ATM cells, resulting in (1492+40) - (48*32)  = 4 bytes of padding
> 		downlink:	7616* (1-(16/(239+16))) * (48/53) * (1492/1536)) = 6279.53710692
> 		uplink:		864* (1-(0/(28+0))) * (48/53) * (1492/1536)) = 760.075471698 
> 
> 5) stuff that netsurf does not report: netsurf will not see any ACK packets; but we can try to estate those (if anybody sees a fleaw in this reasoning please holler). I assume that typically we sent one ACK per two packets, so estimate the number of MTU-sized packets we could maximally send per second:
> 		downlink:	(7616* (1-(16/(239+16))) * (48/53) * (1492/1536)))  = 6279.53710692 / (1536*8/1000)  = 511.030037998
> 		uplink:		864* (1-(0/(28+0))) * (48/53) * (1492/1536)) / 1536 = 760.075471698  / (1536*8/1000) = 61.8551002358

	What I failed to mention/realize in the initial post is that sending the ACKs for the downstream TCP transfers is going to affect the upstream much more than the other way around as it has less capacity and the rrul test loads both directions at the same time; so the hidden ACK traffic is going to have a stronger impact on the observed upload than on the observed download speed. So assuming this would work:
	downlad induced upload ACK traffic [kbps]: 511 data packets per second / 2 (assume we only ack every second packet) * 96 (2 aTM cells )*8/1000 = 196.224 
	upload induced download ACK traffic [kbps]: 62 data packets per second / 2 (assume we only ack every second packet) * 96 (2 aTM cells )*8/1000 =  23.808
So maxing out your download bandwidth with ACK every other packet eats 100*196/737 = 26.59% go your uplink bandwidth right there. The A in ADSL really is quite extreme in most cases as if to persuade people not to actually serve any data… (my pet theory is that this is caused by payed-peering practices in which you pay for the data you transfer to another network, low uplink means for most ISPs their customers can not send to much and hence keep peering costs in check ;) )


> 	Now an ACK packet is rather small (40 bytes without 52 with timestamp?) but with overhead and cell-padding we get 40+40 = 80 results in two cells worth 96 bytes (52+40 = 92, so also two cells just less padding) so the relevant size of our ACKs is 96bytes. I do not know about your stem but mine send one ACK per two data packets (I think) so lets fold this into our calculations by assuming each data packet would contain the ACK data already by simply assuming each packet is 48 bytes longer
> 		downlink:	7616* (1-(16/(239+16))) * (48/53) * (1492/(1536+48))) = 6089.24810368
> 		uplink:		864* (1-(0/(28+0))) * (48/53) * (1492/(1536+48)))  = 737.042881647

	Here is a small mix-up, due to the asynchrony in ACK traffic on both directions the approximation above does not work out. the effects are more 
	down: 	6279.53710692 - 23.808 = 6255.72910692
	up:		760.075471698 - 196.224  = 563.851471698
Which obviously is also wrong as this assumes a number of ACKs matching the full bandwidth worth of data packets, but the order of magnitude should be about right. Upload-induced ACK traffic has a marginal effect on the download, while the reverse is not true; a saturated download has a considerable hidden uplink bandwidth costs.
	I assume that different ACK strategies can have lower costs and I assume that the netsurf probes are independentper direction so the ACZKs can be piggy-backed onto packets in one of the return flows. (It would be a great test of my ramblings above if netsurf could do both to estimate the cost of the ACK-channel)

Best Regards
	Sebastian



> 
> 
> 
> 6) more stuff that does not show up in netsurf-wrappers TCP averages: all the ICMP and UDP packets for the latency probe are not accounted for, yet consume bandwidth as well. The UDP probes in your experiments all stop pretty quickly, if they state at all so we can ignore those. The ICMP pings come at 5 per second and cost 56 default ping size plus 8 byte ICMP header bytes plus 20 bytes IP4 header, plus overhead 40, so 56+8+20+40 = 124 resulting in 3 ATM cells or 3*48 = 144 bytes 144*8*5/1000 = 5.76 kbps, which we probably can ignore here.
> 
> Overall it looks like your actual measured results are pretty close to the maximum we can expect, at least for the download direction; looking at the upstream plots it is really not clear what the cumulative rate actually is, but the order of magnitude looks about right. I really wish we all could switch to ethernet of fiber optics soon, so the calculation of the expected maximum will be much easier…
> 	Note if you shape down to below the rates calculated in 1) use the shaped rates as inputs for the further calculations, also note that activating the ATM link-layer option in SQM will take care of2) 3) 4) independent of whether your link actually suffers from ATM in the first place, so activation these options on a fiber link will cause the same apparent bandwidth waste…
> 
> Best Regards
> 	Sebastian
> 
> 
> 
>> 
>> - When I adjust the SQM parameters to get close to those numbers, I get increasing levels of packet loss (5-8%) during a concurrent ping test.
>> 
>> So my question to the group is whether this behavior makes sense: that we can have low latency while losing ~10% of the link capacity, or that getting close to the link capacity should induce large packet loss...
>> 
>> Experimental setup:
>> 
>> I'm using a Comtrend 583-U DSL modem, that has a sync rate of 7616 kbps down, 864 kbps up. Theoretically, I should be able to tell SQM to use numbers a bit lower than those values, with an ATM plus header overhead with default settings. 
>> 
>> I have posted the results of my netperf-wrapper trials at http://richb-hanover.com - There are a number of RRUL charts, taken with different link rates configured, and with different link layers.
> 
> From you website:
> Note: I don’t know why the upload charts show such fragmentary data.
> 
> This is because netsurf-wrapper works with a fixed step size (from netperf-wrapper --help: -s STEP_SIZE, --step-size=STEP_SIZE: Measurement data point step size.) which works okay for high enough bandwidths, your uplink however is too slow, so "-s 1.0" or even 2.0 would look reasonable ()the default is as far as I remember 0.1. Unfortunately netperf-wrapper does not seem to allow setting different -s options for up and down...
> 
> 
>> 
>> I welcome people's thoughts for other tests/adjustments/etc.
>> 
>> Rich Brown
>> Hanover, NH USA
>> 
>> PS I did try the 3.10.28-16, but ran into troubles with wifi and ethernet connectivity. I must have screwed up my local configuration - I was doing it quickly - so I rolled back to 3.10.28.14.
>> _______________________________________________
>> Cerowrt-devel mailing list
>> Cerowrt-devel@lists.bufferbloat.net
>> https://lists.bufferbloat.net/listinfo/cerowrt-devel
> 
> _______________________________________________
> Cerowrt-devel mailing list
> Cerowrt-devel@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cerowrt-devel

-- 
Sandra, Okko, Joris, & Sebastian Moeller
Telefon: +49 7071 96 49 783, +49 7071 96 49 784, +49 7071 96 49 785
GSM: +49-1577-190 31 41
GSM: +49-1517-00 70 355

Moltkestrasse 6
72072 Tuebingen
Deutschland


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Cerowrt-devel] Equivocal results with using 3.10.28-14
  2014-02-24 14:56 ` Aaron Wood
@ 2014-02-25 13:09   ` Rich Brown
  2014-02-25 13:37     ` Sebastian Moeller
  0 siblings, 1 reply; 14+ messages in thread
From: Rich Brown @ 2014-02-25 13:09 UTC (permalink / raw)
  To: cerowrt-devel

[-- Attachment #1: Type: text/plain, Size: 3481 bytes --]

Thanks everyone for all the good advice. I will summarize my responses to all your notes now, then I'll go away and run more tests.

- Yes, I am using netperf 2.6.0 and netperf-wrapper from Toke's github repo.

- The "sync rate" is the speed with which the DSL modem sends bits to/from my house. I got this by going into the modem's admin interface and poking around. (It turns out that I have a very clean line, high SNR, low attenuation. I'm much less than a km from the central office.) So actual speed should approach this, except...

- Of course, I have to subtract all those overheads that Sebastian described - ATM 48-in-53, which knocks off 10%; ATM frame overhead which could add up to 47 bytes padding to any packet, etc.)

- I looked at the target calculation in Dave's Home Gateway best practices. (http://snapon.lab.bufferbloat.net/~d/draft-taht-home-gateway-best-practices-00.html) Am I correct that it sets the target to five 1500-byte packet transmission time or 5 msec, whichever is greater?

- I was astonished by the calculation of the bandwidth consumed by acks in the reverse direction. In a 7mbps/768kbps setting, I'm going to lose one quarter of the reverse bandwidth? Wow!

- I wasn't entirely clear how to set the target in the SQM GUI. I believe that "target ##msec" is an acceptable format. Is that correct?

- There's also a discussion of setting the target with "auto", but I'm not sure I understand the syntax.

Now to find some time to go back into the measurement lab! I'll report again when I have more data. Thanks again.

Rich



On Feb 24, 2014, at 9:56 AM, Aaron Wood <woody77@gmail.com> wrote:

> Do you have the latest (head) version of netperf and netperf-wrapper?  some changes were made to both that give better UDP results.
> 
> -Aaron
> 
> 
> On Mon, Feb 24, 2014 at 3:36 PM, Rich Brown <richb.hanover@gmail.com> wrote:
> 
> CeroWrt 3.10.28-14 is doing a good job of keeping latency low. But... it has two other effects:
> 
> - I don't get the full "7 mbps down, 768 kbps up" as touted by my DSL provider (Fairpoint). In fact, CeroWrt struggles to get above 6.0/0.6 mbps.
> 
> - When I adjust the SQM parameters to get close to those numbers, I get increasing levels of packet loss (5-8%) during a concurrent ping test.
> 
> So my question to the group is whether this behavior makes sense: that we can have low latency while losing ~10% of the link capacity, or that getting close to the link capacity should induce large packet loss...
> 
> Experimental setup:
> 
> I'm using a Comtrend 583-U DSL modem, that has a sync rate of 7616 kbps down, 864 kbps up. Theoretically, I should be able to tell SQM to use numbers a bit lower than those values, with an ATM plus header overhead with default settings.
> 
> I have posted the results of my netperf-wrapper trials at http://richb-hanover.com - There are a number of RRUL charts, taken with different link rates configured, and with different link layers.
> 
> I welcome people's thoughts for other tests/adjustments/etc.
> 
> Rich Brown
> Hanover, NH USA
> 
> PS I did try the 3.10.28-16, but ran into troubles with wifi and ethernet connectivity. I must have screwed up my local configuration - I was doing it quickly - so I rolled back to 3.10.28.14.
> _______________________________________________
> Cerowrt-devel mailing list
> Cerowrt-devel@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cerowrt-devel
> 


[-- Attachment #2: Type: text/html, Size: 4877 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Cerowrt-devel] Equivocal results with using 3.10.28-14
  2014-02-25 13:09   ` Rich Brown
@ 2014-02-25 13:37     ` Sebastian Moeller
  2014-02-25 15:54       ` Dave Taht
  0 siblings, 1 reply; 14+ messages in thread
From: Sebastian Moeller @ 2014-02-25 13:37 UTC (permalink / raw)
  To: Rich Brown; +Cc: cerowrt-devel

Hi Rich,


On Feb 25, 2014, at 14:09 , Rich Brown <richb.hanover@gmail.com> wrote:

> Thanks everyone for all the good advice. I will summarize my responses to all your notes now, then I'll go away and run more tests.
> 
> - Yes, I am using netperf 2.6.0 and netperf-wrapper from Toke's github repo.
> 
> - The "sync rate" is the speed with which the DSL modem sends bits to/from my house. I got this by going into the modem's admin interface and poking around. (It turns out that I have a very clean line, high SNR, low attenuation. I'm much less than a km from the central office.) So actual speed should approach this, except…

	I would think of this as the theoretical upper limit ;)

> 
> - Of course, I have to subtract all those overheads that Sebastian described - ATM 48-in-53, which knocks off 10%; ATM frame overhead which could add up to 47 bytes padding to any packet, etc.)
> 
> - I looked at the target calculation in Dave's Home Gateway best practices. (http://snapon.lab.bufferbloat.net/~d/draft-taht-home-gateway-best-practices-00.html) Am I correct that it sets the target to five 1500-byte packet transmission time or 5 msec, whichever is greater?

	Note, the auto target implementation in ceropackages-3.10 sqm-scripts uses the following:

adapt_target_to_slow_link() {
    CUR_LINK_KBPS=$1
    CUR_EXTENDED_TARGET_US=
    MAX_PAKET_DELAY_IN_US_AT_1KBPS=$(( 1000 * 1000 *1540 * 8 / 1000 ))
    CUR_EXTENDED_TARGET_US=$(( ${MAX_PAKET_DELAY_IN_US_AT_1KBPS} / ${CUR_LINK_KBPS} ))  # note this truncates the decimals
    # do not change anything for fast links
    [ "$CUR_EXTENDED_TARGET_US" -lt 5000 ] && CUR_EXTENDED_TARGET_US=5000
    case ${QDISC} in
        *codel|pie)
            echo "${CUR_EXTENDED_TARGET_US}"
            ;;
    esac
}

This is modeled after the shell code Dave sent around, and does not exactly match the free version, because I could not make heads and tails out of the free version. (Happy to discuss change this in SQM if anybody has a better idea)

> 
> - I was astonished by the calculation of the bandwidth consumed by acks in the reverse direction. In a 7mbps/768kbps setting, I'm going to lose one quarter of the reverse bandwidth? Wow!

	Well, so was I (first time I did that) but here is the kicker with classical non0-delayed ACKs this actually doubles since each data packet gets acknowledged (I assume this puts a lower bound on how asymmetrical a loin an ISP can sell ;) ). But since I figured out that macosx seems to default to 1 Ack for every 4 packets, so only half the traffic. And note any truly bi-directional traffic shouldbe able to piggyback many of those ACKs into normal data packets.

> 
> - I wasn't entirely clear how to set the target in the SQM GUI. I believe that "target ##msec" is an acceptable format. Is that correct?

	In the new version with the dedicated target field, "40ms" will work as will "40000us", "40 ms". In the most recent version "auto" will set the auto adjustment of target (it will also extend interval by the new-target - 5ms) to avoid the situation where Target gets larger than interval.
	In the older versions you would put "target 40ms" into the egress advanced option string. Note I used double quotes in my examples for clarity, the GUI does not want those...


> 
> - There's also a discussion of setting the target with "auto", but I'm not sure I understand the syntax.

	just type in auto. You can check with log read and "tc -d qdsic"

Best Regards
	Sebastian

> 
> Now to find some time to go back into the measurement lab! I'll report again when I have more data. Thanks again.
> 
> Rich
> 
> 
> 
> On Feb 24, 2014, at 9:56 AM, Aaron Wood <woody77@gmail.com> wrote:
> 
>> Do you have the latest (head) version of netperf and netperf-wrapper?  some changes were made to both that give better UDP results.
>> 
>> -Aaron
>> 
>> 
>> On Mon, Feb 24, 2014 at 3:36 PM, Rich Brown <richb.hanover@gmail.com> wrote:
>> 
>> CeroWrt 3.10.28-14 is doing a good job of keeping latency low. But... it has two other effects:
>> 
>> - I don't get the full "7 mbps down, 768 kbps up" as touted by my DSL provider (Fairpoint). In fact, CeroWrt struggles to get above 6.0/0.6 mbps.
>> 
>> - When I adjust the SQM parameters to get close to those numbers, I get increasing levels of packet loss (5-8%) during a concurrent ping test.
>> 
>> So my question to the group is whether this behavior makes sense: that we can have low latency while losing ~10% of the link capacity, or that getting close to the link capacity should induce large packet loss...
>> 
>> Experimental setup:
>> 
>> I'm using a Comtrend 583-U DSL modem, that has a sync rate of 7616 kbps down, 864 kbps up. Theoretically, I should be able to tell SQM to use numbers a bit lower than those values, with an ATM plus header overhead with default settings.
>> 
>> I have posted the results of my netperf-wrapper trials at http://richb-hanover.com - There are a number of RRUL charts, taken with different link rates configured, and with different link layers.
>> 
>> I welcome people's thoughts for other tests/adjustments/etc.
>> 
>> Rich Brown
>> Hanover, NH USA
>> 
>> PS I did try the 3.10.28-16, but ran into troubles with wifi and ethernet connectivity. I must have screwed up my local configuration - I was doing it quickly - so I rolled back to 3.10.28.14.
>> _______________________________________________
>> Cerowrt-devel mailing list
>> Cerowrt-devel@lists.bufferbloat.net
>> https://lists.bufferbloat.net/listinfo/cerowrt-devel
>> 
> 
> _______________________________________________
> Cerowrt-devel mailing list
> Cerowrt-devel@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cerowrt-devel


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Cerowrt-devel] Equivocal results with using 3.10.28-14
  2014-02-25 13:37     ` Sebastian Moeller
@ 2014-02-25 15:54       ` Dave Taht
  2014-02-25 16:29         ` Sebastian Moeller
  0 siblings, 1 reply; 14+ messages in thread
From: Dave Taht @ 2014-02-25 15:54 UTC (permalink / raw)
  To: Sebastian Moeller; +Cc: Stuart Cheshire, cerowrt-devel

On Tue, Feb 25, 2014 at 5:37 AM, Sebastian Moeller <moeller0@gmx.de> wrote:
> Hi Rich,
>
>
> On Feb 25, 2014, at 14:09 , Rich Brown <richb.hanover@gmail.com> wrote:
>
>> Thanks everyone for all the good advice. I will summarize my responses to all your notes now, then I'll go away and run more tests.
>>
>> - Yes, I am using netperf 2.6.0 and netperf-wrapper from Toke's github repo.
>>
>> - The "sync rate" is the speed with which the DSL modem sends bits to/from my house. I got this by going into the modem's admin interface and poking around. (It turns out that I have a very clean line, high SNR, low attenuation. I'm much less than a km from the central office.) So actual speed should approach this, except...
>
>         I would think of this as the theoretical upper limit ;)
>
>>
>> - Of course, I have to subtract all those overheads that Sebastian described - ATM 48-in-53, which knocks off 10%; ATM frame overhead which could add up to 47 bytes padding to any packet, etc.)
>>
>> - I looked at the target calculation in Dave's Home Gateway best practices. (http://snapon.lab.bufferbloat.net/~d/draft-taht-home-gateway-best-practices-00.html) Am I correct that it sets the target to five 1500-byte packet transmission time or 5 msec, whichever is greater?
>
>         Note, the auto target implementation in ceropackages-3.10 sqm-scripts uses the following:
>
> adapt_target_to_slow_link() {
>     CUR_LINK_KBPS=$1
>     CUR_EXTENDED_TARGET_US=
>     MAX_PAKET_DELAY_IN_US_AT_1KBPS=$(( 1000 * 1000 *1540 * 8 / 1000 ))
>     CUR_EXTENDED_TARGET_US=$(( ${MAX_PAKET_DELAY_IN_US_AT_1KBPS} / ${CUR_LINK_KBPS} ))  # note this truncates the decimals
>     # do not change anything for fast links
>     [ "$CUR_EXTENDED_TARGET_US" -lt 5000 ] && CUR_EXTENDED_TARGET_US=5000
>     case ${QDISC} in
>         *codel|pie)
>             echo "${CUR_EXTENDED_TARGET_US}"
>             ;;
>     esac
> }
>
> This is modeled after the shell code Dave sent around, and does not exactly match the free version, because I could not make heads and tails out of the free version. (Happy to discuss change this in SQM if anybody has a better idea)

Really the target calculation doesn't matter so much so long as it's
larger than the MTU. This is somewhat an artifact of htb which buffers
up an extra packet, and the cpu scheduler...

It is becoming clearer (with the recent description of the pie + rate
limiter + mac compensation stuff in future cablemodems) that we can do
much work to improve the rate limiters, and should probably intertwine
them like what the cable folk just did, in order to get best
performance.

>>
>> - I was astonished by the calculation of the bandwidth consumed by acks in the reverse direction. In a 7mbps/768kbps setting, I'm going to lose one quarter of the reverse bandwidth? Wow!

Yes. TCP's sending ack requirement was actually the main driver for
having any bandwidth at all on the upstream. Back in the 90s all cable
providers wanted to provide was sufficient bandwidth for a "buy"
button. But, due to
this behavior of TCP, a ratio between down/up of somewhere between 6
and 12 to 1 was what was needed to make TCP work well. (and even then
they tried hard to make it worse and ack compression is still part of
many cable modem's provisioning)

IPv6 has much larger acks than ipv4....

I did a lot of work on various forms of hybrid network compression
back in the day (early 90s), back when we had a single 10Mbit radio
feeding hundreds of subscribers, and a 14.4 modem sending back acks...
and data. It turned out that a substantial percentage of subscribers
actually wanted to upload stuff... and that we couldn't achieve 10Mbit
in the lab with a 14.4 return... and that you can only slice up 10Mbit
so many ways before you ran out of bandwidth, and you run out of
bandwidth long before you can turn a profit...

(and while I remember details of the radio setup and all the crazy
stuff we did to get more data through the modems, I can't remember the
name of the company)

At the time I was happy, sorta, in that we'd proven that future ISPs
HAD to provide some level of upstream bandwidth bigger than a buy
button, and the original e2e internet was going to be at least
somewhat preserved...

I didn't grok at the time that NAT was going to be widely deployed...
I mean, at the time, you sold a connection, and asked how big a
network did you want, we defaulted to a /28, and we were still handing
out class C's to everyone that asked.

>         Well, so was I (first time I did that) but here is the kicker with classical non0-delayed ACKs this actually doubles since each data packet gets acknowledged (I assume this puts a lower bound on how asymmetrical a loin an ISP can sell ;) ). But since I figured out that macosx seems to default to 1 Ack for every 4 packets, so only half the traffic. And note any truly bi-directional traffic shouldbe able to piggyback many of those ACKs into normal data packets.

I doubt your '4'. Take a capture for a few hundred ms.

My understanding of how macos works is that after a stream is
sustained for a while, it switches from one ack every 2 packets into
"stretch acks" - to one every 6 (or so).

there are some interesting bugs associated with stretch acks, and also
TSO - in one case we observed a full TSO of TCP RSTs being sent
instead of one.



>>
>> - I wasn't entirely clear how to set the target in the SQM GUI. I believe that "target ##msec" is an acceptable format. Is that correct?
>
>         In the new version with the dedicated target field, "40ms" will work as will "40000us", "40 ms". In the most recent version "auto" will set the auto adjustment of target (it will also extend interval by the new-target - 5ms) to avoid the situation where Target gets larger than interval.
>         In the older versions you would put "target 40ms" into the egress advanced option string. Note I used double quotes in my examples for clarity, the GUI does not want those...
>
>
>>
>> - There's also a discussion of setting the target with "auto", but I'm not sure I understand the syntax.
>
>         just type in auto. You can check with log read and "tc -d qdsic"
>
> Best Regards
>         Sebastian
>
>>
>> Now to find some time to go back into the measurement lab! I'll report again when I have more data. Thanks again.
>>
>> Rich
>>
>>
>>
>> On Feb 24, 2014, at 9:56 AM, Aaron Wood <woody77@gmail.com> wrote:
>>
>>> Do you have the latest (head) version of netperf and netperf-wrapper?  some changes were made to both that give better UDP results.
>>>
>>> -Aaron
>>>
>>>
>>> On Mon, Feb 24, 2014 at 3:36 PM, Rich Brown <richb.hanover@gmail.com> wrote:
>>>
>>> CeroWrt 3.10.28-14 is doing a good job of keeping latency low. But... it has two other effects:
>>>
>>> - I don't get the full "7 mbps down, 768 kbps up" as touted by my DSL provider (Fairpoint). In fact, CeroWrt struggles to get above 6.0/0.6 mbps.
>>>
>>> - When I adjust the SQM parameters to get close to those numbers, I get increasing levels of packet loss (5-8%) during a concurrent ping test.
>>>
>>> So my question to the group is whether this behavior makes sense: that we can have low latency while losing ~10% of the link capacity, or that getting close to the link capacity should induce large packet loss...
>>>
>>> Experimental setup:
>>>
>>> I'm using a Comtrend 583-U DSL modem, that has a sync rate of 7616 kbps down, 864 kbps up. Theoretically, I should be able to tell SQM to use numbers a bit lower than those values, with an ATM plus header overhead with default settings.
>>>
>>> I have posted the results of my netperf-wrapper trials at http://richb-hanover.com - There are a number of RRUL charts, taken with different link rates configured, and with different link layers.
>>>
>>> I welcome people's thoughts for other tests/adjustments/etc.
>>>
>>> Rich Brown
>>> Hanover, NH USA
>>>
>>> PS I did try the 3.10.28-16, but ran into troubles with wifi and ethernet connectivity. I must have screwed up my local configuration - I was doing it quickly - so I rolled back to 3.10.28.14.
>>> _______________________________________________
>>> Cerowrt-devel mailing list
>>> Cerowrt-devel@lists.bufferbloat.net
>>> https://lists.bufferbloat.net/listinfo/cerowrt-devel
>>>
>>
>> _______________________________________________
>> Cerowrt-devel mailing list
>> Cerowrt-devel@lists.bufferbloat.net
>> https://lists.bufferbloat.net/listinfo/cerowrt-devel
>
> _______________________________________________
> Cerowrt-devel mailing list
> Cerowrt-devel@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cerowrt-devel



-- 
Dave Täht

Fixing bufferbloat with cerowrt: http://www.teklibre.com/cerowrt/subscribe.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Cerowrt-devel] Equivocal results with using 3.10.28-14
  2014-02-25 15:54       ` Dave Taht
@ 2014-02-25 16:29         ` Sebastian Moeller
  0 siblings, 0 replies; 14+ messages in thread
From: Sebastian Moeller @ 2014-02-25 16:29 UTC (permalink / raw)
  To: Dave Taht; +Cc: Stuart Cheshire, cerowrt-devel

Hi Dave,

On Feb 25, 2014, at 16:54 , Dave Taht <dave.taht@gmail.com> wrote:

> On Tue, Feb 25, 2014 at 5:37 AM, Sebastian Moeller <moeller0@gmx.de> wrote:
>> Hi Rich,
>> 
>> 
>> On Feb 25, 2014, at 14:09 , Rich Brown <richb.hanover@gmail.com> wrote:
>> 
>>> Thanks everyone for all the good advice. I will summarize my responses to all your notes now, then I'll go away and run more tests.
>>> 
>>> - Yes, I am using netperf 2.6.0 and netperf-wrapper from Toke's github repo.
>>> 
>>> - The "sync rate" is the speed with which the DSL modem sends bits to/from my house. I got this by going into the modem's admin interface and poking around. (It turns out that I have a very clean line, high SNR, low attenuation. I'm much less than a km from the central office.) So actual speed should approach this, except...
>> 
>>        I would think of this as the theoretical upper limit ;)
>> 
>>> 
>>> - Of course, I have to subtract all those overheads that Sebastian described - ATM 48-in-53, which knocks off 10%; ATM frame overhead which could add up to 47 bytes padding to any packet, etc.)
>>> 
>>> - I looked at the target calculation in Dave's Home Gateway best practices. (http://snapon.lab.bufferbloat.net/~d/draft-taht-home-gateway-best-practices-00.html) Am I correct that it sets the target to five 1500-byte packet transmission time or 5 msec, whichever is greater?
>> 
>>        Note, the auto target implementation in ceropackages-3.10 sqm-scripts uses the following:
>> 
>> adapt_target_to_slow_link() {
>>    CUR_LINK_KBPS=$1
>>    CUR_EXTENDED_TARGET_US=
>>    MAX_PAKET_DELAY_IN_US_AT_1KBPS=$(( 1000 * 1000 *1540 * 8 / 1000 ))
>>    CUR_EXTENDED_TARGET_US=$(( ${MAX_PAKET_DELAY_IN_US_AT_1KBPS} / ${CUR_LINK_KBPS} ))  # note this truncates the decimals
>>    # do not change anything for fast links
>>    [ "$CUR_EXTENDED_TARGET_US" -lt 5000 ] && CUR_EXTENDED_TARGET_US=5000
>>    case ${QDISC} in
>>        *codel|pie)
>>            echo "${CUR_EXTENDED_TARGET_US}"
>>            ;;
>>    esac
>> }
>> 
>> This is modeled after the shell code Dave sent around, and does not exactly match the free version, because I could not make heads and tails out of the free version. (Happy to discuss change this in SQM if anybody has a better idea)
> 
> Really the target calculation doesn't matter so much so long as it's
> larger than the MTU. This is somewhat an artifact of htb which buffers
> up an extra packet, and the cpu scheduler…

	Ah, so we just leave the implementation as is? Great!

> 
> It is becoming clearer (with the recent description of the pie + rate
> limiter + mac compensation stuff in future cablemodems) that we can do
> much work to improve the rate limiters, and should probably intertwine
> them like what the cable folk just did, in order to get best
> performance.
> 
>>> 
>>> - I was astonished by the calculation of the bandwidth consumed by acks in the reverse direction. In a 7mbps/768kbps setting, I'm going to lose one quarter of the reverse bandwidth? Wow!
> 
> Yes. TCP's sending ack requirement was actually the main driver for
> having any bandwidth at all on the upstream. Back in the 90s all cable
> providers wanted to provide was sufficient bandwidth for a "buy"
> button. But, due to
> this behavior of TCP, a ratio between down/up of somewhere between 6
> and 12 to 1 was what was needed to make TCP work well. (and even then
> they tried hard to make it worse and ack compression is still part of
> many cable modem's provisioning)

	Interesting!

> 
> IPv6 has much larger acks than ipv4….

	Hopefully mostly delayed/stretched?

> 
> I did a lot of work on various forms of hybrid network compression
> back in the day (early 90s), back when we had a single 10Mbit radio
> feeding hundreds of subscribers, and a 14.4 modem sending back acks…

	I assume one modem per customer?

> and data. It turned out that a substantial percentage of subscribers
> actually wanted to upload stuff... and that we couldn't achieve 10Mbit
> in the lab with a 14.4 return…

	So automatic "fair-ish" queueing, no single customer could hog all 10Mbit ;)

> and that you can only slice up 10Mbit
> so many ways before you ran out of bandwidth, and you run out of
> bandwidth long before you can turn a profit...
> 
> (and while I remember details of the radio setup and all the crazy
> stuff we did to get more data through the modems, I can't remember the
> name of the company)
> 
> At the time I was happy, sorta, in that we'd proven that future ISPs
> HAD to provide some level of upstream bandwidth bigger than a buy
> button, and the original e2e internet was going to be at least
> somewhat preserved...
> 
> I didn't grok at the time that NAT was going to be widely deployed...
> I mean, at the time, you sold a connection, and asked how big a
> network did you want, we defaulted to a /28, and we were still handing
> out class C's to everyone that asked.
> 
>>        Well, so was I (first time I did that) but here is the kicker with classical non0-delayed ACKs this actually doubles since each data packet gets acknowledged (I assume this puts a lower bound on how asymmetrical a loin an ISP can sell ;) ). But since I figured out that macosx seems to default to 1 Ack for every 4 packets, so only half the traffic. And note any truly bi-directional traffic shouldbe able to piggyback many of those ACKs into normal data packets.
> 
> I doubt your '4'. Take a capture for a few hundred ms.

	Ah, I took this from a description about current behavior of macosx that I found on the internet, it sounded about right. I guess what wanted to say that the ACK imprint on the upload  in the given example could range from ~400kbps (1 ACK per packet) to 66.7 kbps (6 ACKs per packet), but even that is a considerable fraction of the available upload bandwidth…
	Silly idea, it would be great if cerowrt could have a real-time display of the number/size of ACK packets (lightly smoothed in time)  going in and out the WAN link, that way it would be quite easy to better estimate the traffic components hidden from RRUL. But that is m playing arm-chair software architect ;) not that I am going to implement that...

> 
> My understanding of how macos works is that after a stream is
> sustained for a while, it switches from one ack every 2 packets into
> "stretch acks" - to one every 6 (or so).

	According to http://rolande.wordpress.com/2010/12/30/performance-tuning-the-network-stack-on-mac-osx-10-6/
net.inet.tcp.delayed_ack has the following meaning:
	• delayed_ack=0 responds after every packet (OFF)
	• delayed_ack=1 always employs delayed ack, 6 packets can get 1 ack
	• delayed_ack=2 immediate ack after 2nd packet, 2 packets per ack (Compatibility Mode)
	• delayed_ack=3 should auto detect when to employ delayed ack, 4 packets per ack. (DEFAULT)
I have it set at three, so this is why I argued with 4 (just hoping it would not opt for one ACK per packet), and I am still on 10.8.5 (just in case someone in the know should happen to read this; what is the true behavior?)



Best Regards
	Sebastian


> 
> there are some interesting bugs associated with stretch acks, and also
> TSO - in one case we observed a full TSO of TCP RSTs being sent
> instead of one.
> 
> 
> 
>>> 
>>> - I wasn't entirely clear how to set the target in the SQM GUI. I believe that "target ##msec" is an acceptable format. Is that correct?
>> 
>>        In the new version with the dedicated target field, "40ms" will work as will "40000us", "40 ms". In the most recent version "auto" will set the auto adjustment of target (it will also extend interval by the new-target - 5ms) to avoid the situation where Target gets larger than interval.
>>        In the older versions you would put "target 40ms" into the egress advanced option string. Note I used double quotes in my examples for clarity, the GUI does not want those...
>> 
>> 
>>> 
>>> - There's also a discussion of setting the target with "auto", but I'm not sure I understand the syntax.
>> 
>>        just type in auto. You can check with log read and "tc -d qdsic"
>> 
>> Best Regards
>>        Sebastian
>> 
>>> 
>>> Now to find some time to go back into the measurement lab! I'll report again when I have more data. Thanks again.
>>> 
>>> Rich
>>> 
>>> 
>>> 
>>> On Feb 24, 2014, at 9:56 AM, Aaron Wood <woody77@gmail.com> wrote:
>>> 
>>>> Do you have the latest (head) version of netperf and netperf-wrapper?  some changes were made to both that give better UDP results.
>>>> 
>>>> -Aaron
>>>> 
>>>> 
>>>> On Mon, Feb 24, 2014 at 3:36 PM, Rich Brown <richb.hanover@gmail.com> wrote:
>>>> 
>>>> CeroWrt 3.10.28-14 is doing a good job of keeping latency low. But... it has two other effects:
>>>> 
>>>> - I don't get the full "7 mbps down, 768 kbps up" as touted by my DSL provider (Fairpoint). In fact, CeroWrt struggles to get above 6.0/0.6 mbps.
>>>> 
>>>> - When I adjust the SQM parameters to get close to those numbers, I get increasing levels of packet loss (5-8%) during a concurrent ping test.
>>>> 
>>>> So my question to the group is whether this behavior makes sense: that we can have low latency while losing ~10% of the link capacity, or that getting close to the link capacity should induce large packet loss...
>>>> 
>>>> Experimental setup:
>>>> 
>>>> I'm using a Comtrend 583-U DSL modem, that has a sync rate of 7616 kbps down, 864 kbps up. Theoretically, I should be able to tell SQM to use numbers a bit lower than those values, with an ATM plus header overhead with default settings.
>>>> 
>>>> I have posted the results of my netperf-wrapper trials at http://richb-hanover.com - There are a number of RRUL charts, taken with different link rates configured, and with different link layers.
>>>> 
>>>> I welcome people's thoughts for other tests/adjustments/etc.
>>>> 
>>>> Rich Brown
>>>> Hanover, NH USA
>>>> 
>>>> PS I did try the 3.10.28-16, but ran into troubles with wifi and ethernet connectivity. I must have screwed up my local configuration - I was doing it quickly - so I rolled back to 3.10.28.14.
>>>> _______________________________________________
>>>> Cerowrt-devel mailing list
>>>> Cerowrt-devel@lists.bufferbloat.net
>>>> https://lists.bufferbloat.net/listinfo/cerowrt-devel
>>>> 
>>> 
>>> _______________________________________________
>>> Cerowrt-devel mailing list
>>> Cerowrt-devel@lists.bufferbloat.net
>>> https://lists.bufferbloat.net/listinfo/cerowrt-devel
>> 
>> _______________________________________________
>> Cerowrt-devel mailing list
>> Cerowrt-devel@lists.bufferbloat.net
>> https://lists.bufferbloat.net/listinfo/cerowrt-devel
> 
> 
> 
> -- 
> Dave Täht
> 
> Fixing bufferbloat with cerowrt: http://www.teklibre.com/cerowrt/subscribe.html


^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2014-02-25 16:29 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-02-24 14:36 [Cerowrt-devel] Equivocal results with using 3.10.28-14 Rich Brown
2014-02-24 14:56 ` Aaron Wood
2014-02-25 13:09   ` Rich Brown
2014-02-25 13:37     ` Sebastian Moeller
2014-02-25 15:54       ` Dave Taht
2014-02-25 16:29         ` Sebastian Moeller
2014-02-24 15:24 ` Fred Stratton
2014-02-24 22:02   ` Sebastian Moeller
2014-02-24 15:51 ` Dave Taht
2014-02-24 16:14   ` Dave Taht
2014-02-24 16:38     ` Aaron Wood
2014-02-24 16:47       ` Dave Taht
2014-02-24 21:54 ` Sebastian Moeller
2014-02-24 22:40   ` Sebastian Moeller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox