General list for discussing Bufferbloat
 help / color / mirror / Atom feed
* Re: [Bloat] What is fairness, anyway? was: Re: finally... winning on wired!
@ 2012-01-14 16:35 Jesper Dangaard Brouer
  2012-01-15  9:49 ` Dave Taht
  0 siblings, 1 reply; 29+ messages in thread
From: Jesper Dangaard Brouer @ 2012-01-14 16:35 UTC (permalink / raw)
  To: Dan Siemon, bloat

Hi Dan

It delights me to see that you are using TC options linklayer ADSL and overhead (from my ADSL optimizer work), in your qos scripts :)

Cheers,
 Jesper Brouer

Dan Siemon <dan@coverfire.com> skrev:

On Sun, 2012-01-08 at 01:40 +0100, Dave Taht wrote:
> On Thu, Jan 5, 2012 at 6:52 PM, Bob Briscoe <bob.briscoe@bt.com> wrote:
> >
> > In a nutshell, bit-rate equality, where each of N active users gets 1/N of
> > the bit-rate, was found to be extremely _unfair_ when the activity of
> > different users is widely different. For example:
> > * 5 light users all active 1% of the time get close to 100% of a shared link
> > whenever they need it.
> > * However, if instead 2 of these users are active 100% of the time, FQ gives
> > the other three light users only 33% of the link whenever they are active.
> > * That's pretty rubbish for a solution that claims to isolate each user from
> > the excesses of others.
> 
> Without AQM or FQ, we have a situation where one stream from one user
> at a site, can eat more than 100% of the bandwidth.
> 
> 1/u would be a substantial improvement!

Indeed I've found this to be the case. I've been using a Linux tc
configuration in both the upstream and downstream which is designed to
protect each host's bandwidth share and within that provide three
traffic classes with flow fairness (script link below). With this
configuration I no longer have to worry about other network traffic
interfering with a decent web experience or VoIP call.

http://git.coverfire.com/?p=linux-qos-scripts.git;a=blob;f=src-3tos.sh;hb=HEAD

-- 
Key ID: 133F6C3E
Key Fingerprint: 72B3 AF04 EFFE 65E4 46FF  7E5B 9297 18BA 133F 6C3E

_______________________________________________
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Bloat] What is fairness, anyway? was: Re: finally... winning on wired!
  2012-01-14 16:35 [Bloat] What is fairness, anyway? was: Re: finally... winning on wired! Jesper Dangaard Brouer
@ 2012-01-15  9:49 ` Dave Taht
  0 siblings, 0 replies; 29+ messages in thread
From: Dave Taht @ 2012-01-15  9:49 UTC (permalink / raw)
  To: Jesper Dangaard Brouer; +Cc: bloat

On Sat, Jan 14, 2012 at 5:35 PM, Jesper Dangaard Brouer <jdb@comx.dk> wrote:
> Hi Dan
>
> It delights me to see that you are using TC options linklayer ADSL and overhead (from my ADSL optimizer work), in your qos scripts :)

I saw that too! :)

We all stand on the shoulders of giants.

I also note: that the person standing on the topmost giant... is
banging their head against the ceiling.... a lot.

--
Dave Täht
SKYPE: davetaht

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Bloat] What is fairness, anyway? was: Re: finally... winning on wired!
  2012-02-05 17:53               ` Justin McCann
@ 2012-02-05 18:21                 ` Eric Dumazet
  0 siblings, 0 replies; 29+ messages in thread
From: Eric Dumazet @ 2012-02-05 18:21 UTC (permalink / raw)
  To: Justin McCann; +Cc: bloat

Le dimanche 05 février 2012 à 12:53 -0500, Justin McCann a écrit :

> I was thinking of suggesting this (delaying ACKs to avoid cwnd
> increase), but doesn't that simply increase the RTT and/or RTT
> variance, and basically do the same thing we're trying to avoid? I
> suppose you could delay a bit, and then drop some earlier ACKs in case
> the sender only increases cwnd per ACK (and not by the number of
> segments the ACK covers), but that per-ACK behavior just seems like a
> bug to me.

I dont think it has anything with delaying acks. Acks are immediately
sent as soon as the incoming packets are delivered to tcp stack.

Check my script, their is no filter on ACK at all.

ingress AQM delays delivering of packets of big flows, _if_ some other
flows are also present and want their part of the bandwidth.

Eventually SFQ drops packets from those big flows, and senders
automatically decrease their congestion window.

I use this kind of ingress AQM on production proxies and it actually
works.




^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Bloat] What is fairness, anyway? was: Re: finally... winning on wired!
  2012-02-05  7:39             ` Eric Dumazet
       [not found]               ` <CAA93jw68yntHkhETQ1a9-Azu7UXEuU9f5fgOsB25hvA240iApg@mail.gmail.com>
@ 2012-02-05 17:53               ` Justin McCann
  2012-02-05 18:21                 ` Eric Dumazet
  1 sibling, 1 reply; 29+ messages in thread
From: Justin McCann @ 2012-02-05 17:53 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: bloat

On Sun, Feb 5, 2012 at 2:39 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> Le dimanche 05 février 2012 à 02:43 +0200, Jonathan Morton a écrit :
>> 2) Implement TCP *receive* window management.  This prevents the TCP
>> algorithm on the sending side from attempting to find the size of the
>> queues in the network.  Search the list archives for "Blackpool" to
>> see my take on this technique in the form of a kernel patch.  More
>> sophisticated algorithms are doubtless possible.
>>
> You can tweak max receiver window to be really small.
>...
> Basically you can delay some flows, so that TCP acks are also delayed.

I was thinking of suggesting this (delaying ACKs to avoid cwnd
increase), but doesn't that simply increase the RTT and/or RTT
variance, and basically do the same thing we're trying to avoid? I
suppose you could delay a bit, and then drop some earlier ACKs in case
the sender only increases cwnd per ACK (and not by the number of
segments the ACK covers), but that per-ACK behavior just seems like a
bug to me.

      Justin

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Bloat] What is fairness, anyway? was: Re: finally... winning on wired!
       [not found]               ` <CAA93jw68yntHkhETQ1a9-Azu7UXEuU9f5fgOsB25hvA240iApg@mail.gmail.com>
@ 2012-02-05 14:24                 ` Dave Taht
  0 siblings, 0 replies; 29+ messages in thread
From: Dave Taht @ 2012-02-05 14:24 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: bloat

[-- Attachment #1: Type: text/plain, Size: 4170 bytes --]

On Sun, Feb 5, 2012 at 2:20 PM, Dave Taht <dave.taht@gmail.com> wrote:

>
>
> On Sun, Feb 5, 2012 at 7:39 AM, Eric Dumazet <eric.dumazet@gmail.com>wrote:
>
>> Le dimanche 05 février 2012 à 02:43 +0200, Jonathan Morton a écrit :
>> > On 5 Feb, 2012, at 2:24 am, George B. wrote:
>> >
>> > > I have yet another question to ask:  On a system where the vast
>> > > majority of traffic is receive traffic, what can it really do to
>> > > mitigate congestion?  I send a click, I get a stream.  There doesn't
>> > > seem to be a lot I can do from my side to manage congestion in the
>> > > remote server's transmit side of the link if I am an overall
>> > receiver
>> > > of traffic.
>> > >
>> > > If I am sending a bunch of traffic, sure, I can do a lot with queue
>> > > management and early detection.  But if I am receiving, it pretty
>> > much
>> > > just is what is and I have to play the stream that I am served.
>> >
>> > There are two good things you can do.
>> >
>> > 1) Pressure your ISP to implement managed queueing and ECN at the
>> > head-end device, eg. DSLAM or cell-tower, and preferably at other
>> > vulnerable points in their network too.
>>
>> Yep, but unfortunately many servers (and clients) dont even
>> initiate/accept ECN
>>
>> > 2) Implement TCP *receive* window management.  This prevents the TCP
>> > algorithm on the sending side from attempting to find the size of the
>> > queues in the network.  Search the list archives for "Blackpool" to
>> > see my take on this technique in the form of a kernel patch.  More
>> > sophisticated algorithms are doubtless possible.
>> >
>> You can tweak max receiver window to be really small.
>>
>> # cat /proc/sys/net/ipv4/tcp_rmem
>> 4096    87380   4127616
>> # echo "4096 16384 40000" >/proc/sys/net/ipv4/tcp_rmem
>>
>> A third one : Install an AQM on ingress side.
>>
>> Basically you can delay some flows, so that TCP acks are also delayed.
>>
>> Example of a basic tc script (probably too basic, but effective)
>>
>> ETH=eth0
>> IFB=ifb0
>> LOCALNETS="hard.coded.ip.addresseses/netmasks"
>> # Put a limit a bit under real one, to 'own' the queue
>> RATE="rate 7Mbit bandwidth 7Mbit maxburst 80 minburst 40"
>> ALLOT="allot 8000" # Depending on how old is your kernel...
>>
>>

> A subtlety here is that several technologies in use today
> (wireless-n, cable, green ethernet, GRO) are highly 'bursty',
> and I'd regard minburst, maxburst as something that needs to be calculated
> as a respectable fraction of the underlying rate.
>
>
>
>> modprobe ifb
>> ip link set dev $IFB up
>>
>> tc qdisc add dev $ETH ingress 2>/dev/null
>>
>> tc filter add dev $ETH parent ffff: \
>>   protocol ip u32 match u32 0 0 flowid 1:1 action mirred egress \
>>   redirect dev $IFB
>>
>> tc qdisc del dev $IFB root
>>
>>
>> # Lets say our NIC is 100Mbit
>> tc qdisc add dev $IFB root handle 1: cbq avpkt 1000 \
>>    rate 100Mbit bandwidth 100Mbit
>>
>> tc class add dev $IFB parent 1: classid 1:1 cbq allot 10000 \
>>        mpu 64 rate 100Mbit prio 1 \
>>        bandwidth 100Mbit maxburst 150 avpkt 1500 bounded
>>
>> # Class for traffic coming from Internet : limited to X Mbits
>> tc class add dev $IFB parent 1:1 classid 1:11 \
>>        cbq $ALLOT mpu 64      \
>>        $RATE prio 2 \
>>        avpkt 1400 bounded
>>
>> tc qdisc add dev $IFB parent 1:11 handle 11: sfq
>>
>>
>> # Traffic from machines in our LAN : no limit
>> for privnet in $LOCALNETS
>> do
>>        tc filter add dev $IFB parent 1: protocol ip prio 2 u32 \
>>                match ip src $privnet flowid 1:1
>> done
>>
>> tc filter add dev $IFB parent 1: protocol ip prio 2 u32 \
>>        match ip protocol 0 0x00 flowid 1:11
>>
>>
>>
>> _______________________________________________
>> Bloat mailing list
>> Bloat@lists.bufferbloat.net
>> https://lists.bufferbloat.net/listinfo/bloat
>>
>
>
>
> --
> Dave Täht
> SKYPE: davetaht
> US Tel: 1-239-829-5608
> FR Tel: 0638645374
> http://www.bufferbloat.net
>



-- 
Dave Täht
SKYPE: davetaht
US Tel: 1-239-829-5608
FR Tel: 0638645374
http://www.bufferbloat.net

[-- Attachment #2: Type: text/html, Size: 5846 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Bloat] What is fairness, anyway? was: Re: finally... winning on wired!
  2012-02-05  0:43           ` Jonathan Morton
  2012-02-05  1:57             ` George B.
@ 2012-02-05  7:39             ` Eric Dumazet
       [not found]               ` <CAA93jw68yntHkhETQ1a9-Azu7UXEuU9f5fgOsB25hvA240iApg@mail.gmail.com>
  2012-02-05 17:53               ` Justin McCann
  1 sibling, 2 replies; 29+ messages in thread
From: Eric Dumazet @ 2012-02-05  7:39 UTC (permalink / raw)
  To: Jonathan Morton; +Cc: bloat

Le dimanche 05 février 2012 à 02:43 +0200, Jonathan Morton a écrit :
> On 5 Feb, 2012, at 2:24 am, George B. wrote:
> 
> > I have yet another question to ask:  On a system where the vast
> > majority of traffic is receive traffic, what can it really do to
> > mitigate congestion?  I send a click, I get a stream.  There doesn't
> > seem to be a lot I can do from my side to manage congestion in the
> > remote server's transmit side of the link if I am an overall
> receiver
> > of traffic.
> > 
> > If I am sending a bunch of traffic, sure, I can do a lot with queue
> > management and early detection.  But if I am receiving, it pretty
> much
> > just is what is and I have to play the stream that I am served.
> 
> There are two good things you can do.
> 
> 1) Pressure your ISP to implement managed queueing and ECN at the
> head-end device, eg. DSLAM or cell-tower, and preferably at other
> vulnerable points in their network too.

Yep, but unfortunately many servers (and clients) dont even
initiate/accept ECN

> 2) Implement TCP *receive* window management.  This prevents the TCP
> algorithm on the sending side from attempting to find the size of the
> queues in the network.  Search the list archives for "Blackpool" to
> see my take on this technique in the form of a kernel patch.  More
> sophisticated algorithms are doubtless possible.
> 
You can tweak max receiver window to be really small.

# cat /proc/sys/net/ipv4/tcp_rmem 
4096	87380	4127616
# echo "4096 16384 40000" >/proc/sys/net/ipv4/tcp_rmem

A third one : Install an AQM on ingress side.

Basically you can delay some flows, so that TCP acks are also delayed.

Example of a basic tc script (probably too basic, but effective)

ETH=eth0
IFB=ifb0
LOCALNETS="172.16.0.0/12 192.168.0.0/16 10.0.0.0/8"
# Put a limit a bit under real one, to 'own' the queue
RATE="rate 7Mbit bandwidth 7Mbit maxburst 80 minburst 40"
ALLOT="allot 8000" # Depending on how old is your kernel...

modprobe ifb
ip link set dev $IFB up

tc qdisc add dev $ETH ingress 2>/dev/null

tc filter add dev $ETH parent ffff: \
   protocol ip u32 match u32 0 0 flowid 1:1 action mirred egress \
   redirect dev $IFB

tc qdisc del dev $IFB root


# Lets say our NIC is 100Mbit
tc qdisc add dev $IFB root handle 1: cbq avpkt 1000 \
    rate 100Mbit bandwidth 100Mbit

tc class add dev $IFB parent 1: classid 1:1 cbq allot 10000 \
	mpu 64 rate 100Mbit prio 1 \
	bandwidth 100Mbit maxburst 150 avpkt 1500 bounded

# Class for traffic coming from Internet : limited to X Mbits
tc class add dev $IFB parent 1:1 classid 1:11 \
	cbq $ALLOT mpu 64      \
	$RATE prio 2 \
	avpkt 1400 bounded

tc qdisc add dev $IFB parent 1:11 handle 11: sfq 


# Traffic from machines in our LAN : no limit
for privnet in $LOCALNETS
do
	tc filter add dev $IFB parent 1: protocol ip prio 2 u32 \
		match ip src $privnet flowid 1:1
done

tc filter add dev $IFB parent 1: protocol ip prio 2 u32 \
	match ip protocol 0 0x00 flowid 1:11




^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Bloat] What is fairness, anyway? was: Re: finally... winning on wired!
  2012-02-05  1:57             ` George B.
@ 2012-02-05  2:05               ` john thompson
  0 siblings, 0 replies; 29+ messages in thread
From: john thompson @ 2012-02-05  2:05 UTC (permalink / raw)
  To: George B.; +Cc: bloat

[-- Attachment #1: Type: text/plain, Size: 1986 bytes --]

Some firewalls (like sonicwall enhanced) can slow down acks to traffic
shape inbound traffic.  It's not perfect, but it's often better than
nothing.

Most business-class ISP's should offer QOS in both directions.  We
certainly do for our T-1 or better customers.

(sorry, I meant this to be a reply all)

On Sat, Feb 4, 2012 at 8:57 PM, George B. <georgeb@gmail.com> wrote:

> On Sat, Feb 4, 2012 at 4:43 PM, Jonathan Morton <chromatix99@gmail.com>
> wrote:
>
> > There are two good things you can do.
> >
> > 1) Pressure your ISP to implement managed queueing and ECN at the
> head-end device, eg. DSLAM or cell-tower, and preferably at other
> vulnerable points in their network too.
>
> Well, if they have a Cisco network, that might work.  Few other
> network gear vendors actively support ECN.
>
> > 2) Implement TCP *receive* window management.  This prevents the TCP
> algorithm on the sending side from attempting to find the size of the
> queues in the network.  Search the list archives for "Blackpool" to see my
> take on this technique in the form of a kernel patch.  More sophisticated
> algorithms are doubtless possible.
>
> Probably not something I want to use in production.
>
> Thanks, Johnathan.   Now yet another question:
>
> Two different server configurations (these are real life examples, by the
> way):
>
> 1.  eth0 and eth1 bound as bond0 with vlans hanging off of them.
> Where to put the qdisc?  On the bond interface?  On the Ethernet
> interfaces?  On the vlan interfaces?
>
> 2.  eth0 and eth1 have vlan interfaces attached as eth0.10, eth1.10
> and eth0.20, eth1.20.  Those are bound to bond interfaces, bond10 and
> bond20.  Same question, where best to apply the qdisc.
>
> George
> _______________________________________________
> Bloat mailing list
> Bloat@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/bloat
>



-- 
“The world is not comprehensible, but it is embraceable.”

[-- Attachment #2: Type: text/html, Size: 2822 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Bloat] What is fairness, anyway? was: Re: finally... winning on wired!
  2012-02-05  0:43           ` Jonathan Morton
@ 2012-02-05  1:57             ` George B.
  2012-02-05  2:05               ` john thompson
  2012-02-05  7:39             ` Eric Dumazet
  1 sibling, 1 reply; 29+ messages in thread
From: George B. @ 2012-02-05  1:57 UTC (permalink / raw)
  To: Jonathan Morton; +Cc: bloat

On Sat, Feb 4, 2012 at 4:43 PM, Jonathan Morton <chromatix99@gmail.com> wrote:

> There are two good things you can do.
>
> 1) Pressure your ISP to implement managed queueing and ECN at the head-end device, eg. DSLAM or cell-tower, and preferably at other vulnerable points in their network too.

Well, if they have a Cisco network, that might work.  Few other
network gear vendors actively support ECN.

> 2) Implement TCP *receive* window management.  This prevents the TCP algorithm on the sending side from attempting to find the size of the queues in the network.  Search the list archives for "Blackpool" to see my take on this technique in the form of a kernel patch.  More sophisticated algorithms are doubtless possible.

Probably not something I want to use in production.

Thanks, Johnathan.   Now yet another question:

Two different server configurations (these are real life examples, by the way):

1.  eth0 and eth1 bound as bond0 with vlans hanging off of them.
Where to put the qdisc?  On the bond interface?  On the Ethernet
interfaces?  On the vlan interfaces?

2.  eth0 and eth1 have vlan interfaces attached as eth0.10, eth1.10
and eth0.20, eth1.20.  Those are bound to bond interfaces, bond10 and
bond20.  Same question, where best to apply the qdisc.

George

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Bloat] What is fairness, anyway? was: Re: finally... winning on wired!
  2012-02-05  0:24         ` George B.
@ 2012-02-05  0:43           ` Jonathan Morton
  2012-02-05  1:57             ` George B.
  2012-02-05  7:39             ` Eric Dumazet
  0 siblings, 2 replies; 29+ messages in thread
From: Jonathan Morton @ 2012-02-05  0:43 UTC (permalink / raw)
  To: George B.; +Cc: bloat


On 5 Feb, 2012, at 2:24 am, George B. wrote:

> I have yet another question to ask:  On a system where the vast
> majority of traffic is receive traffic, what can it really do to
> mitigate congestion?  I send a click, I get a stream.  There doesn't
> seem to be a lot I can do from my side to manage congestion in the
> remote server's transmit side of the link if I am an overall receiver
> of traffic.
> 
> If I am sending a bunch of traffic, sure, I can do a lot with queue
> management and early detection.  But if I am receiving, it pretty much
> just is what is and I have to play the stream that I am served.

There are two good things you can do.

1) Pressure your ISP to implement managed queueing and ECN at the head-end device, eg. DSLAM or cell-tower, and preferably at other vulnerable points in their network too.

2) Implement TCP *receive* window management.  This prevents the TCP algorithm on the sending side from attempting to find the size of the queues in the network.  Search the list archives for "Blackpool" to see my take on this technique in the form of a kernel patch.  More sophisticated algorithms are doubtless possible.

 - Jonathan Morton


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Bloat] What is fairness, anyway? was: Re: finally... winning on wired!
  2012-01-04 15:25       ` [Bloat] What is fairness, anyway? was: " Jim Gettys
  2012-01-04 16:16         ` Dave Taht
  2012-01-04 16:22         ` Eric Dumazet
@ 2012-02-05  0:24         ` George B.
  2012-02-05  0:43           ` Jonathan Morton
  2 siblings, 1 reply; 29+ messages in thread
From: George B. @ 2012-02-05  0:24 UTC (permalink / raw)
  To: Jim Gettys; +Cc: bloat

> As I read this thread, there are three questions that go through my mind:
>    1) since TCP is not "fair", particularly when given flows of
> different RTT's, how do we best deal with this issue?  Do either/both
> SFQ/QFQ deal with this problem, and how do they differ?
>    2) Web browsers are doing "unfair" things at the moment
> (unless/until HTTP/1.1 pipelining and/or SPDY deploys), by opening many
> TCP connections at the same time.  So it's easy for there to be a bunch
> of flows by the same user.  Is "fairness" better a per host property in
> the home environment, or a per TCP flow?  Particularly if we someday
> start diffserv marking traffic, I suspect per host is more "fair", at
> least for unmarked traffic.
>    3) since game manufacturers have noted the diffserv marking in
> PFIFO-FAST, what do these queuing disciplines currently do?
>                    - Jim
>

I have yet another question to ask:  On a system where the vast
majority of traffic is receive traffic, what can it really do to
mitigate congestion?  I send a click, I get a stream.  There doesn't
seem to be a lot I can do from my side to manage congestion in the
remote server's transmit side of the link if I am an overall receiver
of traffic.

If I am sending a bunch of traffic, sure, I can do a lot with queue
management and early detection.  But if I am receiving, it pretty much
just is what is and I have to play the stream that I am served.

George

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Bloat] What is fairness, anyway? was: Re: finally... winning on wired!
  2012-01-13 21:45                   ` Dan Siemon
@ 2012-01-14 15:55                     ` Dave Taht
  0 siblings, 0 replies; 29+ messages in thread
From: Dave Taht @ 2012-01-14 15:55 UTC (permalink / raw)
  To: Dan Siemon; +Cc: bloat

On Fri, Jan 13, 2012 at 10:45 PM, Dan Siemon <dan@coverfire.com> wrote:
> On Sun, 2012-01-08 at 01:40 +0100, Dave Taht wrote:
>> On Thu, Jan 5, 2012 at 6:52 PM, Bob Briscoe <bob.briscoe@bt.com> wrote:
>> >
>> > In a nutshell, bit-rate equality, where each of N active users gets 1/N of
>> > the bit-rate, was found to be extremely _unfair_ when the activity of
>> > different users is widely different. For example:
>> > * 5 light users all active 1% of the time get close to 100% of a shared link
>> > whenever they need it.
>> > * However, if instead 2 of these users are active 100% of the time, FQ gives
>> > the other three light users only 33% of the link whenever they are active.
>> > * That's pretty rubbish for a solution that claims to isolate each user from
>> > the excesses of others.
>>
>> Without AQM or FQ, we have a situation where one stream from one user
>> at a site, can eat more than 100% of the bandwidth.
>>
>> 1/u would be a substantial improvement!
>
> Indeed I've found this to be the case. I've been using a Linux tc
> configuration in both the upstream and downstream which is designed to
> protect each host's bandwidth share and within that provide three
> traffic classes with flow fairness (script link below). With this
> configuration I no longer have to worry about other network traffic
> interfering with a decent web experience or VoIP call.
>
> http://git.coverfire.com/?p=linux-qos-scripts.git;a=blob;f=src-3tos.sh;hb=HEAD

In looking over your script I see there will be several improvements
in linux 3.3 that will apply. I'd be very interested in A/B results on
this script between current and next linux.

SFQ used to permute it's hash and 'scramble' a stream by default every
10 seconds. A 10 second periodicity of tcp weirdness was often visible
on a tcpdump/tcptrace/xplot.org when multiple streams were in play.
Now SFQ permutes the hash in the background, not permuting the hash
until after all the packets that applied to it have been delivered.

SFQ used to add new streams to the tail of the queue, now it adds them
to the head.

There are a zillion more options to SFQ now, deeper buckets, more
flows, etc - and the hybrid combination of REDSFQ. The former stuff
can lower or eliminate the need for perturbation (and or make
perturbation very costly), and handle higher and more diverse
workloads, the latter, like, does it all. I don't even know on where
to start on describing it.

Lastly - with the new stuff SFQ or QFQ - I have very rarely had a need
to explicitly increase the priority of packets using the TOS IMM field
on my workloads, however you are running at rates below what I've been
working with, so that's interesting.

(basically the more 'sparse' a flow is, the faster it leaps to the
front of the queues anyway. Interactive ssh flows fit in that category
well)

So for me the only gateway'd high priority packets are marked EF
(voip) and sometimes dns... and most of the time, none...

Secondly I can imagine decreasing the priority for bulk packets in the
way you are would help, particularly with torrents.

>
> --
> Key ID: 133F6C3E
> Key Fingerprint: 72B3 AF04 EFFE 65E4 46FF  7E5B 9297 18BA 133F 6C3E
>



-- 
Dave Täht
SKYPE: davetaht
US Tel: 1-239-829-5608
FR Tel: 0638645374
http://www.bufferbloat.net

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Bloat] What is fairness, anyway? was: Re: finally...  winning on wired!
       [not found]                         ` <CAA93jw4KJdYwrAuk7-yHDYCGBh1s6mE47eAYu2_LRfY45-qZ2g@mail.g mail.com>
@ 2012-01-14 11:06                           ` Bob Briscoe
  0 siblings, 0 replies; 29+ messages in thread
From: Bob Briscoe @ 2012-01-14 11:06 UTC (permalink / raw)
  To: Dave Taht; +Cc: bloat

Dave,

[There's a lot in yr email, but I have to finish 
day job-stuff (well actually that's a misnomer - 
it's now become day, night and weekend job). So here's a v quick response: ]

Yes, FQ can (and generally does) improve matters, 
particularly when the AQM isn't good*. But FQ 
prevents all attempts by end-systems to do a lot 
better (e.g. uTP). uTP is only the start of what 
could be done. FQ aims for isolation between 
flows, and consequently removes the signals hosts 
would need to be able to do a lot better.

The catch-22 is that a network node can't rely on 
hosts to do a lot better. The upshot: FQ does a 
not-so-good job in case hosts don't do any job at 
all, but it completely prevents hosts doing a brilliant job...

...unless FQ starts making exceptions using DPI, 
then we start down an ultimately fruitless cul de 
sac where apps identify themselves as a protocol 
that is brilliant, without actually behaving 
brilliantly, just to take advantage of the 
exceptions built into the HG,.. yada yada.

I need to understand your points about why 
wireless is different - will read closer, think 
and respond again when I have more time.


Bob

* (can you clarify what experiment the plot you 
linked to is illustrating - you've not made it 
clear what you're comparing between)

At 07:26 11/01/2012, Dave Taht wrote:
>On Mon, Jan 9, 2012 at 6:38 AM, Bob Briscoe <bob.briscoe@bt.com> wrote:
> > Dave,
>
> > You're conflating removal of standing queues with bandwidth allocation. The
> > former is a problem in HGs and hosts. The latter isn't a problem in HGs and
> > hosts.
>
>I've been trying to understand your point here more fully.
>
>1) removal of standing queues is a problem in HGs and hosts
>
>On the host side, the standing queue problem is something that happens often
>in wireless in particular.
>
>Additionally you ship packets around in aggregates, and those
>aggregates can be delayed, lost, or rescheduled.
>
>FQ reduces the damage done when packets are being bulk shipped in this
>way.
>
>http://www.teklibre.com/~d/bloat/ping_log.ps
>
>(This graph also shows that the uncontrolled
>   device driver queue depth totals about 50ms in this case)
>
>In the benchmarks I've been running against wireless, even on fairly light
>loads, FQ reduces bursty packet loss, tcp resets, and the like. Statistically
>it's difficult to 'see', and I'm trying to come 
>up with better methods to do so
>besides double-blind A/B testing and *most importantly*
>
>trying to convince more people to discard their biases and
>actually try the code.
>
>Or take a look at some packet captures.
>
>As for the AP side, you have both a bandwidth allocation and FQ problem
>with wireless, compounded by the packet aggregation problem.
>
>Still a big problem in either wireless case is a 
>need to expire old packets and
>manage the depth of the queue based on the actual bandwidth between
>two devices actually available at that instant of time. Otherwise you
>get nonsense like 10+ second ping times.
>
>So as for managing the the overall length of the standing queues,
>conventional AQM techniques, such as RED, blue, etc... apply but
>as for coping with the bursty nature of wireless in particular (and TSO'd
>streams) FQ helps break up the container-loads into manageable pieces.
>
>2) Bandwidth allocation isn't a problem in HGs and hosts.
>
>On hosts, on wired, it is not a problem. On wireless, see above.
>
>On home gateways, which run uplinks at anywhere between 128KB/sec
>in parts of the world, to 1Mbit in others, & 4Mbit fairly typical on cable,
>it's a huge problem. Regardless of any level of queue management
>as applied above (fq and aqm), the only decent way known to deal with
>'bufferbloat' on bad devices beyond the HG, is to limit your own bandwidth
>to what you've measured as below what the messed up edge provider
>is providing....
>
>and manage it from there across the needs of your site, where various
>AQM and FQ technologies can make a dent in your own problems.
>
>So perhaps I misunderstood your point here?
>
>Certainly the use model of the internet has changed significantly
>and TCP is in dire need of viable complementary protocols such
>as ledbat, etc. I also happen to like hipl, and enjoy but am befuddled
>by the ccn work on-going.
>
>And certainly I look forward to seeing less edge devices misbehaving
>with excessive buffering and lack of AQM.
>
>I'd like in particular, a contractual model - I mean, you are typically buying
>'x bandwidth' as part of your ISP contract -
>made available to correctly, and automatically provision downstream devices.
>
>something as simple as a extension to dhcp would nice,
>or something like parsable data for 'http://whatsmydarnbandwidth.myisp.com'
>would help.
>
>Having to infer the available bandwidth and amount of buffering
>with tools such as shaperprobe is useful but a poor second to
>a contractual model for a baseline and tighter feedback loops
>for ongoing management.
>
>--
>Dave Täht
>SKYPE: davetaht
>US Tel: 1-239-829-5608
>FR Tel: 0638645374
>http://www.bufferbloat.net

________________________________________________________________
Bob Briscoe,                                BT Innovate & Design 


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Bloat] What is fairness, anyway? was: Re: finally... winning on wired!
  2012-01-08  0:40                 ` Dave Taht
       [not found]                   ` <CAA93jw7xKwdUeT7wFNoiM8RQp1--==Eazdo0ucc44vz+L1U06g@mail.g mail.com>
@ 2012-01-13 21:45                   ` Dan Siemon
  2012-01-14 15:55                     ` Dave Taht
  1 sibling, 1 reply; 29+ messages in thread
From: Dan Siemon @ 2012-01-13 21:45 UTC (permalink / raw)
  To: Dave Taht, bloat

On Sun, 2012-01-08 at 01:40 +0100, Dave Taht wrote:
> On Thu, Jan 5, 2012 at 6:52 PM, Bob Briscoe <bob.briscoe@bt.com> wrote:
> >
> > In a nutshell, bit-rate equality, where each of N active users gets 1/N of
> > the bit-rate, was found to be extremely _unfair_ when the activity of
> > different users is widely different. For example:
> > * 5 light users all active 1% of the time get close to 100% of a shared link
> > whenever they need it.
> > * However, if instead 2 of these users are active 100% of the time, FQ gives
> > the other three light users only 33% of the link whenever they are active.
> > * That's pretty rubbish for a solution that claims to isolate each user from
> > the excesses of others.
> 
> Without AQM or FQ, we have a situation where one stream from one user
> at a site, can eat more than 100% of the bandwidth.
> 
> 1/u would be a substantial improvement!

Indeed I've found this to be the case. I've been using a Linux tc
configuration in both the upstream and downstream which is designed to
protect each host's bandwidth share and within that provide three
traffic classes with flow fairness (script link below). With this
configuration I no longer have to worry about other network traffic
interfering with a decent web experience or VoIP call.

http://git.coverfire.com/?p=linux-qos-scripts.git;a=blob;f=src-3tos.sh;hb=HEAD

-- 
Key ID: 133F6C3E
Key Fingerprint: 72B3 AF04 EFFE 65E4 46FF  7E5B 9297 18BA 133F 6C3E


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Bloat] What is fairness, anyway? was: Re: finally... winning on wired!
  2012-01-09  5:38                     ` Bob Briscoe
@ 2012-01-11  7:26                       ` Dave Taht
       [not found]                         ` <CAA93jw4KJdYwrAuk7-yHDYCGBh1s6mE47eAYu2_LRfY45-qZ2g@mail.g mail.com>
  0 siblings, 1 reply; 29+ messages in thread
From: Dave Taht @ 2012-01-11  7:26 UTC (permalink / raw)
  To: Bob Briscoe; +Cc: bloat

On Mon, Jan 9, 2012 at 6:38 AM, Bob Briscoe <bob.briscoe@bt.com> wrote:
> Dave,

> You're conflating removal of standing queues with bandwidth allocation. The
> former is a problem in HGs and hosts. The latter isn't a problem in HGs and
> hosts.

I've been trying to understand your point here more fully.

1) removal of standing queues is a problem in HGs and hosts

On the host side, the standing queue problem is something that happens often
in wireless in particular.

Additionally you ship packets around in aggregates, and those
aggregates can be delayed, lost, or rescheduled.

FQ reduces the damage done when packets are being bulk shipped in this
way.

http://www.teklibre.com/~d/bloat/ping_log.ps

(This graph also shows that the uncontrolled
  device driver queue depth totals about 50ms in this case)

In the benchmarks I've been running against wireless, even on fairly light
loads, FQ reduces bursty packet loss, tcp resets, and the like. Statistically
it's difficult to 'see', and I'm trying to come up with better methods to do so
besides double-blind A/B testing and *most importantly*

trying to convince more people to discard their biases and
actually try the code.

Or take a look at some packet captures.

As for the AP side, you have both a bandwidth allocation and FQ problem
with wireless, compounded by the packet aggregation problem.

Still a big problem in either wireless case is a need to expire old packets and
manage the depth of the queue based on the actual bandwidth between
two devices actually available at that instant of time. Otherwise you
get nonsense like 10+ second ping times.

So as for managing the the overall length of the standing queues,
conventional AQM techniques, such as RED, blue, etc... apply but
as for coping with the bursty nature of wireless in particular (and TSO'd
streams) FQ helps break up the container-loads into manageable pieces.

2) Bandwidth allocation isn't a problem in HGs and hosts.

On hosts, on wired, it is not a problem. On wireless, see above.

On home gateways, which run uplinks at anywhere between 128KB/sec
in parts of the world, to 1Mbit in others, & 4Mbit fairly typical on cable,
it's a huge problem. Regardless of any level of queue management
as applied above (fq and aqm), the only decent way known to deal with
'bufferbloat' on bad devices beyond the HG, is to limit your own bandwidth
to what you've measured as below what the messed up edge provider
is providing....

and manage it from there across the needs of your site, where various
AQM and FQ technologies can make a dent in your own problems.

So perhaps I misunderstood your point here?

Certainly the use model of the internet has changed significantly
and TCP is in dire need of viable complementary protocols such
as ledbat, etc. I also happen to like hipl, and enjoy but am befuddled
by the ccn work on-going.

And certainly I look forward to seeing less edge devices misbehaving
with excessive buffering and lack of AQM.

I'd like in particular, a contractual model - I mean, you are typically buying
'x bandwidth' as part of your ISP contract -
made available to correctly, and automatically provision downstream devices.

something as simple as a extension to dhcp would nice,
or something like parsable data for 'http://whatsmydarnbandwidth.myisp.com'
would help.

Having to infer the available bandwidth and amount of buffering
with tools such as shaperprobe is useful but a poor second to
a contractual model for a baseline and tighter feedback loops
for ongoing management.

-- 
Dave Täht
SKYPE: davetaht
US Tel: 1-239-829-5608
FR Tel: 0638645374
http://www.bufferbloat.net

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Bloat] What is fairness, anyway? was: Re: finally...  winning on wired!
       [not found]                   ` <CAA93jw7xKwdUeT7wFNoiM8RQp1--==Eazdo0ucc44vz+L1U06g@mail.g mail.com>
@ 2012-01-09  5:38                     ` Bob Briscoe
  2012-01-11  7:26                       ` Dave Taht
  0 siblings, 1 reply; 29+ messages in thread
From: Bob Briscoe @ 2012-01-09  5:38 UTC (permalink / raw)
  To: Dave Taht; +Cc: bloat

Dave,

At 00:40 08/01/2012, Dave Taht wrote:
> > To address buffer bloat, I advise we "do one thing and do it well": bulk
> > AQM.
>
>If you have an algorithm to suggest, I'd gladly look at it.

I haven't been working on any AQM algos.
I'm merely challenging the use of FQ.

>Without AQM or FQ, we have a situation where one stream from one user
>at a site, can eat more than 100% of the bandwidth.
>
>1/u would be a substantial improvement!

1/u is only a substantial improvement if you need to protect against 
some bogeyman that is out to take all your b/w. If this bogeyman is 
solely in your imagination, don't add FQ.

If there were an application that ate 100% of the b/w of a 
homegateway (or a host) then early adopters would uninstall it and it 
would never become popular. E.g. the original Joost app.

FQ forces all flows into a straightjacket of equality with each 
other. Usually FQ won't completely stop anything working (it could 
stop an inelastic app working tho). However, if the apps are trying 
to determine their own allocations (e.g. LEDBAT), then FQ 
unnecessarily screws that all up.

I'm not saying no-one should write fairness-policing code. I'm saying 
it's not appropriate to bundle fairness code into AQM code, when the 
case for needing it is based on an imagined bogeyman. Otherwise:
* Either you dump your arbitrary assumptions on the world about what 
allocations each flow should get, bundled with your AQM.
* Or you kill the chances of getting your AQM deployed because half 
the world thinks your allocation assumptions suck.


>Secondly, as most devices lack both AQM and FQ these days despite as -
>one the papers referenced said - "considering that to be a bug"
>people are doing DoS attacks on themselves whenever they attempt
>to do something that requires high bandwidth AND do something interactively.

You're conflating removal of standing queues with bandwidth 
allocation. The former is a problem in HGs and hosts. The latter 
isn't a problem in HGs and hosts.


>Thirdly I kind of need to clarify the usage of three terms in this discussion.
>
>To me, a *user* is - mom, dad, son, daughter, and to some extent their ipods,
>ipads, and tivos, The interesting thing about this scenario is that it
>is the device
>you are in front of that you want the best interactive performance from. A
>second interesting thing is that the highest interactive, yet, bulk flow
>- tv streams - have an upper limit on their rate, and also tend to adjust
>fairly well to lower ones.
>
>A *site* is a home, small business, cybercafe, office, etc. As the size of a
>'site' scales up to well beyond what a polyamorous catholic family could
>produce in terms of users, additional techniques do apply.
>
>(while what we're working on should work well on handhelds, desktops,
>many kinds of servers, home routers, etc - I am far from convinced it's
>applicable in the isp->customer side, aside from the FQ portion breaking
>up streams enough for various algorithms to work better.)
>
>Anyway:
>
>I think sometimes 'users' and 'sites' get conflated.
>
>As you touch upon, the policies that a 'site' implements in order to manage
>their internet access do not enter the political arena in any way. Those
>policies are in the interest of family harmony only, and giving dad or
>daughter knobs to turn that actually work is in everyones best interest.
>
>Thirdly, the definition of a "flow", within a site, is more flexible than
>what a provider can see beyond nat. The device on the customers
>site can regulate flows at levels ranging from mere IP address
>(even the mac address) - to ports, to tuples consisting of the time
>of day and dscp information...

Agree with all the above definitions.

All I would add is that in the family scenario, all the users have 
control over the hosts which are already able to control transfer 
rates. So no-one needs or wants to twiddle knobs on home gateways to 
improve family harmony. In the first instance, app developers have 
the interests of family harmony at heart. They don't want to write 
apps that pee off their users. And if there's a chance they will be 
peeved, the app developer can add a knob in the app.


>As a demo to myself I got 1/u to work nearly perfectly a while
>back.

Perfect 1/u != desirable.
Perfect 1/u == over-constrained.


>And I left in the ability to manage multicast (to limit it
>*enough* so it could be *used* in wireless, without damaging the
>rest of the network at the *site*. I LIKE multicast, but the
>development of switches and wireless need it to be managed
>properly.

I'm assuming the problem you mean is unresponsiveness of certain 
multicast streaming apps to congestion. And I'm assuming "managing 
multicast" means giving it some arbitrary proportion of the link 
capacity (irrespective of whether the multicast app works with that 
allocation).

Your assumption is that the multicast app isn't managing itself. But 
what if it is? For instance, my company operates a multicast app that 
manages its bandwidth. It doesn't have equal bandwidth to other apps, 
but it prevents other apps being starved while ensuring it has the 
min b/w it needs. If your code messes with it and forces it to have 
equal b/w to everything else it won't work.

I'm basically quoting the e2e principle at you.

It's fine to signal out to the transport from an AQM, so the 
transport can keep standing queues to a minimum.

It's much more tricky to do bandwidth allocation, fairness etc. If 
you're only doing fairness as a side-project while you do the AQM, 
it's best not to dabble at all.


> >
> > Since 2004, we now understand that fairness has to involve accounting over
> > time. That requires per-user state, which is not something you can do, or
> > that you need to do, within every queue. We should leave fairness to
> > separate code, probably on machines specialising in this at the edge of a
> > provider's network domain - where it knows about users/customers - separate
> > from the AQM code of each specific queue.
>
>You are taking the provider in approach, we're on the device out approach.
>The two approaches meet on the edge gw in the site.

I'm not taking a provider approach. I'm taking an e2e approach. I'm 
not talking as a myopic carrier, I'm talking as a comms architect. 
Yes, my company operates servers, network and HGs. However my company 
recognises that it has to work with app-code and OS on the hosts. And 
with HGs that we don't manage. I'm saying pls don't put arbitrary b/w 
allocation assumptions in your HG code or low down in the OS stack. 
It's not the right place for this.

(BTW, we remotely manage a few million HGs - we used to do a lot more 
smarts in the HGs but we're reducing that now.)


>Firstly, FQ helps on the user's machine, as does AQM. Not so much
>at gigE speeds, as the instantaneous queue length is rarely long enough,
>but a lot, on wireless, where you always have ~4ms worth of queue.

Again, what evidence do you have that FQ is nec to reduce the delay, 
and AQM alone wouldn't do the job just fine?


>FQ hurts the user herself on bittorrent like applications, and
>there are multiple ways to manage that. For example, a new way showed
>up in the current tree for putting applications in cgroup containers 
>that could
>set their network 'priority'.

An app is written without knowing what network it will connect to. 
How does it know that it needs to set this priority? Or are you 
saying the containers set their own priority (in which case, we're 
back to the problem of arbitrary assumptions in the network)?

>The std way is for bittorrent to have rate limit.

Noooo. The whole point is to be able to use the full capacity when 
no-one else is.

>Another way IS for an AQM or app to be congestion aware much like
>with your proposed Conex stuff.

Don't know what you mean here.

>The way I was considering was playing
>with the TCP-ledbat code to be more datacenter-tcp like...

OK. Sounds interesting.


>Secondly, a combination FQ/AQM is the approach we are taking, more or less.
>
>I started with HTB + QFQ + RED/SFB etc... back in august...
>
>At the moment eric is scaling up SFQ and adding
>a couple variants of RED. The net behavior should be similar to what
>you describe. If you care to try it, it's mostly in linus's kernel at this
>point.
>
>It was possibly unclear to others that we have never thought that
>FQ alone was the answer. In fact we started off saying that
>better AQM was the answer, and then realized that the
>problem had multiple aspects that could be addressed
>independently, and thus started saying FQ + AQM was
>a way forward.

You need to explain this step - this is the nub of the disagreement.


>I would LOVE a 'bulk AQM' solution, and I'll be more
>than glad to start working on solving 2008's problems...
>after we get done with solving 1985's problems...

By bulk I meant all traffic together in a FIFO queue. That was the 
problem Van/Sally started addressing with RED in 1993. And AFAIK, 
that's the problem this bufferbloat list is focused on. I'm not 
introducing anything 2008ish :|

>by applying techniques that were well understood by the late 90s...
>and disremembered by an entire generation of device makers,
>driver writers, OS, network designers, buyers, and benchmarkers.
>
>But first...
>
>1) We needed *any* form of packet scheduler to
>actually work, which we couldn't do until a few months back
>due to overly huge tx rings in all the current ethernet devices.
>
>fixed by BQL
>
>2) We then needed a FQ mechanism to actually behave as designed
>- which both SFQ and and QFQ do now - they weren't, for years.

This is the sticking point. I'm saying it is now fairly widely 
accepted that the goal of FQ was not useful, so whether it works or 
not, we don't want it.


>3) We also needed AQMs that worked - Still do. I'm looking forward
>to a successor to RED, and there were some implementation bugs
>in RED that made turning it's knobs do nothing that are fixed now
>that need testing,

Agreed (strongly).

>and a couple new twists on it in SFQRED...
>
>4) We needed a way to handle TSO properly, as byte oriented AQMs
>     handled that badly. Sort of fixed now.

Agreed.


>Please feel free to evaluate and critique SFQRED.

If I don't agree with the goal, are you expecting me to critique the 
detailed implementation?

>In my case I plan to continue working with HTB, QFQ
>(which, btw, has multiple interesting ways to engage
>weighting mechanisms) and various sub qdiscs... *after SFQ stablizes*.
>
>I kind of view QFQ as a qdisc construction set.

Do you have any knowledge (accessible to the code) of what the 
weights should be?

The $10M question is: What's the argument against not doing FQ?


>We're well aware that these may not be the ultimate answer to
>all the known networking problems in the universe, but what
>we have working in the lab is making a big dent in them.
>
>And now that bugfix #1 above exists...
>
>Let a thousand new AQM's bloom!
>
>If there is anything in what we're up to that will damage the
>internet worse that it is damaged already, let us know soonest.

More *FQ is worse than what there already is.



Bob


________________________________________________________________
Bob Briscoe,                                BT Innovate & Design 


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Bloat] What is fairness, anyway? was: Re: finally... winning on wired!
  2012-01-05 17:52               ` Bob Briscoe
  2012-01-06 17:42                 ` Jim Gettys
  2012-01-06 20:34                 ` Jonathan Morton
@ 2012-01-08  0:40                 ` Dave Taht
       [not found]                   ` <CAA93jw7xKwdUeT7wFNoiM8RQp1--==Eazdo0ucc44vz+L1U06g@mail.g mail.com>
  2012-01-13 21:45                   ` Dan Siemon
  2 siblings, 2 replies; 29+ messages in thread
From: Dave Taht @ 2012-01-08  0:40 UTC (permalink / raw)
  To: Bob Briscoe; +Cc: bloat

On Thu, Jan 5, 2012 at 6:52 PM, Bob Briscoe <bob.briscoe@bt.com> wrote:
> Jim, Justin,
>
> Jumping back one posting in this thread...
>
>
> At 17:36 04/01/2012, Justin McCann wrote:
>>
>> On Wed, Jan 4, 2012 at 11:16 AM, Dave Taht <dave.taht@gmail.com> wrote:
>> >
>> > On Wed, Jan 4, 2012 at 4:25 PM, Jim Gettys <jg@freedesktop.org> wrote:
>> >
>> > 1: the 'slower flows gain priority' question is my gravest concern
>> > (eg, ledbat, bittorrent). It's fixable with per-host FQ.
>>
>> Meaning that you don't want to hand priority to stuff that is intended
>> to stay in the background?
>
>
> The LEDBAT/BitTorrent issue wouldn't be fixed by per-host FQ.
> LEDBAT/uTP tries to yield to other hosts, not just its own host.
>
> In fact, in the early part of the last decade, the whole issue of
> long-running vs interactive flows showed how broken any form of FQ was. This
> was why ISPs moved away from rate equality (whether per-flow, per-host or
> per-customer site) to various per-customer-volume-based approaches (or a mix
> of both).
>
> There seems to be an unspoken assumption among many on this list that rate
> equality must be integrated into each AQM implementation. That's so 2004. It
> seems all the developments in fairness over the last several years have
> passed completely over the heads of many on this list. This page might fill
> in the gaps for those who missed the last few years:
> <http://trac.tools.ietf.org/group/irtf/trac/wiki/CapacitySharingArch>


> To address buffer bloat, I advise we "do one thing and do it well": bulk
> AQM.

If you have an algorithm to suggest, I'd gladly look at it.

>
> In a nutshell, bit-rate equality, where each of N active users gets 1/N of
> the bit-rate, was found to be extremely _unfair_ when the activity of
> different users is widely different. For example:
> * 5 light users all active 1% of the time get close to 100% of a shared link
> whenever they need it.
> * However, if instead 2 of these users are active 100% of the time, FQ gives
> the other three light users only 33% of the link whenever they are active.
> * That's pretty rubbish for a solution that claims to isolate each user from
> the excesses of others.

Without AQM or FQ, we have a situation where one stream from one user
at a site, can eat more than 100% of the bandwidth.

1/u would be a substantial improvement!

Secondly, as most devices lack both AQM and FQ these days despite as -
one the papers referenced said - "considering that to be a bug"
people are doing DoS attacks on themselves whenever they attempt
to do something that requires high bandwidth AND do something interactively.

Thirdly I kind of need to clarify the usage of three terms in this discussion.

To me, a *user* is - mom, dad, son, daughter, and to some extent their ipods,
ipads, and tivos, The interesting thing about this scenario is that it
is the device
you are in front of that you want the best interactive performance from. A
second interesting thing is that the highest interactive, yet, bulk flow
- tv streams - have an upper limit on their rate, and also tend to adjust
fairly well to lower ones.

A *site* is a home, small business, cybercafe, office, etc. As the size of a
'site' scales up to well beyond what a polyamorous catholic family could
produce in terms of users, additional techniques do apply.

(while what we're working on should work well on handhelds, desktops,
many kinds of servers, home routers, etc - I am far from convinced it's
applicable in the isp->customer side, aside from the FQ portion breaking
up streams enough for various algorithms to work better.)

Anyway:

I think sometimes 'users' and 'sites' get conflated.

As you touch upon, the policies that a 'site' implements in order to manage
their internet access do not enter the political arena in any way. Those
policies are in the interest of family harmony only, and giving dad or
daughter knobs to turn that actually work is in everyones best interest.

Thirdly, the definition of a "flow", within a site, is more flexible than
what a provider can see beyond nat. The device on the customers
site can regulate flows at levels ranging from mere IP address
(even the mac address) - to ports, to tuples consisting of the time
of day and dscp information...

As a demo to myself I got 1/u to work nearly perfectly a while
back.

And I left in the ability to manage multicast (to limit it
*enough* so it could be *used* in wireless, without damaging the
rest of the network at the *site*. I LIKE multicast, but the
development of switches and wireless need it to be managed
properly.

>
> Since 2004, we now understand that fairness has to involve accounting over
> time. That requires per-user state, which is not something you can do, or
> that you need to do, within every queue. We should leave fairness to
> separate code, probably on machines specialising in this at the edge of a
> provider's network domain - where it knows about users/customers - separate
> from the AQM code of each specific queue.

You are taking the provider in approach, we're on the device out approach.
The two approaches meet on the edge gw in the site.

Firstly, FQ helps on the user's machine, as does AQM. Not so much
at gigE speeds, as the instantaneous queue length is rarely long enough,
but a lot, on wireless, where you always have ~4ms worth of queue.

FQ hurts the user herself on bittorrent like applications, and
there are multiple ways to manage that. For example, a new way showed
up in the current tree for putting applications in cgroup containers that could
set their network 'priority'. The std way is for bittorrent to have rate limit.
Another way IS for an AQM or app to be congestion aware much like
with your proposed Conex stuff. The way I was considering was playing
with the TCP-ledbat code to be more datacenter-tcp like...

Secondly, a combination FQ/AQM is the approach we are taking, more or less.

I started with HTB + QFQ + RED/SFB etc... back in august...

At the moment eric is scaling up SFQ and adding
a couple variants of RED. The net behavior should be similar to what
you describe. If you care to try it, it's mostly in linus's kernel at this
point.

It was possibly unclear to others that we have never thought that
FQ alone was the answer. In fact we started off saying that
better AQM was the answer, and then realized that the
problem had multiple aspects that could be addressed
independently, and thus started saying FQ + AQM was
a way forward.

I would LOVE a 'bulk AQM' solution, and I'll be more
than glad to start working on solving 2008's problems...
after we get done with solving 1985's problems...
by applying techniques that were well understood by the late 90s...
and disremembered by an entire generation of device makers,
driver writers, OS, network designers, buyers, and benchmarkers.

But first...

1) We needed *any* form of packet scheduler to
actually work, which we couldn't do until a few months back
due to overly huge tx rings in all the current ethernet devices.

fixed by BQL

2) We then needed a FQ mechanism to actually behave as designed
- which both SFQ and and QFQ do now - they weren't, for years.

3) We also needed AQMs that worked - Still do. I'm looking forward
to a successor to RED, and there were some implementation bugs
in RED that made turning it's knobs do nothing that are fixed now
that need testing, and a couple new twists on it in SFQRED...

4) We needed a way to handle TSO properly, as byte oriented AQMs
    handled that badly. Sort of fixed now.

Please feel free to evaluate and critique SFQRED.


In my case I plan to continue working with HTB, QFQ
(which, btw, has multiple interesting ways to engage
weighting mechanisms) and various sub qdiscs... *after SFQ stablizes*.

I kind of view QFQ as a qdisc construction set.

We're well aware that these may not be the ultimate answer to
all the known networking problems in the universe, but what
we have working in the lab is making a big dent in them.

And now that bugfix #1 above exists...

Let a thousand new AQM's bloom!

If there is anything in what we're up to that will damage the
internet worse that it is damaged already, let us know soonest.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Bloat] What is fairness, anyway? was: Re: finally...   winning on wired!
  2012-01-07 19:42                   ` Bob Briscoe
@ 2012-01-07 22:16                     ` Wesley Eddy
  0 siblings, 0 replies; 29+ messages in thread
From: Wesley Eddy @ 2012-01-07 22:16 UTC (permalink / raw)
  To: Bob Briscoe; +Cc: bloat

+1 to everything that Bob said.  In the last several years, the
thinking on this topic has advanced quite a bit, largely due to
work that Bob, Matt Mathis, and others are doing.  In my opinion,
pure FQ concepts are interesting and may have some application,
but for the general Internet, are "not even wrong".


-- 
Wes Eddy
MTI Systems

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Bloat] What is fairness, anyway? was: Re: finally...   winning on wired!
  2012-01-06 20:34                 ` Jonathan Morton
@ 2012-01-07 19:42                   ` Bob Briscoe
  2012-01-07 22:16                     ` Wesley Eddy
  0 siblings, 1 reply; 29+ messages in thread
From: Bob Briscoe @ 2012-01-07 19:42 UTC (permalink / raw)
  To: Jonathan Morton; +Cc: bloat

Jonathan,

At 20:34 06/01/2012, Jonathan Morton wrote:

>On 5 Jan, 2012, at 7:52 pm, Bob Briscoe wrote:
>
> > The LEDBAT/BitTorrent issue wouldn't be fixed by per-host FQ.
> > LEDBAT/uTP tries to yield to other hosts, not just its own host.
>
>According to the LEDBAT I-D 
>(https://datatracker.ietf.org/doc/draft-ietf-ledbat-congestion/?include_text=1), 
>they expressly considered the effect of AQM and FQ, and considered 
>that even if they defeated the LEDBAT mechanism itself, it didn't 
>matter because they would achieve the LEDBAT *goal*.

Yup, I was involved in the ML wordsmithing of that. You've subtly 
changed the sense by altering the words: The draft says "achieving an 
outcome that is in line with LEDBAT's goals", which is not the same 
as "achieving LEDBAT's goals [in full]".

>That goal is to avoid starving other flows, *not* to ensure that 
>LEDBAT flows would always be starved by others.

But a goal of LEDBAT *is* to temporarily yield to interactive flows. 
FQ trumps LEDBAT and prevents it achieving this goal.

> > In fact, in the early part of the last decade, the whole issue of 
> long-running vs interactive flows showed how broken any form of FQ was.
>
>Wait, WTF?  Isn't the long-running versus interactive problem 
>precisely what FQ *does* solve, by prioritising sparse flows over dense ones?

Nooo. Pure FQ doesn't prioritise it equalises (instantaneous 
bit-rate). Cisco's WFQ does give a very short period of extra 
scheduling priority at the start of a flow. But that is a tiny effect 
compared to what would be required for true fairness /over time/.

>We do need both per-flow and per-user fairness.

No. One precludes the other - you can't have both. We need per-user 
fairness, which requires very unequal bit-rates per flow. The more 
that per-flow fairness is enforced in each queue, the harder it will 
be to remove it from the Internet to get per-user fairness.

See "Flow Rate Fairness: Dismantling a Religion" 
<http://dl.acm.org/citation.cfm?id=1232926>

For the avoidance of doubt, if anyone thinks "per-user fairness" 
means equal bit-rate per user, that's not what I take it to mean. I 
mean fairness /over time/. Not equality only at each instant, which 
doesn't take account of the huge benefit from a user staying inactive.

>SFQ and QFQ aim for per-flow fairness, as currently 
>implemented.  Providers currently use a variety of mechanisms - some 
>more effective or more morally acceptable than others - to implement 
>per-user fairness.

Per-user fairness is a perfectly innocent goal. The political problem 
has been about who's in control. That doesn't make per-user fairness 
wrong. The ISP deciding what per-user fairness means has been the 
wrong part. As you say, some ISPs have done this broadly in their 
customer's interests (more so where there's the discipline of competition).

[Aside: I believe some anti-bloat code being proposed on this list 
uses exactly the same politically unacceptable DPI techniques to 
detect exceptions to FQ, (e.g. for LEDBAT). Having to make exceptions 
to FQ should ring alarm bells that FQ isn't actually a good starting 
point. But people get confused and think that FQ holds the moral high 
ground, perhaps because it has inappropriately hijacked the word 
'fair' in its name.]


>But there is currently no easy way for the latter to communicate 
>with the former - ECN doesn't count here - if the former is 
>implemented at the CPE, thereby reducing their effectiveness.  Heck, 
>I have to manually configure my "router" (actually a computer) to 
>know what the upload bandwidth of the modem is.
>
>Why doesn't ECN count?  Because the signalled packets come through 
>the wrong channel - flowing past the router and passing through a 
>different queue, facing in the opposite direction.  The queue that 
>needs to see the information, doesn't.  In any case, if ECN were 
>already deployed sufficiently well, the sending host would be 
>backing off appropriately and we wouldn't be talking about the problem here.

We invented ConEx (congestion exposure) precisely to solve this 
problem of the signals being in the wrong direction.
<http://tools.ietf.org/html/draft-ietf-conex-abstract-mech>

Cheers


Bob


>  - Jonathan Morton

________________________________________________________________
Bob Briscoe,                                BT Innovate & Design 


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Bloat] What is fairness, anyway? was: Re: finally...  winning on wired!
  2012-01-05 17:52               ` Bob Briscoe
  2012-01-06 17:42                 ` Jim Gettys
@ 2012-01-06 20:34                 ` Jonathan Morton
  2012-01-07 19:42                   ` Bob Briscoe
  2012-01-08  0:40                 ` Dave Taht
  2 siblings, 1 reply; 29+ messages in thread
From: Jonathan Morton @ 2012-01-06 20:34 UTC (permalink / raw)
  To: Bob Briscoe; +Cc: bloat


On 5 Jan, 2012, at 7:52 pm, Bob Briscoe wrote:

>> > 1: the 'slower flows gain priority' question is my gravest concern
>> > (eg, ledbat, bittorrent). It's fixable with per-host FQ.
>> 
>> Meaning that you don't want to hand priority to stuff that is intended
>> to stay in the background?
> 
> The LEDBAT/BitTorrent issue wouldn't be fixed by per-host FQ.
> LEDBAT/uTP tries to yield to other hosts, not just its own host.

According to the LEDBAT I-D (https://datatracker.ietf.org/doc/draft-ietf-ledbat-congestion/?include_text=1), they expressly considered the effect of AQM and FQ, and considered that even if they defeated the LEDBAT mechanism itself, it didn't matter because they would achieve the LEDBAT *goal*.

That goal is to avoid starving other flows, *not* to ensure that LEDBAT flows would always be starved by others.

> In fact, in the early part of the last decade, the whole issue of long-running vs interactive flows showed how broken any form of FQ was.

Wait, WTF?  Isn't the long-running versus interactive problem precisely what FQ *does* solve, by prioritising sparse flows over dense ones?

We do need both per-flow and per-user fairness.  SFQ and QFQ aim for per-flow fairness, as currently implemented.  Providers currently use a variety of mechanisms - some more effective or more morally acceptable than others - to implement per-user fairness.

But there is currently no easy way for the latter to communicate with the former - ECN doesn't count here - if the former is implemented at the CPE, thereby reducing their effectiveness.  Heck, I have to manually configure my "router" (actually a computer) to know what the upload bandwidth of the modem is.

Why doesn't ECN count?  Because the signalled packets come through the wrong channel - flowing past the router and passing through a different queue, facing in the opposite direction.  The queue that needs to see the information, doesn't.  In any case, if ECN were already deployed sufficiently well, the sending host would be backing off appropriately and we wouldn't be talking about the problem here.

 - Jonathan Morton


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Bloat] What is fairness, anyway? was: Re: finally... winning on wired!
  2012-01-06 17:42                 ` Jim Gettys
  2012-01-06 18:09                   ` Dave Taht
@ 2012-01-06 19:57                   ` Eric Dumazet
  1 sibling, 0 replies; 29+ messages in thread
From: Eric Dumazet @ 2012-01-06 19:57 UTC (permalink / raw)
  To: Jim Gettys; +Cc: bloat

Le vendredi 06 janvier 2012 à 12:42 -0500, Jim Gettys a écrit :

> That's not how these queuing disciplines work, from what little I
> understand: I'll let Eric and Dave explain exactly what SFQ + QFQ etc.
> are doing.  I think the word "fair" is getting over-used/abused and can
> cause us all to have terminal confusion, since "fair queuing" in the
> research community has a specific meaning.
> 
> Eric, Dave, care to explain exactly what these queuing disciplines are
> doing, and how they differ?


Well, SFQ and QFQ are different, because in SFQ you cannot have
flexibility of QFQ. Typical use of SFQ is using rxhash and let the thing
do its best, while in QFQ you can use different weights per class and
setup a very complex setup.

But in the end, its FQ. With all pros and cons.

> 
> >
> > Since 2004, we now understand that fairness has to involve accounting
> > over time. That requires per-user state, which is not something you
> > can do, or that you need to do, within every queue. We should leave
> > fairness to separate code, probably on machines specialising in this
> > at the edge of a provider's network domain - where it knows about
> > users/customers - separate from the AQM code of each specific queue.
> 
> 
> In home routers we *are* at the very edge of the network (in fact beyond
> the ISP's domain in the case of cable), and that is the focus of the
> experiments that Eric and Dave have been doing.  And here we can afford
> to keep track of per-user state in those devices.  That some of these
> queuing disciplines have such nice effects on latency of many
> applications is gravy.
> 
> This is the point where we go from multiple *users* to the single
> *customer*, which is what you see from where you sit.
> 
> I think "fair queueing" (whatever fair means) is pretty orthogonal to
> AQM in general: we need AQM to keep the overall buffers from filling
> from elephant flows and running full, which is never good, and causes
> packet loss, which ultimately these queuing disciplines don't handle... 
> And AQM is probably all that is needed upstream of the customer,
> particularly since "fairness" is entirely in the eye of the beholder. 
> And we still need an AQM known to work presented with variable bandwidth
> :-(....
> 
> I think we're in agreement here.  And if not, this is the time to get
> everyone to understand and share terminology.
> 

I fully agree. We know for sure that a big part of the problem is at
host side, or the appliance right before it (typical ADSL box / Wifi AP)

With a _single_ TCP upload, we can have lockout and poor interactive
results, no matter of work Bob is going to do in the Provider network.

In our home routers, we serve a small amount of 'users', and we have
enough cpu power to make some classification and try to improve things
before packets leave the box and enter the ISP domain.




^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Bloat] What is fairness, anyway? was: Re: finally... winning on wired!
  2012-01-06 17:42                 ` Jim Gettys
@ 2012-01-06 18:09                   ` Dave Taht
  2012-01-06 19:57                   ` Eric Dumazet
  1 sibling, 0 replies; 29+ messages in thread
From: Dave Taht @ 2012-01-06 18:09 UTC (permalink / raw)
  To: Jim Gettys; +Cc: bloat

On Fri, Jan 6, 2012 at 6:42 PM, Jim Gettys <jg@freedesktop.org> wrote:
> On 01/05/2012 12:52 PM, Bob Briscoe wrote:

> Eric, Dave, care to explain exactly what these queuing disciplines are
> doing, and how they differ?

Well, they are changing so fast as eric assembles pieces of the puzzle
that I think doing it in a wiki page makes more sense than email.

The latest piece is Eric's combination of SFQ and RED which
has some very interesting properties, with a good description
is described here:

http://patchwork.ozlabs.org/patch/134677/

There's still some more pieces left to come (coping with
time in queue for variable bandwidth), but the results thus far have
been very encouraging.

While I compose my thoughts, kernel bql-15 contains the above patch
(and several dozen more prior - all of which besides the above
are now in linux-3.3)

http://huchra.bufferbloat.net/~d/bql/

and debloat-iproute2 will contain (as soon as I patch it in) stuff to
exercise it
(I think adaptive red is missing, too from the iproute2 tree )

https://github.com/dtaht/deBloat-iproute2

It already has code to exercise everything else that has gone in in the
past few weeks for QFQ, netem, and SFQ.

Stuff to exercise qfq is in deBloat

Trying to write all this up comprehensively is going to take a
while. Let me get some stuff patched and built first...

>
>>
>> Since 2004, we now understand that fairness has to involve accounting
>> over time. That requires per-user state, which is not something you
>> can do, or that you need to do, within every queue. We should leave
>> fairness to separate code, probably on machines specialising in this
>> at the edge of a provider's network domain - where it knows about
>> users/customers - separate from the AQM code of each specific queue.
>
>
> In home routers we *are* at the very edge of the network (in fact beyond
> the ISP's domain in the case of cable), and that is the focus of the
> experiments that Eric and Dave have been doing.  And here we can afford
> to keep track of per-user state in those devices.  That some of these
> queuing disciplines have such nice effects on latency of many
> applications is gravy.
>
> This is the point where we go from multiple *users* to the single
> *customer*, which is what you see from where you sit.
>
> I think "fair queueing" (whatever fair means) is pretty orthogonal to
> AQM in general: we need AQM to keep the overall buffers from filling
> from elephant flows and running full, which is never good, and causes
> packet loss, which ultimately these queuing disciplines don't handle...
> And AQM is probably all that is needed upstream of the customer,
> particularly since "fairness" is entirely in the eye of the beholder.
> And we still need an AQM known to work presented with variable bandwidth
> :-(....
>
> I think we're in agreement here.  And if not, this is the time to get
> everyone to understand and share terminology.
>
>                    - Jim
>
>>
>>
>> Bob
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> ________________________________________________________________
>> Bob Briscoe,                                BT Innovate & Design
>



-- 
Dave Täht
SKYPE: davetaht
US Tel: 1-239-829-5608
FR Tel: 0638645374
http://www.bufferbloat.net

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Bloat] What is fairness, anyway? was: Re: finally... winning on wired!
  2012-01-05 17:52               ` Bob Briscoe
@ 2012-01-06 17:42                 ` Jim Gettys
  2012-01-06 18:09                   ` Dave Taht
  2012-01-06 19:57                   ` Eric Dumazet
  2012-01-06 20:34                 ` Jonathan Morton
  2012-01-08  0:40                 ` Dave Taht
  2 siblings, 2 replies; 29+ messages in thread
From: Jim Gettys @ 2012-01-06 17:42 UTC (permalink / raw)
  To: Bob Briscoe; +Cc: bloat

On 01/05/2012 12:52 PM, Bob Briscoe wrote:
> Jim, Justin,
>
> Jumping back one posting in this thread...
>
> At 17:36 04/01/2012, Justin McCann wrote:
>> On Wed, Jan 4, 2012 at 11:16 AM, Dave Taht <dave.taht@gmail.com> wrote:
>> >
>> > On Wed, Jan 4, 2012 at 4:25 PM, Jim Gettys <jg@freedesktop.org> wrote:
>> >
>> > 1: the 'slower flows gain priority' question is my gravest concern
>> > (eg, ledbat, bittorrent). It's fixable with per-host FQ.
>>
>> Meaning that you don't want to hand priority to stuff that is intended
>> to stay in the background?
>
> The LEDBAT/BitTorrent issue wouldn't be fixed by per-host FQ.
> LEDBAT/uTP tries to yield to other hosts, not just its own host.
>
> In fact, in the early part of the last decade, the whole issue of
> long-running vs interactive flows showed how broken any form of FQ
> was. This was why ISPs moved away from rate equality (whether
> per-flow, per-host or per-customer site) to various
> per-customer-volume-based approaches (or a mix of both).
>
> There seems to be an unspoken assumption among many on this list that
> rate equality must be integrated into each AQM implementation. That's
> so 2004. It seems all the developments in fairness over the last
> several years have passed completely over the heads of many on this
> list. This page might fill in the gaps for those who missed the last
> few years:
> <http://trac.tools.ietf.org/group/irtf/trac/wiki/CapacitySharingArch>
>
> To address buffer bloat, I advise we "do one thing and do it well":
> bulk AQM.
>
> In a nutshell, bit-rate equality, where each of N active users gets
> 1/N of the bit-rate, was found to be extremely _unfair_ when the
> activity of different users is widely different. For example:
> * 5 light users all active 1% of the time get close to 100% of a
> shared link whenever they need it.
> * However, if instead 2 of these users are active 100% of the time, FQ
> gives the other three light users only 33% of the link whenever they
> are active.
> * That's pretty rubbish for a solution that claims to isolate each
> user from the excesses of others.

That's not how these queuing disciplines work, from what little I
understand: I'll let Eric and Dave explain exactly what SFQ + QFQ etc.
are doing.  I think the word "fair" is getting over-used/abused and can
cause us all to have terminal confusion, since "fair queuing" in the
research community has a specific meaning.

Eric, Dave, care to explain exactly what these queuing disciplines are
doing, and how they differ?

>
> Since 2004, we now understand that fairness has to involve accounting
> over time. That requires per-user state, which is not something you
> can do, or that you need to do, within every queue. We should leave
> fairness to separate code, probably on machines specialising in this
> at the edge of a provider's network domain - where it knows about
> users/customers - separate from the AQM code of each specific queue.


In home routers we *are* at the very edge of the network (in fact beyond
the ISP's domain in the case of cable), and that is the focus of the
experiments that Eric and Dave have been doing.  And here we can afford
to keep track of per-user state in those devices.  That some of these
queuing disciplines have such nice effects on latency of many
applications is gravy.

This is the point where we go from multiple *users* to the single
*customer*, which is what you see from where you sit.

I think "fair queueing" (whatever fair means) is pretty orthogonal to
AQM in general: we need AQM to keep the overall buffers from filling
from elephant flows and running full, which is never good, and causes
packet loss, which ultimately these queuing disciplines don't handle... 
And AQM is probably all that is needed upstream of the customer,
particularly since "fairness" is entirely in the eye of the beholder. 
And we still need an AQM known to work presented with variable bandwidth
:-(....

I think we're in agreement here.  And if not, this is the time to get
everyone to understand and share terminology.

                    - Jim

>
>
> Bob
>
>
>
>
>
>
>
>
>
> ________________________________________________________________
> Bob Briscoe,                                BT Innovate & Design


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Bloat] What is fairness, anyway? was: Re: finally...  winning on wired!
       [not found]             ` <CAFkTFa89mOmbcOV1PWX3my04rK4NsEvyakcQV2j54qa0gzAViQ@mail.g mail.com>
@ 2012-01-05 17:52               ` Bob Briscoe
  2012-01-06 17:42                 ` Jim Gettys
                                   ` (2 more replies)
  0 siblings, 3 replies; 29+ messages in thread
From: Bob Briscoe @ 2012-01-05 17:52 UTC (permalink / raw)
  To: Jim Gettys, Justin McCann; +Cc: bloat

Jim, Justin,

Jumping back one posting in this thread...

At 17:36 04/01/2012, Justin McCann wrote:
>On Wed, Jan 4, 2012 at 11:16 AM, Dave Taht <dave.taht@gmail.com> wrote:
> >
> > On Wed, Jan 4, 2012 at 4:25 PM, Jim Gettys <jg@freedesktop.org> wrote:
> >
> > 1: the 'slower flows gain priority' question is my gravest concern
> > (eg, ledbat, bittorrent). It's fixable with per-host FQ.
>
>Meaning that you don't want to hand priority to stuff that is intended
>to stay in the background?

The LEDBAT/BitTorrent issue wouldn't be fixed by per-host FQ.
LEDBAT/uTP tries to yield to other hosts, not just its own host.

In fact, in the early part of the last decade, the whole issue of 
long-running vs interactive flows showed how broken any form of FQ 
was. This was why ISPs moved away from rate equality (whether 
per-flow, per-host or per-customer site) to various 
per-customer-volume-based approaches (or a mix of both).

There seems to be an unspoken assumption among many on this list that 
rate equality must be integrated into each AQM implementation. That's 
so 2004. It seems all the developments in fairness over the last 
several years have passed completely over the heads of many on this 
list. This page might fill in the gaps for those who missed the last few years:
<http://trac.tools.ietf.org/group/irtf/trac/wiki/CapacitySharingArch>

To address buffer bloat, I advise we "do one thing and do it well": bulk AQM.

In a nutshell, bit-rate equality, where each of N active users gets 
1/N of the bit-rate, was found to be extremely _unfair_ when the 
activity of different users is widely different. For example:
* 5 light users all active 1% of the time get close to 100% of a 
shared link whenever they need it.
* However, if instead 2 of these users are active 100% of the time, 
FQ gives the other three light users only 33% of the link whenever 
they are active.
* That's pretty rubbish for a solution that claims to isolate each 
user from the excesses of others.

Since 2004, we now understand that fairness has to involve accounting 
over time. That requires per-user state, which is not something you 
can do, or that you need to do, within every queue. We should leave 
fairness to separate code, probably on machines specialising in this 
at the edge of a provider's network domain - where it knows about 
users/customers - separate from the AQM code of each specific queue.


Bob









________________________________________________________________
Bob Briscoe,                                BT Innovate & Design 


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Bloat] What is fairness, anyway? was: Re: finally... winning on wired!
  2012-01-04 17:36           ` Justin McCann
@ 2012-01-04 17:40             ` Eric Dumazet
       [not found]             ` <CAFkTFa89mOmbcOV1PWX3my04rK4NsEvyakcQV2j54qa0gzAViQ@mail.g mail.com>
  1 sibling, 0 replies; 29+ messages in thread
From: Eric Dumazet @ 2012-01-04 17:40 UTC (permalink / raw)
  To: Justin McCann; +Cc: bloat

Le mercredi 04 janvier 2012 à 12:36 -0500, Justin McCann a écrit :

> I must have misinterpreted the 'new flows' discussion on netdev. Are
> these new flows in the sense of SYN/SYN-ACK, or new flows in the sense
> of "don't have any packets in the queue(s) right now"?

New flows in the sense "don't have any packets in the queue(s) right
now"

(ie a non backlogged flow)




^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Bloat] What is fairness, anyway? was: Re: finally... winning on wired!
  2012-01-04 16:16         ` Dave Taht
  2012-01-04 17:23           ` Jim Gettys
@ 2012-01-04 17:36           ` Justin McCann
  2012-01-04 17:40             ` Eric Dumazet
       [not found]             ` <CAFkTFa89mOmbcOV1PWX3my04rK4NsEvyakcQV2j54qa0gzAViQ@mail.g mail.com>
  1 sibling, 2 replies; 29+ messages in thread
From: Justin McCann @ 2012-01-04 17:36 UTC (permalink / raw)
  To: Dave Taht; +Cc: bloat

On Wed, Jan 4, 2012 at 11:16 AM, Dave Taht <dave.taht@gmail.com> wrote:
>
> On Wed, Jan 4, 2012 at 4:25 PM, Jim Gettys <jg@freedesktop.org> wrote:
>
> >    1) since TCP is not "fair", particularly when given flows of
> > different RTT's, how do we best deal with this issue?  Do either/both
> > SFQ/QFQ deal with this problem, and how do they differ?
>
> The reason why QFQ was outperforming SFQ here
>
> http://www.teklibre.com/~d/bloat/pfifo_sfq_vs_qfq_linear50.png
>
> was because SFQ was  enqueuing the first packet of a new stream at the
> end of the existing streams.
>
> After eric moved the SFQ enqueue to head, the two started performing
> the same at light workloads.
>
> (keep in mind the comparison already to pfifo_fast - log scale)
>
> http://www.teklibre.com/~d/bloat/pfifo_fast_vs_sfq_qfq_log.png
>
> So, the net effect of either fq mechanism is that slower flows jump
> forward in the queue.
> RTT  is not really relevant. [1]

I must have misinterpreted the 'new flows' discussion on netdev. Are
these new flows in the sense of SYN/SYN-ACK, or new flows in the sense
of "don't have any packets in the queue(s) right now"?


> 1: the 'slower flows gain priority' question is my gravest concern
> (eg, ledbat, bittorrent). It's fixable with per-host FQ.

Meaning that you don't want to hand priority to stuff that is intended
to stay in the background? How does the classification work you've
been doing interact with these latest QFQ tests?

   Justin

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Bloat] What is fairness, anyway? was: Re:  finally... winning on wired!
  2012-01-04 16:16         ` Dave Taht
@ 2012-01-04 17:23           ` Jim Gettys
  2012-01-04 17:36           ` Justin McCann
  1 sibling, 0 replies; 29+ messages in thread
From: Jim Gettys @ 2012-01-04 17:23 UTC (permalink / raw)
  To: Dave Taht; +Cc: andrewmcgr, bloat

On 01/04/2012 11:16 AM, Dave Taht wrote:
>
>    3) since game manufacturers have noted the diffserv marking in
> PFIFO-FAST, what do these queuing disciplines currently do?
> They pay no attention to diffserv. It's possible we have a small
> problem here, but I'd wager not.

That's not what I understand from talking to Andrew McGregor: he says
the game companies have figured out that diffserv marking is respected
by PFIFO-FAST, and are marking appropriately (since most current routers
are doing PFIFO-FAST, as it's been the default in Linux for quite a while.

But the flows may be sparse enough it may not matter so much.

I think we need to understand this in more depth...

>
> If their packets are not saturating the link, and/or it's per
> host FQ, they are going to win.
>
> isochronous flows (like voice, even (especially) skype) should respond well to
> this stuff, too. do, actually. skype is nicer now.
>
>
> footnotes:
>
> 1: the 'slower flows gain priority' question is my gravest concern
> (eg, ledbat, bittorrent). It's fixable with per-host FQ.
>
> 2: I note that identifying "a host" is harder than it was in nagle's
> day - with multiple protocols in use a hash against an IP address is
> not accurate, nor, by the time you hit the egress interface, do you
> have a mac address to use.
>
> I did not solve that problem in my home router gw/aqm/fq prototype.
> The approximation was a hash that matched an ip or ipv6 address.
>
> 3: (and conversely, existing elephants slow down). (and when you drop
> packets there is a good question, but you do end up with classic tail
> drop behavior in both QFQ and SFQ on a per flow basis. You can get
> head drop out of QFQ, actually)
>
> 4: especially in variable bandwidth scenarios, the depth of even the
> fairest queue needs to be managed somehow.
>
> But I can live with the nearly two orders of magnitude of basic
> latency improvement we've got now, as well
> as the probability this induces of doing smarter/effective queue
> length management, and keep getting the bugs out of it....
>
>>
>
>


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Bloat] What is fairness, anyway? was: Re:  finally... winning on wired!
  2012-01-04 15:25       ` [Bloat] What is fairness, anyway? was: " Jim Gettys
  2012-01-04 16:16         ` Dave Taht
@ 2012-01-04 16:22         ` Eric Dumazet
  2012-02-05  0:24         ` George B.
  2 siblings, 0 replies; 29+ messages in thread
From: Eric Dumazet @ 2012-01-04 16:22 UTC (permalink / raw)
  To: Jim Gettys; +Cc: bloat

Le mercredi 04 janvier 2012 à 10:25 -0500, Jim Gettys a écrit :

> As I read this thread, there are three questions that go through my mind:
>     1) since TCP is not "fair", particularly when given flows of
> different RTT's, how do we best deal with this issue?  Do either/both
> SFQ/QFQ deal with this problem, and how do they differ?

Main idea behind "fairness" is that one or few tcp flows are not able to
clamp the link (because peers are very close, RTT very small)

Its not "globaly fair" in the common sense.

QFQ is more complex, since you can classify your packets more precisely
than SFQ.

SFQ handles each "flow" the same than others in term of weight
(bandwith), and is light, while QFQ can permit you to give very fine
control but is also more difficult to setup (and apparently to debug...)

>     2) Web browsers are doing "unfair" things at the moment
> (unless/until HTTP/1.1 pipelining and/or SPDY deploys), by opening many
> TCP connections at the same time.  So it's easy for there to be a bunch
> of flows by the same user.  Is "fairness" better a per host property in
> the home environment, or a per TCP flow?  Particularly if we someday
> start diffserv marking traffic, I suspect per host is more "fair", at
> least for unmarked traffic.

By default, SFQ does a full flow classification (using hash on tuple
(src ip, dst ip, src port, dst port, proto...), but you can configure it
to only hash on dst ip. (or any combination)

Then, all 'users' behind a NAT will be considered as a single user.

Its "unfair"...


>     3) since game manufacturers have noted the diffserv marking in
> PFIFO-FAST, what do these queuing disciplines currently do?

SFQ doesnt care if used as the single qdisc on your device.

If you want/need such classification, a more complex setup is needed, to
separarate high prio trafic from 'other trafic'.




^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Bloat] What is fairness, anyway? was: Re:  finally... winning on wired!
  2012-01-04 15:25       ` [Bloat] What is fairness, anyway? was: " Jim Gettys
@ 2012-01-04 16:16         ` Dave Taht
  2012-01-04 17:23           ` Jim Gettys
  2012-01-04 17:36           ` Justin McCann
  2012-01-04 16:22         ` Eric Dumazet
  2012-02-05  0:24         ` George B.
  2 siblings, 2 replies; 29+ messages in thread
From: Dave Taht @ 2012-01-04 16:16 UTC (permalink / raw)
  To: Jim Gettys; +Cc: bloat

On Wed, Jan 4, 2012 at 4:25 PM, Jim Gettys <jg@freedesktop.org> wrote:
> On 01/02/2012 04:31 PM, Dave Taht wrote:
>> On Mon, Jan 2, 2012 at 9:07 AM, Dave Taht <dave.taht@gmail.com> wrote:
>> Yes, that patch brings SFQ at light workloads to being
>> indistinguishable from QFQ!
>> http://www.teklibre.com/~d/bloat/sfqnewvsqfq10iperfs.png (if you stare
>> at this image long enough you might see a pattern, but I don't) (I
>> certainly am seeing an afterimage, though)
>>>> A "nolimit" implementation could use a dynamic memory allocator
>>>> scheme, eventually consuming less memory on typical use :)
>> At what point could SFQ be considered a replacement for pfifo_fast? :)
>>
>> I have not managed to crash QFQ yet with your other new patch. I will
>> run it overnight.
>>
>>
>
> As I read this thread, there are three questions that go through my mind:

the thread moved to netdev - more sfq improvements on the way

http://www.spinics.net/lists/netdev/msg184686.html

>    1) since TCP is not "fair", particularly when given flows of
> different RTT's, how do we best deal with this issue?  Do either/both
> SFQ/QFQ deal with this problem, and how do they differ?

The reason why QFQ was outperforming SFQ here

http://www.teklibre.com/~d/bloat/pfifo_sfq_vs_qfq_linear50.png

was because SFQ was  enqueuing the first packet of a new stream at the
end of the existing streams.

After eric moved the SFQ enqueue to head, the two started performing
the same at light workloads.

(keep in mind the comparison already to pfifo_fast - log scale)

http://www.teklibre.com/~d/bloat/pfifo_fast_vs_sfq_qfq_log.png

So, the net effect of either fq mechanism is that slower flows jump
forward in the queue.
RTT  is not really relevant. [1]

The more sparse a flow is, the higher the likelyhood that it will be
higher prioritized in this way.

So we converge towards an optimum for longer RTT flows vs shorter
ones, automagically, or so I think.

But you ask a good question, the testing done thus far has been in
fixed RTT scenarios, with the exception of the wireless test I ran a
few days back.

http://www.teklibre.com/~d/bloat/qfq_vs_pfifo_fast_wireless_iwl_card_vs_cerowrt.pdf
[4]

>    2) Web browsers are doing "unfair" things at the moment
> (unless/until HTTP/1.1 pipelining and/or SPDY deploys), by opening many
> TCP connections at the same time.  So it's easy for there to be a bunch
> of flows by the same user.

> Is "fairness" better a per host property in
> the home environment, or a per TCP flow?  Particularly if we someday
> start diffserv marking traffic, I suspect per host is more "fair", at
> least for unmarked traffic.

I don't think it's an "or" question. I think it's an "and", and
diffserv of very limited use.

The proof of concept implementation I did a few months back did:

At the router:

FQ on a per host basis
and then FQ within each queue for each host. [2]

Hosts should DEFINITELY run something like SFQ rather than PFIFO_FAST.
syns, dns lookups, flows in slow start, all benefit. [3]

and on top of that, since most web flows are short, and the uplink slow,
sending 8 acks for 8 streams gives you more data faster, in aggregate,
than sending all the acks for one stream, then the next, as a short flow
expires quicker.

and of course the other nice thing of FQ is that if you lose a bunch
of acks they are
spread across multiple rather than one stream.

Determining the net effect of ALL THAT is up for more testing, but
I've been running BQL + QFQ (and now SFQ) since mid-november now
and the improvement in my web 'experience' is often amazing,
as are things like interactive ssh sessions. I can have a fully loaded
up network - totally saturated - and barely notice.

>    3) since game manufacturers have noted the diffserv marking in
> PFIFO-FAST, what do these queuing disciplines currently do?

They pay no attention to diffserv. It's possible we have a small
problem here, but I'd wager not.

If their packets are not saturating the link, and/or it's per
host FQ, they are going to win.

isochronous flows (like voice, even (especially) skype) should respond well to
this stuff, too. do, actually. skype is nicer now.


footnotes:

1: the 'slower flows gain priority' question is my gravest concern
(eg, ledbat, bittorrent). It's fixable with per-host FQ.

2: I note that identifying "a host" is harder than it was in nagle's
day - with multiple protocols in use a hash against an IP address is
not accurate, nor, by the time you hit the egress interface, do you
have a mac address to use.

I did not solve that problem in my home router gw/aqm/fq prototype.
The approximation was a hash that matched an ip or ipv6 address.

3: (and conversely, existing elephants slow down). (and when you drop
packets there is a good question, but you do end up with classic tail
drop behavior in both QFQ and SFQ on a per flow basis. You can get
head drop out of QFQ, actually)

4: especially in variable bandwidth scenarios, the depth of even the
fairest queue needs to be managed somehow.

But I can live with the nearly two orders of magnitude of basic
latency improvement we've got now, as well
as the probability this induces of doing smarter/effective queue
length management, and keep getting the bugs out of it....

>
>



-- 
Dave Täht
SKYPE: davetaht
US Tel: 1-239-829-5608
FR Tel: 0638645374
http://www.bufferbloat.net

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [Bloat] What is fairness, anyway? was: Re:  finally... winning on wired!
  2012-01-02 21:31     ` Dave Taht
@ 2012-01-04 15:25       ` Jim Gettys
  2012-01-04 16:16         ` Dave Taht
                           ` (2 more replies)
  0 siblings, 3 replies; 29+ messages in thread
From: Jim Gettys @ 2012-01-04 15:25 UTC (permalink / raw)
  To: Dave Taht; +Cc: bloat

On 01/02/2012 04:31 PM, Dave Taht wrote:
> On Mon, Jan 2, 2012 at 9:07 AM, Dave Taht <dave.taht@gmail.com> wrote:
> Yes, that patch brings SFQ at light workloads to being
> indistinguishable from QFQ!
> http://www.teklibre.com/~d/bloat/sfqnewvsqfq10iperfs.png (if you stare
> at this image long enough you might see a pattern, but I don't) (I
> certainly am seeing an afterimage, though)
>>> A "nolimit" implementation could use a dynamic memory allocator
>>> scheme, eventually consuming less memory on typical use :)
> At what point could SFQ be considered a replacement for pfifo_fast? :)
>
> I have not managed to crash QFQ yet with your other new patch. I will
> run it overnight.
>
>

As I read this thread, there are three questions that go through my mind:
    1) since TCP is not "fair", particularly when given flows of
different RTT's, how do we best deal with this issue?  Do either/both
SFQ/QFQ deal with this problem, and how do they differ?
    2) Web browsers are doing "unfair" things at the moment
(unless/until HTTP/1.1 pipelining and/or SPDY deploys), by opening many
TCP connections at the same time.  So it's easy for there to be a bunch
of flows by the same user.  Is "fairness" better a per host property in
the home environment, or a per TCP flow?  Particularly if we someday
start diffserv marking traffic, I suspect per host is more "fair", at
least for unmarked traffic.
    3) since game manufacturers have noted the diffserv marking in
PFIFO-FAST, what do these queuing disciplines currently do?
                    - Jim



^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2012-02-05 18:21 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-01-14 16:35 [Bloat] What is fairness, anyway? was: Re: finally... winning on wired! Jesper Dangaard Brouer
2012-01-15  9:49 ` Dave Taht
  -- strict thread matches above, loose matches on Subject: below --
2012-01-02  0:40 [Bloat] " Dave Taht
2012-01-02  5:22 ` Eric Dumazet
2012-01-02  8:07   ` Dave Taht
2012-01-02 21:31     ` Dave Taht
2012-01-04 15:25       ` [Bloat] What is fairness, anyway? was: " Jim Gettys
2012-01-04 16:16         ` Dave Taht
2012-01-04 17:23           ` Jim Gettys
2012-01-04 17:36           ` Justin McCann
2012-01-04 17:40             ` Eric Dumazet
     [not found]             ` <CAFkTFa89mOmbcOV1PWX3my04rK4NsEvyakcQV2j54qa0gzAViQ@mail.g mail.com>
2012-01-05 17:52               ` Bob Briscoe
2012-01-06 17:42                 ` Jim Gettys
2012-01-06 18:09                   ` Dave Taht
2012-01-06 19:57                   ` Eric Dumazet
2012-01-06 20:34                 ` Jonathan Morton
2012-01-07 19:42                   ` Bob Briscoe
2012-01-07 22:16                     ` Wesley Eddy
2012-01-08  0:40                 ` Dave Taht
     [not found]                   ` <CAA93jw7xKwdUeT7wFNoiM8RQp1--==Eazdo0ucc44vz+L1U06g@mail.g mail.com>
2012-01-09  5:38                     ` Bob Briscoe
2012-01-11  7:26                       ` Dave Taht
     [not found]                         ` <CAA93jw4KJdYwrAuk7-yHDYCGBh1s6mE47eAYu2_LRfY45-qZ2g@mail.g mail.com>
2012-01-14 11:06                           ` Bob Briscoe
2012-01-13 21:45                   ` Dan Siemon
2012-01-14 15:55                     ` Dave Taht
2012-01-04 16:22         ` Eric Dumazet
2012-02-05  0:24         ` George B.
2012-02-05  0:43           ` Jonathan Morton
2012-02-05  1:57             ` George B.
2012-02-05  2:05               ` john thompson
2012-02-05  7:39             ` Eric Dumazet
     [not found]               ` <CAA93jw68yntHkhETQ1a9-Azu7UXEuU9f5fgOsB25hvA240iApg@mail.gmail.com>
2012-02-05 14:24                 ` Dave Taht
2012-02-05 17:53               ` Justin McCann
2012-02-05 18:21                 ` Eric Dumazet

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox