[Cerowrt-devel] Link to SFQRED description?

Dave Taht dave.taht at gmail.com
Fri Mar 16 20:11:18 PDT 2012


Definately need to work on a good description of that.

commit ddecf0f4db44ef94847a62d6ecf74456b4dcc66f
Author: Eric Dumazet <eric.dumazet at gmail.com>
Date:   Fri Jan 6 06:31:44 2012 +0000

    net_sched: sfq: add optional RED on top of SFQ

    Adds an optional Random Early Detection on each SFQ flow queue.

    Traditional SFQ limits count of packets, while RED permits to also
    control number of bytes per flow, and adds ECN capability as well.

    1) We dont handle the idle time management in this RED implementation,
    since each 'new flow' begins with a null qavg. We really want to address
    backlogged flows.

    2) if headdrop is selected, we try to ecn mark first packet instead of
    currently enqueued packet. This gives faster feedback for tcp flows
    compared to traditional RED [ marking the last packet in queue ]

    Example of use :

    tc qdisc add dev $DEV parent 1:1 handle 10: est 1sec 4sec sfq \
        limit 3000 headdrop flows 512 divisor 16384 \
        redflowlimit 100000 min 8000 max 60000 probability 0.20 ecn

    qdisc sfq 10: parent 1:1 limit 3000p quantum 1514b depth 127 headdrop
    flows 512/16384 divisor 16384
     ewma 6 min 8000b max 60000b probability 0.2 ecn
     prob_mark 0 prob_mark_head 4876 prob_drop 6131
     forced_mark 0 forced_mark_head 0 forced_drop 0
     Sent 1175211782 bytes 777537 pkt (dropped 6131, overlimits 11007
    requeues 0)
     rate 99483Kbit 8219pps backlog 689392b 456p requeues 0

    In this test, with 64 netperf TCP_STREAM sessions, 50% using ECN enabled
    flows, we can see number of packets CE marked is smaller than number of
    drops (for non ECN flows)

    If same test is run, without RED, we can check backlog is much bigger.

    qdisc sfq 10: parent 1:1 limit 3000p quantum 1514b depth 127 headdrop
    flows 512/16384 divisor 16384
     Sent 1148683617 bytes 795006 pkt (dropped 0, overlimits 0 requeues 0)


Which builds on this:

commit 18cb809850fb499ad9bf288696a95f4071f73931
Author: Eric Dumazet <eric.dumazet at gmail.com>
Date:   Wed Jan 4 14:18:38 2012 +0000

    net_sched: sfq: extend limits

    SFQ as implemented in Linux is very limited, with at most 127 flows
    and limit of 127 packets. [ So if 127 flows are active, we have one
    packet per flow ]

    This patch brings to SFQ following features to cope with modern needs.

    - Ability to specify a smaller per flow limit of inflight packets.
        (default value being at 127 packets)

    - Ability to have up to 65408 active flows (instead of 127)

    - Ability to have head drops instead of tail drops
      (to drop old packets from a flow)

    Example of use : No more than 20 packets per flow, max 8000 flows, max
    20000 packets in SFQ qdisc, hash table of 65536 slots.

    tc qdisc add ... sfq \
            flows 8000 \
            depth 20 \
            headdrop \
            limit 20000 \
        divisor 65536

    Ram usage :

    2 bytes per hash table entry (instead of previous 1 byte/entry)
    32 bytes per flow on 64bit arches, instead of 384 for QFQ, so much
    better cache hit ratio.

    Signed-off-by: Eric Dumazet <eric.dumazet at gmail.com>
    CC: Dave Taht <dave.taht at gmail.com>
    Signed-off-by: David S. Miller <davem at davemloft.net>


which builds on this:

commit 02a9098ede0dc7e28c16a03fa7fba86a05219478
Author: Eric Dumazet <eric.dumazet at gmail.com>
Date:   Wed Jan 4 06:23:01 2012 +0000

    net_sched: sfq: always randomize hash perturbation

    SFQ q->perturbation is used in sfq_hash() as an input to Jenkins hash.

    We currently randomize this 32bit value only if a perturbation timer is
    setup.

    Its much better to always initialize it to defeat attackers, or else
    they can predict very well what kind of packets they have to forge to
    hit a particular flow.

    Signed-off-by: Eric Dumazet <eric.dumazet at gmail.com>
    Signed-off-by: David S. Miller <davem at davemloft.net>


which build on this (which is now pulled)

commit d47a0ac7b66883987275598d6039f902f4410ca9
Author: Eric Dumazet <eric.dumazet at gmail.com>
Date:   Sun Jan 1 18:33:31 2012 +0000

    sch_sfq: dont put new flow at the end of flows

    SFQ enqueue algo puts a new flow _behind_ all pre-existing flows in the
    circular list. In fact this is probably an old SFQ implementation bug.

    100 Mbits = ~8333 full frames per second, or ~8 frames per ms.

    With 50 flows, it means your "new flow" will have to wait 50 packets
    being sent before its own packet. Thats the ~6ms.

    We certainly can change SFQ to give a priority advantage to new flows,
    so that next dequeued packet is taken from a new flow, not an old one.

    Reported-by: Dave Taht <dave.taht at gmail.com>
    Signed-off-by: Eric Dumazet <eric.dumazet at gmail.com>
    Signed-off-by: David S. Miller <davem at davemloft.net>


This was an idea discarded in the original sfq paper in 1990 as 'too
computationally intensive'

but as queues got deeper became a bigger problem...


http://www.coverfire.com/archives/2009/06/28/linux-sfq-experimentation/

so fixed here:

commit 225d9b89c937633dfeec502741a174fe0bab5b9f
Author: Eric Dumazet <eric.dumazet at gmail.com>
Date:   Wed Dec 21 03:30:11 2011 +0000

    sch_sfq: rehash queues in perturb timer

    A known Out Of Order (OOO) problem hurts SFQ when timer changes
    perturbation value, since all new packets delivered to SFQ enqueue might
    end on different slots than previous in-flight packets.

    With round robin delivery, we can thus deliver packets in a different
    order.

    Since SFQ is limited to small amount of in-flight packets, we can rehash
    packets so that this OOO problem is fixed.

    This rehashing is performed only if internal flow classifier is in use.

    We now store in skb->cb[] the "struct flow_keys" so that we dont call
    skb_flow_dissect() again while rehashing.

    Signed-off-by: Eric Dumazet <eric.dumazet at gmail.com>
    Signed-off-by: David S. Miller <davem at davemloft.net>


On Fri, Mar 16, 2012 at 8:03 PM, Richard Brown
<richard.e.brown at dartware.com> wrote:
> Anyone have a link to a description to the SFQRED queue discipline? I'd like to put it onto the wiki.



-- 
Dave Täht
SKYPE: davetaht
US Tel: 1-239-829-5608
http://www.bufferbloat.net


More information about the Cerowrt-devel mailing list