[Codel] [PATCH net-next] fq_codel: report congestion notification at enqueue time
Dave Taht
dave.taht at gmail.com
Thu Jun 28 13:51:33 EDT 2012
On Thu, Jun 28, 2012 at 10:07 AM, Eric Dumazet <eric.dumazet at gmail.com> wrote:
> From: Eric Dumazet <edumazet at google.com>
>
> At enqueue time, check sojourn time of packet at head of the queue,
> and return NET_XMIT_CN instead of NET_XMIT_SUCCESS if this sejourn
> time is above codel @target.
>
> This permits local TCP stack to call tcp_enter_cwr() and reduce its cwnd
> without drops (for example if ECN is not enabled for the flow)
>
> Signed-off-by: Eric Dumazet <edumazet at google.com>
> Cc: Dave Taht <dave.taht at bufferbloat.net>
> Cc: Tom Herbert <therbert at google.com>
> Cc: Matt Mathis <mattmathis at google.com>
> Cc: Yuchung Cheng <ycheng at google.com>
> Cc: Nandita Dukkipati <nanditad at google.com>
> Cc: Neal Cardwell <ncardwell at google.com>
> ---
> include/linux/pkt_sched.h | 1 +
> include/net/codel.h | 2 +-
> net/sched/sch_fq_codel.c | 19 +++++++++++++++----
> 3 files changed, 17 insertions(+), 5 deletions(-)
>
> diff --git a/include/linux/pkt_sched.h b/include/linux/pkt_sched.h
> index 32aef0a..4d409a5 100644
> --- a/include/linux/pkt_sched.h
> +++ b/include/linux/pkt_sched.h
> @@ -714,6 +714,7 @@ struct tc_fq_codel_qd_stats {
> */
> __u32 new_flows_len; /* count of flows in new list */
> __u32 old_flows_len; /* count of flows in old list */
> + __u32 congestion_count;
> };
>
> struct tc_fq_codel_cl_stats {
> diff --git a/include/net/codel.h b/include/net/codel.h
> index 550debf..8c7d6a7 100644
> --- a/include/net/codel.h
> +++ b/include/net/codel.h
> @@ -148,7 +148,7 @@ struct codel_vars {
> * struct codel_stats - contains codel shared variables and stats
> * @maxpacket: largest packet we've seen so far
> * @drop_count: temp count of dropped packets in dequeue()
> - * ecn_mark: number of packets we ECN marked instead of dropping
> + * @ecn_mark: number of packets we ECN marked instead of dropping
> */
> struct codel_stats {
> u32 maxpacket;
> diff --git a/net/sched/sch_fq_codel.c b/net/sched/sch_fq_codel.c
> index 9fc1c62..c0485a0 100644
> --- a/net/sched/sch_fq_codel.c
> +++ b/net/sched/sch_fq_codel.c
> @@ -62,6 +62,7 @@ struct fq_codel_sched_data {
> struct codel_stats cstats;
> u32 drop_overlimit;
> u32 new_flow_count;
> + u32 congestion_count;
>
> struct list_head new_flows; /* list of new flows */
> struct list_head old_flows; /* list of old flows */
> @@ -196,16 +197,25 @@ static int fq_codel_enqueue(struct sk_buff *skb, struct Qdisc *sch)
> flow->deficit = q->quantum;
> flow->dropped = 0;
> }
> - if (++sch->q.qlen < sch->limit)
> + if (++sch->q.qlen < sch->limit) {
> + codel_time_t hdelay = codel_get_enqueue_time(skb) -
> + codel_get_enqueue_time(flow->head);
> +
> + /* If this flow is congested, tell the caller ! */
> + if (codel_time_after(hdelay, q->cparams.target)) {
> + q->congestion_count++;
> + return NET_XMIT_CN;
> + }
> return NET_XMIT_SUCCESS;
> -
> + }
> q->drop_overlimit++;
> /* Return Congestion Notification only if we dropped a packet
> * from this flow.
> */
> - if (fq_codel_drop(sch) == idx)
> + if (fq_codel_drop(sch) == idx) {
> + q->congestion_count++;
> return NET_XMIT_CN;
> -
> + }
> /* As we dropped a packet, better let upper stack know this */
> qdisc_tree_decrease_qlen(sch, 1);
> return NET_XMIT_SUCCESS;
> @@ -467,6 +477,7 @@ static int fq_codel_dump_stats(struct Qdisc *sch, struct gnet_dump *d)
>
> st.qdisc_stats.maxpacket = q->cstats.maxpacket;
> st.qdisc_stats.drop_overlimit = q->drop_overlimit;
> + st.qdisc_stats.congestion_count = q->congestion_count;
> st.qdisc_stats.ecn_mark = q->cstats.ecn_mark;
> st.qdisc_stats.new_flow_count = q->new_flow_count;
>
>
>
clever idea. A problem is there are other forms of network traffic on
a link, and this is punishing a single tcp
stream that may not be the source of the problem in the first place,
and basically putting it into double jeopardy.
I am curious as to how often an enqueue is actually dropping in the
codel/fq_codel case, the hope was that there would be plenty of
headroom under far more circumstances on this qdisc.
I note that on the dequeue side of codel (and in the network stack
generally) I was thinking that supplying a netlink level message on a
packet drop/congestion indication that userspace could register for
and see would be very useful, particularly in the case of a routing
daemon, but also for statistics collection, and perhaps other levels
of overall network control (DCTCP-like)
The existing NET_DROP functionality is hard to use, and your idea is
"in-band", the more general netlink message idea would be "out of
band" and more general.
--
Dave Täht
http://www.bufferbloat.net/projects/cerowrt/wiki - "3.3.8-6 is out
with fq_codel!"
More information about the Codel
mailing list