CoDel AQM discussions
 help / color / mirror / Atom feed
From: Dave Taht <dave.taht@gmail.com>
To: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Nandita Dukkipati <nanditad@google.com>,
	netdev <netdev@vger.kernel.org>,
	codel@lists.bufferbloat.net, Yuchung Cheng <ycheng@google.com>,
	Neal Cardwell <ncardwell@google.com>,
	David Miller <davem@davemloft.net>,
	Matt Mathis <mattmathis@google.com>
Subject: Re: [Codel] [PATCH net-next] fq_codel: report congestion notification at enqueue time
Date: Thu, 28 Jun 2012 10:51:33 -0700	[thread overview]
Message-ID: <CAA93jw7agn2J6Hd7x22KWhENY5jqVjnk6uRr=3LJ5Anw7EgacQ@mail.gmail.com> (raw)
In-Reply-To: <1340903237.13187.151.camel@edumazet-glaptop>

On Thu, Jun 28, 2012 at 10:07 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> From: Eric Dumazet <edumazet@google.com>
>
> At enqueue time, check sojourn time of packet at head of the queue,
> and return NET_XMIT_CN instead of NET_XMIT_SUCCESS if this sejourn
> time is above codel @target.
>
> This permits local TCP stack to call tcp_enter_cwr() and reduce its cwnd
> without drops (for example if ECN is not enabled for the flow)
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Cc: Dave Taht <dave.taht@bufferbloat.net>
> Cc: Tom Herbert <therbert@google.com>
> Cc: Matt Mathis <mattmathis@google.com>
> Cc: Yuchung Cheng <ycheng@google.com>
> Cc: Nandita Dukkipati <nanditad@google.com>
> Cc: Neal Cardwell <ncardwell@google.com>
> ---
>  include/linux/pkt_sched.h |    1 +
>  include/net/codel.h       |    2 +-
>  net/sched/sch_fq_codel.c  |   19 +++++++++++++++----
>  3 files changed, 17 insertions(+), 5 deletions(-)
>
> diff --git a/include/linux/pkt_sched.h b/include/linux/pkt_sched.h
> index 32aef0a..4d409a5 100644
> --- a/include/linux/pkt_sched.h
> +++ b/include/linux/pkt_sched.h
> @@ -714,6 +714,7 @@ struct tc_fq_codel_qd_stats {
>                                 */
>        __u32   new_flows_len;  /* count of flows in new list */
>        __u32   old_flows_len;  /* count of flows in old list */
> +       __u32   congestion_count;
>  };
>
>  struct tc_fq_codel_cl_stats {
> diff --git a/include/net/codel.h b/include/net/codel.h
> index 550debf..8c7d6a7 100644
> --- a/include/net/codel.h
> +++ b/include/net/codel.h
> @@ -148,7 +148,7 @@ struct codel_vars {
>  * struct codel_stats - contains codel shared variables and stats
>  * @maxpacket: largest packet we've seen so far
>  * @drop_count:        temp count of dropped packets in dequeue()
> - * ecn_mark:   number of packets we ECN marked instead of dropping
> + * @ecn_mark:  number of packets we ECN marked instead of dropping
>  */
>  struct codel_stats {
>        u32             maxpacket;
> diff --git a/net/sched/sch_fq_codel.c b/net/sched/sch_fq_codel.c
> index 9fc1c62..c0485a0 100644
> --- a/net/sched/sch_fq_codel.c
> +++ b/net/sched/sch_fq_codel.c
> @@ -62,6 +62,7 @@ struct fq_codel_sched_data {
>        struct codel_stats cstats;
>        u32             drop_overlimit;
>        u32             new_flow_count;
> +       u32             congestion_count;
>
>        struct list_head new_flows;     /* list of new flows */
>        struct list_head old_flows;     /* list of old flows */
> @@ -196,16 +197,25 @@ static int fq_codel_enqueue(struct sk_buff *skb, struct Qdisc *sch)
>                flow->deficit = q->quantum;
>                flow->dropped = 0;
>        }
> -       if (++sch->q.qlen < sch->limit)
> +       if (++sch->q.qlen < sch->limit) {
> +               codel_time_t hdelay = codel_get_enqueue_time(skb) -
> +                                     codel_get_enqueue_time(flow->head);
> +
> +               /* If this flow is congested, tell the caller ! */
> +               if (codel_time_after(hdelay, q->cparams.target)) {
> +                       q->congestion_count++;
> +                       return NET_XMIT_CN;
> +               }
>                return NET_XMIT_SUCCESS;
> -
> +       }
>        q->drop_overlimit++;
>        /* Return Congestion Notification only if we dropped a packet
>         * from this flow.
>         */
> -       if (fq_codel_drop(sch) == idx)
> +       if (fq_codel_drop(sch) == idx) {
> +               q->congestion_count++;
>                return NET_XMIT_CN;
> -
> +       }
>        /* As we dropped a packet, better let upper stack know this */
>        qdisc_tree_decrease_qlen(sch, 1);
>        return NET_XMIT_SUCCESS;
> @@ -467,6 +477,7 @@ static int fq_codel_dump_stats(struct Qdisc *sch, struct gnet_dump *d)
>
>        st.qdisc_stats.maxpacket = q->cstats.maxpacket;
>        st.qdisc_stats.drop_overlimit = q->drop_overlimit;
> +       st.qdisc_stats.congestion_count = q->congestion_count;
>        st.qdisc_stats.ecn_mark = q->cstats.ecn_mark;
>        st.qdisc_stats.new_flow_count = q->new_flow_count;
>
>
>

clever idea. A problem is there are other forms of network traffic on
a link, and this is punishing a single tcp
stream that may not be the source of the problem in the first place,
and basically putting it into double jeopardy.

I am curious as to how often an enqueue is actually dropping in the
codel/fq_codel case, the hope was that there would be plenty of
headroom under far more circumstances on this qdisc.

I note that on the dequeue side of codel (and in the network stack
generally) I was thinking that supplying a netlink level message on a
packet drop/congestion indication that userspace could register for
and see would be very useful, particularly in the case of a routing
daemon, but also for statistics collection, and perhaps other levels
of overall network control (DCTCP-like)

The existing NET_DROP functionality is hard to use, and your idea is
"in-band", the more general netlink message idea would be "out of
band" and more general.

-- 
Dave Täht
http://www.bufferbloat.net/projects/cerowrt/wiki - "3.3.8-6 is out
with fq_codel!"

  reply	other threads:[~2012-06-28 17:51 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-06-28 17:07 Eric Dumazet
2012-06-28 17:51 ` Dave Taht [this message]
2012-06-28 18:12   ` Eric Dumazet
2012-06-28 22:56     ` Yuchung Cheng
2012-06-28 23:47       ` Dave Taht
2012-06-29  4:50         ` Eric Dumazet
2012-06-29  5:24           ` Dave Taht
2012-07-04 10:11           ` [Codel] [RFC PATCH] tcp: limit data skbs in qdisc layer Eric Dumazet
2012-07-09  7:08             ` David Miller
2012-07-09  8:03               ` Eric Dumazet
2012-07-09  8:48                 ` Eric Dumazet
2012-07-09 14:55               ` Eric Dumazet
2012-07-10 13:28                 ` Lin Ming
2012-07-10 15:13                 ` [Codel] [RFC PATCH v2] tcp: TCP Small Queues Eric Dumazet
2012-07-10 17:06                   ` Eric Dumazet
2012-07-10 17:37                   ` Yuchung Cheng
2012-07-10 18:32                     ` Eric Dumazet
2012-07-11 15:11                   ` Eric Dumazet
2012-07-11 15:16                     ` Ben Greear
2012-07-11 15:25                       ` Eric Dumazet
2012-07-11 15:43                         ` Ben Greear
2012-07-11 15:54                           ` Eric Dumazet
2012-07-11 16:03                             ` Ben Greear
2012-07-11 18:23                     ` Rick Jones
2012-07-11 23:38                       ` Eric Dumazet
2012-07-11 18:44                     ` Rick Jones
2012-07-11 23:49                       ` Eric Dumazet
2012-07-12  7:34                         ` Eric Dumazet
2012-07-12  7:37                           ` David Miller
2012-07-12  7:51                             ` Eric Dumazet
2012-07-12 14:55                               ` Tom Herbert
2012-07-12 13:33                   ` John Heffner
2012-07-12 13:46                     ` Eric Dumazet
2012-07-12 16:44                       ` John Heffner
2012-07-12 16:54                         ` Jim Gettys
2012-06-28 23:52 ` [Codel] [PATCH net-next] fq_codel: report congestion notification at enqueue time Nandita Dukkipati
2012-06-29  4:18   ` Eric Dumazet
2012-06-29  4:53 ` Eric Dumazet
2012-06-29  5:12   ` David Miller
2012-06-29  5:24     ` Eric Dumazet
2012-06-29  5:29       ` David Miller
2012-06-29  5:50         ` Eric Dumazet
2012-06-29  7:53           ` David Miller
2012-06-29  8:04           ` David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://lists.bufferbloat.net/postorius/lists/codel.lists.bufferbloat.net/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAA93jw7agn2J6Hd7x22KWhENY5jqVjnk6uRr=3LJ5Anw7EgacQ@mail.gmail.com' \
    --to=dave.taht@gmail.com \
    --cc=codel@lists.bufferbloat.net \
    --cc=davem@davemloft.net \
    --cc=eric.dumazet@gmail.com \
    --cc=mattmathis@google.com \
    --cc=nanditad@google.com \
    --cc=ncardwell@google.com \
    --cc=netdev@vger.kernel.org \
    --cc=ycheng@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox