From: Benjamin Cronce <bcronce@gmail.com>
To: Eric Dumazet <eric.dumazet@gmail.com>
Cc: cake@lists.bufferbloat.net, Jonathan Morton <chromatix99@gmail.com>
Subject: Re: [Cake] Possible BUG - parent backlog incorrectly updated in case of NET_XMIT_CN
Date: Sun, 12 Jun 2016 15:40:30 -0500 [thread overview]
Message-ID: <CAJ_ENFHSOK1TVqqt_tbGBU+6GheG-NicmO5kv79nyMmtWqyV0g@mail.gmail.com> (raw)
In-Reply-To: <1465758256.7945.125.camel@edumazet-glaptop3.roam.corp.google.com>
[-- Attachment #1: Type: text/plain, Size: 3909 bytes --]
4GiB back log is several orders of magnitude more than what you want for a
100Gb link. Unless you think this algorithm will be used for 1Zb/s links.
64bit is a bit overkill
On Jun 12, 2016 2:04 PM, "Eric Dumazet" <eric.dumazet@gmail.com> wrote:
> On Sun, 2016-06-12 at 20:59 +0300, Jonathan Morton wrote:
>
> > Not so - unless you are very sure that q->backlog is the same size as
> > quantity.
>
> Obviously this is the case in all linux kernels so far.
>
> If someone plans to change this at one point, then yes, he/she has to
> fix this.
>
> > In an increasingly 64-bit world, that is by no means assured in the
> > future. I don’t like relying on wraparound behaviour without making
> > that assumption explicit, which is precisely what the signed types in
> > C are for.
>
> So why don't you send an official patch, since apparently you plan to
> extend a qdisc backlog beyond 4,294,967,296 bytes ?
>
> I am puzzled you have this idea, but why not, some people thought "640KB
> out to be enough for anybody" ...
> >
> > >> IMHO the NET_XMIT_CN semantics are broken. It might be better to
> > drop
> > >> support for it, since it should rarely be triggered.
> > >
> > > What exact part is broken ?
> > > Semantic looks good to me.
> >
> > It’s broken in that a negative correction may be required in the first
> > place.
>
> But you do realize NET_XMIT_CN was there _before_ Cong Wang added
> backlog tracking in 2ccccf5fb43f ("net_sched: update hierarchical
> backlog too")
>
> The fact that Cong used a 'unsigned' instead of a 'signed' here is a
> very minor 'bug', since generated code is exactly the _same_.
>
> Claiming the 'semantics' are broken, while the reality is that the
> 'implementation has a potential bug _if_ we extend to 64bit' is quite
> different.
>
> > It places additional burden on every producer of the CN signal who
> > isn’t a tail-dropper. I can only assume that the behaviour was
> > designed with only tail-drop in mind - and as we both know, that is
> > not the only option any more.
> >
> Nothing implied that only tail drops could happen.
>
> Where have you seen this in the code exactly ?
>
>
> > It appears to me that there are four things than enqueue() may want to
> > tell its caller:
> >
> > 1: That the packet was enqueued successfully (NET_XMIT_SUCCESS).
> >
> > 2: That the packet was enqueued successfully, but some other relevant
> > packet had to be dropped due to link congestion.
> >
> > 3: That the packet was *not* enqueued, and this was due to link
> > congestion. This is potentially useful for tail-dropping queues,
> > including AQMs.
> >
> > 4: That the packet was *not* enqueued, for some reason other than link
> > congestion; the “error case".
> >
> > Currently NET_XMIT_CN appears to cover case 3, whereas Cake normally
> > wants cases 1 & 2 (and will only signal case 4 if a GSO aggregate
> > couldn’t be segmented).
> >
> Not true. NET_XMIT_CN also covers 2. (not sure what 'relevant' means for
> you)
>
> > For the time being, I have changed my development branch of Cake to
> > always signal case 1 except for the error case.
>
> Feel free to change it, it is your code.
>
> But do not claim NET_XMIT_CN should be removed, just because you missed
> the point (being focused on router workloads I guess)
>
> It can be used by higher level protocol, even if TCP with Small Queues
> no longer can hit it with normal settings.
>
> We have per socket counter of dropped rx packets (sk->sk_drops), we
> could very easily add a counter of drops or congestions at tx.
> This might allow UDP applications to reduce their rate or 'fast'
> retransmit without waiting an extra RTT.
>
>
>
> _______________________________________________
> Cake mailing list
> Cake@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cake
>
[-- Attachment #2: Type: text/html, Size: 4750 bytes --]
prev parent reply other threads:[~2016-06-12 20:40 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-06-07 11:20 Kevin Darbyshire-Bryant
2016-06-07 14:50 ` Jonathan Morton
2016-06-07 15:05 ` Kevin Darbyshire-Bryant
2016-06-11 9:11 ` Kevin Darbyshire-Bryant
2016-06-11 16:41 ` Jonathan Morton
2016-06-12 16:31 ` Eric Dumazet
2016-06-12 16:40 ` Dave Taht
2016-06-12 17:48 ` Eric Dumazet
2016-06-12 17:51 ` Jonathan Morton
2016-06-12 18:19 ` Eric Dumazet
2016-06-12 17:59 ` Jonathan Morton
2016-06-12 19:04 ` Eric Dumazet
2016-06-12 20:40 ` Benjamin Cronce [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://lists.bufferbloat.net/postorius/lists/cake.lists.bufferbloat.net/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAJ_ENFHSOK1TVqqt_tbGBU+6GheG-NicmO5kv79nyMmtWqyV0g@mail.gmail.com \
--to=bcronce@gmail.com \
--cc=cake@lists.bufferbloat.net \
--cc=chromatix99@gmail.com \
--cc=eric.dumazet@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox