[Cake] Possible BUG - parent backlog incorrectly updated in case of NET_XMIT_CN

Sun Jun 12 16:40:30 EDT 2016

4GiB back log is several orders of magnitude more than what you want for a
100Gb link. Unless you think this algorithm will be used for 1Zb/s links.
64bit is a bit overkill
On Jun 12, 2016 2:04 PM, "Eric Dumazet" <eric.dumazet at gmail.com> wrote:

> On Sun, 2016-06-12 at 20:59 +0300, Jonathan Morton wrote:
>
> > Not so - unless you are very sure that q->backlog is the same size as
> > quantity.
>
> Obviously this is the case in all linux kernels so far.
>
> If someone plans to change this at one point, then yes, he/she has to
> fix this.
>
> >   In an increasingly 64-bit world, that is by no means assured in the
> > future.  I don’t like relying on wraparound behaviour without making
> > that assumption explicit, which is precisely what the signed types in
> > C are for.
>
> So why don't you send an official patch, since apparently you plan to
> extend a qdisc backlog beyond 4,294,967,296 bytes ?
>
> I am puzzled you have this idea, but why not, some people thought "640KB
> out to be enough for anybody" ...
> >
> > >> IMHO the NET_XMIT_CN semantics are broken.  It might be better to
> > drop
> > >> support for it, since it should rarely be triggered.
> > >
> > > What exact part is broken ?
> > > Semantic looks good to me.
> >
> > It’s broken in that a negative correction may be required in the first
> > place.
>
> But you do realize NET_XMIT_CN was there _before_ Cong Wang added
> backlog tracking in 2ccccf5fb43f ("net_sched: update hierarchical
> backlog too")
>
> The fact that Cong used a 'unsigned' instead of a 'signed' here is a
> very minor 'bug', since generated code is exactly the _same_.
>
> Claiming the 'semantics' are broken, while the reality is that the
> 'implementation has a potential bug _if_ we extend to 64bit' is quite
> different.
>
> >   It places additional burden on every producer of the CN signal who
> > isn’t a tail-dropper.  I can only assume that the behaviour was
> > designed with only tail-drop in mind - and as we both know, that is
> > not the only option any more.
> >
> Nothing implied that only tail drops could happen.
>
> Where have you seen this in the code exactly ?
>
>
> > It appears to me that there are four things than enqueue() may want to
> > tell its caller:
> >
> > 1: That the packet was enqueued successfully (NET_XMIT_SUCCESS).
> >
> > 2: That the packet was enqueued successfully, but some other relevant
> > packet had to be dropped due to link congestion.
> >
> > 3: That the packet was *not* enqueued, and this was due to link
> > congestion.  This is potentially useful for tail-dropping queues,
> > including AQMs.
> >
> > 4: That the packet was *not* enqueued, for some reason other than link
> > congestion; the “error case".
> >
> > Currently NET_XMIT_CN appears to cover case 3, whereas Cake normally
> > wants cases 1 & 2 (and will only signal case 4 if a GSO aggregate
> > couldn’t be segmented).
> >
> Not true. NET_XMIT_CN also covers 2. (not sure what 'relevant' means for
> you)
>
> > For the time being, I have changed my development branch of Cake to
> > always signal case 1 except for the error case.
>
> Feel free to change it, it is your code.
>
> But do not claim NET_XMIT_CN should be removed, just because you missed
> the point (being focused on router workloads I guess)
>
> It can be used by higher level protocol, even if TCP with Small Queues
> no longer can hit it with normal settings.
>
> We have per socket counter of dropped rx packets (sk->sk_drops), we
> could very easily add a counter of drops or congestions at tx.
> This might allow UDP applications to reduce their rate or 'fast'
> retransmit without waiting an extra RTT.
>
>
>
> _______________________________________________
> Cake mailing list
> Cake at lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cake
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.bufferbloat.net/pipermail/cake/attachments/20160612/d71a2768/attachment.html>