[Cake] Possible BUG - parent backlog incorrectly updated in case of NET_XMIT_CN
bcronce at gmail.com
Sun Jun 12 16:40:30 EDT 2016
4GiB back log is several orders of magnitude more than what you want for a
100Gb link. Unless you think this algorithm will be used for 1Zb/s links.
64bit is a bit overkill
On Jun 12, 2016 2:04 PM, "Eric Dumazet" <eric.dumazet at gmail.com> wrote:
> On Sun, 2016-06-12 at 20:59 +0300, Jonathan Morton wrote:
> > Not so - unless you are very sure that q->backlog is the same size as
> > quantity.
> Obviously this is the case in all linux kernels so far.
> If someone plans to change this at one point, then yes, he/she has to
> fix this.
> > In an increasingly 64-bit world, that is by no means assured in the
> > future. I don’t like relying on wraparound behaviour without making
> > that assumption explicit, which is precisely what the signed types in
> > C are for.
> So why don't you send an official patch, since apparently you plan to
> extend a qdisc backlog beyond 4,294,967,296 bytes ?
> I am puzzled you have this idea, but why not, some people thought "640KB
> out to be enough for anybody" ...
> > >> IMHO the NET_XMIT_CN semantics are broken. It might be better to
> > drop
> > >> support for it, since it should rarely be triggered.
> > >
> > > What exact part is broken ?
> > > Semantic looks good to me.
> > It’s broken in that a negative correction may be required in the first
> > place.
> But you do realize NET_XMIT_CN was there _before_ Cong Wang added
> backlog tracking in 2ccccf5fb43f ("net_sched: update hierarchical
> backlog too")
> The fact that Cong used a 'unsigned' instead of a 'signed' here is a
> very minor 'bug', since generated code is exactly the _same_.
> Claiming the 'semantics' are broken, while the reality is that the
> 'implementation has a potential bug _if_ we extend to 64bit' is quite
> > It places additional burden on every producer of the CN signal who
> > isn’t a tail-dropper. I can only assume that the behaviour was
> > designed with only tail-drop in mind - and as we both know, that is
> > not the only option any more.
> Nothing implied that only tail drops could happen.
> Where have you seen this in the code exactly ?
> > It appears to me that there are four things than enqueue() may want to
> > tell its caller:
> > 1: That the packet was enqueued successfully (NET_XMIT_SUCCESS).
> > 2: That the packet was enqueued successfully, but some other relevant
> > packet had to be dropped due to link congestion.
> > 3: That the packet was *not* enqueued, and this was due to link
> > congestion. This is potentially useful for tail-dropping queues,
> > including AQMs.
> > 4: That the packet was *not* enqueued, for some reason other than link
> > congestion; the “error case".
> > Currently NET_XMIT_CN appears to cover case 3, whereas Cake normally
> > wants cases 1 & 2 (and will only signal case 4 if a GSO aggregate
> > couldn’t be segmented).
> Not true. NET_XMIT_CN also covers 2. (not sure what 'relevant' means for
> > For the time being, I have changed my development branch of Cake to
> > always signal case 1 except for the error case.
> Feel free to change it, it is your code.
> But do not claim NET_XMIT_CN should be removed, just because you missed
> the point (being focused on router workloads I guess)
> It can be used by higher level protocol, even if TCP with Small Queues
> no longer can hit it with normal settings.
> We have per socket counter of dropped rx packets (sk->sk_drops), we
> could very easily add a counter of drops or congestions at tx.
> This might allow UDP applications to reduce their rate or 'fast'
> retransmit without waiting an extra RTT.
> Cake mailing list
> Cake at lists.bufferbloat.net
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Cake