From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-io0-x232.google.com (mail-io0-x232.google.com [IPv6:2607:f8b0:4001:c06::232]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id 8C61E3B2B0 for ; Sun, 12 Jun 2016 15:04:19 -0400 (EDT) Received: by mail-io0-x232.google.com with SMTP id d2so17053254iof.0 for ; Sun, 12 Jun 2016 12:04:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=message-id:subject:from:to:cc:date:in-reply-to:references :mime-version:content-transfer-encoding; bh=z1OPtU4L/wKcTgxf6KK8EMw49plCZ9zaIxfM4r7+IW8=; b=D6Q1MD7WI+h0LW2ntWOwHIlLSm7XKP2JmUprS+/J9aSr1UMu8iZ4x0p5ozzCm1WeZ4 bpS/9o8vCbXTPcDSyHQPwkE4Mzwb2HG8JTXxeLAp+TxlzBC/k4+YTTBERmjx64EfOgh6 Rnfq04UgWGuneFNuHFG8FCy4jO9y6rvKDWEs8YzOpyXOyCFYzoPowVHAryd+kwNQyEAv 6/06mwXwnIn6DLkrEN9OUWI3dQ+TQ/QIeQEJEG0HIdg0gjhlL0H2cVnEOjKMEbiDlufO 0ww+RuVFQk5QYDPy2AD7iRw4YukYE/SstJ9yXgi6wxqr3e8FK/lQCB4ZVsfNYkgmnJjD s/JQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:message-id:subject:from:to:cc:date:in-reply-to :references:mime-version:content-transfer-encoding; bh=z1OPtU4L/wKcTgxf6KK8EMw49plCZ9zaIxfM4r7+IW8=; b=jaliRqp4GSWMLE/pa5XFqrnmJIPKkJVR2YWqO9tH8lWlJjpwOpBRj3pgI97ISw3fBP 8K4TOFvVgXy4xpVBUfX5Blp6bKwv9hrwjkl5xke2oWoCrj+PD4xhEVTC8uRFz7GV7h/4 C05OzzkBMUxjpJhQrlSE1ZfBhD/PVgrk86zenwP4byaHiMHhybUFH4wdGamKnKrd6iDl /0dWml5x/9NQM6jQDad0ZNVQ0w4YOQ3hcg8KH1B5wPZFlcIKTEb8F7U17wsIU/G2cu+T VUTolpsp4eAVB6JWUVrIPoRG83gqXAlpLLNQrDKOkPWLSsumVqDAS/hKj/nMv5Mdffim AK/w== X-Gm-Message-State: ALyK8tLI9p7KZbWNaYKJF48qu2rVi7K8O3GHvnBu35b++qrAUv5vOajGHegKsc8rVAZnGQ== X-Received: by 10.107.171.129 with SMTP id u123mr18110527ioe.111.1465758258938; Sun, 12 Jun 2016 12:04:18 -0700 (PDT) Received: from [172.26.50.149] ([172.26.50.149]) by smtp.googlemail.com with ESMTPSA id p199sm4638582itc.2.2016.06.12.12.04.17 (version=TLSv1/SSLv3 cipher=OTHER); Sun, 12 Jun 2016 12:04:18 -0700 (PDT) Message-ID: <1465758256.7945.125.camel@edumazet-glaptop3.roam.corp.google.com> From: Eric Dumazet To: Jonathan Morton Cc: Kevin Darbyshire-Bryant , cake@lists.bufferbloat.net Date: Sun, 12 Jun 2016 12:04:16 -0700 In-Reply-To: References: <5756ADEC.9070305@darbyshire-bryant.me.uk> <8EE35B4F-5C28-41C7-8795-93A6F606B3A8@gmail.com> <5756E2CA.2020700@darbyshire-bryant.me.uk> <575BD5D0.4060702@darbyshire-bryant.me.uk> <98AFD8C3-2343-4B8B-BFB0-6F877161A039@gmail.com> <1465749085.7945.81.camel@edumazet-glaptop3.roam.corp.google.com> Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.10.4-0ubuntu2 Mime-Version: 1.0 Content-Transfer-Encoding: 8bit Subject: Re: [Cake] Possible BUG - parent backlog incorrectly updated in case of NET_XMIT_CN X-BeenThere: cake@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: Cake - FQ_codel the next generation List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 12 Jun 2016 19:04:19 -0000 On Sun, 2016-06-12 at 20:59 +0300, Jonathan Morton wrote: > Not so - unless you are very sure that q->backlog is the same size as > quantity. Obviously this is the case in all linux kernels so far. If someone plans to change this at one point, then yes, he/she has to fix this. > In an increasingly 64-bit world, that is by no means assured in the > future. I don’t like relying on wraparound behaviour without making > that assumption explicit, which is precisely what the signed types in > C are for. So why don't you send an official patch, since apparently you plan to extend a qdisc backlog beyond 4,294,967,296 bytes ? I am puzzled you have this idea, but why not, some people thought "640KB out to be enough for anybody" ... > > >> IMHO the NET_XMIT_CN semantics are broken. It might be better to > drop > >> support for it, since it should rarely be triggered. > > > > What exact part is broken ? > > Semantic looks good to me. > > It’s broken in that a negative correction may be required in the first > place. But you do realize NET_XMIT_CN was there _before_ Cong Wang added backlog tracking in 2ccccf5fb43f ("net_sched: update hierarchical backlog too") The fact that Cong used a 'unsigned' instead of a 'signed' here is a very minor 'bug', since generated code is exactly the _same_. Claiming the 'semantics' are broken, while the reality is that the 'implementation has a potential bug _if_ we extend to 64bit' is quite different. > It places additional burden on every producer of the CN signal who > isn’t a tail-dropper. I can only assume that the behaviour was > designed with only tail-drop in mind - and as we both know, that is > not the only option any more. > Nothing implied that only tail drops could happen. Where have you seen this in the code exactly ? > It appears to me that there are four things than enqueue() may want to > tell its caller: > > 1: That the packet was enqueued successfully (NET_XMIT_SUCCESS). > > 2: That the packet was enqueued successfully, but some other relevant > packet had to be dropped due to link congestion. > > 3: That the packet was *not* enqueued, and this was due to link > congestion. This is potentially useful for tail-dropping queues, > including AQMs. > > 4: That the packet was *not* enqueued, for some reason other than link > congestion; the “error case". > > Currently NET_XMIT_CN appears to cover case 3, whereas Cake normally > wants cases 1 & 2 (and will only signal case 4 if a GSO aggregate > couldn’t be segmented). > Not true. NET_XMIT_CN also covers 2. (not sure what 'relevant' means for you) > For the time being, I have changed my development branch of Cake to > always signal case 1 except for the error case. Feel free to change it, it is your code. But do not claim NET_XMIT_CN should be removed, just because you missed the point (being focused on router workloads I guess) It can be used by higher level protocol, even if TCP with Small Queues no longer can hit it with normal settings. We have per socket counter of dropped rx packets (sk->sk_drops), we could very easily add a counter of drops or congestions at tx. This might allow UDP applications to reduce their rate or 'fast' retransmit without waiting an extra RTT.