From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wi0-x22a.google.com (mail-wi0-x22a.google.com [IPv6:2a00:1450:400c:c05::22a]) (using TLSv1 with cipher RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by huchra.bufferbloat.net (Postfix) with ESMTPS id 3988721F5D4; Fri, 19 Jun 2015 00:10:16 -0700 (PDT) Received: by wicnd19 with SMTP id nd19so9995078wic.1; Fri, 19 Jun 2015 00:10:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=message-id:subject:from:to:cc:date:in-reply-to:references :content-type:mime-version:content-transfer-encoding; bh=o7vgswgw8A7l7s1b+k2pmSLLTRkdQ6QzWyBoR06/nTY=; b=RJNG0W6L0ufozGbhT8ncWXhjcGcM9hDiAKl1QpcG7cL3fWfzB0Omv6z18a36etd0ak 64VdaDzJkbDXUvwPS8XbKLfU154J2zbkJVWcKfaDvt4gE7lwcshzyQ+FUjtr+rb4Qka1 IMmQhDkp+ydaj1cNK/TR4GKEi7j/xwBv/CG2fXQr4zIul9VEeZLD/bwbVx00CJaUckX1 Cr20QV2qm+mOjHNBoJwLN6q0jWiXTLMAqMpPYQEhK8tIqiI6f4diS25OXQPANkN/qW0i oMd2r+kAs4TTAfXFfIeOfLgrLlx+vm9b5fTs+keH5H94RFYKTFkmA+KBiWCvWC0bsPlD 2H6g== X-Received: by 10.194.71.105 with SMTP id t9mr22358273wju.128.1434697814796; Fri, 19 Jun 2015 00:10:14 -0700 (PDT) Received: from [172.25.191.141] ([172.25.191.141]) by mx.google.com with ESMTPSA id um5sm15542189wjc.1.2015.06.19.00.10.12 (version=TLSv1.2 cipher=AES128-GCM-SHA256 bits=128/128); Fri, 19 Jun 2015 00:10:13 -0700 (PDT) Message-ID: <1434697811.31511.9.camel@edumazet-glaptop2.roam.corp.google.com> From: Eric Dumazet To: Jonathan Morton Date: Fri, 19 Jun 2015 00:10:11 -0700 In-Reply-To: <6644BD81-1FFC-450C-89FD-91E138B7824A@gmail.com> References: <87y4jgbejc.wl-jch@pps.univ-paris-diderot.fr> <6644BD81-1FFC-450C-89FD-91E138B7824A@gmail.com> Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.10.4-0ubuntu2 Mime-Version: 1.0 Content-Transfer-Encoding: 8bit Cc: bloat , bloat-devel , Juliusz Chroboczek Subject: Re: [Bloat] using tcp_notsent_lowat in various apps? X-BeenThere: bloat@lists.bufferbloat.net X-Mailman-Version: 2.1.13 Precedence: list List-Id: General list for discussing Bufferbloat List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 19 Jun 2015 07:10:45 -0000 On Fri, 2015-06-19 at 07:07 +0300, Jonathan Morton wrote: > > On 19 Jun, 2015, at 05:47, Juliusz Chroboczek > wrote: > > > >> I am curious if anyone has tried this new socket option in > appropriate apps, > > > > I'm probably confused, but I don't see how this is different from > setting SO_SNDBUF. I realise that's lower in the stack, but it should > have a similar effect, shouldn't it? > > What I understand of it is: > > Reducing SO_SNDBUF causes send() to block until all of the data can be > accommodated in the smaller buffer. But select() will return the > socket as soon as there is *any* space in that buffer to stuff data > into. > > TCP_NOTSENT_LOWAT causes select() to not return the socket until the > data in the buffer falls below the mark, which may (and should) be a > mere fraction of the total buffer size. > > It’s a subtle difference, but worth noting. The two options > effectively apply to completely different system calls. > > You could use both in the same program, but generally SO_SNDBUF would > be set to a higher value than the low water mark. This allows a > complete chunk of data to be stuffed into the buffer, and the > application can then spend more time waiting in select() - where it is > in a better position to make control decisions which are likely to be > latency sensitive, and it can service other sockets which might be > draining or filling at a different rate. SO_SNDBUF needs to be large enough to accommodate with losses/repairs. If flow has no losses, SNDBUF needs to be at least BDP : ( cwnd * MSS / rtt) If a packet can be lost once, then SNDBUF needs to be : 2 * (cwnd * MSS / rtt) If a packet can be lost twice, then we need 3 * (cwnd * MSS / rtt) ... etc ... But really TCP write queue is logically split into 2 different logical parts : [1] Already sent data, waiting for ACK. This one can be arbitrary big, depending on network conditions. [2] Not sent data. 1) Part is hard to size, because it depends on losses, which cannot be predicted. 2) Part is easy to size, if we have some reasonable ways to schedule the application to provide additional data (write()/send()) when empty. SO_SNDBUF sizes the overall TCP write queue ([1] + [2]) While NOTSENT_LOWAT is able to restrict (2) only, avoiding filling write queue when/if no drops are actually seen.