From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from NAM02-CY1-obe.outbound.protection.outlook.com (mail-cys01nam02on0105.outbound.protection.outlook.com [104.47.37.105]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id 2A3AA3B2A4 for ; Wed, 15 Nov 2017 14:45:40 -0500 (EST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cornellprod.onmicrosoft.com; s=selector1-cornell-edu; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=mzyvrV06H0RUBBVvnj8JLdOTV9725EKT7OA4stTEQks=; b=GVesCmRJMNiL4P/auTDQ8K/ulaUQ0Wb+NfcYZDEkJIkoBWnqqkcFM/mNivf7g5/72YjCZYt52qFtiouV7zxju4ahDjY9i5ywyO22dL1p/bKIhW/WtdiZmqP/rwoMXnl+ttKsznJt7/rSjOPN85SnSiwbSMcL/Worsl+ur6W2xW8= Received: from BN6PR04MB1187.namprd04.prod.outlook.com (10.173.199.12) by BN6PR04MB1189.namprd04.prod.outlook.com (10.173.199.15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P256) id 15.20.218.12; Wed, 15 Nov 2017 19:45:38 +0000 Received: from BN6PR04MB1187.namprd04.prod.outlook.com ([10.173.199.12]) by BN6PR04MB1187.namprd04.prod.outlook.com ([10.173.199.12]) with mapi id 15.20.0218.015; Wed, 15 Nov 2017 19:45:38 +0000 From: Ken Birman To: 'Dave Taht' , Matthias Tafelmeier CC: Bob Briscoe , "ken@cs.cornell.edu" , "bloat@lists.bufferbloat.net" Thread-Topic: [Bloat] DETNET Thread-Index: AQHTXKmP09aNAYbFf0S3Kj7WJlYhI6MV2m+DgAAAmXA= Message-ID: References: <4d54f24f-ce83-34a0-41f3-9f728420d548@gmx.net> <87shdr0vt6.fsf@nemesis.taht.net> <79f4d92c-74f4-8cd0-9d38-e51a668cb9b6@gmx.net> <796aa11e-9e35-cf34-e456-6ae98d1875d6@bobbriscoe.net> <87fu9f72za.fsf@nemesis.taht.net> In-Reply-To: <87fu9f72za.fsf@nemesis.taht.net> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: spf=none (sender IP is ) smtp.mailfrom=kpb3@cornell.edu; x-originating-ip: [128.84.217.115] x-ms-publictraffictype: Email x-microsoft-exchange-diagnostics: 1; BN6PR04MB1189; 6:WQHDJp6+541THj2ij5C4TumBoHL7u/SYU1WOApoH+CNIhrT9MAvQWcROdAiI5aAdhYmZ8wgHftGMfuPi/IajMLOo1d68OFejv+9V1FbwgaYxIfDkOgKO7a+FLmHxfgBsFnCPSNVRy6PGauZ4Z7kyJ8P83kDkIGBl4CCfgFdt0qmgT+QzglxAPqfXzU7kvbt8/hw25fcGWz+HW5PazV1ldyqVYwyuaRdNRMWqfiAvHFl8iCfAw376xJSxc7/qsKAzV4vRXFDZdHvs592mOoU5eIfxPouLtrbGr0U+Q1IgQ7yxeBtW7+utlSMZn9Vl93XzA+Ddqx7jUeU25K7AUZ9DNsaRBZjzhlJXz1e1/xKUqxg=; 5:9FylbO1+uWok85VWEgEAWcCqMWOKYgWZEGE7xqwGxJl+51WAlG99zpN+0xaKiLJF6q9I8Eb70LHrfwVMR+alvGUmE8btqxqxuc4MWZtfbEodQ7tbPtZcwjIc5KrZZIUbPNFWU0loZ50O5fP8l+mAbDf+svVHJ/30MMCU/5LxCHM=; 24:aOnvC0lq9F0WwnOYdZ8pi4V4F8pdzWuO1y5E45i8d7vaKh+lR88iKi6W5IQ/ax/jC1ZMwOorjs+eV7V5k8NNDmkPl4q8RN9j/07gXEB9luY=; 7:/4JHpQxPtUn2SkBKfd9VtlrHhMl/QfYzF/aWtUdWwE+q2OB6XAKPko0XmGYPnFOblQsgYtx5A8q5hIJayh3GpgjDyDctt+vZrwM+wuQobvm4XizzQNsXa/BdLjreYx1wwYvVuNpSBSXW6Sywu0EiYLtFs4z59/v0FNEJtGIIzJk9VbnY5R3NXD/sYD8rxvQOoCft5aKYgJ7nFj4XK/ykq7YhElpgawN+ca8YZhzRbVxa66KyZ0oP2nD1e2yRIJsN x-ms-exchange-antispam-srfa-diagnostics: SSOS; x-ms-office365-filtering-correlation-id: 7a723d5d-5807-41ea-e76e-08d52c617241 x-microsoft-antispam: UriScan:(274809879029318); BCL:0; PCL:0; RULEID:(22001)(4534020)(4602075)(4627115)(8989060)(201703031133081)(201702281549075)(8990040)(2017052603258); SRVR:BN6PR04MB1189; x-ms-traffictypediagnostic: BN6PR04MB1189: x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:(278428928389397)(57809966217671)(248736688235697)(274809879029318); x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(100000700101)(100105000095)(100000701101)(100105300095)(100000702101)(100105100095)(6040450)(2401047)(5005006)(8121501046)(3231022)(3002001)(10201501046)(100000703101)(100105400095)(93006095)(93001095)(6041248)(201703131423075)(201702281529075)(201702281528075)(201703061421075)(201703061406153)(20161123555025)(20161123560025)(20161123558100)(20161123564025)(20161123562025)(6072148)(201708071742011)(100000704101)(100105200095)(100000705101)(100105500095); SRVR:BN6PR04MB1189; BCL:0; PCL:0; RULEID:(100000800101)(100110000095)(100000801101)(100110300095)(100000802101)(100110100095)(100000803101)(100110400095)(100000804101)(100110200095)(100000805101)(100110500095); SRVR:BN6PR04MB1189; x-forefront-prvs: 0492FD61DD x-forefront-antispam-report: SFV:NSPM; SFS:(10019020)(6009001)(376002)(39860400002)(346002)(199003)(13464003)(189002)(11905935001)(478600001)(54906003)(101416001)(74316002)(7736002)(54356999)(76176999)(50986999)(53936002)(75432002)(6306002)(966005)(25786009)(53546010)(7696004)(786003)(9686003)(316002)(93886005)(55016002)(305945005)(66066001)(2950100002)(110136005)(3846002)(33656002)(5660300001)(105586002)(14454004)(106356001)(99286004)(2906002)(88552002)(6246003)(229853002)(86362001)(102836003)(6116002)(68736007)(3660700001)(4326008)(3280700002)(77096006)(97736004)(8936002)(6506006)(189998001)(8676002)(6436002)(81156014)(81166006)(2900100001)(88722002); DIR:OUT; SFP:1102; SCL:1; SRVR:BN6PR04MB1189; H:BN6PR04MB1187.namprd04.prod.outlook.com; FPR:; SPF:None; PTR:InfoNoRecords; A:1; MX:1; LANG:en; received-spf: None (protection.outlook.com: cornell.edu does not designate permitted sender hosts) spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: cornell.edu X-MS-Exchange-CrossTenant-Network-Message-Id: 7a723d5d-5807-41ea-e76e-08d52c617241 X-MS-Exchange-CrossTenant-originalarrivaltime: 15 Nov 2017 19:45:38.2383 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 5d7e4366-1b9b-45cf-8e79-b14b27df46e1 X-MS-Exchange-Transport-CrossTenantHeadersStamped: BN6PR04MB1189 X-Mailman-Approved-At: Mon, 11 Dec 2017 11:18:07 -0500 Subject: Re: [Bloat] DETNET X-BeenThere: bloat@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: General list for discussing Bufferbloat List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Wed, 15 Nov 2017 19:45:40 -0000 X-Original-Date: Wed, 15 Nov 2017 19:45:38 +0000 X-List-Received-Date: Wed, 15 Nov 2017 19:45:40 -0000 I'm missing context. Can someone tell me why I'm being cc'ed on these? To= pic seems germane to me (Derecho uses RDMA on RoCE) but I'm unclear what th= is thread is "about" Ken Birman -----Original Message----- From: Dave Taht [mailto:dave@taht.net]=20 Sent: Wednesday, November 15, 2017 2:32 PM To: Matthias Tafelmeier Cc: Bob Briscoe ; ken@cs.cornell.edu; bloat@lists.buff= erbloat.net Subject: Re: [Bloat] DETNET Matthias Tafelmeier writes: > However, like you, I just sigh when I see the behemoth detnet is buil= ding. > > Does it? Well, so far the circumference seems justififiable for what=20 > they want to achieve, at least according to what I can tell from these=20 > rather still abstract concepts. > > The sort of industrial control applications that detnet is ta= rgeting > require far lower queuing delay and jitter than fq_CoDel can give= . They > have thrown around numbers like 250us jitter and 1E-9 to 1E-12 pa= cket > loss probability. > > Nonetheless, it's important to have a debate about where to go to nex= t. > Personally I don't think fq_CoDel alone has legs to get (that) much b= etter.=20 The place where bob and I always disconnect is that I care about interflow = latencies generally more than queuing latencies and prefer to have strong i= ncentives for non-queue building flows in the first place. This results in = solid latencies of 1/flows at your bandwidth. At 100Mbit, a single 1500 byt= e packet takes 130us to deliver, gbit, 13us, 10Gbit, 1.3us. So for values of flows of 2, 20, 200, at these bandwidths, we meet this det= net requirement. As for queuing, if you constrain the network diameter, and= use ECN, fq_codel can scale down quite a lot, but I agree there is other w= ork in this area that is promising. However, underneath this, unless a shaper like htb or cake is used, is addi= tional unavoidable buffering in the device driver, at 1Gbit and higher, man= aged by BQL. We've successfully used sch_cake to hold things down to a sing= le packet, soft-shaped, at speeds of 15Gbit or so, on high end hardware. Now, I don't honestly know enough about detnet to say if any part of this d= iscussion actually applies to what they are trying to solve! and I don't pl= an to look into until the next ietf meeting. I've been measuring overlying latencies elsewhere in the Linux kernel at 2-= 6us with a long tail to about 2ms for years now. There is a lot of passiona= te work trying to get latencies down for small packets above 10Gbits in the= Linux world.. but there, it's locking, and routing - not queueing - that i= s the dominating factor. > =20 > =20 > > Certainly, all you said is valid - as I stated, I mostly wanted to=20 > share the digest/the existance of the inititiative without judging/reproa= ching/peaching . > .. > > I prefer the direction that Mohamad Alizadeh's HULL pointed in:=20 > Less is More: Trading a little Bandwidth for Ultra-Low Latency in the= Data > Center I have adored all his work. DCTCP, HULL, one other paper... what's he doing= now? > In HULL you have i) a virtual queue that models what the queue would = be if > the link were slightly slower, then marks with ECN based on that. ii)= a much > more well-behaved TCP (HULL uses DCTCP with hardware pacing in the NI= Cs).=20 I do keep hoping that more folk will look at cake... it's a little crufty r= ight now, but we just added ack filtering to it and starting up a major set= of test runs in december/january. > I would love to be able to demonstrate that HULL can achieve the same > extremely low latency and loss targets as detnet, but with a fraction= of the > complexity. > > Well, if it's already for specific HW, then I'd prefer to see RDMA in=20 > place right away with getting rid of IRQs and other TCP/IP specific=20 > rust along the way, at least for DC realms :) Although, this HULL=20 > might has a spin for it from economics perspective. It would be good for more to read the proceeds from the recent netdev conference: https://lwn.net/Articles/738912/ > > For public Internet, not just for DCs? You might have seen the work w= e've > done (L4S) to get queuing delay over regular public Internet and broa= dband > down to about mean 500us; 90%-ile 1ms, by making DCTCP deployable alo= ngside > existing Internet traffic (unlike HULL, pacing at the source is in Li= nux, > not hardware). My personal roadmap for that is to introduce virtual q= ueues > at some future stage, to get down to the sort of delays that detnet w= ants, > but over the public Internet with just FIFOs.=20 My personal goal is to just apply what we got to incrementally reduce all d= elays from seconds to milliseconds, across the internet and on every device= I can fix. Stuff derived from the sqm-scripts is universally available in = third party firmware now, and in many devices. Also: I'm really really happy with what we've done for wifi so far, I think= we can cut peak latencies by another factor or 3, maybe even 5, with what = we got coming up next from the make-wifi-fast project. And that's mostly *driver* work, not abstract queuing theory.