From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-oa0-x234.google.com (mail-oa0-x234.google.com [IPv6:2607:f8b0:4003:c02::234]) (using TLSv1 with cipher RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by huchra.bufferbloat.net (Postfix) with ESMTPS id 0149421F322 for ; Sat, 19 Jul 2014 10:31:18 -0700 (PDT) Received: by mail-oa0-f52.google.com with SMTP id o6so5309018oag.11 for ; Sat, 19 Jul 2014 10:31:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; bh=bPfHCeMf26rScCCNQXOe6yMVJoeIYvlHAh9Wk+9PgNs=; b=MAnMTqtzG7cT/23jzjAwzWjE0KZF2O7j+wP9JAhhyJfwdKkr8ERZYzYiiMXXDw5zAi eYFCjygNAJ5HO5f/Ye1EDAkL6XY4WOTAMz6kc1ZhkZ0i0yPRrrQdnlHWUQoPHYlux9wS NwqaqqzGWTMoS2ZGUsYnH52zKXMpDAd573Sqy4qbFP80LBqvZuCChAaIf7w5Gt2nX0h/ mdl1xDenKKruhsyUeyRuWP70etLZSO00cjwbkWGarrfbLOMyx/P1RI+RA16VZ7ZH9UIL p3xQTCMwg67WJITrjiuN8ToF0a6t5iwfHMuWWroNXOutJrVv54iKWP6IIAtCrdMvb7dE IEPw== MIME-Version: 1.0 X-Received: by 10.60.176.10 with SMTP id ce10mr19367631oec.8.1405791078031; Sat, 19 Jul 2014 10:31:18 -0700 (PDT) Received: by 10.202.93.195 with HTTP; Sat, 19 Jul 2014 10:31:17 -0700 (PDT) In-Reply-To: <55810.1405788087@turing-police.cc.vt.edu> References: <55810.1405788087@turing-police.cc.vt.edu> Date: Sat, 19 Jul 2014 10:31:17 -0700 Message-ID: From: Dave Taht To: Valdis Kletnieks Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Cc: "cerowrt-devel@lists.bufferbloat.net" Subject: Re: [Cerowrt-devel] Fastpass: A Centralized "Zero-Queue" Datacenter Network X-BeenThere: cerowrt-devel@lists.bufferbloat.net X-Mailman-Version: 2.1.13 Precedence: list List-Id: Development issues regarding the cerowrt test router project List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 19 Jul 2014 17:31:19 -0000 On Sat, Jul 19, 2014 at 9:41 AM, wrote: > On Fri, 18 Jul 2014 17:23:24 -0700, Dave Taht said: >> In particular, I'd *really love* to rip most of the network stack out >> of the kernel and into userspace. And I really like the idea of >> writable hardware that can talk to virtual memory from userspace (the >> zynq can) I enjoyed the first comment over here: https://news.ycombinator.com/item?id=3D8056001 and stumbled across it in my every-morning's google search for "bufferbloat= " I really hate the fragmentation of the conversation that has happened since netnews got overwhelmed by spam. > To misquote Lost Boys, "One thing about living in a microkernel I never c= ould > stomach, all the damn context switches." Ha. That movie was filmed in my "hometown" (santa cruz), and all the local extras in it, look like that, all the time. Love that movie. Been on those railroad track= s. > Or were you thinking going entirely the opposite direction and offloading= to > ever-smarter network cards, sort of a combo of segment offload and zero-c= opy, > on steroids no less? Offloads are pita with our current architectures. Certainly I'd like to make it easier to prototype hardware and distribute loads to alternate cpus, whether they = be centralized or distributed. Things like netmap and click are interesting, as is all the work on openswi= tch and SDN related technologies. http://info.iet.unipi.it/~luigi/netmap/ Not for the ultimate speed of it... but if you can move something that is very hard to do, and experiment with, if you do it in kernel, to where you can play with it in userspace, having the vm protections there make iterating on ideas much easier. You can prototype stuff in really high level languages (like python) and prove stuff out, and that's a good thing. Certainly the performance won't be there, but if you can clearly identify a core performance enhancing thing, you can move it to= C or ASIC later. I think, incidentally, HAD micro-kernels been successful, hardware support for them would have evolved, and it would be far better, more reliable computing world. I note at the time that I was working on things like mach, (early 90s) I didn't feel this way!, as moving virtual memory management to userspace incurred such a substantial overhead as to obviate the advantages. There was plenty of other stuff that was pretty useful to move to userspace (plan 9 did it better than mach), too, but it all got lost in how hard and slow it was at the time to abstract so much out of the kernel. (there are good ideas in every bad paper) I have since, given the amount of hassle and finicy/crashyness of how hard it is to do kernel programming in general, revised my opinion. One of my biggest regrets of the "evolution" of computer design over the last 20 years is that most hardware offloads can only be used in kernel space, and that has led to those improvements being difficult to code and design for to only work there, and often, downright useless as they can't easily be used on small amounts of data without excessive context switching. Secondly it has led to a division of labor where EE's in love with the billions of transistors at their disposal, burn time writing things that userspace apps can't use. So I'm VERY bugged about that, and was overjoyed at the prospect in the zynq of being able to "write hardware", and have it talk through a virtual memory port, so that if you could identify a thing that could be done better in hardware, you could get at it via vm with no context switches, which is particularly valuable on a multi-core cpu architecture. To me the availability of the virtual memory port on the zynq is the greatest possible innovation I've seen in FPGA design in a decade and may one day re-unify the outlook of the EEs and userspace programmers to do genuinely useful stuff. There are zillions of useful things that can be done better with just a little extra hardware support, from userspace. Two things that have come up of late are CAM memory comparisons, and echo cancelling, both of which are easy to do efficiently in hardware, with performance a conventional van neuman architecture can't match. Other things on my mind are things like packet scheduling (as per the senic paper I posted earlier), and much, much more. And it would be great if the EEs and CS folk started going to the same parties again. Last regret of the day: Back in the 80s, the LISP machine was a 36 bit tagged architecture. I loved it. I have been mildly bugged, that we didn't use the top 4 bits on 64 bit architectures for tags, it would make things like garbage collection so much easier... --=20 Dave T=C3=A4ht NSFW: https://w2.eff.org/Censorship/Internet_censorship_bills/russell_0296_= indecent.article