* [Cerowrt-devel] Fastpass: A Centralized "Zero-Queue" Datacenter Network @ 2014-07-18 13:48 Jim Reisert AD1C 2014-07-18 23:20 ` David Lang 0 siblings, 1 reply; 9+ messages in thread From: Jim Reisert AD1C @ 2014-07-18 13:48 UTC (permalink / raw) To: cerowrt-devel "Fastpass is a datacenter network framework that aims for high utilization with zero queueing. It provides low median and tail latencies for packets, high data rates between machines, and flexible network resource allocation policies. The key idea in Fastpass is fine-grained control over packet transmission times and network paths." Read more at.... http://fastpass.mit.edu/ -- Jim Reisert AD1C, <jjreisert@alum.mit.edu>, http://www.ad1c.us ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Cerowrt-devel] Fastpass: A Centralized "Zero-Queue" Datacenter Network 2014-07-18 13:48 [Cerowrt-devel] Fastpass: A Centralized "Zero-Queue" Datacenter Network Jim Reisert AD1C @ 2014-07-18 23:20 ` David Lang 2014-07-18 23:27 ` David Lang 2014-07-19 0:04 ` Valdis.Kletnieks 0 siblings, 2 replies; 9+ messages in thread From: David Lang @ 2014-07-18 23:20 UTC (permalink / raw) To: Jim Reisert AD1C; +Cc: cerowrt-devel On Fri, 18 Jul 2014, Jim Reisert AD1C wrote: > "Fastpass is a datacenter network framework that aims for high > utilization with zero queueing. It provides low median and tail > latencies for packets, high data rates between machines, and flexible > network resource allocation policies. The key idea in Fastpass is > fine-grained control over packet transmission times and network > paths." > > Read more at.... > > http://fastpass.mit.edu/ <sarcasam> and all it takes is making one central point aware of all the communications that is going to take place so that it can coordinate everything. That is sure to scale to an entire datacenter, and beyond that to the Internet </sarcasam> now, back in the real world, you can't reliably get that information for a single host, let alone an entire datacenter. not to mention the scalability issues in trying to get that info to a central point so that it can compute the N! possible paths for the data to take, along with the data from all the other systems to pack them most efficiently into the avilable paths. Someone may eventually make something useful out of this, but I think that it's at best a typical academic ivory tower type of solution. David Lang ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Cerowrt-devel] Fastpass: A Centralized "Zero-Queue" Datacenter Network 2014-07-18 23:20 ` David Lang @ 2014-07-18 23:27 ` David Lang 2014-07-19 0:23 ` Dave Taht 2014-07-19 0:04 ` Valdis.Kletnieks 1 sibling, 1 reply; 9+ messages in thread From: David Lang @ 2014-07-18 23:27 UTC (permalink / raw) To: Jim Reisert AD1C; +Cc: cerowrt-devel On Fri, 18 Jul 2014, David Lang wrote: > On Fri, 18 Jul 2014, Jim Reisert AD1C wrote: > >> "Fastpass is a datacenter network framework that aims for high >> utilization with zero queueing. It provides low median and tail >> latencies for packets, high data rates between machines, and flexible >> network resource allocation policies. The key idea in Fastpass is >> fine-grained control over packet transmission times and network >> paths." >> >> Read more at.... >> >> http://fastpass.mit.edu/ > > <sarcasam> > and all it takes is making one central point aware of all the communications > that is going to take place so that it can coordinate everything. > > That is sure to scale to an entire datacenter, and beyond that to the > Internet > </sarcasam> by the way, this wasn't intended to be an attack on Jim or anyone else, please continue posting such links. It's important to know about even the bad ones so that you aren't blindsided by someone referring to them. David Lang ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Cerowrt-devel] Fastpass: A Centralized "Zero-Queue" Datacenter Network 2014-07-18 23:27 ` David Lang @ 2014-07-19 0:23 ` Dave Taht 2014-07-19 0:36 ` Dave Taht 2014-07-19 16:41 ` Valdis.Kletnieks 0 siblings, 2 replies; 9+ messages in thread From: Dave Taht @ 2014-07-19 0:23 UTC (permalink / raw) To: David Lang; +Cc: cerowrt-devel On Fri, Jul 18, 2014 at 4:27 PM, David Lang <david@lang.hm> wrote: > On Fri, 18 Jul 2014, David Lang wrote: > >> On Fri, 18 Jul 2014, Jim Reisert AD1C wrote: >> >>> "Fastpass is a datacenter network framework that aims for high >>> utilization with zero queueing. It provides low median and tail >>> latencies for packets, high data rates between machines, and flexible >>> network resource allocation policies. The key idea in Fastpass is >>> fine-grained control over packet transmission times and network >>> paths." >>> >>> Read more at.... >>> >>> http://fastpass.mit.edu/ >> >> >> <sarcasam> >> and all it takes is making one central point aware of all the >> communications that is going to take place so that it can coordinate >> everything. >> >> That is sure to scale to an entire datacenter, and beyond that to the >> Internet >> </sarcasam> Your tag is incomplete, therefore the rest of your argument fits under it too. :) What I find really puzzling is that this paper makes no references to the fair queuing literature at all. I was even more puzzled by one of the cited papers when it came out, where what they are implementing is basically just a version of "shortest queue first": http://web.stanford.edu/~skatti/pubs/sigcomm13-pfabric.pdf vs http://www.internetsociety.org/sites/default/files/pdf/accepted/4_sqf_isoc.pdf (can't find the sqf paper I wanted to cite, I think it's in the cites above) Which also didn't cite any of the fair queue-ing literature that goes back to 1989.They use different terms ("priorities"), language, etc, which indicates that it was never read... and yet it seems, once you translate the terminology, all the ideas and papers cited in both the mit and stanford papers seem to have clear roots in FQ. Maybe it takes having to fundamentally change the architecture of the internet to get an idea published nowadays? You have to make it unimplementable and non-threatening to the powers-that-be? Studiously avoid work from the previous decade? Maybe the authors have to speak in code? Or maybe an idea looks better when multiple people discover it and describe it different ways, and ultimately get together to resolve their differences? Don't know... If you substitute out pfabric's suggested complete replacement for IP and TCP for a conventional IP architecture and drop some of their version of shortest queue first, and substitute nearly any form of fair queuing (SQF is quite good, however, fq_codel seems better), you get similar, possibly even better, results, and you don't need to change anything important. Going back to the MIT paper, I liked that they measured "normal" queuing delay in the data center (not looking at the paper now... 3.4ms?) under those conditions, showed what could happen if buffering was reduced by a huge amount (no matter the underlying algorithm doing so), liked the fact they worked with a very advanced, clever SDK that rips out networking from the linux kernel core (and moves most packet processing into intel's huge cache), and a few other things that were very interesting. Yes, the central clock idea is a bit crazy, but with switching times measured in ns, it might actually work over short (single rack) distances. And the "incast" problem is, really, really hard and benefits from this sort of approach. There's a whole new ietf wg dedicated (dclc I think it's called). There is often hidden value in many a paper, even if the central idea is problematic. In particular, I'd *really love* to rip most of the network stack out of the kernel and into userspace. And I really like the idea of writable hardware that can talk to virtual memory from userspace (the zynq can) -- Dave Täht NSFW: https://w2.eff.org/Censorship/Internet_censorship_bills/russell_0296_indecent.article ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Cerowrt-devel] Fastpass: A Centralized "Zero-Queue" Datacenter Network 2014-07-19 0:23 ` Dave Taht @ 2014-07-19 0:36 ` Dave Taht 2014-07-19 16:41 ` Valdis.Kletnieks 1 sibling, 0 replies; 9+ messages in thread From: Dave Taht @ 2014-07-19 0:36 UTC (permalink / raw) To: David Lang; +Cc: cerowrt-devel oops, meant to post this on this thread not the other: http://www.eecs.berkeley.edu/~sylvia/cs268-2014/papers//FQ1989.pdf Reading it now it is a model of clarity, that states stuff we've had to discover anew. Also, at the time VJ's congestion control stuff was new. I love history, I love trying to understand stuff in context... Now that I've finally found this one, it would be great to find the papers it cites too. On Fri, Jul 18, 2014 at 5:23 PM, Dave Taht <dave.taht@gmail.com> wrote: > On Fri, Jul 18, 2014 at 4:27 PM, David Lang <david@lang.hm> wrote: >> On Fri, 18 Jul 2014, David Lang wrote: >> >>> On Fri, 18 Jul 2014, Jim Reisert AD1C wrote: >>> >>>> "Fastpass is a datacenter network framework that aims for high >>>> utilization with zero queueing. It provides low median and tail >>>> latencies for packets, high data rates between machines, and flexible >>>> network resource allocation policies. The key idea in Fastpass is >>>> fine-grained control over packet transmission times and network >>>> paths." >>>> >>>> Read more at.... >>>> >>>> http://fastpass.mit.edu/ >>> >>> >>> <sarcasam> >>> and all it takes is making one central point aware of all the >>> communications that is going to take place so that it can coordinate >>> everything. >>> >>> That is sure to scale to an entire datacenter, and beyond that to the >>> Internet >>> </sarcasam> > > Your tag is incomplete, therefore the rest of your argument fits under > it too. :) > > What I find really puzzling is that this paper makes no references to > the fair queuing literature at all. > > I was even more puzzled by one of the cited papers when it came out, > where what they > are implementing is basically just a version of "shortest queue first": > > http://web.stanford.edu/~skatti/pubs/sigcomm13-pfabric.pdf > > vs > > http://www.internetsociety.org/sites/default/files/pdf/accepted/4_sqf_isoc.pdf > > (can't find the sqf paper I wanted to cite, I think it's in the cites above) > > Which also didn't cite any of the fair queue-ing literature that goes > back to 1989.They use different terms ("priorities"), language, etc, > which indicates that it was never read... and yet it seems, once you > translate the terminology, all the ideas and papers cited in both the > mit and stanford papers seem to have clear roots in FQ. > > Maybe it takes having to fundamentally change the architecture of the > internet to get an idea published nowadays? You have to make it > unimplementable and non-threatening to the powers-that-be? Studiously > avoid work from the previous decade? Maybe the authors have to speak > in code? Or maybe an idea looks better when multiple people discover > it and describe it different ways, and ultimately get together to > resolve their differences? Don't know... > > If you substitute out pfabric's suggested complete replacement for IP > and TCP for a conventional IP architecture and drop some of their > version of shortest queue first, and substitute nearly any form of > fair queuing (SQF is quite good, however, fq_codel seems better), you > get similar, possibly even better, results, and you don't need to > change anything important. > > Going back to the MIT paper, I liked that they measured "normal" > queuing delay in the data center (not looking at the paper now... > 3.4ms?) under those conditions, showed what could happen if buffering > was reduced by a huge amount (no matter the underlying algorithm doing > so), liked the fact they worked with a very advanced, clever SDK that > rips out networking from the linux kernel core (and moves most packet > processing into intel's huge cache), and a few other things that were > very interesting. Yes, the central clock idea is a bit crazy, but with > switching times measured in ns, it might actually work over short > (single rack) distances. And the "incast" problem is, really, really > hard and benefits from this sort of approach. There's a whole new ietf > wg dedicated (dclc I think it's called). > > There is often hidden value in many a paper, even if the central idea > is problematic. > > In particular, I'd *really love* to rip most of the network stack out > of the kernel and into userspace. And I really like the idea of > writable hardware that can talk to virtual memory from userspace (the > zynq can) > > -- > Dave Täht > > NSFW: https://w2.eff.org/Censorship/Internet_censorship_bills/russell_0296_indecent.article -- Dave Täht NSFW: https://w2.eff.org/Censorship/Internet_censorship_bills/russell_0296_indecent.article ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Cerowrt-devel] Fastpass: A Centralized "Zero-Queue" Datacenter Network 2014-07-19 0:23 ` Dave Taht 2014-07-19 0:36 ` Dave Taht @ 2014-07-19 16:41 ` Valdis.Kletnieks 2014-07-19 17:31 ` Dave Taht 1 sibling, 1 reply; 9+ messages in thread From: Valdis.Kletnieks @ 2014-07-19 16:41 UTC (permalink / raw) To: Dave Taht; +Cc: cerowrt-devel [-- Attachment #1: Type: text/plain, Size: 583 bytes --] On Fri, 18 Jul 2014 17:23:24 -0700, Dave Taht said: > In particular, I'd *really love* to rip most of the network stack out > of the kernel and into userspace. And I really like the idea of > writable hardware that can talk to virtual memory from userspace (the > zynq can) To misquote Lost Boys, "One thing about living in a microkernel I never could stomach, all the damn context switches." Or were you thinking going entirely the opposite direction and offloading to ever-smarter network cards, sort of a combo of segment offload and zero-copy, on steroids no less? [-- Attachment #2: Type: application/pgp-signature, Size: 848 bytes --] ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Cerowrt-devel] Fastpass: A Centralized "Zero-Queue" Datacenter Network 2014-07-19 16:41 ` Valdis.Kletnieks @ 2014-07-19 17:31 ` Dave Taht 0 siblings, 0 replies; 9+ messages in thread From: Dave Taht @ 2014-07-19 17:31 UTC (permalink / raw) To: Valdis Kletnieks; +Cc: cerowrt-devel On Sat, Jul 19, 2014 at 9:41 AM, <Valdis.Kletnieks@vt.edu> wrote: > On Fri, 18 Jul 2014 17:23:24 -0700, Dave Taht said: >> In particular, I'd *really love* to rip most of the network stack out >> of the kernel and into userspace. And I really like the idea of >> writable hardware that can talk to virtual memory from userspace (the >> zynq can) I enjoyed the first comment over here: https://news.ycombinator.com/item?id=8056001 and stumbled across it in my every-morning's google search for "bufferbloat" I really hate the fragmentation of the conversation that has happened since netnews got overwhelmed by spam. > To misquote Lost Boys, "One thing about living in a microkernel I never could > stomach, all the damn context switches." Ha. That movie was filmed in my "hometown" (santa cruz), and all the local extras in it, look like that, all the time. Love that movie. Been on those railroad tracks. > Or were you thinking going entirely the opposite direction and offloading to > ever-smarter network cards, sort of a combo of segment offload and zero-copy, > on steroids no less? Offloads are pita with our current architectures. Certainly I'd like to make it easier to prototype hardware and distribute loads to alternate cpus, whether they be centralized or distributed. Things like netmap and click are interesting, as is all the work on openswitch and SDN related technologies. http://info.iet.unipi.it/~luigi/netmap/ Not for the ultimate speed of it... but if you can move something that is very hard to do, and experiment with, if you do it in kernel, to where you can play with it in userspace, having the vm protections there make iterating on ideas much easier. You can prototype stuff in really high level languages (like python) and prove stuff out, and that's a good thing. Certainly the performance won't be there, but if you can clearly identify a core performance enhancing thing, you can move it to C or ASIC later. I think, incidentally, HAD micro-kernels been successful, hardware support for them would have evolved, and it would be far better, more reliable computing world. I note at the time that I was working on things like mach, (early 90s) I didn't feel this way!, as moving virtual memory management to userspace incurred such a substantial overhead as to obviate the advantages. There was plenty of other stuff that was pretty useful to move to userspace (plan 9 did it better than mach), too, but it all got lost in how hard and slow it was at the time to abstract so much out of the kernel. (there are good ideas in every bad paper) I have since, given the amount of hassle and finicy/crashyness of how hard it is to do kernel programming in general, revised my opinion. One of my biggest regrets of the "evolution" of computer design over the last 20 years is that most hardware offloads can only be used in kernel space, and that has led to those improvements being difficult to code and design for to only work there, and often, downright useless as they can't easily be used on small amounts of data without excessive context switching. Secondly it has led to a division of labor where EE's in love with the billions of transistors at their disposal, burn time writing things that userspace apps can't use. So I'm VERY bugged about that, and was overjoyed at the prospect in the zynq of being able to "write hardware", and have it talk through a virtual memory port, so that if you could identify a thing that could be done better in hardware, you could get at it via vm with no context switches, which is particularly valuable on a multi-core cpu architecture. To me the availability of the virtual memory port on the zynq is the greatest possible innovation I've seen in FPGA design in a decade and may one day re-unify the outlook of the EEs and userspace programmers to do genuinely useful stuff. There are zillions of useful things that can be done better with just a little extra hardware support, from userspace. Two things that have come up of late are CAM memory comparisons, and echo cancelling, both of which are easy to do efficiently in hardware, with performance a conventional van neuman architecture can't match. Other things on my mind are things like packet scheduling (as per the senic paper I posted earlier), and much, much more. And it would be great if the EEs and CS folk started going to the same parties again. Last regret of the day: Back in the 80s, the LISP machine was a 36 bit tagged architecture. I loved it. I have been mildly bugged, that we didn't use the top 4 bits on 64 bit architectures for tags, it would make things like garbage collection so much easier... -- Dave Täht NSFW: https://w2.eff.org/Censorship/Internet_censorship_bills/russell_0296_indecent.article ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Cerowrt-devel] Fastpass: A Centralized "Zero-Queue" Datacenter Network 2014-07-18 23:20 ` David Lang 2014-07-18 23:27 ` David Lang @ 2014-07-19 0:04 ` Valdis.Kletnieks 2014-07-19 0:31 ` David Lang 1 sibling, 1 reply; 9+ messages in thread From: Valdis.Kletnieks @ 2014-07-19 0:04 UTC (permalink / raw) To: David Lang; +Cc: cerowrt-devel [-- Attachment #1: Type: text/plain, Size: 399 bytes --] On Fri, 18 Jul 2014 16:20:28 -0700, David Lang said: > not to mention the scalability issues in trying to get that info to a central > point so that it can compute the N! possible paths for the data to take, along > with the data from all the other systems to pack them most efficiently into the > avilable paths. OK, so it converges slower than BGP. But at least it's not subject to wedgies. :) [-- Attachment #2: Type: application/pgp-signature, Size: 848 bytes --] ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Cerowrt-devel] Fastpass: A Centralized "Zero-Queue" Datacenter Network 2014-07-19 0:04 ` Valdis.Kletnieks @ 2014-07-19 0:31 ` David Lang 0 siblings, 0 replies; 9+ messages in thread From: David Lang @ 2014-07-19 0:31 UTC (permalink / raw) To: Valdis.Kletnieks; +Cc: cerowrt-devel On Fri, 18 Jul 2014, Valdis.Kletnieks@vt.edu wrote: > On Fri, 18 Jul 2014 16:20:28 -0700, David Lang said: >> not to mention the scalability issues in trying to get that info to a central >> point so that it can compute the N! possible paths for the data to take, along >> with the data from all the other systems to pack them most efficiently into the >> avilable paths. > > OK, so it converges slower than BGP. But at least it's not subject to wedgies. :) The question is if it converges faster than the traffic changes :-) If you are really aiming for zero queuing throughout the network with the links all running at full speed, you have no time to respond to new traffic. David Lang ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2014-07-19 17:31 UTC | newest] Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2014-07-18 13:48 [Cerowrt-devel] Fastpass: A Centralized "Zero-Queue" Datacenter Network Jim Reisert AD1C 2014-07-18 23:20 ` David Lang 2014-07-18 23:27 ` David Lang 2014-07-19 0:23 ` Dave Taht 2014-07-19 0:36 ` Dave Taht 2014-07-19 16:41 ` Valdis.Kletnieks 2014-07-19 17:31 ` Dave Taht 2014-07-19 0:04 ` Valdis.Kletnieks 2014-07-19 0:31 ` David Lang
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox