[Cerowrt-devel] Fastpass: A Centralized "Zero-Queue" Datacenter Network

Development issues regarding the cerowrt test router project
 help / color / mirror / Atom feed

* [Cerowrt-devel] Fastpass: A Centralized "Zero-Queue" Datacenter Network
@ 2014-07-18 13:48 Jim Reisert AD1C
  2014-07-18 23:20 ` David Lang
  0 siblings, 1 reply; 9+ messages in thread
From: Jim Reisert AD1C @ 2014-07-18 13:48 UTC (permalink / raw)
  To: cerowrt-devel

"Fastpass is a datacenter network framework that aims for high
utilization with zero queueing. It provides low median and tail
latencies for packets, high data rates between machines, and flexible
network resource allocation policies. The key idea in Fastpass is
fine-grained control over packet transmission times and network
paths."

Read more at....

http://fastpass.mit.edu/

-- 
Jim Reisert AD1C, <jjreisert@alum.mit.edu>, http://www.ad1c.us

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Cerowrt-devel] Fastpass: A Centralized "Zero-Queue" Datacenter Network
  2014-07-18 13:48 [Cerowrt-devel] Fastpass: A Centralized "Zero-Queue" Datacenter Network Jim Reisert AD1C
@ 2014-07-18 23:20 ` David Lang
  2014-07-18 23:27   ` David Lang
  2014-07-19  0:04   ` Valdis.Kletnieks
  0 siblings, 2 replies; 9+ messages in thread
From: David Lang @ 2014-07-18 23:20 UTC (permalink / raw)
  To: Jim Reisert AD1C; +Cc: cerowrt-devel

On Fri, 18 Jul 2014, Jim Reisert AD1C wrote:

> "Fastpass is a datacenter network framework that aims for high
> utilization with zero queueing. It provides low median and tail
> latencies for packets, high data rates between machines, and flexible
> network resource allocation policies. The key idea in Fastpass is
> fine-grained control over packet transmission times and network
> paths."
>
> Read more at....
>
> http://fastpass.mit.edu/

<sarcasam>
and all it takes is making one central point aware of all the communications 
that is going to take place so that it can coordinate everything.

That is sure to scale to an entire datacenter, and beyond that to the Internet
</sarcasam>

now, back in the real world, you can't reliably get that information for a 
single host, let alone an entire datacenter.

not to mention the scalability issues in trying to get that info to a central 
point so that it can compute the N! possible paths for the data to take, along 
with the data from all the other systems to pack them most efficiently into the 
avilable paths.

Someone may eventually make something useful out of this, but I think that it's 
at best a typical academic ivory tower type of solution.

David Lang

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Cerowrt-devel] Fastpass: A Centralized "Zero-Queue" Datacenter Network
  2014-07-18 23:20 ` David Lang
@ 2014-07-18 23:27   ` David Lang
  2014-07-19  0:23     ` Dave Taht
  2014-07-19  0:04   ` Valdis.Kletnieks
  1 sibling, 1 reply; 9+ messages in thread
From: David Lang @ 2014-07-18 23:27 UTC (permalink / raw)
  To: Jim Reisert AD1C; +Cc: cerowrt-devel

On Fri, 18 Jul 2014, David Lang wrote:

> On Fri, 18 Jul 2014, Jim Reisert AD1C wrote:
>
>> "Fastpass is a datacenter network framework that aims for high
>> utilization with zero queueing. It provides low median and tail
>> latencies for packets, high data rates between machines, and flexible
>> network resource allocation policies. The key idea in Fastpass is
>> fine-grained control over packet transmission times and network
>> paths."
>> 
>> Read more at....
>> 
>> http://fastpass.mit.edu/
>
> <sarcasam>
> and all it takes is making one central point aware of all the communications 
> that is going to take place so that it can coordinate everything.
>
> That is sure to scale to an entire datacenter, and beyond that to the 
> Internet
> </sarcasam>

by the way, this wasn't intended to be an attack on Jim or anyone else, please 
continue posting such links. It's important to know about even the bad ones so 
that you aren't blindsided by someone referring to them.

David Lang

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Cerowrt-devel] Fastpass: A Centralized "Zero-Queue" Datacenter Network
  2014-07-18 23:20 ` David Lang
  2014-07-18 23:27   ` David Lang
@ 2014-07-19  0:04   ` Valdis.Kletnieks
  2014-07-19  0:31     ` David Lang
  1 sibling, 1 reply; 9+ messages in thread
From: Valdis.Kletnieks @ 2014-07-19  0:04 UTC (permalink / raw)
  To: David Lang; +Cc: cerowrt-devel

[-- Attachment #1: Type: text/plain, Size: 399 bytes --]

On Fri, 18 Jul 2014 16:20:28 -0700, David Lang said:
> not to mention the scalability issues in trying to get that info to a central
> point so that it can compute the N! possible paths for the data to take, along
> with the data from all the other systems to pack them most efficiently into the
> avilable paths.

OK, so it converges slower than BGP. But at least it's not subject to wedgies. :)



[-- Attachment #2: Type: application/pgp-signature, Size: 848 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Cerowrt-devel] Fastpass: A Centralized "Zero-Queue" Datacenter Network
  2014-07-18 23:27   ` David Lang
@ 2014-07-19  0:23     ` Dave Taht
  2014-07-19  0:36       ` Dave Taht
  2014-07-19 16:41       ` Valdis.Kletnieks
  0 siblings, 2 replies; 9+ messages in thread
From: Dave Taht @ 2014-07-19  0:23 UTC (permalink / raw)
  To: David Lang; +Cc: cerowrt-devel

On Fri, Jul 18, 2014 at 4:27 PM, David Lang <david@lang.hm> wrote:
> On Fri, 18 Jul 2014, David Lang wrote:
>
>> On Fri, 18 Jul 2014, Jim Reisert AD1C wrote:
>>
>>> "Fastpass is a datacenter network framework that aims for high
>>> utilization with zero queueing. It provides low median and tail
>>> latencies for packets, high data rates between machines, and flexible
>>> network resource allocation policies. The key idea in Fastpass is
>>> fine-grained control over packet transmission times and network
>>> paths."
>>>
>>> Read more at....
>>>
>>> http://fastpass.mit.edu/
>>
>>
>> <sarcasam>
>> and all it takes is making one central point aware of all the
>> communications that is going to take place so that it can coordinate
>> everything.
>>
>> That is sure to scale to an entire datacenter, and beyond that to the
>> Internet
>> </sarcasam>

Your tag is incomplete, therefore the rest of your argument fits under
it too. :)

What I find really puzzling is that this paper makes no references to
the fair queuing literature at all.

I was even more puzzled by one of the cited papers when it came out,
where what they
are implementing is basically just a version of "shortest queue first":

http://web.stanford.edu/~skatti/pubs/sigcomm13-pfabric.pdf

vs

http://www.internetsociety.org/sites/default/files/pdf/accepted/4_sqf_isoc.pdf

(can't find the sqf paper I wanted to cite, I think it's in the cites above)

Which also didn't cite any of the fair queue-ing literature that goes
back to 1989.They use different terms ("priorities"), language, etc,
which indicates that it was never read... and yet it seems, once you
translate the terminology, all the ideas and papers cited in both the
mit and stanford papers seem to have clear roots in FQ.

Maybe it takes having to fundamentally change the architecture of the
internet to get an idea published nowadays? You have to make it
unimplementable and non-threatening to the powers-that-be? Studiously
avoid work from the previous decade? Maybe the authors have to speak
in code? Or maybe an idea looks better when multiple people discover
it and describe it different ways, and ultimately get together to
resolve their differences? Don't know...

If you substitute out pfabric's suggested complete replacement for IP
and TCP for a conventional IP architecture and drop some of their
version of shortest queue first, and substitute nearly any form of
fair queuing (SQF is quite good, however, fq_codel seems better), you
get similar, possibly even better, results, and you don't need to
change anything important.

Going back to the MIT paper, I liked that they measured "normal"
queuing delay in the data center (not looking at the paper now...
3.4ms?) under those conditions, showed what could happen if buffering
was reduced by a huge amount (no matter the underlying algorithm doing
so), liked the fact they worked with a very advanced, clever SDK that
rips out networking from the linux kernel core (and moves most packet
processing into intel's huge cache), and a few other things that were
very interesting. Yes, the central clock idea is a bit crazy, but with
switching times measured in ns, it might actually work over short
(single rack) distances. And the "incast" problem is, really, really
hard and benefits from this sort of approach. There's a whole new ietf
wg dedicated (dclc I think it's called).

There is often hidden value in many a paper, even if the central idea
is problematic.

In particular, I'd *really love* to rip most of the network stack out
of the kernel and into userspace. And I really like the idea of
writable hardware that can talk to virtual memory from userspace (the
zynq can)

-- 
Dave Täht

NSFW: https://w2.eff.org/Censorship/Internet_censorship_bills/russell_0296_indecent.article

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Cerowrt-devel] Fastpass: A Centralized "Zero-Queue" Datacenter Network
  2014-07-19  0:04   ` Valdis.Kletnieks
@ 2014-07-19  0:31     ` David Lang
  0 siblings, 0 replies; 9+ messages in thread
From: David Lang @ 2014-07-19  0:31 UTC (permalink / raw)
  To: Valdis.Kletnieks; +Cc: cerowrt-devel

On Fri, 18 Jul 2014, Valdis.Kletnieks@vt.edu wrote:

> On Fri, 18 Jul 2014 16:20:28 -0700, David Lang said:
>> not to mention the scalability issues in trying to get that info to a central
>> point so that it can compute the N! possible paths for the data to take, along
>> with the data from all the other systems to pack them most efficiently into the
>> avilable paths.
>
> OK, so it converges slower than BGP. But at least it's not subject to wedgies. :)

The question is if it converges faster than the traffic changes :-)

If you are really aiming for zero queuing throughout the network with the links 
all running at full speed, you have no time to respond to new traffic.

David Lang

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Cerowrt-devel] Fastpass: A Centralized "Zero-Queue" Datacenter Network
  2014-07-19  0:23     ` Dave Taht
@ 2014-07-19  0:36       ` Dave Taht
  2014-07-19 16:41       ` Valdis.Kletnieks
  1 sibling, 0 replies; 9+ messages in thread
From: Dave Taht @ 2014-07-19  0:36 UTC (permalink / raw)
  To: David Lang; +Cc: cerowrt-devel

oops, meant to post this on this thread not the other:

http://www.eecs.berkeley.edu/~sylvia/cs268-2014/papers//FQ1989.pdf

Reading it now it is a model of clarity, that states stuff we've had
to discover anew.

Also, at the time VJ's congestion control stuff was new. I love
history, I love trying to understand stuff in context...

Now that I've finally found this one, it would be great to find the
papers it cites too.

On Fri, Jul 18, 2014 at 5:23 PM, Dave Taht <dave.taht@gmail.com> wrote:
> On Fri, Jul 18, 2014 at 4:27 PM, David Lang <david@lang.hm> wrote:
>> On Fri, 18 Jul 2014, David Lang wrote:
>>
>>> On Fri, 18 Jul 2014, Jim Reisert AD1C wrote:
>>>
>>>> "Fastpass is a datacenter network framework that aims for high
>>>> utilization with zero queueing. It provides low median and tail
>>>> latencies for packets, high data rates between machines, and flexible
>>>> network resource allocation policies. The key idea in Fastpass is
>>>> fine-grained control over packet transmission times and network
>>>> paths."
>>>>
>>>> Read more at....
>>>>
>>>> http://fastpass.mit.edu/
>>>
>>>
>>> <sarcasam>
>>> and all it takes is making one central point aware of all the
>>> communications that is going to take place so that it can coordinate
>>> everything.
>>>
>>> That is sure to scale to an entire datacenter, and beyond that to the
>>> Internet
>>> </sarcasam>
>
> Your tag is incomplete, therefore the rest of your argument fits under
> it too. :)
>
> What I find really puzzling is that this paper makes no references to
> the fair queuing literature at all.
>
> I was even more puzzled by one of the cited papers when it came out,
> where what they
> are implementing is basically just a version of "shortest queue first":
>
> http://web.stanford.edu/~skatti/pubs/sigcomm13-pfabric.pdf
>
> vs
>
> http://www.internetsociety.org/sites/default/files/pdf/accepted/4_sqf_isoc.pdf
>
> (can't find the sqf paper I wanted to cite, I think it's in the cites above)
>
> Which also didn't cite any of the fair queue-ing literature that goes
> back to 1989.They use different terms ("priorities"), language, etc,
> which indicates that it was never read... and yet it seems, once you
> translate the terminology, all the ideas and papers cited in both the
> mit and stanford papers seem to have clear roots in FQ.
>
> Maybe it takes having to fundamentally change the architecture of the
> internet to get an idea published nowadays? You have to make it
> unimplementable and non-threatening to the powers-that-be? Studiously
> avoid work from the previous decade? Maybe the authors have to speak
> in code? Or maybe an idea looks better when multiple people discover
> it and describe it different ways, and ultimately get together to
> resolve their differences? Don't know...
>
> If you substitute out pfabric's suggested complete replacement for IP
> and TCP for a conventional IP architecture and drop some of their
> version of shortest queue first, and substitute nearly any form of
> fair queuing (SQF is quite good, however, fq_codel seems better), you
> get similar, possibly even better, results, and you don't need to
> change anything important.
>
> Going back to the MIT paper, I liked that they measured "normal"
> queuing delay in the data center (not looking at the paper now...
> 3.4ms?) under those conditions, showed what could happen if buffering
> was reduced by a huge amount (no matter the underlying algorithm doing
> so), liked the fact they worked with a very advanced, clever SDK that
> rips out networking from the linux kernel core (and moves most packet
> processing into intel's huge cache), and a few other things that were
> very interesting. Yes, the central clock idea is a bit crazy, but with
> switching times measured in ns, it might actually work over short
> (single rack) distances. And the "incast" problem is, really, really
> hard and benefits from this sort of approach. There's a whole new ietf
> wg dedicated (dclc I think it's called).
>
> There is often hidden value in many a paper, even if the central idea
> is problematic.
>
> In particular, I'd *really love* to rip most of the network stack out
> of the kernel and into userspace. And I really like the idea of
> writable hardware that can talk to virtual memory from userspace (the
> zynq can)
>
> --
> Dave Täht
>
> NSFW: https://w2.eff.org/Censorship/Internet_censorship_bills/russell_0296_indecent.article



-- 
Dave Täht

NSFW: https://w2.eff.org/Censorship/Internet_censorship_bills/russell_0296_indecent.article

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Cerowrt-devel] Fastpass: A Centralized "Zero-Queue" Datacenter Network
  2014-07-19  0:23     ` Dave Taht
  2014-07-19  0:36       ` Dave Taht
@ 2014-07-19 16:41       ` Valdis.Kletnieks
  2014-07-19 17:31         ` Dave Taht
  1 sibling, 1 reply; 9+ messages in thread
From: Valdis.Kletnieks @ 2014-07-19 16:41 UTC (permalink / raw)
  To: Dave Taht; +Cc: cerowrt-devel

[-- Attachment #1: Type: text/plain, Size: 583 bytes --]

On Fri, 18 Jul 2014 17:23:24 -0700, Dave Taht said:
> In particular, I'd *really love* to rip most of the network stack out
> of the kernel and into userspace. And I really like the idea of
> writable hardware that can talk to virtual memory from userspace (the
> zynq can)

To misquote Lost Boys, "One thing about living in a microkernel I never could
stomach, all the damn context switches."

Or were you thinking going entirely the opposite direction and offloading to
ever-smarter network cards, sort of a combo of segment offload and zero-copy,
on steroids no less?

[-- Attachment #2: Type: application/pgp-signature, Size: 848 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Cerowrt-devel] Fastpass: A Centralized "Zero-Queue" Datacenter Network
  2014-07-19 16:41       ` Valdis.Kletnieks
@ 2014-07-19 17:31         ` Dave Taht
  0 siblings, 0 replies; 9+ messages in thread
From: Dave Taht @ 2014-07-19 17:31 UTC (permalink / raw)
  To: Valdis Kletnieks; +Cc: cerowrt-devel

On Sat, Jul 19, 2014 at 9:41 AM,  <Valdis.Kletnieks@vt.edu> wrote:
> On Fri, 18 Jul 2014 17:23:24 -0700, Dave Taht said:
>> In particular, I'd *really love* to rip most of the network stack out
>> of the kernel and into userspace. And I really like the idea of
>> writable hardware that can talk to virtual memory from userspace (the
>> zynq can)

I enjoyed the first comment over here:

https://news.ycombinator.com/item?id=8056001

and stumbled across it in my every-morning's google search for "bufferbloat"

I really hate the fragmentation of the conversation that has happened since
netnews got overwhelmed by spam.

> To misquote Lost Boys, "One thing about living in a microkernel I never could
> stomach, all the damn context switches."

Ha. That movie was filmed in my "hometown" (santa cruz), and all the
local extras in it,
look like that, all the time. Love that movie. Been on those railroad tracks.

> Or were you thinking going entirely the opposite direction and offloading to
> ever-smarter network cards, sort of a combo of segment offload and zero-copy,
> on steroids no less?

Offloads are pita with our current architectures. Certainly I'd like
to make it easier
to prototype hardware and distribute loads to alternate cpus, whether they be
centralized or distributed.

Things like netmap and click are interesting, as is all the work on openswitch
and SDN related technologies.

http://info.iet.unipi.it/~luigi/netmap/

Not for the ultimate speed of it... but if you can move something that
is very hard to do,
and experiment with, if you do it in kernel, to where you can play
with it in userspace,
having the vm protections there make iterating on ideas much easier.

You can prototype stuff in really high level languages (like python)
and prove stuff
out, and that's a good thing. Certainly the performance won't be
there, but if you
can clearly identify a core performance enhancing thing, you can move it to C
or ASIC later.

I think, incidentally, HAD micro-kernels been successful, hardware
support for them
would have evolved, and it would be far better, more reliable
computing world. I note at the
time that I was working on things like mach, (early 90s) I didn't feel
this way!, as moving
virtual memory management to userspace incurred such a substantial overhead
as to obviate the advantages. There was plenty of other stuff that was
pretty useful
to move to userspace (plan 9 did it better than mach), too, but it all
got lost in how hard
and slow it was at the time to abstract so much out of the kernel.

(there are good ideas in every bad paper)

I have since, given the amount of hassle and finicy/crashyness of how
hard it is to do kernel programming in general, revised my opinion.
One of my biggest regrets of the "evolution" of computer design over
the last 20 years is that most hardware offloads can only be used in
kernel space, and that has led to those improvements being difficult
to code and design for to only work there, and often, downright
useless as they can't easily be used on small amounts of data without
excessive context switching.

Secondly it has led to a division of labor where EE's in love with the
billions of transistors at their disposal, burn time writing things
that userspace apps can't use. So I'm VERY bugged about that, and was
overjoyed at the prospect in the zynq of being able to "write
hardware", and have it talk through a virtual memory port, so that if
you could identify a thing that could be done better in hardware, you
could get at it via vm with no context switches, which is particularly
valuable on a multi-core cpu architecture.

To me the availability of the virtual memory port on the zynq is the
greatest possible innovation I've seen in FPGA design in a decade and
may one day re-unify the outlook of the EEs and userspace programmers
to do genuinely useful stuff. There are zillions of useful things that
can be done better with just a little extra hardware support, from
userspace. Two things that have come up of late are CAM memory
comparisons, and echo cancelling, both of which are easy to do
efficiently in hardware, with performance a conventional van neuman
architecture can't match. Other things on my mind are things like
packet scheduling (as per the senic paper I posted earlier), and much,
much more.

And it would be great if the EEs and CS folk started going to the same
parties again.

Last regret of the day: Back in the 80s, the LISP machine was a 36 bit
tagged architecture. I loved it. I have been mildly bugged, that we
didn't use the top 4 bits on 64 bit architectures for tags, it would
make things like garbage collection so much easier...

-- 
Dave Täht

NSFW: https://w2.eff.org/Censorship/Internet_censorship_bills/russell_0296_indecent.article

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2014-07-19 17:31 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-07-18 13:48 [Cerowrt-devel] Fastpass: A Centralized "Zero-Queue" Datacenter Network Jim Reisert AD1C
2014-07-18 23:20 ` David Lang
2014-07-18 23:27   ` David Lang
2014-07-19  0:23     ` Dave Taht
2014-07-19  0:36       ` Dave Taht
2014-07-19 16:41       ` Valdis.Kletnieks
2014-07-19 17:31         ` Dave Taht
2014-07-19  0:04   ` Valdis.Kletnieks
2014-07-19  0:31     ` David Lang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox