[Cerowrt-devel] Ubiquiti QOS
David P. Reed
dpreed at reed.com
Thu May 29 08:11:30 EDT 2014
ECN-style signaling has the right properties ... just like TTL it can provide valid and current sampling of the packet ' s environment as it travels. The idea is to sample what is happening at a bottleneck for the packet ' s flow. The bottleneck is the link with the most likelihood of a collision from flows sharing that link.
A control - theoretic estimator of recent collision likelihood is easy to do at each queue. All active flows would receive that signal, with the busiest ones getting it most quickly. Also it is reasonable to count all potentially colliding flows at all outbound queues, and report that.
The estimator can then provide the signal that each flow responds to.
The problem of "defectors" is best dealt with by punishment... An aggressive packet drop policy that makes causing congestion reduce the cause's throughput and increases latency is the best kind of answer. Since the router can remember recent flow behavior, it can penalize recent flows.
A Bloom style filter can remember flow statistics for both of these local policies. A great use for the memory no longer misapplied to buffering....
Simple?
On May 28, 2014, David Lang <david at lang.hm> wrote:
>On Wed, 28 May 2014, dpreed at reed.com wrote:
>
>> I did not mean that "pacing". Sorry I used a generic term. I meant
>what my
>> longer description described - a specific mechanism for reducing
>bunching that
>> is essentially "cooperative" among all active flows through a
>bottlenecked
>> link. That's part of a "closed loop" control system driving each TCP
>endpoint
>> into a cooperative mode.
>
>how do you think we can get feedback from the bottleneck node to all
>the
>different senders?
>
>what happens to the ones who try to play nice if one doesn't?,
>including what
>happens if one isn't just ignorant of the new cooperative mode, but
>activly
>tries to cheat? (as I understand it, this is the fatal flaw in many of
>the past
>buffering improvement proposals)
>
>While the in-house router is the first bottleneck that user's traffic
>hits, the
>bigger problems happen when the bottleneck is in the peering between
>ISPs, many
>hops away from any sender, with many different senders competing for
>the
>avialable bandwidth.
>
>This is where the new buffering approaches win. If the traffic is below
>the
>congestion level, they add very close to zero overhead, but when
>congestion
>happens, they manage the resulting buffers in a way that's works better
>for
>people (allowing short, fast connections to be fast with only a small
>impact on
>very long connections)
>
>David Lang
>
>> The thing you call "pacing" is something quite different. It is
>disconnected
>> from the TCP control loops involved, which basically means it is
>flying blind.
>> Introducing that kind of "pacing" almost certainly reduces
>throughput, because
>> it *delays* packets.
>>
>> The thing I called "pacing" is in no version of Linux that I know of.
> Give it
>> a different name: "anti-bunching cooperation" or "timing phase
>management for
>> congestion reduction". Rather than *delaying* packets, it tries to
>get packets
>> to avoid bunching only when reducing window size, and doing so by
>tightening
>> the control loop so that the sender transmits as *soon* as it can,
>not by
>> delaying sending after the sender dallies around not sending when it
>can.
>>
>>
>>
>>
>>
>>
>>
>> On Tuesday, May 27, 2014 11:23am, "Jim Gettys" <jg at freedesktop.org>
>said:
>>
>>
>>
>>
>>
>>
>>
>> On Sun, May 25, 2014 at 4:00 PM,
><[dpreed at reed.com](mailto:dpreed at reed.com)> wrote:
>>
>> Not that it is directly relevant, but there is no essential reason to
>require 50 ms. of buffering. That might be true of some particular
>QOS-related router algorithm. 50 ms. is about all one can tolerate in
>any router between source and destination for today's networks - an
>upper-bound rather than a minimum.
>>
>> The optimum buffer state for throughput is 1-2 packets worth - in
>other words, if we have an MTU of 1500, 1500 - 3000 bytes. Only the
>bottleneck buffer (the input queue to the lowest speed link along the
>path) should have this much actually buffered. Buffering more than this
>increases end-to-end latency beyond its optimal state. Increased
>end-to-end latency reduces the effectiveness of control loops, creating
>more congestion.
>>
>> The rationale for having 50 ms. of buffering is probably to avoid
>disruption of bursty mixed flows where the bursts might persist for 50
>ms. and then die. One reason for this is that source nodes run
>operating systems that tend to release packets in bursts. That's a
>whole other discussion - in an ideal world, source nodes would avoid
>bursty packet releases by letting the control by the receiver window be
>"tight" timing-wise. That is, to transmit a packet immediately at the
>instant an ACK arrives increasing the window. This would pace the flow
>- current OS's tend (due to scheduling mismatches) to send bursts of
>packets, "catching up" on sending that could have been spaced out and
>done earlier if the feedback from the receiver's window advancing were
>heeded.
>>
>>
>>
>> That is, endpoint network stacks (TCP implementations) can worsen
>congestion by "dallying". The ideal end-to-end flows occupying a
>congested router would have their packets paced so that the packets end
>up being sent in the least bursty manner that an application can
>support. The effect of this pacing is to move the "backlog" for each
>flow quickly into the source node for that flow, which then provides
>back pressure on the application driving the flow, which ultimately is
>necessary to stanch congestion. The ideal congestion control mechanism
>slows the sender part of the application to a pace that can go through
>the network without contributing to buffering.
>>
>> Pacing is in Linux 3.12(?). How long it will take to see widespread
>deployment is another question, and as for other operating systems, who
>knows.
>> See:
>[https://lwn.net/Articles/564978/](https://lwn.net/Articles/564978/)
>>
>>
>> Current network stacks (including Linux's) don't achieve that goal -
>their pushback on application sources is minimal - instead they
>accumulate buffering internal to the network implementation.
>> This is much, much less true than it once was. There have been
>substantial changes in the Linux TCP stack in the last year or two, to
>avoid generating packets before necessary. Again, how long it will
>take for people to deploy this on Linux (and implement on other OS's)
>is a question.
>>
>> This contributes to end-to-end latency as well. But if you think
>about it, this is almost as bad as switch-level bufferbloat in terms of
>degrading user experience. The reason I say "almost" is that there are
>tools, rarely used in practice, that allow an application to specify
>that buffering should not build up in the network stack (in the kernel
>or wherever it is). But the default is not to use those APIs, and to
>buffer way too much.
>>
>> Remember, the network send stack can act similarly to a congested
>switch (it is a switch among all the user applications running on that
>node). IF there is a heavy file transfer, the file transfer's
>buffering acts to increase latency for all other networked
>communications on that machine.
>>
>> Traditionally this problem has been thought of only as a within-node
>fairness issue, but in fact it has a big effect on the switches in
>between source and destination due to the lack of dispersed pacing of
>the packets at the source - in other words, the current design does
>nothing to stem the "burst groups" from a single source mentioned
>above.
>>
>> So we do need the source nodes to implement less "bursty" sending
>stacks. This is especially true for multiplexed source nodes, such as
>web servers implementing thousands of flows.
>>
>> A combination of codel-style switch-level buffer management and the
>stack at the sender being implemented to spread packets in a particular
>TCP flow out over time would improve things a lot. To achieve best
>throughput, the optimal way to spread packets out on an end-to-end
>basis is to update the receive window (sending ACK) at the receive end
>as quickly as possible, and to respond to the updated receive window as
>quickly as possible when it increases.
>>
>> Just like the "bufferbloat" issue, the problem is caused by
>applications like streaming video, file transfers and big web pages
>that the application programmer sees as not having a latency
>requirement within the flow, so the application programmer does not
>have an incentive to control pacing. Thus the operating system has got
>to push back on the applications' flow somehow, so that the flow ends
>up paced once it enters the Internet itself. So there's no real
>problem caused by large buffering in the network stack at the endpoint,
>as long as the stack's delivery to the Internet is paced by some
>mechanism, e.g. tight management of receive window control on an
>end-to-end basis.
>>
>> I don't think this can be fixed by cerowrt, so this is out of place
>here. It's partially ameliorated by cerowrt, if it aggressively drops
>packets from flows that burst without pacing. fq_codel does this, if
>the buffer size it aims for is small - but the problem is that the OS
>stacks don't respond by pacing... they tend to respond by bursting, not
>because TCP doesn't provide the mechanisms for pacing, but because the
>OS stack doesn't transmit as soon as it is allowed to - thus building
>up a burst unnecessarily.
>>
>> Bursts on a flow are thus bad in general. They make congestion
>happen when it need not.
>> By far the biggest headache is what the Web does to the network. It
>has turned the web into a burst generator.
>> A typical web page may have 10 (or even more images). See the
>"connections per page" plot in the link below.
>> A browser downloads the base page, and then, over N connections,
>essentially simultaneously downloads those embedded objects. Many/most
>of them are small in size (4-10 packets). You never even get near slow
>start.
>> So you get an IW amount of data/TCP connection, with no pacing, and
>no congestion avoidance. It is easy to observe 50-100 packets (or
>more) back to back at the bottleneck.
>> This is (in practice) the amount you have to buffer today: that burst
>of packets from a web page. Without flow queuing, you are screwed.
>With it, it's annoying, but can be tolerated.
>> I go over this is detail in:
>>
>>
>[http://gettys.wordpress.com/2013/07/10/low-latency-requires-smart-queuing-traditional-aqm-is-not-enough/](http://gettys.wordpress.com/2013/07/10/low-latency-requires-smart-queuing-traditional-aqm-is-not-enough/)
>> So far, I don't believe anyone has tried pacing the IW burst of
>packets. I'd certainly like to see that, but pacing needs to be across
>TCP connections (host pairs) to be possibly effective to outwit the
>gaming the web has done to the network.
>> - Jim
>>
>>
>>
>>
>>
>>
>>
>>
>> On Sunday, May 25, 2014 11:42am, "Mikael Abrahamsson"
><[swmike at swm.pp.se](mailto:swmike at swm.pp.se)> said:
>>
>>
>>
>>> On Sun, 25 May 2014, Dane Medic wrote:
>>>
>>> > Is it true that devices with less than 64 MB can't handle QOS? ->
>>> >
>[https://lists.chambana.net/pipermail/commotion-dev/2014-May/001816.html](https://lists.chambana.net/pipermail/commotion-dev/2014-May/001816.html)
>> >
>>> At gig speeds you need around 50ms worth of buffering. 1 gigabit/s =
>>> 125 megabyte/s meaning for 50ms you need 6.25 megabyte of buffer.
>>>
>>> I also don't see why performance and memory size would be relevant,
>I'd
>> > say forwarding performance has more to do with CPU speed than
>anything
>>> else.
>>>
>>> --
>>> Mikael Abrahamsson email:
>[swmike at swm.pp.se](mailto:swmike at swm.pp.se)
>> > _______________________________________________
>>> Cerowrt-devel mailing list
>>>
>[Cerowrt-devel at lists.bufferbloat.net](mailto:Cerowrt-devel at lists.bufferbloat.net)
>>>
>[https://lists.bufferbloat.net/listinfo/cerowrt-devel](https://lists.bufferbloat.net/listinfo/cerowrt-devel)
>> >
>> _______________________________________________
>> Cerowrt-devel mailing list
>>
>[Cerowrt-devel at lists.bufferbloat.net](mailto:Cerowrt-devel at lists.bufferbloat.net)
>>
>[https://lists.bufferbloat.net/listinfo/cerowrt-devel](https://lists.bufferbloat.net/listinfo/cerowrt-devel)
>
>------------------------------------------------------------------------
>
>_______________________________________________
>Cerowrt-devel mailing list
>Cerowrt-devel at lists.bufferbloat.net
>https://lists.bufferbloat.net/listinfo/cerowrt-devel
-- Sent from my Android device with K-@ Mail. Please excuse my brevity.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.bufferbloat.net/pipermail/cerowrt-devel/attachments/20140529/35a4079a/attachment-0002.html>
More information about the Cerowrt-devel
mailing list