From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-out5.uio.no (mail-out5.uio.no [IPv6:2001:700:100:10::17]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by huchra.bufferbloat.net (Postfix) with ESMTPS id 5839C21F1A1 for ; Thu, 21 Aug 2014 12:47:00 -0700 (PDT) Received: from mail-mx2.uio.no ([129.240.10.30]) by mail-out5.uio.no with esmtp (Exim 4.80.1) (envelope-from ) id 1XKYK1-0001mG-GC; Thu, 21 Aug 2014 21:46:57 +0200 Received: from 25.71.202.84.customer.cdi.no ([84.202.71.25] helo=[192.168.0.114]) by mail-mx2.uio.no with esmtpsa (TLSv1:AES128-SHA:128) user michawe (Exim 4.80) (envelope-from ) id 1XKYK0-0006Eo-B3; Thu, 21 Aug 2014 21:46:57 +0200 Content-Type: multipart/alternative; boundary="Apple-Mail=_677FF51D-BB8A-48EC-B582-E6DF2FF7B098" Mime-Version: 1.0 (Mac OS X Mail 7.3 \(1878.2\)) From: Michael Welzl In-Reply-To: Date: Thu, 21 Aug 2014 21:46:54 +0200 Message-Id: <586645EC-CEB6-4984-93BB-AE94CC2EE3A4@ifi.uio.no> References: <20140804124453.GA19478@ens-lyon.fr> <96292B6B-F49D-4A8A-BD23-DD75399D8DA9@ifi.uio.no> To: Dave Taht X-Mailer: Apple Mail (2.1878.2) X-UiO-SPF-Received: X-UiO-Ratelimit-Test: rcpts/h 11 msgs/h 5 sum rcpts/h 13 sum msgs/h 6 total rcpts 19373 max rcpts/h 44 ratelimit 0 X-UiO-Spam-info: not spam, SpamAssassin (score=-5.0, required=5.0, autolearn=disabled, HTML_MESSAGE=0.001, TVD_RCVD_IP=0.001, UIO_MAIL_IS_INTERNAL=-5, uiobl=NO, uiouri=NO) X-UiO-Scanned: 7347EC73F7FCE4E7778A7848D2716CF67F73F53E X-UiO-SPAM-Test: remote_host: 84.202.71.25 spam_score: -49 maxlevel 80 minaction 2 bait 0 mail/h: 5 total 150 max/h 6 blacklist 0 greylist 0 ratelimit 0 Cc: bloat Subject: Re: [Bloat] Remy: Computer-Generated Congestion Control X-BeenThere: bloat@lists.bufferbloat.net X-Mailman-Version: 2.1.13 Precedence: list List-Id: General list for discussing Bufferbloat List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 21 Aug 2014 19:47:00 -0000 --Apple-Mail=_677FF51D-BB8A-48EC-B582-E6DF2FF7B098 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=iso-8859-1 On 21. aug. 2014, at 19:04, Dave Taht wrote: > On Thu, Aug 21, 2014 at 5:21 AM, Michael Welzl = wrote: >> Dave, >>=20 >> About this point that I've seen you make repeatedly: >>=20 >>> My biggest problem with all the work so far is that it starts with a >>> constant baseline 150ms or 100ms RTT, and then try various = algorithms >>> over a wide range of bandwidths, and then elides that base RTT in = all >>> future plots. Unless you read carefully, you don't see that... >>>=20 >>> The original bufferbloat experiment was at a physical RTT of 10ms, >>> with bidirectional traffic flooding the queues in both directions, = on >>> an asymmetric network. I keep waiting for more experimenters to try >>> that as a baseline experiment. >>=20 >>=20 >> This sounds like a call for reality, when the point is to measure = things that matter to the system you're investigating. >=20 > Heh. It is my wish, certainly, to see the remy concept extended to > solving the problems I consider important. The prospect of merely > warming my feet on a nice 80 core box for a week, and evaluating the > results, is quite appealing. >=20 > I'll gladly trade someone else's code, and someone else's compute > time, for a vacation (or unemployment, if it works out!). Can arrange > for a very hefty cluster to tackle this stuff also, if the code were > public. >=20 > I have encouraged keith to look into the netfpga.org project as a > possible target for the rules being created by remy's algorithms. >=20 >> E.g., if I'm investigating TCP, and I don't specifically work on ACK = behavior (ACK congestion control or something like that), I usually = don't care much about the backwards traffic or the asymmetry. Yes it = does influence the measured RTT a bit, but then you argue for using a = smaller base RTT where the impact of this gets smaller too. >>=20 >> What matters to the function of congestion controllers is the BDP, = and the buffer size. >=20 > Which varies based on the real physical RTT to the server. One of the > biggest problems in > nets today is that larger flow sources (CDNS, HAS(netflix)) keep > moving closer and closer to the > end node, with all the problems with tcp fairness that short RTTs > induce on longer ones, like your radio > flow blow. Sure >> As for real RTTs, I like pinging wwoz.org because it's a radio = station in New Orleans. I do sometimes listen to it, then I get traffic = via a TCP connection that has this RTT. Very real. My ping delay is = consistently above 110ms. >=20 > I too use a low rate radio stream to measure the impact of other flows > on my listening experience. >=20 > I'd written this before the bufferbloat project got rolling, before > I'd even met jim. >=20 > http://nex-6.taht.net/posts/Did_Bufferbloat_Kill_My_Net_Radio/ >=20 > ... haven't had a problem for 2 years now. :) >=20 > Still do use the stream ripping thing for taking stuff on the road. >=20 >>=20 >> On a side note, I can't help but mention that the "original = bufferbloat experiment" features ping over FQ... measuring ping all by = itself, pretty much :-) >=20 > Huh? The original bufferbloat experiments were against a comcast cable > modem, and various other devices, lacking FQ entirely. there was a > youtube video, a paper, (things like "daddy, why is the internet slow > today" and various other resources. So the earliest stuff was "upload > + ping + complaints about the network from the kids", and the later > simultaneous up+down+ping abstraction was to get the kids out of the > equation. Oops, sorry. I referred to something I'd seen presented so often that I = thought this must be the "original experiment". Of course it isn't, = FQ_CoDel wasn't there from day 1 :) apologies > I agree that up+down+ping is not as useful as we'd like now that FQ is > on the table. >=20 > I have been attempting to formalize where we are now, adding rigor > based on those early benchmarks, with both netperf-wrapper's > tcp_bidirectional test, the more advanced rrul and rtt_fairness tests, > and a model in ns3. Given how much bloat we've killed elsewhere I > would like to continue to improve these tests to properly show > interactions with truly isochronous streams (there are several tests > in netperf-wrapper now that leverage d-itg for that now (thx toke!), > and more bursty yet rate limited flows like videoconferencing which > we've been discussing the structure of with various members of the > rmcat group. >=20 > My complaint comes from seeing very low bandwidths and long rtts and > short queue lengths used in materials used to explain bufferbloat, > like this: Short queue lengths is of course odd. Low bandwidth is sometimes = convenient to isolate an effect, as it makes it less likely for the CPU, = rwnd, lack of window scaling, or whatever else to be your bottleneck. > https://www.udacity.com/wiki/cn/assignment5-buffer-bloat >=20 > when the reality: >=20 > 1) There is a double bump you see in tcp for example at truly > excessive queue lengths). You also see truly > terrible ramp up performance when competing data and ack flows. >=20 > 2) the example assumes byte to packet length equivalence (100 packets > =3D 150kb) which is completely untrue - > 100 packets is in the range of 6.4kb-150kb - and a reason why we have > much larger packet based queue lengths in the real world IS because of > optimizing for acks, not data in commonly deployed equipment. >=20 > If you want to test a byte based queue length, by all means (I have > gradually come to the opinion that byte based packet queues are best), > but test either packet or byte based queues under conditions that are > ack dominated or data dominated, with both, preferably all scenarios. >=20 > 3) it is at 1.5mbit, not the 5,10,100mbit speeds. Below 10mbits you > see IW10 or IW4 effects, you see ssthresh not hit, and a variety of > other tcp specific issues, rather than the generic bufferbloat > problem. >=20 > 4) the 20ms in that course is actually reasonable. :) >=20 > I don't understand what the harm would have been to the student to use > 1000 packets, and test up, down, and up + down, at 10 or 100Mbit. >=20 > 5) I'd have liked it if the final experiment in the above course > material switched to huge queue lengths (say for example, cisco's 2800 > default for their 4500 line), or explained the real world situation > better, so the takeaway was closer to the real size and scope of the > problem. >=20 > Anyway... >=20 > Nearly every paper I've read make one or more of the above > modifications to what I'd viewed as the original experiment, and I'd > like to add rigor to the definitions of phrases floating around like > "bufferbloat episode", and so on. >=20 >=20 >> Michael >>=20 >=20 >=20 >=20 > --=20 > Dave T=E4ht >=20 > NSFW: = https://w2.eff.org/Censorship/Internet_censorship_bills/russell_0296_indec= ent.article --Apple-Mail=_677FF51D-BB8A-48EC-B582-E6DF2FF7B098 Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=iso-8859-1
On 21. aug. 2014, at 19:04, Dave Taht = <dave.taht@gmail.com> = wrote:

On Thu, Aug 21, 2014 at 5:21 AM, = Michael Welzl <michawe@ifi.uio.no> = wrote:
Dave,

About this point that = I've seen you make repeatedly:

My = biggest problem with all the work so far is that it starts with = a
constant baseline 150ms or 100ms RTT, and then try various = algorithms
over a wide range of bandwidths, and then elides that base = RTT in all
future plots. Unless you read carefully, you don't see = that...

The original bufferbloat experiment was at a physical RTT = of 10ms,
with bidirectional traffic flooding the queues in both = directions, on
an asymmetric network. I keep waiting for more = experimenters to try
that as a baseline = experiment.


This sounds like a call for reality, = when the point is to measure things that matter to the system you're = investigating.

Heh. It is my wish, certainly, to see = the remy concept extended to
solving the problems I consider = important. The prospect of merely
warming my feet on a nice 80 core = box for a week, and evaluating the
results, is quite = appealing.

I'll gladly trade someone else's code, and someone = else's compute
time, for a vacation (or unemployment, if it works = out!). Can arrange
for a very hefty cluster to tackle this stuff = also, if the code were
public.

I have encouraged keith to look = into the netfpga.org project as a
possible = target for the rules being created by remy's = algorithms.

E.g., if I'm investigating = TCP, and I don't specifically work on ACK behavior (ACK congestion = control or something like that), I usually don't care much about the = backwards traffic or the asymmetry. Yes it does influence the measured = RTT a bit, but then you argue for using a smaller base RTT where the = impact of this gets smaller too.

What matters to the function of = congestion controllers is the BDP, and the buffer = size.

Which varies based on the real physical RTT to = the server. One of the
biggest problems in
nets today is that = larger flow sources (CDNS, HAS(netflix)) keep
moving closer and = closer to the
end node, with all the problems with tcp fairness that = short RTTs
induce on longer ones, like your radio
flow = blow.

Sure

<= br>
As for real RTTs, I like pinging wwoz.org because it's a radio = station in New Orleans. I do sometimes listen to it, then I get traffic = via a TCP connection that has this RTT. Very real. My ping delay is = consistently above 110ms.

I too use a low rate radio = stream to measure the impact of other flows
on my listening = experience.

I'd written this before the bufferbloat project got = rolling, before
I'd even met jim.

ht= tp://nex-6.taht.net/posts/Did_Bufferbloat_Kill_My_Net_Radio/

..= . haven't had a problem for 2 years now. :)

Still do use the = stream ripping thing for taking stuff on the road.


On a side note, I can't help but mention that the = "original bufferbloat experiment" features ping over FQ... measuring = ping all by itself, pretty much :-)

Huh? The = original bufferbloat experiments were against a comcast cable
modem, = and various other devices, lacking FQ entirely. there was a
youtube = video, a paper, (things like "daddy, why is the internet slow
today" = and various other resources. So the earliest stuff was "upload
+ ping = + complaints about the network from the kids", and the = later
simultaneous up+down+ping abstraction was to get the kids out = of the
equation.

Oops, sorry. I = referred to something I'd seen presented so often that I thought this = must be the "original experiment". Of course it isn't, FQ_CoDel wasn't = there from day 1  :) =  apologies


I agree that up+down+ping is not = as useful as we'd like now that FQ is
on the table.

I have = been attempting to formalize where we are now, adding rigor
based on = those early benchmarks,  with both = netperf-wrapper's
tcp_bidirectional test, the more advanced rrul and = rtt_fairness tests,
and a model in ns3. Given how much bloat we've = killed elsewhere I
would like to continue to improve these tests to = properly show
interactions with truly isochronous streams (there are = several tests
in netperf-wrapper now that leverage d-itg for that now = (thx toke!),
and more bursty yet rate limited flows like = videoconferencing which
we've been discussing the structure of with = various members of the
rmcat group.

My complaint comes from = seeing very low bandwidths and long rtts and
short queue lengths used = in materials used to explain bufferbloat,
like = this:

Short queue lengths is of = course odd. Low bandwidth is sometimes convenient to isolate an effect, = as it makes it less likely for the CPU, rwnd, lack of window scaling, or = whatever else to be your = bottleneck.


https://= www.udacity.com/wiki/cn/assignment5-buffer-bloat

when the = reality:

1) There is a double bump you see in tcp for example at = truly
excessive queue lengths). You also see truly
terrible ramp = up performance when competing data and ack flows.

2) the example = assumes byte to packet length equivalence (100 packets
=3D 150kb) = which is completely untrue -
100 packets is in the range of = 6.4kb-150kb - and a reason why we have
much larger packet based queue = lengths in the real world IS because of
optimizing for acks, not data = in commonly deployed equipment.

If you want to test a byte based = queue length, by all means (I have
gradually come to the opinion that = byte based packet queues are best),
but test either packet or byte = based queues under conditions that are
ack dominated or data = dominated, with both, preferably all scenarios.

3) it is at = 1.5mbit, not the 5,10,100mbit speeds. Below 10mbits you
see IW10 or = IW4 effects, you see ssthresh not hit, and a variety of
other tcp = specific issues, rather than the generic = bufferbloat
problem.

4) the 20ms in that course is actually = reasonable.  :)

I don't understand what the harm would have = been to the student to use
1000 packets, and test up, down, and up + = down, at 10 or 100Mbit.

5) I'd have liked it if the final = experiment in the above course
material switched to huge queue = lengths (say for example, cisco's 2800
default for their 4500 line), = or explained the real world situation
better, so the takeaway was = closer to the real size and scope of = the
problem.

Anyway...

Nearly every paper I've read = make one or more of the above
modifications to what I'd viewed as the = original experiment, and I'd
like to add rigor to the definitions of = phrases floating around like
"bufferbloat episode", and so = on.


Michael




-- 
Dave = T=E4ht

NSFW: https://w2.eff.org/Censorship/Internet_censorship_bil= ls/russell_0296_indecent.article

= --Apple-Mail=_677FF51D-BB8A-48EC-B582-E6DF2FF7B098--