From: Kathleen Nichols <nichols@pollere.com>
To: "codel@lists.bufferbloat.net" <codel@lists.bufferbloat.net>
Subject: [Codel] missing part of section 6.2 of draft-nichols-tsvwg-codel-02
Date: Tue, 11 Mar 2014 15:32:29 -0700 [thread overview]
Message-ID: <531F8EFD.5010807@pollere.com> (raw)
It has come to my attention that part of section 6.2 of the draft is
missing, apparently due
to passing through several programs in order to conform to the holy
Internet Draft format.
Since the missing part is quite interesting, here it is.
When people without time to mess around with this archaic stuff stop
doing Internet Drafts, the IETF will then only have drafts by people
with time to mess around with this
archaic stuff.
Here is how it should read (points a, b, and c were missing):
6.2 CoDel in the datacenter
Nandita Dukkipati’s team at Google was quick to realize that the CoDel
building blocks could be applied to bufferbloat problems in datacenter
servers, not just to Internet routers. The Linux CoDel queueing
discipline (Qdisc) was adapted in three ways to tackle this bufferbloat
problem.
a) The default CoDel action was modified to be a direct feedback from
Qdisc to the TCP layer at dequeue. The direct feedback simply reduces
TCP’s congestion window just as congestion control would do in the event
of drop. The scheme falls back to ECN marking or packet drop if the TCP
socket lock could not be acquired at dequeue.
b) Being located in the server makes it possible to monitor the actual
RTT to use as CoDel’s interval rather than making a “best guess” of RTT.
The CoDel interval is dynamically adjusted by using the maximum TCP
round-trip time (RTT) of those connections sharing the same
Qdisc/bucket. In particular, there is a history entry of the maximum RTT
experienced over the last second. As a packet is dequeued, the RTT
estimate is accessed from its TCP socket. If the estimate is larger than
the current CoDel interval, the CoDel interval is immediately refreshed
to the new value. If the CoDel interval is not refreshed for over a
second, it is decreased it to the history entry and the process
repeated. The use of the dynamic TCP RTT estimate lets interval adapt to
the actual maximum value currently seen and thus lets the controller
space its drop intervals appropriately.
c) Since the mathematics of computing the set point are invariant, a
target of 5% of the RTT or CoDel interval was used here.
Non-data packets were not dropped as these are typically small and
sometimes critical control packets. Being located on the server, there
is no concern with misbehaving users scamming such a policy as there
would be in an Internet router.
In several data center workload benchmarks, which are typically bursty,
CoDel reduced the queueing latencies at the Qdisc, and thereby improved
the mean and 99 percentile latencies from several tens of milliseconds
to less than one millisecond. The minimum tracking part of the CoDel
framework proved useful in disambiguating “good” queue versus “bad”
queue, particularly helpful in controlling Qdisc buffers that are
inherently bursty because of TCP Segmentation Offload.
reply other threads:[~2014-03-11 22:32 UTC|newest]
Thread overview: [no followups] expand[flat|nested] mbox.gz Atom feed
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://lists.bufferbloat.net/postorius/lists/codel.lists.bufferbloat.net/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=531F8EFD.5010807@pollere.com \
--to=nichols@pollere.com \
--cc=codel@lists.bufferbloat.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox