On Fri, Apr 5, 2019 at 3:42 AM Dave Taht wrote: > I see from the iccrg preso at 7 minutes 55 s in, that there is a test > described as: > > 20 BBRv2 flows > starting each 100ms, 1G, 1ms > Linux codel with ECN ce_threshold at 242us sojurn time. > Hi, Dave! Thanks for your e-mail. > I interpret this as > > 20 flows, starting 100ms apart > on a 1G link > with a 1ms transit time > and linux codel with ce_threshold 242us > Yes, except the 1ms is end-to-end two-way propagation time. > 0) This is iperf? There is no crypto? > Each flow is a netperf TCP stream, with no crypto. > > 1) "sojourn time" not as as setting the codel target to 242us? > > I tend to mentally tie the concept of sojourn time to the target > variable, not ce_threshold > Right. I didn't mean setting the codel target to 242us. Where the slide says "Linux codel with ECN ce_threshold at 242us sojourn time" I literally mean a Linux machine with a codel qdisc configured as: codel ce_threshold 242us This is using the ce_threshold feature added in: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=80ba92fa1a92dea1 ... for which the commit message says: "A DCTCP enabled egress port simply have a queue occupancy threshold above which ECT packets get CE mark. In codel language this translates to a sojourn time, so that one doesn't have to worry about bytes or bandwidth but delays." The 242us comes from the seriailization delay for 20 packets at 1Gbps. 2) In our current SCE work we have repurposed ce_threshold to do sce > instead (to save on cpu and also to make it possible to fiddle without > making a userspace api change). Should we instead create a separate > sce_threshold option to allow for backward compatible usage? > Yes, you would need to maintain the semantics of ce_threshold for backwards compatibility for users who are relying on the current semantics. IMHO your suggestion to use a separate sce_threshold sounds like the way to go, if adding SCE to qdiscs in Linux. > 3) Transit time on your typical 1G link is actually 13us for a big > packet, why 1ms? > The 1ms is the path two-way propagation delay ("min RTT"). We run a range of RTTs in our tests, and the graph happens to be for an RTT of 1ms. > is that 1ms from netem? > Yes. > 4) What is the topology here? > > host -> qdisc -> wire -> host? > > host -> qdisc -> wire -> router -> host? > Those two won't work with Linux TCP, because putting the qdisc on the sender pulls the qdisc delays inside the TSQ control loop, giving a behavior very different from reality (even CUBIC won't bloat if the network emulation qdiscs are on the sender host). What we use for our testing is: host -> wire -> qdiscs -> host Where "qdiscs" includes netem and whatever AQM is in use, if any. > 5) What was the result with fq_codel instead? > With fq_codel and the same ECN marking threshold (fq_codel ce_threshold 242us), we see slightly smoother fairness properties (not surprising) but with slightly higher latency. The basic summary: retransmits: 0 flow throughput: [46.77 .. 51.48] RTT samples at various percentiles: % | RTT (ms) ------+--------- 0 1.009 50 1.334 60 1.416 70 1.493 80 1.569 90 1.655 95 1.725 99 1.902 99.9 2.328 100 6.414 Bandwidth share graphs are attached. (Hopefully the graphs will make it through various lists; if not, you can check the bbr-dev group thread .) best, neal