* Re: [Cake] [Starlink] [Bloat] In loving memory of Dave Täht <3
2025-04-02 19:51 0% ` David P. Reed
@ 2025-04-03 3:28 0% ` the keyboard of geoff goodfellow
0 siblings, 0 replies; 200+ results
From: the keyboard of geoff goodfellow @ 2025-04-03 3:28 UTC (permalink / raw)
To: David P. Reed
Cc: Livingood, Jason, cerowrt-commits, bloat-ietf, Herbert Wolverson,
Make-Wifi-fast, cerowrt-users, libreqos, Jeremy Austin via Rpm,
Frantisek (Frank) Borsik,
Network Neutrality is back! Let´s make the technical
aspects heard this time!,
codel-wireless, cerowrt-devel, bloat, Cake List, codel,
Dave Taht via Starlink, Robert Chacón, Internet-history
[-- Attachment #1: Type: text/plain, Size: 7640 bytes --]
vis-a-vis* "**thinking about how we could get Dave recognized for his
contributions" ➔➔ *At The Very Least Dave should immediately
be posthumously nominated to The InternetHallOfFame.org as Dave Most
Certainly Qualifies For *"Recognizing the People **Who Bring the Internet
to Life"*
geoff
On Wed, Apr 2, 2025 at 12:52 PM David P. Reed via Starlink <
starlink@lists.bufferbloat.net> wrote:
> Hi all -
>
>
>
> I've already shared my sadness and appreciation of my good friend Dave on
> LinkedIn.
>
> I met him through Jim Gettys at the beginning of the Bufferbloat
> discovery, and besides our long correspondence, I hope I have given him
> enough support over the years - including introducing him to my network of
> friends, some of whom are on this list. Others he found by himself.
> He's been a one-person social network out there, who got things done
> beyond what institutions seem to be able to do. (And he amazed me by
> managing to get a stodgy IETF crowd to pay attention to the congestion
> control issue, despite much institutional resistance, and academic
> networking researchers who never got the point). Of course, Jason Livingood
> worked behind the scenes very hard to bypass corporate resistance, too.
>
> Also, I can share something that few knew about - I brought Dave into an
> ex parte policy discussion at the FCC about an idea being promoted that the
> FCC should require all routers the FCC certified to have a complete "locked
> down" configuration that could not be changed by users. I got brought in
> because of my FCC TAC involvement around Software Defined Radio. But the
> folks behind the proposal were just using that as an excuse - they wanted
> really to block WISPs by raising the cost of WiFi routers. Dave, who knew
> more than anything why re-flashing routers made them MORE secure and could
> explain it in a disarming way to lawyers and policymakers, managed to get
> the commissioners to understand that security wasn't something the FCC
> could certify, and also why commercial routers weren't at all secure. He
> was so much better at explaining in what you might call an inclusive,
> folksy way that he changed the FCC's approach significantly - away from
> Certifying Security entirely. (The SDR issue ended up not being relevant to
> routers, though SDR is still a complex policy issue that is holding back
> innovation in wireless systems.) I'm certain Dave has had much impact of
> this sort.
>
>
>
> However, Dave's passing s very frustrating to me because of two things:
>
>
>
> 1) there is no one who can replace Dave. The things he made happen will
> continue, but he was only getting started on issues like improving WiFi.
> Again, the resistance to improving WiFi is both institutional and
> corporate, and researchers won't challenge the institutional and corporate
> shibboleths that get in the way of solving critical problems in the 802.11
> implementation and systems architecture domain. (Unfortunately, WiFi has
> become a political term that is being used by "wireless" operators and
> their suppliers to fight for or against monopoly control of the airwaves,
> very parallel to the problems of getting engineering solutions on Internet
> fabric that deal with congestion. So it can't be done in the institutions
> and corporations focused away from the engineering challenges. That's why
> Dave was needed.)
>
> 2) I was thinking about how we could get Dave recognized for his
> contributions. Like other unsung heroes, Dave didn't work for BBN or some
> other moneyed entity who would commission a book or a memorial. (BBN paid
> Katie Hafner to write the text that later turned into her book "When
> Wizards Stay Up Late", which oddly only talked about the ARPANET/Internet
> pioneers who worked for BBN, omitting many of my Internet colleagues.)
> Dave wasn't the kind of guy that gets Awards from the Computer History
> Museum or the ACM or IEEE. He wasn't beloved at IETF or ISOC that I know
> of. He's in the category of folks like Noel Chiappa or Bram Cohen or
> Richard Stallman or Aaron Swartz - people I think really changed the way we
> think about computing and internetworking, but who won't be in the official
> histories.
>
> I was hoping (before this week) to try to
>
> On Wednesday, April 2, 2025 09:59, "Livingood, Jason via Cake" <
> cake@lists.bufferbloat.net> said:
>
> > Very sad news indeed! I had the pleasure of working closely with Dave
> for 15
> > years. He was generous with his time and had a unique way of bringing
> people
> > together to make the internet better for everyone!
> >
> >
> > I had to go down memory lane to recall when I first really started
> working with
> > him. It may have been around 2010 or so. In 2012, I started sending
> funds his way
> > via my day job to help him and his merry network of collaborators work
> to develop
> > the CoDel AQM.
> >
> >
> > Funding him was not necessarily easy, as Dave had a unique way of
> working and was
> > best when he had complete autonomy and only loosely outlined goals -
> typically
> > hard to sell in a big company. But he could make things happen, so it
> worked. And
> > I knew when he started complaining about maintenance needs on his boat,
> or the
> > need to recruit a new person to the project, or about a great new (and
> practical!)
> > idea, that it was time to top up his funding. ;-)
> >
> >
> > That initial CoDel support in 2012 was extended to underwrite work on
> his idea to
> > develop RRUL, the first real working latency test that I can remember
> > (https://www.bufferbloat.net/projects/bloat/wiki/RRUL_Spec/
> > <https://www.bufferbloat.net/projects/bloat/wiki/RRUL_Spec/>). He was
> also
> > helpful in introducing me to Simon Kelley, developer of dnsmasq, so we
> could
> > underwrite some IPv6 features in dnsmasq (and Dave convinced Simon to
> come to an
> > IETF meeting to help gather requirements and meet folks).
> >
> >
> > Dave got CoDel working, so we developed a compelling demo of CoDel on a
> DOCSIS
> > network (via a CeroWrt-based router connected to a cable modem) and
> brought him
> > along to IETF-86 in March 2013 in Orlando - see interview with Dave at
> > https://youtu.be/NuHYOu4aAqg?si=p0SJHLNpp_6n7XP9&t=195
> > <https://youtu.be/NuHYOu4aAqg?si=p0SJHLNpp_6n7XP9&t=195>.
> >
> >
> > From 2014-2017, I was able to make additional financial support happen
> for him, so
> > he could do R&D into how to improve buffer bloat in WiFi network links
> and
> > equipment, a project he called "Make WiFi Fast". In 2020-2021 and 2024,
> I found
> > funding for his work again, this time to work on accelerating AQM
> adoption in the
> > real world & work related to the CAKE AQM.
> >
> >
> > Thanks in part to my longstanding collaboration with Dave, tens of
> millions of
> > DOCSIS users in our network have AQM and thus far better network
> responsiveness.
> > The same is true for AQMs he worked on, CeroWrt, LibreQoS, and other
> projects. He
> > succeeded in his goal to make the internet better for everyone!
> >
> >
> > We will miss you, Dave!
> >
> >
> > Jason
> >
> > _______________________________________________
> > Cake mailing list
> > Cake@lists.bufferbloat.net
> > https://lists.bufferbloat.net/listinfo/cake
> >
> _______________________________________________
> Starlink mailing list
> Starlink@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/starlink
>
--
Geoff.Goodfellow@iconia.com
living as The Truth is True
[-- Attachment #2: Type: text/html, Size: 10578 bytes --]
^ permalink raw reply [relevance 0%]
* Re: [Cake] [Bloat] In loving memory of Dave Täht <3
2025-04-02 13:59 1% ` Livingood, Jason
@ 2025-04-02 19:51 0% ` David P. Reed
2025-04-03 3:28 0% ` [Cake] [Starlink] " the keyboard of geoff goodfellow
0 siblings, 1 reply; 200+ results
From: David P. Reed @ 2025-04-02 19:51 UTC (permalink / raw)
To: Livingood, Jason
Cc: Vint Cerf, Frantisek Borsik, codel-wireless,
Jeremy Austin via Rpm, cerowrt-commits, Make-Wifi-fast, libreqos,
Dave Taht via Starlink, Herbert Wolverson,
Frantisek (Frank) Borsik,
Network Neutrality is back! Let´s make the technical
aspects heard this time!,
codel, cerowrt-devel, bloat, Cake List, bloat-ietf,
cerowrt-users, Robert Chacón
[-- Attachment #1: Type: text/plain, Size: 6686 bytes --]
Hi all -
I've already shared my sadness and appreciation of my good friend Dave on LinkedIn.
I met him through Jim Gettys at the beginning of the Bufferbloat discovery, and besides our long correspondence, I hope I have given him enough support over the years - including introducing him to my network of friends, some of whom are on this list. Others he found by himself.
He's been a one-person social network out there, who got things done beyond what institutions seem to be able to do. (And he amazed me by managing to get a stodgy IETF crowd to pay attention to the congestion control issue, despite much institutional resistance, and academic networking researchers who never got the point). Of course, Jason Livingood worked behind the scenes very hard to bypass corporate resistance, too.
Also, I can share something that few knew about - I brought Dave into an ex parte policy discussion at the FCC about an idea being promoted that the FCC should require all routers the FCC certified to have a complete "locked down" configuration that could not be changed by users. I got brought in because of my FCC TAC involvement around Software Defined Radio. But the folks behind the proposal were just using that as an excuse - they wanted really to block WISPs by raising the cost of WiFi routers. Dave, who knew more than anything why re-flashing routers made them MORE secure and could explain it in a disarming way to lawyers and policymakers, managed to get the commissioners to understand that security wasn't something the FCC could certify, and also why commercial routers weren't at all secure. He was so much better at explaining in what you might call an inclusive, folksy way that he changed the FCC's approach significantly - away from Certifying Security entirely. (The SDR issue ended up not being relevant to routers, though SDR is still a complex policy issue that is holding back innovation in wireless systems.) I'm certain Dave has had much impact of this sort.
However, Dave's passing s very frustrating to me because of two things:
1) there is no one who can replace Dave. The things he made happen will continue, but he was only getting started on issues like improving WiFi. Again, the resistance to improving WiFi is both institutional and corporate, and researchers won't challenge the institutional and corporate shibboleths that get in the way of solving critical problems in the 802.11 implementation and systems architecture domain. (Unfortunately, WiFi has become a political term that is being used by "wireless" operators and their suppliers to fight for or against monopoly control of the airwaves, very parallel to the problems of getting engineering solutions on Internet fabric that deal with congestion. So it can't be done in the institutions and corporations focused away from the engineering challenges. That's why Dave was needed.)
2) I was thinking about how we could get Dave recognized for his contributions. Like other unsung heroes, Dave didn't work for BBN or some other moneyed entity who would commission a book or a memorial. (BBN paid Katie Hafner to write the text that later turned into her book "When Wizards Stay Up Late", which oddly only talked about the ARPANET/Internet pioneers who worked for BBN, omitting many of my Internet colleagues.) Dave wasn't the kind of guy that gets Awards from the Computer History Museum or the ACM or IEEE. He wasn't beloved at IETF or ISOC that I know of. He's in the category of folks like Noel Chiappa or Bram Cohen or Richard Stallman or Aaron Swartz - people I think really changed the way we think about computing and internetworking, but who won't be in the official histories.
I was hoping (before this week) to try to
On Wednesday, April 2, 2025 09:59, "Livingood, Jason via Cake" <cake@lists.bufferbloat.net> said:
> Very sad news indeed! I had the pleasure of working closely with Dave for 15
> years. He was generous with his time and had a unique way of bringing people
> together to make the internet better for everyone!
>
>
> I had to go down memory lane to recall when I first really started working with
> him. It may have been around 2010 or so. In 2012, I started sending funds his way
> via my day job to help him and his merry network of collaborators work to develop
> the CoDel AQM.
>
>
> Funding him was not necessarily easy, as Dave had a unique way of working and was
> best when he had complete autonomy and only loosely outlined goals - typically
> hard to sell in a big company. But he could make things happen, so it worked. And
> I knew when he started complaining about maintenance needs on his boat, or the
> need to recruit a new person to the project, or about a great new (and practical!)
> idea, that it was time to top up his funding. ;-)
>
>
> That initial CoDel support in 2012 was extended to underwrite work on his idea to
> develop RRUL, the first real working latency test that I can remember
> (https://www.bufferbloat.net/projects/bloat/wiki/RRUL_Spec/
> <https://www.bufferbloat.net/projects/bloat/wiki/RRUL_Spec/>). He was also
> helpful in introducing me to Simon Kelley, developer of dnsmasq, so we could
> underwrite some IPv6 features in dnsmasq (and Dave convinced Simon to come to an
> IETF meeting to help gather requirements and meet folks).
>
>
> Dave got CoDel working, so we developed a compelling demo of CoDel on a DOCSIS
> network (via a CeroWrt-based router connected to a cable modem) and brought him
> along to IETF-86 in March 2013 in Orlando - see interview with Dave at
> https://youtu.be/NuHYOu4aAqg?si=p0SJHLNpp_6n7XP9&t=195
> <https://youtu.be/NuHYOu4aAqg?si=p0SJHLNpp_6n7XP9&t=195>.
>
>
> From 2014-2017, I was able to make additional financial support happen for him, so
> he could do R&D into how to improve buffer bloat in WiFi network links and
> equipment, a project he called "Make WiFi Fast". In 2020-2021 and 2024, I found
> funding for his work again, this time to work on accelerating AQM adoption in the
> real world & work related to the CAKE AQM.
>
>
> Thanks in part to my longstanding collaboration with Dave, tens of millions of
> DOCSIS users in our network have AQM and thus far better network responsiveness.
> The same is true for AQMs he worked on, CeroWrt, LibreQoS, and other projects. He
> succeeded in his goal to make the internet better for everyone!
>
>
> We will miss you, Dave!
>
>
> Jason
>
>
>
>
>
>
>
>
>
>
>
> _______________________________________________
> Cake mailing list
> Cake@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cake
>
[-- Attachment #2: Type: text/html, Size: 8310 bytes --]
^ permalink raw reply [relevance 0%]
* Re: [Cake] [Bloat] In loving memory of Dave Täht <3
[not found] ` <976DC4FC-44CA-4C7E-90E0-DE39B57F01E1@comcast.com>
@ 2025-04-02 13:59 1% ` Livingood, Jason
2025-04-02 19:51 0% ` David P. Reed
0 siblings, 1 reply; 200+ results
From: Livingood, Jason @ 2025-04-02 13:59 UTC (permalink / raw)
To: Vint Cerf, Frantisek Borsik, codel-wireless,
Jeremy Austin via Rpm, cerowrt-commits, Make-Wifi-fast, libreqos,
Dave Taht via Starlink, Herbert Wolverson,
Frantisek (Frank) Borsik,
Network Neutrality is back! Let´s make the technical
aspects heard this time!,
codel, cerowrt-devel, bloat, Cake List, bloat-ietf,
cerowrt-users, Robert Chacón
Very sad news indeed! I had the pleasure of working closely with Dave for 15 years. He was generous with his time and had a unique way of bringing people together to make the internet better for everyone!
I had to go down memory lane to recall when I first really started working with him. It may have been around 2010 or so. In 2012, I started sending funds his way via my day job to help him and his merry network of collaborators work to develop the CoDel AQM.
Funding him was not necessarily easy, as Dave had a unique way of working and was best when he had complete autonomy and only loosely outlined goals - typically hard to sell in a big company. But he could make things happen, so it worked. And I knew when he started complaining about maintenance needs on his boat, or the need to recruit a new person to the project, or about a great new (and practical!) idea, that it was time to top up his funding. ;-)
That initial CoDel support in 2012 was extended to underwrite work on his idea to develop RRUL, the first real working latency test that I can remember (https://www.bufferbloat.net/projects/bloat/wiki/RRUL_Spec/ <https://www.bufferbloat.net/projects/bloat/wiki/RRUL_Spec/>). He was also helpful in introducing me to Simon Kelley, developer of dnsmasq, so we could underwrite some IPv6 features in dnsmasq (and Dave convinced Simon to come to an IETF meeting to help gather requirements and meet folks).
Dave got CoDel working, so we developed a compelling demo of CoDel on a DOCSIS network (via a CeroWrt-based router connected to a cable modem) and brought him along to IETF-86 in March 2013 in Orlando - see interview with Dave at https://youtu.be/NuHYOu4aAqg?si=p0SJHLNpp_6n7XP9&t=195 <https://youtu.be/NuHYOu4aAqg?si=p0SJHLNpp_6n7XP9&t=195>.
From 2014-2017, I was able to make additional financial support happen for him, so he could do R&D into how to improve buffer bloat in WiFi network links and equipment, a project he called "Make WiFi Fast". In 2020-2021 and 2024, I found funding for his work again, this time to work on accelerating AQM adoption in the real world & work related to the CAKE AQM.
Thanks in part to my longstanding collaboration with Dave, tens of millions of DOCSIS users in our network have AQM and thus far better network responsiveness. The same is true for AQMs he worked on, CeroWrt, LibreQoS, and other projects. He succeeded in his goal to make the internet better for everyone!
We will miss you, Dave!
Jason
^ permalink raw reply [relevance 1%]
* Re: [Cake] [PATCH net] sched: sch_cake: add bounds checks to host bulk flow fairness counts
@ 2025-01-07 3:14 1% ` kernel test robot
0 siblings, 0 replies; 200+ results
From: kernel test robot @ 2025-01-07 3:14 UTC (permalink / raw)
To: Toke Høiland-Jørgensen,
Toke Høiland-Jørgensen, Jamal Hadi Salim, Cong Wang,
Jiri Pirko, Paolo Abeni
Cc: oe-kbuild-all, syzbot+f63600d288bfb7057424, Eric Dumazet,
Jakub Kicinski, Simon Horman, cake, netdev
Hi Toke,
kernel test robot noticed the following build warnings:
[auto build test WARNING on net/main]
url: https://github.com/intel-lab-lkp/linux/commits/Toke-H-iland-J-rgensen/sched-sch_cake-add-bounds-checks-to-host-bulk-flow-fairness-counts/20250106-214156
base: net/main
patch link: https://lore.kernel.org/r/20250106133837.18609-1-toke%40redhat.com
patch subject: [PATCH net] sched: sch_cake: add bounds checks to host bulk flow fairness counts
config: i386-buildonly-randconfig-004-20250107 (https://download.01.org/0day-ci/archive/20250107/202501071052.ZOECqwS9-lkp@intel.com/config)
compiler: gcc-12 (Debian 12.2.0-14) 12.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250107/202501071052.ZOECqwS9-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202501071052.ZOECqwS9-lkp@intel.com/
All warnings (new ones prefixed by >>):
net/sched/sch_cake.c: In function 'cake_dequeue':
>> net/sched/sch_cake.c:1975:37: warning: variable 'dsthost' set but not used [-Wunused-but-set-variable]
1975 | struct cake_host *srchost, *dsthost;
| ^~~~~~~
>> net/sched/sch_cake.c:1975:27: warning: variable 'srchost' set but not used [-Wunused-but-set-variable]
1975 | struct cake_host *srchost, *dsthost;
| ^~~~~~~
vim +/dsthost +1975 net/sched/sch_cake.c
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 1970
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 1971 static struct sk_buff *cake_dequeue(struct Qdisc *sch)
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 1972 {
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 1973 struct cake_sched_data *q = qdisc_priv(sch);
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 1974 struct cake_tin_data *b = &q->tins[q->cur_tin];
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 @1975 struct cake_host *srchost, *dsthost;
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 1976 ktime_t now = ktime_get();
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 1977 struct cake_flow *flow;
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 1978 struct list_head *head;
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 1979 bool first_flow = true;
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 1980 struct sk_buff *skb;
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 1981 u64 delay;
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 1982 u32 len;
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 1983
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 1984 begin:
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 1985 if (!sch->q.qlen)
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 1986 return NULL;
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 1987
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 1988 /* global hard shaper */
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 1989 if (ktime_after(q->time_next_packet, now) &&
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 1990 ktime_after(q->failsafe_next_packet, now)) {
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 1991 u64 next = min(ktime_to_ns(q->time_next_packet),
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 1992 ktime_to_ns(q->failsafe_next_packet));
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 1993
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 1994 sch->qstats.overlimits++;
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 1995 qdisc_watchdog_schedule_ns(&q->watchdog, next);
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 1996 return NULL;
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 1997 }
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 1998
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 1999 /* Choose a class to work on. */
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2000 if (!q->rate_ns) {
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2001 /* In unlimited mode, can't rely on shaper timings, just balance
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2002 * with DRR
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2003 */
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2004 bool wrapped = false, empty = true;
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2005
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2006 while (b->tin_deficit < 0 ||
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2007 !(b->sparse_flow_count + b->bulk_flow_count)) {
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2008 if (b->tin_deficit <= 0)
cbd22f172df782 Kevin 'ldir' Darbyshire-Bryant 2019-12-18 2009 b->tin_deficit += b->tin_quantum;
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2010 if (b->sparse_flow_count + b->bulk_flow_count)
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2011 empty = false;
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2012
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2013 q->cur_tin++;
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2014 b++;
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2015 if (q->cur_tin >= q->tin_cnt) {
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2016 q->cur_tin = 0;
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2017 b = q->tins;
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2018
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2019 if (wrapped) {
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2020 /* It's possible for q->qlen to be
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2021 * nonzero when we actually have no
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2022 * packets anywhere.
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2023 */
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2024 if (empty)
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2025 return NULL;
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2026 } else {
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2027 wrapped = true;
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2028 }
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2029 }
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2030 }
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2031 } else {
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2032 /* In shaped mode, choose:
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2033 * - Highest-priority tin with queue and meeting schedule, or
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2034 * - The earliest-scheduled tin with queue.
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2035 */
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2036 ktime_t best_time = KTIME_MAX;
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2037 int tin, best_tin = 0;
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2038
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2039 for (tin = 0; tin < q->tin_cnt; tin++) {
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2040 b = q->tins + tin;
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2041 if ((b->sparse_flow_count + b->bulk_flow_count) > 0) {
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2042 ktime_t time_to_pkt = \
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2043 ktime_sub(b->time_next_packet, now);
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2044
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2045 if (ktime_to_ns(time_to_pkt) <= 0 ||
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2046 ktime_compare(time_to_pkt,
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2047 best_time) <= 0) {
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2048 best_time = time_to_pkt;
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2049 best_tin = tin;
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2050 }
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2051 }
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2052 }
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2053
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2054 q->cur_tin = best_tin;
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2055 b = q->tins + best_tin;
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2056
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2057 /* No point in going further if no packets to deliver. */
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2058 if (unlikely(!(b->sparse_flow_count + b->bulk_flow_count)))
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2059 return NULL;
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2060 }
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2061
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2062 retry:
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2063 /* service this class */
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2064 head = &b->decaying_flows;
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2065 if (!first_flow || list_empty(head)) {
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2066 head = &b->new_flows;
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2067 if (list_empty(head)) {
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2068 head = &b->old_flows;
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2069 if (unlikely(list_empty(head))) {
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2070 head = &b->decaying_flows;
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2071 if (unlikely(list_empty(head)))
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2072 goto begin;
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2073 }
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2074 }
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2075 }
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2076 flow = list_first_entry(head, struct cake_flow, flowchain);
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2077 q->cur_flow = flow - b->flows;
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2078 first_flow = false;
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2079
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2080 /* triple isolation (modified DRR++) */
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2081 srchost = &b->hosts[flow->srchost];
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2082 dsthost = &b->hosts[flow->dsthost];
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2083
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2084 /* flow isolation (DRR++) */
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2085 if (flow->deficit <= 0) {
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2086 /* Keep all flows with deficits out of the sparse and decaying
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2087 * rotations. No non-empty flow can go into the decaying
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2088 * rotation, so they can't get deficits
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2089 */
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2090 if (flow->set == CAKE_SET_SPARSE) {
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2091 if (flow->head) {
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2092 b->sparse_flow_count--;
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2093 b->bulk_flow_count++;
712639929912c5 George Amanakis 2019-03-01 2094
c75152104797f8 Toke Høiland-Jørgensen 2025-01-06 2095 cake_inc_srchost_bulk_flow_count(b, flow, q->flow_mode);
c75152104797f8 Toke Høiland-Jørgensen 2025-01-06 2096 cake_inc_dsthost_bulk_flow_count(b, flow, q->flow_mode);
712639929912c5 George Amanakis 2019-03-01 2097
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2098 flow->set = CAKE_SET_BULK;
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2099 } else {
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2100 /* we've moved it to the bulk rotation for
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2101 * correct deficit accounting but we still want
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2102 * to count it as a sparse flow, not a bulk one.
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2103 */
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2104 flow->set = CAKE_SET_SPARSE_WAIT;
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2105 }
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2106 }
712639929912c5 George Amanakis 2019-03-01 2107
c75152104797f8 Toke Høiland-Jørgensen 2025-01-06 2108 flow->deficit += cake_get_flow_quantum(b, flow, q->flow_mode);
712639929912c5 George Amanakis 2019-03-01 2109 list_move_tail(&flow->flowchain, &b->old_flows);
712639929912c5 George Amanakis 2019-03-01 2110
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2111 goto retry;
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2112 }
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2113
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2114 /* Retrieve a packet via the AQM */
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2115 while (1) {
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2116 skb = cake_dequeue_one(sch);
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2117 if (!skb) {
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2118 /* this queue was actually empty */
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2119 if (cobalt_queue_empty(&flow->cvars, &b->cparams, now))
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2120 b->unresponsive_flow_count--;
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2121
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2122 if (flow->cvars.p_drop || flow->cvars.count ||
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2123 ktime_before(now, flow->cvars.drop_next)) {
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2124 /* keep in the flowchain until the state has
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2125 * decayed to rest
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2126 */
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2127 list_move_tail(&flow->flowchain,
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2128 &b->decaying_flows);
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2129 if (flow->set == CAKE_SET_BULK) {
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2130 b->bulk_flow_count--;
712639929912c5 George Amanakis 2019-03-01 2131
c75152104797f8 Toke Høiland-Jørgensen 2025-01-06 2132 cake_dec_srchost_bulk_flow_count(b, flow, q->flow_mode);
c75152104797f8 Toke Høiland-Jørgensen 2025-01-06 2133 cake_dec_dsthost_bulk_flow_count(b, flow, q->flow_mode);
712639929912c5 George Amanakis 2019-03-01 2134
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2135 b->decaying_flow_count++;
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2136 } else if (flow->set == CAKE_SET_SPARSE ||
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2137 flow->set == CAKE_SET_SPARSE_WAIT) {
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2138 b->sparse_flow_count--;
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2139 b->decaying_flow_count++;
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2140 }
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2141 flow->set = CAKE_SET_DECAYING;
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2142 } else {
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2143 /* remove empty queue from the flowchain */
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2144 list_del_init(&flow->flowchain);
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2145 if (flow->set == CAKE_SET_SPARSE ||
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2146 flow->set == CAKE_SET_SPARSE_WAIT)
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2147 b->sparse_flow_count--;
712639929912c5 George Amanakis 2019-03-01 2148 else if (flow->set == CAKE_SET_BULK) {
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2149 b->bulk_flow_count--;
712639929912c5 George Amanakis 2019-03-01 2150
c75152104797f8 Toke Høiland-Jørgensen 2025-01-06 2151 cake_dec_srchost_bulk_flow_count(b, flow, q->flow_mode);
c75152104797f8 Toke Høiland-Jørgensen 2025-01-06 2152 cake_dec_dsthost_bulk_flow_count(b, flow, q->flow_mode);
712639929912c5 George Amanakis 2019-03-01 2153 } else
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2154 b->decaying_flow_count--;
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2155
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2156 flow->set = CAKE_SET_NONE;
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2157 }
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2158 goto begin;
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2159 }
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2160
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2161 /* Last packet in queue may be marked, shouldn't be dropped */
7298de9cd7255a Toke Høiland-Jørgensen 2018-07-06 2162 if (!cobalt_should_drop(&flow->cvars, &b->cparams, now, skb,
7298de9cd7255a Toke Høiland-Jørgensen 2018-07-06 2163 (b->bulk_flow_count *
7298de9cd7255a Toke Høiland-Jørgensen 2018-07-06 2164 !!(q->rate_flags &
7298de9cd7255a Toke Høiland-Jørgensen 2018-07-06 2165 CAKE_FLAG_INGRESS))) ||
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2166 !flow->head)
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2167 break;
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2168
7298de9cd7255a Toke Høiland-Jørgensen 2018-07-06 2169 /* drop this packet, get another one */
7298de9cd7255a Toke Høiland-Jørgensen 2018-07-06 2170 if (q->rate_flags & CAKE_FLAG_INGRESS) {
7298de9cd7255a Toke Høiland-Jørgensen 2018-07-06 2171 len = cake_advance_shaper(q, b, skb,
7298de9cd7255a Toke Høiland-Jørgensen 2018-07-06 2172 now, true);
7298de9cd7255a Toke Høiland-Jørgensen 2018-07-06 2173 flow->deficit -= len;
7298de9cd7255a Toke Høiland-Jørgensen 2018-07-06 2174 b->tin_deficit -= len;
7298de9cd7255a Toke Høiland-Jørgensen 2018-07-06 2175 }
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2176 flow->dropped++;
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2177 b->tin_dropped++;
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2178 qdisc_tree_reduce_backlog(sch, 1, qdisc_pkt_len(skb));
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2179 qdisc_qstats_drop(sch);
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2180 kfree_skb(skb);
7298de9cd7255a Toke Høiland-Jørgensen 2018-07-06 2181 if (q->rate_flags & CAKE_FLAG_INGRESS)
7298de9cd7255a Toke Høiland-Jørgensen 2018-07-06 2182 goto retry;
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2183 }
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2184
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2185 b->tin_ecn_mark += !!flow->cvars.ecn_marked;
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2186 qdisc_bstats_update(sch, skb);
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2187
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2188 /* collect delay stats */
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2189 delay = ktime_to_ns(ktime_sub(now, cobalt_get_enqueue_time(skb)));
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2190 b->avge_delay = cake_ewma(b->avge_delay, delay, 8);
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2191 b->peak_delay = cake_ewma(b->peak_delay, delay,
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2192 delay > b->peak_delay ? 2 : 8);
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2193 b->base_delay = cake_ewma(b->base_delay, delay,
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2194 delay < b->base_delay ? 2 : 8);
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2195
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2196 len = cake_advance_shaper(q, b, skb, now, false);
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2197 flow->deficit -= len;
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2198 b->tin_deficit -= len;
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2199
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2200 if (ktime_after(q->time_next_packet, now) && sch->q.qlen) {
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2201 u64 next = min(ktime_to_ns(q->time_next_packet),
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2202 ktime_to_ns(q->failsafe_next_packet));
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2203
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2204 qdisc_watchdog_schedule_ns(&q->watchdog, next);
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2205 } else if (!sch->q.qlen) {
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2206 int i;
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2207
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2208 for (i = 0; i < q->tin_cnt; i++) {
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2209 if (q->tins[i].decaying_flow_count) {
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2210 ktime_t next = \
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2211 ktime_add_ns(now,
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2212 q->tins[i].cparams.target);
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2213
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2214 qdisc_watchdog_schedule_ns(&q->watchdog,
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2215 ktime_to_ns(next));
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2216 break;
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2217 }
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2218 }
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2219 }
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2220
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2221 if (q->overflow_timeout)
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2222 q->overflow_timeout--;
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2223
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2224 return skb;
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2225 }
046f6fd5daefac Toke Høiland-Jørgensen 2018-07-06 2226
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [relevance 1%]
* [Cake] Fwd: [NetFPGA-announce] Announcing NetFPGA PLUS 1.0
[not found] <AD02259F-4E80-42B7-9B02-A50023EEF2F7@cl.cam.ac.uk>
@ 2021-09-29 16:21 2% ` Dave Taht
0 siblings, 0 replies; 200+ results
From: Dave Taht @ 2021-09-29 16:21 UTC (permalink / raw)
To: Make-Wifi-fast, Cake List
---------- Forwarded message ---------
From: Andrew Moore <andrew.moore@cl.cam.ac.uk>
Date: Fri, Sep 24, 2021 at 2:58 PM
Subject: [NetFPGA-announce] Announcing NetFPGA PLUS 1.0
To: <cl-netfpga-announce@lists.cam.ac.uk>
It is with great excitement we announce the release of NetFPGA PLUS.
NetFPGA PLUS 1.0
NetFPGA PLUS 1.0 has arrived, available in a public repository to all,
links on the netfpga.org website. I’ve reprinted the outline, included
as part of the original announcement, at the bottom of this
newsletter. The overly optimistic timetable fell to the brutal
realities of the last 9 months.
NetFPGA PLUS has been is a momentous effort that largely has fallen
to the broad shoulders of the increasingly slim NetFPGA team at
Cambridge; one person in particular deserves much credit for this huge
effort and for us achieving this first release.
On behalf of us all, I thank Yuta Tokusashi who has lead the NetFPGA
PLUS work throughout this effort and who has managed this despite the
extraordinary challenges of the last 18 months.
Many critical issues were managed and overcome with the expert
guidance of Noa Zilberman, while release testing and preparation would
not have been possible without the assistance of Salvator Galea.
This entire effort was enabled by many members of the excellent Xilinx
team from Gordon Brebner’s leadership and enthusiasm through to the
phenomenal efforts of the Open-NIC team; notably Yan Zhang, and Chris
Neely, as well as critical advice from Cathal McCabe, part of Xilinx
in Dublin.
My personal thanks and on behalf of the NetFPGA community to each of
them. (I’m excruciatingly aware the moment I send this email I will
realise I’ve not credited a critical member of the team - my apologies
in advance.)
I will leave some details to a future newsletter - in preparation -
but promise it shortly, as soon as we have all caught up on our sleep.
Do check out the new website, thanks to Adam Pettigrew for his efforts
there; and of course do check out the public, openly available, Apache
licensed, NetFPGA PLUS codebase too!
Items planned for the next announcement will include
1. License change for NetFPGA
2. NetFPGA PLUS plans
3. NetFPGA SUME status
Thank you all,
Andrew Moore
on behalf of the NetFPGA team.
[direct copy of the PLUS announcement from the December 2020 NetFPGA newsletter]
5. Announcing NetFPGA PLUS (formerly NetFPGA 2020) - 100Gbps and beyond.
At the ACM SOSR19 keynote, I announced the NetFPGA 2020 project,
taking forward the NetFPGA ecosystem to 100Gbps.
Called NetFPGA PLUS, this work does not require a bespoke NetFPGA
board. Instead the codebase is designed to work across a number of the
(commodity) Alveo boards that utilise the Xilinx UltraScale+ FPGA
family. This project will provide more options for the NetFPGA
community and more opportunities for NetFPGA work to continue to be
the foundation stone of future education, future designs, future
research, and ongoing success.
At this time, we have been testing across a subset of the Xilinx Alveo
board family: U200, U250, U280, and also the ancestor VCU1525 board.
A typical specification (VCU1525/U200 in this case) is support for two
QSFP28 100G ports, PCIe Gen3 x16 or Gen4 x8, up to 64GB of DDR4, and
an FPGA which sports 2,586K system logic cells, 345Mbit of on chip
memory and a great many other features beside. The U250 and U280 are
even higher specification systems.
Built upon the Xilinx Vivado toolchain, the initial release of the
NetFPGA-PLUS system still provides the same nf_datapath architecture
that we know and love. The hybrid approach of using NetFPGA and
Xilinx components brings standard interfaces and board-specific blocks
(e.g., CMAC, PCIe), holds promise of an easier migration between
platforms, while holding constant the NetFPGA datapath and networking
capabilities, alongside host software and the build, test and
simulation infrastructure critical for development.
In the first instance we are focussed upon those users with one or
more Alveo boards in hand (or accessible remotely). The initial
release (due early in the new year) will have the basic reference
designs of NetFPGA-SUME:
- Network Interface Card reference project
- Switch reference project (simple switch and learning switch), and
- IPv4 Router reference project
along with the standard NetFPGA Python3 based simulation and hardware
testing framework.
Also on the planning list (a release for Q3 2021):
- Fully integrated P4 compilation support, to provide an open P4
hardware platform
- MAC/PHY support for QSFP28 to 4xSFP28, permitting up to 8 10/25Gbps ports
- New generation open source network tester capable of many 100Gbps.
_______________________________________________
cl-netfpga-announce mailing list
cl-netfpga-announce@lists.cam.ac.uk
https://lists.cam.ac.uk/mailman/listinfo/cl-netfpga-announce
--
Fixing Starlink's Latencies: https://www.youtube.com/watch?v=c9gLo6Xrwgw
Dave Täht CEO, TekLibre, LLC
^ permalink raw reply [relevance 2%]
* [Cake] Fwd: [Tech-board-discuss] Reminder: Voting procedures for the Linux Foundation Technical Advisory Board
[not found] <fccbdadc-a57a-f6fe-68d2-0fbac2fd6b81@labbott.name>
@ 2021-09-09 16:58 1% ` Dave Taht
0 siblings, 0 replies; 200+ results
From: Dave Taht @ 2021-09-09 16:58 UTC (permalink / raw)
To: cerowrt-devel, bloat, Cake List
for 5 positions there are presently only 2 nominees. I run for the
Linux Foundation TAB once in a while, just to try and somehow insert
my wish that the LF do more to improve embedded linux development
processes, with no luck. I'm thinking about running,
now, using my starlink story as the canonical example of what's been
going increasingly wrong in that market.
But: If anyone else would like to step up and into this particular
blender though?
I like very much this new strategy for voting (it turned out I barely
qualified), described below.
---------- Forwarded message ---------
From: Laura Abbott <laura@labbott.name>
Date: Thu, Sep 9, 2021 at 9:49 AM
Subject: [Tech-board-discuss] Reminder: Voting procedures for the
Linux Foundation Technical Advisory Board
To: <ksummit@lists.linux.dev>, linux-kernel@vger.kernel.org
<linux-kernel@vger.kernel.org>,
tech-board-discuss@lists.linuxfoundation.org
<tech-board-discuss@lists.linuxfoundation.org>
Hi,
Reminder that the Linux Foundation Technical Advisory Board (TAB) annual
election will be held virtually during the 2021 Kernel Summit and Linux
Plumbers Conference. Voting will run from September 20th to September
23rd 16:00 GMT-4 (US/Eastern). The voting criteria for the 2021 election
are:
There exist three kernel commits in a mainline or stable released
kernel that both
- Have a commit date in the year 2020 or 2021
- Contain an e-mail address in one of the following tags or merged
tags (e.g. Reviewed-and-tested-by)
-- Signed-off-by
-- Tested-by
-- Reported-by
-- Reviewed-by
-- Acked-by
If you have more than 50 commits that meet this requirement you will
receive a ballot automatically.
If you have between 3 and 49 commits that meet this requirement please
e-mail tab-elections@lists.linuxfoundation.org to request your ballot.
We strongly encourage everyone who meets this criteria to request a
ballot.
We will be using Condorcet Internet Voting
Service (CIVS) https://civs1.civs.us/ . This is a voting service
focused on security and privacy. There are sample polls on the
website if you would like to see what a ballot will look like.
If you have any questions please e-mail
tab-elections@lists.linuxfoundation.org.
Thanks,
Laura
P.S. Please also consider this another reminder to consider running for
the TAB as well
_______________________________________________
Tech-board-discuss mailing list
Tech-board-discuss@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/tech-board-discuss
--
Fixing Starlink's Latencies: https://www.youtube.com/watch?v=c9gLo6Xrwgw
Dave Täht CEO, TekLibre, LLC
^ permalink raw reply [relevance 1%]
* [Cake] Fwd: Update | Starlink Beta
[not found] ` <CANmPVK-wsLrn4bp+pJ8j4K-ZYxQfVYqDQSBPLPKoK02KXdHBow@mail.gmail.com>
@ 2021-04-06 14:50 1% ` Dave Taht
0 siblings, 0 replies; 200+ results
From: Dave Taht @ 2021-04-06 14:50 UTC (permalink / raw)
To: bloat, cerowrt-devel, Make-Wifi-fast, Cake List
[-- Attachment #1: Type: text/plain, Size: 3997 bytes --]
Send a resume to:
*starlinksoftwarejobs@spacex.com <starlinksoftwarejobs@spacex.com>*
If anyone here would like to apply. Got no idea what "dynamic frame
allocation" is, but they are still testing out as quite bloated....
---------- Forwarded message ---------
From: myfriendontheinside <myfriendontheinside@gmail.com>
Date: Mon, Apr 5, 2021 at 10:18 PM
Subject: Fwd: Update | Starlink Beta
To: Dave Taht <dave.taht@gmail.com>
Starlink update email
---------- Forwarded message ---------
From: Starlink <no-reply@starlink.com>
Date: Mon, Apr 5, 2021, 7:23 PM
Subject: Update | Starlink Beta
[image: Starlink Logo]
Throughout the beta program, customer feedback has helped drive some of our
most important changes to date as we continue to test and scale the network.
The Starlink team has implemented a number of improvements since our last
update. Below are some of the key highlights:
*Starlink Expansion*
Since rollout of initial U.S. service in October 2020, Starlink now offers
limited beta service in Canada, U.K., Germany and New Zealand. To date, we
have deposits from almost every country around the world; going forward,
our ability to expand service will be driven in large part by governments
granting us licensing internationally.
*Preventative Maintenance*
Recently some beta users saw short but more frequent outages, particularly
in the evening hours. This was caused by two main issues— preventive
maintenance on various ground gateways, coupled with a network logic bug
that intermittently caused some packet processing services to hang until
they were reset. The good news is fixes were implemented and users should
no longer see this particular issue.
Gateway Availability
As more users come online, the team is seeing an increase in surges of
activity, particularly during peak hours. The gateway infrastructure to
support these types of surges is in place, but we are awaiting final
regulatory approval to use all available channels. Near term fixes have
been implemented to facilitate better load balancing in the interim, and
this issue will fully resolve once all approvals are received.
*Dynamic Frame Allocation*
The Starlink software team recently rolled out our dynamic frame allocation
feature which dynamically allocates additional bandwidth to beta users
based on real time usage. This feature enables the network to better
balance load and deliver higher speeds to the user.
*Connecting to the Best Satellite*
Today, your Starlink speaks to a single satellite assigned to your terminal
for a particular period of time. In the future, if communication with your
assigned satellite is interrupted for any reason, your Starlink will
seamlessly switch to a different satellite, resulting in far fewer network
disruptions. There can only be one satellite connected to your Starlink at
any time, but this feature will allow for choice of the best satellite.
This feature will be available to most beta users in April and is expected
to deliver one of our most notable reliability improvements to date.
These upgrades are part of our overall effort to build a network that not
only reaches underserved users, but also performs significantly better than
traditional satellite internet.
To that end, the Starlink team is always looking for great software,
integration and network engineers. If you want to help us build the
internet in space, please send your resume to *starlinksoftwarejobs@spacex.com
<starlinksoftwarejobs@spacex.com>*.
Thank you for your feedback and continued support!
The Starlink Team
Space Exploration Technologies Corp | 1 Rocket Road, Hawthorne, CA 90250 |
Unsubscribe
Questions? See Starlink FAQs <https://www.starlink.com/faq>
--
"For a successful technology, reality must take precedence over public
relations, for Mother Nature cannot be fooled" - Richard Feynman
dave@taht.net <Dave Täht> CTO, TekLibre, LLC Tel: 1-831-435-0729
[-- Attachment #2: Type: text/html, Size: 20087 bytes --]
^ permalink raw reply [relevance 1%]
* Re: [Cake] [Cerowrt-devel] wireguard almost takes a bullet
@ 2021-03-31 16:08 1% ` Theodore Ts'o
0 siblings, 0 replies; 200+ results
From: Theodore Ts'o @ 2021-03-31 16:08 UTC (permalink / raw)
To: David P. Reed
Cc: Dave Taht, Cake List, Make-Wifi-fast, Jason A. Donenfeld,
cerowrt-devel, bloat
On Tue, Mar 30, 2021 at 09:23:50PM -0400, David P. Reed wrote:
>
> On the other hand, they are pretty damn high salaries for a
> non-profit. Are they appropriate? Depends. There are no stockholders
> and no profits, just a pretty substantial net worth.
For better or for worse, senior software engineers who work for Big
Tech will be making similar amounts of money. Without being
inappropriately specific, my total compensation isn't that different
from Linus's, once you take things like equity compensation into
account. This was true both in my current employer, as well as my
previous employer (IBM).
Is that right or wrong? Unfortunately, it is what it is. Part of it
is that it's very much a supply and demand question. I know, because
I've tried to find talented file system kernel developers for my
team.... and it's hard to find them.
I know that in many ways, it's hugely unfair. When I was at IBM, I
was the high powered senior developer who could meet with the senior
technical leaders at a major US defense contractor. I was the one
with ths TS/SCI security clearance, and yes, I was the senior
architect who got an IBM award for my team's work on real-time Linux
capable of running real-time Java for us in the DDG(1000) Zumwalt
class destroyer. And yet, the vast majority of the work was done by
much more junior engineers, and they didn't get any of the major
awards, and their salary was much less. It was one of the best teams
I had the pleasure of working with, and I'm glad to see that they are
now working in much more senior roles at other companies.
So while it's easy to criticize the Linux Foundation, it's by no means
unique to it, and to the extent that it's part of a larger tech
ecosystem that pays engineers in a very disproportionate way, it has
to pay its people commensurate with what they could get if they had
decided to go work for a company like Facebook or Amazon.
> Regarding the organizaton of "Linux, Inc." as a hierachical control
> structure - I'll just point out that hierarchical control of the
> development of Linux suggests that it is not at all a "community
> project" (if it ever was). It's a product development organization
> with multiple levels of management.
So I'd argue that *any* successful, very large open source project
needs to have multiple levels of management. It's *technical*
managers, and I would point out that it's really more servant
leadership more than anything else. I may be the ext4 maintainer, but
that means that in order to make ext4 successful, I end up doing the
work that no one else finds "fun" to work for, or that companies
aren't willing to pay engineers to do as part of their day job. So
for example, the test framework[1] for ext4 is something I had to create
and maintain, because no one else would do it. And code review is not
necessary *fun*, but someone has to do it. Much of this work actually
happens late at night or on weekend, on my own time, because I care
enough about it that it's something I *choose* to do it.
[1] https://thunk.org/gce-xfstests
So if your definition of a "community project" is one which has a
non-scalable governance structure, I'm going to have to disagree. In
the early 90's, when I first getting started with Linux, there were
attempts from senior leaders at NetBSD and GNU HURD who tried to woo
me to do work for their kernel instead. Let's just say that even
then, after seeing the toxicity/drama of those governance structures,
(and it didn't help that living in Cambridge, I had the ability to
meet and break bread with some of these senior people face-to-face),
one of the primary *reasons* why I declined to work on *BSD and HURD
was due to the leaders of those projects that I would have had to work
with. This despite the fact that my first OS/systems programming
experience, courtesy of MIT Project Athena, was on BSD 4.3.
> Yet the developers are employees of a small number of major
> corporations. In this sense, it is like a "joint venture" among
> those companies.
>
> To the extent that those companies gain (partial) control of the
> Linux kernel, as appears to be the case, I think Linux misrepresents
> itself as a "community project", and in particular, the actual users
> of the software may have little say in the direction development
> takes going forwards.
There are certainly still developers in Linux that are hobbyists, and
not everyone works for Big Tech. In fact, Jason worked at a startup
that was certainly not what I would call an example of Big Tech.
Sure, his startup let him spend a significant amount of his time
working on getting WireGuard upstream, but WireGuard was very much
accepted on the basis of the merits of his work. It was not because
someone paid $$$ to the Linux Foundation, or anything crazy like that.
I also started out as a hobbyist. For a long time, being tech lead
for Kerberos at MIT, and doing IETF standards work (e.g., I was ipsec
working group chair) was my day job, and Linux as my hobby. Sure, I
was the first North American Linux kernel developer, and that got me
invited to a bunch of conferences who were willing to pay my travel
expenses (since I was a starving academic), but I was paid a very
small salary compared to industry. (We were wondering why MIT kept on
losing people to industry, so my department brought in a salary
consultant who determined that MIT was paying its people at the tenth
percentile of industry at that time.) I doubled my salary when I went
to work for a startup, and given that VA Linux Systems imploded before
I was able to sell most of my stock, that figure was *before* any
equity compensation.
Some of the people who were smarter than me, at least in terms of
deciding to go out into industry much sooner, and who were able to
benefit from Red Hat's IPO, have multiple expensive houses and can
travel between them as the ski seasons open up. And I also know
people working in Indiana contributing to Linux who make a tiny
fraction of what one can make in Big Tech. I try not to get envious
over those who have done financially much better than I, and I also
try not to think that I'm inherently superior just because I've been
incredibly blessed and lucky and have a very privileged existence.
Life is unfair, and all you can do is to try to your best to make the
world a better place than when you entered it.
> There's little safeguard, for example, against "senior management"
> biases in support of certain vendors, if other vendors are excluded
> from effective participation by one of many techniques. In other
> words, there's no way it can be a level playing field for
> innovation.
The safeguard is in the maintainers' hands. Remember, we "own" our
subsystems and to the extent that we are passionate to let it be
successful, we'll take the help from whereever we can get it. I might
be at Company A one year, and Company B another, and if I take crappy
code from one Company, I'll end up owning that crap and I'll
ultimately need to fix it later, perhaps when I'm at another company.
It is true that as a someone who manages volunteers (regardless of
whether such volunteers are hobbyists or people who are doing the work
paid by a certain company), we can't force someone to implement a
particular feature or fix a certain bug. As I learned from my service
on the IETF, the only power of such leaders is to say "No". But if we
stop a good feature from getting in, that ultimately is going to be to
the detriment of our subsystem.
And if that does happen for some reason, one of the roles that Linus
plays is as an authority that someone can appeal to. I've never seen
a support for some CPU architecture get denied just because it might
threaten existing "Big Companys", for example. I'm sure the ARM SOC's
weren't happy to see RISC-V support land in the kernel. But if there
was an attempt to keep RISC-V out of the kernel, that's a case where
Linus would intervene, since ultimately it's *his* choice to accept a
new subsystem and a new maintainer.
> (one that is not transparent at all about functioning as such -
> preferring to give the impression that the kernel is developed by
> part-time voluntary "contributions").
Actually, the Linux Foundation has been quite transparent about this;
every few years, it relases a "Who Writes Linux Report". Anyone who
had such an impression hasn't been paying attention:
https://www.linuxfoundation.org/wp-content/uploads/linux-kernel-report-2017.pdf
https://www.linuxfoundation.org/wp-content/uploads/2020_kernel_history_report_082720.pdf
From these reports, you'll see that in 2017 we had 8.2% of the
contributions coming from people weren't getting financial
contributions (with another 4.1% where the source of financial support
couldn't be determined). This was down from the 2013 report, where
14.6% of the contributions came from hobbyists.
In the 2020 report, "None" was 11.95%, with the next highest
contributor being Intel at 10%, Red Hat at 8.9%, "Unknown" at 4%, and
IBM at 3.8%. (Google was much farther down the list, at 2.8%). Not
to put too fine a point on it, "people who receive no financial
contributions" are #1 on the "Top 20 committers list".
> The contrast with other open source communities is quite sharp
> now. There is little eleemosynary intent that can be detected any
> more. I think that is too bad, but things change.
If you look at the members of the Git, Perl and Python communitiesn, I
believe you'll find that most of the major contributors do work for
companies that pay for at least part of their open source
contributions. Given that most people do enjoy having food with their
meals, if a OSS project is successful, this really shouldn't be a
surprise.
It is true that there is a huge long tail of OSS projects which have
not been successful, and which only exist as abandonware on
SourceForge or GitHub. (Or in the case of OpenOffice, as part of the
Apache Consortium :-P) But just as the vast majorities of startups end
up emploding with less than 1% becoming the IPO unicorns, I'm not so
sure it's anything more than sour graps for people to claim that the
startups which made it big were never "real startups" to begin with,
and that the story of startups is all a Big Lie.
Cheers,
- Ted
^ permalink raw reply [relevance 1%]
* Re: [Cake] quantum configuration
@ 2021-01-26 15:46 1% ` Dave Taht
0 siblings, 0 replies; 200+ results
From: Dave Taht @ 2021-01-26 15:46 UTC (permalink / raw)
To: Toke Høiland-Jørgensen; +Cc: Luca Muscariello, Cake List
I have kind of thought we could scale the quantum much higher as we
crack 200mbit to see if that helps on performance.
On Fri, Jul 24, 2020 at 5:26 AM Toke Høiland-Jørgensen via Cake
<cake@lists.bufferbloat.net> wrote:
>
> Luca Muscariello <muscariello@ieee.org> writes:
>
> > Is there a reason why in cake the quantum cannot be configured to a
> > different value like in fq_codel?
>
> I think this was mostly to be as no-knob as possible; so the quantum is
> auto-scaled with the tin bandwidths, instead of being configurable.
>
> Jonathan can probably expand on this...
>
> -Toke
> _______________________________________________
> Cake mailing list
> Cake@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cake
--
"For a successful technology, reality must take precedence over public
relations, for Mother Nature cannot be fooled" - Richard Feynman
dave@taht.net <Dave Täht> CTO, TekLibre, LLC Tel: 1-831-435-0729
^ permalink raw reply [relevance 1%]
* Re: [Cake] [Bloat] New board that looks interesting
2020-12-18 23:48 1% ` Aaron Wood
@ 2021-01-04 2:11 1% ` Dean Scarff
0 siblings, 0 replies; 200+ results
From: Dean Scarff @ 2021-01-04 2:11 UTC (permalink / raw)
To: cake
Any stats on how much power it pulled during your tests and when idle?
On Fri, 18 Dec 2020 15:48:46 -0800, Aaron Wood wrote:
> I have, finally. It's been running for a week or so, now.
>
> OpenWRT was an _adventure_. The board is UEFI, not standard bios..
> And while it will merrily boot OpenWRT's non-uefi images off of USB,
> it won't boot the non-UEFI setup from the internal storage (I'm using
> the eMMC). So _that_ was fun (and I made some dumb mistakes that
> were especially fun to correct.
>
> But it's running OpenWRT 19.07 (and a UEFI bootloader before grub
> that's from ToT OpenWRT).
>
> Anyway, I have cake running, 950Mbps ingress and 35Mbps egress (modem
> is provisioned at 1.3G ingress, and a bit over 35Mbps egress).
> fq_codel was defaulted, in multi-queue mode. While I'm using cake
> on ingress, my local link hasn't been hitting the limiter very often:
>
> Tin 0
> thresh 950Mbit
> target 1.5ms
> interval 30.0ms
> pk_delay 22us
> av_delay 9us
> sp_delay 2us
> backlog 0b
> pkts 243608193
> bytes 250748364896
> way_inds 13167720
> way_miss 1245030
> way_cols 0
> drops 1075
> marks 101
> ack_drop 0
> sp_flows 0
> bk_flows 1
> un_flows 0
> max_len 69876
> quantum 1514
>
> Given that most of the hosts that I interact with are only about
> 10-15ms away, I'm probably going to change the interval target to
> better match that.
>
> Interestingly, while it has a pair of multiqueue NICs (i211s), the
> igbe driver isn't configuring them for RSS. Both output queues are
> being used, but not the ingress queues:
>
> wan interface:
>
> tx_queue_0_packets: 56635989
> tx_queue_1_packets: 39777210
> rx_queue_0_packets: 243646072
> rx_queue_1_packets: 0
>
> lan interface:
>
> tx_queue_0_packets: 85047897
> tx_queue_1_packets: 162004500
> rx_queue_0_packets: 111174855
> rx_queue_1_packets: 0
>
> Since I have housemates that don't appreciate me messing with the
> network during their meetings, I haven't gotten around to poking more
> deeply at that (or at experimenting with running cake on two ingress
> queues).
>
> That being said, I bench-tested this before I put it into operation
> and was able to see 940Mbps of iperf goodput through cake and NAT...
> Took all of a core, though (and that core was still cold and
> therefore
> potentially able to boost to 2.5GHz). I haven't determined how long
> it will take to thermally throttle, and if bandwidth suffers as a
> result.
>
> Pretty happy with it so far, though.
^ permalink raw reply [relevance 1%]
* Re: [Cake] ECN not working?
@ 2020-12-22 20:15 0% ` Jonathan Morton
0 siblings, 0 replies; 200+ results
From: Jonathan Morton @ 2020-12-22 20:15 UTC (permalink / raw)
To: xnor; +Cc: cake
> On 22 Dec, 2020, at 10:06 pm, xnor <xnoreq@gmail.com> wrote:
>
> The client initiates the IPv4 TCP connection with:
> IP Differentiated Services Field: 0x02 (DSCP: CS0, ECN: ECT(0))
> TCP Flags: 0x0c2 (SYN, ECN, CWR)
> Seq=0 Win=64240 Len=0 MSS=1460 WS=256 SACK_PERM=1
>
> The server responds:
> Differentiated Services Field: 0x00 (DSCP: CS0, ECN: Not-ECT)
> Flags: 0x012 (SYN, ACK)
> Seq=0 Ack=1 Win=64240 Len=0 MSS=1460 SACK_PERM=1 WS=128
>
> Shouldn't the server respond with ECT set in the SYN ACK packet
> and possibly also have ECN-related flags set in the TCP header?
Not all servers have ECN support enabled. A SYN-ACK without the ECE bit set indicates it does not. The connection then proceeds as Not-ECT.
I'm reasonably sure Akamai has specifically enabled ECN support. A lot of smaller webservers are probably running with the default passive-mode ECN support as well (ie. will negotiate inbound but not initiate outbound).
- Jonathan Morton
^ permalink raw reply [relevance 0%]
* Re: [Cake] [Bloat] New board that looks interesting
2020-04-27 2:45 1% ` Dave Taht
@ 2020-12-18 23:48 1% ` Aaron Wood
2021-01-04 2:11 1% ` Dean Scarff
0 siblings, 1 reply; 200+ results
From: Aaron Wood @ 2020-12-18 23:48 UTC (permalink / raw)
To: Dave Taht; +Cc: Cake List, David P. Reed, Make-Wifi-fast, bloat
[-- Attachment #1: Type: text/plain, Size: 5526 bytes --]
I have, finally. It's been running for a week or so, now.
OpenWRT was an _adventure_. The board is UEFI, not standard bios. And
while it will merrily boot OpenWRT's non-uefi images off of USB, it won't
boot the non-UEFI setup from the internal storage (I'm using the eMMC). So
_that_ was fun (and I made some dumb mistakes that were especially fun to
correct.
But it's running OpenWRT 19.07 (and a UEFI bootloader before grub that's
from ToT OpenWRT).
Anyway, I have cake running, 950Mbps ingress and 35Mbps egress (modem is
provisioned at 1.3G ingress, and a bit over 35Mbps egress). fq_codel was
defaulted, in multi-queue mode. While I'm using cake on ingress, my local
link hasn't been hitting the limiter very often:
Tin 0
thresh 950Mbit
target 1.5ms
interval 30.0ms
pk_delay 22us
av_delay 9us
sp_delay 2us
backlog 0b
pkts 243608193
bytes 250748364896
way_inds 13167720
way_miss 1245030
way_cols 0
drops 1075
marks 101
ack_drop 0
sp_flows 0
bk_flows 1
un_flows 0
max_len 69876
quantum 1514
Given that most of the hosts that I interact with are only about 10-15ms
away, I'm probably going to change the interval target to better match that.
Interestingly, while it has a pair of multiqueue NICs (i211s), the igbe
driver isn't configuring them for RSS. Both output queues are being used,
but not the ingress queues:
wan interface:
tx_queue_0_packets: 56635989
tx_queue_1_packets: 39777210
rx_queue_0_packets: 243646072
rx_queue_1_packets: 0
lan interface:
tx_queue_0_packets: 85047897
tx_queue_1_packets: 162004500
rx_queue_0_packets: 111174855
rx_queue_1_packets: 0
Since I have housemates that don't appreciate me messing with the network
during their meetings, I haven't gotten around to poking more deeply at
that (or at experimenting with running cake on two ingress queues).
That being said, I bench-tested this before I put it into operation and was
able to see 940Mbps of iperf goodput through cake and NAT... Took all of a
core, though (and that core was still cold and therefore potentially able
to boost to 2.5GHz). I haven't determined how long it will take to
thermally throttle, and if bandwidth suffers as a result.
Pretty happy with it so far, though.
On Sun, Apr 26, 2020 at 7:46 PM Dave Taht <dave.taht@gmail.com> wrote:
> anyone got around to hacking on this board yet?
>
> On Sat, Apr 4, 2020 at 9:27 AM Aaron Wood <woody77@gmail.com> wrote:
> >
> > The comparison of chipset performance link (to OpemWRT forums) that went
> out had this chip, the J4105 as the fastest. Able to do a gigabit with
> cake (nearly able to do it in both directions).
> >
> > I think this has replaced the apu2 as the board I’m going with as my
> edge router.
> >
> > On Sat, Apr 4, 2020 at 9:10 AM Dave Taht <dave.taht@gmail.com> wrote:
> >>
> >> Historically I've found the "Celeron" chips rather weak, but it's just
> >> a brand. I haven't the foggiest idea how well this variant will
> >> perform.
> >>
> >> The intel ethernet chips are best of breed in linux, however. It's
> >> been my hope that the 211 variant with the timed networking support
> >> would show up in the field (sch_etx) so we could fiddle with that,
> >> (the apu2s aren't using that version) but I cannot for the life of me
> >> remember the right keywords to look it up at the moment. this feature
> >> lets you program when a packet emerges from the driver and is sort of
> >> a whole new ballgame when it comes to scheduling - there hasn't been
> >> an aqm designed for it, and you can do fq by playing tricks with the
> >> sent timestamp.
> >>
> >> All the other features look rather nice on this board.
> >>
> >> On Sat, Apr 4, 2020 at 7:47 AM David P. Reed <dpreed@deepplum.com>
> wrote:
> >> >
> >> > Thanks! I ordered one just now. In my experience, this company does
> rather neat stuff. Their XMOS based microphone array (ReSpeaker) is really
> useful. What's the state of play in Linux/OpenWRT for Intel 9560
> capabilities regarding AQM?
> >> >
> >> > On Saturday, April 4, 2020 12:12am, "Aaron Wood" <woody77@gmail.com>
> said:
> >> >
> >> > > _______________________________________________
> >> > > Cake mailing list
> >> > > Cake@lists.bufferbloat.net
> >> > > https://lists.bufferbloat.net/listinfo/cake
> >> > > https://www.seeedstudio.com/ODYSSEY-X86J4105800-p-4445.html
> >> > >
> >> > > quad-core Celeron J4105 1.5-2.5 GHz x64
> >> > > 8GB Ram
> >> > > 2x i211t intel ethernet controllers
> >> > > intel 9560 802.11ac (wave2) wifi/bluetooth chipset
> >> > > intel built-in graphics
> >> > > onboard ARM Cortex-M0 and RPi & Arduino headers
> >> > > m.2 and PCIe adapters
> >> > > <$200
> >> > >
> >> >
> >> >
> >> > _______________________________________________
> >> > Bloat mailing list
> >> > Bloat@lists.bufferbloat.net
> >> > https://lists.bufferbloat.net/listinfo/bloat
> >>
> >>
> >>
> >> --
> >> Make Music, Not War
> >>
> >> Dave Täht
> >> CTO, TekLibre, LLC
> >> http://www.teklibre.com
> >> Tel: 1-831-435-0729
> >
> > --
> > - Sent from my iPhone.
>
>
>
> --
> Make Music, Not War
>
> Dave Täht
> CTO, TekLibre, LLC
> http://www.teklibre.com
> Tel: 1-831-435-0729
>
[-- Attachment #2: Type: text/html, Size: 8094 bytes --]
^ permalink raw reply [relevance 1%]
* Re: [Cake] NLA_F_NESTED is missing
2020-11-04 5:48 1% ` Dean Scarff
@ 2020-11-04 11:27 0% ` Toke Høiland-Jørgensen
0 siblings, 0 replies; 200+ results
From: Toke Høiland-Jørgensen @ 2020-11-04 11:27 UTC (permalink / raw)
To: Dean Scarff, cake
Dean Scarff <dos@scarff.id.au> writes:
> On Tue, 03 Nov 2020 12:00:55 +0100, Toke Høiland-Jørgensen wrote:
>> Dean Scarff <dos@scarff.id.au> writes:
>>
>>> On Mon, 02 Nov 2020 13:37:00 +0100, Toke wrote:
>>>> Dean Scarff <dos@scarff.id.au> writes:
>>>>
>>>>> Hi,
>>>>>
>>>>> I've been happily running the out-of-tree sch_cake on my
>>>>> Raspberry
>>>>> Pi
>>>>> since 2015. However, I recently upgraded my kernel (to 5.4.72
>>>>> from
>>>>> Raspbian's raspberrypi-kernel 1.20201022-1), which comes with the
>>>>> sch_cake in mainline. Now, when running:
>>>>>
>>>>> sudo /sbin/tc qdisc add dev ppp0 root cake
>>>>>
>>>>> I get the error:
>>>>>
>>>>> Error: NLA_F_NESTED is missing.
>>>>>
>>>>> I get this error with the sch_cake in mainline, and also with
>>>>> sch_cake
>>>>> built out-of-tree. I also get the error with both Debian's
>>>>> iproute2
>>>>> 5.9.0-1 (built myself via debian/rules) and "tc" from dtaht's
>>>>> tc-adv
>>>>> repo.
>>>>>
>>>>> Any ideas on what this error means and how to fix it?
>>>>
>>>> I just tried building a 5.4.72 kernel and couldn't reproduce this,
>>>> so
>>>> it
>>>> seems it's a fault with the raspberry pi kernel; I guess opening a
>>>> bug
>>>> against that would be the way to go?
>>>>
>>>> As for what's actually causing this, I couldn't find anything
>>>> obvious
>>>> that touches this code in the qdisc layer; but I suppose it has
>>>> something to do with the core qdisc netlink parsing code?
>>>>
>>>> -Toke
>>>
>>> Thanks for the data point.
>>>
>>> For the record, the relevant kernel source is:
>>>
>>> https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/include/net/netlink.h?h=v5.4.72#n1143
>>> and the Pi branch:
>>>
>>> https://github.com/raspberrypi/linux/blob/raspberrypi-kernel_1.20201022-1/include/net/netlink.h#L1143
>>>
>>> It seems very unlikely that the Pi folks are patching the netlink
>>> stuff, so I don't think I'll get much traction there unless I can
>>> call
>>> out something specifically wrong with their patchset.
>>
>> Well, something odd is certainly going on. The error message you're
>> quoting comes form a part of the netlink parsing code (in the kernel)
>> that shouldn't even be hit by the qdisc addition: NLA_F_NESTED
>> parsing
>> is only enabled in 'strict' validation mode, which is not used for
>> qdiscs.
>>
>> So IDK, maybe a compiler issue or a bit that gets set wrong
>> somewhere?
>> Bisecting the kernel may be the only option here, I don't think
>> you're
>> going to find anything in userspace...
>
> Yeah, I came to the same conclusion. I verified the userspace was sane
> via gdb (see earlier post), and I also read through the sch_api.c and
> nlattr.c kernel code and it sure looks impossible for the strict
> validation to be getting hit.
>
> Safe to say this was random corruption: I downgraded the kernel, things
> worked as expected, then I upgraded back to the 5.4.72 and it worked
> too! Interestingly, the problem persisted across reboots (so it wasn't
> just RAM corruption), and all the kernel files also matched their "dpkg"
> MD5s (so it wasn't like the binaries were obviously corrupt on disk).
> I've replaced the Pi's microSD card just to be safe, though... kernel
> corruption is scary.
Ugh, Heisenbugs are the worst! Great to hear you managed to resolve it,
though :)
-Toke
^ permalink raw reply [relevance 0%]
* Re: [Cake] NLA_F_NESTED is missing
2020-11-03 11:00 0% ` Toke Høiland-Jørgensen
@ 2020-11-04 5:48 1% ` Dean Scarff
2020-11-04 11:27 0% ` Toke Høiland-Jørgensen
0 siblings, 1 reply; 200+ results
From: Dean Scarff @ 2020-11-04 5:48 UTC (permalink / raw)
To: cake
On Tue, 03 Nov 2020 12:00:55 +0100, Toke Høiland-Jørgensen wrote:
> Dean Scarff <dos@scarff.id.au> writes:
>
>> On Mon, 02 Nov 2020 13:37:00 +0100, Toke wrote:
>>> Dean Scarff <dos@scarff.id.au> writes:
>>>
>>>> Hi,
>>>>
>>>> I've been happily running the out-of-tree sch_cake on my
>>>> Raspberry
>>>> Pi
>>>> since 2015. However, I recently upgraded my kernel (to 5.4.72
>>>> from
>>>> Raspbian's raspberrypi-kernel 1.20201022-1), which comes with the
>>>> sch_cake in mainline. Now, when running:
>>>>
>>>> sudo /sbin/tc qdisc add dev ppp0 root cake
>>>>
>>>> I get the error:
>>>>
>>>> Error: NLA_F_NESTED is missing.
>>>>
>>>> I get this error with the sch_cake in mainline, and also with
>>>> sch_cake
>>>> built out-of-tree. I also get the error with both Debian's
>>>> iproute2
>>>> 5.9.0-1 (built myself via debian/rules) and "tc" from dtaht's
>>>> tc-adv
>>>> repo.
>>>>
>>>> Any ideas on what this error means and how to fix it?
>>>
>>> I just tried building a 5.4.72 kernel and couldn't reproduce this,
>>> so
>>> it
>>> seems it's a fault with the raspberry pi kernel; I guess opening a
>>> bug
>>> against that would be the way to go?
>>>
>>> As for what's actually causing this, I couldn't find anything
>>> obvious
>>> that touches this code in the qdisc layer; but I suppose it has
>>> something to do with the core qdisc netlink parsing code?
>>>
>>> -Toke
>>
>> Thanks for the data point.
>>
>> For the record, the relevant kernel source is:
>>
>> https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/include/net/netlink.h?h=v5.4.72#n1143
>> and the Pi branch:
>>
>> https://github.com/raspberrypi/linux/blob/raspberrypi-kernel_1.20201022-1/include/net/netlink.h#L1143
>>
>> It seems very unlikely that the Pi folks are patching the netlink
>> stuff, so I don't think I'll get much traction there unless I can
>> call
>> out something specifically wrong with their patchset.
>
> Well, something odd is certainly going on. The error message you're
> quoting comes form a part of the netlink parsing code (in the kernel)
> that shouldn't even be hit by the qdisc addition: NLA_F_NESTED
> parsing
> is only enabled in 'strict' validation mode, which is not used for
> qdiscs.
>
> So IDK, maybe a compiler issue or a bit that gets set wrong
> somewhere?
> Bisecting the kernel may be the only option here, I don't think
> you're
> going to find anything in userspace...
Yeah, I came to the same conclusion. I verified the userspace was sane
via gdb (see earlier post), and I also read through the sch_api.c and
nlattr.c kernel code and it sure looks impossible for the strict
validation to be getting hit.
Safe to say this was random corruption: I downgraded the kernel, things
worked as expected, then I upgraded back to the 5.4.72 and it worked
too! Interestingly, the problem persisted across reboots (so it wasn't
just RAM corruption), and all the kernel files also matched their "dpkg"
MD5s (so it wasn't like the binaries were obviously corrupt on disk).
I've replaced the Pi's microSD card just to be safe, though... kernel
corruption is scary.
^ permalink raw reply [relevance 1%]
* Re: [Cake] NLA_F_NESTED is missing
2020-11-03 1:11 1% ` Dean Scarff
2020-11-03 8:07 1% ` Dean Scarff
@ 2020-11-03 11:00 0% ` Toke Høiland-Jørgensen
2020-11-04 5:48 1% ` Dean Scarff
1 sibling, 1 reply; 200+ results
From: Toke Høiland-Jørgensen @ 2020-11-03 11:00 UTC (permalink / raw)
To: Dean Scarff, cake
Dean Scarff <dos@scarff.id.au> writes:
> On Mon, 02 Nov 2020 13:37:00 +0100, Toke wrote:
>> Dean Scarff <dos@scarff.id.au> writes:
>>
>>> Hi,
>>>
>>> I've been happily running the out-of-tree sch_cake on my Raspberry
>>> Pi
>>> since 2015. However, I recently upgraded my kernel (to 5.4.72 from
>>> Raspbian's raspberrypi-kernel 1.20201022-1), which comes with the
>>> sch_cake in mainline. Now, when running:
>>>
>>> sudo /sbin/tc qdisc add dev ppp0 root cake
>>>
>>> I get the error:
>>>
>>> Error: NLA_F_NESTED is missing.
>>>
>>> I get this error with the sch_cake in mainline, and also with
>>> sch_cake
>>> built out-of-tree. I also get the error with both Debian's
>>> iproute2
>>> 5.9.0-1 (built myself via debian/rules) and "tc" from dtaht's
>>> tc-adv
>>> repo.
>>>
>>> Any ideas on what this error means and how to fix it?
>>
>> I just tried building a 5.4.72 kernel and couldn't reproduce this, so
>> it
>> seems it's a fault with the raspberry pi kernel; I guess opening a
>> bug
>> against that would be the way to go?
>>
>> As for what's actually causing this, I couldn't find anything obvious
>> that touches this code in the qdisc layer; but I suppose it has
>> something to do with the core qdisc netlink parsing code?
>>
>> -Toke
>
> Thanks for the data point.
>
> For the record, the relevant kernel source is:
> https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/include/net/netlink.h?h=v5.4.72#n1143
> and the Pi branch:
> https://github.com/raspberrypi/linux/blob/raspberrypi-kernel_1.20201022-1/include/net/netlink.h#L1143
>
> It seems very unlikely that the Pi folks are patching the netlink
> stuff, so I don't think I'll get much traction there unless I can call
> out something specifically wrong with their patchset.
Well, something odd is certainly going on. The error message you're
quoting comes form a part of the netlink parsing code (in the kernel)
that shouldn't even be hit by the qdisc addition: NLA_F_NESTED parsing
is only enabled in 'strict' validation mode, which is not used for
qdiscs.
So IDK, maybe a compiler issue or a bit that gets set wrong somewhere?
Bisecting the kernel may be the only option here, I don't think you're
going to find anything in userspace...
-Toke
^ permalink raw reply [relevance 0%]
* Re: [Cake] NLA_F_NESTED is missing
2020-11-03 1:11 1% ` Dean Scarff
@ 2020-11-03 8:07 1% ` Dean Scarff
2020-11-03 11:00 0% ` Toke Høiland-Jørgensen
1 sibling, 0 replies; 200+ results
From: Dean Scarff @ 2020-11-03 8:07 UTC (permalink / raw)
To: cake
On Tue, 03 Nov 2020 12:11:06 +1100, Dean Scarff wrote:
> I should be able to figure it out by poking around in tc with gdb.
I did this, and I confirmed that tc isn't trying to send any nested
attributes. So I think the problem is on the kernel side, since it
seems to be hallucinating attributes it expects to be nested but aren't.
Note that "tc" does send an empty options attribute:
addattr_l(n, 1024, TCA_OPTIONS, NULL, 0);
https://salsa.debian.org/debian/iproute2/-/blob/v5.7.0/tc/q_cake.c#L356
It's the same in upstream iproute2 and iproute2-next:
https://git.kernel.org/pub/scm/network/iproute2/iproute2.git/tree/tc/q_cake.c#n356
https://git.kernel.org/pub/scm/network/iproute2/iproute2-next.git/tree/tc/q_cake.c#n356
This looks valid to me. While I'm less sure about all the other
attributes being added in cake_parse_opt (i.e. whether they should be
nested under TCA_OPTIONS), that's moot in my repro case, because they're
not being set anyway.
---
Interesting parts of the gdb session:
(gdb) run qdisc add dev ppp0 root cake
Starting program: /home/dean/iproute2/tc/tc qdisc add dev ppp0 root
cake
Breakpoint 10, rtnl_talk (rtnl=0xc72d0 <rth>, n=0x7efefb78, answer=0x0)
at libnetlink.c:1048
1048 return __rtnl_talk(rtnl, n, answer, true, NULL);
(gdb) p *rtnl
$14 = {fd = 3, local = {nl_family = 16, nl_pad = 0, nl_pid = 18698,
nl_groups = 0}, peer = {nl_family = 0, nl_pad = 0, nl_pid = 0,
nl_groups = 0}, seq = 1604370876, dump = 1604370876, proto = 0,
dump_fp = 0x0, flags = 0}
(gdb) p *n
$15 = {nlmsg_len = 52, nlmsg_type = 36, nlmsg_flags = 1537, nlmsg_seq =
0,
nlmsg_pid = 0}
(gdb) p sizeof(struct nlmsghdr)
$16 = 16
(gdb) call print_qdisc(n, stdout)
added qdisc cake 0: dev ppp0 root refcnt 0 nonat nowash no-ack-filter
no-split-gso noatm overhead 0
$17 = 0
I've annotated the following to show the structure of the request.
There are only two attributes, TCA_KIND and TCA_OPTIONS, and neither of
those is nested.
(gdb) x/52xb n
nlmsghdr:
0x7efefb78: [0x34] 0x00 0x00 0x00 [0x24] 0x00 0x01 0x06
len=52 RTM_NEWQDISC
0x7efefb80: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
payload:
family header:
0x7efefb88: [0x00][0x00][0x00 0x00][0x05 0x00][0x00 0x00]
family=AF_UNSPEC ifindex=ppp0
pad1 pad2 alignment
0x7efefb90: [0x00 0x00 0x00 0x00][0xff 0xff 0xff 0xff]
handle=0 parent=TC_H_ROOT
attributes:
0x7efefb98: [0x00 0x00 0x00 0x00][0x09 0x00][0x01 0x00]
info=0 rta_len=9 rta_type=TCA_KIND
0x7efefba0: [0x63 0x61 0x6b 0x65 0x00][0x00 0x00 0x00]
rta_data=“cake” alignment
0x7efefba8: [0x04 0x00][0x02 0x00]
rta_len=4 rta_type=TCA_OPTIONS
(gdb) up
#1 0x000199a4 in tc_qdisc_modify (cmd=36, flags=1536, argc=0,
argv=0x7efffd70)
at tc_qdisc.c:208
208 if (rtnl_talk(&rth, &req.n, NULL) < 0)
(gdb) p req.t
$19 = {tcm_family = 0 '\000', tcm__pad1 = 0 '\000', tcm__pad2 = 0,
tcm_ifindex = 5, tcm_handle = 0, tcm_parent = 4294967295, tcm_info =
0}
^ permalink raw reply [relevance 1%]
* Re: [Cake] NLA_F_NESTED is missing
2020-11-03 1:14 0% ` Jonathan Morton
@ 2020-11-03 1:51 1% ` Dean Scarff
0 siblings, 0 replies; 200+ results
From: Dean Scarff @ 2020-11-03 1:51 UTC (permalink / raw)
To: cake
On Tue, 3 Nov 2020 03:14:37 +0200, Jonathan Morton wrote:
>> On 1 Nov, 2020, at 12:15 pm, Dean Scarff <dos@scarff.id.au> wrote:
>>
>> Error: NLA_F_NESTED is missing.
>
> Since you're running an up-to-date kernel, you should check you are
> also running up-to-date userspace tools. That flag is associated
> with
> the interface between the two.
>
> - Jonathan Morton
Thanks. I figured the same thing (see my other post today), but if
anything, one of the userspace versions I tested (iproute2 5.9.0) is
*too* new (released Oct 19 for 5.9 kernels, see:
https://lwn.net/Articles/834755/ ). For good measure, I also tested
with Debian's iproute2_5.7.0-1 ;)
Either way though, I can debug the userspace tools, which should get me
to the root cause.
^ permalink raw reply [relevance 1%]
* Re: [Cake] NLA_F_NESTED is missing
2020-11-01 16:53 1% ` Y
@ 2020-11-03 1:14 0% ` Jonathan Morton
2020-11-03 1:51 1% ` Dean Scarff
2 siblings, 1 reply; 200+ results
From: Jonathan Morton @ 2020-11-03 1:14 UTC (permalink / raw)
To: Dean Scarff; +Cc: cake
> On 1 Nov, 2020, at 12:15 pm, Dean Scarff <dos@scarff.id.au> wrote:
>
> Error: NLA_F_NESTED is missing.
Since you're running an up-to-date kernel, you should check you are also running up-to-date userspace tools. That flag is associated with the interface between the two.
- Jonathan Morton
^ permalink raw reply [relevance 0%]
* Re: [Cake] NLA_F_NESTED is missing
@ 2020-11-03 1:11 1% ` Dean Scarff
2020-11-03 8:07 1% ` Dean Scarff
2020-11-03 11:00 0% ` Toke Høiland-Jørgensen
0 siblings, 2 replies; 200+ results
From: Dean Scarff @ 2020-11-03 1:11 UTC (permalink / raw)
To: cake
On Mon, 02 Nov 2020 13:37:00 +0100, Toke wrote:
> Dean Scarff <dos@scarff.id.au> writes:
>
>> Hi,
>>
>> I've been happily running the out-of-tree sch_cake on my Raspberry
>> Pi
>> since 2015. However, I recently upgraded my kernel (to 5.4.72 from
>> Raspbian's raspberrypi-kernel 1.20201022-1), which comes with the
>> sch_cake in mainline. Now, when running:
>>
>> sudo /sbin/tc qdisc add dev ppp0 root cake
>>
>> I get the error:
>>
>> Error: NLA_F_NESTED is missing.
>>
>> I get this error with the sch_cake in mainline, and also with
>> sch_cake
>> built out-of-tree. I also get the error with both Debian's
>> iproute2
>> 5.9.0-1 (built myself via debian/rules) and "tc" from dtaht's
>> tc-adv
>> repo.
>>
>> Any ideas on what this error means and how to fix it?
>
> I just tried building a 5.4.72 kernel and couldn't reproduce this, so
> it
> seems it's a fault with the raspberry pi kernel; I guess opening a
> bug
> against that would be the way to go?
>
> As for what's actually causing this, I couldn't find anything obvious
> that touches this code in the qdisc layer; but I suppose it has
> something to do with the core qdisc netlink parsing code?
>
> -Toke
Thanks for the data point.
For the record, the relevant kernel source is:
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/include/net/netlink.h?h=v5.4.72#n1143
and the Pi branch:
https://github.com/raspberrypi/linux/blob/raspberrypi-kernel_1.20201022-1/include/net/netlink.h#L1143
It seems very unlikely that the Pi folks are patching the netlink
stuff, so I don't think I'll get much traction there unless I can call
out something specifically wrong with their patchset.
My current theory (despite the 4 combinations I tried) is that there's
some mismatch between Raspbian/Debian's tc and the kernel (somewhere in
the tc's qdisc code it's calling nla_parse_nested but not setting
nla_type), but it's odd that nobody else can repro. tbh the Debian
patches look pretty innocent too:
https://salsa.debian.org/debian/iproute2/-/tree/558bae88bd0befc1bf3e1070733bafd522e44992/debian/patches
I should be able to figure it out by poking around in tc with gdb.
^ permalink raw reply [relevance 1%]
* Re: [Cake] NLA_F_NESTED is missing
@ 2020-11-01 16:53 1% ` Y
2020-11-03 1:14 0% ` Jonathan Morton
2 siblings, 0 replies; 200+ results
From: Y @ 2020-11-01 16:53 UTC (permalink / raw)
To: cake, Dean Scarff
My pi doesn't have error using cake through eth0.
Le dimanche 1 novembre 2020 à 19:15:54 UTC+9, Dean Scarff <dos@scarff.id.au> a écrit :
Hi,
I've been happily running the out-of-tree sch_cake on my Raspberry Pi
since 2015. However, I recently upgraded my kernel (to 5.4.72 from
Raspbian's raspberrypi-kernel 1.20201022-1), which comes with the
sch_cake in mainline. Now, when running:
sudo /sbin/tc qdisc add dev ppp0 root cake
I get the error:
Error: NLA_F_NESTED is missing.
I get this error with the sch_cake in mainline, and also with sch_cake
built out-of-tree. I also get the error with both Debian's iproute2
5.9.0-1 (built myself via debian/rules) and "tc" from dtaht's tc-adv
repo.
Any ideas on what this error means and how to fix it?
_______________________________________________
Cake mailing list
Cake@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/cake
^ permalink raw reply [relevance 1%]
* Re: [Cake] Cake, low speed ADSL & fwmark
2020-07-28 16:51 0% ` Jim Geo
2020-07-28 16:54 0% ` Jonathan Morton
@ 2020-07-28 16:56 0% ` Toke Høiland-Jørgensen
1 sibling, 0 replies; 200+ results
From: Toke Høiland-Jørgensen @ 2020-07-28 16:56 UTC (permalink / raw)
To: Jim Geo, Jonathan Morton; +Cc: cake
Jim Geo <dim.geo@gmail.com> writes:
>>
>> > On 28 Jul, 2020, at 12:41 am, Jim Geo <dim.geo@gmail.com> wrote:
>> >
>> > Thank you for all the efforts you have done to make internet usable.
>> >
>> > I currently use htb & fq_codel in my low speed ADSL 6Mbps downlink/1 Mbps uplink. I use fwmark to control both uplink and downlink with good results in terms of bandwidth allocation. Streaming video is chopping bulk traffic successfully.
>> >
>> > Is setting up cake worth the effort at such low speeds? Would it reduce latency?
>>
>> Cake has a better-quality shaper than HTB does, and a more sophisticated flow-isolation scheme than fq_codel does. These tend to matter more at low speeds, not less. It's also generally easier to set up than a compound qdisc scheme.
>>
>> > Regarding fwmark can you please elaborate more on the calculations performed? Man page is not that helpful.
>> >
>> > My understanding is this:
>> >
>> > I use 1,2,3,4 as marks of traffic.
>> > If I set the mask to 0xffffff[..] the marks will remain unchanged. Then right shifting will occur for the unset bits, so they will land on tins
>> > 1,1,3,1
>> >
>> > Can you please correct me? If logical and performed between mask and mark value?
>>
>> Since there's only a few "tins" at a time used in Cake, and the fwmark is a direct mapping into those tins, a narrow mask is probably safer to use than a wide one. The reason for the mask is so you can encode several values into different parts of the mark value. The shift is simply to move the field covered by the mask to the low end of the word, so that it is useful to Cake.
>>
>> For your use case, a mask of 0xF will be completely sufficient. It would allow you to specify mark values of 1-15, to map directly in the first 15 tins used by Cake, or a mark value of 0 to fall back to Cake's default Diffserv handling. None of Cake's tin setups use more than 8 tins, and most use fewer.
>>
>> - Jonathan Morton
>>
>
> Thanks for the info! I've noticed that by using 0xF, marks 1-4 become
> tins 0-3. Tin 0 is special? I assumed it's for bulk traffic. I use
> diffserv8.
Nah, it's just that the fwmark uses 1-indexed tin numbers (because a
mark of 0 is the same as 'unset').
The code in cake_select_tin() that handles the mark is literally just this:
else if (mark && mark <= q->tin_cnt)
tin = q->tin_order[mark - 1];
-Toke
^ permalink raw reply [relevance 0%]
* Re: [Cake] Cake, low speed ADSL & fwmark
2020-07-28 16:51 0% ` Jim Geo
@ 2020-07-28 16:54 0% ` Jonathan Morton
2020-07-28 16:56 0% ` Toke Høiland-Jørgensen
1 sibling, 0 replies; 200+ results
From: Jonathan Morton @ 2020-07-28 16:54 UTC (permalink / raw)
To: Jim Geo; +Cc: cake
> On 28 Jul, 2020, at 7:51 pm, Jim Geo <dim.geo@gmail.com> wrote:
>
> Thanks for the info! I've noticed that by using 0xF, marks 1-4 become
> tins 0-3. Tin 0 is special? I assumed it's for bulk traffic. I use
> diffserv8.
Mark 0 (not tin 0) is special because it corresponds to "no mark set". Otherwise, what you see is what you get, and mark N goes into tin N-1.
- Jonathan Morton
^ permalink raw reply [relevance 0%]
* Re: [Cake] Cake, low speed ADSL & fwmark
2020-07-27 22:46 0% ` Jonathan Morton
@ 2020-07-28 16:51 0% ` Jim Geo
2020-07-28 16:54 0% ` Jonathan Morton
2020-07-28 16:56 0% ` Toke Høiland-Jørgensen
0 siblings, 2 replies; 200+ results
From: Jim Geo @ 2020-07-28 16:51 UTC (permalink / raw)
To: Jonathan Morton; +Cc: cake
>
> > On 28 Jul, 2020, at 12:41 am, Jim Geo <dim.geo@gmail.com> wrote:
> >
> > Thank you for all the efforts you have done to make internet usable.
> >
> > I currently use htb & fq_codel in my low speed ADSL 6Mbps downlink/1 Mbps uplink. I use fwmark to control both uplink and downlink with good results in terms of bandwidth allocation. Streaming video is chopping bulk traffic successfully.
> >
> > Is setting up cake worth the effort at such low speeds? Would it reduce latency?
>
> Cake has a better-quality shaper than HTB does, and a more sophisticated flow-isolation scheme than fq_codel does. These tend to matter more at low speeds, not less. It's also generally easier to set up than a compound qdisc scheme.
>
> > Regarding fwmark can you please elaborate more on the calculations performed? Man page is not that helpful.
> >
> > My understanding is this:
> >
> > I use 1,2,3,4 as marks of traffic.
> > If I set the mask to 0xffffff[..] the marks will remain unchanged. Then right shifting will occur for the unset bits, so they will land on tins
> > 1,1,3,1
> >
> > Can you please correct me? If logical and performed between mask and mark value?
>
> Since there's only a few "tins" at a time used in Cake, and the fwmark is a direct mapping into those tins, a narrow mask is probably safer to use than a wide one. The reason for the mask is so you can encode several values into different parts of the mark value. The shift is simply to move the field covered by the mask to the low end of the word, so that it is useful to Cake.
>
> For your use case, a mask of 0xF will be completely sufficient. It would allow you to specify mark values of 1-15, to map directly in the first 15 tins used by Cake, or a mark value of 0 to fall back to Cake's default Diffserv handling. None of Cake's tin setups use more than 8 tins, and most use fewer.
>
> - Jonathan Morton
>
Thanks for the info! I've noticed that by using 0xF, marks 1-4 become
tins 0-3. Tin 0 is special? I assumed it's for bulk traffic. I use
diffserv8.
^ permalink raw reply [relevance 0%]
* Re: [Cake] Cake, low speed ADSL & fwmark
2020-07-27 22:46 0% ` Jonathan Morton
@ 2020-07-28 14:52 1% ` Y
1 sibling, 0 replies; 200+ results
From: Y @ 2020-07-28 14:52 UTC (permalink / raw)
To: cake
Hi,all
My situation is/was similer.
I prefer to use cake because it costs lower cpu time than htb + fq_codel.
tc qdisc add root dev eth0 cake bandwidth 810kbit pppoa-vcmux diffserv4
ack-filter-aggressive dual-srchost
pi@raspberrypi:~ $ tc -s qdisc show dev eth0
qdisc cake 8023: root refcnt 2 bandwidth 810Kbit diffserv4 dual-srchost
nonat nowash ack-filter-aggressive split-gso rtt 100.0ms atm overhead 10
Sent 18265833249 bytes 23590044 pkt (dropped 7172987, overlimits
53950415 requeues 11)
backlog 1444b 1p requeues 11
memory used: 130147b of 4Mb
capacity estimate: 810Kbit
min/max network layer size: 30 / 1478
min/max overhead-adjusted size: 53 / 1643
average network hdr offset: 14
Bulk Best Effort Video Voice
thresh 50624bit 810Kbit 405Kbit 202496bit
target 356.5ms 22.3ms 44.6ms 89.1ms
interval 713.0ms 117.3ms 139.6ms 184.1ms
pk_delay 62.6ms 132.6ms 13.8ms 86.4ms
av_delay 9.0ms 42.3ms 7.2ms 14.8ms
sp_delay 1.3ms 5.5ms 981us 3.6ms
backlog 0b 1444b 0b 0b
pkts 369 30744151 8116 10396
bytes 19926 23924477414 438264 5958198
way_inds 0 6553855 4 1
way_miss 250 1048934 4749 205
way_cols 0 0 0 0
drops 0 4430387 0 0
marks 0 7611 0 0
ack_drop 0 2742600 0 0
sp_flows 1 4 1 1
bk_flows 0 2 0 0
un_flows 0 0 0 0
max_len 54 2984 54 590
quantum 300 300 300 300
On 28/07/2020 06:41, Jim Geo wrote:
> Hello,
>
> Thank you for all the efforts you have done to make internet usable.
>
> I currently use htb & fq_codel in my low speed ADSL 6Mbps downlink/1
> Mbps uplink. I use fwmark to control both uplink and downlink with good
> results in terms of bandwidth allocation. Streaming video is chopping
> bulk traffic successfully.
>
> Is setting up cake worth the effort at such low speeds? Would it reduce
> latency?
>
> Regarding fwmark can you please elaborate more on the calculations
> performed? Man page is not that helpful.
>
> My understanding is this:
>
> I use 1,2,3,4 as marks of traffic.
> If I set the mask to 0xffffff[..] the marks will remain unchanged. Then
> right shifting will occur for the unset bits, so they will land on tins
> 1,1,3,1
>
> Can you please correct me? If logical and performed between mask and
> mark value?
>
> Thanks,
> Jim
>
>
>
> _______________________________________________
> Cake mailing list
> Cake@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cake
>
^ permalink raw reply [relevance 1%]
* Re: [Cake] Cake, low speed ADSL & fwmark
@ 2020-07-27 22:46 0% ` Jonathan Morton
2020-07-28 16:51 0% ` Jim Geo
2020-07-28 14:52 1% ` Y
1 sibling, 1 reply; 200+ results
From: Jonathan Morton @ 2020-07-27 22:46 UTC (permalink / raw)
To: Jim Geo; +Cc: cake
> On 28 Jul, 2020, at 12:41 am, Jim Geo <dim.geo@gmail.com> wrote:
>
> Thank you for all the efforts you have done to make internet usable.
>
> I currently use htb & fq_codel in my low speed ADSL 6Mbps downlink/1 Mbps uplink. I use fwmark to control both uplink and downlink with good results in terms of bandwidth allocation. Streaming video is chopping bulk traffic successfully.
>
> Is setting up cake worth the effort at such low speeds? Would it reduce latency?
Cake has a better-quality shaper than HTB does, and a more sophisticated flow-isolation scheme than fq_codel does. These tend to matter more at low speeds, not less. It's also generally easier to set up than a compound qdisc scheme.
> Regarding fwmark can you please elaborate more on the calculations performed? Man page is not that helpful.
>
> My understanding is this:
>
> I use 1,2,3,4 as marks of traffic.
> If I set the mask to 0xffffff[..] the marks will remain unchanged. Then right shifting will occur for the unset bits, so they will land on tins
> 1,1,3,1
>
> Can you please correct me? If logical and performed between mask and mark value?
Since there's only a few "tins" at a time used in Cake, and the fwmark is a direct mapping into those tins, a narrow mask is probably safer to use than a wide one. The reason for the mask is so you can encode several values into different parts of the mark value. The shift is simply to move the field covered by the mask to the low end of the word, so that it is useful to Cake.
For your use case, a mask of 0xF will be completely sufficient. It would allow you to specify mark values of 1-15, to map directly in the first 15 tins used by Cake, or a mark value of 0 to fall back to Cake's default Diffserv handling. None of Cake's tin setups use more than 8 tins, and most use fewer.
- Jonathan Morton
^ permalink raw reply [relevance 0%]
* Re: [Cake] diffserv3 vs diffserv4
2020-07-25 20:04 0% ` Sebastian Moeller
@ 2020-07-25 21:33 0% ` Kevin Darbyshire-Bryant
0 siblings, 0 replies; 200+ results
From: Kevin Darbyshire-Bryant @ 2020-07-25 21:33 UTC (permalink / raw)
To: Sebastian Moeller, David P. Reed; +Cc: Cake List
[-- Attachment #1: Type: text/plain, Size: 10297 bytes --]
Apologies. I think this is a cultural divide and misunderstanding, certainly on my part. Bellhead is uncomfortably close to bellend and in the context of ‘absurd’ I’m afraid I rather took your bellhead to be rather closer to my interpretation of bellend. In essence I thought you were saying I was a bit of a cock, which I probably am but for reasons other than the ones mentioned.
Possibly getting back to subject. CAKE is probably the fairest thing I know and worships the god of (low) latency. When I first heard of cake many years ago, the promise of ‘full link’ and ‘low latency’ I couldn’t believe it. I’ve subsequently learned it’s all about queues, effective queue management and picking the right packet at the right time to fill the right tx slot. It’s all about latency. It’s all about shooting/marking the right packets to signal correctly and keep latency of each flow under control. I understand the phrase “It’s the latency, stupid”.
Cake is so fair across flows that cake offers deliberate unfairness features, things that bias that fairness, some obvious, some not. The obvious one is using packet categorisation to choose priority levels. Instead of the default ‘every packet is equal’ of besteffort, a choice of categorisation are available from ‘diffserv3’ (a 3 tier system) to ‘diffserv8’ (an 8 tier system) all designed to introduce an unfairness, ‘least important’ to ‘most important’. The categories or tins in cake parlance also have bandwidth thresholds, representing minimum capacities for that tin in the presence of traffic in competing tins.
A less obvious but deliberate unfairness mechanism in cake is ‘host fairness’. This counts the number of flows to/from each host and divides the bandwidth amongst the flows such that each host gets an even share. e.g. 1 host with 9 flows vs 1 host with 1 flow will end up with the 9 flow host getting 50% of bandwidth across those 9 flows whilst the 1 flow host will get 50% of bandwidth across 1 flow. This prevents gaming of bandwidth allocation simply by starting more flows.
Going all the way back to the start of this thread which spoke about ‘more important to de-prioritise’ my domestic use case/problem is to support ‘bittorrent’ but at a lower importance than bulk (my backups) at a lower importance than best effort (browsing) at a lower importance than latency/jitter sensitive video at a lower importance than voip. That’s 5 categories (LE, BK, BE, VI, VO), but it could easily be 4 (LE, BK, BE, VO) with ‘video’ lumped in with Best Effort as is done with diffserv3.
Cake’s tin bandwidth thresholds say ‘you’re allowed to have at least this much’ and in my diffserv5 implementation it’s 1/64th of configured bandwidth simply ‘cos it can’t really be zero. In the absence of other traffic, Least Effort under CAKE will happily consume ALL the bandwidth, great, nothing more important..bittorrent you go right ahead. But as soon as something more important comes along, well, that takes (within limits) priority. I think this is called ’soft admission’ but not totally sure.
I apologies if I have incorrectly used bandwidth/capacity/rate/whatever but hopefully everyone will understand what I’m trying to say.
Kevin
> On 25 Jul 2020, at 21:04, Sebastian Moeller <moeller0@gmx.de> wrote:
>
> Hi David,
>
> I believe that folks here, certainly Kevin, have accepted the domain specificity of diffserv and we are mainly discussion, how many tiers of differential latency tolerance we desire ;). It goes without saying that this is all within the "home domain", and the goal is how little/few priority games we need to play for decent performance under load. Sure, end2end signaling would be desirable, but as you point out does not exist in diffserve today and is also not very likely to appear anytime soon, but in one's home it is still relatively easy to identify a few special cases, like bit-torrent (don't get in the way of "real" traffic) or VoIP (quite latency and especially jitter averse, but also typically of modest rate) that could/should be taken care of.
> As far as I can tell that is all the cake/sqm's diffserve modes try to accomplish. DSCPs are simply used, as there exists machinery for routers/end-hosts OS to selectively set/re-set them and all IP packets will carry them.
> Regarding the Bell-haededness, sure I might qualify for that moniker/abuse*, but the relevant factor, IMHO, is not so much the exact rate cut-offs of the different priority tiers, but the simple realization that low latency via prioritization requires relatively low rates, otherwise "priority" traffic will self congest, so these thresholds serve to establish a "cost" for using each priority tier.
>
> Best Regards
> Sebastian
>
> *) By virtue of intellectual laziness, it is simply often easier for me to think in rate shares than the alternatives. But hey, I do this as a hobby, so I cut myself a lot of slack here ;) But I take no offense in being labeled that.
>
>
>> On Jul 25, 2020, at 21:35, David P. Reed <dpreed@deepplum.com> wrote:
>>
>> I want to apologize for any implication that you, Mr. Darbyshire-Bryant, were a "bellhead". AFAIK, you were quoting a staement from the designers of diffserv4, who apparently still believe in bandwidth division as a metric.
>>
>> But I understand it might be painful to hear my critique of the diffserv design process.
>>
>> Just be aware that it's my problem, not yours. I don't mean to offend you. I do, however, feel like the folks who did "design" diffserv (and continue to promote it) completely miss the whole point of why the Internet is architected the way it is. And since they haven't managed to respond to a clue-by-4 yet, I'm tired of just pointing out that the idea doesn't actually achieve any benefits, because no one (literally no one) has evern done a consistent assignment of end-to-end meaning to the various diffserv labels after decades of failed testing.
>>
>> Since this is a group discussion, and not just a response to you, my comment was aimed at the general group (which is not dedicated to bellhead thinking, thank goodness).
>>
>> And to be clear, AQM (cake, being an example) is not about bandwidth allocation. It does focus on latency/queueing-delay, for the most part.
>>
>> Hence my concern that diffserv's fundamental misunderstanding of the responsibility of router queue management might contaminate a very, very important project.
>>
>>
>> On Saturday, July 25, 2020 1:54pm, "Kevin Darbyshire-Bryant" <kevin@darbyshire-bryant.me.uk> said:
>>
>>> I didn’t sign up for this abuse. Bellhead eh? Well f**k off!
>>>
>>> I’ve had enough - bye.
>>>
>>>> On 25 Jul 2020, at 18:48, David P. Reed <dpreed@deepplum.com> wrote:
>>>>
>>>> This idea (dividing the link rate capacity, since "bandwidth" is an incorrect
>>> term not to be promulgated), is absurd, but typical of "bellhead" thinking.
>>>>
>>>> Per packet latency is the key control variable, even for TCP. That's because
>>> capacity/rate is not controllable by routers, but by routing in a general Internet
>>> situation.
>>>>
>>>> Latency is controlled by queuing delay in a packet network, not bitrate. And
>>> in mixed traffic, which after all is why traffic is classified in the first place,
>>> by its characteristics and response to increased latency end-to-end, is the core
>>> "control" for the internetwork as a whole.
>>>>
>>>> So, by promoting thinking about "bandwidth" a whole sequence of
>>> misformulations of network management is embedded into the thinking of those
>>> designing queue management algorithms.
>>>>
>>>> And make no mistake, queue management is the ONLY knob other than sending
>>> different packets on different routes that one has for routers.
>>>>
>>>> I don't know who proposed this fractional division, but it is clearly a
>>> bellhead-influenced thinker who thinks all protocols are CBR flows like in the old
>>> phone system.
>>>>
>>>> But almost no flows in the internet are CBR flows! File transfers are not,
>>> streaming TV is not, web ttraffic is not, game traffic is not. Only
>>> non-statistically multiplexed real-time telephony and *some* video conferencing is
>>> CBR.
>>>>
>>>> Yet this bizarre idea of dividing "bandwidth" among all categories of flows
>>> pops up. Probably from employees of phone companies or phone equipment suppliers.
>>> Or folks who went to Uni and were trained in "communications" by former phone
>>> engineers.
>>>>
>>>> Latency, latency, latency. Queue delay, queue delay, queue delay. Not link
>>> speed! Change your brains.
>>>>
>>>> It's hard fo fight this bellhead crowd (or the bellheadedness in your own
>>> thinking) but think about packets and queues instead.
>>>>
>>>> My good friend Len Kleinrock didn't invent "Bandwidth Theory"! He invented
>>> Queueing Theory. For a reason.
>>>>
>>>> On Saturday, July 25, 2020 6:12am, "Kevin Darbyshire-Bryant"
>>> <kevin@darbyshire-bryant.me.uk> said:
>>>>
>>>>> _______________________________________________
>>>>> Cake mailing list
>>>>> Cake@lists.bufferbloat.net
>>>>> https://lists.bufferbloat.net/listinfo/cake
>>>>>
>>>>>
>>>>>> On 24 Jul 2020, at 18:42, Kevin Darbyshire-Bryant
>>>>> <kevin@darbyshire-bryant.me.uk> wrote:
>>>>>>
>>>>>>
>>>>>> The move from diffserv4 to diffserv5 WAS about de-prioritization.
>>>>>
>>>>> It was also about minimum bandwidth allocations:
>>>>>
>>>>> LE: 1/64th
>>>>> BK: 1/16th
>>>>> BE: 1/1
>>>>> VI: 1/2
>>>>> VO: 1/4
>>>>>
>>>>> So worst case, best effort should get 11/64ths in the extreme case of
>>> all other
>>>>> tins in use.
>>>>>
>>>>> Cheers,
>>>>>
>>>>> Kevin D-B
>>>>>
>>>>> gpg: 012C ACB2 28C6 C53E 9775 9123 B3A2 389B 9DE2 334A
>>>>>
>>>>>
>>>
>>>
>>> Cheers,
>>>
>>> Kevin D-B
>>>
>>> gpg: 012C ACB2 28C6 C53E 9775 9123 B3A2 389B 9DE2 334A
>>>
>>>
>> _______________________________________________
>> Cake mailing list
>> Cake@lists.bufferbloat.net
>> https://lists.bufferbloat.net/listinfo/cake
>
Cheers,
Kevin D-B
gpg: 012C ACB2 28C6 C53E 9775 9123 B3A2 389B 9DE2 334A
[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [relevance 0%]
* Re: [Cake] diffserv3 vs diffserv4
2020-07-25 19:35 1% ` David P. Reed
2020-07-25 20:04 0% ` Sebastian Moeller
@ 2020-07-25 21:27 0% ` Jonathan Morton
1 sibling, 0 replies; 200+ results
From: Jonathan Morton @ 2020-07-25 21:27 UTC (permalink / raw)
To: David P. Reed; +Cc: Kevin Darbyshire-Bryant, Cake List
> On 25 Jul, 2020, at 10:35 pm, David P. Reed <dpreed@deepplum.com> wrote:
>
> And to be clear, AQM (cake, being an example) is not about bandwidth allocation. It does focus on latency/queueing-delay, for the most part.
Cake is not *just* an AQM, though I understand your point. It is a qdisc with many interwoven functions.
Cake's Diffserv support is neither a pure priority scheme nor a pure bandwidth allocation. By using a hybrid of the two for bandwidth allocation, I was hoping to avoid the main pitfalls that the simple Bell-headed approaches routinely encounter. Each tin also has its own AQM parameters, which feed into the distinction between high-throughput and low-latency classes of traffic.
There are doubtless other approaches that could be tried, of course. And there might be endless debate over exactly how many traffic classes are actually needed; I don't think five is the right number, and the symmetry argument is not persuasive. But can we at least agree that Cake's attempt is a step in the right direction?
- Jonathan Morton
^ permalink raw reply [relevance 0%]
* Re: [Cake] diffserv3 vs diffserv4
2020-07-25 19:35 1% ` David P. Reed
@ 2020-07-25 20:04 0% ` Sebastian Moeller
2020-07-25 21:33 0% ` Kevin Darbyshire-Bryant
2020-07-25 21:27 0% ` Jonathan Morton
1 sibling, 1 reply; 200+ results
From: Sebastian Moeller @ 2020-07-25 20:04 UTC (permalink / raw)
To: David P. Reed; +Cc: Kevin Darbyshire-Bryant, Cake List
Hi David,
I believe that folks here, certainly Kevin, have accepted the domain specificity of diffserv and we are mainly discussion, how many tiers of differential latency tolerance we desire ;). It goes without saying that this is all within the "home domain", and the goal is how little/few priority games we need to play for decent performance under load. Sure, end2end signaling would be desirable, but as you point out does not exist in diffserve today and is also not very likely to appear anytime soon, but in one's home it is still relatively easy to identify a few special cases, like bit-torrent (don't get in the way of "real" traffic) or VoIP (quite latency and especially jitter averse, but also typically of modest rate) that could/should be taken care of.
As far as I can tell that is all the cake/sqm's diffserve modes try to accomplish. DSCPs are simply used, as there exists machinery for routers/end-hosts OS to selectively set/re-set them and all IP packets will carry them.
Regarding the Bell-haededness, sure I might qualify for that moniker/abuse*, but the relevant factor, IMHO, is not so much the exact rate cut-offs of the different priority tiers, but the simple realization that low latency via prioritization requires relatively low rates, otherwise "priority" traffic will self congest, so these thresholds serve to establish a "cost" for using each priority tier.
Best Regards
Sebastian
*) By virtue of intellectual laziness, it is simply often easier for me to think in rate shares than the alternatives. But hey, I do this as a hobby, so I cut myself a lot of slack here ;) But I take no offense in being labeled that.
> On Jul 25, 2020, at 21:35, David P. Reed <dpreed@deepplum.com> wrote:
>
> I want to apologize for any implication that you, Mr. Darbyshire-Bryant, were a "bellhead". AFAIK, you were quoting a staement from the designers of diffserv4, who apparently still believe in bandwidth division as a metric.
>
> But I understand it might be painful to hear my critique of the diffserv design process.
>
> Just be aware that it's my problem, not yours. I don't mean to offend you. I do, however, feel like the folks who did "design" diffserv (and continue to promote it) completely miss the whole point of why the Internet is architected the way it is. And since they haven't managed to respond to a clue-by-4 yet, I'm tired of just pointing out that the idea doesn't actually achieve any benefits, because no one (literally no one) has evern done a consistent assignment of end-to-end meaning to the various diffserv labels after decades of failed testing.
>
> Since this is a group discussion, and not just a response to you, my comment was aimed at the general group (which is not dedicated to bellhead thinking, thank goodness).
>
> And to be clear, AQM (cake, being an example) is not about bandwidth allocation. It does focus on latency/queueing-delay, for the most part.
>
> Hence my concern that diffserv's fundamental misunderstanding of the responsibility of router queue management might contaminate a very, very important project.
>
>
> On Saturday, July 25, 2020 1:54pm, "Kevin Darbyshire-Bryant" <kevin@darbyshire-bryant.me.uk> said:
>
> > I didn’t sign up for this abuse. Bellhead eh? Well f**k off!
> >
> > I’ve had enough - bye.
> >
> > > On 25 Jul 2020, at 18:48, David P. Reed <dpreed@deepplum.com> wrote:
> > >
> > > This idea (dividing the link rate capacity, since "bandwidth" is an incorrect
> > term not to be promulgated), is absurd, but typical of "bellhead" thinking.
> > >
> > > Per packet latency is the key control variable, even for TCP. That's because
> > capacity/rate is not controllable by routers, but by routing in a general Internet
> > situation.
> > >
> > > Latency is controlled by queuing delay in a packet network, not bitrate. And
> > in mixed traffic, which after all is why traffic is classified in the first place,
> > by its characteristics and response to increased latency end-to-end, is the core
> > "control" for the internetwork as a whole.
> > >
> > > So, by promoting thinking about "bandwidth" a whole sequence of
> > misformulations of network management is embedded into the thinking of those
> > designing queue management algorithms.
> > >
> > > And make no mistake, queue management is the ONLY knob other than sending
> > different packets on different routes that one has for routers.
> > >
> > > I don't know who proposed this fractional division, but it is clearly a
> > bellhead-influenced thinker who thinks all protocols are CBR flows like in the old
> > phone system.
> > >
> > > But almost no flows in the internet are CBR flows! File transfers are not,
> > streaming TV is not, web ttraffic is not, game traffic is not. Only
> > non-statistically multiplexed real-time telephony and *some* video conferencing is
> > CBR.
> > >
> > > Yet this bizarre idea of dividing "bandwidth" among all categories of flows
> > pops up. Probably from employees of phone companies or phone equipment suppliers.
> > Or folks who went to Uni and were trained in "communications" by former phone
> > engineers.
> > >
> > > Latency, latency, latency. Queue delay, queue delay, queue delay. Not link
> > speed! Change your brains.
> > >
> > > It's hard fo fight this bellhead crowd (or the bellheadedness in your own
> > thinking) but think about packets and queues instead.
> > >
> > > My good friend Len Kleinrock didn't invent "Bandwidth Theory"! He invented
> > Queueing Theory. For a reason.
> > >
> > > On Saturday, July 25, 2020 6:12am, "Kevin Darbyshire-Bryant"
> > <kevin@darbyshire-bryant.me.uk> said:
> > >
> > > > _______________________________________________
> > > > Cake mailing list
> > > > Cake@lists.bufferbloat.net
> > > > https://lists.bufferbloat.net/listinfo/cake
> > > >
> > > >
> > > > > On 24 Jul 2020, at 18:42, Kevin Darbyshire-Bryant
> > > > <kevin@darbyshire-bryant.me.uk> wrote:
> > > > >
> > > > >
> > > > > The move from diffserv4 to diffserv5 WAS about de-prioritization.
> > > >
> > > > It was also about minimum bandwidth allocations:
> > > >
> > > > LE: 1/64th
> > > > BK: 1/16th
> > > > BE: 1/1
> > > > VI: 1/2
> > > > VO: 1/4
> > > >
> > > > So worst case, best effort should get 11/64ths in the extreme case of
> > all other
> > > > tins in use.
> > > >
> > > > Cheers,
> > > >
> > > > Kevin D-B
> > > >
> > > > gpg: 012C ACB2 28C6 C53E 9775 9123 B3A2 389B 9DE2 334A
> > > >
> > > >
> >
> >
> > Cheers,
> >
> > Kevin D-B
> >
> > gpg: 012C ACB2 28C6 C53E 9775 9123 B3A2 389B 9DE2 334A
> >
> >
> _______________________________________________
> Cake mailing list
> Cake@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cake
^ permalink raw reply [relevance 0%]
* Re: [Cake] diffserv3 vs diffserv4
2020-07-25 17:54 0% ` Kevin Darbyshire-Bryant
@ 2020-07-25 19:35 1% ` David P. Reed
2020-07-25 20:04 0% ` Sebastian Moeller
2020-07-25 21:27 0% ` Jonathan Morton
0 siblings, 2 replies; 200+ results
From: David P. Reed @ 2020-07-25 19:35 UTC (permalink / raw)
To: Kevin Darbyshire-Bryant; +Cc: Cake List
[-- Attachment #1: Type: text/plain, Size: 4724 bytes --]
I want to apologize for any implication that you, Mr. Darbyshire-Bryant, were a "bellhead". AFAIK, you were quoting a staement from the designers of diffserv4, who apparently still believe in bandwidth division as a metric.
But I understand it might be painful to hear my critique of the diffserv design process.
Just be aware that it's my problem, not yours. I don't mean to offend you. I do, however, feel like the folks who did "design" diffserv (and continue to promote it) completely miss the whole point of why the Internet is architected the way it is. And since they haven't managed to respond to a clue-by-4 yet, I'm tired of just pointing out that the idea doesn't actually achieve any benefits, because no one (literally no one) has evern done a consistent assignment of end-to-end meaning to the various diffserv labels after decades of failed testing.
Since this is a group discussion, and not just a response to you, my comment was aimed at the general group (which is not dedicated to bellhead thinking, thank goodness).
And to be clear, AQM (cake, being an example) is not about bandwidth allocation. It does focus on latency/queueing-delay, for the most part.
Hence my concern that diffserv's fundamental misunderstanding of the responsibility of router queue management might contaminate a very, very important project.
On Saturday, July 25, 2020 1:54pm, "Kevin Darbyshire-Bryant" <kevin@darbyshire-bryant.me.uk> said:
> I didn’t sign up for this abuse. Bellhead eh? Well f**k off!
>
> I’ve had enough - bye.
>
> > On 25 Jul 2020, at 18:48, David P. Reed <dpreed@deepplum.com> wrote:
> >
> > This idea (dividing the link rate capacity, since "bandwidth" is an incorrect
> term not to be promulgated), is absurd, but typical of "bellhead" thinking.
> >
> > Per packet latency is the key control variable, even for TCP. That's because
> capacity/rate is not controllable by routers, but by routing in a general Internet
> situation.
> >
> > Latency is controlled by queuing delay in a packet network, not bitrate. And
> in mixed traffic, which after all is why traffic is classified in the first place,
> by its characteristics and response to increased latency end-to-end, is the core
> "control" for the internetwork as a whole.
> >
> > So, by promoting thinking about "bandwidth" a whole sequence of
> misformulations of network management is embedded into the thinking of those
> designing queue management algorithms.
> >
> > And make no mistake, queue management is the ONLY knob other than sending
> different packets on different routes that one has for routers.
> >
> > I don't know who proposed this fractional division, but it is clearly a
> bellhead-influenced thinker who thinks all protocols are CBR flows like in the old
> phone system.
> >
> > But almost no flows in the internet are CBR flows! File transfers are not,
> streaming TV is not, web ttraffic is not, game traffic is not. Only
> non-statistically multiplexed real-time telephony and *some* video conferencing is
> CBR.
> >
> > Yet this bizarre idea of dividing "bandwidth" among all categories of flows
> pops up. Probably from employees of phone companies or phone equipment suppliers.
> Or folks who went to Uni and were trained in "communications" by former phone
> engineers.
> >
> > Latency, latency, latency. Queue delay, queue delay, queue delay. Not link
> speed! Change your brains.
> >
> > It's hard fo fight this bellhead crowd (or the bellheadedness in your own
> thinking) but think about packets and queues instead.
> >
> > My good friend Len Kleinrock didn't invent "Bandwidth Theory"! He invented
> Queueing Theory. For a reason.
> >
> > On Saturday, July 25, 2020 6:12am, "Kevin Darbyshire-Bryant"
> <kevin@darbyshire-bryant.me.uk> said:
> >
> > > _______________________________________________
> > > Cake mailing list
> > > Cake@lists.bufferbloat.net
> > > https://lists.bufferbloat.net/listinfo/cake
> > >
> > >
> > > > On 24 Jul 2020, at 18:42, Kevin Darbyshire-Bryant
> > > <kevin@darbyshire-bryant.me.uk> wrote:
> > > >
> > > >
> > > > The move from diffserv4 to diffserv5 WAS about de-prioritization.
> > >
> > > It was also about minimum bandwidth allocations:
> > >
> > > LE: 1/64th
> > > BK: 1/16th
> > > BE: 1/1
> > > VI: 1/2
> > > VO: 1/4
> > >
> > > So worst case, best effort should get 11/64ths in the extreme case of
> all other
> > > tins in use.
> > >
> > > Cheers,
> > >
> > > Kevin D-B
> > >
> > > gpg: 012C ACB2 28C6 C53E 9775 9123 B3A2 389B 9DE2 334A
> > >
> > >
>
>
> Cheers,
>
> Kevin D-B
>
> gpg: 012C ACB2 28C6 C53E 9775 9123 B3A2 389B 9DE2 334A
>
>
[-- Attachment #2: Type: text/html, Size: 7236 bytes --]
^ permalink raw reply [relevance 1%]
* Re: [Cake] diffserv3 vs diffserv4
2020-07-25 17:48 1% ` David P. Reed
@ 2020-07-25 17:54 0% ` Kevin Darbyshire-Bryant
2020-07-25 19:35 1% ` David P. Reed
0 siblings, 1 reply; 200+ results
From: Kevin Darbyshire-Bryant @ 2020-07-25 17:54 UTC (permalink / raw)
To: David P. Reed; +Cc: Cake List
[-- Attachment #1: Type: text/plain, Size: 3063 bytes --]
I didn’t sign up for this abuse. Bellhead eh? Well f**k off!
I’ve had enough - bye.
> On 25 Jul 2020, at 18:48, David P. Reed <dpreed@deepplum.com> wrote:
>
> This idea (dividing the link rate capacity, since "bandwidth" is an incorrect term not to be promulgated), is absurd, but typical of "bellhead" thinking.
>
> Per packet latency is the key control variable, even for TCP. That's because capacity/rate is not controllable by routers, but by routing in a general Internet situation.
>
> Latency is controlled by queuing delay in a packet network, not bitrate. And in mixed traffic, which after all is why traffic is classified in the first place, by its characteristics and response to increased latency end-to-end, is the core "control" for the internetwork as a whole.
>
> So, by promoting thinking about "bandwidth" a whole sequence of misformulations of network management is embedded into the thinking of those designing queue management algorithms.
>
> And make no mistake, queue management is the ONLY knob other than sending different packets on different routes that one has for routers.
>
> I don't know who proposed this fractional division, but it is clearly a bellhead-influenced thinker who thinks all protocols are CBR flows like in the old phone system.
>
> But almost no flows in the internet are CBR flows! File transfers are not, streaming TV is not, web ttraffic is not, game traffic is not. Only non-statistically multiplexed real-time telephony and *some* video conferencing is CBR.
>
> Yet this bizarre idea of dividing "bandwidth" among all categories of flows pops up. Probably from employees of phone companies or phone equipment suppliers. Or folks who went to Uni and were trained in "communications" by former phone engineers.
>
> Latency, latency, latency. Queue delay, queue delay, queue delay. Not link speed! Change your brains.
>
> It's hard fo fight this bellhead crowd (or the bellheadedness in your own thinking) but think about packets and queues instead.
>
> My good friend Len Kleinrock didn't invent "Bandwidth Theory"! He invented Queueing Theory. For a reason.
>
> On Saturday, July 25, 2020 6:12am, "Kevin Darbyshire-Bryant" <kevin@darbyshire-bryant.me.uk> said:
>
> > _______________________________________________
> > Cake mailing list
> > Cake@lists.bufferbloat.net
> > https://lists.bufferbloat.net/listinfo/cake
> >
> >
> > > On 24 Jul 2020, at 18:42, Kevin Darbyshire-Bryant
> > <kevin@darbyshire-bryant.me.uk> wrote:
> > >
> > >
> > > The move from diffserv4 to diffserv5 WAS about de-prioritization.
> >
> > It was also about minimum bandwidth allocations:
> >
> > LE: 1/64th
> > BK: 1/16th
> > BE: 1/1
> > VI: 1/2
> > VO: 1/4
> >
> > So worst case, best effort should get 11/64ths in the extreme case of all other
> > tins in use.
> >
> > Cheers,
> >
> > Kevin D-B
> >
> > gpg: 012C ACB2 28C6 C53E 9775 9123 B3A2 389B 9DE2 334A
> >
> >
Cheers,
Kevin D-B
gpg: 012C ACB2 28C6 C53E 9775 9123 B3A2 389B 9DE2 334A
[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [relevance 0%]
* Re: [Cake] diffserv3 vs diffserv4
2020-07-25 10:12 0% ` Kevin Darbyshire-Bryant
2020-07-25 17:18 0% ` Sebastian Moeller
@ 2020-07-25 17:48 1% ` David P. Reed
2020-07-25 17:54 0% ` Kevin Darbyshire-Bryant
1 sibling, 1 reply; 200+ results
From: David P. Reed @ 2020-07-25 17:48 UTC (permalink / raw)
To: Kevin Darbyshire-Bryant; +Cc: Cake List
[-- Attachment #1: Type: text/plain, Size: 2721 bytes --]
This idea (dividing the link rate capacity, since "bandwidth" is an incorrect term not to be promulgated), is absurd, but typical of "bellhead" thinking.
Per packet latency is the key control variable, even for TCP. That's because capacity/rate is not controllable by routers, but by routing in a general Internet situation.
Latency is controlled by queuing delay in a packet network, not bitrate. And in mixed traffic, which after all is why traffic is classified in the first place, by its characteristics and response to increased latency end-to-end, is the core "control" for the internetwork as a whole.
So, by promoting thinking about "bandwidth" a whole sequence of misformulations of network management is embedded into the thinking of those designing queue management algorithms.
And make no mistake, queue management is the ONLY knob other than sending different packets on different routes that one has for routers.
I don't know who proposed this fractional division, but it is clearly a bellhead-influenced thinker who thinks all protocols are CBR flows like in the old phone system.
But almost no flows in the internet are CBR flows! File transfers are not, streaming TV is not, web ttraffic is not, game traffic is not. Only non-statistically multiplexed real-time telephony and *some* video conferencing is CBR.
Yet this bizarre idea of dividing "bandwidth" among all categories of flows pops up. Probably from employees of phone companies or phone equipment suppliers. Or folks who went to Uni and were trained in "communications" by former phone engineers.
Latency, latency, latency. Queue delay, queue delay, queue delay. Not link speed! Change your brains.
It's hard fo fight this bellhead crowd (or the bellheadedness in your own thinking) but think about packets and queues instead.
My good friend Len Kleinrock didn't invent "Bandwidth Theory"! He invented Queueing Theory. For a reason.
On Saturday, July 25, 2020 6:12am, "Kevin Darbyshire-Bryant" <kevin@darbyshire-bryant.me.uk> said:
> _______________________________________________
> Cake mailing list
> Cake@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cake
>
>
> > On 24 Jul 2020, at 18:42, Kevin Darbyshire-Bryant
> <kevin@darbyshire-bryant.me.uk> wrote:
> >
> >
> > The move from diffserv4 to diffserv5 WAS about de-prioritization.
>
> It was also about minimum bandwidth allocations:
>
> LE: 1/64th
> BK: 1/16th
> BE: 1/1
> VI: 1/2
> VO: 1/4
>
> So worst case, best effort should get 11/64ths in the extreme case of all other
> tins in use.
>
> Cheers,
>
> Kevin D-B
>
> gpg: 012C ACB2 28C6 C53E 9775 9123 B3A2 389B 9DE2 334A
>
>
[-- Attachment #2: Type: text/html, Size: 5438 bytes --]
^ permalink raw reply [relevance 1%]
* Re: [Cake] diffserv3 vs diffserv4
2020-07-25 17:18 0% ` Sebastian Moeller
@ 2020-07-25 17:47 0% ` Jonathan Morton
0 siblings, 0 replies; 200+ results
From: Jonathan Morton @ 2020-07-25 17:47 UTC (permalink / raw)
To: Sebastian Moeller; +Cc: Kevin Darbyshire-Bryant, Cake List
> On 25 Jul, 2020, at 8:18 pm, Sebastian Moeller <moeller0@gmx.de> wrote:
>
> I am confused... but I am also confused by cake's output:
>
> Bulk Best Effort Voice
> thresh 3062Kbit 49Mbit 12250Kbit"
>
> as far as I can tell, Bulk's 3062Kbit must be the minimum, while BE and Voice give their maxima... That, or I am missing something important...
> (I wonder whether it would not be clearer to give both min and max for each tin, then again I probably missing all the deyails of the actual implementation...)
Cake delivers from the highest-priority tin that both has data to send and is "behind" its local schedule, defined by the threshold rate. If no tin with data to send is behind schedule, then some tin that does have data to send is chosen (so Cake as a whole is work-conserving, modulo its global shaper). IIRC, it'll be the highest priority such tin.
The notion of which tin is highest priority is a little counter-intuitive. One tin must be at the global shaper rate, and will be the lowest priority tin - and normally that is the "best effort" tin. So the BK tin is actually at a higher priority, but only up to its very limited threshold rate. To avoid starving the best effort tin under all possible combinations of traffic, it is necessary and sufficient to ensure that the sum of all higher-priority tins' threshold rates is less than the global rate.
In the case of Diffserv3, the BK and VO tins both have higher priority than BE and sum to 5/16ths of the global rate. So with all tins saturated, the BE traffic gets 11/16ths which is pretty respectable. If the BE and VO traffic goes away, BK is then able to use all available bandwidth.
- Jonathan Morton
^ permalink raw reply [relevance 0%]
* Re: [Cake] diffserv3 vs diffserv4
2020-07-25 10:12 0% ` Kevin Darbyshire-Bryant
@ 2020-07-25 17:18 0% ` Sebastian Moeller
2020-07-25 17:47 0% ` Jonathan Morton
2020-07-25 17:48 1% ` David P. Reed
1 sibling, 1 reply; 200+ results
From: Sebastian Moeller @ 2020-07-25 17:18 UTC (permalink / raw)
To: Kevin Darbyshire-Bryant; +Cc: Cake List
Hi Kevin,
> On Jul 25, 2020, at 12:12, Kevin Darbyshire-Bryant <kevin@darbyshire-bryant.me.uk> wrote:
>
>
>
>> On 24 Jul 2020, at 18:42, Kevin Darbyshire-Bryant <kevin@darbyshire-bryant.me.uk> wrote:
>>
>>
>> The move from diffserv4 to diffserv5 WAS about de-prioritization.
>
> It was also about minimum bandwidth allocations:
>
> LE: 1/64th
That is 6 binary orders of magnitude, on a slow link, LE is effectively starved and there will be no real forward progress. For real scavenger services this might well be a sane policy, but this requires the very selective with assigning flows to this tin ;)
> BK: 1/16th
> BE: 1/1
> VI: 1/2
> VO: 1/4
So I see 1/64 + 1/16 + 1/1 + 1/2 + 1/4 = 1.828125 which seems excessive for actually guaranteed minimums. I was under the naive? impression the minima should add up to <= 1, no?
>
> So worst case, best effort should get 11/64ths in the extreme case of all other tins in use.
This seems only true, if on overload the lowest prioritiers tiers get their allotment first, no?
I am confused... but I am also confused by cake's output:
"
Bulk Best Effort Voice
thresh 3062Kbit 49Mbit 12250Kbit"
as far as I can tell, Bulk's 3062Kbit must be the minimum, while BE and Voice give their maxima... That, or I am missing something important...
(I wonder whether it would not be clearer to give both min and max for each tin, then again I probably missing all the deyails of the actual implementation...)
Best Regards
Sebastian
>
> Cheers,
>
> Kevin D-B
>
> gpg: 012C ACB2 28C6 C53E 9775 9123 B3A2 389B 9DE2 334A
>
> _______________________________________________
> Cake mailing list
> Cake@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cake
^ permalink raw reply [relevance 0%]
* Re: [Cake] diffserv3 vs diffserv4
2020-07-25 3:13 0% ` Jonathan Morton
@ 2020-07-25 17:05 1% ` David P. Reed
0 siblings, 0 replies; 200+ results
From: David P. Reed @ 2020-07-25 17:05 UTC (permalink / raw)
To: Jonathan Morton; +Cc: Justin Kilpatrick, cake
[-- Attachment #1: Type: text/plain, Size: 4998 bytes --]
+1000%
I believe the problem with diffserv arose at conception, because it violated the core idea of IETF's operation:
"rough consensus and working code"
It was clear very, very early( to everyone but those working on it!) that no working approximate implementation ever existed, nor could it!
Had someone proposed a single better-efforts category, whose implementation would be Autonomous System by Autonomous System defined by a scheme roughly equivalent to "Paris Metro Pricing", it would have afforded experience at scale. (In Paris Metro Pricing, there are two knds of cars on each train, First Class and Second Class. If you pay for first class, you get to go into the first class cars. Cars change from second to first class iff the seats in first class are tending to be full. Trains run more often when there are lines waiting for second class cars. The analogy with router decisions is should be clear, except since trains can't run more often, congestion is signaled by drop or marking, which means that second class packets would be dropped or marked unless there were no first class packets.)
But instead the designers ignored implementation entirely, and invented "wish-based" classes.
This also violated an end-to-end argument - you should only put "in the network" functions that can be completely *implemented* "within the network".
And the TOS/QOS idea isn't meaningful to routers.
On Friday, July 24, 2020 11:13pm, "Jonathan Morton" <chromatix99@gmail.com> said:
> > On 24 Jul, 2020, at 6:56 pm, Justin Kilpatrick <justin@althea.net>
> wrote:
> >
> > "sqm-scripts used 3 tiers of priority pretty successfully as does free.fr. -
> de-prioritization seems a good idea, prioritization not so much."
> >
> > This is the best comment on why diffserv3 is the default that I could find on
> bufferbloat.net. I'm interested in hearing about what data (anecdotes welcome)
> lead to this conclusion.
>
> In Cake, Diffserv4 maps conceptually (but not in detail) to the four priority
> buckets in Wifi - BK, BE, VI, VO. In Diffserv3 the VI bucket is dropped, because
> Cake's flow isolation within BE is already good enough to give decent video
> streaming performance. The BK and VO buckets are still useful to deal with
> certain specific problems; BK is the place to put "swarm" protocols which intend
> to be scavengers but which flow-isolation tends to prioritise, and VO is for
> latency-sensitive protocols which the user wants to specifically protect from
> random traffic fluctuations.
>
> Thinking more broadly, I believe Diffserv would have been far more successful if
> it had replaced Precedence/TOS with a simple two-bit, four-way set of PHBs:
>
> 00: High Throughput - equivalent to traditional Best Effort service.
>
> 01: High Reliability - "Every Packet's Sacred".
>
> 10: Low Cost - a scavenging service for altruistic applications.
>
> 11: Low Latency - for the many protocols that are sensitive to delays more than
> throughput.
>
> It may also have been reasonable to include a couple of extra bits for uses
> internal to an AS, on the understanding that the basic two bits would be preserved
> end-to-end as an indication of application intent.
>
> Of the above four classes, Diffserv3 provides three - omitting only the High
> Reliability class. But that is a class most useful within a datacentre, where it
> is actually practical to implement a lossless backplane with backpressure signals
> instead of loss.
>
> What we *actually* have is a six-bit field with ill-defined semantics, that is
> neither preserved nor respected end-to-end, is consequently ignored by most
> applications, and consumes all the space in the former TOS byte that is not
> specifically set aside for ECN (a field which could profitably have been larger).
> It is a serious problem.
>
> Implementations of PHBs still tend to think in terms of bandwidth reservation (a
> Bell-head paradigm) and/or strict priority (like the Precedence system which was
> lifted directly from telegraphy practice). Both approaches are inefficient, and
> go along with the misconception that if we can merely categorise traffic on the
> fly into a large enough number of pigeonholes, some magical method of dealing with
> the pigeonholes will make itself obvious. However, both the easy, universal
> method of categorisation and the magical delivery strategy have failed to
> materialise. It rather suggests that they're doing it wrong.
>
> So that is why Diffserv3 is the default in Cake. It offers explicit "low cost"
> and "low latency" service for suitably marked traffic, and for everything else the
> Best Effort service uses flow and host isolation strategies to maintain good
> behaviour. It usually works well.
>
> - Jonathan Morton
>
> _______________________________________________
> Cake mailing list
> Cake@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cake
>
[-- Attachment #2: Type: text/html, Size: 7271 bytes --]
^ permalink raw reply [relevance 1%]
* Re: [Cake] diffserv3 vs diffserv4
2020-07-24 17:42 0% ` Kevin Darbyshire-Bryant
@ 2020-07-25 10:12 0% ` Kevin Darbyshire-Bryant
2020-07-25 17:18 0% ` Sebastian Moeller
2020-07-25 17:48 1% ` David P. Reed
0 siblings, 2 replies; 200+ results
From: Kevin Darbyshire-Bryant @ 2020-07-25 10:12 UTC (permalink / raw)
To: Cake List
[-- Attachment #1: Type: text/plain, Size: 459 bytes --]
> On 24 Jul 2020, at 18:42, Kevin Darbyshire-Bryant <kevin@darbyshire-bryant.me.uk> wrote:
>
>
> The move from diffserv4 to diffserv5 WAS about de-prioritization.
It was also about minimum bandwidth allocations:
LE: 1/64th
BK: 1/16th
BE: 1/1
VI: 1/2
VO: 1/4
So worst case, best effort should get 11/64ths in the extreme case of all other tins in use.
Cheers,
Kevin D-B
gpg: 012C ACB2 28C6 C53E 9775 9123 B3A2 389B 9DE2 334A
[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [relevance 0%]
* Re: [Cake] diffserv3 vs diffserv4
2020-07-24 17:42 0% ` Kevin Darbyshire-Bryant
@ 2020-07-25 3:13 0% ` Jonathan Morton
2020-07-25 17:05 1% ` David P. Reed
1 sibling, 1 reply; 200+ results
From: Jonathan Morton @ 2020-07-25 3:13 UTC (permalink / raw)
To: Justin Kilpatrick; +Cc: cake
> On 24 Jul, 2020, at 6:56 pm, Justin Kilpatrick <justin@althea.net> wrote:
>
> "sqm-scripts used 3 tiers of priority pretty successfully as does free.fr. - de-prioritization seems a good idea, prioritization not so much."
>
> This is the best comment on why diffserv3 is the default that I could find on bufferbloat.net. I'm interested in hearing about what data (anecdotes welcome) lead to this conclusion.
In Cake, Diffserv4 maps conceptually (but not in detail) to the four priority buckets in Wifi - BK, BE, VI, VO. In Diffserv3 the VI bucket is dropped, because Cake's flow isolation within BE is already good enough to give decent video streaming performance. The BK and VO buckets are still useful to deal with certain specific problems; BK is the place to put "swarm" protocols which intend to be scavengers but which flow-isolation tends to prioritise, and VO is for latency-sensitive protocols which the user wants to specifically protect from random traffic fluctuations.
Thinking more broadly, I believe Diffserv would have been far more successful if it had replaced Precedence/TOS with a simple two-bit, four-way set of PHBs:
00: High Throughput - equivalent to traditional Best Effort service.
01: High Reliability - "Every Packet's Sacred".
10: Low Cost - a scavenging service for altruistic applications.
11: Low Latency - for the many protocols that are sensitive to delays more than throughput.
It may also have been reasonable to include a couple of extra bits for uses internal to an AS, on the understanding that the basic two bits would be preserved end-to-end as an indication of application intent.
Of the above four classes, Diffserv3 provides three - omitting only the High Reliability class. But that is a class most useful within a datacentre, where it is actually practical to implement a lossless backplane with backpressure signals instead of loss.
What we *actually* have is a six-bit field with ill-defined semantics, that is neither preserved nor respected end-to-end, is consequently ignored by most applications, and consumes all the space in the former TOS byte that is not specifically set aside for ECN (a field which could profitably have been larger). It is a serious problem.
Implementations of PHBs still tend to think in terms of bandwidth reservation (a Bell-head paradigm) and/or strict priority (like the Precedence system which was lifted directly from telegraphy practice). Both approaches are inefficient, and go along with the misconception that if we can merely categorise traffic on the fly into a large enough number of pigeonholes, some magical method of dealing with the pigeonholes will make itself obvious. However, both the easy, universal method of categorisation and the magical delivery strategy have failed to materialise. It rather suggests that they're doing it wrong.
So that is why Diffserv3 is the default in Cake. It offers explicit "low cost" and "low latency" service for suitably marked traffic, and for everything else the Best Effort service uses flow and host isolation strategies to maintain good behaviour. It usually works well.
- Jonathan Morton
^ permalink raw reply [relevance 0%]
* Re: [Cake] diffserv3 vs diffserv4
@ 2020-07-24 17:42 0% ` Kevin Darbyshire-Bryant
2020-07-25 10:12 0% ` Kevin Darbyshire-Bryant
2020-07-25 3:13 0% ` Jonathan Morton
1 sibling, 1 reply; 200+ results
From: Kevin Darbyshire-Bryant @ 2020-07-24 17:42 UTC (permalink / raw)
To: Cake List
[-- Attachment #1: Type: text/plain, Size: 1091 bytes --]
> On 24 Jul 2020, at 16:56, Justin Kilpatrick <justin@althea.net> wrote:
>
> "sqm-scripts used 3 tiers of priority pretty successfully as does free.fr. - de-prioritization seems a good idea, prioritization not so much."
>
> This is the best comment on why diffserv3 is the default that I could find on bufferbloat.net. I'm interested in hearing about what data (anecdotes welcome) lead to this conclusion.
As someone who is currently trying (but not that hard) to get a diffserv5 implementation in upstream as opposed to a local hack the aim being to have 2 lower than default classes I have some opinion on this. My use case is straightforward, I want somewhere to put ‘Least Effort’ traffic (eg Bittorrent) as a scavenger class that loses out to my Bulk transfers (backups) At the other end of things I do want to prioritise Voice (VOIP) above Video (netflix/facetime) above ‘Best Effort’. LE, BK, BE, VI, VO
The move from diffserv4 to diffserv5 WAS about de-prioritization.
Cheers,
Kevin D-B
gpg: 012C ACB2 28C6 C53E 9775 9123 B3A2 389B 9DE2 334A
[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [relevance 0%]
* Re: [Cake] quantum configuration
@ 2020-07-21 22:29 1% ` Y
1 sibling, 0 replies; 200+ results
From: Y @ 2020-07-21 22:29 UTC (permalink / raw)
To: cake
Hi.
tc -s qdisc show dev eth0
shows settled quantum.
On 2020/07/22 0:32, Luca Muscariello wrote:
> Is there a reason why in cake the quantum cannot be configured to a
> different value like in fq_codel?
>
> _______________________________________________
> Cake mailing list
> Cake@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cake
>
^ permalink raw reply [relevance 1%]
* [Cake] [PATCH for v5.9] sch_cake: Replace HTTP links with HTTPS ones
@ 2020-07-19 12:22 2% Alexander A. Klimov
0 siblings, 0 replies; 200+ results
From: Alexander A. Klimov @ 2020-07-19 12:22 UTC (permalink / raw)
To: toke, jhs, xiyou.wangcong, jiri, davem, kuba, cake, netdev, linux-kernel
Cc: Alexander A. Klimov
Rationale:
Reduces attack surface on kernel devs opening the links for MITM
as HTTPS traffic is much harder to manipulate.
Deterministic algorithm:
For each file:
If not .svg:
For each line:
If doesn't contain `\bxmlns\b`:
For each link, `\bhttp://[^# \t\r\n]*(?:\w|/)`:
If neither `\bgnu\.org/license`, nor `\bmozilla\.org/MPL\b`:
If both the HTTP and HTTPS versions
return 200 OK and serve the same content:
Replace HTTP with HTTPS.
Signed-off-by: Alexander A. Klimov <grandmaster@al2klimov.de>
---
Continuing my work started at 93431e0607e5.
See also: git log --oneline '--author=Alexander A. Klimov <grandmaster@al2klimov.de>' v5.7..master
(Actually letting a shell for loop submit all this stuff for me.)
If there are any URLs to be removed completely
or at least not (just) HTTPSified:
Just clearly say so and I'll *undo my change*.
See also: https://lkml.org/lkml/2020/6/27/64
If there are any valid, but yet not changed URLs:
See: https://lkml.org/lkml/2020/6/26/837
If you apply the patch, please let me know.
Sorry again to all maintainers who complained about subject lines.
Now I realized that you want an actually perfect prefixes,
not just subsystem ones.
I tried my best...
And yes, *I could* (at least half-)automate it.
Impossible is nothing! :)
net/sched/sch_cake.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/sched/sch_cake.c b/net/sched/sch_cake.c
index ebaeec1e5c82..2f6c0daa2337 100644
--- a/net/sched/sch_cake.c
+++ b/net/sched/sch_cake.c
@@ -363,7 +363,7 @@ static const u8 bulk_order[] = {1, 0, 2, 3};
#define REC_INV_SQRT_CACHE (16)
static u32 cobalt_rec_inv_sqrt_cache[REC_INV_SQRT_CACHE] = {0};
-/* http://en.wikipedia.org/wiki/Methods_of_computing_square_roots
+/* https://en.wikipedia.org/wiki/Methods_of_computing_square_roots
* new_invsqrt = (invsqrt / 2) * (3 - count * invsqrt^2)
*
* Here, invsqrt is a fixed point number (< 1.0), 32bit mantissa, aka Q0.32
--
2.27.0
^ permalink raw reply [relevance 2%]
* Re: [Cake] [PATCH net v2] vlan: consolidate VLAN parsing code and limit max parsing depth
@ 2020-07-07 22:49 1% ` David Miller
0 siblings, 0 replies; 200+ results
From: David Miller @ 2020-07-07 22:49 UTC (permalink / raw)
To: toke
Cc: netdev, cake, dcaratti, jiri, jhs, xiyou.wangcong,
toshiaki.makita1, daniel
From: Toke Høiland-Jørgensen <toke@redhat.com>
Date: Tue, 7 Jul 2020 13:03:25 +0200
> Toshiaki pointed out that we now have two very similar functions to extract
> the L3 protocol number in the presence of VLAN tags. And Daniel pointed out
> that the unbounded parsing loop makes it possible for maliciously crafted
> packets to loop through potentially hundreds of tags.
>
> Fix both of these issues by consolidating the two parsing functions and
> limiting the VLAN tag parsing to a max depth of 8 tags. As part of this,
> switch over __vlan_get_protocol() to use skb_header_pointer() instead of
> pskb_may_pull(), to avoid the possible side effects of the latter and keep
> the skb pointer 'const' through all the parsing functions.
>
> v2:
> - Use limit of 8 tags instead of 32 (matching XMIT_RECURSION_LIMIT)
>
> Reported-by: Toshiaki Makita <toshiaki.makita1@gmail.com>
> Reported-by: Daniel Borkmann <daniel@iogearbox.net>
> Fixes: d7bf2ebebc2b ("sched: consistently handle layer3 header accesses in the presence of VLANs")
> Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Applied, thank you.
^ permalink raw reply [relevance 1%]
* Re: [Cake] [PATCH net] vlan: consolidate VLAN parsing code and limit max parsing depth
2020-07-07 10:57 0% ` Toke Høiland-Jørgensen
@ 2020-07-07 11:01 1% ` Toshiaki Makita
0 siblings, 0 replies; 200+ results
From: Toshiaki Makita @ 2020-07-07 11:01 UTC (permalink / raw)
To: Toke Høiland-Jørgensen
Cc: davem, netdev, cake, Davide Caratti, Jiri Pirko,
Jamal Hadi Salim, Cong Wang, Daniel Borkmann
On 2020/07/07 19:57, Toke Høiland-Jørgensen wrote:
> Toshiaki Makita <toshiaki.makita1@gmail.com> writes:
>
>> On 2020/07/06 21:29, Toke Høiland-Jørgensen wrote:
>>> Toshiaki pointed out that we now have two very similar functions to extract
>>> the L3 protocol number in the presence of VLAN tags. And Daniel pointed out
>>> that the unbounded parsing loop makes it possible for maliciously crafted
>>> packets to loop through potentially hundreds of tags.
>>>
>>> Fix both of these issues by consolidating the two parsing functions and
>>> limiting the VLAN tag parsing to an arbitrarily-chosen, but hopefully
>>> conservative, max depth of 32 tags. As part of this, switch over
>>> __vlan_get_protocol() to use skb_header_pointer() instead of
>>> pskb_may_pull(), to avoid the possible side effects of the latter and keep
>>> the skb pointer 'const' through all the parsing functions.
>>>
>>> Reported-by: Toshiaki Makita <toshiaki.makita1@gmail.com>
>>> Reported-by: Daniel Borkmann <daniel@iogearbox.net>
>>> Fixes: d7bf2ebebc2b ("sched: consistently handle layer3 header accesses in the presence of VLANs")
>>> Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
>>> ---
>> ...
>>> @@ -623,13 +597,12 @@ static inline __be16 __vlan_get_protocol(struct sk_buff *skb, __be16 type,
>>> vlan_depth = ETH_HLEN;
>>> }
>>> do {
>>> - struct vlan_hdr *vh;
>>> + struct vlan_hdr vhdr, *vh;
>>>
>>> - if (unlikely(!pskb_may_pull(skb,
>>> - vlan_depth + VLAN_HLEN)))
>>> + vh = skb_header_pointer(skb, vlan_depth, sizeof(vhdr), &vhdr);
>>
>> Some drivers which use vlan_get_protocol to get IP protocol for checksum offload discards
>> packets when it cannot get the protocol.
>> I guess for such users this function should try to get protocol even if it is not in skb header?
>> I'm not sure such a case can happen, but since you care about this, you know real cases where
>> vlan tag can be in skb frags?
>
> skb_header_pointer() will still succeed in reading the data, it'll just
> do so by copying it into the buffer on the stack (vhdr) instead of
> moving the SKB data itself around...
True, probably I need some more coffee...
Thanks.
Toshiaki Makita
^ permalink raw reply [relevance 1%]
* Re: [Cake] [PATCH net] vlan: consolidate VLAN parsing code and limit max parsing depth
2020-07-07 10:44 1% ` Toshiaki Makita
@ 2020-07-07 10:57 0% ` Toke Høiland-Jørgensen
2020-07-07 11:01 1% ` Toshiaki Makita
0 siblings, 1 reply; 200+ results
From: Toke Høiland-Jørgensen @ 2020-07-07 10:57 UTC (permalink / raw)
To: Toshiaki Makita, davem
Cc: netdev, cake, Davide Caratti, Jiri Pirko, Jamal Hadi Salim,
Cong Wang, Daniel Borkmann
Toshiaki Makita <toshiaki.makita1@gmail.com> writes:
> On 2020/07/06 21:29, Toke Høiland-Jørgensen wrote:
>> Toshiaki pointed out that we now have two very similar functions to extract
>> the L3 protocol number in the presence of VLAN tags. And Daniel pointed out
>> that the unbounded parsing loop makes it possible for maliciously crafted
>> packets to loop through potentially hundreds of tags.
>>
>> Fix both of these issues by consolidating the two parsing functions and
>> limiting the VLAN tag parsing to an arbitrarily-chosen, but hopefully
>> conservative, max depth of 32 tags. As part of this, switch over
>> __vlan_get_protocol() to use skb_header_pointer() instead of
>> pskb_may_pull(), to avoid the possible side effects of the latter and keep
>> the skb pointer 'const' through all the parsing functions.
>>
>> Reported-by: Toshiaki Makita <toshiaki.makita1@gmail.com>
>> Reported-by: Daniel Borkmann <daniel@iogearbox.net>
>> Fixes: d7bf2ebebc2b ("sched: consistently handle layer3 header accesses in the presence of VLANs")
>> Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
>> ---
> ...
>> @@ -623,13 +597,12 @@ static inline __be16 __vlan_get_protocol(struct sk_buff *skb, __be16 type,
>> vlan_depth = ETH_HLEN;
>> }
>> do {
>> - struct vlan_hdr *vh;
>> + struct vlan_hdr vhdr, *vh;
>>
>> - if (unlikely(!pskb_may_pull(skb,
>> - vlan_depth + VLAN_HLEN)))
>> + vh = skb_header_pointer(skb, vlan_depth, sizeof(vhdr), &vhdr);
>
> Some drivers which use vlan_get_protocol to get IP protocol for checksum offload discards
> packets when it cannot get the protocol.
> I guess for such users this function should try to get protocol even if it is not in skb header?
> I'm not sure such a case can happen, but since you care about this, you know real cases where
> vlan tag can be in skb frags?
skb_header_pointer() will still succeed in reading the data, it'll just
do so by copying it into the buffer on the stack (vhdr) instead of
moving the SKB data itself around...
-Toke
^ permalink raw reply [relevance 0%]
* Re: [Cake] [PATCH net] vlan: consolidate VLAN parsing code and limit max parsing depth
2020-07-07 10:49 1% ` Toshiaki Makita
@ 2020-07-07 10:54 0% ` Toke Høiland-Jørgensen
0 siblings, 0 replies; 200+ results
From: Toke Høiland-Jørgensen @ 2020-07-07 10:54 UTC (permalink / raw)
To: Toshiaki Makita, Daniel Borkmann, davem
Cc: netdev, cake, Davide Caratti, Jiri Pirko, Jamal Hadi Salim, Cong Wang
Toshiaki Makita <toshiaki.makita1@gmail.com> writes:
> On 2020/07/07 7:44, Toke Høiland-Jørgensen wrote:
>> Daniel Borkmann <daniel@iogearbox.net> writes:
>>> On 7/6/20 2:29 PM, Toke Høiland-Jørgensen wrote:
>>>> Toshiaki pointed out that we now have two very similar functions to extract
>>>> the L3 protocol number in the presence of VLAN tags. And Daniel pointed out
>>>> that the unbounded parsing loop makes it possible for maliciously crafted
>>>> packets to loop through potentially hundreds of tags.
>>>>
>>>> Fix both of these issues by consolidating the two parsing functions and
>>>> limiting the VLAN tag parsing to an arbitrarily-chosen, but hopefully
>>>> conservative, max depth of 32 tags. As part of this, switch over
>>>> __vlan_get_protocol() to use skb_header_pointer() instead of
>>>> pskb_may_pull(), to avoid the possible side effects of the latter and keep
>>>> the skb pointer 'const' through all the parsing functions.
>>>>
>>>> Reported-by: Toshiaki Makita <toshiaki.makita1@gmail.com>
>>>> Reported-by: Daniel Borkmann <daniel@iogearbox.net>
>>>> Fixes: d7bf2ebebc2b ("sched: consistently handle layer3 header accesses in the presence of VLANs")
>>>> Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
>>>> ---
>>>> include/linux/if_vlan.h | 57 ++++++++++++++++-------------------------
>>>> 1 file changed, 22 insertions(+), 35 deletions(-)
>>>>
>>>> diff --git a/include/linux/if_vlan.h b/include/linux/if_vlan.h
>>>> index 427a5b8597c2..855d16192e6a 100644
>>>> --- a/include/linux/if_vlan.h
>>>> +++ b/include/linux/if_vlan.h
>>>> @@ -25,6 +25,8 @@
>>>> #define VLAN_ETH_DATA_LEN 1500 /* Max. octets in payload */
>>>> #define VLAN_ETH_FRAME_LEN 1518 /* Max. octets in frame sans FCS */
>>>>
>>>> +#define VLAN_MAX_DEPTH 32 /* Max. number of nested VLAN tags parsed */
>>>> +
>>>
>>> Any insight on limits of nesting wrt QinQ, maybe from spec side?
>>
>> Don't think so. Wikipedia says this:
>>
>> 802.1ad is upward compatible with 802.1Q. Although 802.1ad is limited
>> to two tags, there is no ceiling on the standard limiting a single
>> frame to more than two tags, allowing for growth in the protocol. In
>> practice Service Provider topologies often anticipate and utilize
>> frames having more than two tags.
>>
>>> Why not 8 as max, for example (I'd probably even consider a depth like
>>> this as utterly broken setup ..)?
>>
>> I originally went with 8, but chickened out after seeing how many places
>> call the parsing function. While I do agree that eight tags is... somewhat
>> excessive... I was trying to make absolutely sure no one would hit this
>> limit in normal use. See also https://xkcd.com/1172/ :)
>
> Considering that XMIT_RECURSION_LIMIT is 8, I also think 8 is sufficient.
Alright, fair enough, I'll send a v2 with a limit of 8 :)
-Toke
^ permalink raw reply [relevance 0%]
* Re: [Cake] [PATCH net] vlan: consolidate VLAN parsing code and limit max parsing depth
@ 2020-07-07 10:49 1% ` Toshiaki Makita
2020-07-07 10:54 0% ` Toke Høiland-Jørgensen
0 siblings, 1 reply; 200+ results
From: Toshiaki Makita @ 2020-07-07 10:49 UTC (permalink / raw)
To: Toke Høiland-Jørgensen, Daniel Borkmann, davem
Cc: netdev, cake, Davide Caratti, Jiri Pirko, Jamal Hadi Salim, Cong Wang
On 2020/07/07 7:44, Toke Høiland-Jørgensen wrote:
> Daniel Borkmann <daniel@iogearbox.net> writes:
>> On 7/6/20 2:29 PM, Toke Høiland-Jørgensen wrote:
>>> Toshiaki pointed out that we now have two very similar functions to extract
>>> the L3 protocol number in the presence of VLAN tags. And Daniel pointed out
>>> that the unbounded parsing loop makes it possible for maliciously crafted
>>> packets to loop through potentially hundreds of tags.
>>>
>>> Fix both of these issues by consolidating the two parsing functions and
>>> limiting the VLAN tag parsing to an arbitrarily-chosen, but hopefully
>>> conservative, max depth of 32 tags. As part of this, switch over
>>> __vlan_get_protocol() to use skb_header_pointer() instead of
>>> pskb_may_pull(), to avoid the possible side effects of the latter and keep
>>> the skb pointer 'const' through all the parsing functions.
>>>
>>> Reported-by: Toshiaki Makita <toshiaki.makita1@gmail.com>
>>> Reported-by: Daniel Borkmann <daniel@iogearbox.net>
>>> Fixes: d7bf2ebebc2b ("sched: consistently handle layer3 header accesses in the presence of VLANs")
>>> Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
>>> ---
>>> include/linux/if_vlan.h | 57 ++++++++++++++++-------------------------
>>> 1 file changed, 22 insertions(+), 35 deletions(-)
>>>
>>> diff --git a/include/linux/if_vlan.h b/include/linux/if_vlan.h
>>> index 427a5b8597c2..855d16192e6a 100644
>>> --- a/include/linux/if_vlan.h
>>> +++ b/include/linux/if_vlan.h
>>> @@ -25,6 +25,8 @@
>>> #define VLAN_ETH_DATA_LEN 1500 /* Max. octets in payload */
>>> #define VLAN_ETH_FRAME_LEN 1518 /* Max. octets in frame sans FCS */
>>>
>>> +#define VLAN_MAX_DEPTH 32 /* Max. number of nested VLAN tags parsed */
>>> +
>>
>> Any insight on limits of nesting wrt QinQ, maybe from spec side?
>
> Don't think so. Wikipedia says this:
>
> 802.1ad is upward compatible with 802.1Q. Although 802.1ad is limited
> to two tags, there is no ceiling on the standard limiting a single
> frame to more than two tags, allowing for growth in the protocol. In
> practice Service Provider topologies often anticipate and utilize
> frames having more than two tags.
>
>> Why not 8 as max, for example (I'd probably even consider a depth like
>> this as utterly broken setup ..)?
>
> I originally went with 8, but chickened out after seeing how many places
> call the parsing function. While I do agree that eight tags is... somewhat
> excessive... I was trying to make absolutely sure no one would hit this
> limit in normal use. See also https://xkcd.com/1172/ :)
Considering that XMIT_RECURSION_LIMIT is 8, I also think 8 is sufficient.
Toshiaki Makita
^ permalink raw reply [relevance 1%]
* Re: [Cake] [PATCH net] vlan: consolidate VLAN parsing code and limit max parsing depth
@ 2020-07-07 10:44 1% ` Toshiaki Makita
2020-07-07 10:57 0% ` Toke Høiland-Jørgensen
1 sibling, 1 reply; 200+ results
From: Toshiaki Makita @ 2020-07-07 10:44 UTC (permalink / raw)
To: Toke Høiland-Jørgensen, davem
Cc: netdev, cake, Davide Caratti, Jiri Pirko, Jamal Hadi Salim,
Cong Wang, Daniel Borkmann
On 2020/07/06 21:29, Toke Høiland-Jørgensen wrote:
> Toshiaki pointed out that we now have two very similar functions to extract
> the L3 protocol number in the presence of VLAN tags. And Daniel pointed out
> that the unbounded parsing loop makes it possible for maliciously crafted
> packets to loop through potentially hundreds of tags.
>
> Fix both of these issues by consolidating the two parsing functions and
> limiting the VLAN tag parsing to an arbitrarily-chosen, but hopefully
> conservative, max depth of 32 tags. As part of this, switch over
> __vlan_get_protocol() to use skb_header_pointer() instead of
> pskb_may_pull(), to avoid the possible side effects of the latter and keep
> the skb pointer 'const' through all the parsing functions.
>
> Reported-by: Toshiaki Makita <toshiaki.makita1@gmail.com>
> Reported-by: Daniel Borkmann <daniel@iogearbox.net>
> Fixes: d7bf2ebebc2b ("sched: consistently handle layer3 header accesses in the presence of VLANs")
> Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
> ---
...
> @@ -623,13 +597,12 @@ static inline __be16 __vlan_get_protocol(struct sk_buff *skb, __be16 type,
> vlan_depth = ETH_HLEN;
> }
> do {
> - struct vlan_hdr *vh;
> + struct vlan_hdr vhdr, *vh;
>
> - if (unlikely(!pskb_may_pull(skb,
> - vlan_depth + VLAN_HLEN)))
> + vh = skb_header_pointer(skb, vlan_depth, sizeof(vhdr), &vhdr);
Some drivers which use vlan_get_protocol to get IP protocol for checksum offload discards
packets when it cannot get the protocol.
I guess for such users this function should try to get protocol even if it is not in skb header?
I'm not sure such a case can happen, but since you care about this, you know real cases where
vlan tag can be in skb frags?
Toshiaki Makita
^ permalink raw reply [relevance 1%]
* Re: [Cake] [PATCH net v3] sched: consistently handle layer3 header accesses in the presence of VLANs
2020-07-06 4:24 1% ` Toshiaki Makita
@ 2020-07-06 10:53 0% ` Toke Høiland-Jørgensen
0 siblings, 0 replies; 200+ results
From: Toke Høiland-Jørgensen @ 2020-07-06 10:53 UTC (permalink / raw)
To: Toshiaki Makita
Cc: davem, netdev, bpf, cake, Davide Caratti, Jiri Pirko,
Jamal Hadi Salim, Cong Wang, Roman Mashak, Lawrence Brakmo,
Ilya Ponetayev
Toshiaki Makita <toshiaki.makita1@gmail.com> writes:
> On 2020/07/04 20:33, Toke Høiland-Jørgensen wrote:
>> Toshiaki Makita <toshiaki.makita1@gmail.com> writes:
>>> On 2020/07/04 5:26, Toke Høiland-Jørgensen wrote:
>>> ...
>>>> +/* A getter for the SKB protocol field which will handle VLAN tags consistently
>>>> + * whether VLAN acceleration is enabled or not.
>>>> + */
>>>> +static inline __be16 skb_protocol(const struct sk_buff *skb, bool skip_vlan)
>>>> +{
>>>> + unsigned int offset = skb_mac_offset(skb) + sizeof(struct ethhdr);
>>>> + __be16 proto = skb->protocol;
>>>> +
>>>> + if (!skip_vlan)
>>>> + /* VLAN acceleration strips the VLAN header from the skb and
>>>> + * moves it to skb->vlan_proto
>>>> + */
>>>> + return skb_vlan_tag_present(skb) ? skb->vlan_proto : proto;
>>>> +
>>>> + while (eth_type_vlan(proto)) {
>>>> + struct vlan_hdr vhdr, *vh;
>>>> +
>>>> + vh = skb_header_pointer(skb, offset, sizeof(vhdr), &vhdr);
>>>> + if (!vh)
>>>> + break;
>>>> +
>>>> + proto = vh->h_vlan_encapsulated_proto;
>>>> + offset += sizeof(vhdr);
>>>> + }
>>>
>>> Why don't you use __vlan_get_protocol() here? It looks quite similar.
>>> Is there any problem with using that?
>>
>> TBH, I completely missed that helper. It seems to have side effects,
>> though (pskb_may_pull()), which is one of the things the original patch
>> to sch_cake that initiated all of this was trying to avoid.
>
> Sorry for not completely following the discussion...
> Pulling data is wrong for cake or other schedulers?
This was not explicit in the current thread, but the reason I started
looking into this in the first place was a pull request on the
out-of-tree version of sch_cake that noticed that there are drivers that
will allocate SKBs in such a way that accessing the packet header causes
it to be reallocated: https://github.com/dtaht/sch_cake/pull/136
I'm not entirely positive that this applies to just reading the header
through pskb_may_pull(), or if it was only on skb_try_make_writable();
but in any case it seems to me that it's better for a helper like
__vlan_get_protocol() to not have side effects.
>> I guess I could just fix that, though, and switch __vlan_get_protocol()
>> over to using skb_header_pointer(). Will send a follow-up to do that.
>>
>> Any opinion on whether it's a good idea to limit the max parse depth
>> while I'm at it (see Daniel's reply)?
>
> The logic was originally introduced by skb_network_protocol() back in
> v3.10, and I have never heard of security report about that. But yes,
> I guess it potentially can be used for DoS attack.
Right, I'll add the limit as well, then :)
-Toke
^ permalink raw reply [relevance 0%]
* Re: [Cake] [PATCH net v3] sched: consistently handle layer3 header accesses in the presence of VLANs
2020-07-04 11:33 0% ` Toke Høiland-Jørgensen
@ 2020-07-06 4:24 1% ` Toshiaki Makita
2020-07-06 10:53 0% ` Toke Høiland-Jørgensen
0 siblings, 1 reply; 200+ results
From: Toshiaki Makita @ 2020-07-06 4:24 UTC (permalink / raw)
To: Toke Høiland-Jørgensen
Cc: davem, netdev, bpf, cake, Davide Caratti, Jiri Pirko,
Jamal Hadi Salim, Cong Wang, Roman Mashak, Lawrence Brakmo,
Ilya Ponetayev
On 2020/07/04 20:33, Toke Høiland-Jørgensen wrote:
> Toshiaki Makita <toshiaki.makita1@gmail.com> writes:
>> On 2020/07/04 5:26, Toke Høiland-Jørgensen wrote:
>> ...
>>> +/* A getter for the SKB protocol field which will handle VLAN tags consistently
>>> + * whether VLAN acceleration is enabled or not.
>>> + */
>>> +static inline __be16 skb_protocol(const struct sk_buff *skb, bool skip_vlan)
>>> +{
>>> + unsigned int offset = skb_mac_offset(skb) + sizeof(struct ethhdr);
>>> + __be16 proto = skb->protocol;
>>> +
>>> + if (!skip_vlan)
>>> + /* VLAN acceleration strips the VLAN header from the skb and
>>> + * moves it to skb->vlan_proto
>>> + */
>>> + return skb_vlan_tag_present(skb) ? skb->vlan_proto : proto;
>>> +
>>> + while (eth_type_vlan(proto)) {
>>> + struct vlan_hdr vhdr, *vh;
>>> +
>>> + vh = skb_header_pointer(skb, offset, sizeof(vhdr), &vhdr);
>>> + if (!vh)
>>> + break;
>>> +
>>> + proto = vh->h_vlan_encapsulated_proto;
>>> + offset += sizeof(vhdr);
>>> + }
>>
>> Why don't you use __vlan_get_protocol() here? It looks quite similar.
>> Is there any problem with using that?
>
> TBH, I completely missed that helper. It seems to have side effects,
> though (pskb_may_pull()), which is one of the things the original patch
> to sch_cake that initiated all of this was trying to avoid.
Sorry for not completely following the discussion...
Pulling data is wrong for cake or other schedulers?
> I guess I could just fix that, though, and switch __vlan_get_protocol()
> over to using skb_header_pointer(). Will send a follow-up to do that.
>
> Any opinion on whether it's a good idea to limit the max parse depth
> while I'm at it (see Daniel's reply)?
The logic was originally introduced by skb_network_protocol() back in v3.10,
and I have never heard of security report about that. But yes, I guess it
potentially can be used for DoS attack.
Toshiaki Makita
^ permalink raw reply [relevance 1%]
* Re: [Cake] [PATCH net v3] sched: consistently handle layer3 header accesses in the presence of VLANs
2020-07-04 3:24 1% ` Toshiaki Makita
@ 2020-07-04 11:33 0% ` Toke Høiland-Jørgensen
2020-07-06 4:24 1% ` Toshiaki Makita
0 siblings, 1 reply; 200+ results
From: Toke Høiland-Jørgensen @ 2020-07-04 11:33 UTC (permalink / raw)
To: Toshiaki Makita
Cc: davem, netdev, bpf, cake, Davide Caratti, Jiri Pirko,
Jamal Hadi Salim, Cong Wang, Roman Mashak, Lawrence Brakmo,
Ilya Ponetayev
Toshiaki Makita <toshiaki.makita1@gmail.com> writes:
> On 2020/07/04 5:26, Toke Høiland-Jørgensen wrote:
> ...
>> +/* A getter for the SKB protocol field which will handle VLAN tags consistently
>> + * whether VLAN acceleration is enabled or not.
>> + */
>> +static inline __be16 skb_protocol(const struct sk_buff *skb, bool skip_vlan)
>> +{
>> + unsigned int offset = skb_mac_offset(skb) + sizeof(struct ethhdr);
>> + __be16 proto = skb->protocol;
>> +
>> + if (!skip_vlan)
>> + /* VLAN acceleration strips the VLAN header from the skb and
>> + * moves it to skb->vlan_proto
>> + */
>> + return skb_vlan_tag_present(skb) ? skb->vlan_proto : proto;
>> +
>> + while (eth_type_vlan(proto)) {
>> + struct vlan_hdr vhdr, *vh;
>> +
>> + vh = skb_header_pointer(skb, offset, sizeof(vhdr), &vhdr);
>> + if (!vh)
>> + break;
>> +
>> + proto = vh->h_vlan_encapsulated_proto;
>> + offset += sizeof(vhdr);
>> + }
>
> Why don't you use __vlan_get_protocol() here? It looks quite similar.
> Is there any problem with using that?
TBH, I completely missed that helper. It seems to have side effects,
though (pskb_may_pull()), which is one of the things the original patch
to sch_cake that initiated all of this was trying to avoid.
I guess I could just fix that, though, and switch __vlan_get_protocol()
over to using skb_header_pointer(). Will send a follow-up to do that.
Any opinion on whether it's a good idea to limit the max parse depth
while I'm at it (see Daniel's reply)?
-Toke
^ permalink raw reply [relevance 0%]
* Re: [Cake] [PATCH net v3] sched: consistently handle layer3 header accesses in the presence of VLANs
2020-07-03 21:35 1% ` David Miller
@ 2020-07-04 3:24 1% ` Toshiaki Makita
2020-07-04 11:33 0% ` Toke Høiland-Jørgensen
1 sibling, 1 reply; 200+ results
From: Toshiaki Makita @ 2020-07-04 3:24 UTC (permalink / raw)
To: Toke Høiland-Jørgensen
Cc: davem, netdev, bpf, cake, Davide Caratti, Jiri Pirko,
Jamal Hadi Salim, Cong Wang, Roman Mashak, Lawrence Brakmo,
Ilya Ponetayev
On 2020/07/04 5:26, Toke Høiland-Jørgensen wrote:
...
> +/* A getter for the SKB protocol field which will handle VLAN tags consistently
> + * whether VLAN acceleration is enabled or not.
> + */
> +static inline __be16 skb_protocol(const struct sk_buff *skb, bool skip_vlan)
> +{
> + unsigned int offset = skb_mac_offset(skb) + sizeof(struct ethhdr);
> + __be16 proto = skb->protocol;
> +
> + if (!skip_vlan)
> + /* VLAN acceleration strips the VLAN header from the skb and
> + * moves it to skb->vlan_proto
> + */
> + return skb_vlan_tag_present(skb) ? skb->vlan_proto : proto;
> +
> + while (eth_type_vlan(proto)) {
> + struct vlan_hdr vhdr, *vh;
> +
> + vh = skb_header_pointer(skb, offset, sizeof(vhdr), &vhdr);
> + if (!vh)
> + break;
> +
> + proto = vh->h_vlan_encapsulated_proto;
> + offset += sizeof(vhdr);
> + }
Why don't you use __vlan_get_protocol() here? It looks quite similar.
Is there any problem with using that?
Toshiaki Makita
^ permalink raw reply [relevance 1%]
* Re: [Cake] [PATCH net v3] sched: consistently handle layer3 header accesses in the presence of VLANs
@ 2020-07-03 21:35 1% ` David Miller
2020-07-04 3:24 1% ` Toshiaki Makita
1 sibling, 0 replies; 200+ results
From: David Miller @ 2020-07-03 21:35 UTC (permalink / raw)
To: toke
Cc: netdev, bpf, cake, dcaratti, jiri, jhs, xiyou.wangcong, mrv,
brakmo, i.ponetaev
From: Toke Høiland-Jørgensen <toke@redhat.com>
Date: Fri, 3 Jul 2020 22:26:43 +0200
> There are a couple of places in net/sched/ that check skb->protocol and act
> on the value there. However, in the presence of VLAN tags, the value stored
> in skb->protocol can be inconsistent based on whether VLAN acceleration is
> enabled. The commit quoted in the Fixes tag below fixed the users of
> skb->protocol to use a helper that will always see the VLAN ethertype.
>
> However, most of the callers don't actually handle the VLAN ethertype, but
> expect to find the IP header type in the protocol field. This means that
> things like changing the ECN field, or parsing diffserv values, stops
> working if there's a VLAN tag, or if there are multiple nested VLAN
> tags (QinQ).
>
> To fix this, change the helper to take an argument that indicates whether
> the caller wants to skip the VLAN tags or not. When skipping VLAN tags, we
> make sure to skip all of them, so behaviour is consistent even in QinQ
> mode.
>
> To make the helper usable from the ECN code, move it to if_vlan.h instead
> of pkt_sched.h.
>
> v3:
> - Remove empty lines
> - Move vlan variable definitions inside loop in skb_protocol()
> - Also use skb_protocol() helper in IP{,6}_ECN_decapsulate() and
> bpf_skb_ecn_set_ce()
>
> v2:
> - Use eth_type_vlan() helper in skb_protocol()
> - Also fix code that reads skb->protocol directly
> - Change a couple of 'if/else if' statements to switch constructs to avoid
> calling the helper twice
>
> Reported-by: Ilya Ponetayev <i.ponetaev@ndmsystems.com>
> Fixes: d8b9605d2697 ("net: sched: fix skb->protocol use in case of accelerated vlan path")
> Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Looks good, applied and queued up for -stable.
Thanks!
^ permalink raw reply [relevance 1%]
* Re: [Cake] [PATCH net v2] sched: consistently handle layer3 header accesses in the presence of VLANs
2020-07-03 19:19 1% ` Cong Wang
@ 2020-07-03 20:09 0% ` Toke Høiland-Jørgensen
0 siblings, 0 replies; 200+ results
From: Toke Høiland-Jørgensen @ 2020-07-03 20:09 UTC (permalink / raw)
To: Cong Wang
Cc: David Miller, Linux Kernel Network Developers, Cake List,
Davide Caratti, Jiri Pirko, Jamal Hadi Salim, Ilya Ponetayev
Cong Wang <xiyou.wangcong@gmail.com> writes:
> On Fri, Jul 3, 2020 at 8:22 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>> diff --git a/include/linux/if_vlan.h b/include/linux/if_vlan.h
>> index b05e855f1ddd..d0c1cb0d264d 100644
>> --- a/include/linux/if_vlan.h
>> +++ b/include/linux/if_vlan.h
>> @@ -308,6 +308,35 @@ static inline bool eth_type_vlan(__be16 ethertype)
>> }
>> }
>>
>> +/* A getter for the SKB protocol field which will handle VLAN tags consistently
>> + * whether VLAN acceleration is enabled or not.
>> + */
>> +static inline __be16 skb_protocol(const struct sk_buff *skb, bool skip_vlan)
>> +{
>> + unsigned int offset = skb_mac_offset(skb) + sizeof(struct ethhdr);
>> + __be16 proto = skb->protocol;
>> + struct vlan_hdr vhdr, *vh;
>> +
>> + if (!skip_vlan)
>> + /* VLAN acceleration strips the VLAN header from the skb and
>> + * moves it to skb->vlan_proto
>> + */
>> + return skb_vlan_tag_present(skb) ? skb->vlan_proto : proto;
>> +
>> + while (eth_type_vlan(proto)) {
>> + vh = skb_header_pointer(skb, offset, sizeof(vhdr), &vhdr);
>> + if (!vh)
>> + break;
>> +
>> + proto = vh->h_vlan_encapsulated_proto;
>> + offset += sizeof(vhdr);
>> + }
>> +
>> + return proto;
>> +}
>> +
>> +
>> +
>
> Just nit: too many newlines here. Please run checkpatch.pl.
Hmm, I did run checkpatch, but seems it only complains about multiple
newlines when run with --strict. Will fix, thanks! :)
>> static inline bool vlan_hw_offload_capable(netdev_features_t features,
>> __be16 proto)
>> {
>> diff --git a/include/net/inet_ecn.h b/include/net/inet_ecn.h
>> index 0f0d1efe06dd..82763ba597f2 100644
>> --- a/include/net/inet_ecn.h
>> +++ b/include/net/inet_ecn.h
>> @@ -4,6 +4,7 @@
>>
>> #include <linux/ip.h>
>> #include <linux/skbuff.h>
>> +#include <linux/if_vlan.h>
>>
>> #include <net/inet_sock.h>
>> #include <net/dsfield.h>
>> @@ -172,7 +173,7 @@ static inline void ipv6_copy_dscp(unsigned int dscp, struct ipv6hdr *inner)
>>
>> static inline int INET_ECN_set_ce(struct sk_buff *skb)
>> {
>> - switch (skb->protocol) {
>> + switch (skb_protocol(skb, true)) {
>> case cpu_to_be16(ETH_P_IP):
>> if (skb_network_header(skb) + sizeof(struct iphdr) <=
>> skb_tail_pointer(skb))
>> @@ -191,7 +192,7 @@ static inline int INET_ECN_set_ce(struct sk_buff *skb)
>>
>> static inline int INET_ECN_set_ect1(struct sk_buff *skb)
>> {
>> - switch (skb->protocol) {
>> + switch (skb_protocol(skb, true)) {
>> case cpu_to_be16(ETH_P_IP):
>> if (skb_network_header(skb) + sizeof(struct iphdr) <=
>> skb_tail_pointer(skb))
>
> These two helpers are called by non-net_sched too, are you sure
> your change is correct for them too?
>
> For example, IP6_ECN_decapsulate() uses skb->protocol then calls
> INET_ECN_decapsulate() which calls the above, after your change
> they use skb_protocol(). This looks inconsistent to me.
Good point. I'll change IP{,6}_ECN_decapsulate() to also use
skb_protocol().
-Toke
^ permalink raw reply [relevance 0%]
* Re: [Cake] [PATCH net v2] sched: consistently handle layer3 header accesses in the presence of VLANs
@ 2020-07-03 19:19 1% ` Cong Wang
2020-07-03 20:09 0% ` Toke Høiland-Jørgensen
0 siblings, 1 reply; 200+ results
From: Cong Wang @ 2020-07-03 19:19 UTC (permalink / raw)
To: Toke Høiland-Jørgensen
Cc: David Miller, Linux Kernel Network Developers, Cake List,
Davide Caratti, Jiri Pirko, Jamal Hadi Salim, Ilya Ponetayev
On Fri, Jul 3, 2020 at 8:22 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
> diff --git a/include/linux/if_vlan.h b/include/linux/if_vlan.h
> index b05e855f1ddd..d0c1cb0d264d 100644
> --- a/include/linux/if_vlan.h
> +++ b/include/linux/if_vlan.h
> @@ -308,6 +308,35 @@ static inline bool eth_type_vlan(__be16 ethertype)
> }
> }
>
> +/* A getter for the SKB protocol field which will handle VLAN tags consistently
> + * whether VLAN acceleration is enabled or not.
> + */
> +static inline __be16 skb_protocol(const struct sk_buff *skb, bool skip_vlan)
> +{
> + unsigned int offset = skb_mac_offset(skb) + sizeof(struct ethhdr);
> + __be16 proto = skb->protocol;
> + struct vlan_hdr vhdr, *vh;
> +
> + if (!skip_vlan)
> + /* VLAN acceleration strips the VLAN header from the skb and
> + * moves it to skb->vlan_proto
> + */
> + return skb_vlan_tag_present(skb) ? skb->vlan_proto : proto;
> +
> + while (eth_type_vlan(proto)) {
> + vh = skb_header_pointer(skb, offset, sizeof(vhdr), &vhdr);
> + if (!vh)
> + break;
> +
> + proto = vh->h_vlan_encapsulated_proto;
> + offset += sizeof(vhdr);
> + }
> +
> + return proto;
> +}
> +
> +
> +
Just nit: too many newlines here. Please run checkpatch.pl.
> static inline bool vlan_hw_offload_capable(netdev_features_t features,
> __be16 proto)
> {
> diff --git a/include/net/inet_ecn.h b/include/net/inet_ecn.h
> index 0f0d1efe06dd..82763ba597f2 100644
> --- a/include/net/inet_ecn.h
> +++ b/include/net/inet_ecn.h
> @@ -4,6 +4,7 @@
>
> #include <linux/ip.h>
> #include <linux/skbuff.h>
> +#include <linux/if_vlan.h>
>
> #include <net/inet_sock.h>
> #include <net/dsfield.h>
> @@ -172,7 +173,7 @@ static inline void ipv6_copy_dscp(unsigned int dscp, struct ipv6hdr *inner)
>
> static inline int INET_ECN_set_ce(struct sk_buff *skb)
> {
> - switch (skb->protocol) {
> + switch (skb_protocol(skb, true)) {
> case cpu_to_be16(ETH_P_IP):
> if (skb_network_header(skb) + sizeof(struct iphdr) <=
> skb_tail_pointer(skb))
> @@ -191,7 +192,7 @@ static inline int INET_ECN_set_ce(struct sk_buff *skb)
>
> static inline int INET_ECN_set_ect1(struct sk_buff *skb)
> {
> - switch (skb->protocol) {
> + switch (skb_protocol(skb, true)) {
> case cpu_to_be16(ETH_P_IP):
> if (skb_network_header(skb) + sizeof(struct iphdr) <=
> skb_tail_pointer(skb))
These two helpers are called by non-net_sched too, are you sure
your change is correct for them too?
For example, IP6_ECN_decapsulate() uses skb->protocol then calls
INET_ECN_decapsulate() which calls the above, after your change
they use skb_protocol(). This looks inconsistent to me.
Thanks.
^ permalink raw reply [relevance 1%]
* Re: [Cake] [PATCH net] sched: consistently handle layer3 header accesses in the presence of VLANs
2020-07-03 12:53 1% ` Davide Caratti
@ 2020-07-03 14:37 0% ` Toke Høiland-Jørgensen
0 siblings, 0 replies; 200+ results
From: Toke Høiland-Jørgensen @ 2020-07-03 14:37 UTC (permalink / raw)
To: Davide Caratti, davem
Cc: netdev, cake, Jiri Pirko, Jamal Hadi Salim, Ilya Ponetayev, Cong Wang
Davide Caratti <dcaratti@redhat.com> writes:
> hello Toke,
>
> thanks for answering!
>
> On Fri, 2020-07-03 at 14:05 +0200, Toke Høiland-Jørgensen wrote:
>> while (proto == htons(ETH_P_8021Q) || proto == htons(ETH_P_8021AD)) {
>
> maybe this line be shortened, since if_vlan.h has [1]:
>
> while (eth_type_vlan(proto)) {
> ...
> }
Good point, missed that! Will fix and send a v2.
> If I read well, the biggest change from functional point of view is that
> now qdiscs can set the ECN bit also on non-accelerated VLAN packets and
> QinQ-tagged packets, if the IP header is the outer-most header after VLAN;
> and the same applies to almost all net/sched former users of skb->protocol
> or tc_skb_protocol().
Yup, that's the idea.
> Question (sorry in advance because it might be a dumb one :) ):
>
> do you know why cls_flower, act_ct, act_mpls and act_connmark keep reading
> skb->protocol? is that intentional?
Hmm, no not really. I only checked for calls to tc_skb_protocol(), not
for direct uses of skb->protocol. Will fix those as well :)
> (for act_mpls that doesn't look intentional, and probably the result is
> that the BOS bit is not set correctly if someone tries to push/pop a label
> for a non-accelerated or QinQ packet. But I didn't try it experimentally
> :) )
Hmm, you're certainly right that the MPLS code should use the helper to
get consistent use between accelerated/non-accelerated VLAN usage. But I
don't know enough about MPLS to judge whether it should be skipping the
VLAN tags or not. Sounds like you're saying the right thing is to skip
the VLAN tags there as well?
Looking at the others, it looks like act_connmark and act_ct both ought
to skip VLAN tags, while act_flower should probably keep it, since it
seems it has a VLAN match type. Or?
-Toke
^ permalink raw reply [relevance 0%]
* Re: [Cake] [PATCH net] sched: consistently handle layer3 header accesses in the presence of VLANs
@ 2020-07-03 12:53 1% ` Davide Caratti
2020-07-03 14:37 0% ` Toke Høiland-Jørgensen
0 siblings, 1 reply; 200+ results
From: Davide Caratti @ 2020-07-03 12:53 UTC (permalink / raw)
To: Toke Høiland-Jørgensen, davem
Cc: netdev, cake, Jiri Pirko, Jamal Hadi Salim, Ilya Ponetayev, Cong Wang
hello Toke,
thanks for answering!
On Fri, 2020-07-03 at 14:05 +0200, Toke Høiland-Jørgensen wrote:
> while (proto == htons(ETH_P_8021Q) || proto == htons(ETH_P_8021AD)) {
maybe this line be shortened, since if_vlan.h has [1]:
while (eth_type_vlan(proto)) {
...
}
If I read well, the biggest change from functional point of view is that
now qdiscs can set the ECN bit also on non-accelerated VLAN packets and
QinQ-tagged packets, if the IP header is the outer-most header after VLAN;
and the same applies to almost all net/sched former users of skb->protocol
or tc_skb_protocol().
Question (sorry in advance because it might be a dumb one :) ):
do you know why cls_flower, act_ct, act_mpls and act_connmark keep reading
skb->protocol? is that intentional?
(for act_mpls that doesn't look intentional, and probably the result is
that the BOS bit is not set correctly if someone tries to push/pop a label
for a non-accelerated or QinQ packet. But I didn't try it experimentally
:) )
[1] https://elixir.bootlin.com/linux/latest/source/include/linux/if_vlan.h#L300
--
davide
^ permalink raw reply [relevance 1%]
* Re: [Cake] [PATCH net-next 1/5] sch_cake: fix IP protocol handling in the presence of VLAN tags
2020-06-26 18:52 1% ` Davide Caratti
@ 2020-06-29 10:27 0% ` Toke Høiland-Jørgensen
0 siblings, 0 replies; 200+ results
From: Toke Høiland-Jørgensen @ 2020-06-29 10:27 UTC (permalink / raw)
To: Davide Caratti, David Miller; +Cc: cake, netdev, Jiri Pirko, Jamal Hadi Salim
Davide Caratti <dcaratti@redhat.com> writes:
> hi Toke,
>
> thanks for answering.
>
> On Fri, 2020-06-26 at 14:52 +0200, Toke Høiland-Jørgensen wrote:
>> Davide Caratti <dcaratti@redhat.com> writes:
>
> [...]
>
>> >
>> > > I guess I can trying going through them all and figuring out if
>> > > there's a more generic solution.
>> >
>> > For sch_cake, I think that the qdisc shouldn't look at the IP header when
>> > it schedules packets having a VLAN tag.
>> >
>> > Probably, when tc_skb_protocol() returns ETH_P_8021Q or ETH_P_8021AD, we
>> > should look at the VLAN priority (PCP) bits (and that's something that
>> > cake doesn't do currently - but I have a small patch in my stash that
>> > implements this: please let me know if you are interested in seeing it :)
>> > ).
>> >
>> > Then, to ensure that the IP precedence is respected, even with different
>> > VLAN tags, users should explicitly configure TC filters that "map" the
>> > DSCP value to a PCP value. This would ensure that configured priority is
>> > respected by the scheduler, and would also be flexible enough to allow
>> > different "mappings".
>>
>> I think you have this the wrong way around :)
>>
>> I.e., classifying based on VLAN priority is even more esoteric than
>> using diffserv markings,
>
> is it so uncommon? I knew that almost every wifi card did something
> similar with 802.11 'access categories'. More generally, I'm not sure if
> it's ok to ignore any QoS information present in the L2 header. Anyway,
>
>> so that should not be the default. Making it
>> the default would also make the behaviour change for the same traffic if
>> there's a VLAN tag present, which is bound to confuse people. I suppose
>> it could be an option, but not really sure that's needed, since as you
>> say you could just implement it with your own TC filters...
>
> you caught me :) ,
>
> I wrote that patch in my stash to fix cake on my home router, where voice
> and data are encapsulated in IP over PPPoE over VLANs, and different
> services go over different VLAN ids (one VLAN dedicated for voice, the
> other one for data) [1]. The quickest thing I did was: to prioritize
> packets having VLAN id equal to 1035.
>
> Now that I look at cake code again (where again means: after almost 1
> year) it would be probably better to assign skb->priority using flower +
> act_skbedit, and then prioritize in the qdisc: if I read the code well,
> this would avoid voice and data falling into the same traffic class (that
> was my original problem).
>
> please note: I didn't try this patch - but I think that even with this
> code I would have voice and data mixed together, because there is PPPoE
> between VLAN and IP.
>
>> > Sure, my proposal does not cover the problem of mangling the CE bit
>> > inside VLAN-tagged packets, i.e. if we should understand if qdiscs
>> > should allow it or not.
>>
>> Hmm, yeah, that's the rub, isn't it? I think this is related to this
>> commit, which first introduced tc_skb_protocol():
>>
>> d8b9605d2697 ("net: sched: fix skb->protocol use in case of accelerated vlan path")
>>
>> That commit at least made the behaviour consistent across
>> accelerated/non-accelerated VLANs. However, the patch description
>> asserts that 'tc code .. expects vlan protocol type [in skb->protocol]'.
>> Looking at the various callers, I'm not actually sure that's true, in
>> the sense that most of the callers don't handle VLAN ethertypes at all,
>> but expects to find an IP header. This is the case for at least:
>>
>> - act_ctinfo
>> - act_skbedit
>> - cls_flow
>> - em_ipset
>> - em_ipt
>> - sch_cake
>> - sch_dsmark
>
> sure, I'm not saying it's not possible to look inside IP headers. What I
> understood from Cong's replies [2], and he sort-of convinced me, was: when
> I have IP and one or more VLAN tags, no matter whether it is accelerated
> or not, it should be sufficient to access the IP header daisy-chaining
> 'act_vlan pop actions' -> access to the IP header -> ' act_vlan push
> actions (in the reversed order).
>
> oh well, that's still not sufficient in my home router because of PPPoE. I
> should practice with cls_bpf more seriously :-)
>
> or write act_pppoe.c :D
>
>> In fact the only caller that explicitly handles a VLAN ethertype seems
>> to be act_csum; and that handles it in a way that also just skips the
>> VLAN headers, albeit by skb_pull()'ing the header.
>
>
>> cls_api, em_meta and sch_teql don't explicitly handle it; but returning
>> the VLAN ethertypes to those does seem to make sense, since they just
>> pass the value somewhere else.
>>
>> So my suggestion would be to add a new helper that skips the VLAN tags
>> and finds the L3 ethertype (i.e., basically cake_skb_proto() as
>> introduced in this patch), then switch all the callers listed above, as
>> well as the INET_ECN_set_ce() over to using that. Maybe something like
>> 'skb_l3_protocol()' which could go into skbuff.h itself, so the ECN code
>> can also find it?
>
> for setting the CE bit, that's understandable - in one way or the other,
> the behaviour should be made consistent.
>
>> Any objections to this? It's not actually clear to me how the discussion
>> you quoted above landed; but this will at least make things consistent
>> across all the different actions/etc.
>
> well, it just didn't "land". But I agree, inconsistency here can make some
> TC configurations "unreliable" (i.e., they don't do the job they were
> configured for).
Right, I'll send a patch to try to make all this consistent :)
-Toke
^ permalink raw reply [relevance 0%]
* Re: [Cake] [PATCH net-next 1/5] sch_cake: fix IP protocol handling in the presence of VLAN tags
2020-06-26 14:59 0% ` Sebastian Moeller
@ 2020-06-26 22:00 1% ` Stephen Hemminger
1 sibling, 0 replies; 200+ results
From: Stephen Hemminger @ 2020-06-26 22:00 UTC (permalink / raw)
To: Jonathan Morton; +Cc: Davide Caratti, cake, netdev, David Miller
On Fri, 26 Jun 2020 16:11:49 +0300
Jonathan Morton <chromatix99@gmail.com> wrote:
> Toke has already replied, but:
>
> > Sure, my proposal does not cover the problem of mangling the CE bit inside
> > VLAN-tagged packets, i.e. if we should understand if qdiscs should allow
> > it or not.
>
> This is clearly wrong-headed by itself.
>
> Everything I've heard about VLAN tags thus far indicates that they should be *transparent* to nodes which don't care about them; they determine where the packet goes within the LAN, but not how it behaves. In particular this means that AQM should be able to apply congestion control signals to them in the normal way, by modifying the ECN field of the IP header encapsulated within.
>
> The most I would entertain is to incorporate a VLAN tag into the hashes that Cake uses to distinguish hosts and/or flows. This would account for the case where two hosts on different VLANs of the same physical network have the same IP address.
>
> - Jonathan Morton
>
> _______________________________________________
> Cake mailing list
> Cake@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cake
The implementation of VLAN's is awkward/flawed. The outer VLAN tag is transparent
but the inner VLAN is visible. Similarly the outer VLAN tag doesn't count towards
the MTU but inner one does.
^ permalink raw reply [relevance 1%]
* Re: [Cake] [PATCH net-next 1/5] sch_cake: fix IP protocol handling in the presence of VLAN tags
2020-06-26 12:52 0% ` Toke Høiland-Jørgensen
2020-06-26 14:01 1% ` Jamal Hadi Salim
@ 2020-06-26 18:52 1% ` Davide Caratti
2020-06-29 10:27 0% ` Toke Høiland-Jørgensen
1 sibling, 1 reply; 200+ results
From: Davide Caratti @ 2020-06-26 18:52 UTC (permalink / raw)
To: Toke Høiland-Jørgensen, David Miller
Cc: cake, netdev, Jiri Pirko, Jamal Hadi Salim
hi Toke,
thanks for answering.
On Fri, 2020-06-26 at 14:52 +0200, Toke Høiland-Jørgensen wrote:
> Davide Caratti <dcaratti@redhat.com> writes:
[...]
> >
> > > I guess I can trying going through them all and figuring out if
> > > there's a more generic solution.
> >
> > For sch_cake, I think that the qdisc shouldn't look at the IP header when
> > it schedules packets having a VLAN tag.
> >
> > Probably, when tc_skb_protocol() returns ETH_P_8021Q or ETH_P_8021AD, we
> > should look at the VLAN priority (PCP) bits (and that's something that
> > cake doesn't do currently - but I have a small patch in my stash that
> > implements this: please let me know if you are interested in seeing it :)
> > ).
> >
> > Then, to ensure that the IP precedence is respected, even with different
> > VLAN tags, users should explicitly configure TC filters that "map" the
> > DSCP value to a PCP value. This would ensure that configured priority is
> > respected by the scheduler, and would also be flexible enough to allow
> > different "mappings".
>
> I think you have this the wrong way around :)
>
> I.e., classifying based on VLAN priority is even more esoteric than
> using diffserv markings,
is it so uncommon? I knew that almost every wifi card did something
similar with 802.11 'access categories'. More generally, I'm not sure if
it's ok to ignore any QoS information present in the L2 header. Anyway,
> so that should not be the default. Making it
> the default would also make the behaviour change for the same traffic if
> there's a VLAN tag present, which is bound to confuse people. I suppose
> it could be an option, but not really sure that's needed, since as you
> say you could just implement it with your own TC filters...
you caught me :) ,
I wrote that patch in my stash to fix cake on my home router, where voice
and data are encapsulated in IP over PPPoE over VLANs, and different
services go over different VLAN ids (one VLAN dedicated for voice, the
other one for data) [1]. The quickest thing I did was: to prioritize
packets having VLAN id equal to 1035.
Now that I look at cake code again (where again means: after almost 1
year) it would be probably better to assign skb->priority using flower +
act_skbedit, and then prioritize in the qdisc: if I read the code well,
this would avoid voice and data falling into the same traffic class (that
was my original problem).
please note: I didn't try this patch - but I think that even with this
code I would have voice and data mixed together, because there is PPPoE
between VLAN and IP.
> > Sure, my proposal does not cover the problem of mangling the CE bit
> > inside VLAN-tagged packets, i.e. if we should understand if qdiscs
> > should allow it or not.
>
> Hmm, yeah, that's the rub, isn't it? I think this is related to this
> commit, which first introduced tc_skb_protocol():
>
> d8b9605d2697 ("net: sched: fix skb->protocol use in case of accelerated vlan path")
>
> That commit at least made the behaviour consistent across
> accelerated/non-accelerated VLANs. However, the patch description
> asserts that 'tc code .. expects vlan protocol type [in skb->protocol]'.
> Looking at the various callers, I'm not actually sure that's true, in
> the sense that most of the callers don't handle VLAN ethertypes at all,
> but expects to find an IP header. This is the case for at least:
>
> - act_ctinfo
> - act_skbedit
> - cls_flow
> - em_ipset
> - em_ipt
> - sch_cake
> - sch_dsmark
sure, I'm not saying it's not possible to look inside IP headers. What I
understood from Cong's replies [2], and he sort-of convinced me, was: when
I have IP and one or more VLAN tags, no matter whether it is accelerated
or not, it should be sufficient to access the IP header daisy-chaining
'act_vlan pop actions' -> access to the IP header -> ' act_vlan push
actions (in the reversed order).
oh well, that's still not sufficient in my home router because of PPPoE. I
should practice with cls_bpf more seriously :-)
or write act_pppoe.c :D
> In fact the only caller that explicitly handles a VLAN ethertype seems
> to be act_csum; and that handles it in a way that also just skips the
> VLAN headers, albeit by skb_pull()'ing the header.
> cls_api, em_meta and sch_teql don't explicitly handle it; but returning
> the VLAN ethertypes to those does seem to make sense, since they just
> pass the value somewhere else.
>
> So my suggestion would be to add a new helper that skips the VLAN tags
> and finds the L3 ethertype (i.e., basically cake_skb_proto() as
> introduced in this patch), then switch all the callers listed above, as
> well as the INET_ECN_set_ce() over to using that. Maybe something like
> 'skb_l3_protocol()' which could go into skbuff.h itself, so the ECN code
> can also find it?
for setting the CE bit, that's understandable - in one way or the other,
the behaviour should be made consistent.
> Any objections to this? It's not actually clear to me how the discussion
> you quoted above landed; but this will at least make things consistent
> across all the different actions/etc.
well, it just didn't "land". But I agree, inconsistency here can make some
TC configurations "unreliable" (i.e., they don't do the job they were
configured for).
thanks!
--
davide
[1] https://gist.github.com/teknoraver/9524e539061d0b1e9f8774aa96902082
(by the way, thanks to Matteo Croce for this :) )
[2] https://lore.kernel.org/netdev/CAM_iQpWir7R3AQ7KSeFA5QNXSPHGK-1Nc7WsRM1vhkFyxB5ekA@mail.gmail.com/
^ permalink raw reply [relevance 1%]
* Re: [Cake] [PATCH net-next 1/5] sch_cake: fix IP protocol handling in the presence of VLAN tags
2020-06-26 14:59 0% ` Sebastian Moeller
@ 2020-06-26 16:36 0% ` Jonathan Morton
0 siblings, 0 replies; 200+ results
From: Jonathan Morton @ 2020-06-26 16:36 UTC (permalink / raw)
To: Sebastian Moeller; +Cc: Cake List, dcaratti
> On 26 Jun, 2020, at 5:59 pm, Sebastian Moeller <moeller0@gmx.de> wrote:
>
> thinking this over, I wonder whether a hypothetical carrier grade cake, might not actually grow a classify-by-vlan-priority keyword to allow switching over to using VLAN priority tags instead of dscps? That would avoid tempting carriers to re-map deeep-encapsulated dscps if they can just ignore them for good. And it scratches my pet itch, that 3 bits of classification should be enough for >80 % of the cases ;)
>
> What do you think?
If carriers could use Ethernet VLANs for internal purposes instead of DSCPs, I would count that as progress towards allowing DSCPs to carry end-to-end information. And if there's a desire for a software qdisc which fits that paradigm, then we can do a requirements analysis which might well lead to something useful being developed.
But that isn't going to be Cake. It'll be a different qdisc which might share some features and technology with Cake, but definitely arranged in a different order and with a different focus.
- Jonathan Morton
^ permalink raw reply [relevance 0%]
* Re: [Cake] [PATCH net-next 1/5] sch_cake: fix IP protocol handling in the presence of VLAN tags
@ 2020-06-26 14:59 0% ` Sebastian Moeller
2020-06-26 16:36 0% ` Jonathan Morton
2020-06-26 22:00 1% ` Stephen Hemminger
1 sibling, 1 reply; 200+ results
From: Sebastian Moeller @ 2020-06-26 14:59 UTC (permalink / raw)
To: Jonathan Morton; +Cc: Cake List, dcaratti
Hi Jonathan,
thinking this over, I wonder whether a hypothetical carrier grade cake, might not actually grow a classify-by-vlan-priority keyword to allow switching over to using VLAN priority tags instead of dscps? That would avoid tempting carriers to re-map deeep-encapsulated dscps if they can just ignore them for good. And it scratches my pet itch, that 3 bits of classification should be enough for >80 % of the cases ;)
What do you think?
Best Regards
Sebastian
P.S.: I reduced the CC list since I doubt that netdev is the right venue for mere hypotheticals ;)
> On Jun 26, 2020, at 15:11, Jonathan Morton <chromatix99@gmail.com> wrote:
>
> Toke has already replied, but:
>
>> Sure, my proposal does not cover the problem of mangling the CE bit inside
>> VLAN-tagged packets, i.e. if we should understand if qdiscs should allow
>> it or not.
>
> This is clearly wrong-headed by itself.
>
> Everything I've heard about VLAN tags thus far indicates that they should be *transparent* to nodes which don't care about them; they determine where the packet goes within the LAN, but not how it behaves. In particular this means that AQM should be able to apply congestion control signals to them in the normal way, by modifying the ECN field of the IP header encapsulated within.
>
> The most I would entertain is to incorporate a VLAN tag into the hashes that Cake uses to distinguish hosts and/or flows. This would account for the case where two hosts on different VLANs of the same physical network have the same IP address.
>
> - Jonathan Morton
>
> _______________________________________________
> Cake mailing list
> Cake@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cake
^ permalink raw reply [relevance 0%]
* Re: [Cake] [PATCH net-next 1/5] sch_cake: fix IP protocol handling in the presence of VLAN tags
2020-06-26 12:52 0% ` Toke Høiland-Jørgensen
@ 2020-06-26 14:01 1% ` Jamal Hadi Salim
2020-06-26 18:52 1% ` Davide Caratti
1 sibling, 0 replies; 200+ results
From: Jamal Hadi Salim @ 2020-06-26 14:01 UTC (permalink / raw)
To: Toke Høiland-Jørgensen, Davide Caratti, David Miller
Cc: cake, netdev, Jiri Pirko, Simon Horman
On 2020-06-26 8:52 a.m., Toke Høiland-Jørgensen wrote:
> Davide Caratti <dcaratti@redhat.com> writes:
>
>> hello,
>>
>> my 2 cents:
>>
>> On Thu, 2020-06-25 at 21:53 +0200, Toke Høiland-Jørgensen wrote:
>>> I think it depends a little on the use case; some callers actually care
>>> about the VLAN tags themselves and handle that specially (e.g.,
>>> act_csum).
>>
>> I remember taht something similar was discussed about 1 year ago [1].
>
> Ah, thank you for the pointer!
>
>>> Whereas others (e.g., sch_dsmark) probably will have the same
>>> issue.
>>
>> I'd say that the issue "propagates" to all qdiscs that mangle the ECN-CE
>> bit (i.e., calling INET_ECN_set_ce() [2]), most notably all the RED
>> variants and "codel/fq_codel".
>
> Yeah, I think we should fix INET_ECN_set_ce() instead of re-implementing
> it in CAKE. See below, though.
>
>>> I guess I can trying going through them all and figuring out if
>>> there's a more generic solution.
>>
>> For sch_cake, I think that the qdisc shouldn't look at the IP header when
>> it schedules packets having a VLAN tag.
>>
>> Probably, when tc_skb_protocol() returns ETH_P_8021Q or ETH_P_8021AD, we
>> should look at the VLAN priority (PCP) bits (and that's something that
>> cake doesn't do currently - but I have a small patch in my stash that
>> implements this: please let me know if you are interested in seeing it :)
>> ).
>>
>> Then, to ensure that the IP precedence is respected, even with different
>> VLAN tags, users should explicitly configure TC filters that "map" the
>> DSCP value to a PCP value. This would ensure that configured priority is
>> respected by the scheduler, and would also be flexible enough to allow
>> different "mappings".
>
> I think you have this the wrong way around :)
>
> I.e., classifying based on VLAN priority is even more esoteric than
> using diffserv markings, so that should not be the default. Making it
> the default would also make the behaviour change for the same traffic if
> there's a VLAN tag present, which is bound to confuse people. I suppose
> it could be an option, but not really sure that's needed, since as you
> say you could just implement it with your own TC filters...
>
>> Sure, my proposal does not cover the problem of mangling the CE bit
>> inside VLAN-tagged packets, i.e. if we should understand if qdiscs
>> should allow it or not.
>
> Hmm, yeah, that's the rub, isn't it? I think this is related to this
> commit, which first introduced tc_skb_protocol():
>
> d8b9605d2697 ("net: sched: fix skb->protocol use in case of accelerated vlan path")
>
I didnt quiet follow the discussion - but the patch you are referencing
was to fix an earlier commit which had broken things (we didnt
have the "fixes" tag back then).
> That commit at least made the behaviour consistent across
> accelerated/non-accelerated VLANs. However, the patch description
> asserts that 'tc code .. expects vlan protocol type [in skb->protocol]'.
> Looking at the various callers, I'm not actually sure that's true, in
> the sense that most of the callers don't handle VLAN ethertypes at all,
> but expects to find an IP header. This is the case for at least:
>
> - act_ctinfo
> - act_skbedit
> - cls_flow
> - em_ipset
> - em_ipt
> - sch_cake
> - sch_dsmark
>
> In fact the only caller that explicitly handles a VLAN ethertype seems
> to be act_csum; and that handles it in a way that also just skips the
> VLAN headers, albeit by skb_pull()'ing the header.
>
The earlier change broke a few things unfortunately. There was a more
recent discussion with Simon Horman that i cant find now on breakage
with some classifiers in presence of double vlans.
+cc Simon maybe he can find the discussion.
> cls_api, em_meta and sch_teql don't explicitly handle it; but returning
> the VLAN ethertypes to those does seem to make sense, since they just
> pass the value somewhere else.
>
> So my suggestion would be to add a new helper that skips the VLAN tags
> and finds the L3 ethertype (i.e., basically cake_skb_proto() as
> introduced in this patch), then switch all the callers listed above, as
> well as the INET_ECN_set_ce() over to using that. Maybe something like
> 'skb_l3_protocol()' which could go into skbuff.h itself, so the ECN code
> can also find it?
>
> Any objections to this? It's not actually clear to me how the discussion
> you quoted above landed; but this will at least make things consistent
> across all the different actions/etc.
>
I didnt follow the original discussion - will try to read in order
to form an opinion.
cheers,
jamal
^ permalink raw reply [relevance 1%]
* Re: [Cake] [PATCH net-next 1/5] sch_cake: fix IP protocol handling in the presence of VLAN tags
2020-06-26 8:27 1% ` Davide Caratti
@ 2020-06-26 12:52 0% ` Toke Høiland-Jørgensen
2020-06-26 14:01 1% ` Jamal Hadi Salim
2020-06-26 18:52 1% ` Davide Caratti
1 sibling, 2 replies; 200+ results
From: Toke Høiland-Jørgensen @ 2020-06-26 12:52 UTC (permalink / raw)
To: Davide Caratti, David Miller; +Cc: cake, netdev, Jiri Pirko, Jamal Hadi Salim
Davide Caratti <dcaratti@redhat.com> writes:
> hello,
>
> my 2 cents:
>
> On Thu, 2020-06-25 at 21:53 +0200, Toke Høiland-Jørgensen wrote:
>> I think it depends a little on the use case; some callers actually care
>> about the VLAN tags themselves and handle that specially (e.g.,
>> act_csum).
>
> I remember taht something similar was discussed about 1 year ago [1].
Ah, thank you for the pointer!
>> Whereas others (e.g., sch_dsmark) probably will have the same
>> issue.
>
> I'd say that the issue "propagates" to all qdiscs that mangle the ECN-CE
> bit (i.e., calling INET_ECN_set_ce() [2]), most notably all the RED
> variants and "codel/fq_codel".
Yeah, I think we should fix INET_ECN_set_ce() instead of re-implementing
it in CAKE. See below, though.
>> I guess I can trying going through them all and figuring out if
>> there's a more generic solution.
>
> For sch_cake, I think that the qdisc shouldn't look at the IP header when
> it schedules packets having a VLAN tag.
>
> Probably, when tc_skb_protocol() returns ETH_P_8021Q or ETH_P_8021AD, we
> should look at the VLAN priority (PCP) bits (and that's something that
> cake doesn't do currently - but I have a small patch in my stash that
> implements this: please let me know if you are interested in seeing it :)
> ).
>
> Then, to ensure that the IP precedence is respected, even with different
> VLAN tags, users should explicitly configure TC filters that "map" the
> DSCP value to a PCP value. This would ensure that configured priority is
> respected by the scheduler, and would also be flexible enough to allow
> different "mappings".
I think you have this the wrong way around :)
I.e., classifying based on VLAN priority is even more esoteric than
using diffserv markings, so that should not be the default. Making it
the default would also make the behaviour change for the same traffic if
there's a VLAN tag present, which is bound to confuse people. I suppose
it could be an option, but not really sure that's needed, since as you
say you could just implement it with your own TC filters...
> Sure, my proposal does not cover the problem of mangling the CE bit
> inside VLAN-tagged packets, i.e. if we should understand if qdiscs
> should allow it or not.
Hmm, yeah, that's the rub, isn't it? I think this is related to this
commit, which first introduced tc_skb_protocol():
d8b9605d2697 ("net: sched: fix skb->protocol use in case of accelerated vlan path")
That commit at least made the behaviour consistent across
accelerated/non-accelerated VLANs. However, the patch description
asserts that 'tc code .. expects vlan protocol type [in skb->protocol]'.
Looking at the various callers, I'm not actually sure that's true, in
the sense that most of the callers don't handle VLAN ethertypes at all,
but expects to find an IP header. This is the case for at least:
- act_ctinfo
- act_skbedit
- cls_flow
- em_ipset
- em_ipt
- sch_cake
- sch_dsmark
In fact the only caller that explicitly handles a VLAN ethertype seems
to be act_csum; and that handles it in a way that also just skips the
VLAN headers, albeit by skb_pull()'ing the header.
cls_api, em_meta and sch_teql don't explicitly handle it; but returning
the VLAN ethertypes to those does seem to make sense, since they just
pass the value somewhere else.
So my suggestion would be to add a new helper that skips the VLAN tags
and finds the L3 ethertype (i.e., basically cake_skb_proto() as
introduced in this patch), then switch all the callers listed above, as
well as the INET_ECN_set_ce() over to using that. Maybe something like
'skb_l3_protocol()' which could go into skbuff.h itself, so the ECN code
can also find it?
Any objections to this? It's not actually clear to me how the discussion
you quoted above landed; but this will at least make things consistent
across all the different actions/etc.
Adding in Jiri and Jamal as well since they were involved in the patch I
quoted above.
-Toke
^ permalink raw reply [relevance 0%]
* Re: [Cake] [PATCH net-next 1/5] sch_cake: fix IP protocol handling in the presence of VLAN tags
2020-06-25 19:53 0% ` Toke Høiland-Jørgensen
2020-06-25 20:00 1% ` David Miller
@ 2020-06-26 8:27 1% ` Davide Caratti
2020-06-26 12:52 0% ` Toke Høiland-Jørgensen
1 sibling, 2 replies; 200+ results
From: Davide Caratti @ 2020-06-26 8:27 UTC (permalink / raw)
To: Toke Høiland-Jørgensen, David Miller; +Cc: netdev, cake
hello,
my 2 cents:
On Thu, 2020-06-25 at 21:53 +0200, Toke Høiland-Jørgensen wrote:
> I think it depends a little on the use case; some callers actually care
> about the VLAN tags themselves and handle that specially (e.g.,
> act_csum).
I remember taht something similar was discussed about 1 year ago [1].
> Whereas others (e.g., sch_dsmark) probably will have the same
> issue.
I'd say that the issue "propagates" to all qdiscs that mangle the ECN-CE
bit (i.e., calling INET_ECN_set_ce() [2]), most notably all the RED
variants and "codel/fq_codel".
> I guess I can trying going through them all and figuring out if
> there's a more generic solution.
For sch_cake, I think that the qdisc shouldn't look at the IP header when
it schedules packets having a VLAN tag.
Probably, when tc_skb_protocol() returns ETH_P_8021Q or ETH_P_8021AD, we
should look at the VLAN priority (PCP) bits (and that's something that
cake doesn't do currently - but I have a small patch in my stash that
implements this: please let me know if you are interested in seeing it :)
).
Then, to ensure that the IP precedence is respected, even with different
VLAN tags, users should explicitly configure TC filters that "map" the
DSCP value to a PCP value. This would ensure that configured priority is
respected by the scheduler, and would also be flexible enough to allow
different "mappings".
Sure, my proposal does not cover the problem of mangling the CE bit inside
VLAN-tagged packets, i.e. if we should understand if qdiscs should allow
it or not.
WDYT?
thank you in advance!
--
davide
[1] https://lore.kernel.org/netdev/CAM_iQpUmuHH8S35ERuJ-sFS=17aa-C8uHSWF-WF7toANX2edCQ@mail.gmail.com/#t
[2] https://elixir.bootlin.com/linux/latest/C/ident/INET_ECN_set_ce
^ permalink raw reply [relevance 1%]
* Re: [Cake] [PATCH RESEND net-next] sch_cake: add RFC 8622 LE PHB support to CAKE diffserv handling
@ 2020-06-25 23:31 1% ` David Miller
0 siblings, 0 replies; 200+ results
From: David Miller @ 2020-06-25 23:31 UTC (permalink / raw)
To: toke; +Cc: ldir, netdev, cake
From: Toke Høiland-Jørgensen <toke@redhat.com>
Date: Thu, 25 Jun 2020 22:18:00 +0200
> From: Kevin Darbyshire-Bryant <ldir@darbyshire-bryant.me.uk>
>
> Change tin mapping on diffserv3, 4 & 8 for LE PHB support, in essence
> making LE a member of the Bulk tin.
>
> Bulk has the least priority and minimum of 1/16th total bandwidth in the
> face of higher priority traffic.
>
> NB: Diffserv 3 & 4 swap tin 0 & 1 priorities from the default order as
> found in diffserv8, in case anyone is wondering why it looks a bit odd.
>
> Signed-off-by: Kevin Darbyshire-Bryant <ldir@darbyshire-bryant.me.uk>
> [ reword commit message slightly ]
> Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Applied, thanks!
^ permalink raw reply [relevance 1%]
* Re: [Cake] [PATCH net 0/3] sched: A couple of fixes for sch_cake
@ 2020-06-25 23:25 1% ` David Miller
0 siblings, 0 replies; 200+ results
From: David Miller @ 2020-06-25 23:25 UTC (permalink / raw)
To: toke; +Cc: netdev, cake
From: Toke Høiland-Jørgensen <toke@redhat.com>
Date: Thu, 25 Jun 2020 22:12:06 +0200
> This series contains a couple of fixes for diffserv handling in sch_cake that
> provide a nice speedup (with a somewhat pedantic nit fix tacked on to the end).
>
> Not quite sure about whether this should go to stable; it does provide a nice
> speedup, but it's not strictly a fix in the "correctness" sense. I lean towards
> including this in stable as well, since our most important consumer of that
> (OpenWrt) is likely to backport the series anyway.
Series applied and queued up for -stable, thanks.
^ permalink raw reply [relevance 1%]
* Re: [Cake] Why are target & interval increased on the reduced bandwidth tins?
2020-06-25 13:40 0% ` Kevin Darbyshire-Bryant
@ 2020-06-25 20:42 0% ` Jonathan Morton
0 siblings, 0 replies; 200+ results
From: Jonathan Morton @ 2020-06-25 20:42 UTC (permalink / raw)
To: Kevin Darbyshire-Bryant; +Cc: Sebastian Moeller, Cake List
> On 25 Jun, 2020, at 4:40 pm, Kevin Darbyshire-Bryant <kevin@darbyshire-bryant.me.uk> wrote:
>
> So the scenario I have in my head says that BK traffic could burst at full bandwidth rate (or higher) filling upstream ISP buffers and thus inducing delays on all other traffic because "we think it’s a slow link and have high interval and target values” delaying our response to the burst. Whereas if we retain the default interval & target from the true link capacity calculation we’ll jump on it in time.
You might be forgetting about ack clocking. This gives the sender information about how quickly data is reaching the receiver, and normally the sender will generate either one or two packets for each packet acked. So even without an immediate AQM action, there is still *some* restraint on the sender's behaviour within approximately one RTT.
This is one of the many subtle factors that Codel relies on.
- Jonathan Morton
^ permalink raw reply [relevance 0%]
* Re: [Cake] [PATCH net-next 1/5] sch_cake: fix IP protocol handling in the presence of VLAN tags
2020-06-25 19:53 0% ` Toke Høiland-Jørgensen
@ 2020-06-25 20:00 1% ` David Miller
2020-06-26 8:27 1% ` Davide Caratti
1 sibling, 0 replies; 200+ results
From: David Miller @ 2020-06-25 20:00 UTC (permalink / raw)
To: toke; +Cc: netdev, cake
From: Toke Høiland-Jørgensen <toke@redhat.com>
Date: Thu, 25 Jun 2020 21:53:53 +0200
> I think it depends a little on the use case; some callers actually care
> about the VLAN tags themselves and handle that specially (e.g.,
> act_csum). Whereas others (e.g., sch_dsmark) probably will have the same
> issue. I guess I can trying going through them all and figuring out if
> there's a more generic solution.
That makes sense.
> I'll split out the diffserv parsing fixes and send those for your net
> tree straight away, then circle back to this one...
Great, thank you.
^ permalink raw reply [relevance 1%]
* Re: [Cake] [PATCH net-next 1/5] sch_cake: fix IP protocol handling in the presence of VLAN tags
2020-06-25 19:29 1% ` David Miller
@ 2020-06-25 19:53 0% ` Toke Høiland-Jørgensen
2020-06-25 20:00 1% ` David Miller
2020-06-26 8:27 1% ` Davide Caratti
0 siblings, 2 replies; 200+ results
From: Toke Høiland-Jørgensen @ 2020-06-25 19:53 UTC (permalink / raw)
To: David Miller; +Cc: netdev, cake
David Miller <davem@davemloft.net> writes:
> From: Toke Høiland-Jørgensen <toke@redhat.com>
> Date: Thu, 25 Jun 2020 13:55:03 +0200
>
>> From: Ilya Ponetayev <i.ponetaev@ndmsystems.com>
>>
>> CAKE was using the return value of tc_skb_protocol() and expecting it to be
>> the IP protocol type. This can fail in the presence of QinQ VLAN tags,
>> making CAKE unable to handle ECN marking and diffserv parsing in this case.
>> Fix this by implementing our own version of tc_skb_protocol(), which will
>> use skb->protocol directly, but also parse and skip over any VLAN tags and
>> return the inner protocol number instead.
>>
>> Also fix CE marking by implementing a version of INET_ECN_set_ce() that
>> uses the same parsing routine.
>>
>> Fixes: ea82511518f4 ("sch_cake: Add NAT awareness to packet classifier")
>> Fixes: b2100cc56fca ("sch_cake: Use tc_skb_protocol() helper for getting packet protocol")
>> Fixes: 046f6fd5daef ("sched: Add Common Applications Kept Enhanced (cake) qdisc")
>> Signed-off-by: Ilya Ponetayev <i.ponetaev@ndmsystems.com>
>> [ squash original two patches, rewrite commit message ]
>> Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
>
> First, this is a bug fix and should probably be steered to 'net'.
>
> Also, other users of tc_skb_protocol() are almost certainly hitting a
> similar problem aren't they? Maybe fix this generically.
I think it depends a little on the use case; some callers actually care
about the VLAN tags themselves and handle that specially (e.g.,
act_csum). Whereas others (e.g., sch_dsmark) probably will have the same
issue. I guess I can trying going through them all and figuring out if
there's a more generic solution.
I'll split out the diffserv parsing fixes and send those for your net
tree straight away, then circle back to this one...
-Toke
^ permalink raw reply [relevance 0%]
* Re: [Cake] [PATCH net-next 0/5] sched: A series of fixes and optimisations for sch_cake
2020-06-25 19:31 1% ` [Cake] [PATCH net-next 0/5] sched: A series of fixes and optimisations for sch_cake David Miller
@ 2020-06-25 19:49 0% ` Toke Høiland-Jørgensen
0 siblings, 0 replies; 200+ results
From: Toke Høiland-Jørgensen @ 2020-06-25 19:49 UTC (permalink / raw)
To: David Miller; +Cc: netdev, cake
David Miller <davem@davemloft.net> writes:
> From: Toke Høiland-Jørgensen <toke@redhat.com>
> Date: Thu, 25 Jun 2020 13:55:02 +0200
>
>> The first three patches in the series are candidates for inclusion into stable.
>
> Stable candidates, ie. fixes, should target 'net' not 'net-next'.
Right, sure, can do; I was just being lazy and putting everything in one
series :)
-Toke
^ permalink raw reply [relevance 0%]
* Re: [Cake] [PATCH net-next 0/5] sched: A series of fixes and optimisations for sch_cake
@ 2020-06-25 19:31 1% ` David Miller
2020-06-25 19:49 0% ` Toke Høiland-Jørgensen
1 sibling, 1 reply; 200+ results
From: David Miller @ 2020-06-25 19:31 UTC (permalink / raw)
To: toke; +Cc: netdev, cake
From: Toke Høiland-Jørgensen <toke@redhat.com>
Date: Thu, 25 Jun 2020 13:55:02 +0200
> The first three patches in the series are candidates for inclusion into stable.
Stable candidates, ie. fixes, should target 'net' not 'net-next'.
^ permalink raw reply [relevance 1%]
* Re: [Cake] [PATCH net-next 1/5] sch_cake: fix IP protocol handling in the presence of VLAN tags
@ 2020-06-25 19:29 1% ` David Miller
2020-06-25 19:53 0% ` Toke Høiland-Jørgensen
0 siblings, 1 reply; 200+ results
From: David Miller @ 2020-06-25 19:29 UTC (permalink / raw)
To: toke; +Cc: netdev, cake
From: Toke Høiland-Jørgensen <toke@redhat.com>
Date: Thu, 25 Jun 2020 13:55:03 +0200
> From: Ilya Ponetayev <i.ponetaev@ndmsystems.com>
>
> CAKE was using the return value of tc_skb_protocol() and expecting it to be
> the IP protocol type. This can fail in the presence of QinQ VLAN tags,
> making CAKE unable to handle ECN marking and diffserv parsing in this case.
> Fix this by implementing our own version of tc_skb_protocol(), which will
> use skb->protocol directly, but also parse and skip over any VLAN tags and
> return the inner protocol number instead.
>
> Also fix CE marking by implementing a version of INET_ECN_set_ce() that
> uses the same parsing routine.
>
> Fixes: ea82511518f4 ("sch_cake: Add NAT awareness to packet classifier")
> Fixes: b2100cc56fca ("sch_cake: Use tc_skb_protocol() helper for getting packet protocol")
> Fixes: 046f6fd5daef ("sched: Add Common Applications Kept Enhanced (cake) qdisc")
> Signed-off-by: Ilya Ponetayev <i.ponetaev@ndmsystems.com>
> [ squash original two patches, rewrite commit message ]
> Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
First, this is a bug fix and should probably be steered to 'net'.
Also, other users of tc_skb_protocol() are almost certainly hitting a
similar problem aren't they? Maybe fix this generically.
^ permalink raw reply [relevance 1%]
* Re: [Cake] Why are target & interval increased on the reduced bandwidth tins?
2020-06-24 14:40 0% ` Sebastian Moeller
@ 2020-06-25 13:40 0% ` Kevin Darbyshire-Bryant
2020-06-25 20:42 0% ` Jonathan Morton
0 siblings, 1 reply; 200+ results
From: Kevin Darbyshire-Bryant @ 2020-06-25 13:40 UTC (permalink / raw)
To: Sebastian Moeller; +Cc: Cake List
[-- Attachment #1: Type: text/plain, Size: 1727 bytes --]
> On 24 Jun 2020, at 15:40, Sebastian Moeller <moeller0@gmx.de> wrote:
>
> Hi Kevin,
>
> so the way codel is designed target is best understood as a function of interval (allowing 5-10% of interval as standing queue allows a fine trade-off between bandwidth utilization and latency under load increase).
> Now, interval is basically akin to the time you are willing to give a flow to react to signals, it should be in the same order of magnitude as the path RTT. Now reducing the bandwidth allocation for a traffic class will increase its saturation load RTT and hence increasing the target seems justified; target just follows along due to still wanting a reasonable bandwidth/latency trade-off.
> So in short these scale the shaper to work well under loaded conditions. But Jonathan & Toke will be able to give the real explanation ;)
>
> Best Regards
> Sebastian
So crudely interval is the delay before we start jumping on packets. And I think that’s the wrong thing to do for ingress. So the scenario I have in my head says that BK traffic could burst at full bandwidth rate (or higher) filling upstream ISP buffers and thus inducing delays on all other traffic because "we think it’s a slow link and have high interval and target values” delaying our response to the burst. Whereas if we retain the default interval & target from the true link capacity calculation we’ll jump on it in time.
The same thing happens in traditional egress mode but it doesn’t matter as much as we are in control of our own buffer/queue and we’ll see the higher priority traffic and skip to servicing that and gradually bring the BK tin back under control.
What’s the error in my thinking?
Kevin
[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [relevance 0%]
* Re: [Cake] Why are target & interval increased on the reduced bandwidth tins?
@ 2020-06-24 14:40 0% ` Sebastian Moeller
2020-06-25 13:40 0% ` Kevin Darbyshire-Bryant
0 siblings, 1 reply; 200+ results
From: Sebastian Moeller @ 2020-06-24 14:40 UTC (permalink / raw)
To: Kevin Darbyshire-Bryant; +Cc: Cake List
Hi Kevin,
so the way codel is designed target is best understood as a function of interval (allowing 5-10% of interval as standing queue allows a fine trade-off between bandwidth utilization and latency under load increase).
Now, interval is basically akin to the time you are willing to give a flow to react to signals, it should be in the same order of magnitude as the path RTT. Now reducing the bandwidth allocation for a traffic class will increase its saturation load RTT and hence increasing the target seems justified; target just follows along due to still wanting a reasonable bandwidth/latency trade-off.
So in short these scale the shaper to work well under loaded conditions. But Jonathan & Toke will be able to give the real explanation ;)
Best Regards
Sebastian
> On Jun 24, 2020, at 16:33, Kevin Darbyshire-Bryant <kevin@darbyshire-bryant.me.uk> wrote:
>
> Genuine question. For the reduced bandwidth tins in diffserv3/4/8 a different rate and hence different target & interval values are also calculated. I get why a target/interval calculation is desirable for the ‘main’ tin - this forms a ‘best case’ of how long each byte takes to transmit and is fundamental to the shaper. What I’m less clear on is why increased targets & intervals are used for the reduced threshold tins.
>
> To my mind it means those tins can be more ‘bursty’ before codel jumps on them. That’s possibly ok on an egress path but I’m not so convinced on an ingress path.
>
> Please point out the error in my thinking!
>
>
> Cheers,
>
> Kevin D-B
>
> gpg: 012C ACB2 28C6 C53E 9775 9123 B3A2 389B 9DE2 334A
>
> _______________________________________________
> Cake mailing list
> Cake@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cake
^ permalink raw reply [relevance 0%]
* Re: [Cake] [CAKE] Rate is much lower than expected - CPU load is higher than expected
2020-06-23 16:08 0% ` Sebastian Moeller
@ 2020-06-23 16:25 0% ` Jonathan Morton
0 siblings, 0 replies; 200+ results
From: Jonathan Morton @ 2020-06-23 16:25 UTC (permalink / raw)
To: Sebastian Moeller; +Cc: Toke Høiland-Jørgensen, cake, marco maniezzo
> On 23 Jun, 2020, at 7:08 pm, Sebastian Moeller <moeller0@gmx.de> wrote:
>
> But I assume that you bound the bursts somehow, do you remember your bust sizing method by chance?
It bursts exactly enough to catch up to the schedule. No more, no less - unless the queue is emptied in the process.
- Jonathan Morton
^ permalink raw reply [relevance 0%]
* Re: [Cake] [CAKE] Rate is much lower than expected - CPU load is higher than expected
2020-06-23 15:21 0% ` Jonathan Morton
@ 2020-06-23 16:08 0% ` Sebastian Moeller
2020-06-23 16:25 0% ` Jonathan Morton
0 siblings, 1 reply; 200+ results
From: Sebastian Moeller @ 2020-06-23 16:08 UTC (permalink / raw)
To: Jonathan Morton; +Cc: Toke Høiland-Jørgensen, cake, marco maniezzo
Hi Jonathan,
> On Jun 23, 2020, at 17:21, Jonathan Morton <chromatix99@gmail.com> wrote:
>
>> On 23 Jun, 2020, at 5:41 pm, Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>>
>> Right, well if you're not running out of CPU I guess it could be a
>> timing issue. The CAKE shaper relies on accurate timestamps and the
>> qdisc watchdog timer to schedule the transmission of packets. A loaded
>> system can simply miss deadlines, which would be consistent with the
>> behaviour you're seeing.
>>
>> In fact, when looking into this a bit more, I came across this commit
>> that seemed to observe the same behaviour in sch_fq:
>> https://git.kernel.org/torvalds/c/fefa569a9d4b
>>
>> So I guess we could try to do something similar in CAKE.
>
> Actually, we already do. The first version of Cake's shaper was based closely on the one in sch_fq at the time, and I modified it after noticing that it had a very limited maximum throughput when timer resolution was poor (eg. at 1kHz on an old PC without HPET hardware, we could only get 1k pps). Now, any late timer event will result in a burst being issued to catch up with the proper schedule. The only time that wouldn't work is if the queue is empty.
This nicely and effortlessly explains why cake unlike HTB+fq_codel maintains the set bandwidth better under CPU load (but then these burst also increase latency under load a bit more; then again again, we had to make the burst buffering for HTB configurable so it would not be as bad in dropping bandwidth on the floor). But I assume that you bound the bursts somehow, do you remember your bust sizing method by chance? (For HTB/TBF sqm now simply allows the user to configure a maximum burst service time im microseconds, that at least allows to make a conscious trade-off).
>
> If the patches currently being trialled are not sufficient, then perhaps we could try something counter-intuitive: switch on "flows" instead of "flowblind", and enable the ack-filter. That should result in fewer small packets to process, as the ack-filter will coalesce some acks, especially under load. It might also help to select "satellite" AQM parameters, reducing the amount of processing needed at that layer.
>
> - Jonathan Morton
>
> _______________________________________________
> Cake mailing list
> Cake@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cake
^ permalink raw reply [relevance 0%]
* Re: [Cake] [CAKE] Rate is much lower than expected - CPU load is higher than expected
2020-06-23 14:41 0% ` Toke Høiland-Jørgensen
@ 2020-06-23 15:21 0% ` Jonathan Morton
2020-06-23 16:08 0% ` Sebastian Moeller
0 siblings, 1 reply; 200+ results
From: Jonathan Morton @ 2020-06-23 15:21 UTC (permalink / raw)
To: Toke Høiland-Jørgensen; +Cc: Jose Blanquicet, cake, marco maniezzo
> On 23 Jun, 2020, at 5:41 pm, Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>
> Right, well if you're not running out of CPU I guess it could be a
> timing issue. The CAKE shaper relies on accurate timestamps and the
> qdisc watchdog timer to schedule the transmission of packets. A loaded
> system can simply miss deadlines, which would be consistent with the
> behaviour you're seeing.
>
> In fact, when looking into this a bit more, I came across this commit
> that seemed to observe the same behaviour in sch_fq:
> https://git.kernel.org/torvalds/c/fefa569a9d4b
>
> So I guess we could try to do something similar in CAKE.
Actually, we already do. The first version of Cake's shaper was based closely on the one in sch_fq at the time, and I modified it after noticing that it had a very limited maximum throughput when timer resolution was poor (eg. at 1kHz on an old PC without HPET hardware, we could only get 1k pps). Now, any late timer event will result in a burst being issued to catch up with the proper schedule. The only time that wouldn't work is if the queue is empty.
If the patches currently being trialled are not sufficient, then perhaps we could try something counter-intuitive: switch on "flows" instead of "flowblind", and enable the ack-filter. That should result in fewer small packets to process, as the ack-filter will coalesce some acks, especially under load. It might also help to select "satellite" AQM parameters, reducing the amount of processing needed at that layer.
- Jonathan Morton
^ permalink raw reply [relevance 0%]
* Re: [Cake] [CAKE] Rate is much lower than expected - CPU load is higher than expected
2020-06-23 13:05 1% ` Jose Blanquicet
@ 2020-06-23 14:41 0% ` Toke Høiland-Jørgensen
2020-06-23 15:21 0% ` Jonathan Morton
0 siblings, 1 reply; 200+ results
From: Toke Høiland-Jørgensen @ 2020-06-23 14:41 UTC (permalink / raw)
To: Jose Blanquicet; +Cc: cake, marco maniezzo
Jose Blanquicet <blanquicet@gmail.com> writes:
> Hi Toke,
>
> Thanks for your reply.
>
> On Mon, Jun 22, 2020 at 5:47 PM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>> > We have an embedded system with limited CPU resources that acts as a
>> > gateway to provide Internet access from LTE to a private USB-NCM
>> > network (And also to a Wi-Fi private network but we will work on it
>> > later). Our problem is that the bandwidth on LTE and USB link is
>> > higher than what the system is able to handle thus it reaches 100% of
>> > CPU load when we perform a simple speed test from a device on the
>> > private network.
>>
>> What speeds were you getting without shaping?
>
> Between 35 and 40Mbps.
>
>> > Therefore, we want to limit the bandwidth to avoid system getting
>> > saturated in such use-case. To do so, we thought to use the CAKE on
>> > the USB interface. For instance, we tried:
>> >
>> > tc qdisc replace root dev eth0 cake bandwidth 20mbit ethernet
>> > internet flowblind nonat besteffort nowash
>> >
>> > It worked correctly and the maximum rate was limited but there are two
>> > things that are worrying us:
>> >
>> > 1) The maximum rate reached after applying CAKE was in between 12Mbps
>> > and 15Mbps which is quite lower than the 20Mbps we are configuring, we
>> > were expecting around 18-19. Why? Is there something in the parameters
>> > we are doing wrong? Please take into account that our goal is to limit
>> > the rate but adding as little CPU load as possible.
>>
>> Hmm, are you actually running out of CPU? I.e., is the CPU pegged at
>> 100% when you hit this limit? What kind of platform are you running on?
>> And what kernel and CAKE versions are you using?
>
> I checked the CPU with top and there is still free CPU to be used. We
> also tried with lower values like 10 and it is again far away from the
> configured limit.
>
> We have just a percentage of an ARM Cortex A7 (1.2GHz) because the
> rest is reserved for modem. We are now trying to optimize all the
> applications in the system but LTE<->WIFI/USB data transfer is indeed
> the
> use-case that puts our system in crisis.
>
> The kernel version is 3.18 and for we are using the latest commit on
> master branch (9d79e2b) for CAKE. In case, we could change CAKE but
> not the kernel version, at most some specific patches.
Right, well if you're not running out of CPU I guess it could be a
timing issue. The CAKE shaper relies on accurate timestamps and the
qdisc watchdog timer to schedule the transmission of packets. A loaded
system can simply miss deadlines, which would be consistent with the
behaviour you're seeing.
In fact, when looking into this a bit more, I came across this commit
that seemed to observe the same behaviour in sch_fq:
https://git.kernel.org/torvalds/c/fefa569a9d4b
So I guess we could try to do something similar in CAKE.
Could you please post the output of 'tc -s qdisc' after a test run? That
should give some indication on how much the shaper is throttling...
>> > 2) The CPU load added by CAKE was not negligible for our system. In
>> > fact, we compared the CPU load when limitation was done by CAKE and by
>> > the device on the private network, e.g. curl tool with parameter
>> > "--limit-rate". As a result, we found that the CPU load when using
>> > CAKE was 30%. Is there any way to make it lighter with a different
>> > configuration?
>>
>> No, you've already turned off most of the features that might incur
>> overhead, so I don't think there's anything more you can do
>> configuration-wise to improve CPU load. Shaping does tend to use up a
>> lot of CPU, so it's not too surprising you run into issues here.
>
> Could you please help us to identify which one is still active? We
> thought we had already turned off all the features not needed to apply
> a limitation with a single queue (Besteffor mode).
Well the only thing more you can turn off by configuration is the shaper
itself :)
>> We did recently get a pull request whose author states that he was
>> seeing a 1/3 improvement in performance from it. See:
>> https://github.com/dtaht/sch_cake/pull/136
>>
>> You could try this; if your ingress network device driver has the same
>> issue with skbs being allocated in smaller bits, you may see a similar
>> increase with this patch. For a quick test you could also just try
>> commenting out the call to cake_handle_diffserv() entirely since you're
>> running in besteffort mode anyway :)
>
> Interesting. We will try this, we commented out the call to
> cake_handle_diffserv() as you said and just to be sure, we also
> applied the 2nd commit of the PR. I will be back soon with news.
OK, great!
-Toke
^ permalink raw reply [relevance 0%]
* Re: [Cake] [CAKE] Rate is much lower than expected - CPU load is higher than expected
@ 2020-06-23 13:05 1% ` Jose Blanquicet
2020-06-23 14:41 0% ` Toke Høiland-Jørgensen
0 siblings, 1 reply; 200+ results
From: Jose Blanquicet @ 2020-06-23 13:05 UTC (permalink / raw)
To: Toke Høiland-Jørgensen; +Cc: cake, marco maniezzo
Hi Toke,
Thanks for your reply.
On Mon, Jun 22, 2020 at 5:47 PM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
> > We have an embedded system with limited CPU resources that acts as a
> > gateway to provide Internet access from LTE to a private USB-NCM
> > network (And also to a Wi-Fi private network but we will work on it
> > later). Our problem is that the bandwidth on LTE and USB link is
> > higher than what the system is able to handle thus it reaches 100% of
> > CPU load when we perform a simple speed test from a device on the
> > private network.
>
> What speeds were you getting without shaping?
Between 35 and 40Mbps.
> > Therefore, we want to limit the bandwidth to avoid system getting
> > saturated in such use-case. To do so, we thought to use the CAKE on
> > the USB interface. For instance, we tried:
> >
> > tc qdisc replace root dev eth0 cake bandwidth 20mbit ethernet
> > internet flowblind nonat besteffort nowash
> >
> > It worked correctly and the maximum rate was limited but there are two
> > things that are worrying us:
> >
> > 1) The maximum rate reached after applying CAKE was in between 12Mbps
> > and 15Mbps which is quite lower than the 20Mbps we are configuring, we
> > were expecting around 18-19. Why? Is there something in the parameters
> > we are doing wrong? Please take into account that our goal is to limit
> > the rate but adding as little CPU load as possible.
>
> Hmm, are you actually running out of CPU? I.e., is the CPU pegged at
> 100% when you hit this limit? What kind of platform are you running on?
> And what kernel and CAKE versions are you using?
I checked the CPU with top and there is still free CPU to be used. We
also tried with lower values like 10 and it is again far away from the
configured limit.
We have just a percentage of an ARM Cortex A7 (1.2GHz) because the
rest is reserved for modem. We are now trying to optimize all the
applications in the system but LTE<->WIFI/USB data transfer is indeed
the
use-case that puts our system in crisis.
The kernel version is 3.18 and for we are using the latest commit on
master branch (9d79e2b) for CAKE. In case, we could change CAKE but
not the kernel version, at most some specific patches.
> > 2) The CPU load added by CAKE was not negligible for our system. In
> > fact, we compared the CPU load when limitation was done by CAKE and by
> > the device on the private network, e.g. curl tool with parameter
> > "--limit-rate". As a result, we found that the CPU load when using
> > CAKE was 30%. Is there any way to make it lighter with a different
> > configuration?
>
> No, you've already turned off most of the features that might incur
> overhead, so I don't think there's anything more you can do
> configuration-wise to improve CPU load. Shaping does tend to use up a
> lot of CPU, so it's not too surprising you run into issues here.
Could you please help us to identify which one is still active? We
thought we had already turned off all the features not needed to apply
a limitation with a single queue (Besteffor mode).
> We did recently get a pull request whose author states that he was
> seeing a 1/3 improvement in performance from it. See:
> https://github.com/dtaht/sch_cake/pull/136
>
> You could try this; if your ingress network device driver has the same
> issue with skbs being allocated in smaller bits, you may see a similar
> increase with this patch. For a quick test you could also just try
> commenting out the call to cake_handle_diffserv() entirely since you're
> running in besteffort mode anyway :)
Interesting. We will try this, we commented out the call to
cake_handle_diffserv() as you said and just to be sure, we also
applied the 2nd commit of the PR. I will be back soon with news.
Thanks,
Jose Blanquicet
^ permalink raw reply [relevance 1%]
* Re: [Cake] [CAKE] Rate is much lower than expected - CPU load is higher than expected
@ 2020-06-22 14:25 1% ` Y
1 sibling, 0 replies; 200+ results
From: Y @ 2020-06-22 14:25 UTC (permalink / raw)
To: cake
You should paste this result.
tc -s qdisc show dev eth0
Yutaka
On 22/06/2020 22:10, Jose Blanquicet wrote:
> Hi everyone,
>
> We have an embedded system with limited CPU resources that acts as a
> gateway to provide Internet access from LTE to a private USB-NCM
> network (And also to a Wi-Fi private network but we will work on it
> later). Our problem is that the bandwidth on LTE and USB link is
> higher than what the system is able to handle thus it reaches 100% of
> CPU load when we perform a simple speed test from a device on the
> private network.
>
> Therefore, we want to limit the bandwidth to avoid system getting
> saturated in such use-case. To do so, we thought to use the CAKE on
> the USB interface. For instance, we tried:
>
> tc qdisc replace root dev eth0 cake bandwidth 20mbit ethernet
> internet flowblind nonat besteffort nowash
>
> It worked correctly and the maximum rate was limited but there are two
> things that are worrying us:
>
> 1) The maximum rate reached after applying CAKE was in between 12Mbps
> and 15Mbps which is quite lower than the 20Mbps we are configuring, we
> were expecting around 18-19. Why? Is there something in the parameters
> we are doing wrong? Please take into account that our goal is to limit
> the rate but adding as little CPU load as possible.
>
> 2) The CPU load added by CAKE was not negligible for our system. In
> fact, we compared the CPU load when limitation was done by CAKE and by
> the device on the private network, e.g. curl tool with parameter
> "--limit-rate". As a result, we found that the CPU load when using
> CAKE was 30%. Is there any way to make it lighter with a different
> configuration?
>
> Thanks in advance for the support. Any suggestion is welcome.
>
> Jose Blanquicet
> _______________________________________________
> Cake mailing list
> Cake@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cake
>
^ permalink raw reply [relevance 1%]
* Re: [Cake] Query on ACK
2020-06-16 5:22 1% ` Avakash bhat
@ 2020-06-16 5:31 1% ` Dave Taht
0 siblings, 0 replies; 200+ results
From: Dave Taht @ 2020-06-16 5:31 UTC (permalink / raw)
To: Avakash bhat
Cc: Jonathan Morton, Toke Høiland-Jørgensen, Cake List,
Vybhav Pai, Shrinidhi Varna, Mohit P. Tahiliani, Deepak K
so glad to see that merged! I know how hard it is to make progress
that anyone can (re)use.
Thanks so much!
On Mon, Jun 15, 2020 at 10:22 PM Avakash bhat <avakash261@gmail.com> wrote:
>
> Hi all,
>
> Thank you for the clarification. We will try implementing a similar test.
>
> Thanks to the Cake community's continued support we were able to successfully merge the set-associative flow hash module into ns-3 (https://gitlab.com/nsnam/ns-3-dev/-/merge_requests/209).
>
> Hopefully, we are able to achieve a similar result with the ack filter module and we will continue to work to do so.
>
> Thanks,
> Avakash Bhat
>
> On Sun, Jun 14, 2020 at 8:13 PM Jonathan Morton <chromatix99@gmail.com> wrote:
>>
>> > On 14 Jun, 2020, at 3:43 pm, Avakash bhat <avakash261@gmail.com> wrote:
>> >
>> > I wanted another clarification on the results obtained by the Ack filtering experiment( Fig 6) .
>> > Was the experiment conducted with only ack filtering enabled?
>> > Or was set associative hash and the other modules of Cake enabled along with Ack filtering while running this experiment ?
>>
>> The test was run on a complete implementation of Cake, set up in the normal way. I think we kept the configuration simple for this test, so everything at defaults except for choosing the shaped bandwidth in each direction.
>>
>> The ack-filter relies on having fairly good flow isolation, so that consecutive packets in the appropriate queue belong to the same ack stream. So at minimum it is appropriate to have the set-associative flow hash enabled.
>>
>> The host-fairness and Diffserv features were probably enabled, but did not have relevant effects in this case, since only one pair of hosts and the Best Effort DSCP were used in the traffic.
>>
>> - Jonathan Morton
--
"For a successful technology, reality must take precedence over public
relations, for Mother Nature cannot be fooled" - Richard Feynman
dave@taht.net <Dave Täht> CTO, TekLibre, LLC Tel: 1-831-435-0729
^ permalink raw reply [relevance 1%]
* Re: [Cake] Query on ACK
2020-06-14 14:43 0% ` Jonathan Morton
@ 2020-06-16 5:22 1% ` Avakash bhat
2020-06-16 5:31 1% ` Dave Taht
0 siblings, 1 reply; 200+ results
From: Avakash bhat @ 2020-06-16 5:22 UTC (permalink / raw)
To: Jonathan Morton
Cc: Toke Høiland-Jørgensen, Cake List, Dave Taht,
Vybhav Pai, Shrinidhi Varna, Mohit P. Tahiliani, Deepak K
[-- Attachment #1: Type: text/plain, Size: 1563 bytes --]
Hi all,
Thank you for the clarification. We will try implementing a similar test.
Thanks to the Cake community's continued support we were able to
successfully merge the set-associative flow hash module into ns-3 (
https://gitlab.com/nsnam/ns-3-dev/-/merge_requests/209).
Hopefully, we are able to achieve a similar result with the ack filter
module and we will continue to work to do so.
Thanks,
Avakash Bhat
On Sun, Jun 14, 2020 at 8:13 PM Jonathan Morton <chromatix99@gmail.com>
wrote:
> > On 14 Jun, 2020, at 3:43 pm, Avakash bhat <avakash261@gmail.com> wrote:
> >
> > I wanted another clarification on the results obtained by the Ack
> filtering experiment( Fig 6) .
> > Was the experiment conducted with only ack filtering enabled?
> > Or was set associative hash and the other modules of Cake enabled along
> with Ack filtering while running this experiment ?
>
> The test was run on a complete implementation of Cake, set up in the
> normal way. I think we kept the configuration simple for this test, so
> everything at defaults except for choosing the shaped bandwidth in each
> direction.
>
> The ack-filter relies on having fairly good flow isolation, so that
> consecutive packets in the appropriate queue belong to the same ack
> stream. So at minimum it is appropriate to have the set-associative flow
> hash enabled.
>
> The host-fairness and Diffserv features were probably enabled, but did not
> have relevant effects in this case, since only one pair of hosts and the
> Best Effort DSCP were used in the traffic.
>
> - Jonathan Morton
[-- Attachment #2: Type: text/html, Size: 2127 bytes --]
^ permalink raw reply [relevance 1%]
* Re: [Cake] Query on ACK
2020-06-14 12:43 1% ` Avakash bhat
@ 2020-06-14 14:43 0% ` Jonathan Morton
2020-06-16 5:22 1% ` Avakash bhat
0 siblings, 1 reply; 200+ results
From: Jonathan Morton @ 2020-06-14 14:43 UTC (permalink / raw)
To: Avakash bhat
Cc: Toke Høiland-Jørgensen, Cake List, Dave Taht,
Vybhav Pai, Shrinidhi Varna, Mohit P. Tahiliani, Deepak K
> On 14 Jun, 2020, at 3:43 pm, Avakash bhat <avakash261@gmail.com> wrote:
>
> I wanted another clarification on the results obtained by the Ack filtering experiment( Fig 6) .
> Was the experiment conducted with only ack filtering enabled?
> Or was set associative hash and the other modules of Cake enabled along with Ack filtering while running this experiment ?
The test was run on a complete implementation of Cake, set up in the normal way. I think we kept the configuration simple for this test, so everything at defaults except for choosing the shaped bandwidth in each direction.
The ack-filter relies on having fairly good flow isolation, so that consecutive packets in the appropriate queue belong to the same ack stream. So at minimum it is appropriate to have the set-associative flow hash enabled.
The host-fairness and Diffserv features were probably enabled, but did not have relevant effects in this case, since only one pair of hosts and the Best Effort DSCP were used in the traffic.
- Jonathan Morton
^ permalink raw reply [relevance 0%]
* Re: [Cake] Query on ACK
2020-05-25 11:58 0% ` Toke Høiland-Jørgensen
@ 2020-06-14 12:43 1% ` Avakash bhat
2020-06-14 14:43 0% ` Jonathan Morton
0 siblings, 1 reply; 200+ results
From: Avakash bhat @ 2020-06-14 12:43 UTC (permalink / raw)
To: Toke Høiland-Jørgensen
Cc: Jonathan Morton, Cake List, Dave Taht, Vybhav Pai,
Shrinidhi Varna, Mohit P. Tahiliani, Deepak K
[-- Attachment #1: Type: text/plain, Size: 1820 bytes --]
Hi all,
I wanted another clarification on the results obtained by the Ack filtering
experiment( Fig 6) .
Was the experiment conducted with only ack filtering enabled?
Or was set associative hash and the other modules of Cake enabled along
with Ack filtering while running this experiment ?
Thanks,
Avakash Bhat
On Mon, May 25, 2020, 5:28 PM Toke Høiland-Jørgensen <toke@redhat.com>
wrote:
> Jonathan Morton <chromatix99@gmail.com> writes:
>
> >> On 25 May, 2020, at 8:17 am, Avakash bhat <avakash261@gmail.com> wrote:
> >>
> >> We had another query we would like to resolve. We wanted to verify the
> working of ack filter in ns-3,
> >> so we decided to replicate the Fig 6 graph in the CAKE paper(
> https://ieeexplore.ieee.org/document/8475045).
> >> While trying to build the topology we realized that we do not know the
> number of packets or bytes sent from
> >> the source to the destination for each of the TCP connections ( We are
> assuming it is a point to point connection with 4 TCP flows).
> >>
> >> Could we get a bit more details about how the experiment was conducted?
> >
> > I believe this was conducted using the RRUL test in Flent. This opens
> > four saturating TCP flows in each direction, and also sends a small
> > amount of latency measuring traffic. On this occasion I don't think
> > we added any simulated path delays, and only imposed the quoted
> > asymmetric bandwidth limits (30Mbps down, 1Mbps up).
>
> See https://www.cs.kau.se/tohojo/cake/ - the link to the data files near
> the bottom of that page also contains the Flent batch file and setup
> scripts used to run the whole thing.
>
> (And there's no explicit "number of bytes sent", but rather the flows
> are capacity-seeking flows running for a limited *time*).
>
> -Toke
>
>
[-- Attachment #2: Type: text/html, Size: 2791 bytes --]
^ permalink raw reply [relevance 1%]
* [Cake] Fwd: [tsvwg] Fwd: Working Group Last Call: QUIC protocol drafts
[not found] ` <CALGR9oZ-MzUh6JZrM7w97i=64OEZ3JzjzhVir2RBTWm210Fw7w@mail.gmail.com>
@ 2020-06-10 2:23 2% ` Dave Taht
0 siblings, 0 replies; 200+ results
From: Dave Taht @ 2020-06-10 2:23 UTC (permalink / raw)
To: bloat, ECN-Sane, Make-Wifi-fast, Cake List, cerowrt-devel
I am happy to see quic in last call. there are a ton of interoperble
implementations now.
---------- Forwarded message ---------
From: Lucas Pardue <lucaspardue.24.7@gmail.com>
Date: Tue, Jun 9, 2020 at 7:08 PM
Subject: [tsvwg] Fwd: Working Group Last Call: QUIC protocol drafts
To: HTTP Working Group <ietf-http-wg@w3.org>, <tsvwg@ietf.org>, <tls@ietf.org>
Cc: Mark Nottingham <mnot@mnot.net>
Hello folks,
Please see the forwarded Working Group Last Call announcement for the
QUIC protocol drafts..
The QUIC WG will only consider feedback directed towards it so please
do not respond to this thread. Instead, review comments need to be
opened on the GitHub repo or sent to the QUIC mailing list as
described in the guidance below.
Cheers,
Lars, Lucas and Mark
QUIC WG Chairs
---------- Forwarded message ---------
From: Lucas Pardue <lucaspardue.24.7@gmail.com>
Date: Wed, Jun 10, 2020 at 2:36 AM
Subject: Working Group Last Call: QUIC protocol drafts
To: QUIC WG <quic@ietf.org>
Hello,
After more than three and a half years and substantial discussion, all
845 of the design issues raised against the QUIC protocol drafts have
gained consensus or have a proposed resolution. In that time the
protocol has been considerably transformed; it has become more secure,
much more widely implemented, and has been shown to be interoperable.
Both the Chairs and the Editors feel that it is ready to proceed in
standardisation.
Therefore, this email announces a Working Group Last Call (WGLC) for
the following QUIC documents:
* QUIC Transport
https://tools.ietf.org/html/draft-ietf-quic-transport-29
* QUIC Loss Detection and Congestion Control
https://tools.ietf.org/html/draft-ietf-quic-recovery-29
* Using TLS to Secure QUIC
https://tools.ietf.org/html/draft-ietf-quic-tls-29
* Version-Independent Properties of QUIC
https://tools.ietf.org/html/draft-ietf-quic-invariants-09
* HTTP/3
https://tools.ietf.org/html/draft-ietf-quic-http-29
* QPACK Header Compression for HTTP/3
https://tools.ietf.org/html/draft-ietf-quic-qpack-16
The WGLC will run for four weeks, ending on 8 July 2020.
As a reminder, we have been operating under the Late-Stage Process;
see https://github.com/quicwg/base-drafts/blob/master/CONTRIBUTING.md#late-stage-process.
In theory, this means that the contents of the drafts above already
have consensus. However, the Chairs would like to actively reaffirm
that consensus and start the process of wider review through a formal
WGLC.
Please review the documents above and open issues for your review
comments in our repository at https://github.com/quicwg/base-drafts.
You may also send comments to quic@ietf.org.
Issues raised during WGLC will be handled in accordance with the
Late-Stage Process defined in the Contribution Guidelines (see link
above). Please note that design issues that revisit a topic where
there's already declared consensus (see
https://github.com/quicwg/base-drafts/issues?q=is%3Aclosed+label%3Ahas-consensus)
need to provide compelling reasons to warrant reopening the
discussion.
As part of this WGLC, we seek consensus on the remaining open design
issue #3661 “Include epoch in the AAD or the nonce?”
(https://github.com/quicwg/base-drafts/issues/3661). The proposed
resolution for this issue is to close with no action, which means that
the drafts above already reflect this emerging consensus.
Subject to the feedback received during this WGLC, a subsequent
smaller WGLC may be run in the near future to confirm any changes to
the drafts made between now and then.
The Applicability and Manageability drafts have some dependencies on
the core drafts, so we'll run separate WGLCs for them.
Cheers,
Lars, Lucas and Mark
QUIC WG Chairs
--
"For a successful technology, reality must take precedence over public
relations, for Mother Nature cannot be fooled" - Richard Feynman
dave@taht.net <Dave Täht> CTO, TekLibre, LLC Tel: 1-831-435-0729
^ permalink raw reply [relevance 2%]
* Re: [Cake] [Bloat] anyone using google stadia?
2020-06-03 19:09 1% ` [Cake] [Bloat] " Pedro Tumusok
@ 2020-06-04 17:27 1% ` Dave Taht
0 siblings, 0 replies; 200+ results
From: Dave Taht @ 2020-06-04 17:27 UTC (permalink / raw)
To: Pedro Tumusok
Cc: bloat, cerowrt-devel, Make-Wifi-fast, Cake List, ECN-Sane, Ozer, Sebnem
On Wed, Jun 3, 2020 at 12:09 PM Pedro Tumusok <pedro.tumusok@gmail.com> wrote:
>
> I have it, but have not fired up for awhile.
Do you have the controller?
> Lets see if I can get around to do some captures in a couple of days, if anybody wants it, please feel free :)
Thx so much! Testing what happens normally, and then when load is
offered (e.g. rrul), should be "interesting" in
the long run, but just having a few reference captures from around the
world, and on different service types (fiber,cable, wifi, dsl),
would be a good start to understanding the state of the art here -
pacing, actual traffic volume (both ways), etc - can be easily
derived even from encrypted traffic.
... and being stuck at home myself, I was considering getting the
service to play with! What's the best game y'all have tried on it? I'm
not much of a first-person shooter person, the last game I played much
of was starcraft.... I would enjoy taking some of y'all on on some
game, while
testing...
> Pedro
>
> On Wed, Jun 3, 2020 at 9:36 AM Dave Taht <dave.taht@gmail.com> wrote:
>>
>> and willing to share some packet captures of 5 minutes of gameplay
>> start to finish? Over wired and wifi?
>>
>> thx.
>>
>> --
>> "For a successful technology, reality must take precedence over public
>> relations, for Mother Nature cannot be fooled" - Richard Feynman
>>
>> dave@taht.net <Dave Täht> CTO, TekLibre, LLC Tel: 1-831-435-0729
>> _______________________________________________
>> Bloat mailing list
>> Bloat@lists.bufferbloat.net
>> https://lists.bufferbloat.net/listinfo/bloat
>
>
>
> --
> Best regards / Mvh
> Jan Pedro Tumusok
>
--
"For a successful technology, reality must take precedence over public
relations, for Mother Nature cannot be fooled" - Richard Feynman
dave@taht.net <Dave Täht> CTO, TekLibre, LLC Tel: 1-831-435-0729
^ permalink raw reply [relevance 1%]
* Re: [Cake] [Bloat] anyone using google stadia?
@ 2020-06-03 19:09 1% ` Pedro Tumusok
2020-06-04 17:27 1% ` Dave Taht
0 siblings, 1 reply; 200+ results
From: Pedro Tumusok @ 2020-06-03 19:09 UTC (permalink / raw)
To: Dave Taht
Cc: bloat, cerowrt-devel, Make-Wifi-fast, Cake List, ECN-Sane, Ozer, Sebnem
[-- Attachment #1: Type: text/plain, Size: 791 bytes --]
I have it, but have not fired up for awhile.
Lets see if I can get around to do some captures in a couple of days, if
anybody wants it, please feel free :)
Pedro
On Wed, Jun 3, 2020 at 9:36 AM Dave Taht <dave.taht@gmail.com> wrote:
> and willing to share some packet captures of 5 minutes of gameplay
> start to finish? Over wired and wifi?
>
> thx.
>
> --
> "For a successful technology, reality must take precedence over public
> relations, for Mother Nature cannot be fooled" - Richard Feynman
>
> dave@taht.net <Dave Täht> CTO, TekLibre, LLC Tel: 1-831-435-0729
> _______________________________________________
> Bloat mailing list
> Bloat@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/bloat
>
--
Best regards / Mvh
Jan Pedro Tumusok
[-- Attachment #2: Type: text/html, Size: 1431 bytes --]
^ permalink raw reply [relevance 1%]
* Re: [Cake] Playing with ingredients = ruined the CAKE
2020-05-31 19:01 1% ` Dave Taht
@ 2020-05-31 19:25 0% ` Sebastian Moeller
0 siblings, 0 replies; 200+ results
From: Sebastian Moeller @ 2020-05-31 19:25 UTC (permalink / raw)
To: Dave Täht; +Cc: Kevin Darbyshire-Bryant, Cake List, Make-Wifi-fast
Hi Dave,
> On May 31, 2020, at 21:01, Dave Taht <dave.taht@gmail.com> wrote:
>
> On Sun, May 31, 2020 at 11:08 AM Kevin Darbyshire-Bryant
> <kevin@darbyshire-bryant.me.uk> wrote:
>>
>>
>>
>>> On 31 May 2020, at 18:26, John Yates <john@yates-sheets.org> wrote:
>>>
>>> On Sun, May 31, 2020 at 1:08 PM Kevin Darbyshire-Bryant
>>> <kevin@darbyshire-bryant.me.uk> wrote:
>>>> I have absolutely no idea, don’t appear to have that thread :-)
>>>
>>> Mea culpa. Should have included this link to the thread:
>>>
>>> https://lists.bufferbloat.net/pipermail/make-wifi-fast/2020-May/002860.html
>>>
>>> /john
>>
>> Ah, well after the initial excitement that ‘oh an application actually sets DSCP’ I checked what marking my zoom packets had on the next conference…to find… Best effort. Crushing disappointment led to this in my firewall box:
>>
>> #Zoom - connections go to Zoom with dest ports 8801-8810
>> $IPTABLES -t mangle -A QOS_MARK_F_${IFACE} -p udp -m udp -m set --match-set Zoom4 dst -m multiport --dports 8801:8810 -j DSCP --set-dscp-class CS3 -m comment --comment "Zoom CS3 VI"
>> $IP6TABLES -t mangle -A QOS_MARK_F_${IFACE} -p udp -m udp -m set --match-set Zoom6 dst -m multiport --dports 8801:8810 -j DSCP --set-dscp-class CS3 -m comment --comment "Zoom CS3 VI”
>>
>> With dnsmasq configured to fill the Zoom4/6 ipsets with relevant IP addresses
>>
>> ipset=/zoom.us/Zoom4,Zoom6
>>
>> Works a treat.
>
> groovy. nicer than what I did, except that I don't remember where CS3
> lands in wifi anymore! CS4 and CS5 land in the VI queue....
>
> As for the "EF" (and for that matter, CS6 and CS7) issue on wifi, it
> lands in the VO queue.
According to https://tools.ietf.org/html/rfc8325#page-10 it actually often lands in AC_VI:
"Voice (EF-101110) will be mapped to UP 5 (101), and treated in the
Video Access Category (AC_VI) rather than the Voice Access
Category (AC_VO), for which it is intended"
Which IMHO is the right thing, AC_VO is so ruthless in acquiring airtime when competing with the other AC's that IMHO it should only be kept as the "nuclear option" in case an AP does not get sufficient airtime when competing with AC_VO hogging stations, but not as part as a normal scheme, UNLESS there are only managed APs in the environment, and any airtime hogging is not at the expense of the neighbors...
> babel and BGP sets CS6 last I looked, and the
> VO queue on 802.11n (which is still quite common on both clients and
> APs) cannot aggregate. Given the rise of videoconferencing, mapping
> stuff into the VI queue makes the most sense for all forms of wifi,
> for both voice and video traffic. I like to think we've conclusively
> proven that
> packet aggregation is way more efficient in terms of airtime and media
> aquisition for both 802.11n and 802.11ac at this point.
+1; not that I can back this up with data...
>
> Worse than that, EF once meant "expedited forwarding", an early
> attempt to create a paid-for "fast lane" on the internet. I'd not use
> that
> for anything nowadays.
>
> So I could see EF landing in VI, and CS6, at least in the babel case,
> being a good candidate for VI also, but the existing usage of CS6 for
> BGP (tcp transfers) is a lousy place to put stuff into the VI queue, also.
>
> And all in all, our existing fq_codel for wifi code does not work well
> when we saturate all four wifi queues in the first place, and in
> general
> I think it's better that APs just use the BE queue at all times with
> our existing codebase for that. Someday, perhaps, the scheduler
> will only feed 1-2 hw queues at a time....
+1, I would assume the issue is that queued packets in AC_VO will basically stall all other quees until the VO queue clears. I have a macbook, which in rrul_cs8 tests over wifi basically hogs all airtime for its two AC_VO flows, essentially starving all other flows in both directions to a trickle. Hence my assesssment that AC_VO is a tad anti-social.
>
> On the other hand, other APs, with massively overbuffered BE queues,
> will probably do better with videoconferencing-style traffic landing
> in VI,
> so long as it's access controlled to a reasonable extent.
>
> Clients SHOULD put videoconferencing (and gaming and latency sensitive
> traffic, like interactive ssh and mosh) into the VI queue
> to more minimize media acquization delays.
>
> On the gripping hand, the best thing anyone can do for wifi is for all
> devices to be located as close to the AP as possible.
Puzzled, using zoom, over an ath9k radio with Toke & yours airtime fairness/fq_codel fixes works quite well with competing traffic even with zoom in the default AC_BE. May this is because my nominal 100/40 link, shaped with layer_cake down to 49/36 Mbps does not make the wifi into the real bottleneck in the first place...
Best Regards
Sebastian
>
>>
>> Cheers,
>>
>> Kevin D-B
>>
>> gpg: 012C ACB2 28C6 C53E 9775 9123 B3A2 389B 9DE2 334A
>>
>> _______________________________________________
>> Cake mailing list
>> Cake@lists.bufferbloat.net
>> https://lists.bufferbloat.net/listinfo/cake
>
>
>
> --
> "For a successful technology, reality must take precedence over public
> relations, for Mother Nature cannot be fooled" - Richard Feynman
>
> dave@taht.net <Dave Täht> CTO, TekLibre, LLC Tel: 1-831-435-0729
> _______________________________________________
> Cake mailing list
> Cake@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cake
^ permalink raw reply [relevance 0%]
* Re: [Cake] Playing with ingredients = ruined the CAKE
2020-05-31 18:08 0% ` Kevin Darbyshire-Bryant
@ 2020-05-31 19:01 1% ` Dave Taht
2020-05-31 19:25 0% ` Sebastian Moeller
0 siblings, 1 reply; 200+ results
From: Dave Taht @ 2020-05-31 19:01 UTC (permalink / raw)
To: Kevin Darbyshire-Bryant; +Cc: John Yates, Cake List, Make-Wifi-fast
On Sun, May 31, 2020 at 11:08 AM Kevin Darbyshire-Bryant
<kevin@darbyshire-bryant.me.uk> wrote:
>
>
>
> > On 31 May 2020, at 18:26, John Yates <john@yates-sheets.org> wrote:
> >
> > On Sun, May 31, 2020 at 1:08 PM Kevin Darbyshire-Bryant
> > <kevin@darbyshire-bryant.me.uk> wrote:
> >> I have absolutely no idea, don’t appear to have that thread :-)
> >
> > Mea culpa. Should have included this link to the thread:
> >
> > https://lists.bufferbloat.net/pipermail/make-wifi-fast/2020-May/002860.html
> >
> > /john
>
> Ah, well after the initial excitement that ‘oh an application actually sets DSCP’ I checked what marking my zoom packets had on the next conference…to find… Best effort. Crushing disappointment led to this in my firewall box:
>
> #Zoom - connections go to Zoom with dest ports 8801-8810
> $IPTABLES -t mangle -A QOS_MARK_F_${IFACE} -p udp -m udp -m set --match-set Zoom4 dst -m multiport --dports 8801:8810 -j DSCP --set-dscp-class CS3 -m comment --comment "Zoom CS3 VI"
> $IP6TABLES -t mangle -A QOS_MARK_F_${IFACE} -p udp -m udp -m set --match-set Zoom6 dst -m multiport --dports 8801:8810 -j DSCP --set-dscp-class CS3 -m comment --comment "Zoom CS3 VI”
>
> With dnsmasq configured to fill the Zoom4/6 ipsets with relevant IP addresses
>
> ipset=/zoom.us/Zoom4,Zoom6
>
> Works a treat.
groovy. nicer than what I did, except that I don't remember where CS3
lands in wifi anymore! CS4 and CS5 land in the VI queue....
As for the "EF" (and for that matter, CS6 and CS7) issue on wifi, it
lands in the VO queue. babel and BGP sets CS6 last I looked, and the
VO queue on 802.11n (which is still quite common on both clients and
APs) cannot aggregate. Given the rise of videoconferencing, mapping
stuff into the VI queue makes the most sense for all forms of wifi,
for both voice and video traffic. I like to think we've conclusively
proven that
packet aggregation is way more efficient in terms of airtime and media
aquisition for both 802.11n and 802.11ac at this point.
Worse than that, EF once meant "expedited forwarding", an early
attempt to create a paid-for "fast lane" on the internet. I'd not use
that
for anything nowadays.
So I could see EF landing in VI, and CS6, at least in the babel case,
being a good candidate for VI also, but the existing usage of CS6 for
BGP (tcp transfers) is a lousy place to put stuff into the VI queue, also.
And all in all, our existing fq_codel for wifi code does not work well
when we saturate all four wifi queues in the first place, and in
general
I think it's better that APs just use the BE queue at all times with
our existing codebase for that. Someday, perhaps, the scheduler
will only feed 1-2 hw queues at a time....
On the other hand, other APs, with massively overbuffered BE queues,
will probably do better with videoconferencing-style traffic landing
in VI,
so long as it's access controlled to a reasonable extent.
Clients SHOULD put videoconferencing (and gaming and latency sensitive
traffic, like interactive ssh and mosh) into the VI queue
to more minimize media acquization delays.
On the gripping hand, the best thing anyone can do for wifi is for all
devices to be located as close to the AP as possible.
>
> Cheers,
>
> Kevin D-B
>
> gpg: 012C ACB2 28C6 C53E 9775 9123 B3A2 389B 9DE2 334A
>
> _______________________________________________
> Cake mailing list
> Cake@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cake
--
"For a successful technology, reality must take precedence over public
relations, for Mother Nature cannot be fooled" - Richard Feynman
dave@taht.net <Dave Täht> CTO, TekLibre, LLC Tel: 1-831-435-0729
^ permalink raw reply [relevance 1%]
* Re: [Cake] Playing with ingredients = ruined the CAKE
2020-05-31 17:26 2% ` John Yates
@ 2020-05-31 18:08 0% ` Kevin Darbyshire-Bryant
2020-05-31 19:01 1% ` Dave Taht
0 siblings, 1 reply; 200+ results
From: Kevin Darbyshire-Bryant @ 2020-05-31 18:08 UTC (permalink / raw)
To: John Yates; +Cc: Cake List
> On 31 May 2020, at 18:26, John Yates <john@yates-sheets.org> wrote:
>
> On Sun, May 31, 2020 at 1:08 PM Kevin Darbyshire-Bryant
> <kevin@darbyshire-bryant.me.uk> wrote:
>> I have absolutely no idea, don’t appear to have that thread :-)
>
> Mea culpa. Should have included this link to the thread:
>
> https://lists.bufferbloat.net/pipermail/make-wifi-fast/2020-May/002860.html
>
> /john
Ah, well after the initial excitement that ‘oh an application actually sets DSCP’ I checked what marking my zoom packets had on the next conference…to find… Best effort. Crushing disappointment led to this in my firewall box:
#Zoom - connections go to Zoom with dest ports 8801-8810
$IPTABLES -t mangle -A QOS_MARK_F_${IFACE} -p udp -m udp -m set --match-set Zoom4 dst -m multiport --dports 8801:8810 -j DSCP --set-dscp-class CS3 -m comment --comment "Zoom CS3 VI"
$IP6TABLES -t mangle -A QOS_MARK_F_${IFACE} -p udp -m udp -m set --match-set Zoom6 dst -m multiport --dports 8801:8810 -j DSCP --set-dscp-class CS3 -m comment --comment "Zoom CS3 VI”
With dnsmasq configured to fill the Zoom4/6 ipsets with relevant IP addresses
ipset=/zoom.us/Zoom4,Zoom6
Works a treat.
Cheers,
Kevin D-B
gpg: 012C ACB2 28C6 C53E 9775 9123 B3A2 389B 9DE2 334A
^ permalink raw reply [relevance 0%]
* Re: [Cake] Playing with ingredients = ruined the CAKE
2020-05-31 17:08 0% ` Kevin Darbyshire-Bryant
@ 2020-05-31 17:26 2% ` John Yates
2020-05-31 18:08 0% ` Kevin Darbyshire-Bryant
0 siblings, 1 reply; 200+ results
From: John Yates @ 2020-05-31 17:26 UTC (permalink / raw)
To: Kevin Darbyshire-Bryant; +Cc: Cake List
On Sun, May 31, 2020 at 1:08 PM Kevin Darbyshire-Bryant
<kevin@darbyshire-bryant.me.uk> wrote:
> I have absolutely no idea, don’t appear to have that thread :-)
Mea culpa. Should have included this link to the thread:
https://lists.bufferbloat.net/pipermail/make-wifi-fast/2020-May/002860.html
/john
^ permalink raw reply [relevance 2%]
* Re: [Cake] Playing with ingredients = ruined the CAKE
@ 2020-05-31 17:08 0% ` Kevin Darbyshire-Bryant
2020-05-31 17:26 2% ` John Yates
0 siblings, 1 reply; 200+ results
From: Kevin Darbyshire-Bryant @ 2020-05-31 17:08 UTC (permalink / raw)
To: Cake List
> On 31 May 2020, at 17:38, John Yates <john@yates-sheets.org> wrote:
>
> Kevin,
>
> I am curious how this effort relates to Dave Taht's point in his May
> 20th "not really huge on EF landing where it does in wifi" thread.
>
> /john
Hi John,
I have absolutely no idea, don’t appear to have that thread :-) My own DSCP/CAKE interests are aligned to exercising CAKE’s built-in classification/bandwidth allocations across my WAN link. In essence I have traffic types of ‘Least Effort’ (bittorrent - nearly starvable), ‘Bulk’ (backups/long term up/downloads - low minimum b/w), ’Normal’ (Most short term stuff - all bandwith), VI (video conference calls/streaming, long-term more important flows latency important- up to 1/2 b/w), VO (voice/latency critical up to 1/4 b/w). Or two levels of ’not so important’ and two levels of ‘more important’ around normal/everything.
The classification process happens as a combination of iptables/ipsets rules on the internet router using tc act_ctinfo to preserve the DSCP classification of flows across the WAN.
Cheers,
Kevin D-B
gpg: 012C ACB2 28C6 C53E 9775 9123 B3A2 389B 9DE2 334A
^ permalink raw reply [relevance 0%]
* Re: [Cake] Playing with ingredients = ruined the CAKE
2020-05-29 15:24 0% ` Kevin Darbyshire-Bryant
@ 2020-05-31 10:04 1% ` Kevin Darbyshire-Bryant
0 siblings, 1 reply; 200+ results
From: Kevin Darbyshire-Bryant @ 2020-05-31 10:04 UTC (permalink / raw)
To: Cake List
[-- Attachment #1: Type: text/plain, Size: 129 bytes --]
This is currently what I’m playing with:
Cheers,
Kevin D-B
gpg: 012C ACB2 28C6 C53E 9775 9123 B3A2 389B 9DE2 334A
[-- Attachment #2: 0001-experiment-with-diffserv5-incl-an-LE-class.patch --]
[-- Type: application/octet-stream, Size: 5325 bytes --]
From fe1d1fb237aaa8d5728a81707d1c2af6e89aeb23 Mon Sep 17 00:00:00 2001
From: Kevin Darbyshire-Bryant <ldir@darbyshire-bryant.me.uk>
Date: Wed, 27 May 2020 17:05:51 +0100
Subject: [PATCH] experiment with diffserv5 incl an LE class
Signed-off-by: Kevin Darbyshire-Bryant <ldir@darbyshire-bryant.me.uk>
---
pkt_sched.h | 3 +-
sch_cake.c | 88 +++++++++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 90 insertions(+), 1 deletion(-)
diff --git a/pkt_sched.h b/pkt_sched.h
index 745cbc7..6d6106a 100644
--- a/pkt_sched.h
+++ b/pkt_sched.h
@@ -947,7 +947,7 @@ enum {
CAKE_FLOW_DUAL_SRC, /* = CAKE_FLOW_SRC_IP | CAKE_FLOW_FLOWS */
CAKE_FLOW_DUAL_DST, /* = CAKE_FLOW_DST_IP | CAKE_FLOW_FLOWS */
CAKE_FLOW_TRIPLE, /* = CAKE_FLOW_HOSTS | CAKE_FLOW_FLOWS */
- CAKE_FLOW_MAX,
+ CAKE_FLOW_MAX
};
enum {
@@ -956,6 +956,7 @@ enum {
CAKE_DIFFSERV_DIFFSERV8,
CAKE_DIFFSERV_BESTEFFORT,
CAKE_DIFFSERV_PRECEDENCE,
+ CAKE_DIFFSERV_DIFFSERV5,
CAKE_DIFFSERV_MAX
};
diff --git a/sch_cake.c b/sch_cake.c
index cb9bbf7..524c5a6 100644
--- a/sch_cake.c
+++ b/sch_cake.c
@@ -333,6 +333,17 @@ static const u8 diffserv8[] = {
7, 2, 2, 2, 2, 2, 2, 2,
};
+static const u8 diffserv5[] = {
+ 0, 1, 0, 0, 3, 0, 0, 0,
+ 2, 0, 0, 0, 0, 0, 0, 0,
+ 3, 0, 3, 0, 3, 0, 3, 0,
+ 3, 0, 3, 0, 3, 0, 3, 0,
+ 4, 0, 3, 0, 3, 0, 3, 0,
+ 4, 0, 0, 0, 4, 0, 4, 0,
+ 4, 0, 0, 0, 0, 0, 0, 0,
+ 4, 0, 0, 0, 0, 0, 0, 0,
+};
+
static const u8 diffserv4[] = {
0, 1, 0, 0, 2, 0, 0, 0,
1, 0, 0, 0, 0, 0, 0, 0,
@@ -370,6 +381,7 @@ static const u8 besteffort[] = {
static const u8 normal_order[] = {0, 1, 2, 3, 4, 5, 6, 7};
static const u8 bulk_order[] = {1, 0, 2, 3};
+static const u8 le_order[] = {1, 2, 0, 3, 4};
#define REC_INV_SQRT_CACHE (16)
static u32 cobalt_rec_inv_sqrt_cache[REC_INV_SQRT_CACHE] = {0};
@@ -2350,6 +2362,17 @@ static void cake_set_rate(struct cake_tin_data *b, u64 rate, u32 mtu,
b->cparams.p_dec = 1 << 20; /* 1/4096 */
}
+static void cake_config_ingress(struct cake_sched_data *q)
+{
+ u32 i;
+
+ for (i = 1; i < q->tin_cnt ; i++) {
+ q->tins[i].cparams.target = q->tins[0].cparams.target;
+ q->tins[i].cparams.interval = q->tins[0].cparams.interval;
+ q->tins[i].flow_quantum = q->tins[0].flow_quantum;
+ }
+}
+
static int cake_config_besteffort(struct Qdisc *sch)
{
struct cake_sched_data *q = qdisc_priv(sch);
@@ -2397,6 +2420,8 @@ static int cake_config_precedence(struct Qdisc *sch)
quantum *= 7;
quantum >>= 3;
}
+/* if (q->rate_flags & CAKE_FLAG_INGRESS)*/
+ cake_config_ingress(q);
return 0;
}
@@ -2489,6 +2514,58 @@ static int cake_config_diffserv8(struct Qdisc *sch)
quantum *= 7;
quantum >>= 3;
}
+/* if (q->rate_flags & CAKE_FLAG_INGRESS)*/
+ cake_config_ingress(q);
+
+ return 0;
+}
+
+static int cake_config_diffserv5(struct Qdisc *sch)
+{
+/* Further pruned list of traffic classes for four-class system:
+ *
+ * Latency Sensitive (CS7, CS6, EF, VA, CS5, CS4)
+ * Streaming Media (AF4x, AF3x, CS3, AF2x, TOS4, CS2, TOS1)
+ * Background Traffic (CS1)
+ * Best Effort (CS0, AF1x, TOS2, and those not specified)
+ * Least Effort (LE)
+ *
+ * Total 5 traffic classes.
+ */
+
+ struct cake_sched_data *q = qdisc_priv(sch);
+ u32 mtu = psched_mtu(qdisc_dev(sch));
+ u64 rate = q->rate_bps;
+ u32 quantum = 1024;
+ u32 i;
+
+ q->tin_cnt = 5;
+
+ /* codepoint to class mapping */
+ q->tin_index = diffserv5;
+ q->tin_order = le_order;
+
+ /* class characteristics */
+ cake_set_rate(&q->tins[0], rate, mtu,
+ us_to_ns(q->target), us_to_ns(q->interval));
+ cake_set_rate(&q->tins[1], rate >> 8, mtu,
+ us_to_ns(q->target), us_to_ns(q->interval));
+ cake_set_rate(&q->tins[2], rate >> 4, mtu,
+ us_to_ns(q->target), us_to_ns(q->interval));
+ cake_set_rate(&q->tins[3], rate >> 1, mtu,
+ us_to_ns(q->target), us_to_ns(q->interval));
+ cake_set_rate(&q->tins[4], rate >> 2, mtu,
+ us_to_ns(q->target), us_to_ns(q->interval));
+
+ /* bandwidth-sharing weights */
+ q->tins[0].tin_quantum = quantum; /*BE*/
+ q->tins[1].tin_quantum = quantum >> 8; /*LE*/
+ q->tins[2].tin_quantum = quantum >> 4; /*BK*/
+ q->tins[3].tin_quantum = quantum >> 1; /*VI*/
+ q->tins[4].tin_quantum = quantum >> 2; /*VO*/
+
+/* if (q->rate_flags & CAKE_FLAG_INGRESS)*/
+ cake_config_ingress(q);
return 0;
}
@@ -2509,6 +2586,7 @@ static int cake_config_diffserv4(struct Qdisc *sch)
u32 mtu = psched_mtu(qdisc_dev(sch));
u64 rate = q->rate_bps;
u32 quantum = 1024;
+ u32 i;
q->tin_cnt = 4;
@@ -2532,6 +2610,9 @@ static int cake_config_diffserv4(struct Qdisc *sch)
q->tins[2].tin_quantum = quantum >> 1;
q->tins[3].tin_quantum = quantum >> 2;
+/* if (q->rate_flags & CAKE_FLAG_INGRESS)*/
+ cake_config_ingress(q);
+
return 0;
}
@@ -2566,6 +2647,9 @@ static int cake_config_diffserv3(struct Qdisc *sch)
q->tins[1].tin_quantum = quantum >> 4;
q->tins[2].tin_quantum = quantum >> 2;
+/* if (q->rate_flags & CAKE_FLAG_INGRESS)*/
+ cake_config_ingress(q);
+
return 0;
}
@@ -2587,6 +2671,10 @@ static void cake_reconfigure(struct Qdisc *sch)
ft = cake_config_diffserv8(sch);
break;
+ case CAKE_DIFFSERV_DIFFSERV5:
+ ft = cake_config_diffserv5(sch);
+ break;
+
case CAKE_DIFFSERV_DIFFSERV4:
ft = cake_config_diffserv4(sch);
break;
--
2.24.3 (Apple Git-128)
^ permalink raw reply [relevance 1%]
* Re: [Cake] [PATCH net] sch_cake: Take advantage of skb->hash where appropriate
2020-05-29 13:02 7% ` Toke Høiland-Jørgensen
2020-05-29 17:57 1% ` Jakub Kicinski
@ 2020-05-31 4:52 1% ` David Miller
2 siblings, 0 replies; 200+ results
From: David Miller @ 2020-05-31 4:52 UTC (permalink / raw)
To: toke; +Cc: netdev, cake
From: Toke Høiland-Jørgensen <toke@redhat.com>
Date: Fri, 29 May 2020 14:43:44 +0200
> While the other fq-based qdiscs take advantage of skb->hash and doesn't
> recompute it if it is already set, sch_cake does not.
>
> This was a deliberate choice because sch_cake hashes various parts of the
> packet header to support its advanced flow isolation modes. However,
> foregoing the use of skb->hash entirely loses a few important benefits:
>
> - When skb->hash is set by hardware, a few CPU cycles can be saved by not
> hashing again in software.
>
> - Tunnel encapsulations will generally preserve the value of skb->hash from
> before the encapsulation, which allows flow-based qdiscs to distinguish
> between flows even though the outer packet header no longer has flow
> information.
>
> It turns out that we can preserve these desirable properties in many cases,
> while still supporting the advanced flow isolation properties of sch_cake.
> This patch does so by reusing the skb->hash value as the flow_hash part of
> the hashing procedure in cake_hash() only in the following conditions:
>
> - If the skb->hash is marked as covering the flow headers (skb->l4_hash is
> set)
>
> AND
>
> - NAT header rewriting is either disabled, or did not change any values
> used for hashing. The latter is important to match local-origin packets
> such as those of a tunnel endpoint.
>
> The immediate motivation for fixing this was the recent patch to WireGuard
> to preserve the skb->hash on encapsulation. As such, this is also what I
> tested against; with this patch, added latency under load for competing
> flows drops from ~8 ms to sub-1ms on an RRUL test over a WireGuard tunnel
> going through a virtual link shaped to 1Gbps using sch_cake. This matches
> the results we saw with a similar setup using sch_fq_codel when testing the
> WireGuard patch.
>
> Fixes: 046f6fd5daef ("sched: Add Common Applications Kept Enhanced (cake) qdisc")
> Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Applied to net-next, thanks.
^ permalink raw reply [relevance 1%]
* Re: [Cake] [PATCH net] sch_cake: Take advantage of skb->hash where appropriate
2020-05-29 17:57 1% ` Jakub Kicinski
@ 2020-05-29 18:31 0% ` Toke Høiland-Jørgensen
0 siblings, 0 replies; 200+ results
From: Toke Høiland-Jørgensen @ 2020-05-29 18:31 UTC (permalink / raw)
To: Jakub Kicinski; +Cc: davem, netdev, cake
Jakub Kicinski <kuba@kernel.org> writes:
> On Fri, 29 May 2020 14:43:44 +0200 Toke Høiland-Jørgensen wrote:
>> + * enabled there's another check below after doing the conntrack lookup.
>> + */
>
> nit: alignment
Ah, right, seems I forget to hit <TAB> after adding that end marker.
Davem, can I get you to fix that when applying, or should I send a v2?
-Toke
^ permalink raw reply [relevance 0%]
* Re: [Cake] [PATCH net] sch_cake: Take advantage of skb->hash where appropriate
2020-05-29 13:02 7% ` Toke Høiland-Jørgensen
@ 2020-05-29 17:57 1% ` Jakub Kicinski
2020-05-29 18:31 0% ` Toke Høiland-Jørgensen
2020-05-31 4:52 1% ` David Miller
2 siblings, 1 reply; 200+ results
From: Jakub Kicinski @ 2020-05-29 17:57 UTC (permalink / raw)
To: Toke Høiland-Jørgensen; +Cc: davem, netdev, cake
On Fri, 29 May 2020 14:43:44 +0200 Toke Høiland-Jørgensen wrote:
> + * enabled there's another check below after doing the conntrack lookup.
> + */
nit: alignment
^ permalink raw reply [relevance 1%]
* Re: [Cake] Playing with ingredients = ruined the CAKE
@ 2020-05-29 15:24 0% ` Kevin Darbyshire-Bryant
2020-05-31 10:04 1% ` Kevin Darbyshire-Bryant
0 siblings, 1 reply; 200+ results
From: Kevin Darbyshire-Bryant @ 2020-05-29 15:24 UTC (permalink / raw)
To: Cake List
> On 29 May 2020, at 11:06, Kevin Darbyshire-Bryant <kevin@darbyshire-bryant.me.uk> wrote:
>
> I’m trying to create a ‘diffserv5’ for the purposes of implementing a 'Least Effort’ class: something like LE=Bittorrent, BK=Backups/long term down/uploads, BE=Best Effort/Normal, VI=Streaming media/facetime/zoom, VO=VOIP/SIP. Not too hard you’d think, take diffserv4 and add a tin.
>
> I did this with tin allocation: 0=LE, 1=BE, 2=BK, 3=VI, 4=VO. BW allocation relative to base rate = LE>>8, BE>>0, BK>>4, VI>>1, VO>>2. Tin display order = 0, 2, 1, 3, 4. In theory I don’t mind LE being starved hence the above order. This pretty much ‘jammed' the shaper as soon as any traffic went into LE with other higher priority tins seeing huge latencies, lots of drops and general bad news all over.
>
> I tried again with a slightly different tin allocation: 0=BE, 1=LE, 2=BK, 3=VI, 4=VO more in keeping with the existing arrangement and display order 1, 2, 0, 3, 4. The shaper doesn’t appear to obviously wedge, though I have seen some latency spikes that I don’t normally see, so it feels like there’s still a corner case being hit.
>
> Does anyone have any ideas?
Pondering out loud: Is setting different (i.e. increased) codel intervals & targets sensible for ‘artificially’ reduced bandwidths, especially in ingress mode? If I have a 100mbit link and I wish to have a minimum reservation for a low bandwidth, low priority tin e.g. 1mbit. Does it make sense to make that tin respond slower as if it were a 1mbit link whereas it’s a minimally reserved portion of a 100mbit link and it could burst up 100 times quicker than I think? Egress I suspect is less of a problem in that we’ll queue the packets and eventually throw them in the floor, but ingress???
Kevin
^ permalink raw reply [relevance 0%]
* Re: [Cake] [PATCH net] sch_cake: Take advantage of skb->hash where appropriate
@ 2020-05-29 13:02 7% ` Toke Høiland-Jørgensen
2020-05-29 17:57 1% ` Jakub Kicinski
2020-05-31 4:52 1% ` David Miller
2 siblings, 0 replies; 200+ results
From: Toke Høiland-Jørgensen @ 2020-05-29 13:02 UTC (permalink / raw)
To: cake
[-- Attachment #1: Type: text/plain, Size: 565 bytes --]
> The immediate motivation for fixing this was the recent patch to WireGuard
> to preserve the skb->hash on encapsulation. As such, this is also what I
> tested against; with this patch, added latency under load for competing
> flows drops from ~8 ms to sub-1ms on an RRUL test over a WireGuard tunnel
> going through a virtual link shaped to 1Gbps using sch_cake. This matches
> the results we saw with a similar setup using sch_fq_codel when testing the
> WireGuard patch.
See attached Flent data files for the full results (dropping netdev@ for
these).
-Toke
[-- Attachment #2: rrul-2020-05-29T125220.474228.cake_no-patch_wg_patched.flent.gz --]
[-- Type: application/gzip, Size: 116901 bytes --]
[-- Attachment #3: rrul-2020-05-29T140517.282425.cake_patch_wg_patched.flent.gz --]
[-- Type: application/gzip, Size: 102927 bytes --]
[-- Attachment #4: rtt_fair_var-2020-05-29T134324.452771.cake_no-patch_wg_patched_nonat.flent.gz --]
[-- Type: application/gzip, Size: 82092 bytes --]
[-- Attachment #5: rtt_fair_var-2020-05-29T135038.694646.cake_no-patch_wg_patched_nonat_dual-srchost.flent.gz --]
[-- Type: application/gzip, Size: 82338 bytes --]
[-- Attachment #6: rtt_fair_var-2020-05-29T135627.430220.cake_no-patch_wg_patched_nonat_dual-srchost.flent.gz --]
[-- Type: application/gzip, Size: 100479 bytes --]
[-- Attachment #7: rtt_fair_var-2020-05-29T135809.540527.cake_no-patch_wg_patched_nat_dual-srchost.flent.gz --]
[-- Type: application/gzip, Size: 101197 bytes --]
[-- Attachment #8: rtt_fair_var-2020-05-29T140711.738176.cake_patch_wg_patched_nonat.flent.gz --]
[-- Type: application/gzip, Size: 80263 bytes --]
[-- Attachment #9: rtt_fair_var-2020-05-29T141034.197074.cake_patch_wg_patched_nonat_dual-srchost.flent.gz --]
[-- Type: application/gzip, Size: 98289 bytes --]
[-- Attachment #10: rtt_fair_var-2020-05-29T141217.145841.cake_patch_wg_patched_nat_dual-srchost.flent.gz --]
[-- Type: application/gzip, Size: 98860 bytes --]
[-- Attachment #11: rtt_fair_var-2020-05-29T141551.629034.cake_patch_wg_patched_nat_dual-srchost.flent.gz --]
[-- Type: application/gzip, Size: 115860 bytes --]
[-- Attachment #12: tcp_nup-2020-05-29T123116.704500.cake_no-patch_wg_patch.flent.gz --]
[-- Type: application/gzip, Size: 216035 bytes --]
[-- Attachment #13: tcp_nup-2020-05-29T140336.810381.cake_patch_wg_patch.flent.gz --]
[-- Type: application/gzip, Size: 353354 bytes --]
^ permalink raw reply [relevance 7%]
* [Cake] Fwd: [PATCH net-next] net: dsa: sja1105: offload the Credit-Based Shaper qdisc
[not found] <20200527165527.1085151-1-olteanv@gmail.com>
@ 2020-05-27 19:29 1% ` Dave Taht
0 siblings, 0 replies; 200+ results
From: Dave Taht @ 2020-05-27 19:29 UTC (permalink / raw)
To: Cake List
---------- Forwarded message ---------
From: Vladimir Oltean <olteanv@gmail.com>
Date: Wed, May 27, 2020 at 12:15 PM
Subject: [PATCH net-next] net: dsa: sja1105: offload the Credit-Based
Shaper qdisc
To: <davem@davemloft.net>
Cc: <andrew@lunn.ch>, <f.fainelli@gmail.com>,
<vivien.didelot@gmail.com>, <netdev@vger.kernel.org>,
<vinicius.gomes@intel.com>
From: Vladimir Oltean <vladimir.oltean@nxp.com>
SJA1105, being AVB/TSN switches, provide hardware assist for the
Credit-Based Shaper as described in the IEEE 8021Q-2018 document.
First generation has 10 shapers, freely assignable to any of the 4
external ports and 8 traffic classes, and second generation has 16
shapers.
We also need to provide a dummy implementation of mqprio qdisc offload,
since this seems to be necessary for shaping any traffic class other
than zero.
The Credit-Based Shaper tables are accessed through the dynamic
reconfiguration interface, so we have to restore them manually after a
switch reset. The tables are backed up by the static config only on
P/Q/R/S, and we don't want to add custom code only for that family,
since the procedure that is in place now works for both.
Tested with the following commands:
data_rate_kbps=34000
port_transmit_rate_kbps=1000000
idleslope=$data_rate_kbps
sendslope=$(($idleslope - $port_transmit_rate_kbps))
locredit=$((-0x7fffffff))
hicredit=$((0x7fffffff))
tc qdisc add dev sw1p3 root handle 1: mqprio num_tc 8
tc qdisc add dev sw1p3 parent 1:1 cbs \
idleslope $idleslope \
sendslope $sendslope \
hicredit $hicredit \
locredit $locredit \
offload 1
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
---
drivers/net/dsa/sja1105/sja1105.h | 2 +
.../net/dsa/sja1105/sja1105_dynamic_config.c | 76 ++++++++++++
drivers/net/dsa/sja1105/sja1105_main.c | 108 ++++++++++++++++++
drivers/net/dsa/sja1105/sja1105_spi.c | 6 +
.../net/dsa/sja1105/sja1105_static_config.h | 15 +++
5 files changed, 207 insertions(+)
diff --git a/drivers/net/dsa/sja1105/sja1105.h
b/drivers/net/dsa/sja1105/sja1105.h
index 303b21470d77..ee79749aacf1 100644
--- a/drivers/net/dsa/sja1105/sja1105.h
+++ b/drivers/net/dsa/sja1105/sja1105.h
@@ -84,6 +84,7 @@ struct sja1105_info {
* the egress timestamps.
*/
int ptpegr_ts_bytes;
+ int num_cbs_shapers;
const struct sja1105_dynamic_table_ops *dyn_ops;
const struct sja1105_table_ops *static_ops;
const struct sja1105_regs *regs;
@@ -218,6 +219,7 @@ struct sja1105_private {
struct mutex mgmt_lock;
bool expect_dsa_8021q;
enum sja1105_vlan_state vlan_state;
+ struct sja1105_cbs_entry *cbs;
struct sja1105_tagger_data tagger_data;
struct sja1105_ptp_data ptp_data;
struct sja1105_tas_data tas_data;
diff --git a/drivers/net/dsa/sja1105/sja1105_dynamic_config.c
b/drivers/net/dsa/sja1105/sja1105_dynamic_config.c
index f98c98a063e7..05b62d544eb9 100644
--- a/drivers/net/dsa/sja1105/sja1105_dynamic_config.c
+++ b/drivers/net/dsa/sja1105/sja1105_dynamic_config.c
@@ -142,6 +142,12 @@
#define SJA1105_SIZE_RETAGGING_DYN_CMD \
(SJA1105_SIZE_DYN_CMD + SJA1105_SIZE_RETAGGING_ENTRY)
+#define SJA1105ET_SIZE_CBS_DYN_CMD \
+ SJA1105ET_SIZE_CBS_ENTRY
+
+#define SJA1105PQRS_SIZE_CBS_DYN_CMD \
+ (SJA1105_SIZE_DYN_CMD + SJA1105PQRS_SIZE_CBS_ENTRY)
+
#define SJA1105_MAX_DYN_CMD_SIZE \
SJA1105PQRS_SIZE_GENERAL_PARAMS_DYN_CMD
@@ -572,6 +578,60 @@ sja1105_retagging_cmd_packing(void *buf, struct
sja1105_dyn_cmd *cmd,
sja1105_packing(p, &cmd->index, 5, 0, size, op);
}
+static void sja1105et_cbs_cmd_packing(void *buf, struct sja1105_dyn_cmd *cmd,
+ enum packing_op op)
+{
+ u8 *p = buf + SJA1105ET_SIZE_CBS_ENTRY;
+ const int size = SJA1105_SIZE_DYN_CMD;
+
+ sja1105_packing(p, &cmd->valid, 31, 31, size, op);
+ sja1105_packing(p, &cmd->index, 19, 16, size, op);
+}
+
+static size_t sja1105et_cbs_entry_packing(void *buf, void *entry_ptr,
+ enum packing_op op)
+{
+ const size_t size = SJA1105ET_SIZE_CBS_ENTRY;
+ struct sja1105_cbs_entry *entry = entry_ptr;
+ u8 *cmd = buf + size;
+ u8 *p = buf;
+
+ sja1105_packing(cmd, &entry->port, 5, 3, SJA1105_SIZE_DYN_CMD, op);
+ sja1105_packing(cmd, &entry->prio, 2, 0, SJA1105_SIZE_DYN_CMD, op);
+ sja1105_packing(p + 3, &entry->credit_lo, 31, 0, size, op);
+ sja1105_packing(p + 2, &entry->credit_hi, 31, 0, size, op);
+ sja1105_packing(p + 1, &entry->send_slope, 31, 0, size, op);
+ sja1105_packing(p + 0, &entry->idle_slope, 31, 0, size, op);
+ return size;
+}
+
+static void sja1105pqrs_cbs_cmd_packing(void *buf, struct sja1105_dyn_cmd *cmd,
+ enum packing_op op)
+{
+ u8 *p = buf + SJA1105PQRS_SIZE_CBS_ENTRY;
+ const int size = SJA1105_SIZE_DYN_CMD;
+
+ sja1105_packing(p, &cmd->valid, 31, 31, size, op);
+ sja1105_packing(p, &cmd->rdwrset, 30, 30, size, op);
+ sja1105_packing(p, &cmd->errors, 29, 29, size, op);
+ sja1105_packing(p, &cmd->index, 3, 0, size, op);
+}
+
+static size_t sja1105pqrs_cbs_entry_packing(void *buf, void *entry_ptr,
+ enum packing_op op)
+{
+ const size_t size = SJA1105PQRS_SIZE_CBS_ENTRY;
+ struct sja1105_cbs_entry *entry = entry_ptr;
+
+ sja1105_packing(buf, &entry->port, 159, 157, size, op);
+ sja1105_packing(buf, &entry->prio, 156, 154, size, op);
+ sja1105_packing(buf, &entry->credit_lo, 153, 122, size, op);
+ sja1105_packing(buf, &entry->credit_hi, 121, 90, size, op);
+ sja1105_packing(buf, &entry->send_slope, 89, 58, size, op);
+ sja1105_packing(buf, &entry->idle_slope, 57, 26, size, op);
+ return size;
+}
+
#define OP_READ BIT(0)
#define OP_WRITE BIT(1)
#define OP_DEL BIT(2)
@@ -661,6 +721,14 @@ struct sja1105_dynamic_table_ops
sja1105et_dyn_ops[BLK_IDX_MAX_DYN] = {
.packed_size = SJA1105_SIZE_RETAGGING_DYN_CMD,
.addr = 0x31,
},
+ [BLK_IDX_CBS] = {
+ .entry_packing = sja1105et_cbs_entry_packing,
+ .cmd_packing = sja1105et_cbs_cmd_packing,
+ .max_entry_count = SJA1105ET_MAX_CBS_COUNT,
+ .access = OP_WRITE,
+ .packed_size = SJA1105ET_SIZE_CBS_DYN_CMD,
+ .addr = 0x2c,
+ },
[BLK_IDX_XMII_PARAMS] = {0},
};
@@ -755,6 +823,14 @@ struct sja1105_dynamic_table_ops
sja1105pqrs_dyn_ops[BLK_IDX_MAX_DYN] = {
.packed_size = SJA1105_SIZE_RETAGGING_DYN_CMD,
.addr = 0x38,
},
+ [BLK_IDX_CBS] = {
+ .entry_packing = sja1105pqrs_cbs_entry_packing,
+ .cmd_packing = sja1105pqrs_cbs_cmd_packing,
+ .max_entry_count = SJA1105PQRS_MAX_CBS_COUNT,
+ .access = OP_WRITE,
+ .packed_size = SJA1105PQRS_SIZE_CBS_DYN_CMD,
+ .addr = 0x32,
+ },
[BLK_IDX_XMII_PARAMS] = {0},
};
diff --git a/drivers/net/dsa/sja1105/sja1105_main.c
b/drivers/net/dsa/sja1105/sja1105_main.c
index 44ce7882dfb1..53bbf01ecb8e 100644
--- a/drivers/net/dsa/sja1105/sja1105_main.c
+++ b/drivers/net/dsa/sja1105/sja1105_main.c
@@ -1640,6 +1640,98 @@ static void sja1105_bridge_leave(struct
dsa_switch *ds, int port,
sja1105_bridge_member(ds, port, br, false);
}
+#define BYTES_PER_KBIT (1000LL / 8)
+
+static int sja1105_find_unused_cbs_shaper(struct sja1105_private *priv)
+{
+ int i;
+
+ for (i = 0; i < priv->info->num_cbs_shapers; i++)
+ if (!priv->cbs[i].idle_slope && !priv->cbs[i].send_slope)
+ return i;
+
+ return -1;
+}
+
+static int sja1105_delete_cbs_shaper(struct sja1105_private *priv, int port,
+ int prio)
+{
+ int i;
+
+ for (i = 0; i < priv->info->num_cbs_shapers; i++) {
+ struct sja1105_cbs_entry *cbs = &priv->cbs[i];
+
+ if (cbs->port == port && cbs->prio == prio) {
+ memset(cbs, 0, sizeof(*cbs));
+ return sja1105_dynamic_config_write(priv, BLK_IDX_CBS,
+ i, cbs, true);
+ }
+ }
+
+ return 0;
+}
+
+static int sja1105_setup_tc_cbs(struct dsa_switch *ds, int port,
+ struct tc_cbs_qopt_offload *offload)
+{
+ struct sja1105_private *priv = ds->priv;
+ struct sja1105_cbs_entry *cbs;
+ int index;
+
+ if (!offload->enable)
+ return sja1105_delete_cbs_shaper(priv, port, offload->queue);
+
+ index = sja1105_find_unused_cbs_shaper(priv);
+ if (index < 0)
+ return -ENOSPC;
+
+ cbs = &priv->cbs[index];
+ cbs->port = port;
+ cbs->prio = offload->queue;
+ /* locredit and sendslope are negative by definition. In hardware,
+ * positive values must be provided, and the negative sign is implicit.
+ */
+ cbs->credit_hi = offload->hicredit;
+ cbs->credit_lo = abs(offload->locredit);
+ /* User space is in kbits/sec, hardware in bytes/sec */
+ cbs->idle_slope = offload->idleslope * BYTES_PER_KBIT;
+ cbs->send_slope = abs(offload->sendslope * BYTES_PER_KBIT);
+ /* Convert the negative values from 64-bit 2's complement
+ * to 32-bit 2's complement (for the case of 0x80000000 whose
+ * negative is still negative).
+ */
+ cbs->credit_lo &= GENMASK_ULL(31, 0);
+ cbs->send_slope &= GENMASK_ULL(31, 0);
+
+ return sja1105_dynamic_config_write(priv, BLK_IDX_CBS, index, cbs,
+ true);
+}
+
+static int sja1105_reload_cbs(struct sja1105_private *priv)
+{
+ int rc = 0, i;
+
+ for (i = 0; i < priv->info->num_cbs_shapers; i++) {
+ struct sja1105_cbs_entry *cbs = &priv->cbs[i];
+
+ if (!cbs->idle_slope && !cbs->send_slope)
+ continue;
+
+ rc = sja1105_dynamic_config_write(priv, BLK_IDX_CBS, i, cbs,
+ true);
+ if (rc)
+ break;
+ }
+
+ return rc;
+}
+
+static int sja1105_setup_tc_mqprio(struct dsa_switch *ds, int port,
+ struct tc_mqprio_qopt *offload)
+{
+ return 0;
+}
+
static const char * const sja1105_reset_reasons[] = {
[SJA1105_VLAN_FILTERING] = "VLAN filtering",
[SJA1105_RX_HWTSTAMPING] = "RX timestamping",
@@ -1754,6 +1846,10 @@ int sja1105_static_config_reload(struct
sja1105_private *priv,
sja1105_sgmii_pcs_force_speed(priv, speed);
}
}
+
+ rc = sja1105_reload_cbs(priv);
+ if (rc < 0)
+ goto out;
out:
mutex_unlock(&priv->mgmt_lock);
@@ -3131,6 +3227,10 @@ static int sja1105_port_setup_tc(struct
dsa_switch *ds, int port,
switch (type) {
case TC_SETUP_QDISC_TAPRIO:
return sja1105_setup_tc_taprio(ds, port, type_data);
+ case TC_SETUP_QDISC_MQPRIO:
+ return sja1105_setup_tc_mqprio(ds, port, type_data);
+ case TC_SETUP_QDISC_CBS:
+ return sja1105_setup_tc_cbs(ds, port, type_data);
default:
return -EOPNOTSUPP;
}
@@ -3408,6 +3508,14 @@ static int sja1105_probe(struct spi_device *spi)
if (rc)
return rc;
+ if (IS_ENABLED(CONFIG_NET_SCH_CBS)) {
+ priv->cbs = devm_kcalloc(dev, priv->info->num_cbs_shapers,
+ sizeof(struct sja1105_cbs_entry),
+ GFP_KERNEL);
+ if (!priv->cbs)
+ return -ENOMEM;
+ }
+
/* Connections between dsa_port and sja1105_port */
for (port = 0; port < SJA1105_NUM_PORTS; port++) {
struct sja1105_port *sp = &priv->ports[port];
diff --git a/drivers/net/dsa/sja1105/sja1105_spi.c
b/drivers/net/dsa/sja1105/sja1105_spi.c
index a0dacae803cc..bb52b9c841b2 100644
--- a/drivers/net/dsa/sja1105/sja1105_spi.c
+++ b/drivers/net/dsa/sja1105/sja1105_spi.c
@@ -515,6 +515,7 @@ struct sja1105_info sja1105e_info = {
.qinq_tpid = ETH_P_8021Q,
.ptp_ts_bits = 24,
.ptpegr_ts_bytes = 4,
+ .num_cbs_shapers = SJA1105ET_MAX_CBS_COUNT,
.reset_cmd = sja1105et_reset_cmd,
.fdb_add_cmd = sja1105et_fdb_add,
.fdb_del_cmd = sja1105et_fdb_del,
@@ -530,6 +531,7 @@ struct sja1105_info sja1105t_info = {
.qinq_tpid = ETH_P_8021Q,
.ptp_ts_bits = 24,
.ptpegr_ts_bytes = 4,
+ .num_cbs_shapers = SJA1105ET_MAX_CBS_COUNT,
.reset_cmd = sja1105et_reset_cmd,
.fdb_add_cmd = sja1105et_fdb_add,
.fdb_del_cmd = sja1105et_fdb_del,
@@ -545,6 +547,7 @@ struct sja1105_info sja1105p_info = {
.qinq_tpid = ETH_P_8021AD,
.ptp_ts_bits = 32,
.ptpegr_ts_bytes = 8,
+ .num_cbs_shapers = SJA1105PQRS_MAX_CBS_COUNT,
.setup_rgmii_delay = sja1105pqrs_setup_rgmii_delay,
.reset_cmd = sja1105pqrs_reset_cmd,
.fdb_add_cmd = sja1105pqrs_fdb_add,
@@ -561,6 +564,7 @@ struct sja1105_info sja1105q_info = {
.qinq_tpid = ETH_P_8021AD,
.ptp_ts_bits = 32,
.ptpegr_ts_bytes = 8,
+ .num_cbs_shapers = SJA1105PQRS_MAX_CBS_COUNT,
.setup_rgmii_delay = sja1105pqrs_setup_rgmii_delay,
.reset_cmd = sja1105pqrs_reset_cmd,
.fdb_add_cmd = sja1105pqrs_fdb_add,
@@ -577,6 +581,7 @@ struct sja1105_info sja1105r_info = {
.qinq_tpid = ETH_P_8021AD,
.ptp_ts_bits = 32,
.ptpegr_ts_bytes = 8,
+ .num_cbs_shapers = SJA1105PQRS_MAX_CBS_COUNT,
.setup_rgmii_delay = sja1105pqrs_setup_rgmii_delay,
.reset_cmd = sja1105pqrs_reset_cmd,
.fdb_add_cmd = sja1105pqrs_fdb_add,
@@ -594,6 +599,7 @@ struct sja1105_info sja1105s_info = {
.qinq_tpid = ETH_P_8021AD,
.ptp_ts_bits = 32,
.ptpegr_ts_bytes = 8,
+ .num_cbs_shapers = SJA1105PQRS_MAX_CBS_COUNT,
.setup_rgmii_delay = sja1105pqrs_setup_rgmii_delay,
.reset_cmd = sja1105pqrs_reset_cmd,
.fdb_add_cmd = sja1105pqrs_fdb_add,
diff --git a/drivers/net/dsa/sja1105/sja1105_static_config.h
b/drivers/net/dsa/sja1105/sja1105_static_config.h
index 5946847bb5b9..9b62b9b5549d 100644
--- a/drivers/net/dsa/sja1105/sja1105_static_config.h
+++ b/drivers/net/dsa/sja1105/sja1105_static_config.h
@@ -30,11 +30,13 @@
#define SJA1105ET_SIZE_L2_LOOKUP_PARAMS_ENTRY 4
#define SJA1105ET_SIZE_GENERAL_PARAMS_ENTRY 40
#define SJA1105ET_SIZE_AVB_PARAMS_ENTRY 12
+#define SJA1105ET_SIZE_CBS_ENTRY 16
#define SJA1105PQRS_SIZE_L2_LOOKUP_ENTRY 20
#define SJA1105PQRS_SIZE_MAC_CONFIG_ENTRY 32
#define SJA1105PQRS_SIZE_L2_LOOKUP_PARAMS_ENTRY 16
#define SJA1105PQRS_SIZE_GENERAL_PARAMS_ENTRY 44
#define SJA1105PQRS_SIZE_AVB_PARAMS_ENTRY 16
+#define SJA1105PQRS_SIZE_CBS_ENTRY 20
/* UM10944.pdf Page 11, Table 2. Configuration Blocks */
enum {
@@ -56,6 +58,7 @@ enum {
BLKID_AVB_PARAMS = 0x10,
BLKID_GENERAL_PARAMS = 0x11,
BLKID_RETAGGING = 0x12,
+ BLKID_CBS = 0x13,
BLKID_XMII_PARAMS = 0x4E,
};
@@ -78,6 +81,7 @@ enum sja1105_blk_idx {
BLK_IDX_AVB_PARAMS,
BLK_IDX_GENERAL_PARAMS,
BLK_IDX_RETAGGING,
+ BLK_IDX_CBS,
BLK_IDX_XMII_PARAMS,
BLK_IDX_MAX,
/* Fake block indices that are only valid for dynamic access */
@@ -105,6 +109,8 @@ enum sja1105_blk_idx {
#define SJA1105_MAX_RETAGGING_COUNT 32
#define SJA1105_MAX_XMII_PARAMS_COUNT 1
#define SJA1105_MAX_AVB_PARAMS_COUNT 1
+#define SJA1105ET_MAX_CBS_COUNT 10
+#define SJA1105PQRS_MAX_CBS_COUNT 16
#define SJA1105_MAX_FRAME_MEMORY 929
#define SJA1105_MAX_FRAME_MEMORY_RETAGGING 910
@@ -289,6 +295,15 @@ struct sja1105_retagging_entry {
u64 destports;
};
+struct sja1105_cbs_entry {
+ u64 port;
+ u64 prio;
+ u64 credit_hi;
+ u64 credit_lo;
+ u64 send_slope;
+ u64 idle_slope;
+};
+
struct sja1105_xmii_params_entry {
u64 phy_mac[5];
u64 xmii_mode[5];
--
2.25.1
--
"For a successful technology, reality must take precedence over public
relations, for Mother Nature cannot be fooled" - Richard Feynman
dave@taht.net <Dave Täht> CTO, TekLibre, LLC Tel: 1-831-435-0729
^ permalink raw reply [relevance 1%]
* Re: [Cake] [Bloat] dslreports is no longer free
2020-05-27 9:08 0% ` [Cake] " Matthew Ford
@ 2020-05-27 9:32 0% ` Sebastian Moeller
0 siblings, 0 replies; 200+ results
From: Sebastian Moeller @ 2020-05-27 9:32 UTC (permalink / raw)
To: Matthew Ford
Cc: Dave Täht, Cake List, Make-Wifi-fast, cerowrt-devel, bloat
Hi Mat,
> On May 27, 2020, at 11:08, Matthew Ford <ford@isoc.org> wrote:
>
> What's the bufferbloat verdict on https://speed.cloudflare.com/ ?
Not a verdict per se, but this has potential, but is not there yet.
Pros: Decent reporting of the Download rates including intermediate values
Decent reporting for the idle latency (I like the box whisker plots, ans the details revealed on mouse-over, as well as the individual samples)
Cons: Upload seems missing
Latency is only measured for a pre-download idle phase, that is important, but for bufferbloat testing we really need to see the latency-under-load numbers (separately for down- and upload).
Test duration not configurable. A number of ISP techniques, like power-boost can give higher throughput for a limited amount of time, which often accidentally coincides with typical durations of speedtests*, so being able to confirm bufferbloat remedies at longer test run times is really helpful (nothing crazy, but if a test can run 30-60 seconds instead of just 10-20 seconds that already helps a lot).
Best Regards
Sebastian
*) I believe this to be accidental, as the duration for "fair" power-boosting are naturally in the same few dozends of seconds range as typical speedtests take, nothing nefarious here.
>
> Mat
>
>> On 1 May 2020, at 20:48, Sebastian Moeller <moeller0@gmx.de> wrote:
>>
>> Hi Dave,
>>
>> well, it was a free service and it lasted a long time. I want to raise a toast to Justin and convey my sincere thanks for years of investing into the "good" of the internet.
>>
>> Now, the question is which test is going to be the rightful successor?
>>
>> Short of running netperf/irtt/iper2/iperf3 on a hosted server, I see lots of potential but none of the tests are really there yet (grievances in now particular order):
>>
>> OOKLA: speedtest.net.
>> Pros: ubiquitious, allows selection of single flow versus multi-flow test, allows server selection
>> Cons: only IPv4, only static unloaded RTT measurement, no control over measurement duration
>> BUFFERBLOAT verdict: incomplete, maybe usable as load generator
>>
>>
>> NETFLIX: fast.com.
>> Pros: allows selection of upload testing, supposedly decent back-end, duration configurable
>> allows unloaded, loaded download and loaded upload RTT measurements (but reports sinlge numbers for loaded and unloaded RTT, that are not the max)
>> Cons: RTT report as two numbers one for the loaded and one for unloaded RTT, time-course of RTTs missing
>> BUFFERBLOAT verdict: incomplete, but oh, so close...
>>
>>
>> NPERF: nperf.com
>> Pros: allows server selection, RTT measurement and report as time course, also reports average rates and static RTT/jitter for Up- and Download
>> Cons: RTT measurement for unloaded only, reported RTT static only , no control over measurement duration
>> BUFFERBLOAT verdict: incomplete,
>>
>>
>> THINKBROADBAND: www.thinkbroadband.com/speedtest
>> Pros: IPv6, reports coarse RTT time courses for all three measurement phases
>> Cons: only static unloaded RTT report in final results, time courses only visible immediately after testing, no control over measurement duration
>> BUFFERBLOAT verdict: a bit coarse, might work for users within a reasonable distance to the UK for acute de-bloating sessions (history reporting is bad though)
>>
>>
>> honorable mentioning:
>> BREITBANDMESSUNG: breitbandmessung.de
>> Pros: query of contracted internet access speed before measurement, with a scheduler that will only start a test when the backend has sufficient capacity to saturate the user-supplied contracted rates, IPv6 (happy-eyeballs)
>> Cons: only static unloaded RTT measurement, no control over measurement duration
>> BUFFERBLOAT verdict: unsuitable, exceot as load generator, but the bandwidth reservation feature is quite nice.
>>
>> Best Regards
>> Sebastian
>>
>>
>>> On May 1, 2020, at 18:44, Dave Taht <dave.taht@gmail.com> wrote:
>>>
>>> https://www.reddit.com/r/HomeNetworking/comments/gbd6g0/dsl_reports_speed_test_no_longer_free/
>>>
>>> They ran out of bandwidth.
>>>
>>> Message to users here:
>>>
>>> http://www.dslreports.com/speedtest
>>>
>>>
>>> --
>>> Make Music, Not War
>>>
>>> Dave Täht
>>> CTO, TekLibre, LLC
>>> http://www.teklibre.com
>>> Tel: 1-831-435-0729
>>> _______________________________________________
>>> Cake mailing list
>>> Cake@lists.bufferbloat.net
>>> https://lists.bufferbloat.net/listinfo/cake
>>
>> _______________________________________________
>> Bloat mailing list
>> Bloat@lists.bufferbloat.net
>> https://lists.bufferbloat.net/listinfo/bloat
^ permalink raw reply [relevance 0%]
* Re: [Cake] [Bloat] dslreports is no longer free
2020-05-01 19:48 0% ` [Cake] " Sebastian Moeller
2020-05-01 20:09 1% ` [Bloat] " Sergey Fedorov
[not found] ` <mailman.170.1588363787.24343.bloat@lists.bufferbloat.net>
@ 2020-05-27 9:08 0% ` Matthew Ford
2020-05-27 9:32 0% ` Sebastian Moeller
2 siblings, 1 reply; 200+ results
From: Matthew Ford @ 2020-05-27 9:08 UTC (permalink / raw)
To: Sebastian Moeller
Cc: Dave Täht, Cake List, Make-Wifi-fast, cerowrt-devel, bloat
What's the bufferbloat verdict on https://speed.cloudflare.com/ ?
Mat
> On 1 May 2020, at 20:48, Sebastian Moeller <moeller0@gmx.de> wrote:
>
> Hi Dave,
>
> well, it was a free service and it lasted a long time. I want to raise a toast to Justin and convey my sincere thanks for years of investing into the "good" of the internet.
>
> Now, the question is which test is going to be the rightful successor?
>
> Short of running netperf/irtt/iper2/iperf3 on a hosted server, I see lots of potential but none of the tests are really there yet (grievances in now particular order):
>
> OOKLA: speedtest.net.
> Pros: ubiquitious, allows selection of single flow versus multi-flow test, allows server selection
> Cons: only IPv4, only static unloaded RTT measurement, no control over measurement duration
> BUFFERBLOAT verdict: incomplete, maybe usable as load generator
>
>
> NETFLIX: fast.com.
> Pros: allows selection of upload testing, supposedly decent back-end, duration configurable
> allows unloaded, loaded download and loaded upload RTT measurements (but reports sinlge numbers for loaded and unloaded RTT, that are not the max)
> Cons: RTT report as two numbers one for the loaded and one for unloaded RTT, time-course of RTTs missing
> BUFFERBLOAT verdict: incomplete, but oh, so close...
>
>
> NPERF: nperf.com
> Pros: allows server selection, RTT measurement and report as time course, also reports average rates and static RTT/jitter for Up- and Download
> Cons: RTT measurement for unloaded only, reported RTT static only , no control over measurement duration
> BUFFERBLOAT verdict: incomplete,
>
>
> THINKBROADBAND: www.thinkbroadband.com/speedtest
> Pros: IPv6, reports coarse RTT time courses for all three measurement phases
> Cons: only static unloaded RTT report in final results, time courses only visible immediately after testing, no control over measurement duration
> BUFFERBLOAT verdict: a bit coarse, might work for users within a reasonable distance to the UK for acute de-bloating sessions (history reporting is bad though)
>
>
> honorable mentioning:
> BREITBANDMESSUNG: breitbandmessung.de
> Pros: query of contracted internet access speed before measurement, with a scheduler that will only start a test when the backend has sufficient capacity to saturate the user-supplied contracted rates, IPv6 (happy-eyeballs)
> Cons: only static unloaded RTT measurement, no control over measurement duration
> BUFFERBLOAT verdict: unsuitable, exceot as load generator, but the bandwidth reservation feature is quite nice.
>
> Best Regards
> Sebastian
>
>
>> On May 1, 2020, at 18:44, Dave Taht <dave.taht@gmail.com> wrote:
>>
>> https://www.reddit.com/r/HomeNetworking/comments/gbd6g0/dsl_reports_speed_test_no_longer_free/
>>
>> They ran out of bandwidth.
>>
>> Message to users here:
>>
>> http://www.dslreports.com/speedtest
>>
>>
>> --
>> Make Music, Not War
>>
>> Dave Täht
>> CTO, TekLibre, LLC
>> http://www.teklibre.com
>> Tel: 1-831-435-0729
>> _______________________________________________
>> Cake mailing list
>> Cake@lists.bufferbloat.net
>> https://lists.bufferbloat.net/listinfo/cake
>
> _______________________________________________
> Bloat mailing list
> Bloat@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/bloat
^ permalink raw reply [relevance 0%]
* Re: [Cake] Query on ACK
2020-05-25 9:42 0% ` Jonathan Morton
@ 2020-05-25 11:58 0% ` Toke Høiland-Jørgensen
2020-06-14 12:43 1% ` Avakash bhat
0 siblings, 1 reply; 200+ results
From: Toke Høiland-Jørgensen @ 2020-05-25 11:58 UTC (permalink / raw)
To: Jonathan Morton, Avakash bhat
Cc: Cake List, Dave Taht, Vybhav Pai, Shrinidhi Varna,
Mohit P. Tahiliani, Deepak K
Jonathan Morton <chromatix99@gmail.com> writes:
>> On 25 May, 2020, at 8:17 am, Avakash bhat <avakash261@gmail.com> wrote:
>>
>> We had another query we would like to resolve. We wanted to verify the working of ack filter in ns-3,
>> so we decided to replicate the Fig 6 graph in the CAKE paper(https://ieeexplore.ieee.org/document/8475045).
>> While trying to build the topology we realized that we do not know the number of packets or bytes sent from
>> the source to the destination for each of the TCP connections ( We are assuming it is a point to point connection with 4 TCP flows).
>>
>> Could we get a bit more details about how the experiment was conducted?
>
> I believe this was conducted using the RRUL test in Flent. This opens
> four saturating TCP flows in each direction, and also sends a small
> amount of latency measuring traffic. On this occasion I don't think
> we added any simulated path delays, and only imposed the quoted
> asymmetric bandwidth limits (30Mbps down, 1Mbps up).
See https://www.cs.kau.se/tohojo/cake/ - the link to the data files near
the bottom of that page also contains the Flent batch file and setup
scripts used to run the whole thing.
(And there's no explicit "number of bytes sent", but rather the flows
are capacity-seeking flows running for a limited *time*).
-Toke
^ permalink raw reply [relevance 0%]
* Re: [Cake] Query on ACK
2020-05-25 5:17 1% ` Avakash bhat
@ 2020-05-25 9:42 0% ` Jonathan Morton
2020-05-25 11:58 0% ` Toke Høiland-Jørgensen
0 siblings, 1 reply; 200+ results
From: Jonathan Morton @ 2020-05-25 9:42 UTC (permalink / raw)
To: Avakash bhat
Cc: Cake List, Toke Høiland-Jørgensen, Dave Taht,
Vybhav Pai, Shrinidhi Varna, Mohit P. Tahiliani, Deepak K
> On 25 May, 2020, at 8:17 am, Avakash bhat <avakash261@gmail.com> wrote:
>
> We had another query we would like to resolve. We wanted to verify the working of ack filter in ns-3,
> so we decided to replicate the Fig 6 graph in the CAKE paper(https://ieeexplore.ieee.org/document/8475045).
> While trying to build the topology we realized that we do not know the number of packets or bytes sent from
> the source to the destination for each of the TCP connections ( We are assuming it is a point to point connection with 4 TCP flows).
>
> Could we get a bit more details about how the experiment was conducted?
I believe this was conducted using the RRUL test in Flent. This opens four saturating TCP flows in each direction, and also sends a small amount of latency measuring traffic. On this occasion I don't think we added any simulated path delays, and only imposed the quoted asymmetric bandwidth limits (30Mbps down, 1Mbps up).
> Also is this the best way to verify the correctness of our implementation?
Obviously with limited space in our paper, we could only include a small selection of test results. Many other tests were run in practice, and we have expanded our test repertoire since.
In particular, we now routinely run tests with a simulated typical Internet path delay inserted, eg. 20ms, 80ms, 160ms baseline RTTs to represent reaching a local-ish CDN, across the Atlantic, and from Europe to the US West Coast. You will also want to include multiple traffic mixes in the analysis, in particular different congestion control algorithms (at least Reno and CUBIC), and running with ECN both enabled and disabled at the endpoints.
A useful torture test we used was to send many bulk flows up the narrow side of the link and a single bulk flow down the wide side. For example, 50:1 flow counts with 1:10, 1:20 and 1:30 bandwidth asymmetries. The acks of the single flow then have to compete with the heavy load of the many flows, and the total goodput of that single flow is an important metric, along with both the total goodput and the Jain's fairness of the upload traffic. This should show a particularly strong effect of the ack filter, as otherwise individual acks have to be dropped by the AQM, which Codel is not very good at adapting to quickly.
In evaluating the above, you will want to be vigilant not only for observed gross performance, but also the extent to which the ack filter preserves or loses information from the ack stream. This is particularly true in the runs without ECN, in which congestion signals can only be applied through packet loss, and the feedback of that signal is through dup-acks and SACK. I think you will find that the "aggressive" setting loses some information, and its performance suffers accordingly in some cases.
- Jonathan Morton
^ permalink raw reply [relevance 0%]
* Re: [Cake] Query on ACK
2020-05-08 15:40 1% ` Dave Taht
@ 2020-05-25 5:17 1% ` Avakash bhat
2020-05-25 9:42 0% ` Jonathan Morton
0 siblings, 1 reply; 200+ results
From: Avakash bhat @ 2020-05-25 5:17 UTC (permalink / raw)
To: Cake List
Cc: Jonathan Morton, Toke Høiland-Jørgensen, Dave Taht,
Vybhav Pai, Shrinidhi Varna, Mohit P. Tahiliani, Deepak K
[-- Attachment #1: Type: text/plain, Size: 2386 bytes --]
Hi all,
We had another query we would like to resolve. We wanted to verify the
working of ack filter in ns-3,
so we decided to replicate the Fig 6 graph in the CAKE paper(
https://ieeexplore.ieee.org/document/8475045).
While trying to build the topology we realized that we do not know the
number of packets or bytes sent from
the source to the destination for each of the TCP connections ( We are
assuming it is a point to point connection with 4 TCP flows).
Could we get a bit more details about how the experiment was conducted?
Also is this the best way to verify the correctness of our implementation?
Thanks,
Avakash Bhat
On Fri, May 8, 2020 at 9:11 PM Dave Taht <dave.taht@gmail.com> wrote:
> acks at the time you have reached a point of dropping them
> significantly have filled the pipe, also.
>
> What I saw here was that the first flow to really get going, and
> really get dropped, dominated over the others,
> because I thought it was consistently ending up in the priority queue.
>
> http://blog.cerowrt.org/post/ack_filtering/
>
> Look, all I'm proposing is this idea be tried and tested. Cynically...
> since there's a new model coming out as
> the result of this work, it immediately turns into something a good
> paper can hing on.
>
> On Fri, May 8, 2020 at 8:20 AM Jonathan Morton <chromatix99@gmail.com>
> wrote:
> >
> > >> The ACK filter runs on enqueue, so if a queue has only ACKs in it, it
> > >> will never accumulate anything in the first place...
> > >
> > > but the side effect is that on dequeue, it flips it into the fast
> > > queue drr rotation, not the slow, so it can't accumulate
> > > as many acks before delivering the one it has left.
> > >
> > > Or so I thought, way back when....
> >
> > The ack filter converts a stream of acks that might be treated as a bulk
> flow into a sparse flow, which is delivered promptly. This is a good
> thing; an ack should not be held back solely to see whether another one
> will arrive.
> >
> > I think of it as an optimisation to reduce delay of the information in
> the ack stream, not solely as a way to reduce the bandwidth consumed by the
> ack stream; the latter is a happy side effect.
> >
> > - Jonathan Morton
>
>
>
> --
> Make Music, Not War
>
> Dave Täht
> CTO, TekLibre, LLC
> http://www.teklibre.com
> Tel: 1-831-435-0729
>
[-- Attachment #2: Type: text/html, Size: 3218 bytes --]
^ permalink raw reply [relevance 1%]
* Re: [Cake] Is target a command-line option?
[not found] <mailman.404.1590061333.24343.cake@lists.bufferbloat.net>
@ 2020-05-22 14:18 0% ` Toke Høiland-Jørgensen
0 siblings, 0 replies; 200+ results
From: Toke Høiland-Jørgensen @ 2020-05-22 14:18 UTC (permalink / raw)
To: Aarti Nandagiri., cake
"Aarti Nandagiri. via Cake" <cake@lists.bufferbloat.net> writes:
> From: "Aarti Nandagiri." <aarti.183is001@nitk.edu.in>
> Subject: Is target a command-line option?
> To: cake@lists.bufferbloat.net
> Date: Thu, 21 May 2020 17:12:00 +0530
>
> Hello,
>
> Can I use a different 'target' value by passing it as a command-line option
> for the cake qdisc?
Not directly. You can use one of the presents (datacentre, lan, metro,
regional, internet, oceanic, satellite, interplanetary) to select a
preset target/interval setting, or you can use 'rtt X' to set the CoDel,
in which case 'target' will be set to interval/20.
-Toke
^ permalink raw reply [relevance 0%]
* Re: [Cake] [PATCH] net/sch_generic.h: use sizeof_member() and get rid of unused variable
2020-05-20 18:17 1% ` David Miller
@ 2020-05-20 21:25 1% ` Antonio Quartulli
0 siblings, 0 replies; 200+ results
From: Antonio Quartulli @ 2020-05-20 21:25 UTC (permalink / raw)
To: David Miller; +Cc: netdev, cake, toke, jhs, xiyou.wangcong, jiri, stephen
On 20/05/2020 20:17, David Miller wrote:
> From: Antonio Quartulli <a@unstable.cc>
> Date: Wed, 20 May 2020 10:39:33 +0200
>
>> I don't think it's BUILD_BUG_ON()'s fault, because qcb->data is passed
>> to sizeof() first.
>>
>> My best guess is that gcc is somewhat optimizing the sizeof(gcb->data)
>> and thus leaving the gcb variable unused.
>
> If you remove the argument from the function but leave the BUILD_BUG_ON()
> calls the same, the compilation will fail.
>
> Any such optimization is therefore unreasonable.
>
> The variable is used otherwise compilation would not fail when you
> remove it right?
You're correct.
I guess this should be reported to gcc then.
Regards,
--
Antonio Quartulli
^ permalink raw reply [relevance 1%]
* Re: [Cake] [PATCH] net/sch_generic.h: use sizeof_member() and get rid of unused variable
2020-05-20 8:39 1% ` Antonio Quartulli
@ 2020-05-20 18:17 1% ` David Miller
2020-05-20 21:25 1% ` Antonio Quartulli
0 siblings, 1 reply; 200+ results
From: David Miller @ 2020-05-20 18:17 UTC (permalink / raw)
To: a; +Cc: netdev, cake, toke, jhs, xiyou.wangcong, jiri, stephen
From: Antonio Quartulli <a@unstable.cc>
Date: Wed, 20 May 2020 10:39:33 +0200
> I don't think it's BUILD_BUG_ON()'s fault, because qcb->data is passed
> to sizeof() first.
>
> My best guess is that gcc is somewhat optimizing the sizeof(gcb->data)
> and thus leaving the gcb variable unused.
If you remove the argument from the function but leave the BUILD_BUG_ON()
calls the same, the compilation will fail.
Any such optimization is therefore unreasonable.
The variable is used otherwise compilation would not fail when you
remove it right?
^ permalink raw reply [relevance 1%]
* Re: [Cake] [PATCH] net/sch_generic.h: use sizeof_member() and get rid of unused variable
2020-05-19 22:40 1% ` David Miller
@ 2020-05-20 8:39 1% ` Antonio Quartulli
2020-05-20 18:17 1% ` David Miller
0 siblings, 1 reply; 200+ results
From: Antonio Quartulli @ 2020-05-20 8:39 UTC (permalink / raw)
To: David Miller; +Cc: netdev, cake, toke, jhs, xiyou.wangcong, jiri, stephen
Hi David,
On 20/05/2020 00:40, David Miller wrote:
> From: Antonio Quartulli <a@unstable.cc>
> Date: Tue, 19 May 2020 11:13:33 +0200
>
>> Compiling with -Wunused triggers the following warning:
>>
>> ./include/net/sch_generic.h: In function ‘qdisc_cb_private_validate’:
>> ./include/net/sch_generic.h:464:23: warning: unused variable ‘qcb’ [-Wunused-variable]
>> 464 | struct qdisc_skb_cb *qcb;
>> | ^~~
>>
>> as the qcb variable is only used to compute the sizeof one of its members.
>
> It's referenced in the code, therefore it is not "unused".
True.
>
> If in some configuration BUILD_BUG_ON() does not reference it's arguments,
> that's the bug that needs to be fixed.
>
I don't think it's BUILD_BUG_ON()'s fault, because qcb->data is passed
to sizeof() first.
My best guess is that gcc is somewhat optimizing the sizeof(gcb->data)
and thus leaving the gcb variable unused.
This said, I think it's better for the code style (and for the compiler)
if we used sizeof_member().
Should I resend the patch with a commit message that does not mention
the "unused" warning?
Thanks a lot.
Best Regards,
--
Antonio Quartulli
^ permalink raw reply [relevance 1%]
* Re: [Cake] [PATCH] net/sch_generic.h: use sizeof_member() and get rid of unused variable
@ 2020-05-19 22:40 1% ` David Miller
2020-05-20 8:39 1% ` Antonio Quartulli
0 siblings, 1 reply; 200+ results
From: David Miller @ 2020-05-19 22:40 UTC (permalink / raw)
To: a; +Cc: netdev, cake, toke, jhs, xiyou.wangcong, jiri, stephen
From: Antonio Quartulli <a@unstable.cc>
Date: Tue, 19 May 2020 11:13:33 +0200
> Compiling with -Wunused triggers the following warning:
>
> ./include/net/sch_generic.h: In function ‘qdisc_cb_private_validate’:
> ./include/net/sch_generic.h:464:23: warning: unused variable ‘qcb’ [-Wunused-variable]
> 464 | struct qdisc_skb_cb *qcb;
> | ^~~
>
> as the qcb variable is only used to compute the sizeof one of its members.
It's referenced in the code, therefore it is not "unused".
If in some configuration BUILD_BUG_ON() does not reference it's arguments,
that's the bug that needs to be fixed.
^ permalink raw reply [relevance 1%]
* Re: [Cake] Query on ACK
@ 2020-05-08 15:40 1% ` Dave Taht
2020-05-25 5:17 1% ` Avakash bhat
0 siblings, 1 reply; 200+ results
From: Dave Taht @ 2020-05-08 15:40 UTC (permalink / raw)
To: Jonathan Morton
Cc: Toke Høiland-Jørgensen, Avakash bhat, Vybhav Pai,
Shrinidhi Varna, Cake List, Mohit P. Tahiliani, Deepak K
acks at the time you have reached a point of dropping them
significantly have filled the pipe, also.
What I saw here was that the first flow to really get going, and
really get dropped, dominated over the others,
because I thought it was consistently ending up in the priority queue.
http://blog.cerowrt.org/post/ack_filtering/
Look, all I'm proposing is this idea be tried and tested. Cynically...
since there's a new model coming out as
the result of this work, it immediately turns into something a good
paper can hing on.
On Fri, May 8, 2020 at 8:20 AM Jonathan Morton <chromatix99@gmail.com> wrote:
>
> >> The ACK filter runs on enqueue, so if a queue has only ACKs in it, it
> >> will never accumulate anything in the first place...
> >
> > but the side effect is that on dequeue, it flips it into the fast
> > queue drr rotation, not the slow, so it can't accumulate
> > as many acks before delivering the one it has left.
> >
> > Or so I thought, way back when....
>
> The ack filter converts a stream of acks that might be treated as a bulk flow into a sparse flow, which is delivered promptly. This is a good thing; an ack should not be held back solely to see whether another one will arrive.
>
> I think of it as an optimisation to reduce delay of the information in the ack stream, not solely as a way to reduce the bandwidth consumed by the ack stream; the latter is a happy side effect.
>
> - Jonathan Morton
--
Make Music, Not War
Dave Täht
CTO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-831-435-0729
^ permalink raw reply [relevance 1%]
* Re: [Cake] Query on ACK
2020-05-08 15:08 0% ` Toke Høiland-Jørgensen
@ 2020-05-08 15:11 1% ` Dave Taht
0 siblings, 1 reply; 200+ results
From: Dave Taht @ 2020-05-08 15:11 UTC (permalink / raw)
To: Toke Høiland-Jørgensen
Cc: Avakash bhat, Vybhav Pai, Shrinidhi Varna, Cake List,
Mohit P. Tahiliani, Deepak K
On Fri, May 8, 2020 at 8:08 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>
> Dave Taht <dave.taht@gmail.com> writes:
>
> > On Thu, May 7, 2020 at 11:36 PM Avakash bhat <avakash261@gmail.com> wrote:
> >>
> >> Ok thanks so much for the clarifications.
> >> That cleared it up quite a bit.
> >
> > I note that there was something really subtle that could have been
> > done to improve cake's ack handling, and for all I know
> > it actually happened in the final codebase.
> >
> > so, please, go forth and duplicate the existing implementation, and
> > ignore me, cause looking at this hairy code gives me a
> > headache.
> >
> > anyway, to try and describe what I thought I saw an interaction with
> > the scheduler back in the day.
> >
> > The ack-filter runs, deleting all but one packet from the ack queue,
> > and delivers that.
> > the scheduler runs, serves a bunch of other flows, then returns to the
> > ack queue, which has accumulated a couple more packets,
> > the ack-filter runs, deleting all but one packet from the ack queue,
> > and delivers that, but doesn't exhaust its qauntum
>
> The ACK filter runs on enqueue, so if a queue has only ACKs in it, it
> will never accumulate anything in the first place...
but the side effect is that on dequeue, it flips it into the fast
queue drr rotation, not the slow, so it can't accumulate
as many acks before delivering the one it has left.
Or so I thought, way back when....
>
> -Toke
>
--
Make Music, Not War
Dave Täht
CTO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-831-435-0729
^ permalink raw reply [relevance 1%]
* Re: [Cake] Query on ACK
2020-05-08 6:50 1% ` Dave Taht
2020-05-08 7:41 0% ` Sebastian Moeller
@ 2020-05-08 15:08 0% ` Toke Høiland-Jørgensen
2020-05-08 15:11 1% ` Dave Taht
1 sibling, 1 reply; 200+ results
From: Toke Høiland-Jørgensen @ 2020-05-08 15:08 UTC (permalink / raw)
To: Dave Taht, Avakash bhat
Cc: Vybhav Pai, Shrinidhi Varna, Cake List, Mohit P. Tahiliani, Deepak K
Dave Taht <dave.taht@gmail.com> writes:
> On Thu, May 7, 2020 at 11:36 PM Avakash bhat <avakash261@gmail.com> wrote:
>>
>> Ok thanks so much for the clarifications.
>> That cleared it up quite a bit.
>
> I note that there was something really subtle that could have been
> done to improve cake's ack handling, and for all I know
> it actually happened in the final codebase.
>
> so, please, go forth and duplicate the existing implementation, and
> ignore me, cause looking at this hairy code gives me a
> headache.
>
> anyway, to try and describe what I thought I saw an interaction with
> the scheduler back in the day.
>
> The ack-filter runs, deleting all but one packet from the ack queue,
> and delivers that.
> the scheduler runs, serves a bunch of other flows, then returns to the
> ack queue, which has accumulated a couple more packets,
> the ack-filter runs, deleting all but one packet from the ack queue,
> and delivers that, but doesn't exhaust its qauntum
The ACK filter runs on enqueue, so if a queue has only ACKs in it, it
will never accumulate anything in the first place...
-Toke
^ permalink raw reply [relevance 0%]
* Re: [Cake] Query on ACK
2020-05-07 7:07 1% ` Sebastian Moeller
2020-05-08 6:36 1% ` Avakash bhat
@ 2020-05-08 8:23 0% ` Sebastian Moeller
1 sibling, 0 replies; 200+ results
From: Sebastian Moeller @ 2020-05-08 8:23 UTC (permalink / raw)
To: Cake List, Avakash bhat, Toke Høiland-Jørgensen
Cc: Mohit P. Tahiliani, Shrinidhi Varna, Deepak K, Vybhav Pai
Dear All,
just as a side-note. I believe that ACK filtering is one more application that directly profits from flow-queueing (as the set of packets to compare with is already separated out from the set of all queued packets), as one needs to collect ACKs according to their 4-Tuples which FQ does naturally.
Best Regards
Sebastian
> On May 7, 2020, at 09:07, Sebastian Moeller <moeller0@gmx.de> wrote:
>
> I think that you will remove all redundant Backs in one go considerably advancing the new ACK in the queue. And more importantly, in most relevant modes cake will apply one queue per flow stochastically, so almost all packet's in a reverse ACK flow will be ACK with identical 5-tupel....
>
> On 7 May 2020 08:44:59 CEST, Avakash bhat <avakash261@gmail.com> wrote:
>
> Thanks for the quick response. I also had a followup question.
>
> If the ack filter adds the new ack to the tail of the queue after removing an ack from the queue, won't it be starving the ack?
> The replaced ack was much ahead in the queue than the ack we replaced at the tail right?
>
> Thanks,
> Avakash Bhat
>
> --
> Sent from my Android device with K-9 Mail. Please excuse my brevity.
> _______________________________________________
> Cake mailing list
> Cake@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cake
^ permalink raw reply [relevance 0%]
* Re: [Cake] Query on ACK
2020-05-08 6:50 1% ` Dave Taht
@ 2020-05-08 7:41 0% ` Sebastian Moeller
2020-05-08 15:08 0% ` Toke Høiland-Jørgensen
1 sibling, 0 replies; 200+ results
From: Sebastian Moeller @ 2020-05-08 7:41 UTC (permalink / raw)
To: Dave Täht
Cc: Avakash bhat, Vybhav Pai, Shrinidhi Varna, Cake List,
Mohit P. Tahiliani, Deepak K
Hi Dave,
> On May 8, 2020, at 08:50, Dave Taht <dave.taht@gmail.com> wrote:
>
> On Thu, May 7, 2020 at 11:36 PM Avakash bhat <avakash261@gmail.com> wrote:
>>
>> Ok thanks so much for the clarifications.
>> That cleared it up quite a bit.
>
> I note that there was something really subtle that could have been
> done to improve cake's ack handling, and for all I know
> it actually happened in the final codebase.
>
> so, please, go forth and duplicate the existing implementation, and
> ignore me, cause looking at this hairy code gives me a
> headache.
>
> anyway, to try and describe what I thought I saw an interaction with
> the scheduler back in the day.
>
> The ack-filter runs, deleting all but one packet from the ack queue,
> and delivers that.
> the scheduler runs, serves a bunch of other flows, then returns to the
> ack queue, which has accumulated a couple more packets,
> the ack-filter runs, deleting all but one packet from the ack queue,
> and delivers that, but doesn't exhaust its qauntum
> but now that flow is in the "fast" queue, and we service just a few
> other flows, and return to it, delete a couple, service one... and
> stay stuck in the fast queue.
Why would that be a problem? In that case ACKs did not bunch up (otherwise there would be backlog in the queue and it would forfeit its sparseness boost) and hence delivering the only ACK in a timely fashion should preserve the ACK clock, especially for non ABC-ACKs (https://tools.ietf.org/html/rfc3465) that relay on ACK count? Sure, if the single ACK had already matured a bit and immediately after sending it a fresher ACK would have been enqueued that looks suboptimal, but that race seems to exist no matter what? Now if the goal is to weed out ACKs to conserve bandwidth, sure not filtering ACKs is sub-optimal, but for the clocking?
Side-note, whoever invented the term "ACK-clocking" seemingly had a very fuzzy concept of what a clock is and what precision a clock can be expected to deliver ;)
Best Regards
Sebastian
P.S.: As so often, I might simple be confused about the actual subtlety...
>
> better, I thought, was once the ack filter exceeded the quantum of
> packets for that flow in that drr round, even if it only delivered one
> packet,
> that it should always return it to the bulk queue, because tons more
> packets would arrive in the interval between servicing
> all the rest of the flows, thus more of which could be safely removed,
> while maintaining a steadier clock for tcp.
>
> I've already seen cake remove over 25% of all ack packets with no harm
> to the other flows. So for all I know (and I'd have to
> look) it's already doing it this way.
>
>>
>> Thanks,
>> Avakash Bhat
>>
>> On Thu, May 7, 2020 at 12:37 PM Sebastian Moeller <moeller0@gmx.de> wrote:
>>>
>>> I think that you will remove all redundant Backs in one go considerably advancing the new ACK in the queue. And more importantly, in most relevant modes cake will apply one queue per flow stochastically, so almost all packet's in a reverse ACK flow will be ACK with identical 5-tupel....
>>>
>>> On 7 May 2020 08:44:59 CEST, Avakash bhat <avakash261@gmail.com> wrote:
>>>>
>>>>
>>>> Thanks for the quick response. I also had a followup question.
>>>>
>>>> If the ack filter adds the new ack to the tail of the queue after removing an ack from the queue, won't it be starving the ack?
>>>> The replaced ack was much ahead in the queue than the ack we replaced at the tail right?
>>>>
>>>> Thanks,
>>>> Avakash Bhat
>>>
>>>
>>> --
>>> Sent from my Android device with K-9 Mail. Please excuse my brevity.
>>
>> _______________________________________________
>> Cake mailing list
>> Cake@lists.bufferbloat.net
>> https://lists.bufferbloat.net/listinfo/cake
>
>
>
> --
> Make Music, Not War
>
> Dave Täht
> CTO, TekLibre, LLC
> http://www.teklibre.com
> Tel: 1-831-435-0729
^ permalink raw reply [relevance 0%]
* Re: [Cake] Query on ACK
2020-05-08 6:36 1% ` Avakash bhat
@ 2020-05-08 6:50 1% ` Dave Taht
2020-05-08 7:41 0% ` Sebastian Moeller
2020-05-08 15:08 0% ` Toke Høiland-Jørgensen
0 siblings, 2 replies; 200+ results
From: Dave Taht @ 2020-05-08 6:50 UTC (permalink / raw)
To: Avakash bhat
Cc: Sebastian Moeller, Vybhav Pai, Shrinidhi Varna, Cake List,
Mohit P. Tahiliani, Deepak K
On Thu, May 7, 2020 at 11:36 PM Avakash bhat <avakash261@gmail.com> wrote:
>
> Ok thanks so much for the clarifications.
> That cleared it up quite a bit.
I note that there was something really subtle that could have been
done to improve cake's ack handling, and for all I know
it actually happened in the final codebase.
so, please, go forth and duplicate the existing implementation, and
ignore me, cause looking at this hairy code gives me a
headache.
anyway, to try and describe what I thought I saw an interaction with
the scheduler back in the day.
The ack-filter runs, deleting all but one packet from the ack queue,
and delivers that.
the scheduler runs, serves a bunch of other flows, then returns to the
ack queue, which has accumulated a couple more packets,
the ack-filter runs, deleting all but one packet from the ack queue,
and delivers that, but doesn't exhaust its qauntum
but now that flow is in the "fast" queue, and we service just a few
other flows, and return to it, delete a couple, service one... and
stay stuck in the fast queue.
better, I thought, was once the ack filter exceeded the quantum of
packets for that flow in that drr round, even if it only delivered one
packet,
that it should always return it to the bulk queue, because tons more
packets would arrive in the interval between servicing
all the rest of the flows, thus more of which could be safely removed,
while maintaining a steadier clock for tcp.
I've already seen cake remove over 25% of all ack packets with no harm
to the other flows. So for all I know (and I'd have to
look) it's already doing it this way.
>
> Thanks,
> Avakash Bhat
>
> On Thu, May 7, 2020 at 12:37 PM Sebastian Moeller <moeller0@gmx.de> wrote:
>>
>> I think that you will remove all redundant Backs in one go considerably advancing the new ACK in the queue. And more importantly, in most relevant modes cake will apply one queue per flow stochastically, so almost all packet's in a reverse ACK flow will be ACK with identical 5-tupel....
>>
>> On 7 May 2020 08:44:59 CEST, Avakash bhat <avakash261@gmail.com> wrote:
>>>
>>>
>>> Thanks for the quick response. I also had a followup question.
>>>
>>> If the ack filter adds the new ack to the tail of the queue after removing an ack from the queue, won't it be starving the ack?
>>> The replaced ack was much ahead in the queue than the ack we replaced at the tail right?
>>>
>>> Thanks,
>>> Avakash Bhat
>>
>>
>> --
>> Sent from my Android device with K-9 Mail. Please excuse my brevity.
>
> _______________________________________________
> Cake mailing list
> Cake@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cake
--
Make Music, Not War
Dave Täht
CTO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-831-435-0729
^ permalink raw reply [relevance 1%]
* Re: [Cake] Query on ACK
2020-05-07 7:07 1% ` Sebastian Moeller
@ 2020-05-08 6:36 1% ` Avakash bhat
2020-05-08 6:50 1% ` Dave Taht
2020-05-08 8:23 0% ` Sebastian Moeller
1 sibling, 1 reply; 200+ results
From: Avakash bhat @ 2020-05-08 6:36 UTC (permalink / raw)
To: Sebastian Moeller
Cc: cake, Toke Høiland-Jørgensen, Vybhav Pai,
Shrinidhi Varna, Mohit P. Tahiliani, Deepak K
[-- Attachment #1: Type: text/plain, Size: 962 bytes --]
Ok thanks so much for the clarifications.
That cleared it up quite a bit.
Thanks,
Avakash Bhat
On Thu, May 7, 2020 at 12:37 PM Sebastian Moeller <moeller0@gmx.de> wrote:
> I think that you will remove all redundant Backs in one go considerably
> advancing the new ACK in the queue. And more importantly, in most relevant
> modes cake will apply one queue per flow stochastically, so almost all
> packet's in a reverse ACK flow will be ACK with identical 5-tupel....
>
> On 7 May 2020 08:44:59 CEST, Avakash bhat <avakash261@gmail.com> wrote:
>>
>>
>> Thanks for the quick response. I also had a followup question.
>>
>> If the ack filter adds the new ack to the tail of the queue after
>> removing an ack from the queue, won't it be starving the ack?
>> The replaced ack was much ahead in the queue than the ack we replaced at
>> the tail right?
>>
>> Thanks,
>> Avakash Bhat
>>
>
> --
> Sent from my Android device with K-9 Mail. Please excuse my brevity.
>
[-- Attachment #2: Type: text/html, Size: 1622 bytes --]
^ permalink raw reply [relevance 1%]
* Re: [Cake] Latency target curiosity
2020-05-07 8:09 0% ` Jonathan Morton
@ 2020-05-07 9:11 0% ` Kevin Darbyshire-Bryant
0 siblings, 0 replies; 200+ results
From: Kevin Darbyshire-Bryant @ 2020-05-07 9:11 UTC (permalink / raw)
To: Jonathan Morton; +Cc: Cake List
[-- Attachment #1: Type: text/plain, Size: 2415 bytes --]
> On 7 May 2020, at 09:09, Jonathan Morton <chromatix99@gmail.com> wrote:
>
>> On 7 May, 2020, at 10:58 am, Kevin Darbyshire-Bryant <kevin@darbyshire-bryant.me.uk> wrote:
>>
>> A curiosity has arisen: I use diffserv4 mode on a 20Mbit egress link. Bulk tin has ‘capacity’ threshold of 1.2Mbit and because it’s a slow ’tin', the default target & interval values get overridden to 14.6ms and 109.6ms respectively. The 3 other tins are 5ms & 100ms defaults.
>>
>> I have a backup job that bulk uploads 5 simultaneous flows to Onedrive. The sparse_delay, average_delay & peak_delay figures settle on 32, 38 & 43 ms respectively with around 9 drops per second on that tin.
>>
>> I’m curious as to why the reported delays are over double the target latency?
>
> It's likely that there's a minimum cwnd in your sender's TCP stack, which may be as large as 4 segments. In Linux it is 2 segments. No matter how much congestion signalling is asserted, the volume of data in flight (including retransmissions of dropped packets) will always correspond to at least that minimum per flow. If the path is short, most of that volume will exists in queues instead of on the wire.
This is a Qnap NAS box running (a form of) Linux but to say I don’t trust the IP stack as being vanilla is an understatement. I’ve seen it do some really odd things when enabling ECN on egress…I don’t know if that’s a qnap or onedrive issue.
>
> Fortunately, backups are unlikely to suffer from a small amount of extra latency, and Cake will isolate their influence from other flows that may be more sensitive.
Absolutely! That’s why the traffic is in the Bulk tin, I don’t care about its ‘interactivity’, I just want the data transferred eventually :-) Cake does really well, I’ve had these backups running for days, maxing out the line but I simply couldn’t tell. I can also reasonably reliably classify facetime calls, so I put those in ‘Video’, so just for fun, whilst the other half was engaged on a 2 hour(!) facetime call with family I started backups, ran speed tests etc to try to disturb that call. Cake just kept that video call running smoothly all the time, brilliant! I know that’s more of a tin isolation test rather than a flow isolation test but it’s still great!
Cheers,
Kevin D-B
gpg: 012C ACB2 28C6 C53E 9775 9123 B3A2 389B 9DE2 334A
[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [relevance 0%]
* Re: [Cake] Latency target curiosity
@ 2020-05-07 8:09 0% ` Jonathan Morton
2020-05-07 9:11 0% ` Kevin Darbyshire-Bryant
0 siblings, 1 reply; 200+ results
From: Jonathan Morton @ 2020-05-07 8:09 UTC (permalink / raw)
To: Kevin Darbyshire-Bryant; +Cc: Cake List
> On 7 May, 2020, at 10:58 am, Kevin Darbyshire-Bryant <kevin@darbyshire-bryant.me.uk> wrote:
>
> A curiosity has arisen: I use diffserv4 mode on a 20Mbit egress link. Bulk tin has ‘capacity’ threshold of 1.2Mbit and because it’s a slow ’tin', the default target & interval values get overridden to 14.6ms and 109.6ms respectively. The 3 other tins are 5ms & 100ms defaults.
>
> I have a backup job that bulk uploads 5 simultaneous flows to Onedrive. The sparse_delay, average_delay & peak_delay figures settle on 32, 38 & 43 ms respectively with around 9 drops per second on that tin.
>
> I’m curious as to why the reported delays are over double the target latency?
It's likely that there's a minimum cwnd in your sender's TCP stack, which may be as large as 4 segments. In Linux it is 2 segments. No matter how much congestion signalling is asserted, the volume of data in flight (including retransmissions of dropped packets) will always correspond to at least that minimum per flow. If the path is short, most of that volume will exists in queues instead of on the wire.
Fortunately, backups are unlikely to suffer from a small amount of extra latency, and Cake will isolate their influence from other flows that may be more sensitive.
- Jonathan Morton
^ permalink raw reply [relevance 0%]
* Re: [Cake] Query on ACK
2020-05-07 6:59 0% ` Jonathan Morton
@ 2020-05-07 7:07 1% ` Sebastian Moeller
2020-05-08 6:36 1% ` Avakash bhat
2020-05-08 8:23 0% ` Sebastian Moeller
1 sibling, 2 replies; 200+ results
From: Sebastian Moeller @ 2020-05-07 7:07 UTC (permalink / raw)
To: cake, Avakash bhat, Toke Høiland-Jørgensen
Cc: Vybhav Pai, Shrinidhi Varna, Mohit P. Tahiliani, Deepak K
[-- Attachment #1: Type: text/plain, Size: 764 bytes --]
I think that you will remove all redundant Backs in one go considerably advancing the new ACK in the queue. And more importantly, in most relevant modes cake will apply one queue per flow stochastically, so almost all packet's in a reverse ACK flow will be ACK with identical 5-tupel....
On 7 May 2020 08:44:59 CEST, Avakash bhat <avakash261@gmail.com> wrote:
>Thanks for the quick response. I also had a followup question.
>
>If the ack filter adds the new ack to the tail of the queue after
>removing
>an ack from the queue, won't it be starving the ack?
>The replaced ack was much ahead in the queue than the ack we replaced
>at
>the tail right?
>
>Thanks,
>Avakash Bhat
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.
[-- Attachment #2: Type: text/html, Size: 1074 bytes --]
^ permalink raw reply [relevance 1%]
* Re: [Cake] Query on ACK
@ 2020-05-07 6:59 0% ` Jonathan Morton
2020-05-07 7:07 1% ` Sebastian Moeller
1 sibling, 0 replies; 200+ results
From: Jonathan Morton @ 2020-05-07 6:59 UTC (permalink / raw)
To: Avakash bhat
Cc: Toke Høiland-Jørgensen, cake, Mohit P. Tahiliani,
Vybhav Pai, Deepak K, Shrinidhi Varna
> On 7 May, 2020, at 9:44 am, Avakash bhat <avakash261@gmail.com> wrote:
>
> Thanks for the quick response. I also had a followup question.
>
> If the ack filter adds the new ack to the tail of the queue after removing an ack from the queue, won't it be starving the ack?
> The replaced ack was much ahead in the queue than the ack we replaced at the tail right?
No, if you are doing this on enqueue, then you are comparing the new ack with an ack immediately preceding it in the same queue, which will also be at the tail. And if you are doing it on dequeue then both packets were enqueued some time ago, and both are already due for delivery very soon.
In general, the second packet is delivered sooner, in place of the first one that was removed. This means it reduces feedback latency to the (forward path) sender.
- Jonathan Morton
^ permalink raw reply [relevance 0%]
* Re: [Cake] Query on ACK
2020-05-06 19:01 0% ` Jonathan Morton
@ 2020-05-06 19:13 0% ` Toke Høiland-Jørgensen
0 siblings, 1 reply; 200+ results
From: Toke Høiland-Jørgensen @ 2020-05-06 19:13 UTC (permalink / raw)
To: Jonathan Morton, Avakash bhat
Cc: cake, Mohit P. Tahiliani, Vybhav Pai, Deepak K, Shrinidhi Varna
Jonathan Morton <chromatix99@gmail.com> writes:
>> On 6 May, 2020, at 9:43 pm, Avakash bhat <avakash261@gmail.com> wrote:
>>
>> We are trying to implement the ACK filtering module of CAKE in ns-3 (Network Simulator).
>
> Ah yes. Sorry I didn't respond to the introduction earlier - we were right in the middle of preparing for an IETF virtual meeting. The debris is still falling from orbit…
>
>> We had a question on the working of ack filtering.
>> If an incoming ack which can replace an eligible ack in the queue is about to be enqueued, do we replace the ack in the queue with the incoming ack
>> or do we enqueue the ack to the tail of the queue and remove the eligible ack from the queue?
>
> That sounds like an implementation detail. But what we do in Cake is
> to simply enqueue all the packets, and deal with everything
> complicated on dequeue.
The ACK filter is run on enqueue, actually :)
> At that point, we check whether the two packets at the head of the
> queue are acks for the same flow, and if so, we further check whether
> the information in the first packet is redundant given the presence of
> the second packet. If there is information in the first packet that is
> not also provided by the second packet, the first packet is delivered.
> Otherwise the first packet is dropped, and the second packet moves to
> the head of the queue. This process may repeat several times if there
> are several consecutive, redundant acks in the queue.
>
> The important part is the set of rules determining whether the ack is
> redundant.
Yes, indeed. Please feel free to go through cake_ack_filter() in
sch_cake.c and make sure you get all those edge cases in your
eligibility check...
-Toke
^ permalink raw reply [relevance 0%]
* Re: [Cake] Query on ACK
@ 2020-05-06 19:01 0% ` Jonathan Morton
2020-05-06 19:13 0% ` Toke Høiland-Jørgensen
0 siblings, 1 reply; 200+ results
From: Jonathan Morton @ 2020-05-06 19:01 UTC (permalink / raw)
To: Avakash bhat
Cc: cake, Mohit P. Tahiliani, Shrinidhi Varna, Deepak K, Vybhav Pai
> On 6 May, 2020, at 9:43 pm, Avakash bhat <avakash261@gmail.com> wrote:
>
> We are trying to implement the ACK filtering module of CAKE in ns-3 (Network Simulator).
Ah yes. Sorry I didn't respond to the introduction earlier - we were right in the middle of preparing for an IETF virtual meeting. The debris is still falling from orbit…
> We had a question on the working of ack filtering.
> If an incoming ack which can replace an eligible ack in the queue is about to be enqueued, do we replace the ack in the queue with the incoming ack
> or do we enqueue the ack to the tail of the queue and remove the eligible ack from the queue?
That sounds like an implementation detail. But what we do in Cake is to simply enqueue all the packets, and deal with everything complicated on dequeue.
At that point, we check whether the two packets at the head of the queue are acks for the same flow, and if so, we further check whether the information in the first packet is redundant given the presence of the second packet. If there is information in the first packet that is not also provided by the second packet, the first packet is delivered. Otherwise the first packet is dropped, and the second packet moves to the head of the queue. This process may repeat several times if there are several consecutive, redundant acks in the queue.
The important part is the set of rules determining whether the ack is redundant.
- Jonathan Morton
^ permalink raw reply [relevance 0%]
* Re: [Cake] Slightly OT Re: [Make-wifi-fast] [Bloat] dslreports is no longer free
@ 2020-05-06 15:51 1% ` Dave Taht
0 siblings, 0 replies; 200+ results
From: Dave Taht @ 2020-05-06 15:51 UTC (permalink / raw)
To: David P. Reed
Cc: Sebastian Moeller, Sergey Fedorov, Make-Wifi-fast,
Jannie Hanekom, Cake List, bloat
On Wed, May 6, 2020 at 8:39 AM David P. Reed <dpreed@deepplum.com> wrote:
>
> While the jury is still out for me on the "best" speed test to recommend to my friends, family, and even enemies, I think the progression has been good.
>
>
>
> Originally, I used to recommend the web-embedded Java test called Netalyzer from ICSI. That did extensive tests, and included tests that are important to me like detecting DNS spoofing, various middlebox mucking with packets, ... as well as measuring lag under load in a simple way. But then I had to teach each person I recommended it to what everything meant. That was a BIG burden on me.
>
>
>
> Then I switched to dslreports.com, because of several factors - it highlighted lag under load as a bufferbloat grade that made sense.
>
>
>
> Now, I have to say that fast.com is likely to become my new recommendation. However, I have two issues with it. The biggest one is that lag-under-load is obscured in the interface, as is the asymmetry of upload vs. download.
>
>
>
> The problem for me is that I usually get asked to recommend a test under circumstances where someone isn't looking for "bragging rights" but is experiencing a problem of disrupted service quality. The NUMBER ONE problem they usually have is the lag-under-load problem in some form. But all they know is what "download speed" they bought.
>
>
>
> Many, many people are using videoconferencing now, not just web and TV watching. And that is hypersensitive to lag-under-load (also on WiFi due to airtime scheduling).
>
> And no one seems to be aware that their quality of experience is not about speed, but about instability of lag-under-load. So it's a new idea.
>
>
>
> Yeah, I do once in a while want to know if my service is delivering the top speed advertised - just as I once in a while measure the time of my car in the quarter mile on dragstrip :-)
>
>
>
> But mostly I want to know what's making my *applications* so slow. And it's almost never the case that they need a nitro-burning funny car level of speed. Instead, they need either: elimination of lag under load, or eliminating all the crap running in tabs on the browser (like animated JavaScript attention-seeking ads filling memory with garbage and causing the JS garbage collector to run constantly).
An analogy I'd been making to myself and was hoping to use somewhere
relevant to the l4s debate, was
james dean poising his dragster at the top of winding dirt mountain
road, with his best girl, Isadora Duncan,
scarf flowing out, by his side.
what could go wrong?
>
>
>
> So I would change fast.com, if I could, to emphasize the *problems* (as netalyzer did) and not the speed.
--
Make Music, Not War
Dave Täht
CTO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-831-435-0729
^ permalink raw reply [relevance 1%]
* Re: [Cake] [Make-wifi-fast] [Bloat] dslreports is no longer free
2020-05-04 17:04 1% ` Sergey Fedorov
2020-05-05 21:02 1% ` David P. Reed
@ 2020-05-06 8:19 0% ` Sebastian Moeller
1 sibling, 1 reply; 200+ results
From: Sebastian Moeller @ 2020-05-06 8:19 UTC (permalink / raw)
To: Sergey Fedorov
Cc: David P. Reed, Dave Täht, Make-Wifi-fast, Jannie Hanekom,
Cake List, bloat
Hi Sergey,
> On May 4, 2020, at 19:04, Sergey Fedorov <sfedorov@netflix.com> wrote:
>
> Sergey - I wasn't assuming anything about fast.com. The document you shared wasn't clear about the methodology's details here. Others sadly, have actually used ICMP pings in the way I described. I was making a generic comment of concern.
>
> That said, it sounds like what you are doing is really helpful (esp. given that your measure is aimed at end user experiential qualities).
> David - my apologies, I incorrectly interpreted your statement as being said in context of fast.com measurements. The blog post linked indeed doesn't provide the latency measurement details - was written before we added the extra metrics. We'll see if we can publish an update.
>
> 1) a clear definition of lag under load that is from end-to-end in latency, and involves, ideally, independent traffic from multiple sources through the bottleneck.
> Curious if by multiple sources you mean multiple clients (devices) or multiple connections sending data?
Not trying to speak for David obviously, but the dslreports speedtest, when using multiple streams mostly recruited streams for different server locations and reported these locations in some of the detailed report parts. For normal use that level of detail is overkill, but for problematic cases that was really elucidating (the reported the retransmit count for up to 5 server sites):
Server Nett Speed Avg RTT / Jitter Avg Re-xmit Avg Cwnd
Singapore (softlayer) d1 7.3 Mb/s 200.5±7ms 0.1% 154
Houston, USA (softlayer) d3 3.07 Mb/s 157.6±3.6ms 0.4% 125
Dallas, USA (softlayer) d3 2.65 Mb/s 150.1±3.3ms 0.6% 131
San Jose, USA (softlayer) d3 2.77 Mb/s 185.6±5ms 0.5% 126
Nashville, TN, USA (Twinlakes coop) d3 2.34 Mb/s 127.6±4ms 0.6% 76
Run Log:
0.00s setting download file size to 40mb max for Safari
0.00s Start testing DSL
00.43s Servers available: 10
00.46s pinging 10 locations
01.66s geo location failed
05.47s 19ms Amsterdam, Netherlands, EU
05.47s 63ms Nashville, TN, USA
05.47s 72ms Dallas, USA
05.47s 75ms Houston, USA
05.47s 89ms San Jose, USA
05.47s 96ms Singapore
05.47s could not reach Silver Spring, MD, USA https://t70.dslreports.com
05.47s could not reach Newcastle, Delaware, USA https://t68.dslreports.com
05.47s could not reach Westland, Michigan, USA https://t67.dslreports.com
05.47s could not reach Beaverton, Oregon, USA https://t69.dslreports.com
05.48s 5 seconds measuring idle buffer bloat
10.96s Trial download normal
10.99s Using GET for upload testing
10.99s preference https set to 1
10.99s preference fixrids set to 1
10.99s preference streamsDown set to 16
10.99s preference dnmethod set to websocket
10.99s preference upmethod set to websocket
10.99s preference upduration set to 30
10.99s preference streamsUp set to 16
10.99s preference dnduration set to 30
10.99s preference bloathf set to 1
10.99s preference rids set to [object Object]
10.99s preference compress set to 1
19.11s stream0 4.71 megabit Amsterdam, Netherlands, EU
19.11s stream1 2.74 megabit Dallas, USA
19.11s stream2 4.68 megabit Singapore
19.11s stream3 2.23 megabit Dallas, USA
19.11s stream4 3.31 megabit Houston, USA
19.11s stream5 3.19 megabit Houston, USA
19.11s stream6 2.83 megabit Amsterdam, Netherlands, EU
19.11s stream7 1.13 megabit Dallas, USA
19.11s stream8 2.15 megabit Amsterdam, Netherlands, EU
19.11s stream9 2.35 megabit San Jose, USA
19.11s stream10 1.46 megabit Nashville, TN, USA
19.11s stream11 1.42 megabit Nashville, TN, USA
19.11s stream12 2.92 megabit Nashville, TN, USA
19.11s stream13 2.19 megabit Houston, USA
19.11s stream14 2.16 megabit San Jose, USA
19.11s stream15 1.2 megabit San Jose, USA
41.26s End of download testing. Starting upload in 2 seconds
43.27s Capping upload streams to 6 because of download result
43.27s starting websocket upload with 16 streams
43.27s minimum upload speed of 0.3 per stream
43.48s sent first packet to t56.dslreports.com
44.08s sent first packet to t59.dslreports.com
44.48s sent first packet to t59.dslreports.com
44.48s sent first packet to t57.dslreports.com
44.68s sent first packet to t56.dslreports.com
44.78s sent first packet to t58.dslreports.com
44.79s got first reply from t56.dslreports.com 221580
44.98s sent first packet to t58.dslreports.com
45.08s sent first packet to t56.dslreports.com
45.14s got first reply from t59.dslreports.com 221580
45.28s sent first packet to t59.dslreports.com
45.53s got first reply from t59.dslreports.com 155106
45.55s got first reply from t57.dslreports.com 70167
45.78s got first reply from t58.dslreports.com 210501
45.85s got first reply from t56.dslreports.com 162492
45.88s sent first packet to t60.dslreports.com
45.88s sent first packet to t71.dslreports.com
46.00s got first reply from t58.dslreports.com 44316
46.08s sent first packet to t71.dslreports.com
46.26s got first reply from t56.dslreports.com 177264
46.28s sent first packet to t71.dslreports.com
46.41s got first reply from t59.dslreports.com 221580
46.58s sent first packet to t58.dslreports.com
46.88s sent first packet to t60.dslreports.com
46.89s got first reply from t60.dslreports.com 99711
47.08s sent first packet to t60.dslreports.com
47.61s got first reply from t58.dslreports.com 221580
47.93s got first reply from t60.dslreports.com 158799
48.09s got first reply from t60.dslreports.com 107097
62.87s Recording upload 21.45
62.87s Timer drops: frames=0 total ms=0
62.87s END TEST
64.88s Total megabytes consumed: 198.8 (down:155 up:43.8)
Not sure how trust-worthy these numbers were, but high retransmit counts correlated with relative low measured goodput...
I realize that this level of detail is explicitly out of scope for fast.com, but if you collect similar data, exposing it for interested parties following a chain of links would be swell. I am thinking along the lines of Dougles Adams' "It was on display in the bottom of a locked filing cabinet stuck in a disused lavatory with a sign on the door saying ‘Beware of the Leopard." here ;)
Best Regards
Sebastian
>
> SERGEY FEDOROV
> Director of Engineering
> sfedorov@netflix.com
> 121 Albright Way | Los Gatos, CA 95032
>
>
>
>
> On Sun, May 3, 2020 at 8:07 AM David P. Reed <dpreed@deepplum.com> wrote:
> Thanks Sebastian. I do agree that in many cases, reflecting the ICMP off the entry device that has the external IP address for the NAT gets most of the RTT measure, and if there's no queueing built up in the NAT device, that's a reasonable measure. But...
>
> However, if the router has "taken up the queueing delay" by rate limiting its uplink traffic to slightly less than the capacity (as with Cake and other TC shaping that isn't as good as cake), then there is a queue in the TC layer itself. This is what concerns me as a distortion in the measurement that can fool one into thinking the TC shaper is doing a good job, when in fact, lag under load may be quite high from inside the routed domain (the home).
>
> As you point out this unmeasured queueing delay can also be a problem with WiFi inside the home. But it isn't limited to that.
>
> A badly set up shaping/congestion management subsystem inside the NAT can look "very good" in its echo of ICMP packets, but be terrible in response time to trivial HTTP requests from inside, or equally terrible in twitch games and video conferencing.
>
> So, for example, for tuning settings with "Cake" it is useless.
>
> To be fair, usually the Access Provider has no control of what is done after the cable is terminated at the home, so as a way to decide if the provider is badly engineering its side, a ping from a server is a reasonable quality measure of the provider.
>
> But not a good measure of the user experience, and if the provider provides the NAT box, even if it has a good shaper in it, like Cake or fq_codel, it will just confuse the user and create the opportunity for a "finger pointing" argument where neither side understands what is going on.
>
> This is why we need
>
> 1) a clear definition of lag under load that is from end-to-end in latency, and involves, ideally, independent traffic from multiple sources through the bottleneck.
>
> 2) ideally, a better way to localize where the queues are building up and present that to users and access providers. The flent graphs are not interpretable by most non-experts. What we need is a simple visualization of a sketch-map of the path (like traceroute might provide) with queueing delay measures shown at key points that the user can understand.
> On Saturday, May 2, 2020 4:19pm, "Sebastian Moeller" <moeller0@gmx.de> said:
>
>> Hi David,
>>
>> in principle I agree, a NATed IPv4 ICMP probe will be at best reflected at the NAT
>> router (CPE) (some commercial home gateways do not respond to ICMP echo requests
>> in the name of security theatre). So it is pretty hard to measure the full end to
>> end path in that configuration. I believe that IPv6 should make that
>> easier/simpler in that NAT hopefully will be out of the path (but let's see what
>> ingenuity ISPs will come up with).
>> Then again, traditionally the relevant bottlenecks often are a) the internet
>> access link itself and there the CPE is in a reasonable position as a reflector on
>> the other side of the bottleneck as seen from an internet server, b) the home
>> network between CPE and end-host, often with variable rate wifi, here I agree
>> reflecting echos at the CPE hides part of the issue.
>>
>>
>>
>>> On May 2, 2020, at 19:38, David P. Reed <dpreed@deepplum.com> wrote:
>>>
>>> I am still a bit worried about properly defining "latency under load" for a
>> NAT routed situation. If the test is based on ICMP Ping packets *from the server*,
>> it will NOT be measuring the full path latency, and if the potential congestion
>> is in the uplink path from the access provider's residential box to the access
>> provider's router/switch, it will NOT measure congestion caused by bufferbloat
>> reliably on either side, since the bufferbloat will be outside the ICMP Ping
>> path.
>>
>> Puzzled, as i believe it is going to be the residential box that will respond
>> here, or will it be the AFTRs for CG-NAT that reflect the ICMP echo requests?
>>
>>>
>>> I realize that a browser based speed test has to be basically run from the
>> "server" end, because browsers are not that good at time measurement on a packet
>> basis. However, there are ways to solve this and avoid the ICMP Ping issue, with a
>> cooperative server.
>>>
>>> I once built a test that fixed this issue reasonably well. It carefully
>> created a TCP based RTT measurement channel (over HTTP) that made the echo have to
>> traverse the whole end-to-end path, which is the best and only way to accurately
>> define lag under load from the user's perspective. The client end of an unloaded
>> TCP connection can depend on TCP (properly prepared by getting it past slowstart)
>> to generate a single packet response.
>>>
>>> This "TCP ping" is thus compatible with getting the end-to-end measurement on
>> the server end of a true RTT.
>>>
>>> It's like tcp-traceroute tool, in that it tricks anyone in the middle boxes
>> into thinking this is a real, serious packet, not an optional low priority
>> packet.
>>>
>>> The same issue comes up with non-browser-based techniques for measuring true
>> lag-under-load.
>>>
>>> Now as we move HTTP to QUIC, this actually gets easier to do.
>>>
>>> One other opportunity I haven't explored, but which is pregnant with
>> potential is the use of WebRTC, which runs over UDP internally. Since JavaScript
>> has direct access to create WebRTC connections (multiple ones), this makes
>> detailed testing in the browser quite reasonable.
>>>
>>> And the time measurements can resolve well below 100 microseconds, if the JS
>> is based on modern JIT compilation (Chrome, Firefox, Edge all compile to machine
>> code speed if the code is restricted and in a loop). Then again, there is Web
>> Assembly if you want to write C code that runs in the brower fast. WebAssembly is
>> a low level language that compiles to machine code in the browser execution, and
>> still has access to all the browser networking facilities.
>>
>> Mmmh, according to https://github.com/w3c/hr-time/issues/56 due to spectre
>> side-channel vulnerabilities many browsers seemed to have lowered the timer
>> resolution, but even the ~1ms resolution should be fine for typical RTTs.
>>
>> Best Regards
>> Sebastian
>>
>> P.S.: I assume that I simply do not see/understand the full scope of the issue at
>> hand yet.
>>
>>
>>>
>>> On Saturday, May 2, 2020 12:52pm, "Dave Taht" <dave.taht@gmail.com>
>> said:
>>>
>>>> On Sat, May 2, 2020 at 9:37 AM Benjamin Cronce <bcronce@gmail.com>
>> wrote:
>>>>>
>>>>>> Fast.com reports my unloaded latency as 4ms, my loaded latency
>> as ~7ms
>>>>
>>>> I guess one of my questions is that with a switch to BBR netflix is
>>>> going to do pretty well. If fast.com is using bbr, well... that
>>>> excludes much of the current side of the internet.
>>>>
>>>>> For download, I show 6ms unloaded and 6-7 loaded. But for upload
>> the loaded
>>>> shows as 7-8 and I see it blip upwards of 12ms. But I am no longer using
>> any
>>>> traffic shaping. Any anti-bufferbloat is from my ISP. A graph of the
>> bloat would
>>>> be nice.
>>>>
>>>> The tests do need to last a fairly long time.
>>>>
>>>>> On Sat, May 2, 2020 at 9:51 AM Jannie Hanekom
>> <jannie@hanekom.net>
>>>> wrote:
>>>>>>
>>>>>> Michael Richardson <mcr@sandelman.ca>:
>>>>>>> Does it find/use my nearest Netflix cache?
>>>>>>
>>>>>> Thankfully, it appears so. The DSLReports bloat test was
>> interesting,
>>>> but
>>>>>> the jitter on the ~240ms base latency from South Africa (and
>> other parts
>>>> of
>>>>>> the world) was significant enough that the figures returned
>> were often
>>>>>> unreliable and largely unusable - at least in my experience.
>>>>>>
>>>>>> Fast.com reports my unloaded latency as 4ms, my loaded latency
>> as ~7ms
>>>> and
>>>>>> mentions servers located in local cities. I finally have a test
>> I can
>>>> share
>>>>>> with local non-technical people!
>>>>>>
>>>>>> (Agreed, upload test would be nice, but this is a huge step
>> forward from
>>>>>> what I had access to before.)
>>>>>>
>>>>>> Jannie Hanekom
>>>>>>
>>>>>> _______________________________________________
>>>>>> Cake mailing list
>>>>>> Cake@lists.bufferbloat.net
>>>>>> https://lists.bufferbloat.net/listinfo/cake
>>>>>
>>>>> _______________________________________________
>>>>> Cake mailing list
>>>>> Cake@lists.bufferbloat.net
>>>>> https://lists.bufferbloat.net/listinfo/cake
>>>>
>>>>
>>>>
>>>> --
>>>> Make Music, Not War
>>>>
>>>> Dave Täht
>>>> CTO, TekLibre, LLC
>>>> http://www.teklibre.com
>>>> Tel: 1-831-435-0729
>>>> _______________________________________________
>>>> Cake mailing list
>>>> Cake@lists.bufferbloat.net
>>>> https://lists.bufferbloat.net/listinfo/cake
>>>>
>>> _______________________________________________
>>> Cake mailing list
>>> Cake@lists.bufferbloat.net
>>> https://lists.bufferbloat.net/listinfo/cake
>>
>>
>
^ permalink raw reply [relevance 0%]
* Re: [Cake] [Make-wifi-fast] [Bloat] dslreports is no longer free
2020-05-03 15:06 1% [Cake] [Make-wifi-fast] " David P. Reed
2020-05-04 17:04 1% ` Sergey Fedorov
[not found] ` <mailman.253.1588611897.24343.make-wifi-fast@lists.bufferbloat.net>
@ 2020-05-06 8:08 0% ` Sebastian Moeller
2 siblings, 0 replies; 200+ results
From: Sebastian Moeller @ 2020-05-06 8:08 UTC (permalink / raw)
To: David P. Reed
Cc: Dave Täht, Make-Wifi-fast, Jannie Hanekom, Cake List,
Sergey Fedorov, bloat, Toke Høiland-Jørgensen
Dear David,
Thanks for the elaboration below, and indeed I was not appreciating the full scope of the challenge.
> On May 3, 2020, at 17:06, David P. Reed <dpreed@deepplum.com> wrote:
>
> Thanks Sebastian. I do agree that in many cases, reflecting the ICMP off the entry device that has the external IP address for the NAT gets most of the RTT measure, and if there's no queueing built up in the NAT device, that's a reasonable measure. But...
Yes, I see; I really hope that with IPv6 coming more and more online, and hence less NAT, end-to-end RTT measurements will be simpler in the future. But cue the people who will for example recommend to drop/ignore ICMP in the name of security theater... Its the same mindset that basically recommends to ignore ICMP and/or IP timestamps, because "information leakage", while all the information that leaks for a standards conformant host is the time since midnight UTC (and potentially an idea about the difference between the local clock setting)... I fail to understand the rationale thread model behind eschewing this... For our purpoes one-way timestamps would be most excellent to have to be able to assess on which "leg" overload actually happens.
>
> However, if the router has "taken up the queueing delay" by rate limiting its uplink traffic to slightly less than the capacity (as with Cake and other TC shaping that isn't as good as cake), then there is a queue in the TC layer itself. This is what concerns me as a distortion in the measurement that can fool one into thinking the TC shaper is doing a good job, when in fact, lag under load may be quite high from inside the routed domain (the home).
As long as the shaper is instantiated on the NAT box, the latency probes reflected by that NAT-box will also travel through the shaper; but now you mention it, in SQM we do ingress shaping via an IFB and hence will also shape the incoming latency probes, but I started to recommend to do ingress shaping as egress-shaping on the LAN-wards interface of a router (to avoid the computational cost of the IFB redirection dance, and to allow people to use iptables for ingress*), and in such a configuration router reflected/emitted WAN-probes will avoid the ingress TC-queues...
*) With nftables having a hook at ingress, that second rationale will become moot in the near future...
>
> As you point out this unmeasured queueing delay can also be a problem with WiFi inside the home. But it isn't limited to that.
>
> A badly set up shaping/congestion management subsystem inside the NAT can look "very good" in its echo of ICMP packets, but be terrible in response time to trivial HTTP requests from inside, or equally terrible in twitch games and video conferencing.
Good point, and one of Dave's pet peeves, in former time people recommended to up-priritize ICMP packets to make RTT look good, falling exactly into the trap you described.
>
> So, for example, for tuning settings with "Cake" it is useless.
I believe that at least for the way we instantiate things by default in SQM-scripts we avoid that pit-fall. What do you think @Toke?
>
> To be fair, usually the Access Provider has no control of what is done after the cable is terminated at the home, so as a way to decide if the provider is badly engineering its side, a ping from a server is a reasonable quality measure of the provider.
Most providers in Germany will try to steer customers to rent a wifi router from the ISP, so bloat in the wifi link would also be under the responsibility of the ISP to some degree, no?
>
> But not a good measure of the user experience, and if the provider provides the NAT box, even if it has a good shaper in it, like Cake or fq_codel, it will just confuse the user and create the opportunity for a "finger pointing" argument where neither side understands what is going on.
>
> This is why we need
>
> 1) a clear definition of lag under load that is from end-to-end in latency, and involves, ideally, independent traffic from multiple sources through the bottleneck.
I am all for it, in addition in the past we also reasoned that this definition needs to be relative simple so it can be easily explained to turn naive layperson into informed amateurs ;) The multiple sources thing is something that dslreports did welll, they typically tried to serve from multiple server sites and reported some stats per site. Now with its basically gone, it becomes clear how much clue went into that speedtest, a pitty that most of the competition did not follow their lead yet (I am especially looking at you Ookla...).
>
> 2) ideally, a better way to localize where the queues are building up and present that to users and access providers.
Yes. Now how to do this robustly and reliably escapes me, albeit enabling one-way timestamps might help, then a saturating speedtest could be accompanied not by conceptually a "simple" IVMP echo request, but by a repeated traceroute that gets there-and-back delay measurements for the approximated path (approximated because of the complications of understanding traceroute results).
> The flent graphs are not interpretable by most non-experts.
And sometimes not even by experts ;)
> What we need is a simple visualization of a sketch-map of the path (like traceroute might provide) with queueing delay measures shown at key points that the user can understand.
I am on the fence, personally I would absolutely love that, but I am not sure how the rest of my family would receive something like that? I guess it depends on the simplicity of the representation and probably, following fast.com's lead, a way tp also compress that expanded results into a reasonable one-number representation. I hate on-number-representations for complex issues, but people generally will come up with one themselves if none is supplied. (And I get this, outside our areas of expertise we all prefer the world to be simple)
Best Regards
Sebastian
> On Saturday, May 2, 2020 4:19pm, "Sebastian Moeller" <moeller0@gmx.de> said:
>
>> Hi David,
>>
>> in principle I agree, a NATed IPv4 ICMP probe will be at best reflected at the NAT
>> router (CPE) (some commercial home gateways do not respond to ICMP echo requests
>> in the name of security theatre). So it is pretty hard to measure the full end to
>> end path in that configuration. I believe that IPv6 should make that
>> easier/simpler in that NAT hopefully will be out of the path (but let's see what
>> ingenuity ISPs will come up with).
>> Then again, traditionally the relevant bottlenecks often are a) the internet
>> access link itself and there the CPE is in a reasonable position as a reflector on
>> the other side of the bottleneck as seen from an internet server, b) the home
>> network between CPE and end-host, often with variable rate wifi, here I agree
>> reflecting echos at the CPE hides part of the issue.
>>
>>
>>
>>> On May 2, 2020, at 19:38, David P. Reed <dpreed@deepplum.com> wrote:
>>>
>>> I am still a bit worried about properly defining "latency under load" for a
>> NAT routed situation. If the test is based on ICMP Ping packets *from the server*,
>> it will NOT be measuring the full path latency, and if the potential congestion
>> is in the uplink path from the access provider's residential box to the access
>> provider's router/switch, it will NOT measure congestion caused by bufferbloat
>> reliably on either side, since the bufferbloat will be outside the ICMP Ping
>> path.
>>
>> Puzzled, as i believe it is going to be the residential box that will respond
>> here, or will it be the AFTRs for CG-NAT that reflect the ICMP echo requests?
>>
>>>
>>> I realize that a browser based speed test has to be basically run from the
>> "server" end, because browsers are not that good at time measurement on a packet
>> basis. However, there are ways to solve this and avoid the ICMP Ping issue, with a
>> cooperative server.
>>>
>>> I once built a test that fixed this issue reasonably well. It carefully
>> created a TCP based RTT measurement channel (over HTTP) that made the echo have to
>> traverse the whole end-to-end path, which is the best and only way to accurately
>> define lag under load from the user's perspective. The client end of an unloaded
>> TCP connection can depend on TCP (properly prepared by getting it past slowstart)
>> to generate a single packet response.
>>>
>>> This "TCP ping" is thus compatible with getting the end-to-end measurement on
>> the server end of a true RTT.
>>>
>>> It's like tcp-traceroute tool, in that it tricks anyone in the middle boxes
>> into thinking this is a real, serious packet, not an optional low priority
>> packet.
>>>
>>> The same issue comes up with non-browser-based techniques for measuring true
>> lag-under-load.
>>>
>>> Now as we move HTTP to QUIC, this actually gets easier to do.
>>>
>>> One other opportunity I haven't explored, but which is pregnant with
>> potential is the use of WebRTC, which runs over UDP internally. Since JavaScript
>> has direct access to create WebRTC connections (multiple ones), this makes
>> detailed testing in the browser quite reasonable.
>>>
>>> And the time measurements can resolve well below 100 microseconds, if the JS
>> is based on modern JIT compilation (Chrome, Firefox, Edge all compile to machine
>> code speed if the code is restricted and in a loop). Then again, there is Web
>> Assembly if you want to write C code that runs in the brower fast. WebAssembly is
>> a low level language that compiles to machine code in the browser execution, and
>> still has access to all the browser networking facilities.
>>
>> Mmmh, according to https://github.com/w3c/hr-time/issues/56 due to spectre
>> side-channel vulnerabilities many browsers seemed to have lowered the timer
>> resolution, but even the ~1ms resolution should be fine for typical RTTs.
>>
>> Best Regards
>> Sebastian
>>
>> P.S.: I assume that I simply do not see/understand the full scope of the issue at
>> hand yet.
>>
>>
>>>
>>> On Saturday, May 2, 2020 12:52pm, "Dave Taht" <dave.taht@gmail.com>
>> said:
>>>
>>>> On Sat, May 2, 2020 at 9:37 AM Benjamin Cronce <bcronce@gmail.com>
>> wrote:
>>>>>
>>>>>> Fast.com reports my unloaded latency as 4ms, my loaded latency
>> as ~7ms
>>>>
>>>> I guess one of my questions is that with a switch to BBR netflix is
>>>> going to do pretty well. If fast.com is using bbr, well... that
>>>> excludes much of the current side of the internet.
>>>>
>>>>> For download, I show 6ms unloaded and 6-7 loaded. But for upload
>> the loaded
>>>> shows as 7-8 and I see it blip upwards of 12ms. But I am no longer using
>> any
>>>> traffic shaping. Any anti-bufferbloat is from my ISP. A graph of the
>> bloat would
>>>> be nice.
>>>>
>>>> The tests do need to last a fairly long time.
>>>>
>>>>> On Sat, May 2, 2020 at 9:51 AM Jannie Hanekom
>> <jannie@hanekom.net>
>>>> wrote:
>>>>>>
>>>>>> Michael Richardson <mcr@sandelman.ca>:
>>>>>>> Does it find/use my nearest Netflix cache?
>>>>>>
>>>>>> Thankfully, it appears so. The DSLReports bloat test was
>> interesting,
>>>> but
>>>>>> the jitter on the ~240ms base latency from South Africa (and
>> other parts
>>>> of
>>>>>> the world) was significant enough that the figures returned
>> were often
>>>>>> unreliable and largely unusable - at least in my experience.
>>>>>>
>>>>>> Fast.com reports my unloaded latency as 4ms, my loaded latency
>> as ~7ms
>>>> and
>>>>>> mentions servers located in local cities. I finally have a test
>> I can
>>>> share
>>>>>> with local non-technical people!
>>>>>>
>>>>>> (Agreed, upload test would be nice, but this is a huge step
>> forward from
>>>>>> what I had access to before.)
>>>>>>
>>>>>> Jannie Hanekom
>>>>>>
>>>>>> _______________________________________________
>>>>>> Cake mailing list
>>>>>> Cake@lists.bufferbloat.net
>>>>>> https://lists.bufferbloat.net/listinfo/cake
>>>>>
>>>>> _______________________________________________
>>>>> Cake mailing list
>>>>> Cake@lists.bufferbloat.net
>>>>> https://lists.bufferbloat.net/listinfo/cake
>>>>
>>>>
>>>>
>>>> --
>>>> Make Music, Not War
>>>>
>>>> Dave Täht
>>>> CTO, TekLibre, LLC
>>>> http://www.teklibre.com
>>>> Tel: 1-831-435-0729
>>>> _______________________________________________
>>>> Cake mailing list
>>>> Cake@lists.bufferbloat.net
>>>> https://lists.bufferbloat.net/listinfo/cake
>>>>
>>> _______________________________________________
>>> Cake mailing list
>>> Cake@lists.bufferbloat.net
>>> https://lists.bufferbloat.net/listinfo/cake
>>
>>
>
^ permalink raw reply [relevance 0%]
* Re: [Cake] [Make-wifi-fast] [Bloat] dslreports is no longer free
2020-05-04 17:04 1% ` Sergey Fedorov
@ 2020-05-05 21:02 1% ` David P. Reed
2020-05-06 8:19 0% ` Sebastian Moeller
1 sibling, 0 replies; 200+ results
From: David P. Reed @ 2020-05-05 21:02 UTC (permalink / raw)
To: Sergey Fedorov
Cc: Sebastian Moeller, Dave Täht, Michael Richardson,
Make-Wifi-fast, Jannie Hanekom, Cake List, bloat
[-- Attachment #1: Type: text/plain, Size: 12966 bytes --]
I think the real test should be multiple clients, not multiple sources, but coordinating is hard. The middleboxes on the way may treat distinct IP host addresses specially, and of course there is an edge case because a single NIC by definition never sends two datagrams at once, which distort things as you look at edge performance issues.
The classic problem (Jim Gettys' "Daddy why is the Internet broken?" when uploading a big file from Dad's computer affects the web performance of the kid in the kid's bedroom) is an example of a UX issue that *really matters*. At HP Cambridge Research Lab, I used to have the local network management come to my office and yell at me because I was often uploading huge datasets to other HP locations, and it absolutely destroyed every other person's web usability when I did. (as usual, RTT went to multiple seconds, not affecting my file uploads at all, but it was the first example of what was later called Bufferbloat that got me focused on the issue of overbuffering.) Turned out that that problem was in choosing to use a Frame Relay link with the "don't ever discard packets" setting.
That was ALSO the first time I encountered "network experts" who absolutely denied that more buffering was bad. They thought that more buffering was GOOD. This was shocking, after I realized that almost no-one understood congestion was about excess queueing delay.
I still see badly misconfigured networks that destroy the ability to do Zoom or any other teleconferencing when someone is uploading files. And for some weird, weird reason, the work done by the Bloat team is constantly disparaged at IETF, to the point that their work isn't influencing anyone outside the Linux-based-router community. (Including Arista Networks, where they build overbuffered high speed switches and claim that is "a feature", and Andy Bechtolsheim refuses to listen to me or anyone else about it).
On Monday, May 4, 2020 1:04pm, "Sergey Fedorov" <sfedorov@netflix.com> said:
Sergey - I wasn't assuming anything about [ fast.com ]( http://fast.com/ ). The document you shared wasn't clear about the methodology's details here. Others sadly, have actually used ICMP pings in the way I described. I was making a generic comment of concern.
That said, it sounds like what you are doing is really helpful (esp. given that your measure is aimed at end user experiential qualities).
David - my apologies, I incorrectly interpreted your statement as being said in context of [ fast.com ]( http://fast.com ) measurements. The blog post linked indeed doesn't provide the latency measurement details - was written before we added the extra metrics. We'll see if we can publish an update. 1) a clear definition of lag under load that is from end-to-end in latency, and involves, ideally, independent traffic from multiple sources through the bottleneck.
Curious if by multiple sources you mean multiple clients (devices) or multiple connections sending data?
SERGEY FEDOROV
Director of Engineering
[ sfedorov@netflix.com ]( mailto:sfedorov@netflix.com )
121 Albright Way | Los Gatos, CA 95032
On Sun, May 3, 2020 at 8:07 AM David P. Reed <[ dpreed@deepplum.com ]( mailto:dpreed@deepplum.com )> wrote:
Thanks Sebastian. I do agree that in many cases, reflecting the ICMP off the entry device that has the external IP address for the NAT gets most of the RTT measure, and if there's no queueing built up in the NAT device, that's a reasonable measure. But...
However, if the router has "taken up the queueing delay" by rate limiting its uplink traffic to slightly less than the capacity (as with Cake and other TC shaping that isn't as good as cake), then there is a queue in the TC layer itself. This is what concerns me as a distortion in the measurement that can fool one into thinking the TC shaper is doing a good job, when in fact, lag under load may be quite high from inside the routed domain (the home).
As you point out this unmeasured queueing delay can also be a problem with WiFi inside the home. But it isn't limited to that.
A badly set up shaping/congestion management subsystem inside the NAT can look "very good" in its echo of ICMP packets, but be terrible in response time to trivial HTTP requests from inside, or equally terrible in twitch games and video conferencing.
So, for example, for tuning settings with "Cake" it is useless.
To be fair, usually the Access Provider has no control of what is done after the cable is terminated at the home, so as a way to decide if the provider is badly engineering its side, a ping from a server is a reasonable quality measure of the provider.
But not a good measure of the user experience, and if the provider provides the NAT box, even if it has a good shaper in it, like Cake or fq_codel, it will just confuse the user and create the opportunity for a "finger pointing" argument where neither side understands what is going on.
This is why we need
1) a clear definition of lag under load that is from end-to-end in latency, and involves, ideally, independent traffic from multiple sources through the bottleneck.
2) ideally, a better way to localize where the queues are building up and present that to users and access providers. The flent graphs are not interpretable by most non-experts. What we need is a simple visualization of a sketch-map of the path (like traceroute might provide) with queueing delay measures shown at key points that the user can understand.
On Saturday, May 2, 2020 4:19pm, "Sebastian Moeller" <[ moeller0@gmx.de ]( mailto:moeller0@gmx.de )> said:
> Hi David,
>
> in principle I agree, a NATed IPv4 ICMP probe will be at best reflected at the NAT
> router (CPE) (some commercial home gateways do not respond to ICMP echo requests
> in the name of security theatre). So it is pretty hard to measure the full end to
> end path in that configuration. I believe that IPv6 should make that
> easier/simpler in that NAT hopefully will be out of the path (but let's see what
> ingenuity ISPs will come up with).
> Then again, traditionally the relevant bottlenecks often are a) the internet
> access link itself and there the CPE is in a reasonable position as a reflector on
> the other side of the bottleneck as seen from an internet server, b) the home
> network between CPE and end-host, often with variable rate wifi, here I agree
> reflecting echos at the CPE hides part of the issue.
>
>
>
> > On May 2, 2020, at 19:38, David P. Reed <[ dpreed@deepplum.com ]( mailto:dpreed@deepplum.com )> wrote:
> >
> > I am still a bit worried about properly defining "latency under load" for a
> NAT routed situation. If the test is based on ICMP Ping packets *from the server*,
> it will NOT be measuring the full path latency, and if the potential congestion
> is in the uplink path from the access provider's residential box to the access
> provider's router/switch, it will NOT measure congestion caused by bufferbloat
> reliably on either side, since the bufferbloat will be outside the ICMP Ping
> path.
>
> Puzzled, as i believe it is going to be the residential box that will respond
> here, or will it be the AFTRs for CG-NAT that reflect the ICMP echo requests?
>
> >
> > I realize that a browser based speed test has to be basically run from the
> "server" end, because browsers are not that good at time measurement on a packet
> basis. However, there are ways to solve this and avoid the ICMP Ping issue, with a
> cooperative server.
> >
> > I once built a test that fixed this issue reasonably well. It carefully
> created a TCP based RTT measurement channel (over HTTP) that made the echo have to
> traverse the whole end-to-end path, which is the best and only way to accurately
> define lag under load from the user's perspective. The client end of an unloaded
> TCP connection can depend on TCP (properly prepared by getting it past slowstart)
> to generate a single packet response.
> >
> > This "TCP ping" is thus compatible with getting the end-to-end measurement on
> the server end of a true RTT.
> >
> > It's like tcp-traceroute tool, in that it tricks anyone in the middle boxes
> into thinking this is a real, serious packet, not an optional low priority
> packet.
> >
> > The same issue comes up with non-browser-based techniques for measuring true
> lag-under-load.
> >
> > Now as we move HTTP to QUIC, this actually gets easier to do.
> >
> > One other opportunity I haven't explored, but which is pregnant with
> potential is the use of WebRTC, which runs over UDP internally. Since JavaScript
> has direct access to create WebRTC connections (multiple ones), this makes
> detailed testing in the browser quite reasonable.
> >
> > And the time measurements can resolve well below 100 microseconds, if the JS
> is based on modern JIT compilation (Chrome, Firefox, Edge all compile to machine
> code speed if the code is restricted and in a loop). Then again, there is Web
> Assembly if you want to write C code that runs in the brower fast. WebAssembly is
> a low level language that compiles to machine code in the browser execution, and
> still has access to all the browser networking facilities.
>
> Mmmh, according to [ https://github.com/w3c/hr-time/issues/56 ]( https://github.com/w3c/hr-time/issues/56 ) due to spectre
> side-channel vulnerabilities many browsers seemed to have lowered the timer
> resolution, but even the ~1ms resolution should be fine for typical RTTs.
>
> Best Regards
> Sebastian
>
> P.S.: I assume that I simply do not see/understand the full scope of the issue at
> hand yet.
>
>
> >
> > On Saturday, May 2, 2020 12:52pm, "Dave Taht" <[ dave.taht@gmail.com ]( mailto:dave.taht@gmail.com )>
> said:
> >
> > > On Sat, May 2, 2020 at 9:37 AM Benjamin Cronce <[ bcronce@gmail.com ]( mailto:bcronce@gmail.com )>
> wrote:
> > > >
> > > > > Fast.com reports my unloaded latency as 4ms, my loaded latency
> as ~7ms
> > >
> > > I guess one of my questions is that with a switch to BBR netflix is
> > > going to do pretty well. If [ fast.com ]( http://fast.com ) is using bbr, well... that
> > > excludes much of the current side of the internet.
> > >
> > > > For download, I show 6ms unloaded and 6-7 loaded. But for upload
> the loaded
> > > shows as 7-8 and I see it blip upwards of 12ms. But I am no longer using
> any
> > > traffic shaping. Any anti-bufferbloat is from my ISP. A graph of the
> bloat would
> > > be nice.
> > >
> > > The tests do need to last a fairly long time.
> > >
> > > > On Sat, May 2, 2020 at 9:51 AM Jannie Hanekom
> <[ jannie@hanekom.net ]( mailto:jannie@hanekom.net )>
> > > wrote:
> > > >>
> > > >> Michael Richardson <[ mcr@sandelman.ca ]( mailto:mcr@sandelman.ca )>:
> > > >> > Does it find/use my nearest Netflix cache?
> > > >>
> > > >> Thankfully, it appears so. The DSLReports bloat test was
> interesting,
> > > but
> > > >> the jitter on the ~240ms base latency from South Africa (and
> other parts
> > > of
> > > >> the world) was significant enough that the figures returned
> were often
> > > >> unreliable and largely unusable - at least in my experience.
> > > >>
> > > >> Fast.com reports my unloaded latency as 4ms, my loaded latency
> as ~7ms
> > > and
> > > >> mentions servers located in local cities. I finally have a test
> I can
> > > share
> > > >> with local non-technical people!
> > > >>
> > > >> (Agreed, upload test would be nice, but this is a huge step
> forward from
> > > >> what I had access to before.)
> > > >>
> > > >> Jannie Hanekom
> > > >>
> > > >> _______________________________________________
> > > >> Cake mailing list
> > > >> [ Cake@lists.bufferbloat.net ]( mailto:Cake@lists.bufferbloat.net )
> > > >> [ https://lists.bufferbloat.net/listinfo/cake ]( https://lists.bufferbloat.net/listinfo/cake )
> > > >
> > > > _______________________________________________
> > > > Cake mailing list
> > > > [ Cake@lists.bufferbloat.net ]( mailto:Cake@lists.bufferbloat.net )
> > > > [ https://lists.bufferbloat.net/listinfo/cake ]( https://lists.bufferbloat.net/listinfo/cake )
> > >
> > >
> > >
> > > --
> > > Make Music, Not War
> > >
> > > Dave Täht
> > > CTO, TekLibre, LLC
> > > [ http://www.teklibre.com ]( http://www.teklibre.com )
> > > Tel: 1-831-435-0729
> > > _______________________________________________
> > > Cake mailing list
> > > [ Cake@lists.bufferbloat.net ]( mailto:Cake@lists.bufferbloat.net )
> > > [ https://lists.bufferbloat.net/listinfo/cake ]( https://lists.bufferbloat.net/listinfo/cake )
> > >
> > _______________________________________________
> > Cake mailing list
> > [ Cake@lists.bufferbloat.net ]( mailto:Cake@lists.bufferbloat.net )
> > [ https://lists.bufferbloat.net/listinfo/cake ]( https://lists.bufferbloat.net/listinfo/cake )
>
>
[-- Attachment #2: Type: text/html, Size: 20682 bytes --]
^ permalink raw reply [relevance 1%]
* Re: [Cake] [Bloat] [Make-wifi-fast] dslreports is no longer free
[not found] ` <mailman.256.1588636996.24343.bloat@lists.bufferbloat.net>
@ 2020-05-05 0:10 1% ` Dave Taht
0 siblings, 0 replies; 200+ results
From: Dave Taht @ 2020-05-05 0:10 UTC (permalink / raw)
To: Bob McMahon
Cc: Sergey Fedorov, Make-Wifi-fast, bloat, David P. Reed, Cake List,
Jannie Hanekom
[-- Attachment #1: Type: text/plain, Size: 15914 bytes --]
On Mon, May 4, 2020 at 5:03 PM Bob McMahon via Bloat <
bloat@lists.bufferbloat.net> wrote:
>
>
>
> ---------- Forwarded message ----------
> From: Bob McMahon <bob.mcmahon@broadcom.com>
> To: Sergey Fedorov <sfedorov@netflix.com>
> Cc: "David P. Reed" <dpreed@deepplum.com>, Michael Richardson <
> mcr@sandelman.ca>, Make-Wifi-fast <make-wifi-fast@lists.bufferbloat.net>,
> bloat <bloat@lists.bufferbloat.net>, Cake List <cake@lists.bufferbloat.net>,
> Jannie Hanekom <jannie@hanekom.net>
> Bcc:
> Date: Mon, 4 May 2020 17:03:02 -0700
> Subject: Re: [Make-wifi-fast] [Cake] [Bloat] dslreports is no longer free
> Sorry for being a bit off topic but we find average latency not all that
> useful. A full CDF is. The next best is a box plot with outliers which
> can be presented parametrically as a few numbers. Most customers want
> visibility into the PDF tail.
>
yea!
Try never to discard the outliers anywhere in the core tests. I always
point to this as a place where, if you stop thinking
the noise is noise, caused by bird droppings in your receiver, you find
structure where you thought it never existed before.
https://theconversation.com/the-cmb-how-an-accidental-discovery-became-the-key-to-understanding-the-universe-45126
A lot of times, just plotting the patterns of the outliers can be
interesting. It's often a lot of bird droppings to sort through!
Also, we're moving to socket write() to read() latencies for our end/end
> measurements (using the iperf 2.0.14 --trip-times option assumes
> synchronized clocks.). We also now measure TCP connects (3WHS) as well.
>
One thing that may or may not help is the sock_sent_lowat option.
I note that with SSL so common, it helps to be using that, rather than
straight tcp, so it's closer to a 5WHS
Yes, generating the crypto exchange costs time, but with that as a baseline
with the extra round trips...
> Finally, since we have trip times and the application write rates we can
> compute the amount of "end/end bytes in queue" per Little's law.
>
I will reserve comment on littles law for a bit.
For fault isolation, in-band network telemetry (or something similar) can
> be useful. https://p4.org/assets/INT-current-spec.pdf
>
> Bob
>
> On Mon, May 4, 2020 at 10:05 AM Sergey Fedorov via Make-wifi-fast <
> make-wifi-fast@lists.bufferbloat.net> wrote:
>
>>
>>
>>
>> ---------- Forwarded message ----------
>> From: Sergey Fedorov <sfedorov@netflix.com>
>> To: "David P. Reed" <dpreed@deepplum.com>
>> Cc: Sebastian Moeller <moeller0@gmx.de>, "Dave Täht" <dave.taht@gmail.com>,
>> Michael Richardson <mcr@sandelman.ca>, Make-Wifi-fast <
>> make-wifi-fast@lists.bufferbloat.net>, Jannie Hanekom <jannie@hanekom.net>,
>> Cake List <cake@lists.bufferbloat.net>, bloat <
>> bloat@lists.bufferbloat.net>
>> Bcc:
>> Date: Mon, 4 May 2020 10:04:19 -0700
>> Subject: Re: [Cake] [Make-wifi-fast] [Bloat] dslreports is no longer free
>>
>>> Sergey - I wasn't assuming anything about fast.com. The document you
>>> shared wasn't clear about the methodology's details here. Others sadly,
>>> have actually used ICMP pings in the way I described. I was making a
>>> generic comment of concern.
>>>
>>> That said, it sounds like what you are doing is really helpful (esp.
>>> given that your measure is aimed at end user experiential qualities).
>>
>> David - my apologies, I incorrectly interpreted your statement as being
>> said in context of fast.com measurements. The blog post linked indeed
>> doesn't provide the latency measurement details - was written before we
>> added the extra metrics. We'll see if we can publish an update.
>>
>> 1) a clear definition of lag under load that is from end-to-end in
>>> latency, and involves, ideally, independent traffic from multiple sources
>>> through the bottleneck.
>>
>> Curious if by multiple sources you mean multiple clients (devices) or
>> multiple connections sending data?
>>
>>
>> SERGEY FEDOROV
>>
>> Director of Engineering
>>
>> sfedorov@netflix.com
>>
>> 121 Albright Way | Los Gatos, CA 95032
>>
>>
>>
>>
>> On Sun, May 3, 2020 at 8:07 AM David P. Reed <dpreed@deepplum.com> wrote:
>>
>>> Thanks Sebastian. I do agree that in many cases, reflecting the ICMP off
>>> the entry device that has the external IP address for the NAT gets most of
>>> the RTT measure, and if there's no queueing built up in the NAT device,
>>> that's a reasonable measure. But...
>>>
>>>
>>>
>>> However, if the router has "taken up the queueing delay" by rate
>>> limiting its uplink traffic to slightly less than the capacity (as with
>>> Cake and other TC shaping that isn't as good as cake), then there is a
>>> queue in the TC layer itself. This is what concerns me as a distortion in
>>> the measurement that can fool one into thinking the TC shaper is doing a
>>> good job, when in fact, lag under load may be quite high from inside the
>>> routed domain (the home).
>>>
>>>
>>>
>>> As you point out this unmeasured queueing delay can also be a problem
>>> with WiFi inside the home. But it isn't limited to that.
>>>
>>>
>>>
>>> A badly set up shaping/congestion management subsystem inside the NAT
>>> can look "very good" in its echo of ICMP packets, but be terrible in
>>> response time to trivial HTTP requests from inside, or equally terrible in
>>> twitch games and video conferencing.
>>>
>>>
>>>
>>> So, for example, for tuning settings with "Cake" it is useless.
>>>
>>>
>>>
>>> To be fair, usually the Access Provider has no control of what is done
>>> after the cable is terminated at the home, so as a way to decide if the
>>> provider is badly engineering its side, a ping from a server is a
>>> reasonable quality measure of the provider.
>>>
>>>
>>>
>>> But not a good measure of the user experience, and if the provider
>>> provides the NAT box, even if it has a good shaper in it, like Cake or
>>> fq_codel, it will just confuse the user and create the opportunity for a
>>> "finger pointing" argument where neither side understands what is going on.
>>>
>>>
>>>
>>> This is why we need
>>>
>>>
>>>
>>> 1) a clear definition of lag under load that is from end-to-end in
>>> latency, and involves, ideally, independent traffic from multiple sources
>>> through the bottleneck.
>>>
>>>
>>>
>>> 2) ideally, a better way to localize where the queues are building up
>>> and present that to users and access providers. The flent graphs are not
>>> interpretable by most non-experts. What we need is a simple visualization
>>> of a sketch-map of the path (like traceroute might provide) with queueing
>>> delay measures shown at key points that the user can understand.
>>>
>>> On Saturday, May 2, 2020 4:19pm, "Sebastian Moeller" <moeller0@gmx.de>
>>> said:
>>>
>>> > Hi David,
>>> >
>>> > in principle I agree, a NATed IPv4 ICMP probe will be at best
>>> reflected at the NAT
>>> > router (CPE) (some commercial home gateways do not respond to ICMP
>>> echo requests
>>> > in the name of security theatre). So it is pretty hard to measure the
>>> full end to
>>> > end path in that configuration. I believe that IPv6 should make that
>>> > easier/simpler in that NAT hopefully will be out of the path (but
>>> let's see what
>>> > ingenuity ISPs will come up with).
>>> > Then again, traditionally the relevant bottlenecks often are a) the
>>> internet
>>> > access link itself and there the CPE is in a reasonable position as a
>>> reflector on
>>> > the other side of the bottleneck as seen from an internet server, b)
>>> the home
>>> > network between CPE and end-host, often with variable rate wifi, here
>>> I agree
>>> > reflecting echos at the CPE hides part of the issue.
>>> >
>>> >
>>> >
>>> > > On May 2, 2020, at 19:38, David P. Reed <dpreed@deepplum.com> wrote:
>>> > >
>>> > > I am still a bit worried about properly defining "latency under
>>> load" for a
>>> > NAT routed situation. If the test is based on ICMP Ping packets *from
>>> the server*,
>>> > it will NOT be measuring the full path latency, and if the potential
>>> congestion
>>> > is in the uplink path from the access provider's residential box to
>>> the access
>>> > provider's router/switch, it will NOT measure congestion caused by
>>> bufferbloat
>>> > reliably on either side, since the bufferbloat will be outside the
>>> ICMP Ping
>>> > path.
>>> >
>>> > Puzzled, as i believe it is going to be the residential box that will
>>> respond
>>> > here, or will it be the AFTRs for CG-NAT that reflect the ICMP echo
>>> requests?
>>> >
>>> > >
>>> > > I realize that a browser based speed test has to be basically run
>>> from the
>>> > "server" end, because browsers are not that good at time measurement
>>> on a packet
>>> > basis. However, there are ways to solve this and avoid the ICMP Ping
>>> issue, with a
>>> > cooperative server.
>>> > >
>>> > > I once built a test that fixed this issue reasonably well. It
>>> carefully
>>> > created a TCP based RTT measurement channel (over HTTP) that made the
>>> echo have to
>>> > traverse the whole end-to-end path, which is the best and only way to
>>> accurately
>>> > define lag under load from the user's perspective. The client end of
>>> an unloaded
>>> > TCP connection can depend on TCP (properly prepared by getting it past
>>> slowstart)
>>> > to generate a single packet response.
>>> > >
>>> > > This "TCP ping" is thus compatible with getting the end-to-end
>>> measurement on
>>> > the server end of a true RTT.
>>> > >
>>> > > It's like tcp-traceroute tool, in that it tricks anyone in the
>>> middle boxes
>>> > into thinking this is a real, serious packet, not an optional low
>>> priority
>>> > packet.
>>> > >
>>> > > The same issue comes up with non-browser-based techniques for
>>> measuring true
>>> > lag-under-load.
>>> > >
>>> > > Now as we move HTTP to QUIC, this actually gets easier to do.
>>> > >
>>> > > One other opportunity I haven't explored, but which is pregnant with
>>> > potential is the use of WebRTC, which runs over UDP internally. Since
>>> JavaScript
>>> > has direct access to create WebRTC connections (multiple ones), this
>>> makes
>>> > detailed testing in the browser quite reasonable.
>>> > >
>>> > > And the time measurements can resolve well below 100 microseconds,
>>> if the JS
>>> > is based on modern JIT compilation (Chrome, Firefox, Edge all compile
>>> to machine
>>> > code speed if the code is restricted and in a loop). Then again, there
>>> is Web
>>> > Assembly if you want to write C code that runs in the brower fast.
>>> WebAssembly is
>>> > a low level language that compiles to machine code in the browser
>>> execution, and
>>> > still has access to all the browser networking facilities.
>>> >
>>> > Mmmh, according to https://github.com/w3c/hr-time/issues/56 due to
>>> spectre
>>> > side-channel vulnerabilities many browsers seemed to have lowered the
>>> timer
>>> > resolution, but even the ~1ms resolution should be fine for typical
>>> RTTs.
>>> >
>>> > Best Regards
>>> > Sebastian
>>> >
>>> > P.S.: I assume that I simply do not see/understand the full scope of
>>> the issue at
>>> > hand yet.
>>> >
>>> >
>>> > >
>>> > > On Saturday, May 2, 2020 12:52pm, "Dave Taht" <dave.taht@gmail.com>
>>> > said:
>>> > >
>>> > > > On Sat, May 2, 2020 at 9:37 AM Benjamin Cronce <bcronce@gmail.com>
>>> > wrote:
>>> > > > >
>>> > > > > > Fast.com reports my unloaded latency as 4ms, my loaded latency
>>> > as ~7ms
>>> > > >
>>> > > > I guess one of my questions is that with a switch to BBR netflix is
>>> > > > going to do pretty well. If fast.com is using bbr, well... that
>>> > > > excludes much of the current side of the internet.
>>> > > >
>>> > > > > For download, I show 6ms unloaded and 6-7 loaded. But for upload
>>> > the loaded
>>> > > > shows as 7-8 and I see it blip upwards of 12ms. But I am no longer
>>> using
>>> > any
>>> > > > traffic shaping. Any anti-bufferbloat is from my ISP. A graph of
>>> the
>>> > bloat would
>>> > > > be nice.
>>> > > >
>>> > > > The tests do need to last a fairly long time.
>>> > > >
>>> > > > > On Sat, May 2, 2020 at 9:51 AM Jannie Hanekom
>>> > <jannie@hanekom.net>
>>> > > > wrote:
>>> > > > >>
>>> > > > >> Michael Richardson <mcr@sandelman.ca>:
>>> > > > >> > Does it find/use my nearest Netflix cache?
>>> > > > >>
>>> > > > >> Thankfully, it appears so. The DSLReports bloat test was
>>> > interesting,
>>> > > > but
>>> > > > >> the jitter on the ~240ms base latency from South Africa (and
>>> > other parts
>>> > > > of
>>> > > > >> the world) was significant enough that the figures returned
>>> > were often
>>> > > > >> unreliable and largely unusable - at least in my experience.
>>> > > > >>
>>> > > > >> Fast.com reports my unloaded latency as 4ms, my loaded latency
>>> > as ~7ms
>>> > > > and
>>> > > > >> mentions servers located in local cities. I finally have a test
>>> > I can
>>> > > > share
>>> > > > >> with local non-technical people!
>>> > > > >>
>>> > > > >> (Agreed, upload test would be nice, but this is a huge step
>>> > forward from
>>> > > > >> what I had access to before.)
>>> > > > >>
>>> > > > >> Jannie Hanekom
>>> > > > >>
>>> > > > >> _______________________________________________
>>> > > > >> Cake mailing list
>>> > > > >> Cake@lists.bufferbloat.net
>>> > > > >> https://lists.bufferbloat.net/listinfo/cake
>>> > > > >
>>> > > > > _______________________________________________
>>> > > > > Cake mailing list
>>> > > > > Cake@lists.bufferbloat.net
>>> > > > > https://lists.bufferbloat.net/listinfo/cake
>>> > > >
>>> > > >
>>> > > >
>>> > > > --
>>> > > > Make Music, Not War
>>> > > >
>>> > > > Dave Täht
>>> > > > CTO, TekLibre, LLC
>>> > > > http://www.teklibre.com
>>> > > > Tel: 1-831-435-0729
>>> > > > _______________________________________________
>>> > > > Cake mailing list
>>> > > > Cake@lists.bufferbloat.net
>>> > > > https://lists.bufferbloat.net/listinfo/cake
>>> > > >
>>> > > _______________________________________________
>>> > > Cake mailing list
>>> > > Cake@lists.bufferbloat.net
>>> > > https://lists.bufferbloat.net/listinfo/cake
>>> >
>>> >
>>>
>>>
>>>
>>
>>
>>
>> ---------- Forwarded message ----------
>> From: Sergey Fedorov via Make-wifi-fast <
>> make-wifi-fast@lists.bufferbloat.net>
>> To: "David P. Reed" <dpreed@deepplum.com>
>> Cc: Michael Richardson <mcr@sandelman.ca>, Make-Wifi-fast <
>> make-wifi-fast@lists.bufferbloat.net>, bloat <bloat@lists.bufferbloat.net>,
>> Cake List <cake@lists.bufferbloat.net>, Jannie Hanekom <
>> jannie@hanekom.net>
>> Bcc:
>> Date: Mon, 04 May 2020 10:05:04 -0700 (PDT)
>> Subject: Re: [Make-wifi-fast] [Cake] [Bloat] dslreports is no longer free
>> _______________________________________________
>> Make-wifi-fast mailing list
>> Make-wifi-fast@lists.bufferbloat.net
>> https://lists.bufferbloat.net/listinfo/make-wifi-fast
>
>
>
>
> ---------- Forwarded message ----------
> From: Bob McMahon via Bloat <bloat@lists.bufferbloat.net>
> To: Sergey Fedorov <sfedorov@netflix.com>
> Cc: Make-Wifi-fast <make-wifi-fast@lists.bufferbloat.net>, bloat <
> bloat@lists.bufferbloat.net>, "David P. Reed" <dpreed@deepplum.com>, Cake
> List <cake@lists.bufferbloat.net>, Jannie Hanekom <jannie@hanekom.net>
> Bcc:
> Date: Mon, 04 May 2020 17:03:19 -0700 (PDT)
> Subject: Re: [Bloat] [Make-wifi-fast] [Cake] dslreports is no longer free
> _______________________________________________
> Bloat mailing list
> Bloat@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/bloat
>
--
Make Music, Not War
Dave Täht
CTO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-831-435-0729
[-- Attachment #2: Type: text/html, Size: 25323 bytes --]
^ permalink raw reply [relevance 1%]
* Re: [Make-wifi-fast] [Cake] [Bloat] dslreports is no longer free
[not found] ` <mailman.253.1588611897.24343.make-wifi-fast@lists.bufferbloat.net>
@ 2020-05-05 0:03 1% ` Bob McMahon
[not found] ` <mailman.256.1588636996.24343.bloat@lists.bufferbloat.net>
1 sibling, 0 replies; 200+ results
From: Bob McMahon @ 2020-05-05 0:03 UTC (permalink / raw)
To: Sergey Fedorov
Cc: David P. Reed, Michael Richardson, Make-Wifi-fast, bloat,
Cake List, Jannie Hanekom
[-- Attachment #1: Type: text/plain, Size: 13260 bytes --]
Sorry for being a bit off topic but we find average latency not all that
useful. A full CDF is. The next best is a box plot with outliers which
can be presented parametrically as a few numbers. Most customers want
visibility into the PDF tail.
Also, we're moving to socket write() to read() latencies for our end/end
measurements (using the iperf 2.0.14 --trip-times option assumes
synchronized clocks.). We also now measure TCP connects (3WHS) as well.
Finally, since we have trip times and the application write rates we can
compute the amount of "end/end bytes in queue" per Little's law.
For fault isolation, in-band network telemetry (or something similar) can
be useful. https://p4.org/assets/INT-current-spec.pdf
Bob
On Mon, May 4, 2020 at 10:05 AM Sergey Fedorov via Make-wifi-fast <
make-wifi-fast@lists.bufferbloat.net> wrote:
>
>
>
> ---------- Forwarded message ----------
> From: Sergey Fedorov <sfedorov@netflix.com>
> To: "David P. Reed" <dpreed@deepplum.com>
> Cc: Sebastian Moeller <moeller0@gmx.de>, "Dave Täht" <dave.taht@gmail.com>,
> Michael Richardson <mcr@sandelman.ca>, Make-Wifi-fast <
> make-wifi-fast@lists.bufferbloat.net>, Jannie Hanekom <jannie@hanekom.net>,
> Cake List <cake@lists.bufferbloat.net>, bloat <bloat@lists.bufferbloat.net
> >
> Bcc:
> Date: Mon, 4 May 2020 10:04:19 -0700
> Subject: Re: [Cake] [Make-wifi-fast] [Bloat] dslreports is no longer free
>
>> Sergey - I wasn't assuming anything about fast.com. The document you
>> shared wasn't clear about the methodology's details here. Others sadly,
>> have actually used ICMP pings in the way I described. I was making a
>> generic comment of concern.
>>
>> That said, it sounds like what you are doing is really helpful (esp.
>> given that your measure is aimed at end user experiential qualities).
>
> David - my apologies, I incorrectly interpreted your statement as being
> said in context of fast.com measurements. The blog post linked indeed
> doesn't provide the latency measurement details - was written before we
> added the extra metrics. We'll see if we can publish an update.
>
> 1) a clear definition of lag under load that is from end-to-end in
>> latency, and involves, ideally, independent traffic from multiple sources
>> through the bottleneck.
>
> Curious if by multiple sources you mean multiple clients (devices) or
> multiple connections sending data?
>
>
> SERGEY FEDOROV
>
> Director of Engineering
>
> sfedorov@netflix.com
>
> 121 Albright Way | Los Gatos, CA 95032
>
>
>
>
> On Sun, May 3, 2020 at 8:07 AM David P. Reed <dpreed@deepplum.com> wrote:
>
>> Thanks Sebastian. I do agree that in many cases, reflecting the ICMP off
>> the entry device that has the external IP address for the NAT gets most of
>> the RTT measure, and if there's no queueing built up in the NAT device,
>> that's a reasonable measure. But...
>>
>>
>>
>> However, if the router has "taken up the queueing delay" by rate limiting
>> its uplink traffic to slightly less than the capacity (as with Cake and
>> other TC shaping that isn't as good as cake), then there is a queue in the
>> TC layer itself. This is what concerns me as a distortion in the
>> measurement that can fool one into thinking the TC shaper is doing a good
>> job, when in fact, lag under load may be quite high from inside the routed
>> domain (the home).
>>
>>
>>
>> As you point out this unmeasured queueing delay can also be a problem
>> with WiFi inside the home. But it isn't limited to that.
>>
>>
>>
>> A badly set up shaping/congestion management subsystem inside the NAT can
>> look "very good" in its echo of ICMP packets, but be terrible in response
>> time to trivial HTTP requests from inside, or equally terrible in twitch
>> games and video conferencing.
>>
>>
>>
>> So, for example, for tuning settings with "Cake" it is useless.
>>
>>
>>
>> To be fair, usually the Access Provider has no control of what is done
>> after the cable is terminated at the home, so as a way to decide if the
>> provider is badly engineering its side, a ping from a server is a
>> reasonable quality measure of the provider.
>>
>>
>>
>> But not a good measure of the user experience, and if the provider
>> provides the NAT box, even if it has a good shaper in it, like Cake or
>> fq_codel, it will just confuse the user and create the opportunity for a
>> "finger pointing" argument where neither side understands what is going on.
>>
>>
>>
>> This is why we need
>>
>>
>>
>> 1) a clear definition of lag under load that is from end-to-end in
>> latency, and involves, ideally, independent traffic from multiple sources
>> through the bottleneck.
>>
>>
>>
>> 2) ideally, a better way to localize where the queues are building up and
>> present that to users and access providers. The flent graphs are not
>> interpretable by most non-experts. What we need is a simple visualization
>> of a sketch-map of the path (like traceroute might provide) with queueing
>> delay measures shown at key points that the user can understand.
>>
>> On Saturday, May 2, 2020 4:19pm, "Sebastian Moeller" <moeller0@gmx.de>
>> said:
>>
>> > Hi David,
>> >
>> > in principle I agree, a NATed IPv4 ICMP probe will be at best reflected
>> at the NAT
>> > router (CPE) (some commercial home gateways do not respond to ICMP echo
>> requests
>> > in the name of security theatre). So it is pretty hard to measure the
>> full end to
>> > end path in that configuration. I believe that IPv6 should make that
>> > easier/simpler in that NAT hopefully will be out of the path (but let's
>> see what
>> > ingenuity ISPs will come up with).
>> > Then again, traditionally the relevant bottlenecks often are a) the
>> internet
>> > access link itself and there the CPE is in a reasonable position as a
>> reflector on
>> > the other side of the bottleneck as seen from an internet server, b)
>> the home
>> > network between CPE and end-host, often with variable rate wifi, here I
>> agree
>> > reflecting echos at the CPE hides part of the issue.
>> >
>> >
>> >
>> > > On May 2, 2020, at 19:38, David P. Reed <dpreed@deepplum.com> wrote:
>> > >
>> > > I am still a bit worried about properly defining "latency under load"
>> for a
>> > NAT routed situation. If the test is based on ICMP Ping packets *from
>> the server*,
>> > it will NOT be measuring the full path latency, and if the potential
>> congestion
>> > is in the uplink path from the access provider's residential box to the
>> access
>> > provider's router/switch, it will NOT measure congestion caused by
>> bufferbloat
>> > reliably on either side, since the bufferbloat will be outside the ICMP
>> Ping
>> > path.
>> >
>> > Puzzled, as i believe it is going to be the residential box that will
>> respond
>> > here, or will it be the AFTRs for CG-NAT that reflect the ICMP echo
>> requests?
>> >
>> > >
>> > > I realize that a browser based speed test has to be basically run
>> from the
>> > "server" end, because browsers are not that good at time measurement on
>> a packet
>> > basis. However, there are ways to solve this and avoid the ICMP Ping
>> issue, with a
>> > cooperative server.
>> > >
>> > > I once built a test that fixed this issue reasonably well. It
>> carefully
>> > created a TCP based RTT measurement channel (over HTTP) that made the
>> echo have to
>> > traverse the whole end-to-end path, which is the best and only way to
>> accurately
>> > define lag under load from the user's perspective. The client end of an
>> unloaded
>> > TCP connection can depend on TCP (properly prepared by getting it past
>> slowstart)
>> > to generate a single packet response.
>> > >
>> > > This "TCP ping" is thus compatible with getting the end-to-end
>> measurement on
>> > the server end of a true RTT.
>> > >
>> > > It's like tcp-traceroute tool, in that it tricks anyone in the middle
>> boxes
>> > into thinking this is a real, serious packet, not an optional low
>> priority
>> > packet.
>> > >
>> > > The same issue comes up with non-browser-based techniques for
>> measuring true
>> > lag-under-load.
>> > >
>> > > Now as we move HTTP to QUIC, this actually gets easier to do.
>> > >
>> > > One other opportunity I haven't explored, but which is pregnant with
>> > potential is the use of WebRTC, which runs over UDP internally. Since
>> JavaScript
>> > has direct access to create WebRTC connections (multiple ones), this
>> makes
>> > detailed testing in the browser quite reasonable.
>> > >
>> > > And the time measurements can resolve well below 100 microseconds, if
>> the JS
>> > is based on modern JIT compilation (Chrome, Firefox, Edge all compile
>> to machine
>> > code speed if the code is restricted and in a loop). Then again, there
>> is Web
>> > Assembly if you want to write C code that runs in the brower fast.
>> WebAssembly is
>> > a low level language that compiles to machine code in the browser
>> execution, and
>> > still has access to all the browser networking facilities.
>> >
>> > Mmmh, according to https://github.com/w3c/hr-time/issues/56 due to
>> spectre
>> > side-channel vulnerabilities many browsers seemed to have lowered the
>> timer
>> > resolution, but even the ~1ms resolution should be fine for typical
>> RTTs.
>> >
>> > Best Regards
>> > Sebastian
>> >
>> > P.S.: I assume that I simply do not see/understand the full scope of
>> the issue at
>> > hand yet.
>> >
>> >
>> > >
>> > > On Saturday, May 2, 2020 12:52pm, "Dave Taht" <dave.taht@gmail.com>
>> > said:
>> > >
>> > > > On Sat, May 2, 2020 at 9:37 AM Benjamin Cronce <bcronce@gmail.com>
>> > wrote:
>> > > > >
>> > > > > > Fast.com reports my unloaded latency as 4ms, my loaded latency
>> > as ~7ms
>> > > >
>> > > > I guess one of my questions is that with a switch to BBR netflix is
>> > > > going to do pretty well. If fast.com is using bbr, well... that
>> > > > excludes much of the current side of the internet.
>> > > >
>> > > > > For download, I show 6ms unloaded and 6-7 loaded. But for upload
>> > the loaded
>> > > > shows as 7-8 and I see it blip upwards of 12ms. But I am no longer
>> using
>> > any
>> > > > traffic shaping. Any anti-bufferbloat is from my ISP. A graph of the
>> > bloat would
>> > > > be nice.
>> > > >
>> > > > The tests do need to last a fairly long time.
>> > > >
>> > > > > On Sat, May 2, 2020 at 9:51 AM Jannie Hanekom
>> > <jannie@hanekom.net>
>> > > > wrote:
>> > > > >>
>> > > > >> Michael Richardson <mcr@sandelman.ca>:
>> > > > >> > Does it find/use my nearest Netflix cache?
>> > > > >>
>> > > > >> Thankfully, it appears so. The DSLReports bloat test was
>> > interesting,
>> > > > but
>> > > > >> the jitter on the ~240ms base latency from South Africa (and
>> > other parts
>> > > > of
>> > > > >> the world) was significant enough that the figures returned
>> > were often
>> > > > >> unreliable and largely unusable - at least in my experience.
>> > > > >>
>> > > > >> Fast.com reports my unloaded latency as 4ms, my loaded latency
>> > as ~7ms
>> > > > and
>> > > > >> mentions servers located in local cities. I finally have a test
>> > I can
>> > > > share
>> > > > >> with local non-technical people!
>> > > > >>
>> > > > >> (Agreed, upload test would be nice, but this is a huge step
>> > forward from
>> > > > >> what I had access to before.)
>> > > > >>
>> > > > >> Jannie Hanekom
>> > > > >>
>> > > > >> _______________________________________________
>> > > > >> Cake mailing list
>> > > > >> Cake@lists.bufferbloat.net
>> > > > >> https://lists.bufferbloat.net/listinfo/cake
>> > > > >
>> > > > > _______________________________________________
>> > > > > Cake mailing list
>> > > > > Cake@lists.bufferbloat.net
>> > > > > https://lists.bufferbloat.net/listinfo/cake
>> > > >
>> > > >
>> > > >
>> > > > --
>> > > > Make Music, Not War
>> > > >
>> > > > Dave Täht
>> > > > CTO, TekLibre, LLC
>> > > > http://www.teklibre.com
>> > > > Tel: 1-831-435-0729
>> > > > _______________________________________________
>> > > > Cake mailing list
>> > > > Cake@lists.bufferbloat.net
>> > > > https://lists.bufferbloat.net/listinfo/cake
>> > > >
>> > > _______________________________________________
>> > > Cake mailing list
>> > > Cake@lists.bufferbloat.net
>> > > https://lists.bufferbloat.net/listinfo/cake
>> >
>> >
>>
>>
>>
>
>
>
> ---------- Forwarded message ----------
> From: Sergey Fedorov via Make-wifi-fast <
> make-wifi-fast@lists.bufferbloat.net>
> To: "David P. Reed" <dpreed@deepplum.com>
> Cc: Michael Richardson <mcr@sandelman.ca>, Make-Wifi-fast <
> make-wifi-fast@lists.bufferbloat.net>, bloat <bloat@lists.bufferbloat.net>,
> Cake List <cake@lists.bufferbloat.net>, Jannie Hanekom <jannie@hanekom.net
> >
> Bcc:
> Date: Mon, 04 May 2020 10:05:04 -0700 (PDT)
> Subject: Re: [Make-wifi-fast] [Cake] [Bloat] dslreports is no longer free
> _______________________________________________
> Make-wifi-fast mailing list
> Make-wifi-fast@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/make-wifi-fast
[-- Attachment #2: Type: text/html, Size: 20487 bytes --]
^ permalink raw reply [relevance 1%]
* Re: [Cake] [Make-wifi-fast] [Bloat] dslreports is no longer free
2020-05-03 15:06 1% [Cake] [Make-wifi-fast] " David P. Reed
@ 2020-05-04 17:04 1% ` Sergey Fedorov
2020-05-05 21:02 1% ` David P. Reed
2020-05-06 8:19 0% ` Sebastian Moeller
[not found] ` <mailman.253.1588611897.24343.make-wifi-fast@lists.bufferbloat.net>
2020-05-06 8:08 0% ` [Cake] [Make-wifi-fast] [Bloat] " Sebastian Moeller
2 siblings, 2 replies; 200+ results
From: Sergey Fedorov @ 2020-05-04 17:04 UTC (permalink / raw)
To: David P. Reed
Cc: Sebastian Moeller, Dave Täht, Michael Richardson,
Make-Wifi-fast, Jannie Hanekom, Cake List, bloat
[-- Attachment #1: Type: text/plain, Size: 10784 bytes --]
>
> Sergey - I wasn't assuming anything about fast.com. The document you
> shared wasn't clear about the methodology's details here. Others sadly,
> have actually used ICMP pings in the way I described. I was making a
> generic comment of concern.
>
> That said, it sounds like what you are doing is really helpful (esp. given
> that your measure is aimed at end user experiential qualities).
David - my apologies, I incorrectly interpreted your statement as being
said in context of fast.com measurements. The blog post linked indeed
doesn't provide the latency measurement details - was written before we
added the extra metrics. We'll see if we can publish an update.
1) a clear definition of lag under load that is from end-to-end in latency,
> and involves, ideally, independent traffic from multiple sources through
> the bottleneck.
Curious if by multiple sources you mean multiple clients (devices) or
multiple connections sending data?
SERGEY FEDOROV
Director of Engineering
sfedorov@netflix.com
121 Albright Way | Los Gatos, CA 95032
On Sun, May 3, 2020 at 8:07 AM David P. Reed <dpreed@deepplum.com> wrote:
> Thanks Sebastian. I do agree that in many cases, reflecting the ICMP off
> the entry device that has the external IP address for the NAT gets most of
> the RTT measure, and if there's no queueing built up in the NAT device,
> that's a reasonable measure. But...
>
>
>
> However, if the router has "taken up the queueing delay" by rate limiting
> its uplink traffic to slightly less than the capacity (as with Cake and
> other TC shaping that isn't as good as cake), then there is a queue in the
> TC layer itself. This is what concerns me as a distortion in the
> measurement that can fool one into thinking the TC shaper is doing a good
> job, when in fact, lag under load may be quite high from inside the routed
> domain (the home).
>
>
>
> As you point out this unmeasured queueing delay can also be a problem with
> WiFi inside the home. But it isn't limited to that.
>
>
>
> A badly set up shaping/congestion management subsystem inside the NAT can
> look "very good" in its echo of ICMP packets, but be terrible in response
> time to trivial HTTP requests from inside, or equally terrible in twitch
> games and video conferencing.
>
>
>
> So, for example, for tuning settings with "Cake" it is useless.
>
>
>
> To be fair, usually the Access Provider has no control of what is done
> after the cable is terminated at the home, so as a way to decide if the
> provider is badly engineering its side, a ping from a server is a
> reasonable quality measure of the provider.
>
>
>
> But not a good measure of the user experience, and if the provider
> provides the NAT box, even if it has a good shaper in it, like Cake or
> fq_codel, it will just confuse the user and create the opportunity for a
> "finger pointing" argument where neither side understands what is going on.
>
>
>
> This is why we need
>
>
>
> 1) a clear definition of lag under load that is from end-to-end in
> latency, and involves, ideally, independent traffic from multiple sources
> through the bottleneck.
>
>
>
> 2) ideally, a better way to localize where the queues are building up and
> present that to users and access providers. The flent graphs are not
> interpretable by most non-experts. What we need is a simple visualization
> of a sketch-map of the path (like traceroute might provide) with queueing
> delay measures shown at key points that the user can understand.
>
> On Saturday, May 2, 2020 4:19pm, "Sebastian Moeller" <moeller0@gmx.de>
> said:
>
> > Hi David,
> >
> > in principle I agree, a NATed IPv4 ICMP probe will be at best reflected
> at the NAT
> > router (CPE) (some commercial home gateways do not respond to ICMP echo
> requests
> > in the name of security theatre). So it is pretty hard to measure the
> full end to
> > end path in that configuration. I believe that IPv6 should make that
> > easier/simpler in that NAT hopefully will be out of the path (but let's
> see what
> > ingenuity ISPs will come up with).
> > Then again, traditionally the relevant bottlenecks often are a) the
> internet
> > access link itself and there the CPE is in a reasonable position as a
> reflector on
> > the other side of the bottleneck as seen from an internet server, b) the
> home
> > network between CPE and end-host, often with variable rate wifi, here I
> agree
> > reflecting echos at the CPE hides part of the issue.
> >
> >
> >
> > > On May 2, 2020, at 19:38, David P. Reed <dpreed@deepplum.com> wrote:
> > >
> > > I am still a bit worried about properly defining "latency under load"
> for a
> > NAT routed situation. If the test is based on ICMP Ping packets *from
> the server*,
> > it will NOT be measuring the full path latency, and if the potential
> congestion
> > is in the uplink path from the access provider's residential box to the
> access
> > provider's router/switch, it will NOT measure congestion caused by
> bufferbloat
> > reliably on either side, since the bufferbloat will be outside the ICMP
> Ping
> > path.
> >
> > Puzzled, as i believe it is going to be the residential box that will
> respond
> > here, or will it be the AFTRs for CG-NAT that reflect the ICMP echo
> requests?
> >
> > >
> > > I realize that a browser based speed test has to be basically run from
> the
> > "server" end, because browsers are not that good at time measurement on
> a packet
> > basis. However, there are ways to solve this and avoid the ICMP Ping
> issue, with a
> > cooperative server.
> > >
> > > I once built a test that fixed this issue reasonably well. It carefully
> > created a TCP based RTT measurement channel (over HTTP) that made the
> echo have to
> > traverse the whole end-to-end path, which is the best and only way to
> accurately
> > define lag under load from the user's perspective. The client end of an
> unloaded
> > TCP connection can depend on TCP (properly prepared by getting it past
> slowstart)
> > to generate a single packet response.
> > >
> > > This "TCP ping" is thus compatible with getting the end-to-end
> measurement on
> > the server end of a true RTT.
> > >
> > > It's like tcp-traceroute tool, in that it tricks anyone in the middle
> boxes
> > into thinking this is a real, serious packet, not an optional low
> priority
> > packet.
> > >
> > > The same issue comes up with non-browser-based techniques for
> measuring true
> > lag-under-load.
> > >
> > > Now as we move HTTP to QUIC, this actually gets easier to do.
> > >
> > > One other opportunity I haven't explored, but which is pregnant with
> > potential is the use of WebRTC, which runs over UDP internally. Since
> JavaScript
> > has direct access to create WebRTC connections (multiple ones), this
> makes
> > detailed testing in the browser quite reasonable.
> > >
> > > And the time measurements can resolve well below 100 microseconds, if
> the JS
> > is based on modern JIT compilation (Chrome, Firefox, Edge all compile to
> machine
> > code speed if the code is restricted and in a loop). Then again, there
> is Web
> > Assembly if you want to write C code that runs in the brower fast.
> WebAssembly is
> > a low level language that compiles to machine code in the browser
> execution, and
> > still has access to all the browser networking facilities.
> >
> > Mmmh, according to https://github.com/w3c/hr-time/issues/56 due to
> spectre
> > side-channel vulnerabilities many browsers seemed to have lowered the
> timer
> > resolution, but even the ~1ms resolution should be fine for typical RTTs.
> >
> > Best Regards
> > Sebastian
> >
> > P.S.: I assume that I simply do not see/understand the full scope of the
> issue at
> > hand yet.
> >
> >
> > >
> > > On Saturday, May 2, 2020 12:52pm, "Dave Taht" <dave.taht@gmail.com>
> > said:
> > >
> > > > On Sat, May 2, 2020 at 9:37 AM Benjamin Cronce <bcronce@gmail.com>
> > wrote:
> > > > >
> > > > > > Fast.com reports my unloaded latency as 4ms, my loaded latency
> > as ~7ms
> > > >
> > > > I guess one of my questions is that with a switch to BBR netflix is
> > > > going to do pretty well. If fast.com is using bbr, well... that
> > > > excludes much of the current side of the internet.
> > > >
> > > > > For download, I show 6ms unloaded and 6-7 loaded. But for upload
> > the loaded
> > > > shows as 7-8 and I see it blip upwards of 12ms. But I am no longer
> using
> > any
> > > > traffic shaping. Any anti-bufferbloat is from my ISP. A graph of the
> > bloat would
> > > > be nice.
> > > >
> > > > The tests do need to last a fairly long time.
> > > >
> > > > > On Sat, May 2, 2020 at 9:51 AM Jannie Hanekom
> > <jannie@hanekom.net>
> > > > wrote:
> > > > >>
> > > > >> Michael Richardson <mcr@sandelman.ca>:
> > > > >> > Does it find/use my nearest Netflix cache?
> > > > >>
> > > > >> Thankfully, it appears so. The DSLReports bloat test was
> > interesting,
> > > > but
> > > > >> the jitter on the ~240ms base latency from South Africa (and
> > other parts
> > > > of
> > > > >> the world) was significant enough that the figures returned
> > were often
> > > > >> unreliable and largely unusable - at least in my experience.
> > > > >>
> > > > >> Fast.com reports my unloaded latency as 4ms, my loaded latency
> > as ~7ms
> > > > and
> > > > >> mentions servers located in local cities. I finally have a test
> > I can
> > > > share
> > > > >> with local non-technical people!
> > > > >>
> > > > >> (Agreed, upload test would be nice, but this is a huge step
> > forward from
> > > > >> what I had access to before.)
> > > > >>
> > > > >> Jannie Hanekom
> > > > >>
> > > > >> _______________________________________________
> > > > >> Cake mailing list
> > > > >> Cake@lists.bufferbloat.net
> > > > >> https://lists.bufferbloat.net/listinfo/cake
> > > > >
> > > > > _______________________________________________
> > > > > Cake mailing list
> > > > > Cake@lists.bufferbloat.net
> > > > > https://lists.bufferbloat.net/listinfo/cake
> > > >
> > > >
> > > >
> > > > --
> > > > Make Music, Not War
> > > >
> > > > Dave Täht
> > > > CTO, TekLibre, LLC
> > > > http://www.teklibre.com
> > > > Tel: 1-831-435-0729
> > > > _______________________________________________
> > > > Cake mailing list
> > > > Cake@lists.bufferbloat.net
> > > > https://lists.bufferbloat.net/listinfo/cake
> > > >
> > > _______________________________________________
> > > Cake mailing list
> > > Cake@lists.bufferbloat.net
> > > https://lists.bufferbloat.net/listinfo/cake
> >
> >
>
>
>
[-- Attachment #2: Type: text/html, Size: 16703 bytes --]
^ permalink raw reply [relevance 1%]
* Re: [Cake] fast.com quality
2020-05-03 15:31 1% ` [Cake] fast.com quality David P. Reed
@ 2020-05-03 15:37 1% ` Dave Taht
0 siblings, 0 replies; 200+ results
From: Dave Taht @ 2020-05-03 15:37 UTC (permalink / raw)
To: David P. Reed
Cc: Sergey Fedorov, Benjamin Cronce, Michael Richardson,
Jannie Hanekom, bloat, Cake List, Make-Wifi-fast
turn off cake, do it over wired. :) TAKE a packet cap of before and after.Thx.
On Sun, May 3, 2020 at 8:31 AM David P. Reed <dpreed@deepplum.com> wrote:
>
> Sergey -
>
>
>
> I am very happy to report that fast.com reports the following from my inexpensive Chromebook, over 802.11ac, my Linux-on-Celeron cake entry router setup, through RCN's "Gigabit service". It's a little surprising, only in how good it is.
>
>
>
> 460 Mbps down/17 Mbps up, 11 ms. unloaded, 18 ms. loaded.
>
>
>
> I'm a little bit curious about the extra 7 ms. due to load. I'm wondering if it is in my WiFi path, or whether Cake is building a queue.
>
>
>
> The 11 ms. to South Boston from my Needham home seems a bit high. I used to be about 7 msec. away from that switch. But I'm not complaiing.
>
> On Saturday, May 2, 2020 3:00pm, "Sergey Fedorov" <sfedorov@netflix.com> said:
>
> Dave, thanks for sharing interesting thoughts and context.
>>
>> I am still a bit worried about properly defining "latency under load" for a NAT routed situation. If the test is based on ICMP Ping packets *from the server*, it will NOT be measuring the full path latency, and if the potential congestion is in the uplink path from the access provider's residential box to the access provider's router/switch, it will NOT measure congestion caused by bufferbloat reliably on either side, since the bufferbloat will be outside the ICMP Ping path.
>>
>> I realize that a browser based speed test has to be basically run from the "server" end, because browsers are not that good at time measurement on a packet basis. However, there are ways to solve this and avoid the ICMP Ping issue, with a cooperative server.
>
> This erroneously assumes that fast.com measures latency from the server side. It does not. The measurements are done from the client, over http, with a parallel connection(s) to the same or similar set of servers, by sending empty requests over a previously established connection (you can see that in the browser web inspector).
> It should be noted that the value is not precisely the "RTT on a TCP/UDP flow that is loaded with traffic", but "user delay given the presence of heavy parallel flows". With that, some of the challenges you mentioned do not apply.
> In line with another point I've shared earlier - the goal is to measure and explain the user experience, not to be a diagnostic tool showing internal transport metrics.
>
> SERGEY FEDOROV
>
> Director of Engineering
>
> sfedorov@netflix.com
>
> 121 Albright Way | Los Gatos, CA 95032
>
>
> On Sat, May 2, 2020 at 10:38 AM David P. Reed <dpreed@deepplum.com> wrote:
>>
>> I am still a bit worried about properly defining "latency under load" for a NAT routed situation. If the test is based on ICMP Ping packets *from the server*, it will NOT be measuring the full path latency, and if the potential congestion is in the uplink path from the access provider's residential box to the access provider's router/switch, it will NOT measure congestion caused by bufferbloat reliably on either side, since the bufferbloat will be outside the ICMP Ping path.
>>
>>
>>
>> I realize that a browser based speed test has to be basically run from the "server" end, because browsers are not that good at time measurement on a packet basis. However, there are ways to solve this and avoid the ICMP Ping issue, with a cooperative server.
>>
>>
>>
>> I once built a test that fixed this issue reasonably well. It carefully created a TCP based RTT measurement channel (over HTTP) that made the echo have to traverse the whole end-to-end path, which is the best and only way to accurately define lag under load from the user's perspective. The client end of an unloaded TCP connection can depend on TCP (properly prepared by getting it past slowstart) to generate a single packet response.
>>
>>
>>
>> This "TCP ping" is thus compatible with getting the end-to-end measurement on the server end of a true RTT.
>>
>>
>>
>> It's like tcp-traceroute tool, in that it tricks anyone in the middle boxes into thinking this is a real, serious packet, not an optional low priority packet.
>>
>>
>>
>> The same issue comes up with non-browser-based techniques for measuring true lag-under-load.
>>
>>
>>
>> Now as we move HTTP to QUIC, this actually gets easier to do.
>>
>>
>>
>> One other opportunity I haven't explored, but which is pregnant with potential is the use of WebRTC, which runs over UDP internally. Since JavaScript has direct access to create WebRTC connections (multiple ones), this makes detailed testing in the browser quite reasonable.
>>
>>
>>
>> And the time measurements can resolve well below 100 microseconds, if the JS is based on modern JIT compilation (Chrome, Firefox, Edge all compile to machine code speed if the code is restricted and in a loop). Then again, there is Web Assembly if you want to write C code that runs in the brower fast. WebAssembly is a low level language that compiles to machine code in the browser execution, and still has access to all the browser networking facilities.
>>
>>
>>
>> On Saturday, May 2, 2020 12:52pm, "Dave Taht" <dave.taht@gmail.com> said:
>>
>> > On Sat, May 2, 2020 at 9:37 AM Benjamin Cronce <bcronce@gmail.com> wrote:
>> > >
>> > > > Fast.com reports my unloaded latency as 4ms, my loaded latency as ~7ms
>> >
>> > I guess one of my questions is that with a switch to BBR netflix is
>> > going to do pretty well. If fast.com is using bbr, well... that
>> > excludes much of the current side of the internet.
>> >
>> > > For download, I show 6ms unloaded and 6-7 loaded. But for upload the loaded
>> > shows as 7-8 and I see it blip upwards of 12ms. But I am no longer using any
>> > traffic shaping. Any anti-bufferbloat is from my ISP. A graph of the bloat would
>> > be nice.
>> >
>> > The tests do need to last a fairly long time.
>> >
>> > > On Sat, May 2, 2020 at 9:51 AM Jannie Hanekom <jannie@hanekom.net>
>> > wrote:
>> > >>
>> > >> Michael Richardson <mcr@sandelman.ca>:
>> > >> > Does it find/use my nearest Netflix cache?
>> > >>
>> > >> Thankfully, it appears so. The DSLReports bloat test was interesting,
>> > but
>> > >> the jitter on the ~240ms base latency from South Africa (and other parts
>> > of
>> > >> the world) was significant enough that the figures returned were often
>> > >> unreliable and largely unusable - at least in my experience.
>> > >>
>> > >> Fast.com reports my unloaded latency as 4ms, my loaded latency as ~7ms
>> > and
>> > >> mentions servers located in local cities. I finally have a test I can
>> > share
>> > >> with local non-technical people!
>> > >>
>> > >> (Agreed, upload test would be nice, but this is a huge step forward from
>> > >> what I had access to before.)
>> > >>
>> > >> Jannie Hanekom
>> > >>
>> > >> _______________________________________________
>> > >> Cake mailing list
>> > >> Cake@lists.bufferbloat.net
>> > >> https://lists.bufferbloat.net/listinfo/cake
>> > >
>> > > _______________________________________________
>> > > Cake mailing list
>> > > Cake@lists.bufferbloat.net
>> > > https://lists.bufferbloat.net/listinfo/cake
>> >
>> >
>> >
>> > --
>> > Make Music, Not War
>> >
>> > Dave Täht
>> > CTO, TekLibre, LLC
>> > http://www.teklibre.com
>> > Tel: 1-831-435-0729
>> > _______________________________________________
>> > Cake mailing list
>> > Cake@lists.bufferbloat.net
>> > https://lists.bufferbloat.net/listinfo/cake
>> >
--
Make Music, Not War
Dave Täht
CTO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-831-435-0729
^ permalink raw reply [relevance 1%]
* [Cake] fast.com quality
2020-05-02 19:00 1% ` Sergey Fedorov
2020-05-02 23:23 1% ` David P. Reed
@ 2020-05-03 15:31 1% ` David P. Reed
2020-05-03 15:37 1% ` Dave Taht
1 sibling, 1 reply; 200+ results
From: David P. Reed @ 2020-05-03 15:31 UTC (permalink / raw)
To: Sergey Fedorov
Cc: Dave Taht, Benjamin Cronce, Michael Richardson, Jannie Hanekom,
bloat, Cake List, Make-Wifi-fast
[-- Attachment #1: Type: text/plain, Size: 7571 bytes --]
Sergey -
I am very happy to report that fast.com reports the following from my inexpensive Chromebook, over 802.11ac, my Linux-on-Celeron cake entry router setup, through RCN's "Gigabit service". It's a little surprising, only in how good it is.
460 Mbps down/17 Mbps up, 11 ms. unloaded, 18 ms. loaded.
I'm a little bit curious about the extra 7 ms. due to load. I'm wondering if it is in my WiFi path, or whether Cake is building a queue.
The 11 ms. to South Boston from my Needham home seems a bit high. I used to be about 7 msec. away from that switch. But I'm not complaiing.
On Saturday, May 2, 2020 3:00pm, "Sergey Fedorov" <sfedorov@netflix.com> said:
Dave, thanks for sharing interesting thoughts and context. I am still a bit worried about properly defining "latency under load" for a NAT routed situation. If the test is based on ICMP Ping packets *from the server*, it will NOT be measuring the full path latency, and if the potential congestion is in the uplink path from the access provider's residential box to the access provider's router/switch, it will NOT measure congestion caused by bufferbloat reliably on either side, since the bufferbloat will be outside the ICMP Ping path.
I realize that a browser based speed test has to be basically run from the "server" end, because browsers are not that good at time measurement on a packet basis. However, there are ways to solve this and avoid the ICMP Ping issue, with a cooperative server.
This erroneously assumes that [ fast.com ]( http://fast.com ) measures latency from the server side. It does not. The measurements are done from the client, over http, with a parallel connection(s) to the same or similar set of servers, by sending empty requests over a previously established connection (you can see that in the browser web inspector).
It should be noted that the value is not precisely the "RTT on a TCP/UDP flow that is loaded with traffic", but "user delay given the presence of heavy parallel flows". With that, some of the challenges you mentioned do not apply.
In line with another point I've shared earlier - the goal is to measure and explain the user experience, not to be a diagnostic tool showing internal transport metrics.
SERGEY FEDOROV
Director of Engineering
[ sfedorov@netflix.com ]( mailto:sfedorov@netflix.com )
121 Albright Way | Los Gatos, CA 95032
On Sat, May 2, 2020 at 10:38 AM David P. Reed <[ dpreed@deepplum.com ]( mailto:dpreed@deepplum.com )> wrote:
I am still a bit worried about properly defining "latency under load" for a NAT routed situation. If the test is based on ICMP Ping packets *from the server*, it will NOT be measuring the full path latency, and if the potential congestion is in the uplink path from the access provider's residential box to the access provider's router/switch, it will NOT measure congestion caused by bufferbloat reliably on either side, since the bufferbloat will be outside the ICMP Ping path.
I realize that a browser based speed test has to be basically run from the "server" end, because browsers are not that good at time measurement on a packet basis. However, there are ways to solve this and avoid the ICMP Ping issue, with a cooperative server.
I once built a test that fixed this issue reasonably well. It carefully created a TCP based RTT measurement channel (over HTTP) that made the echo have to traverse the whole end-to-end path, which is the best and only way to accurately define lag under load from the user's perspective. The client end of an unloaded TCP connection can depend on TCP (properly prepared by getting it past slowstart) to generate a single packet response.
This "TCP ping" is thus compatible with getting the end-to-end measurement on the server end of a true RTT.
It's like tcp-traceroute tool, in that it tricks anyone in the middle boxes into thinking this is a real, serious packet, not an optional low priority packet.
The same issue comes up with non-browser-based techniques for measuring true lag-under-load.
Now as we move HTTP to QUIC, this actually gets easier to do.
One other opportunity I haven't explored, but which is pregnant with potential is the use of WebRTC, which runs over UDP internally. Since JavaScript has direct access to create WebRTC connections (multiple ones), this makes detailed testing in the browser quite reasonable.
And the time measurements can resolve well below 100 microseconds, if the JS is based on modern JIT compilation (Chrome, Firefox, Edge all compile to machine code speed if the code is restricted and in a loop). Then again, there is Web Assembly if you want to write C code that runs in the brower fast. WebAssembly is a low level language that compiles to machine code in the browser execution, and still has access to all the browser networking facilities.
On Saturday, May 2, 2020 12:52pm, "Dave Taht" <[ dave.taht@gmail.com ]( mailto:dave.taht@gmail.com )> said:
> On Sat, May 2, 2020 at 9:37 AM Benjamin Cronce <[ bcronce@gmail.com ]( mailto:bcronce@gmail.com )> wrote:
> >
> > > Fast.com reports my unloaded latency as 4ms, my loaded latency as ~7ms
>
> I guess one of my questions is that with a switch to BBR netflix is
> going to do pretty well. If [ fast.com ]( http://fast.com ) is using bbr, well... that
> excludes much of the current side of the internet.
>
> > For download, I show 6ms unloaded and 6-7 loaded. But for upload the loaded
> shows as 7-8 and I see it blip upwards of 12ms. But I am no longer using any
> traffic shaping. Any anti-bufferbloat is from my ISP. A graph of the bloat would
> be nice.
>
> The tests do need to last a fairly long time.
>
> > On Sat, May 2, 2020 at 9:51 AM Jannie Hanekom <[ jannie@hanekom.net ]( mailto:jannie@hanekom.net )>
> wrote:
> >>
> >> Michael Richardson <[ mcr@sandelman.ca ]( mailto:mcr@sandelman.ca )>:
> >> > Does it find/use my nearest Netflix cache?
> >>
> >> Thankfully, it appears so. The DSLReports bloat test was interesting,
> but
> >> the jitter on the ~240ms base latency from South Africa (and other parts
> of
> >> the world) was significant enough that the figures returned were often
> >> unreliable and largely unusable - at least in my experience.
> >>
> >> Fast.com reports my unloaded latency as 4ms, my loaded latency as ~7ms
> and
> >> mentions servers located in local cities. I finally have a test I can
> share
> >> with local non-technical people!
> >>
> >> (Agreed, upload test would be nice, but this is a huge step forward from
> >> what I had access to before.)
> >>
> >> Jannie Hanekom
> >>
> >> _______________________________________________
> >> Cake mailing list
> >> [ Cake@lists.bufferbloat.net ]( mailto:Cake@lists.bufferbloat.net )
> >> [ https://lists.bufferbloat.net/listinfo/cake ]( https://lists.bufferbloat.net/listinfo/cake )
> >
> > _______________________________________________
> > Cake mailing list
> > [ Cake@lists.bufferbloat.net ]( mailto:Cake@lists.bufferbloat.net )
> > [ https://lists.bufferbloat.net/listinfo/cake ]( https://lists.bufferbloat.net/listinfo/cake )
>
>
>
> --
> Make Music, Not War
>
> Dave Täht
> CTO, TekLibre, LLC
> [ http://www.teklibre.com ]( http://www.teklibre.com )
> Tel: 1-831-435-0729
> _______________________________________________
> Cake mailing list
> [ Cake@lists.bufferbloat.net ]( mailto:Cake@lists.bufferbloat.net )
> [ https://lists.bufferbloat.net/listinfo/cake ]( https://lists.bufferbloat.net/listinfo/cake )
>
[-- Attachment #2: Type: text/html, Size: 13526 bytes --]
^ permalink raw reply [relevance 1%]
* Re: [Cake] [Make-wifi-fast] [Bloat] dslreports is no longer free
@ 2020-05-03 15:06 1% David P. Reed
2020-05-04 17:04 1% ` Sergey Fedorov
` (2 more replies)
0 siblings, 3 replies; 200+ results
From: David P. Reed @ 2020-05-03 15:06 UTC (permalink / raw)
To: Sebastian Moeller
Cc: Dave Täht, Michael Richardson, Make-Wifi-fast,
Jannie Hanekom, Cake List, Sergey Fedorov, bloat
[-- Attachment #1: Type: text/plain, Size: 9033 bytes --]
Thanks Sebastian. I do agree that in many cases, reflecting the ICMP off the entry device that has the external IP address for the NAT gets most of the RTT measure, and if there's no queueing built up in the NAT device, that's a reasonable measure. But...
However, if the router has "taken up the queueing delay" by rate limiting its uplink traffic to slightly less than the capacity (as with Cake and other TC shaping that isn't as good as cake), then there is a queue in the TC layer itself. This is what concerns me as a distortion in the measurement that can fool one into thinking the TC shaper is doing a good job, when in fact, lag under load may be quite high from inside the routed domain (the home).
As you point out this unmeasured queueing delay can also be a problem with WiFi inside the home. But it isn't limited to that.
A badly set up shaping/congestion management subsystem inside the NAT can look "very good" in its echo of ICMP packets, but be terrible in response time to trivial HTTP requests from inside, or equally terrible in twitch games and video conferencing.
So, for example, for tuning settings with "Cake" it is useless.
To be fair, usually the Access Provider has no control of what is done after the cable is terminated at the home, so as a way to decide if the provider is badly engineering its side, a ping from a server is a reasonable quality measure of the provider.
But not a good measure of the user experience, and if the provider provides the NAT box, even if it has a good shaper in it, like Cake or fq_codel, it will just confuse the user and create the opportunity for a "finger pointing" argument where neither side understands what is going on.
This is why we need
1) a clear definition of lag under load that is from end-to-end in latency, and involves, ideally, independent traffic from multiple sources through the bottleneck.
2) ideally, a better way to localize where the queues are building up and present that to users and access providers. The flent graphs are not interpretable by most non-experts. What we need is a simple visualization of a sketch-map of the path (like traceroute might provide) with queueing delay measures shown at key points that the user can understand.
On Saturday, May 2, 2020 4:19pm, "Sebastian Moeller" <moeller0@gmx.de> said:
> Hi David,
>
> in principle I agree, a NATed IPv4 ICMP probe will be at best reflected at the NAT
> router (CPE) (some commercial home gateways do not respond to ICMP echo requests
> in the name of security theatre). So it is pretty hard to measure the full end to
> end path in that configuration. I believe that IPv6 should make that
> easier/simpler in that NAT hopefully will be out of the path (but let's see what
> ingenuity ISPs will come up with).
> Then again, traditionally the relevant bottlenecks often are a) the internet
> access link itself and there the CPE is in a reasonable position as a reflector on
> the other side of the bottleneck as seen from an internet server, b) the home
> network between CPE and end-host, often with variable rate wifi, here I agree
> reflecting echos at the CPE hides part of the issue.
>
>
>
> > On May 2, 2020, at 19:38, David P. Reed <dpreed@deepplum.com> wrote:
> >
> > I am still a bit worried about properly defining "latency under load" for a
> NAT routed situation. If the test is based on ICMP Ping packets *from the server*,
> it will NOT be measuring the full path latency, and if the potential congestion
> is in the uplink path from the access provider's residential box to the access
> provider's router/switch, it will NOT measure congestion caused by bufferbloat
> reliably on either side, since the bufferbloat will be outside the ICMP Ping
> path.
>
> Puzzled, as i believe it is going to be the residential box that will respond
> here, or will it be the AFTRs for CG-NAT that reflect the ICMP echo requests?
>
> >
> > I realize that a browser based speed test has to be basically run from the
> "server" end, because browsers are not that good at time measurement on a packet
> basis. However, there are ways to solve this and avoid the ICMP Ping issue, with a
> cooperative server.
> >
> > I once built a test that fixed this issue reasonably well. It carefully
> created a TCP based RTT measurement channel (over HTTP) that made the echo have to
> traverse the whole end-to-end path, which is the best and only way to accurately
> define lag under load from the user's perspective. The client end of an unloaded
> TCP connection can depend on TCP (properly prepared by getting it past slowstart)
> to generate a single packet response.
> >
> > This "TCP ping" is thus compatible with getting the end-to-end measurement on
> the server end of a true RTT.
> >
> > It's like tcp-traceroute tool, in that it tricks anyone in the middle boxes
> into thinking this is a real, serious packet, not an optional low priority
> packet.
> >
> > The same issue comes up with non-browser-based techniques for measuring true
> lag-under-load.
> >
> > Now as we move HTTP to QUIC, this actually gets easier to do.
> >
> > One other opportunity I haven't explored, but which is pregnant with
> potential is the use of WebRTC, which runs over UDP internally. Since JavaScript
> has direct access to create WebRTC connections (multiple ones), this makes
> detailed testing in the browser quite reasonable.
> >
> > And the time measurements can resolve well below 100 microseconds, if the JS
> is based on modern JIT compilation (Chrome, Firefox, Edge all compile to machine
> code speed if the code is restricted and in a loop). Then again, there is Web
> Assembly if you want to write C code that runs in the brower fast. WebAssembly is
> a low level language that compiles to machine code in the browser execution, and
> still has access to all the browser networking facilities.
>
> Mmmh, according to https://github.com/w3c/hr-time/issues/56 due to spectre
> side-channel vulnerabilities many browsers seemed to have lowered the timer
> resolution, but even the ~1ms resolution should be fine for typical RTTs.
>
> Best Regards
> Sebastian
>
> P.S.: I assume that I simply do not see/understand the full scope of the issue at
> hand yet.
>
>
> >
> > On Saturday, May 2, 2020 12:52pm, "Dave Taht" <dave.taht@gmail.com>
> said:
> >
> > > On Sat, May 2, 2020 at 9:37 AM Benjamin Cronce <bcronce@gmail.com>
> wrote:
> > > >
> > > > > Fast.com reports my unloaded latency as 4ms, my loaded latency
> as ~7ms
> > >
> > > I guess one of my questions is that with a switch to BBR netflix is
> > > going to do pretty well. If fast.com is using bbr, well... that
> > > excludes much of the current side of the internet.
> > >
> > > > For download, I show 6ms unloaded and 6-7 loaded. But for upload
> the loaded
> > > shows as 7-8 and I see it blip upwards of 12ms. But I am no longer using
> any
> > > traffic shaping. Any anti-bufferbloat is from my ISP. A graph of the
> bloat would
> > > be nice.
> > >
> > > The tests do need to last a fairly long time.
> > >
> > > > On Sat, May 2, 2020 at 9:51 AM Jannie Hanekom
> <jannie@hanekom.net>
> > > wrote:
> > > >>
> > > >> Michael Richardson <mcr@sandelman.ca>:
> > > >> > Does it find/use my nearest Netflix cache?
> > > >>
> > > >> Thankfully, it appears so. The DSLReports bloat test was
> interesting,
> > > but
> > > >> the jitter on the ~240ms base latency from South Africa (and
> other parts
> > > of
> > > >> the world) was significant enough that the figures returned
> were often
> > > >> unreliable and largely unusable - at least in my experience.
> > > >>
> > > >> Fast.com reports my unloaded latency as 4ms, my loaded latency
> as ~7ms
> > > and
> > > >> mentions servers located in local cities. I finally have a test
> I can
> > > share
> > > >> with local non-technical people!
> > > >>
> > > >> (Agreed, upload test would be nice, but this is a huge step
> forward from
> > > >> what I had access to before.)
> > > >>
> > > >> Jannie Hanekom
> > > >>
> > > >> _______________________________________________
> > > >> Cake mailing list
> > > >> Cake@lists.bufferbloat.net
> > > >> https://lists.bufferbloat.net/listinfo/cake
> > > >
> > > > _______________________________________________
> > > > Cake mailing list
> > > > Cake@lists.bufferbloat.net
> > > > https://lists.bufferbloat.net/listinfo/cake
> > >
> > >
> > >
> > > --
> > > Make Music, Not War
> > >
> > > Dave Täht
> > > CTO, TekLibre, LLC
> > > http://www.teklibre.com
> > > Tel: 1-831-435-0729
> > > _______________________________________________
> > > Cake mailing list
> > > Cake@lists.bufferbloat.net
> > > https://lists.bufferbloat.net/listinfo/cake
> > >
> > _______________________________________________
> > Cake mailing list
> > Cake@lists.bufferbloat.net
> > https://lists.bufferbloat.net/listinfo/cake
>
>
[-- Attachment #2: Type: text/html, Size: 13609 bytes --]
^ permalink raw reply [relevance 1%]
* Re: [Cake] [Make-wifi-fast] [Bloat] dslreports is no longer free
2020-05-02 19:00 1% ` Sergey Fedorov
@ 2020-05-02 23:23 1% ` David P. Reed
2020-05-03 15:31 1% ` [Cake] fast.com quality David P. Reed
1 sibling, 0 replies; 200+ results
From: David P. Reed @ 2020-05-02 23:23 UTC (permalink / raw)
To: Sergey Fedorov
Cc: Dave Taht, Benjamin Cronce, Michael Richardson, Jannie Hanekom,
bloat, Cake List, Make-Wifi-fast
[-- Attachment #1: Type: text/plain, Size: 7376 bytes --]
Sergey - I wasn't assuming anything about fast.com. The document you shared wasn't clear about the methodology's details here. Others sadly, have actually used ICMP pings in the way I described. I was making a generic comment of concern.
That said, it sounds like what you are doing is really helpful (esp. given that your measure is aimed at end user experiential qualities).
Good luck!
On Saturday, May 2, 2020 3:00pm, "Sergey Fedorov" <sfedorov@netflix.com> said:
Dave, thanks for sharing interesting thoughts and context. I am still a bit worried about properly defining "latency under load" for a NAT routed situation. If the test is based on ICMP Ping packets *from the server*, it will NOT be measuring the full path latency, and if the potential congestion is in the uplink path from the access provider's residential box to the access provider's router/switch, it will NOT measure congestion caused by bufferbloat reliably on either side, since the bufferbloat will be outside the ICMP Ping path.
I realize that a browser based speed test has to be basically run from the "server" end, because browsers are not that good at time measurement on a packet basis. However, there are ways to solve this and avoid the ICMP Ping issue, with a cooperative server.
This erroneously assumes that [ fast.com ]( http://fast.com ) measures latency from the server side. It does not. The measurements are done from the client, over http, with a parallel connection(s) to the same or similar set of servers, by sending empty requests over a previously established connection (you can see that in the browser web inspector).
It should be noted that the value is not precisely the "RTT on a TCP/UDP flow that is loaded with traffic", but "user delay given the presence of heavy parallel flows". With that, some of the challenges you mentioned do not apply.
In line with another point I've shared earlier - the goal is to measure and explain the user experience, not to be a diagnostic tool showing internal transport metrics.
SERGEY FEDOROV
Director of Engineering
[ sfedorov@netflix.com ]( mailto:sfedorov@netflix.com )
121 Albright Way | Los Gatos, CA 95032
On Sat, May 2, 2020 at 10:38 AM David P. Reed <[ dpreed@deepplum.com ]( mailto:dpreed@deepplum.com )> wrote:
I am still a bit worried about properly defining "latency under load" for a NAT routed situation. If the test is based on ICMP Ping packets *from the server*, it will NOT be measuring the full path latency, and if the potential congestion is in the uplink path from the access provider's residential box to the access provider's router/switch, it will NOT measure congestion caused by bufferbloat reliably on either side, since the bufferbloat will be outside the ICMP Ping path.
I realize that a browser based speed test has to be basically run from the "server" end, because browsers are not that good at time measurement on a packet basis. However, there are ways to solve this and avoid the ICMP Ping issue, with a cooperative server.
I once built a test that fixed this issue reasonably well. It carefully created a TCP based RTT measurement channel (over HTTP) that made the echo have to traverse the whole end-to-end path, which is the best and only way to accurately define lag under load from the user's perspective. The client end of an unloaded TCP connection can depend on TCP (properly prepared by getting it past slowstart) to generate a single packet response.
This "TCP ping" is thus compatible with getting the end-to-end measurement on the server end of a true RTT.
It's like tcp-traceroute tool, in that it tricks anyone in the middle boxes into thinking this is a real, serious packet, not an optional low priority packet.
The same issue comes up with non-browser-based techniques for measuring true lag-under-load.
Now as we move HTTP to QUIC, this actually gets easier to do.
One other opportunity I haven't explored, but which is pregnant with potential is the use of WebRTC, which runs over UDP internally. Since JavaScript has direct access to create WebRTC connections (multiple ones), this makes detailed testing in the browser quite reasonable.
And the time measurements can resolve well below 100 microseconds, if the JS is based on modern JIT compilation (Chrome, Firefox, Edge all compile to machine code speed if the code is restricted and in a loop). Then again, there is Web Assembly if you want to write C code that runs in the brower fast. WebAssembly is a low level language that compiles to machine code in the browser execution, and still has access to all the browser networking facilities.
On Saturday, May 2, 2020 12:52pm, "Dave Taht" <[ dave.taht@gmail.com ]( mailto:dave.taht@gmail.com )> said:
> On Sat, May 2, 2020 at 9:37 AM Benjamin Cronce <[ bcronce@gmail.com ]( mailto:bcronce@gmail.com )> wrote:
> >
> > > Fast.com reports my unloaded latency as 4ms, my loaded latency as ~7ms
>
> I guess one of my questions is that with a switch to BBR netflix is
> going to do pretty well. If [ fast.com ]( http://fast.com ) is using bbr, well... that
> excludes much of the current side of the internet.
>
> > For download, I show 6ms unloaded and 6-7 loaded. But for upload the loaded
> shows as 7-8 and I see it blip upwards of 12ms. But I am no longer using any
> traffic shaping. Any anti-bufferbloat is from my ISP. A graph of the bloat would
> be nice.
>
> The tests do need to last a fairly long time.
>
> > On Sat, May 2, 2020 at 9:51 AM Jannie Hanekom <[ jannie@hanekom.net ]( mailto:jannie@hanekom.net )>
> wrote:
> >>
> >> Michael Richardson <[ mcr@sandelman.ca ]( mailto:mcr@sandelman.ca )>:
> >> > Does it find/use my nearest Netflix cache?
> >>
> >> Thankfully, it appears so. The DSLReports bloat test was interesting,
> but
> >> the jitter on the ~240ms base latency from South Africa (and other parts
> of
> >> the world) was significant enough that the figures returned were often
> >> unreliable and largely unusable - at least in my experience.
> >>
> >> Fast.com reports my unloaded latency as 4ms, my loaded latency as ~7ms
> and
> >> mentions servers located in local cities. I finally have a test I can
> share
> >> with local non-technical people!
> >>
> >> (Agreed, upload test would be nice, but this is a huge step forward from
> >> what I had access to before.)
> >>
> >> Jannie Hanekom
> >>
> >> _______________________________________________
> >> Cake mailing list
> >> [ Cake@lists.bufferbloat.net ]( mailto:Cake@lists.bufferbloat.net )
> >> [ https://lists.bufferbloat.net/listinfo/cake ]( https://lists.bufferbloat.net/listinfo/cake )
> >
> > _______________________________________________
> > Cake mailing list
> > [ Cake@lists.bufferbloat.net ]( mailto:Cake@lists.bufferbloat.net )
> > [ https://lists.bufferbloat.net/listinfo/cake ]( https://lists.bufferbloat.net/listinfo/cake )
>
>
>
> --
> Make Music, Not War
>
> Dave Täht
> CTO, TekLibre, LLC
> [ http://www.teklibre.com ]( http://www.teklibre.com )
> Tel: 1-831-435-0729
> _______________________________________________
> Cake mailing list
> [ Cake@lists.bufferbloat.net ]( mailto:Cake@lists.bufferbloat.net )
> [ https://lists.bufferbloat.net/listinfo/cake ]( https://lists.bufferbloat.net/listinfo/cake )
>
[-- Attachment #2: Type: text/html, Size: 13135 bytes --]
^ permalink raw reply [relevance 1%]
* Re: [Cake] [Make-wifi-fast] [Bloat] dslreports is no longer free
2020-05-02 17:38 1% ` David P. Reed
2020-05-02 19:00 1% ` Sergey Fedorov
@ 2020-05-02 20:19 0% ` Sebastian Moeller
1 sibling, 0 replies; 200+ results
From: Sebastian Moeller @ 2020-05-02 20:19 UTC (permalink / raw)
To: David P. Reed
Cc: Dave Täht, Michael Richardson, Make-Wifi-fast,
Jannie Hanekom, Cake List, Sergey Fedorov, bloat
Hi David,
in principle I agree, a NATed IPv4 ICMP probe will be at best reflected at the NAT router (CPE) (some commercial home gateways do not respond to ICMP echo requests in the name of security theatre). So it is pretty hard to measure the full end to end path in that configuration. I believe that IPv6 should make that easier/simpler in that NAT hopefully will be out of the path (but let's see what ingenuity ISPs will come up with).
Then again, traditionally the relevant bottlenecks often are a) the internet access link itself and there the CPE is in a reasonable position as a reflector on the other side of the bottleneck as seen from an internet server, b) the home network between CPE and end-host, often with variable rate wifi, here I agree reflecting echos at the CPE hides part of the issue.
> On May 2, 2020, at 19:38, David P. Reed <dpreed@deepplum.com> wrote:
>
> I am still a bit worried about properly defining "latency under load" for a NAT routed situation. If the test is based on ICMP Ping packets *from the server*, it will NOT be measuring the full path latency, and if the potential congestion is in the uplink path from the access provider's residential box to the access provider's router/switch, it will NOT measure congestion caused by bufferbloat reliably on either side, since the bufferbloat will be outside the ICMP Ping path.
Puzzled, as i believe it is going to be the residential box that will respond here, or will it be the AFTRs for CG-NAT that reflect the ICMP echo requests?
>
> I realize that a browser based speed test has to be basically run from the "server" end, because browsers are not that good at time measurement on a packet basis. However, there are ways to solve this and avoid the ICMP Ping issue, with a cooperative server.
>
> I once built a test that fixed this issue reasonably well. It carefully created a TCP based RTT measurement channel (over HTTP) that made the echo have to traverse the whole end-to-end path, which is the best and only way to accurately define lag under load from the user's perspective. The client end of an unloaded TCP connection can depend on TCP (properly prepared by getting it past slowstart) to generate a single packet response.
>
> This "TCP ping" is thus compatible with getting the end-to-end measurement on the server end of a true RTT.
>
> It's like tcp-traceroute tool, in that it tricks anyone in the middle boxes into thinking this is a real, serious packet, not an optional low priority packet.
>
> The same issue comes up with non-browser-based techniques for measuring true lag-under-load.
>
> Now as we move HTTP to QUIC, this actually gets easier to do.
>
> One other opportunity I haven't explored, but which is pregnant with potential is the use of WebRTC, which runs over UDP internally. Since JavaScript has direct access to create WebRTC connections (multiple ones), this makes detailed testing in the browser quite reasonable.
>
> And the time measurements can resolve well below 100 microseconds, if the JS is based on modern JIT compilation (Chrome, Firefox, Edge all compile to machine code speed if the code is restricted and in a loop). Then again, there is Web Assembly if you want to write C code that runs in the brower fast. WebAssembly is a low level language that compiles to machine code in the browser execution, and still has access to all the browser networking facilities.
Mmmh, according to https://github.com/w3c/hr-time/issues/56 due to spectre side-channel vulnerabilities many browsers seemed to have lowered the timer resolution, but even the ~1ms resolution should be fine for typical RTTs.
Best Regards
Sebastian
P.S.: I assume that I simply do not see/understand the full scope of the issue at hand yet.
>
> On Saturday, May 2, 2020 12:52pm, "Dave Taht" <dave.taht@gmail.com> said:
>
> > On Sat, May 2, 2020 at 9:37 AM Benjamin Cronce <bcronce@gmail.com> wrote:
> > >
> > > > Fast.com reports my unloaded latency as 4ms, my loaded latency as ~7ms
> >
> > I guess one of my questions is that with a switch to BBR netflix is
> > going to do pretty well. If fast.com is using bbr, well... that
> > excludes much of the current side of the internet.
> >
> > > For download, I show 6ms unloaded and 6-7 loaded. But for upload the loaded
> > shows as 7-8 and I see it blip upwards of 12ms. But I am no longer using any
> > traffic shaping. Any anti-bufferbloat is from my ISP. A graph of the bloat would
> > be nice.
> >
> > The tests do need to last a fairly long time.
> >
> > > On Sat, May 2, 2020 at 9:51 AM Jannie Hanekom <jannie@hanekom.net>
> > wrote:
> > >>
> > >> Michael Richardson <mcr@sandelman.ca>:
> > >> > Does it find/use my nearest Netflix cache?
> > >>
> > >> Thankfully, it appears so. The DSLReports bloat test was interesting,
> > but
> > >> the jitter on the ~240ms base latency from South Africa (and other parts
> > of
> > >> the world) was significant enough that the figures returned were often
> > >> unreliable and largely unusable - at least in my experience.
> > >>
> > >> Fast.com reports my unloaded latency as 4ms, my loaded latency as ~7ms
> > and
> > >> mentions servers located in local cities. I finally have a test I can
> > share
> > >> with local non-technical people!
> > >>
> > >> (Agreed, upload test would be nice, but this is a huge step forward from
> > >> what I had access to before.)
> > >>
> > >> Jannie Hanekom
> > >>
> > >> _______________________________________________
> > >> Cake mailing list
> > >> Cake@lists.bufferbloat.net
> > >> https://lists.bufferbloat.net/listinfo/cake
> > >
> > > _______________________________________________
> > > Cake mailing list
> > > Cake@lists.bufferbloat.net
> > > https://lists.bufferbloat.net/listinfo/cake
> >
> >
> >
> > --
> > Make Music, Not War
> >
> > Dave Täht
> > CTO, TekLibre, LLC
> > http://www.teklibre.com
> > Tel: 1-831-435-0729
> > _______________________________________________
> > Cake mailing list
> > Cake@lists.bufferbloat.net
> > https://lists.bufferbloat.net/listinfo/cake
> >
> _______________________________________________
> Cake mailing list
> Cake@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cake
^ permalink raw reply [relevance 0%]
* Re: [Cake] [Make-wifi-fast] [Bloat] dslreports is no longer free
2020-05-02 17:38 1% ` David P. Reed
@ 2020-05-02 19:00 1% ` Sergey Fedorov
2020-05-02 23:23 1% ` David P. Reed
2020-05-03 15:31 1% ` [Cake] fast.com quality David P. Reed
2020-05-02 20:19 0% ` [Cake] [Make-wifi-fast] [Bloat] dslreports is no longer free Sebastian Moeller
1 sibling, 2 replies; 200+ results
From: Sergey Fedorov @ 2020-05-02 19:00 UTC (permalink / raw)
To: David P. Reed
Cc: Dave Taht, Benjamin Cronce, Michael Richardson, Jannie Hanekom,
bloat, Cake List, Make-Wifi-fast
[-- Attachment #1: Type: text/plain, Size: 6670 bytes --]
Dave, thanks for sharing interesting thoughts and context.
> I am still a bit worried about properly defining "latency under load" for
> a NAT routed situation. If the test is based on ICMP Ping packets *from the
> server*, it will NOT be measuring the full path latency, and if the
> potential congestion is in the uplink path from the access provider's
> residential box to the access provider's router/switch, it will NOT measure
> congestion caused by bufferbloat reliably on either side, since the
> bufferbloat will be outside the ICMP Ping path.
>
> I realize that a browser based speed test has to be basically run from the
> "server" end, because browsers are not that good at time measurement on a
> packet basis. However, there are ways to solve this and avoid the ICMP Ping
> issue, with a cooperative server.
This erroneously assumes that fast.com measures latency from the server
side. It does not. The measurements are done from the client, over http,
with a parallel connection(s) to the same or similar set of servers, by
sending empty requests over a previously established connection (you can
see that in the browser web inspector).
It should be noted that the value is not precisely the "RTT on a
TCP/UDP flow that is loaded with traffic", but "user delay given the
presence of heavy parallel flows". With that, some of the challenges you
mentioned do not apply.
In line with another point I've shared earlier - the goal is to measure and
explain the user experience, not to be a diagnostic tool showing internal
transport metrics.
SERGEY FEDOROV
Director of Engineering
sfedorov@netflix.com
121 Albright Way | Los Gatos, CA 95032
On Sat, May 2, 2020 at 10:38 AM David P. Reed <dpreed@deepplum.com> wrote:
> I am still a bit worried about properly defining "latency under load" for
> a NAT routed situation. If the test is based on ICMP Ping packets *from the
> server*, it will NOT be measuring the full path latency, and if the
> potential congestion is in the uplink path from the access provider's
> residential box to the access provider's router/switch, it will NOT measure
> congestion caused by bufferbloat reliably on either side, since the
> bufferbloat will be outside the ICMP Ping path.
>
>
>
> I realize that a browser based speed test has to be basically run from the
> "server" end, because browsers are not that good at time measurement on a
> packet basis. However, there are ways to solve this and avoid the ICMP Ping
> issue, with a cooperative server.
>
>
>
> I once built a test that fixed this issue reasonably well. It carefully
> created a TCP based RTT measurement channel (over HTTP) that made the echo
> have to traverse the whole end-to-end path, which is the best and only way
> to accurately define lag under load from the user's perspective. The client
> end of an unloaded TCP connection can depend on TCP (properly prepared by
> getting it past slowstart) to generate a single packet response.
>
>
>
> This "TCP ping" is thus compatible with getting the end-to-end measurement
> on the server end of a true RTT.
>
>
>
> It's like tcp-traceroute tool, in that it tricks anyone in the middle
> boxes into thinking this is a real, serious packet, not an optional low
> priority packet.
>
>
>
> The same issue comes up with non-browser-based techniques for measuring
> true lag-under-load.
>
>
>
> Now as we move HTTP to QUIC, this actually gets easier to do.
>
>
>
> One other opportunity I haven't explored, but which is pregnant with
> potential is the use of WebRTC, which runs over UDP internally. Since
> JavaScript has direct access to create WebRTC connections (multiple ones),
> this makes detailed testing in the browser quite reasonable.
>
>
>
> And the time measurements can resolve well below 100 microseconds, if the
> JS is based on modern JIT compilation (Chrome, Firefox, Edge all compile to
> machine code speed if the code is restricted and in a loop). Then again,
> there is Web Assembly if you want to write C code that runs in the brower
> fast. WebAssembly is a low level language that compiles to machine code in
> the browser execution, and still has access to all the browser networking
> facilities.
>
>
>
> On Saturday, May 2, 2020 12:52pm, "Dave Taht" <dave.taht@gmail.com> said:
>
> > On Sat, May 2, 2020 at 9:37 AM Benjamin Cronce <bcronce@gmail.com>
> wrote:
> > >
> > > > Fast.com reports my unloaded latency as 4ms, my loaded latency as
> ~7ms
> >
> > I guess one of my questions is that with a switch to BBR netflix is
> > going to do pretty well. If fast.com is using bbr, well... that
> > excludes much of the current side of the internet.
> >
> > > For download, I show 6ms unloaded and 6-7 loaded. But for upload the
> loaded
> > shows as 7-8 and I see it blip upwards of 12ms. But I am no longer using
> any
> > traffic shaping. Any anti-bufferbloat is from my ISP. A graph of the
> bloat would
> > be nice.
> >
> > The tests do need to last a fairly long time.
> >
> > > On Sat, May 2, 2020 at 9:51 AM Jannie Hanekom <jannie@hanekom.net>
> > wrote:
> > >>
> > >> Michael Richardson <mcr@sandelman.ca>:
> > >> > Does it find/use my nearest Netflix cache?
> > >>
> > >> Thankfully, it appears so. The DSLReports bloat test was interesting,
> > but
> > >> the jitter on the ~240ms base latency from South Africa (and other
> parts
> > of
> > >> the world) was significant enough that the figures returned were often
> > >> unreliable and largely unusable - at least in my experience.
> > >>
> > >> Fast.com reports my unloaded latency as 4ms, my loaded latency as ~7ms
> > and
> > >> mentions servers located in local cities. I finally have a test I can
> > share
> > >> with local non-technical people!
> > >>
> > >> (Agreed, upload test would be nice, but this is a huge step forward
> from
> > >> what I had access to before.)
> > >>
> > >> Jannie Hanekom
> > >>
> > >> _______________________________________________
> > >> Cake mailing list
> > >> Cake@lists.bufferbloat.net
> > >> https://lists.bufferbloat.net/listinfo/cake
> > >
> > > _______________________________________________
> > > Cake mailing list
> > > Cake@lists.bufferbloat.net
> > > https://lists.bufferbloat.net/listinfo/cake
> >
> >
> >
> > --
> > Make Music, Not War
> >
> > Dave Täht
> > CTO, TekLibre, LLC
> > http://www.teklibre.com
> > Tel: 1-831-435-0729
> > _______________________________________________
> > Cake mailing list
> > Cake@lists.bufferbloat.net
> > https://lists.bufferbloat.net/listinfo/cake
> >
>
[-- Attachment #2: Type: text/html, Size: 11006 bytes --]
^ permalink raw reply [relevance 1%]
* Re: [Cake] [Make-wifi-fast] [Bloat] dslreports is no longer free
2020-05-02 16:52 1% ` Dave Taht
@ 2020-05-02 17:38 1% ` David P. Reed
2020-05-02 19:00 1% ` Sergey Fedorov
2020-05-02 20:19 0% ` [Cake] [Make-wifi-fast] [Bloat] dslreports is no longer free Sebastian Moeller
0 siblings, 2 replies; 200+ results
From: David P. Reed @ 2020-05-02 17:38 UTC (permalink / raw)
To: Dave Taht
Cc: Benjamin Cronce, Michael Richardson, Jannie Hanekom, bloat,
Cake List, Sergey Fedorov, Make-Wifi-fast
[-- Attachment #1: Type: text/plain, Size: 4600 bytes --]
I am still a bit worried about properly defining "latency under load" for a NAT routed situation. If the test is based on ICMP Ping packets *from the server*, it will NOT be measuring the full path latency, and if the potential congestion is in the uplink path from the access provider's residential box to the access provider's router/switch, it will NOT measure congestion caused by bufferbloat reliably on either side, since the bufferbloat will be outside the ICMP Ping path.
I realize that a browser based speed test has to be basically run from the "server" end, because browsers are not that good at time measurement on a packet basis. However, there are ways to solve this and avoid the ICMP Ping issue, with a cooperative server.
I once built a test that fixed this issue reasonably well. It carefully created a TCP based RTT measurement channel (over HTTP) that made the echo have to traverse the whole end-to-end path, which is the best and only way to accurately define lag under load from the user's perspective. The client end of an unloaded TCP connection can depend on TCP (properly prepared by getting it past slowstart) to generate a single packet response.
This "TCP ping" is thus compatible with getting the end-to-end measurement on the server end of a true RTT.
It's like tcp-traceroute tool, in that it tricks anyone in the middle boxes into thinking this is a real, serious packet, not an optional low priority packet.
The same issue comes up with non-browser-based techniques for measuring true lag-under-load.
Now as we move HTTP to QUIC, this actually gets easier to do.
One other opportunity I haven't explored, but which is pregnant with potential is the use of WebRTC, which runs over UDP internally. Since JavaScript has direct access to create WebRTC connections (multiple ones), this makes detailed testing in the browser quite reasonable.
And the time measurements can resolve well below 100 microseconds, if the JS is based on modern JIT compilation (Chrome, Firefox, Edge all compile to machine code speed if the code is restricted and in a loop). Then again, there is Web Assembly if you want to write C code that runs in the brower fast. WebAssembly is a low level language that compiles to machine code in the browser execution, and still has access to all the browser networking facilities.
On Saturday, May 2, 2020 12:52pm, "Dave Taht" <dave.taht@gmail.com> said:
> On Sat, May 2, 2020 at 9:37 AM Benjamin Cronce <bcronce@gmail.com> wrote:
> >
> > > Fast.com reports my unloaded latency as 4ms, my loaded latency as ~7ms
>
> I guess one of my questions is that with a switch to BBR netflix is
> going to do pretty well. If fast.com is using bbr, well... that
> excludes much of the current side of the internet.
>
> > For download, I show 6ms unloaded and 6-7 loaded. But for upload the loaded
> shows as 7-8 and I see it blip upwards of 12ms. But I am no longer using any
> traffic shaping. Any anti-bufferbloat is from my ISP. A graph of the bloat would
> be nice.
>
> The tests do need to last a fairly long time.
>
> > On Sat, May 2, 2020 at 9:51 AM Jannie Hanekom <jannie@hanekom.net>
> wrote:
> >>
> >> Michael Richardson <mcr@sandelman.ca>:
> >> > Does it find/use my nearest Netflix cache?
> >>
> >> Thankfully, it appears so. The DSLReports bloat test was interesting,
> but
> >> the jitter on the ~240ms base latency from South Africa (and other parts
> of
> >> the world) was significant enough that the figures returned were often
> >> unreliable and largely unusable - at least in my experience.
> >>
> >> Fast.com reports my unloaded latency as 4ms, my loaded latency as ~7ms
> and
> >> mentions servers located in local cities. I finally have a test I can
> share
> >> with local non-technical people!
> >>
> >> (Agreed, upload test would be nice, but this is a huge step forward from
> >> what I had access to before.)
> >>
> >> Jannie Hanekom
> >>
> >> _______________________________________________
> >> Cake mailing list
> >> Cake@lists.bufferbloat.net
> >> https://lists.bufferbloat.net/listinfo/cake
> >
> > _______________________________________________
> > Cake mailing list
> > Cake@lists.bufferbloat.net
> > https://lists.bufferbloat.net/listinfo/cake
>
>
>
> --
> Make Music, Not War
>
> Dave Täht
> CTO, TekLibre, LLC
> http://www.teklibre.com
> Tel: 1-831-435-0729
> _______________________________________________
> Cake mailing list
> Cake@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cake
>
[-- Attachment #2: Type: text/html, Size: 7316 bytes --]
^ permalink raw reply [relevance 1%]
* Re: [Cake] [Make-wifi-fast] [Bloat] dslreports is no longer free
2020-05-02 16:37 1% ` Benjamin Cronce
@ 2020-05-02 16:52 1% ` Dave Taht
2020-05-02 17:38 1% ` David P. Reed
0 siblings, 1 reply; 200+ results
From: Dave Taht @ 2020-05-02 16:52 UTC (permalink / raw)
To: Benjamin Cronce
Cc: Jannie Hanekom, Cake List, Sergey Fedorov, Make-Wifi-fast,
Michael Richardson, bloat
On Sat, May 2, 2020 at 9:37 AM Benjamin Cronce <bcronce@gmail.com> wrote:
>
> > Fast.com reports my unloaded latency as 4ms, my loaded latency as ~7ms
I guess one of my questions is that with a switch to BBR netflix is
going to do pretty well. If fast.com is using bbr, well... that
excludes much of the current side of the internet.
> For download, I show 6ms unloaded and 6-7 loaded. But for upload the loaded shows as 7-8 and I see it blip upwards of 12ms. But I am no longer using any traffic shaping. Any anti-bufferbloat is from my ISP. A graph of the bloat would be nice.
The tests do need to last a fairly long time.
> On Sat, May 2, 2020 at 9:51 AM Jannie Hanekom <jannie@hanekom.net> wrote:
>>
>> Michael Richardson <mcr@sandelman.ca>:
>> > Does it find/use my nearest Netflix cache?
>>
>> Thankfully, it appears so. The DSLReports bloat test was interesting, but
>> the jitter on the ~240ms base latency from South Africa (and other parts of
>> the world) was significant enough that the figures returned were often
>> unreliable and largely unusable - at least in my experience.
>>
>> Fast.com reports my unloaded latency as 4ms, my loaded latency as ~7ms and
>> mentions servers located in local cities. I finally have a test I can share
>> with local non-technical people!
>>
>> (Agreed, upload test would be nice, but this is a huge step forward from
>> what I had access to before.)
>>
>> Jannie Hanekom
>>
>> _______________________________________________
>> Cake mailing list
>> Cake@lists.bufferbloat.net
>> https://lists.bufferbloat.net/listinfo/cake
>
> _______________________________________________
> Cake mailing list
> Cake@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cake
--
Make Music, Not War
Dave Täht
CTO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-831-435-0729
^ permalink raw reply [relevance 1%]
* Re: [Cake] [Make-wifi-fast] [Bloat] dslreports is no longer free
@ 2020-05-02 16:37 1% ` Benjamin Cronce
2020-05-02 16:52 1% ` Dave Taht
0 siblings, 1 reply; 200+ results
From: Benjamin Cronce @ 2020-05-02 16:37 UTC (permalink / raw)
To: Jannie Hanekom
Cc: Michael Richardson, Cake List, Sergey Fedorov, bloat, Make-Wifi-fast
[-- Attachment #1: Type: text/plain, Size: 1246 bytes --]
> Fast.com reports my unloaded latency as 4ms, my loaded latency as ~7ms
For download, I show 6ms unloaded and 6-7 loaded. But for upload the loaded
shows as 7-8 and I see it blip upwards of 12ms. But I am no longer using
any traffic shaping. Any anti-bufferbloat is from my ISP. A graph of the
bloat would be nice.
On Sat, May 2, 2020 at 9:51 AM Jannie Hanekom <jannie@hanekom.net> wrote:
> Michael Richardson <mcr@sandelman.ca>:
> > Does it find/use my nearest Netflix cache?
>
> Thankfully, it appears so. The DSLReports bloat test was interesting, but
> the jitter on the ~240ms base latency from South Africa (and other parts of
> the world) was significant enough that the figures returned were often
> unreliable and largely unusable - at least in my experience.
>
> Fast.com reports my unloaded latency as 4ms, my loaded latency as ~7ms and
> mentions servers located in local cities. I finally have a test I can
> share
> with local non-technical people!
>
> (Agreed, upload test would be nice, but this is a huge step forward from
> what I had access to before.)
>
> Jannie Hanekom
>
> _______________________________________________
> Cake mailing list
> Cake@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cake
>
[-- Attachment #2: Type: text/html, Size: 1875 bytes --]
^ permalink raw reply [relevance 1%]
* Re: [Bloat] [Cake] dslreports is no longer free
@ 2020-05-01 23:35 1% ` Sergey Fedorov
1 sibling, 0 replies; 200+ results
From: Sergey Fedorov @ 2020-05-01 23:35 UTC (permalink / raw)
To: Michael Richardson
Cc: Sebastian Moeller, Cake List, Make-Wifi-fast, cerowrt-devel, bloat
[-- Attachment #1: Type: text/plain, Size: 1917 bytes --]
Hi Michael,
This blog post <https://netflixtechblog.com/building-fast-com-4857fe0f8adb>
describes how
the test steers to the server(s).
Noted on the other thread, I hope to add the url param option reasonably
soon.
SERGEY FEDOROV
Director of Engineering
sfedorov@netflix.com
121 Albright Way | Los Gatos, CA 95032
On Fri, May 1, 2020 at 3:07 PM Michael Richardson <mcr@sandelman.ca> wrote:
>
> {Do I need all the lists?}
>
> Sergey Fedorov via Bloat <bloat@lists.bufferbloat.net> wrote:
> > Just a note that I have a plan to separate the loaded latency into
> > upload/download. It's not great UX now they way it's implemented.
> > The timeline view is a bit more nuanced, in the spirit of the
> simplistic
> > UX, but I've been thinking on a good way to show that for super
> users as
> > well.
> > Two latency numbers - that's more user friendly, we want the general
> user
> > to understand the meaning. And latency under load is much easier than
> > bufferbloat.
>
> > As a side note, if our backend is decent, I'm curious what are the
> backends
> > for the speed tests that exist that are great :)
>
> Does it find/use my nearest Netflix cache?
>
> As others asked, it would be great if we could put the settings into a URL,
> and having the "latency under upload" is probably the most important number
> that people trying to videoconference need to know.
>
> (it's also the thing that they can mostly directly/cheaply fix)
>
> > SERGEY FEDOROV
> > Director of Engineering
> > sfedorov@netflix.com
> > 121 Albright Way | Los Gatos, CA 95032
>
> Very happy that you are looped in here.
>
> --
> ] Never tell me the odds! | ipv6 mesh
> networks [
> ] Michael Richardson, Sandelman Software Works | IoT
> architect [
> ] mcr@sandelman.ca http://www.sandelman.ca/ | ruby on
> rails [
>
>
[-- Attachment #2: Type: text/html, Size: 4039 bytes --]
^ permalink raw reply [relevance 1%]
* Re: [Bloat] [Cake] dslreports is no longer free
2020-05-01 21:11 0% ` [Cake] [Bloat] " Sebastian Moeller
@ 2020-05-01 21:37 1% ` Sergey Fedorov
0 siblings, 0 replies; 200+ results
From: Sergey Fedorov @ 2020-05-01 21:37 UTC (permalink / raw)
To: Sebastian Moeller
Cc: Dave Täht, Cake List, Make-Wifi-fast, cerowrt-devel, bloat
[-- Attachment #1: Type: text/plain, Size: 8302 bytes --]
Thanks for the kind words, Sebastian!
+1; for normal users that is already bliss. For de-bloating a link however
> a bit more time resolution generally makes things a bit easier to reason
> about ;)
Apologies, I misunderstood your original statement. I interpreted it as a
vote to keep a single bufferbloat metric (vs loaded/unloaded latency).
Agreed on time resolution and its value. No question it's useful for
diagnostics. Open question is to what extent browser-based tools should be
used for detailed troubleshooting (due to sandboxing limitations), and when
is the time for the big guns (like flent) to enter the scene.
I like to talk about the latency-under-load-increase when helping people
> to debloat their links, but that also is a tad on the long side.
Fully agree on length, don't like the verboseness as well. Still looking
for a term that is shorter and yet generic enough that I can explain to my
mom.
Ah, I might have tried too hard at understatement, this was the only
> back-end worth mentioning in the "pros" section...
Got it. The breitbandmessung case is indeed interesting.
SERGEY FEDOROV
Director of Engineering
sfedorov@netflix.com
121 Albright Way | Los Gatos, CA 95032
On Fri, May 1, 2020 at 2:11 PM Sebastian Moeller <moeller0@gmx.de> wrote:
> Hi Sergey,
>
>
>
> > On May 1, 2020, at 22:09, Sergey Fedorov <sfedorov@netflix.com> wrote:
> >
> > Great review, Sebastian!
> >
> > NETFLIX: fast.com.
> > Pros: allows selection of upload testing, supposedly decent
> back-end, duration configurable
> > allows unloaded, loaded download and loaded upload RTT
> measurements (but reports sinlge numbers for loaded and unloaded RTT, that
> are not the max)
> > Cons: RTT report as two numbers one for the loaded and one for
> unloaded RTT, time-course of RTTs missing
> > BUFFERBLOAT verdict: incomplete, but oh, so close...
> > Just a note that I have a plan to separate the loaded latency into
> upload/download. It's not great UX now they way it's implemented.
>
> Great! I really appreciate the way fast.com evolves carefully to
> not confuse the intended users and to stay true to its core mission while
> it still gaining additional features that are not directly part of Netflix
> business case to operate that test in the first place. Don't get me wrong,
> I absolutely love that I can easily understand why you should be interested
> in getting reliable robust speedtests from all existing or potential
> customers to your back-end; and unlike an ISP's internal speedtest, you are
> not likely to sugar coat things ;) as your goal and the end-user's goal are
> fully aligned.
>
> > The timeline view is a bit more nuanced, in the spirit of the simplistic
> UX, but I've been thinking on a good way to show that for super users as
> well.
>
> Great again! I see the beauty of keeping things simple while maybe
> hiding optional information behind an additional "click".
>
> > Two latency numbers - that's more user friendly, we want the general
> user to understand the meaning.
>
> +1; for normal users that is already bliss. For de-bloating a link
> however a bit more time resolution generally makes things a bit easier to
> reason about ;)
>
> > And latency under load is much easier than bufferbloat.
>
> +1; as far as I can tell that term sort of was a decent
> description of the observed phenomenon that then got a life of its own; in
> retrospect it was not the most self explanatory term. I like to talk about
> the latency-under-load-increase when helping people to debloat their links,
> but that also is a tad on the long side.
>
> >
> > As a side note, if our backend is decent, I'm curious what are the
> backends for the speed tests that exist that are great :)
>
> Ah, I might have tried too hard at understatement, this was the
> only back-end worth mentioning in the "pros" section...
> (well, I also like how breitbandmessung.de deals with their purposefully
> limited backend (all located in a single" data center in Germany located in
> an AS that is not directly owned by any ISP, it's the german regulators
> official speedtest for germany against which we can effectively measure and
> get an early exit from contracts if the ISPs can not deliver the contracted
> rates (with a bit of slack)))
>
> Best Regards
> Sebastian
>
> >
> > SERGEY FEDOROV
> > Director of Engineering
> > sfedorov@netflix.com
> > 121 Albright Way | Los Gatos, CA 95032
> >
> >
> >
> > On Fri, May 1, 2020 at 12:48 PM Sebastian Moeller <moeller0@gmx.de>
> wrote:
> > Hi Dave,
> >
> > well, it was a free service and it lasted a long time. I want to raise a
> toast to Justin and convey my sincere thanks for years of investing into
> the "good" of the internet.
> >
> > Now, the question is which test is going to be the rightful successor?
> >
> > Short of running netperf/irtt/iper2/iperf3 on a hosted server, I see
> lots of potential but none of the tests are really there yet (grievances in
> now particular order):
> >
> > OOKLA: speedtest.net.
> > Pros: ubiquitious, allows selection of single flow versus
> multi-flow test, allows server selection
> > Cons: only IPv4, only static unloaded RTT measurement, no
> control over measurement duration
> > BUFFERBLOAT verdict: incomplete, maybe usable as load generator
> >
> >
> > NETFLIX: fast.com.
> > Pros: allows selection of upload testing, supposedly decent
> back-end, duration configurable
> > allows unloaded, loaded download and loaded upload RTT
> measurements (but reports sinlge numbers for loaded and unloaded RTT, that
> are not the max)
> > Cons: RTT report as two numbers one for the loaded and one for
> unloaded RTT, time-course of RTTs missing
> > BUFFERBLOAT verdict: incomplete, but oh, so close...
> >
> >
> > NPERF: nperf.com
> > Pros: allows server selection, RTT measurement and report as
> time course, also reports average rates and static RTT/jitter for Up- and
> Download
> > Cons: RTT measurement for unloaded only, reported RTT static
> only , no control over measurement duration
> > BUFFERBLOAT verdict: incomplete,
> >
> >
> > THINKBROADBAND: www.thinkbroadband.com/speedtest
> > Pros: IPv6, reports coarse RTT time courses for all three
> measurement phases
> > Cons: only static unloaded RTT report in final results, time
> courses only visible immediately after testing, no control over measurement
> duration
> > BUFFERBLOAT verdict: a bit coarse, might work for users within a
> reasonable distance to the UK for acute de-bloating sessions (history
> reporting is bad though)
> >
> >
> > honorable mentioning:
> > BREITBANDMESSUNG: breitbandmessung.de
> > Pros: query of contracted internet access speed before
> measurement, with a scheduler that will only start a test when the backend
> has sufficient capacity to saturate the user-supplied contracted rates,
> IPv6 (happy-eyeballs)
> > Cons: only static unloaded RTT measurement, no control over
> measurement duration
> > BUFFERBLOAT verdict: unsuitable, exceot as load generator, but
> the bandwidth reservation feature is quite nice.
> >
> > Best Regards
> > Sebastian
> >
> >
> > > On May 1, 2020, at 18:44, Dave Taht <dave.taht@gmail.com> wrote:
> > >
> > >
> https://www.reddit.com/r/HomeNetworking/comments/gbd6g0/dsl_reports_speed_test_no_longer_free/
> > >
> > > They ran out of bandwidth.
> > >
> > > Message to users here:
> > >
> > > http://www.dslreports.com/speedtest
> > >
> > >
> > > --
> > > Make Music, Not War
> > >
> > > Dave Täht
> > > CTO, TekLibre, LLC
> > > http://www.teklibre.com
> > > Tel: 1-831-435-0729
> > > _______________________________________________
> > > Cake mailing list
> > > Cake@lists.bufferbloat.net
> > > https://lists.bufferbloat.net/listinfo/cake
> >
> > _______________________________________________
> > Bloat mailing list
> > Bloat@lists.bufferbloat.net
> > https://lists.bufferbloat.net/listinfo/bloat
>
>
[-- Attachment #2: Type: text/html, Size: 12423 bytes --]
^ permalink raw reply [relevance 1%]
* Re: [Cake] [Bloat] dslreports is no longer free
2020-05-01 20:09 1% ` [Bloat] " Sergey Fedorov
@ 2020-05-01 21:11 0% ` Sebastian Moeller
2020-05-01 21:37 1% ` [Bloat] [Cake] " Sergey Fedorov
0 siblings, 1 reply; 200+ results
From: Sebastian Moeller @ 2020-05-01 21:11 UTC (permalink / raw)
To: Sergey Fedorov
Cc: Dave Täht, Cake List, Make-Wifi-fast, cerowrt-devel, bloat
Hi Sergey,
> On May 1, 2020, at 22:09, Sergey Fedorov <sfedorov@netflix.com> wrote:
>
> Great review, Sebastian!
>
> NETFLIX: fast.com.
> Pros: allows selection of upload testing, supposedly decent back-end, duration configurable
> allows unloaded, loaded download and loaded upload RTT measurements (but reports sinlge numbers for loaded and unloaded RTT, that are not the max)
> Cons: RTT report as two numbers one for the loaded and one for unloaded RTT, time-course of RTTs missing
> BUFFERBLOAT verdict: incomplete, but oh, so close...
> Just a note that I have a plan to separate the loaded latency into upload/download. It's not great UX now they way it's implemented.
Great! I really appreciate the way fast.com evolves carefully to not confuse the intended users and to stay true to its core mission while it still gaining additional features that are not directly part of Netflix business case to operate that test in the first place. Don't get me wrong, I absolutely love that I can easily understand why you should be interested in getting reliable robust speedtests from all existing or potential customers to your back-end; and unlike an ISP's internal speedtest, you are not likely to sugar coat things ;) as your goal and the end-user's goal are fully aligned.
> The timeline view is a bit more nuanced, in the spirit of the simplistic UX, but I've been thinking on a good way to show that for super users as well.
Great again! I see the beauty of keeping things simple while maybe hiding optional information behind an additional "click".
> Two latency numbers - that's more user friendly, we want the general user to understand the meaning.
+1; for normal users that is already bliss. For de-bloating a link however a bit more time resolution generally makes things a bit easier to reason about ;)
> And latency under load is much easier than bufferbloat.
+1; as far as I can tell that term sort of was a decent description of the observed phenomenon that then got a life of its own; in retrospect it was not the most self explanatory term. I like to talk about the latency-under-load-increase when helping people to debloat their links, but that also is a tad on the long side.
>
> As a side note, if our backend is decent, I'm curious what are the backends for the speed tests that exist that are great :)
Ah, I might have tried too hard at understatement, this was the only back-end worth mentioning in the "pros" section...
(well, I also like how breitbandmessung.de deals with their purposefully limited backend (all located in a single" data center in Germany located in an AS that is not directly owned by any ISP, it's the german regulators official speedtest for germany against which we can effectively measure and get an early exit from contracts if the ISPs can not deliver the contracted rates (with a bit of slack)))
Best Regards
Sebastian
>
> SERGEY FEDOROV
> Director of Engineering
> sfedorov@netflix.com
> 121 Albright Way | Los Gatos, CA 95032
>
>
>
> On Fri, May 1, 2020 at 12:48 PM Sebastian Moeller <moeller0@gmx.de> wrote:
> Hi Dave,
>
> well, it was a free service and it lasted a long time. I want to raise a toast to Justin and convey my sincere thanks for years of investing into the "good" of the internet.
>
> Now, the question is which test is going to be the rightful successor?
>
> Short of running netperf/irtt/iper2/iperf3 on a hosted server, I see lots of potential but none of the tests are really there yet (grievances in now particular order):
>
> OOKLA: speedtest.net.
> Pros: ubiquitious, allows selection of single flow versus multi-flow test, allows server selection
> Cons: only IPv4, only static unloaded RTT measurement, no control over measurement duration
> BUFFERBLOAT verdict: incomplete, maybe usable as load generator
>
>
> NETFLIX: fast.com.
> Pros: allows selection of upload testing, supposedly decent back-end, duration configurable
> allows unloaded, loaded download and loaded upload RTT measurements (but reports sinlge numbers for loaded and unloaded RTT, that are not the max)
> Cons: RTT report as two numbers one for the loaded and one for unloaded RTT, time-course of RTTs missing
> BUFFERBLOAT verdict: incomplete, but oh, so close...
>
>
> NPERF: nperf.com
> Pros: allows server selection, RTT measurement and report as time course, also reports average rates and static RTT/jitter for Up- and Download
> Cons: RTT measurement for unloaded only, reported RTT static only , no control over measurement duration
> BUFFERBLOAT verdict: incomplete,
>
>
> THINKBROADBAND: www.thinkbroadband.com/speedtest
> Pros: IPv6, reports coarse RTT time courses for all three measurement phases
> Cons: only static unloaded RTT report in final results, time courses only visible immediately after testing, no control over measurement duration
> BUFFERBLOAT verdict: a bit coarse, might work for users within a reasonable distance to the UK for acute de-bloating sessions (history reporting is bad though)
>
>
> honorable mentioning:
> BREITBANDMESSUNG: breitbandmessung.de
> Pros: query of contracted internet access speed before measurement, with a scheduler that will only start a test when the backend has sufficient capacity to saturate the user-supplied contracted rates, IPv6 (happy-eyeballs)
> Cons: only static unloaded RTT measurement, no control over measurement duration
> BUFFERBLOAT verdict: unsuitable, exceot as load generator, but the bandwidth reservation feature is quite nice.
>
> Best Regards
> Sebastian
>
>
> > On May 1, 2020, at 18:44, Dave Taht <dave.taht@gmail.com> wrote:
> >
> > https://www.reddit.com/r/HomeNetworking/comments/gbd6g0/dsl_reports_speed_test_no_longer_free/
> >
> > They ran out of bandwidth.
> >
> > Message to users here:
> >
> > http://www.dslreports.com/speedtest
> >
> >
> > --
> > Make Music, Not War
> >
> > Dave Täht
> > CTO, TekLibre, LLC
> > http://www.teklibre.com
> > Tel: 1-831-435-0729
> > _______________________________________________
> > Cake mailing list
> > Cake@lists.bufferbloat.net
> > https://lists.bufferbloat.net/listinfo/cake
>
> _______________________________________________
> Bloat mailing list
> Bloat@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/bloat
^ permalink raw reply [relevance 0%]
* Re: [Bloat] [Cake] dslreports is no longer free
2020-05-01 19:48 0% ` [Cake] " Sebastian Moeller
@ 2020-05-01 20:09 1% ` Sergey Fedorov
2020-05-01 21:11 0% ` [Cake] [Bloat] " Sebastian Moeller
[not found] ` <mailman.170.1588363787.24343.bloat@lists.bufferbloat.net>
2020-05-27 9:08 0% ` [Cake] " Matthew Ford
2 siblings, 1 reply; 200+ results
From: Sergey Fedorov @ 2020-05-01 20:09 UTC (permalink / raw)
To: Sebastian Moeller
Cc: Dave Täht, Cake List, Make-Wifi-fast, cerowrt-devel, bloat
[-- Attachment #1: Type: text/plain, Size: 4738 bytes --]
Great review, Sebastian!
> NETFLIX: fast.com.
> Pros: allows selection of upload testing, supposedly decent
> back-end, duration configurable
> allows unloaded, loaded download and loaded upload RTT
> measurements (but reports sinlge numbers for loaded and unloaded RTT, that
> are not the max)
> Cons: RTT report as two numbers one for the loaded and one for
> unloaded RTT, time-course of RTTs missing
> BUFFERBLOAT verdict: incomplete, but oh, so close...
Just a note that I have a plan to separate the loaded latency into
upload/download. It's not great UX now they way it's implemented.
The timeline view is a bit more nuanced, in the spirit of the simplistic
UX, but I've been thinking on a good way to show that for super users as
well.
Two latency numbers - that's more user friendly, we want the general user
to understand the meaning. And latency under load is much easier than
bufferbloat.
As a side note, if our backend is decent, I'm curious what are the backends
for the speed tests that exist that are great :)
SERGEY FEDOROV
Director of Engineering
sfedorov@netflix.com
121 Albright Way | Los Gatos, CA 95032
On Fri, May 1, 2020 at 12:48 PM Sebastian Moeller <moeller0@gmx.de> wrote:
> Hi Dave,
>
> well, it was a free service and it lasted a long time. I want to raise a
> toast to Justin and convey my sincere thanks for years of investing into
> the "good" of the internet.
>
> Now, the question is which test is going to be the rightful successor?
>
> Short of running netperf/irtt/iper2/iperf3 on a hosted server, I see lots
> of potential but none of the tests are really there yet (grievances in now
> particular order):
>
> OOKLA: speedtest.net.
> Pros: ubiquitious, allows selection of single flow versus
> multi-flow test, allows server selection
> Cons: only IPv4, only static unloaded RTT measurement, no control
> over measurement duration
> BUFFERBLOAT verdict: incomplete, maybe usable as load generator
>
>
> NETFLIX: fast.com.
> Pros: allows selection of upload testing, supposedly decent
> back-end, duration configurable
> allows unloaded, loaded download and loaded upload RTT
> measurements (but reports sinlge numbers for loaded and unloaded RTT, that
> are not the max)
> Cons: RTT report as two numbers one for the loaded and one for
> unloaded RTT, time-course of RTTs missing
> BUFFERBLOAT verdict: incomplete, but oh, so close...
>
>
> NPERF: nperf.com
> Pros: allows server selection, RTT measurement and report as time
> course, also reports average rates and static RTT/jitter for Up- and
> Download
> Cons: RTT measurement for unloaded only, reported RTT static only
> , no control over measurement duration
> BUFFERBLOAT verdict: incomplete,
>
>
> THINKBROADBAND: www.thinkbroadband.com/speedtest
> Pros: IPv6, reports coarse RTT time courses for all three
> measurement phases
> Cons: only static unloaded RTT report in final results, time
> courses only visible immediately after testing, no control over measurement
> duration
> BUFFERBLOAT verdict: a bit coarse, might work for users within a
> reasonable distance to the UK for acute de-bloating sessions (history
> reporting is bad though)
>
>
> honorable mentioning:
> BREITBANDMESSUNG: breitbandmessung.de
> Pros: query of contracted internet access speed before
> measurement, with a scheduler that will only start a test when the backend
> has sufficient capacity to saturate the user-supplied contracted rates,
> IPv6 (happy-eyeballs)
> Cons: only static unloaded RTT measurement, no control over
> measurement duration
> BUFFERBLOAT verdict: unsuitable, exceot as load generator, but the
> bandwidth reservation feature is quite nice.
>
> Best Regards
> Sebastian
>
>
> > On May 1, 2020, at 18:44, Dave Taht <dave.taht@gmail.com> wrote:
> >
> >
> https://www.reddit.com/r/HomeNetworking/comments/gbd6g0/dsl_reports_speed_test_no_longer_free/
> >
> > They ran out of bandwidth.
> >
> > Message to users here:
> >
> > http://www.dslreports.com/speedtest
> >
> >
> > --
> > Make Music, Not War
> >
> > Dave Täht
> > CTO, TekLibre, LLC
> > http://www.teklibre.com
> > Tel: 1-831-435-0729
> > _______________________________________________
> > Cake mailing list
> > Cake@lists.bufferbloat.net
> > https://lists.bufferbloat.net/listinfo/cake
>
> _______________________________________________
> Bloat mailing list
> Bloat@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/bloat
>
[-- Attachment #2: Type: text/html, Size: 7843 bytes --]
^ permalink raw reply [relevance 1%]
* Re: [Cake] [Bloat] dslreports is no longer free
2020-05-01 19:34 1% ` [Cake] [Bloat] " Kenneth Porter
@ 2020-05-01 19:54 0% ` Sebastian Moeller
0 siblings, 0 replies; 200+ results
From: Sebastian Moeller @ 2020-05-01 19:54 UTC (permalink / raw)
To: Kenneth Porter; +Cc: bloat, Cake List
Hi Kenneth,
Run netperf & irtt and then use flent/netperf/irtt (www.flent.org) as measurement tools. You will get high quality bufferbloat measurements for little effort.
Best Regards
Sebastian
> On May 1, 2020, at 21:34, Kenneth Porter <shiva@sewingwitch.com> wrote:
>
> --On Friday, May 01, 2020 10:44 AM -0700 Dave Taht <dave.taht@gmail.com> wrote:
>
>> https://www.reddit.com/r/HomeNetworking/comments/gbd6g0/dsl_reports_speed
>> _test_no_longer_free/
>>
>> They ran out of bandwidth.
>>
>> Message to users here:
>>
>> http://www.dslreports.com/speedtest
>
> Is there an open source speedtest of comparable quality and usability? I could run one on my Linode for friends and family.
>
>
>
> _______________________________________________
> Bloat mailing list
> Bloat@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/bloat
^ permalink raw reply [relevance 0%]
* Re: [Cake] dslreports is no longer free
2020-05-01 19:34 1% ` [Cake] [Bloat] " Kenneth Porter
@ 2020-05-01 19:48 0% ` Sebastian Moeller
2020-05-01 20:09 1% ` [Bloat] " Sergey Fedorov
` (2 more replies)
1 sibling, 3 replies; 200+ results
From: Sebastian Moeller @ 2020-05-01 19:48 UTC (permalink / raw)
To: Dave Täht; +Cc: bloat, cerowrt-devel, Make-Wifi-fast, Cake List
Hi Dave,
well, it was a free service and it lasted a long time. I want to raise a toast to Justin and convey my sincere thanks for years of investing into the "good" of the internet.
Now, the question is which test is going to be the rightful successor?
Short of running netperf/irtt/iper2/iperf3 on a hosted server, I see lots of potential but none of the tests are really there yet (grievances in now particular order):
OOKLA: speedtest.net.
Pros: ubiquitious, allows selection of single flow versus multi-flow test, allows server selection
Cons: only IPv4, only static unloaded RTT measurement, no control over measurement duration
BUFFERBLOAT verdict: incomplete, maybe usable as load generator
NETFLIX: fast.com.
Pros: allows selection of upload testing, supposedly decent back-end, duration configurable
allows unloaded, loaded download and loaded upload RTT measurements (but reports sinlge numbers for loaded and unloaded RTT, that are not the max)
Cons: RTT report as two numbers one for the loaded and one for unloaded RTT, time-course of RTTs missing
BUFFERBLOAT verdict: incomplete, but oh, so close...
NPERF: nperf.com
Pros: allows server selection, RTT measurement and report as time course, also reports average rates and static RTT/jitter for Up- and Download
Cons: RTT measurement for unloaded only, reported RTT static only , no control over measurement duration
BUFFERBLOAT verdict: incomplete,
THINKBROADBAND: www.thinkbroadband.com/speedtest
Pros: IPv6, reports coarse RTT time courses for all three measurement phases
Cons: only static unloaded RTT report in final results, time courses only visible immediately after testing, no control over measurement duration
BUFFERBLOAT verdict: a bit coarse, might work for users within a reasonable distance to the UK for acute de-bloating sessions (history reporting is bad though)
honorable mentioning:
BREITBANDMESSUNG: breitbandmessung.de
Pros: query of contracted internet access speed before measurement, with a scheduler that will only start a test when the backend has sufficient capacity to saturate the user-supplied contracted rates, IPv6 (happy-eyeballs)
Cons: only static unloaded RTT measurement, no control over measurement duration
BUFFERBLOAT verdict: unsuitable, exceot as load generator, but the bandwidth reservation feature is quite nice.
Best Regards
Sebastian
> On May 1, 2020, at 18:44, Dave Taht <dave.taht@gmail.com> wrote:
>
> https://www.reddit.com/r/HomeNetworking/comments/gbd6g0/dsl_reports_speed_test_no_longer_free/
>
> They ran out of bandwidth.
>
> Message to users here:
>
> http://www.dslreports.com/speedtest
>
>
> --
> Make Music, Not War
>
> Dave Täht
> CTO, TekLibre, LLC
> http://www.teklibre.com
> Tel: 1-831-435-0729
> _______________________________________________
> Cake mailing list
> Cake@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cake
^ permalink raw reply [relevance 0%]
* Re: [Cake] [Bloat] dslreports is no longer free
@ 2020-05-01 19:34 1% ` Kenneth Porter
2020-05-01 19:54 0% ` Sebastian Moeller
2020-05-01 19:48 0% ` [Cake] " Sebastian Moeller
1 sibling, 1 reply; 200+ results
From: Kenneth Porter @ 2020-05-01 19:34 UTC (permalink / raw)
To: bloat, Cake List
--On Friday, May 01, 2020 10:44 AM -0700 Dave Taht <dave.taht@gmail.com>
wrote:
> https://www.reddit.com/r/HomeNetworking/comments/gbd6g0/dsl_reports_speed
> _test_no_longer_free/
>
> They ran out of bandwidth.
>
> Message to users here:
>
> http://www.dslreports.com/speedtest
Is there an open source speedtest of comparable quality and usability? I
could run one on my Linode for friends and family.
^ permalink raw reply [relevance 1%]
* [Cake] Fwd: [Babel-users] OT: Centralised WebRTC server available for testing
[not found] <87368jc3df.wl-jch@irif.fr>
@ 2020-05-01 16:11 1% ` Dave Taht
0 siblings, 0 replies; 200+ results
From: Dave Taht @ 2020-05-01 16:11 UTC (permalink / raw)
To: cerowrt-devel, Make-Wifi-fast, Cake List, bloat
I kind of have an ulterior motive for tracking this work, as A) the
codebase is new, small and not crufty and B) juliusz
sometimes bothers to answer my emails, and C) I don't really have a
grip on the state of webrtc congestion control.
I used to really enjoy dinking with videoconferencing tools. I'll be
getting this server up somewhere, by and by.
---------- Forwarded message ---------
From: Juliusz Chroboczek <jch@irif.fr>
Date: Fri, May 1, 2020 at 4:16 AM
Subject: [Babel-users] OT: Centralised WebRTC server available for testing
To: babel-users <babel-users@lists.alioth.debian.org>
Hi, and sorry for abusing this list for another off-topic post.
Some of you may remember the peer-to-peer WebRTC server I advertised a few
weeks ago. While it is still what I recommend for one-on-one conversations
(peer-to-peer is good), I've been working on a centralised solution for
larger groups of people.
We've just tested yesterday a meeting with 12 participants (12 incoming
and 132 outgoing flows), and it held. So I guess I might as well make it
available. The demo server is on
https://vps-63c87489.vps.ovh.net:8443/
It is described on
https://www.irif.fr/~jch/software/sfu/
The code is available, and will be licensed under a Free license when I'm
ready.
-- Juliusz
_______________________________________________
Babel-users mailing list
Babel-users@alioth-lists.debian.net
https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/babel-users
--
Make Music, Not War
Dave Täht
CTO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-831-435-0729
^ permalink raw reply [relevance 1%]
* Re: [Cake] [Cerowrt-devel] intel gives up on home gateways
@ 2020-04-28 22:22 1% ` Joel Wirāmu Pauling
0 siblings, 0 replies; 200+ results
From: Joel Wirāmu Pauling @ 2020-04-28 22:22 UTC (permalink / raw)
To: Dave Taht; +Cc: bloat, cerowrt-devel, Make-Wifi-fast, Cake List
[-- Attachment #1: Type: text/plain, Size: 891 bytes --]
Good riddance; most of the current platforms are using Lantiq Wireless
which they got from some acquizition/cross-licence deal and are poorly
documented at worst and completely innoperable.
I doubt this will impact any of their Industrial IO and Edge stuff which is
actually good and anyone sensible might base their design on.
-Joel
On Wed, 29 Apr 2020 at 04:57, Dave Taht <dave.taht@gmail.com> wrote:
> H/T sebastian:
>
>
> https://investors.maxlinear.com/press-releases/detail/395/maxlinear-to-acquire-intels-home-gateway-platform
>
> Gawd knows what this means.
>
> --
> Make Music, Not War
>
> Dave Täht
> CTO, TekLibre, LLC
> http://www.teklibre.com
> Tel: 1-831-435-0729
> _______________________________________________
> Cerowrt-devel mailing list
> Cerowrt-devel@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cerowrt-devel
>
[-- Attachment #2: Type: text/html, Size: 2107 bytes --]
^ permalink raw reply [relevance 1%]
* Re: [Cake] Cake tin behaviour - discuss....
2020-04-26 13:53 1% ` David P. Reed
@ 2020-04-27 11:52 0% ` Kevin Darbyshire-Bryant
0 siblings, 0 replies; 200+ results
From: Kevin Darbyshire-Bryant @ 2020-04-27 11:52 UTC (permalink / raw)
To: David P. Reed; +Cc: Jonathan Morton, Cake List
[-- Attachment #1: Type: text/plain, Size: 2764 bytes --]
> On 26 Apr 2020, at 14:53, David P. Reed <dpreed@deepplum.com> wrote:
>
> Very interesting. However, I'm curious about what is being "ping'ed" from outside.
>
> I would bet that the ping comes in on your router interface and is reflected immediately back. Which would mean that it might not at all be going through the Cake layer. That depends on the details of your setup, which you didn't share.
The address being pinged from the external ‘ping box’ is that of the globally routable IPv6 WAN interface on my APU2 router. The ping packet is going through 2 instances of cake, one on ingress (ifb4eth0), one on egress (eth0).
DSCP is applied to the packets by tc filter action act_ctinfo JUST before cake gets to see the packets. I know DSCP is affecting cake tin selection because I see cake’s tin byte/packet counters adjust accordingly. icmp/icmpv6 traffic is marked as BE by default AND also explicitly by some ip’n’tables rules that set it so.
> As you probably know, Cake works by packet shaping in the box where it runs, in the Linux stack. If the ping responder is on the ISP side of Cake, it will not be measuring lag-under-load *inside* cake.
I think I answered that above, however just for good measure, I’ve set up another ‘ping latency’ test to a box that is definitely on my LAN side, so it’ll go: ingress (cake) eth0 (wan) -> egress eth1 (lan) -> switches -> device under test -> ingress eth1 (lan) -> egress (cake) eth0 (wan)
Note that the DSCP applied by cake on egress is ignored by the ISP. Similarly, it’s a very rare thing to see a non 0 DSCP come in from them. I’m using DSCP ‘internally’ purely to provide CAKE with some traffic identification and hence clue as to how to shape it.
>
> End-to-end lag-under-load on multiple paths sharing a bottleneck is the problem Cake was invented to solve. (Jonathan - you agree?) Yes, it will move that congestion "inside" itself, pulling it out of the bottleneck itself. There it drops and ECN's "as if" the bottleneck were working correctly, rather than being "bufferbloated".
>
> So it would be interesting to learn more about the topology of your test to interpret this ping. A more interesting ping would be along the fujl path that the other flows are taking. Your ISP can't provide that.
My question was trying to determine what cake was doing:
bandwidth / per host fairness / tin weighting or
bandwidth / tin weighting / per host fairness
I was expecting the latter and Jonathan has confirmed my expectation to be the correct one. The results I saw under some circumstance appeared more toward the former, which boggled the mind.
Cheers,
Kevin D-B
gpg: 012C ACB2 28C6 C53E 9775 9123 B3A2 389B 9DE2 334A
[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [relevance 0%]
* Re: [Cake] [Bloat] New board that looks interesting
@ 2020-04-27 2:45 1% ` Dave Taht
2020-12-18 23:48 1% ` Aaron Wood
1 sibling, 1 reply; 200+ results
From: Dave Taht @ 2020-04-27 2:45 UTC (permalink / raw)
To: Aaron Wood; +Cc: Cake List, David P. Reed, Make-Wifi-fast, bloat
anyone got around to hacking on this board yet?
On Sat, Apr 4, 2020 at 9:27 AM Aaron Wood <woody77@gmail.com> wrote:
>
> The comparison of chipset performance link (to OpemWRT forums) that went out had this chip, the J4105 as the fastest. Able to do a gigabit with cake (nearly able to do it in both directions).
>
> I think this has replaced the apu2 as the board I’m going with as my edge router.
>
> On Sat, Apr 4, 2020 at 9:10 AM Dave Taht <dave.taht@gmail.com> wrote:
>>
>> Historically I've found the "Celeron" chips rather weak, but it's just
>> a brand. I haven't the foggiest idea how well this variant will
>> perform.
>>
>> The intel ethernet chips are best of breed in linux, however. It's
>> been my hope that the 211 variant with the timed networking support
>> would show up in the field (sch_etx) so we could fiddle with that,
>> (the apu2s aren't using that version) but I cannot for the life of me
>> remember the right keywords to look it up at the moment. this feature
>> lets you program when a packet emerges from the driver and is sort of
>> a whole new ballgame when it comes to scheduling - there hasn't been
>> an aqm designed for it, and you can do fq by playing tricks with the
>> sent timestamp.
>>
>> All the other features look rather nice on this board.
>>
>> On Sat, Apr 4, 2020 at 7:47 AM David P. Reed <dpreed@deepplum.com> wrote:
>> >
>> > Thanks! I ordered one just now. In my experience, this company does rather neat stuff. Their XMOS based microphone array (ReSpeaker) is really useful. What's the state of play in Linux/OpenWRT for Intel 9560 capabilities regarding AQM?
>> >
>> > On Saturday, April 4, 2020 12:12am, "Aaron Wood" <woody77@gmail.com> said:
>> >
>> > > _______________________________________________
>> > > Cake mailing list
>> > > Cake@lists.bufferbloat.net
>> > > https://lists.bufferbloat.net/listinfo/cake
>> > > https://www.seeedstudio.com/ODYSSEY-X86J4105800-p-4445.html
>> > >
>> > > quad-core Celeron J4105 1.5-2.5 GHz x64
>> > > 8GB Ram
>> > > 2x i211t intel ethernet controllers
>> > > intel 9560 802.11ac (wave2) wifi/bluetooth chipset
>> > > intel built-in graphics
>> > > onboard ARM Cortex-M0 and RPi & Arduino headers
>> > > m.2 and PCIe adapters
>> > > <$200
>> > >
>> >
>> >
>> > _______________________________________________
>> > Bloat mailing list
>> > Bloat@lists.bufferbloat.net
>> > https://lists.bufferbloat.net/listinfo/bloat
>>
>>
>>
>> --
>> Make Music, Not War
>>
>> Dave Täht
>> CTO, TekLibre, LLC
>> http://www.teklibre.com
>> Tel: 1-831-435-0729
>
> --
> - Sent from my iPhone.
--
Make Music, Not War
Dave Täht
CTO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-831-435-0729
^ permalink raw reply [relevance 1%]
* Re: [Cake] Cake tin behaviour - discuss....
2020-04-25 21:31 1% ` Kevin Darbyshire-Bryant
@ 2020-04-26 13:53 1% ` David P. Reed
2020-04-27 11:52 0% ` Kevin Darbyshire-Bryant
0 siblings, 1 reply; 200+ results
From: David P. Reed @ 2020-04-26 13:53 UTC (permalink / raw)
To: Kevin Darbyshire-Bryant; +Cc: Jonathan Morton, Cake List
[-- Attachment #1: Type: text/plain, Size: 1838 bytes --]
Very interesting. However, I'm curious about what is being "ping'ed" from outside.
I would bet that the ping comes in on your router interface and is reflected immediately back. Which would mean that it might not at all be going through the Cake layer. That depends on the details of your setup, which you didn't share.
As you probably know, Cake works by packet shaping in the box where it runs, in the Linux stack. If the ping responder is on the ISP side of Cake, it will not be measuring lag-under-load *inside* cake.
End-to-end lag-under-load on multiple paths sharing a bottleneck is the problem Cake was invented to solve. (Jonathan - you agree?) Yes, it will move that congestion "inside" itself, pulling it out of the bottleneck itself. There it drops and ECN's "as if" the bottleneck were working correctly, rather than being "bufferbloated".
So it would be interesting to learn more about the topology of your test to interpret this ping. A more interesting ping would be along the fujl path that the other flows are taking. Your ISP can't provide that.
On Saturday, April 25, 2020 5:31pm, "Kevin Darbyshire-Bryant" <kevin@darbyshire-bryant.me.uk> said:
>
>
> > On 25 Apr 2020, at 21:56, David P. Reed <dpreed@deepplum.com> wrote:
> >
> > Question: what's the "lag under load" experienced when these two loads are
> filling the capacity of the bottleneck router (the DSL link)?
> > I'm wondering whether your cake setup is deliberately building up a big queue
> within the router for any of the 10 bulk/best efforts flows.
>
> https://www.thinkbroadband.com/broadband/monitoring/quality/share/3dec809ecef5cd52f574b6be5da9af28326845d6-25-04-2020
>
> I don’t reckon it’s bad for the past 24 hours, one peak at 50ms. Avg
> latency increase by about 6ms during load.
>
>
>
[-- Attachment #2: Type: text/html, Size: 3359 bytes --]
^ permalink raw reply [relevance 1%]
* Re: [Cake] Cake tin behaviour - discuss....
2020-04-25 20:56 1% ` David P. Reed
@ 2020-04-25 21:31 1% ` Kevin Darbyshire-Bryant
2020-04-26 13:53 1% ` David P. Reed
0 siblings, 1 reply; 200+ results
From: Kevin Darbyshire-Bryant @ 2020-04-25 21:31 UTC (permalink / raw)
To: David P. Reed; +Cc: Jonathan Morton, Cake List
[-- Attachment #1: Type: text/plain, Size: 607 bytes --]
> On 25 Apr 2020, at 21:56, David P. Reed <dpreed@deepplum.com> wrote:
>
> Question: what's the "lag under load" experienced when these two loads are filling the capacity of the bottleneck router (the DSL link)?
> I'm wondering whether your cake setup is deliberately building up a big queue within the router for any of the 10 bulk/best efforts flows.
https://www.thinkbroadband.com/broadband/monitoring/quality/share/3dec809ecef5cd52f574b6be5da9af28326845d6-25-04-2020
I don’t reckon it’s bad for the past 24 hours, one peak at 50ms. Avg latency increase by about 6ms during load.
[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [relevance 1%]
* Re: [Cake] Cake tin behaviour - discuss....
2020-04-25 20:34 0% ` Kevin Darbyshire-Bryant
@ 2020-04-25 20:56 1% ` David P. Reed
2020-04-25 21:31 1% ` Kevin Darbyshire-Bryant
0 siblings, 1 reply; 200+ results
From: David P. Reed @ 2020-04-25 20:56 UTC (permalink / raw)
To: Kevin Darbyshire-Bryant; +Cc: Jonathan Morton, Cake List
[-- Attachment #1: Type: text/plain, Size: 3193 bytes --]
Question: what's the "lag under load" experienced when these two loads are filling the capacity of the bottleneck router (the DSL link)?
I'm wondering whether your cake setup is deliberately building up a big queue within the router for any of the 10 bulk/best efforts flows.
On Saturday, April 25, 2020 4:34pm, "Kevin Darbyshire-Bryant" <kevin@darbyshire-bryant.me.uk> said:
> _______________________________________________
> Cake mailing list
> Cake@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cake
>
>
> > On 25 Apr 2020, at 16:25, Jonathan Morton <chromatix99@gmail.com>
> wrote:
> >
> >> On 25 Apr, 2020, at 2:07 pm, Kevin Darbyshire-Bryant
> <kevin@darbyshire-bryant.me.uk> wrote:
> >>
> >> Download from ‘onedrive’ from 1 box, using 5 flows,
> classified as Bulk. Little other traffic going on, sits there at circa 70Mbit, no
> problem.
> >>
> >> If I started another download on another box, say 5 flows, classified as
> Best Effort, what rates would you expect the Bulk & Best effort tins to flow at?
> >
> > Approximately speaking, Cake should give the Best Effort traffic priority
> over Bulk, until the latter is squashed down to its tin's capacity. So you may
> see 5-10Mbps of Bulk and 65-70Mbps of Best Effort, depending on some short-term
> effects.
> >
> > This assumes that the Diffserv marking actually reaches Cake, of course.
>
> Thanks Jonathan. I can assure you diffserv markings are reaching cake both egress
> & ingress due to my pet ‘act_ctinfo/connmark -savedscp’ project.
> Amongst other monitoring methods a simple 'watch -t tc -s qdisc show dev $1’
> albeit with a slightly modified cake module & tc to report per tin traffic as a
> percentage of total & per tin % of threshold is used.
>
> eg:
> Bulk Best Effort Video Voice
> thresh 4812Kbit 77Mbit 38500Kbit 19250Kbit
> target 5.0ms 5.0ms 5.0ms 5.0ms
> interval 100.0ms 100.0ms 100.0ms 100.0ms
> pk_delay 961us 167us 311us 164us
> av_delay 453us 78us 141us 75us
> sp_delay 51us 12us 17us 9us
> backlog 9084b 0b 0b 0b
> pkts 60618617 2006708 460725 11129
> bytes 91414263264 2453185010 636385583 5205008
> traffic% 89 0 0 0
> traftin% 1435 0 0 0
> way_inds 2703134 8957 169 111
> way_miss 922 6192 104 525
> way_cols 0 0 0 0
> drops 8442 230 37 0
> marks 5 0 0 0
> ack_drop 0 0 0 0
> sp_flows 2 3 1 3
> bk_flows 1 0 0 0
> un_flows 0 0 0 0
> max_len 66616 12112 9084 3360
> quantum 300 1514 1174 587
>
> Your expectation is that Best Effort would exert downward pressure on Bulk traffic
> reducing bulk traffic to about bulk threshold level which is my expectation also.
> Tin priority then host (fairness), then flow.
>
> As you may have guessed, that’s not quite what I’m seeing but as
> I’ve managed to see the issue when using ‘flowblind’ am now much
> less inclined to point the finger at host fairness & friends. I remain confused
> why ‘bulk’ is exceeding its allocation though in what should be
> pressure from best effort but it ends up going all over the place and being a bit
> unstable. Odd.
>
> BTW: The ‘onedrive’ client box is actually running linux.
>
>
[-- Attachment #2: Type: text/html, Size: 4347 bytes --]
^ permalink raw reply [relevance 1%]
* Re: [Cake] Cake tin behaviour - discuss....
2020-04-25 15:25 0% ` Jonathan Morton
@ 2020-04-25 20:34 0% ` Kevin Darbyshire-Bryant
2020-04-25 20:56 1% ` David P. Reed
0 siblings, 1 reply; 200+ results
From: Kevin Darbyshire-Bryant @ 2020-04-25 20:34 UTC (permalink / raw)
To: Jonathan Morton; +Cc: Cake List
[-- Attachment #1: Type: text/plain, Size: 3350 bytes --]
> On 25 Apr 2020, at 16:25, Jonathan Morton <chromatix99@gmail.com> wrote:
>
>> On 25 Apr, 2020, at 2:07 pm, Kevin Darbyshire-Bryant <kevin@darbyshire-bryant.me.uk> wrote:
>>
>> Download from ‘onedrive’ from 1 box, using 5 flows, classified as Bulk. Little other traffic going on, sits there at circa 70Mbit, no problem.
>>
>> If I started another download on another box, say 5 flows, classified as Best Effort, what rates would you expect the Bulk & Best effort tins to flow at?
>
> Approximately speaking, Cake should give the Best Effort traffic priority over Bulk, until the latter is squashed down to its tin's capacity. So you may see 5-10Mbps of Bulk and 65-70Mbps of Best Effort, depending on some short-term effects.
>
> This assumes that the Diffserv marking actually reaches Cake, of course.
Thanks Jonathan. I can assure you diffserv markings are reaching cake both egress & ingress due to my pet ‘act_ctinfo/connmark -savedscp’ project. Amongst other monitoring methods a simple 'watch -t tc -s qdisc show dev $1’ albeit with a slightly modified cake module & tc to report per tin traffic as a percentage of total & per tin % of threshold is used.
eg:
Bulk Best Effort Video Voice
thresh 4812Kbit 77Mbit 38500Kbit 19250Kbit
target 5.0ms 5.0ms 5.0ms 5.0ms
interval 100.0ms 100.0ms 100.0ms 100.0ms
pk_delay 961us 167us 311us 164us
av_delay 453us 78us 141us 75us
sp_delay 51us 12us 17us 9us
backlog 9084b 0b 0b 0b
pkts 60618617 2006708 460725 11129
bytes 91414263264 2453185010 636385583 5205008
traffic% 89 0 0 0
traftin% 1435 0 0 0
way_inds 2703134 8957 169 111
way_miss 922 6192 104 525
way_cols 0 0 0 0
drops 8442 230 37 0
marks 5 0 0 0
ack_drop 0 0 0 0
sp_flows 2 3 1 3
bk_flows 1 0 0 0
un_flows 0 0 0 0
max_len 66616 12112 9084 3360
quantum 300 1514 1174 587
Your expectation is that Best Effort would exert downward pressure on Bulk traffic reducing bulk traffic to about bulk threshold level which is my expectation also. Tin priority then host (fairness), then flow.
As you may have guessed, that’s not quite what I’m seeing but as I’ve managed to see the issue when using ‘flowblind’ am now much less inclined to point the finger at host fairness & friends. I remain confused why ‘bulk’ is exceeding its allocation though in what should be pressure from best effort but it ends up going all over the place and being a bit unstable. Odd.
BTW: The ‘onedrive’ client box is actually running linux.
[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [relevance 0%]
* Re: [Cake] cake on linux 5.6 32 bit x86 might be broken
2020-04-25 19:59 0% ` Jonathan Morton
@ 2020-04-25 20:05 1% ` Dave Taht
0 siblings, 0 replies; 200+ results
From: Dave Taht @ 2020-04-25 20:05 UTC (permalink / raw)
To: Jonathan Morton; +Cc: Cake List
thx I'm an idiot sometimes.
On Sat, Apr 25, 2020 at 1:00 PM Jonathan Morton <chromatix99@gmail.com> wrote:
>
> > On 25 Apr, 2020, at 10:09 pm, Dave Taht <dave.taht@gmail.com> wrote:
> >
> > ~# tc qdisc add dev eth1 root cake bandwidth 160mbps
>
> For tc, the "mbps" suffix is interpreted as megaBYTES per second. For megaBITS, use Mbit.
>
> The output and behaviour is consistent with that.
>
> - Jonathan Morton
--
Make Music, Not War
Dave Täht
CTO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-831-435-0729
^ permalink raw reply [relevance 1%]
* Re: [Cake] cake on linux 5.6 32 bit x86 might be broken
2020-04-25 19:09 1% ` [Cake] cake on linux 5.6 32 bit x86 might be broken Dave Taht
2020-04-25 19:19 1% ` Y
@ 2020-04-25 19:59 0% ` Jonathan Morton
2020-04-25 20:05 1% ` Dave Taht
1 sibling, 1 reply; 200+ results
From: Jonathan Morton @ 2020-04-25 19:59 UTC (permalink / raw)
To: Dave Taht; +Cc: Cake List
> On 25 Apr, 2020, at 10:09 pm, Dave Taht <dave.taht@gmail.com> wrote:
>
> ~# tc qdisc add dev eth1 root cake bandwidth 160mbps
For tc, the "mbps" suffix is interpreted as megaBYTES per second. For megaBITS, use Mbit.
The output and behaviour is consistent with that.
- Jonathan Morton
^ permalink raw reply [relevance 0%]
* Re: [Cake] cake on linux 5.6 32 bit x86 might be broken
2020-04-25 19:09 1% ` [Cake] cake on linux 5.6 32 bit x86 might be broken Dave Taht
@ 2020-04-25 19:19 1% ` Y
2020-04-25 19:59 0% ` Jonathan Morton
1 sibling, 0 replies; 200+ results
From: Y @ 2020-04-25 19:19 UTC (permalink / raw)
To: cake
Hi,Dave
uname -r
5.4.34-1-MANJARO
Cake seeems to work.
Yutaka.
On Sat, 25 Apr 2020 12:09:49 -0700
Dave Taht <dave.taht@gmail.com> wrote:
> the bandwidth parameter is mis-parsed. this is stock kernel, stock 5.6 iproute2
> looks like an alignment bug. Anyone running x86 on 32 bit? anyone
> running this kernel on anything 32bit?
>
> is 5.4 ok?
>
> ---------- Forwarded message ---------
> From: elided
> Date: Fri, Apr 24, 2020 at 7:57 PM
> Subject: Re: PSA pt 1: for better videoconferencing at home on slow links
> To: Dave Taht <dave.taht@gmail.com>
>
>
> Yo Dave!
>
> One step forward, two back.
>
> Linux kernel 5.6.7.
>
> ~# tc -V
> tc utility, iproute2-ss200330
>
> This started cake:
>
> ~# tc qdisc add dev eth1 root cake bandwidth 160mbps
>
> But, my score of B for bufferbloat fell to a C!
>
> http://www.dslreports.com/speedtest/62801576
>
> cake seems to be ignoring my bandwidth:
>
> catbert:~# tc -s qdisc show dev eth1
> qdisc cake 8005: root refcnt 2 bandwidth 1280Mbit diffserv3
> triple-isolate nonat nowash no-ack-filter split-gso rtt 100.0ms raw
> overhead 0
> Sent 591223311 bytes 562082 pkt (dropped 0, overlimits 666624 requeues 0)
> backlog 0b 0p requeues 0
> memory used: 377936b of 15140Kb
> capacity estimate: 1280Mbit
> min/max network layer size: 42 / 1514
> min/max overhead-adjusted size: 42 / 1514
> average network hdr offset: 14
>
> Bulk Best Effort Voice
> thresh 80Mbit 1280Mbit 320Mbit
> target 5.0ms 5.0ms 5.0ms
> interval 100.0ms 100.0ms 100.0ms
> pk_delay 0us 8us 6us
> av_delay 0us 3us 3us
> sp_delay 0us 1us 1us
> backlog 0b 0b 0b
> pkts 0 447439 114643
> bytes 0 580864114 10359197
> way_inds 0 42159 2526
> way_miss 0 3931 87597
> way_cols 0 11 0
> drops 0 0 0
> marks 0 0 0
> ack_drop 0 0 0
> sp_flows 0 247 46
> bk_flows 0 1 0
> un_flows 0 0 0
> max_len 0 36144 998
> quantum 1514 1514 1514
>
> Back to reading...
> _______________________________________________
> Cake mailing list
> Cake@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cake
--
Y <intruder_tkyf@yahoo.fr>
^ permalink raw reply [relevance 1%]
* [Cake] cake on linux 5.6 32 bit x86 might be broken
[not found] ` <20200424195745.72d725bd@rellim.com>
@ 2020-04-25 19:09 1% ` Dave Taht
2020-04-25 19:19 1% ` Y
2020-04-25 19:59 0% ` Jonathan Morton
0 siblings, 2 replies; 200+ results
From: Dave Taht @ 2020-04-25 19:09 UTC (permalink / raw)
To: Cake List
the bandwidth parameter is mis-parsed. this is stock kernel, stock 5.6 iproute2
looks like an alignment bug. Anyone running x86 on 32 bit? anyone
running this kernel on anything 32bit?
is 5.4 ok?
---------- Forwarded message ---------
From: elided
Date: Fri, Apr 24, 2020 at 7:57 PM
Subject: Re: PSA pt 1: for better videoconferencing at home on slow links
To: Dave Taht <dave.taht@gmail.com>
Yo Dave!
One step forward, two back.
Linux kernel 5.6.7.
~# tc -V
tc utility, iproute2-ss200330
This started cake:
~# tc qdisc add dev eth1 root cake bandwidth 160mbps
But, my score of B for bufferbloat fell to a C!
http://www.dslreports.com/speedtest/62801576
cake seems to be ignoring my bandwidth:
catbert:~# tc -s qdisc show dev eth1
qdisc cake 8005: root refcnt 2 bandwidth 1280Mbit diffserv3
triple-isolate nonat nowash no-ack-filter split-gso rtt 100.0ms raw
overhead 0
Sent 591223311 bytes 562082 pkt (dropped 0, overlimits 666624 requeues 0)
backlog 0b 0p requeues 0
memory used: 377936b of 15140Kb
capacity estimate: 1280Mbit
min/max network layer size: 42 / 1514
min/max overhead-adjusted size: 42 / 1514
average network hdr offset: 14
Bulk Best Effort Voice
thresh 80Mbit 1280Mbit 320Mbit
target 5.0ms 5.0ms 5.0ms
interval 100.0ms 100.0ms 100.0ms
pk_delay 0us 8us 6us
av_delay 0us 3us 3us
sp_delay 0us 1us 1us
backlog 0b 0b 0b
pkts 0 447439 114643
bytes 0 580864114 10359197
way_inds 0 42159 2526
way_miss 0 3931 87597
way_cols 0 11 0
drops 0 0 0
marks 0 0 0
ack_drop 0 0 0
sp_flows 0 247 46
bk_flows 0 1 0
un_flows 0 0 0
max_len 0 36144 998
quantum 1514 1514 1514
Back to reading...
^ permalink raw reply [relevance 1%]
* Re: [Cake] Cake tin behaviour - discuss....
2020-04-25 15:14 1% ` David P. Reed
@ 2020-04-25 15:25 0% ` Jonathan Morton
2020-04-25 20:34 0% ` Kevin Darbyshire-Bryant
1 sibling, 1 reply; 200+ results
From: Jonathan Morton @ 2020-04-25 15:25 UTC (permalink / raw)
To: Kevin Darbyshire-Bryant; +Cc: Cake List
> On 25 Apr, 2020, at 2:07 pm, Kevin Darbyshire-Bryant <kevin@darbyshire-bryant.me.uk> wrote:
>
> Download from ‘onedrive’ from 1 box, using 5 flows, classified as Bulk. Little other traffic going on, sits there at circa 70Mbit, no problem.
>
> If I started another download on another box, say 5 flows, classified as Best Effort, what rates would you expect the Bulk & Best effort tins to flow at?
Approximately speaking, Cake should give the Best Effort traffic priority over Bulk, until the latter is squashed down to its tin's capacity. So you may see 5-10Mbps of Bulk and 65-70Mbps of Best Effort, depending on some short-term effects.
This assumes that the Diffserv marking actually reaches Cake, of course.
- Jonathan Morton
^ permalink raw reply [relevance 0%]
* Re: [Cake] Cake tin behaviour - discuss....
@ 2020-04-25 15:14 1% ` David P. Reed
2020-04-25 15:25 0% ` Jonathan Morton
1 sibling, 0 replies; 200+ results
From: David P. Reed @ 2020-04-25 15:14 UTC (permalink / raw)
To: Kevin Darbyshire-Bryant; +Cc: Cake List
[-- Attachment #1: Type: text/plain, Size: 5143 bytes --]
I'll bite.
Assuming a lot of things (you seem to be a microsoft user, One Drive, so your OS's network stack isn't necessarily very good at all).
35/35 split.
Why?
THere are no "bursts" in the fundamental flows (disk transfer rates are far higher than 80 Mb/sec., so the only burstiness would come from OS schedulers on either end).
There should be next to zero queueing in the bottleneck, and without queue depth, best efforts and bulk are happy to sync to that share, and stable once sawtoothing around an average of 35.
What's more important is what studying this teaches us:
1. diffserv only makes a difference when queues are allowed to build in switches/routers. But the whole goal of cake is to make the queues zero length.
2. TCP's optimal state is to adjust rate to insure that there is no queueing delay *inside* the network. (Well, Kleinrock says it should be just under 1 packet's worth of delay at the bottleneck router, and a small fraction of 1 packet on each router that is not bottlenecked.
3. in terms of "end to end" control, diffserv is about the worst possible mechanism for creating "differentiated service". It is based on a very old (pre-Jacobson AIMD) idea about inter-networking sharing of capacity. Because Van finally demonstrated (unfortunately it didn't penetrate the transport layer thick skulls who invented diffserv) that when *sharing* a path's capacity, the work has to be done at the *endpoints* - simply adjusting the window size on the receive end can do it - to slow the sending rate to the point where the network buffering drops to a mean of < 1 packet at the bottleneck.
4. diffserv is an example of attempting to "put a function in the network" that cannot be provided fully (or much at all) by the network equipment. The function (differentiating service quality) requires attention at the TCP level, not the IP layer, with the receive end and the transmit end cooperating. As one of the creators of The End-to-end argument, this is why I continue to be frustrated at the whole "diffserv" effort, which has wasted decades of sporadic research projects, all failing. My co-author, Dave Clark, has an equally strong critique of diffserv - which is that there is no actual quantitative and *universal* definition of all its "code points" across all AS's in the network. And there never will be because of commercial considerations - even if there were *only* two code points for performance (high and low), there is no way to provide pricing incentives for routers to follow those definitions in ANY algorithm they use.
5. There is paradoxically intense interest in *router vendors* and network operators to have any "feature" they can seel that claims to improve game performance or create "very low latency" priceable service. You can see this in the current "5G" pitches about being able to robotically do telesurgery with sub millisecond latency (faster than the speed of light, note, but the marketers don't care about truth), merely because they have "5G magic". To have a differentiation for your company's Brand, all you have to say is "I support diffserv", and the rubes will buy it. It doesn't work, but you can blame it on the fact that the problem is the other networks on the path, not your fancy routers.
6. If you have a dozen independent flows through a particular router, most likely those flows will be between pairs that do not, and pragmatically cannot, know anything about the other flows sharing the bottleneck. Yet to achieve differentiation among flows, somehow each flow must adjust its *own* rate to share *unequally* with the other flows.
There is 0.000 information bits/second about the differentiated service requirements shared between the distinct flows.
7. Every time I or others have pointed out that diffserv cannot work, we get met with nasty, very nasty personal attacks. I even wrote a short paper about it, which was killed by the referees (apparently diffserv fanboys). So we generally have just waited for the idea to die.
But it just won't die. It has never worked. But that just makes people want to imagine it will work if they only hold their breath real deep and wish.
On Saturday, April 25, 2020 7:07am, "Kevin Darbyshire-Bryant" <kevin@darbyshire-bryant.me.uk> said:
> _______________________________________________
> Cake mailing list
> Cake@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cake
> I’m confused as to what the ‘correct' behaviour should be under the
> following (real life) scenario:
>
> Downstream VDSL wan link 79.5Mbit/s cake shaped to 77Mbit/s - diffserv4, so Bulk
> 4.8Mbit, Best effort 77Mbit, Video 38.5, Voice 19.
>
> Download from ‘onedrive’ from 1 box, using 5 flows, classified as
> Bulk. Little other traffic going on, sits there at circa 70Mbit, no problem.
>
> If I started another download on another box, say 5 flows, classified as Best
> Effort, what rates would you expect the Bulk & Best effort tins to flow at?
>
> Cheers,
>
> Kevin D-B
>
> gpg: 012C ACB2 28C6 C53E 9775 9123 B3A2 389B 9DE2 334A
>
>
[-- Attachment #2: Type: text/html, Size: 8599 bytes --]
^ permalink raw reply [relevance 1%]
* Re: [Cake] Advantages to tightly tuning latency
2020-04-23 23:05 0% ` Toke Høiland-Jørgensen
@ 2020-04-23 23:11 1% ` Dave Taht
0 siblings, 0 replies; 200+ results
From: Dave Taht @ 2020-04-23 23:11 UTC (permalink / raw)
To: Toke Høiland-Jørgensen; +Cc: Maxime Bizon, Cake List
On Thu, Apr 23, 2020 at 4:05 PM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>
> Maxime Bizon <mbizon@freebox.fr> writes:
>
> > On Thursday 23 Apr 2020 à 20:35:15 (+0200), Toke Høiland-Jørgensen wrote:
> >
> >> I meant more details of your SOC platform. You already said it's
> >> ARM-based, so I guess the most important missing piece is which (Linux)
> >> driver does the Ethernet device(s) use?
> >
> > - Marvell Kirkwood, mv643xx_eth driver
> > - Marvell A8k, mvpp2 driver
>
> No native XDP support in any of those, unfortunately :(
>
> >> Yup, I think so. What does your current solution do with packets that
> >> are destined for the WiFi interface, BTW? Just punt them to the regular
> >> kernel path?
> >
> > yes, but that won't fly anymore for 11ax rates
>
> Indeed, that was partly why I asked :)
> Got any plans?
I would really love to get something going out of this initiative:
https://pointer.ngi.eu/pages/ngi-pointer-opencalls
... either with marvell wifi, 802.11ax, ath11k, mt76... gpon onts,
just something, anything fq-ing at the very least, over fiber of any
sort, would be a start.
is there anyone out there with some spare time and needs some eu dollars?
I tend to disagree that we need massive offloading on 802.11ax. We
need firmware on the chip that does per station scheduling, for the
rest of the queue management a smarter host cpu in the a72 class more
than suffices, especially if the packets are arriving gso'd.
>
> -Toke
>
--
Make Music, Not War
Dave Täht
CTO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-831-435-0729
^ permalink raw reply [relevance 1%]
* Re: [Cake] Advantages to tightly tuning latency
2020-04-23 21:59 1% ` Maxime Bizon
@ 2020-04-23 23:05 0% ` Toke Høiland-Jørgensen
2020-04-23 23:11 1% ` Dave Taht
0 siblings, 1 reply; 200+ results
From: Toke Høiland-Jørgensen @ 2020-04-23 23:05 UTC (permalink / raw)
To: Maxime Bizon; +Cc: Dave Taht, Cake List
Maxime Bizon <mbizon@freebox.fr> writes:
> On Thursday 23 Apr 2020 à 20:35:15 (+0200), Toke Høiland-Jørgensen wrote:
>
>> I meant more details of your SOC platform. You already said it's
>> ARM-based, so I guess the most important missing piece is which (Linux)
>> driver does the Ethernet device(s) use?
>
> - Marvell Kirkwood, mv643xx_eth driver
> - Marvell A8k, mvpp2 driver
No native XDP support in any of those, unfortunately :(
>> Yup, I think so. What does your current solution do with packets that
>> are destined for the WiFi interface, BTW? Just punt them to the regular
>> kernel path?
>
> yes, but that won't fly anymore for 11ax rates
Indeed, that was partly why I asked :)
Got any plans?
-Toke
^ permalink raw reply [relevance 0%]
* Re: [Cake] Advantages to tightly tuning latency
2020-04-23 18:35 0% ` Toke Høiland-Jørgensen
@ 2020-04-23 21:59 1% ` Maxime Bizon
2020-04-23 23:05 0% ` Toke Høiland-Jørgensen
0 siblings, 1 reply; 200+ results
From: Maxime Bizon @ 2020-04-23 21:59 UTC (permalink / raw)
To: Toke Høiland-Jørgensen; +Cc: Dave Taht, Cake List
On Thursday 23 Apr 2020 à 20:35:15 (+0200), Toke Høiland-Jørgensen wrote:
> I meant more details of your SOC platform. You already said it's
> ARM-based, so I guess the most important missing piece is which (Linux)
> driver does the Ethernet device(s) use?
- Marvell Kirkwood, mv643xx_eth driver
- Marvell A8k, mvpp2 driver
> Yup, I think so. What does your current solution do with packets that
> are destined for the WiFi interface, BTW? Just punt them to the regular
> kernel path?
yes, but that won't fly anymore for 11ax rates
--
Maxime
^ permalink raw reply [relevance 1%]
* Re: [Cake] Advantages to tightly tuning latency
2020-04-23 18:30 0% ` Sebastian Moeller
@ 2020-04-23 21:53 1% ` Maxime Bizon
0 siblings, 0 replies; 200+ results
From: Maxime Bizon @ 2020-04-23 21:53 UTC (permalink / raw)
To: Sebastian Moeller; +Cc: Toke Høiland-Jørgensen, Cake List
On Thursday 23 Apr 2020 à 20:30:09 (+0200), Sebastian Moeller wrote:
Hello,
> if I might ask you a tangential question, did you also look at MAP-T
> (translation) and if so, what made you choose MAP-E (encapsulation)?
decision was made a long time ago
I remember it had something to do with the fact that, at that time,
Cisco only supported MAP-E on the switches we planned to use for this,
so we did not even look at MAP-T
--
Maxime
^ permalink raw reply [relevance 1%]
* Re: [Cake] Advantages to tightly tuning latency
2020-04-23 17:31 1% ` Maxime Bizon
2020-04-23 18:30 0% ` Sebastian Moeller
@ 2020-04-23 18:35 0% ` Toke Høiland-Jørgensen
2020-04-23 21:59 1% ` Maxime Bizon
1 sibling, 1 reply; 200+ results
From: Toke Høiland-Jørgensen @ 2020-04-23 18:35 UTC (permalink / raw)
To: Maxime Bizon; +Cc: Dave Taht, Cake List
Maxime Bizon <mbizon@freebox.fr> writes:
> On Thursday 23 Apr 2020 à 18:42:11 (+0200), Toke Høiland-Jørgensen wrote:
>
>> Didn't make it in until 5.5, unfortunately... :(
>>
>> I can try to produce a patch that you can manually apply on top of 5.4
>> if you're interested?
>
> I could do it, but the thing I'm more worried about is the lack of
> test coverage from everyone else.
Yeah, I guess you'd be on the hook for backporting any follow-ups
yourself if you do that; maybe better to wait for the next longterm
kernel release, then...
>> Anyhow, my larger point was that we really do want to enable such use
>> cases for XDP; but we are lacking the details of what exactly is missing
>> before we can get to something that's useful / deployable. So any
>> details you could share about what feature set you are supporting in
>> your own 'fast path' implementation would be really helpful. As would
>> details about the hardware platform you are using. You can send them
>> off-list if you don't want to make it public, of course :)
>
> there is no hardware specific feature used, it's all software
I meant more details of your SOC platform. You already said it's
ARM-based, so I guess the most important missing piece is which (Linux)
driver does the Ethernet device(s) use?
> imagine this "simple" setup, pretty much what anyone's home router is
> doing:
>
> <br0> with <eth0> + <wlan0> inside, private IPv4 address
> <wan0.vlan> with IPv6, vlan interface over <wan0>
> <map0> with IPv4, MAP-E tunnel over <wan0.vlan>
>
> then:
> - IPv6 routing between <br0> and <wan0.vlan>
> - IPv4 routing + NAT between <br0> and <map0>
>
> iptables would be filled with usual rules, per interface ALLOW rules
> in FORWARD chain, DNAT rules in PREROUTING to access LAN from WAN...
>
> and then you want this to be fast :)
>
> What we do is build a "flow" table on top of conntrack, so with a
> single lookup we find the flow, the destination interface, and what
> modifications to apply to the packet (L3 address to change, encap to
> add/remove, etc etc)
>
> Then we do this lookup more or less early in RX path, on our oldest
> platform we even had to do this from the ethernet driver, and do TX
> from there too, skipping qdisc layer and allowing cache maintenance
> hacks (partial invalidation and wback)
This sounds pretty much what you'd do with an XDP program: Packet comes
in -> XDP program runs, parses the headers, does a flow lookup, modifies
the packet and redirects it out the egress interface. All in one go,
kernel never even builds an skb for the packet.
You can build most of that with XDP today, but you'd need to implement
all the lookups yourself using BPF maps; having a hook into the kernel
conntrack / flow tables would help with that. I guess I should look into
what happened with that hook.
Oh, and we also need to solve queueing in XDP; it's all line rate ATM,
which is obviously not ideal for a CPE :)
> nftable with flowtables seems to be have developped something that
> could replace our flow cache, but I'm not sure if it can handle our
> tunneling scheme yet. It even has a notion of offloaded flow for
> hardware that can support it.
Well, the nice thing about XDP is that you can just implement any custom
encapsulation that is not supported by the kernel yourself :)
> If you add an XDP offload to it, with an option to do the
> lookup/modification/tx at the layer you want, depending on the
> performance you need, whether you want qdisc.. that you'd give you
> pretty much the same thing we use today, but with a cleaner design.
Yup, I think so. What does your current solution do with packets that
are destined for the WiFi interface, BTW? Just punt them to the regular
kernel path?
>> Depends on the TCP stack (I think).
>
> I guess Linux deals with OFO better, but unfortunately that's not the
> main OS used by our subscribers...
Yeah, you really should do something about that ;)
>> Steam is perhaps a bad example as that is doing something very much like
>> bittorrent AFAIK; but point taken, people do occasionally run
>> single-stream downloads and want them to be fast. I'm just annoyed that
>> this becomes the *one* benchmark people run, to the exclusion of
>> everything else that has a much larger impact on the overall user
>> experience :/
>
> that one is easy
>
> convince ookla to add some kind of "latency under load" metric, and
> have them report it as a big red flag when too high, and even better
> add scary messages like "this connection is not suitable for online
> gaming".
>
> subscribers will bug telco, then telco will bug SOCs vendors
Heh. Easy in theory, yeah. I do believe people on this list have tried
to convince them; no luck thus far :/
-Toke
^ permalink raw reply [relevance 0%]
* Re: [Cake] Advantages to tightly tuning latency
2020-04-23 17:31 1% ` Maxime Bizon
@ 2020-04-23 18:30 0% ` Sebastian Moeller
2020-04-23 21:53 1% ` Maxime Bizon
2020-04-23 18:35 0% ` Toke Høiland-Jørgensen
1 sibling, 1 reply; 200+ results
From: Sebastian Moeller @ 2020-04-23 18:30 UTC (permalink / raw)
To: Maxime Bizon; +Cc: Toke Høiland-Jørgensen, Cake List
Hi Maxime,
if I might ask you a tangential question, did you also look at MAP-T (translation) and if so, what made you choose MAP-E (encapsulation)?
Best Regards
Sebastian
> On Apr 23, 2020, at 19:31, Maxime Bizon <mbizon@freebox.fr> wrote:
>
>
> On Thursday 23 Apr 2020 à 18:42:11 (+0200), Toke Høiland-Jørgensen wrote:
>
>> Didn't make it in until 5.5, unfortunately... :(
>>
>> I can try to produce a patch that you can manually apply on top of 5.4
>> if you're interested?
>
> I could do it, but the thing I'm more worried about is the lack of
> test coverage from everyone else.
>
>> Anyhow, my larger point was that we really do want to enable such use
>> cases for XDP; but we are lacking the details of what exactly is missing
>> before we can get to something that's useful / deployable. So any
>> details you could share about what feature set you are supporting in
>> your own 'fast path' implementation would be really helpful. As would
>> details about the hardware platform you are using. You can send them
>> off-list if you don't want to make it public, of course :)
>
> there is no hardware specific feature used, it's all software
>
> imagine this "simple" setup, pretty much what anyone's home router is
> doing:
>
> <br0> with <eth0> + <wlan0> inside, private IPv4 address
> <wan0.vlan> with IPv6, vlan interface over <wan0>
> <map0> with IPv4, MAP-E tunnel over <wan0.vlan>
>
> then:
> - IPv6 routing between <br0> and <wan0.vlan>
> - IPv4 routing + NAT between <br0> and <map0>
>
> iptables would be filled with usual rules, per interface ALLOW rules
> in FORWARD chain, DNAT rules in PREROUTING to access LAN from WAN...
>
> and then you want this to be fast :)
>
> What we do is build a "flow" table on top of conntrack, so with a
> single lookup we find the flow, the destination interface, and what
> modifications to apply to the packet (L3 address to change, encap to
> add/remove, etc etc)
>
> Then we do this lookup more or less early in RX path, on our oldest
> platform we even had to do this from the ethernet driver, and do TX
> from there too, skipping qdisc layer and allowing cache maintenance
> hacks (partial invalidation and wback)
>
>
> nftable with flowtables seems to be have developped something that
> could replace our flow cache, but I'm not sure if it can handle our
> tunneling scheme yet. It even has a notion of offloaded flow for
> hardware that can support it.
>
> If you add an XDP offload to it, with an option to do the
> lookup/modification/tx at the layer you want, depending on the
> performance you need, whether you want qdisc.. that you'd give you
> pretty much the same thing we use today, but with a cleaner design.
>
>
>> Depends on the TCP stack (I think).
>
> I guess Linux deals with OFO better, but unfortunately that's not the
> main OS used by our subscribers...
>
>> Steam is perhaps a bad example as that is doing something very much like
>> bittorrent AFAIK; but point taken, people do occasionally run
>> single-stream downloads and want them to be fast. I'm just annoyed that
>> this becomes the *one* benchmark people run, to the exclusion of
>> everything else that has a much larger impact on the overall user
>> experience :/
>
> that one is easy
>
> convince ookla to add some kind of "latency under load" metric, and
> have them report it as a big red flag when too high, and even better
> add scary messages like "this connection is not suitable for online
> gaming".
>
> subscribers will bug telco, then telco will bug SOCs vendors
>
> --
> Maxime
> _______________________________________________
> Cake mailing list
> Cake@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cake
^ permalink raw reply [relevance 0%]
* Re: [Cake] Advantages to tightly tuning latency
2020-04-23 16:42 0% ` Toke Høiland-Jørgensen
@ 2020-04-23 17:31 1% ` Maxime Bizon
2020-04-23 18:30 0% ` Sebastian Moeller
2020-04-23 18:35 0% ` Toke Høiland-Jørgensen
0 siblings, 2 replies; 200+ results
From: Maxime Bizon @ 2020-04-23 17:31 UTC (permalink / raw)
To: Toke Høiland-Jørgensen; +Cc: Dave Taht, Cake List
On Thursday 23 Apr 2020 à 18:42:11 (+0200), Toke Høiland-Jørgensen wrote:
> Didn't make it in until 5.5, unfortunately... :(
>
> I can try to produce a patch that you can manually apply on top of 5.4
> if you're interested?
I could do it, but the thing I'm more worried about is the lack of
test coverage from everyone else.
> Anyhow, my larger point was that we really do want to enable such use
> cases for XDP; but we are lacking the details of what exactly is missing
> before we can get to something that's useful / deployable. So any
> details you could share about what feature set you are supporting in
> your own 'fast path' implementation would be really helpful. As would
> details about the hardware platform you are using. You can send them
> off-list if you don't want to make it public, of course :)
there is no hardware specific feature used, it's all software
imagine this "simple" setup, pretty much what anyone's home router is
doing:
<br0> with <eth0> + <wlan0> inside, private IPv4 address
<wan0.vlan> with IPv6, vlan interface over <wan0>
<map0> with IPv4, MAP-E tunnel over <wan0.vlan>
then:
- IPv6 routing between <br0> and <wan0.vlan>
- IPv4 routing + NAT between <br0> and <map0>
iptables would be filled with usual rules, per interface ALLOW rules
in FORWARD chain, DNAT rules in PREROUTING to access LAN from WAN...
and then you want this to be fast :)
What we do is build a "flow" table on top of conntrack, so with a
single lookup we find the flow, the destination interface, and what
modifications to apply to the packet (L3 address to change, encap to
add/remove, etc etc)
Then we do this lookup more or less early in RX path, on our oldest
platform we even had to do this from the ethernet driver, and do TX
from there too, skipping qdisc layer and allowing cache maintenance
hacks (partial invalidation and wback)
nftable with flowtables seems to be have developped something that
could replace our flow cache, but I'm not sure if it can handle our
tunneling scheme yet. It even has a notion of offloaded flow for
hardware that can support it.
If you add an XDP offload to it, with an option to do the
lookup/modification/tx at the layer you want, depending on the
performance you need, whether you want qdisc.. that you'd give you
pretty much the same thing we use today, but with a cleaner design.
> Depends on the TCP stack (I think).
I guess Linux deals with OFO better, but unfortunately that's not the
main OS used by our subscribers...
> Steam is perhaps a bad example as that is doing something very much like
> bittorrent AFAIK; but point taken, people do occasionally run
> single-stream downloads and want them to be fast. I'm just annoyed that
> this becomes the *one* benchmark people run, to the exclusion of
> everything else that has a much larger impact on the overall user
> experience :/
that one is easy
convince ookla to add some kind of "latency under load" metric, and
have them report it as a big red flag when too high, and even better
add scary messages like "this connection is not suitable for online
gaming".
subscribers will bug telco, then telco will bug SOCs vendors
--
Maxime
^ permalink raw reply [relevance 1%]
* Re: [Cake] Advantages to tightly tuning latency
2020-04-23 12:33 1% ` Maxime Bizon
@ 2020-04-23 16:42 0% ` Toke Høiland-Jørgensen
2020-04-23 17:31 1% ` Maxime Bizon
0 siblings, 1 reply; 200+ results
From: Toke Høiland-Jørgensen @ 2020-04-23 16:42 UTC (permalink / raw)
To: Maxime Bizon; +Cc: Dave Taht, Cake List
Maxime Bizon <mbizon@freebox.fr> writes:
> On Thursday 23 Apr 2020 à 13:57:25 (+0200), Toke Høiland-Jørgensen wrote:
>
> Hello Toke,
>
>> That is awesome! Please make sure you include the AQL patch for ath10k,
>> it really works wonders, as Dave demonstrated:
>>
>> https://lists.bufferbloat.net/pipermail/make-wifi-fast/2020-March/002721.html
>
> Was it in 5.4 ? we try to stick to LTS kernel
Didn't make it in until 5.5, unfortunately... :(
I can try to produce a patch that you can manually apply on top of 5.4
if you're interested?
>> We're working on that in kernel land - ever heard of XDP? On big-iron
>> servers we have no issues pushing 10s and 100s of Gbps in software
>> (well, the latter only given enough cores to throw at the problem :)).
>> There's not a lot of embedded platforms support as of yet, but we do
>> have some people in the ARM world working on that.
>>
>> Personally, I do see embedded platforms as an important (future) use
>> case for XDP, though, in particular for CPEs. So I would be very
>> interested in hearing details about your particular platform, and your
>> DPDK solution, so we can think about what it will take to achieve the
>> same with XDP. If you're interested in this, please feel free to reach
>> out :)
>
> Last time I looked at XDP, its primary use cases were "early drop" /
> "anti ddos".
Yeah, that's the obvious use case (i.e., easiest to implement). But we
really want it to be a general purpose acceleration layer where you can
selectively use only the kernel facilities you need for your use case -
or even skip some of them entirely and reimplement an optimised subset
fitting your use case.
> In our case, each packet has to be routed+NAT, we have VLAN tags, we
> also have MAP-E for IPv4 traffic. So in the vanilla forwading path,
> this does multiple rounds of RX/TX because of tunneling.
>
> TBH, the hard work in our optimized forwarding code is figuring out
> what modifications to apply to each packets. Now whether modifications
> and tx would be done by XDP or by hand written C code in the kernel is
> more of a detail, even though using XDP is much cleaner of course.
>
> What the kernel always lacked is what DaveM called once the "grand
> unified flow cache", the ability to do a single lookup and be able to
> decide what to do with the packet. Instead we have the bridge
> forwarding table, the ip routing table (used to be a cache), the
> netfilter conntrack lookup, and multiple round of those if you do
> tunneling.
>
> Once you have this "flow table" infrastructure, it becomes easy to
> offload forwarding, either to real hardware, or software (for example,
> dedicate a CPU core in polling mode)
>
> The good news is that it seems nftables is building this:
>
> https://wiki.nftables.org/wiki-nftables/index.php/Flowtable
>
> I'm still using iptables, but it seems that the features I was missing
> like TCPMSS are now in nft also, so I will have a look.
I find it useful to think of XDP as a 'software offload' - i.e. a fast
path where you implement the most common functionality as efficiently as
possible and dynamically fall back to the full stack for the edge cases.
Enabling lookups in the flow table from XDP would be an obvious thing to
do, for instance. There were some patches going by to enable some kind
of lookup into conntrack at some point, but I don't recall the details.
Anyhow, my larger point was that we really do want to enable such use
cases for XDP; but we are lacking the details of what exactly is missing
before we can get to something that's useful / deployable. So any
details you could share about what feature set you are supporting in
your own 'fast path' implementation would be really helpful. As would
details about the hardware platform you are using. You can send them
off-list if you don't want to make it public, of course :)
>> Setting aside the fact that those single-stream tests ought to die a
>> horrible death, I do wonder if it would be feasible to do a bit of
>> 'optimising for the test'? With XDP we do have the ability to steer
>> packets between CPUs based on arbitrary criteria, and while it is not as
>> efficient as hardware-based RSS it may be enough to achieve line rate
>> for a single TCP flow?
>
> You cannot do steering for a single TCP flow at those rates because
> you will get out-of-order packets and kill TCP performance.
Depends on the TCP stack (I think).
> I do not consider those single-stream tests to be unrealistic, this is
> exactly what happen if say you buy a game on Steam and download it.
Steam is perhaps a bad example as that is doing something very much like
bittorrent AFAIK; but point taken, people do occasionally run
single-stream downloads and want them to be fast. I'm just annoyed that
this becomes the *one* benchmark people run, to the exclusion of
everything else that has a much larger impact on the overall user
experience :/
-Toke
^ permalink raw reply [relevance 0%]
* Re: [Cake] Advantages to tightly tuning latency
2020-04-23 11:57 1% ` Toke Høiland-Jørgensen
2020-04-23 12:29 1% ` Luca Muscariello
2020-04-23 12:33 1% ` Maxime Bizon
@ 2020-04-23 16:28 1% ` Dave Taht
2 siblings, 0 replies; 200+ results
From: Dave Taht @ 2020-04-23 16:28 UTC (permalink / raw)
To: Toke Høiland-Jørgensen; +Cc: Maxime Bizon, Cake List, Make-Wifi-fast
[-- Attachment #1: Type: text/plain, Size: 1178 bytes --]
On Thu, Apr 23, 2020 at 4:57 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>
> Maxime Bizon <mbizon@freebox.fr> writes:
>
> > On Wednesday 22 Apr 2020 à 07:48:43 (-0700), Dave Taht wrote:
> >
> > Hello,
> >
> >> > Free has been using SFQ since 2005 (if I remember well).
> >> > They announced the wide deployment of SFQ in the free.fr newsgroup.
> >> > Wi-Fi in the free.fr router was not as good though.
> >>
> >> They're working on it. :)
> >
> > yes indeed.
> >
> > Switching to softmac approach, so now mac80211 will do rate control
> > and scheduling (using wake_tx_queue model).
> >
> > for 5ghz, we use ath10k
>
> That is awesome! Please make sure you include the AQL patch for ath10k,
> it really works wonders, as Dave demonstrated:
>
> https://lists.bufferbloat.net/pipermail/make-wifi-fast/2020-March/002721.html
and THIS patch is looking lovely at the higher rates I'm testing at.
Yea, it needs to be more robust, but I'm seeing tcp rtt inflation
of exactly the right buffer size (15ms), on an HT20 or VHT80 link,
where before I'd see 30+ms... (it's nice that flent can plot both tcp
rtt and ping rtt now),
ecn "just works",
[-- Attachment #2: 982-do-codel-more-right.patch --]
[-- Type: text/x-patch, Size: 1767 bytes --]
diff --git a/net/mac80211/sta_info.c b/net/mac80211/sta_info.c
index c431722..92ba09b 100644
--- a/net/mac80211/sta_info.c
+++ b/net/mac80211/sta_info.c
@@ -476,10 +476,11 @@ struct sta_info *sta_info_alloc(struct ieee80211_sub_if_data *sdata,
sta->sta.max_rc_amsdu_len = IEEE80211_MAX_MPDU_LEN_HT_BA;
sta->cparams.ce_threshold = CODEL_DISABLED_THRESHOLD;
- sta->cparams.target = MS2TIME(20);
+ sta->cparams.target = MS2TIME(5);
sta->cparams.interval = MS2TIME(100);
sta->cparams.ecn = true;
-
+ sta_dbg(sdata, "Codel target, interval %d, %d\n", sta->cparams.target,
+ sta->cparams.interval);
sta_dbg(sdata, "Allocated STA %pM\n", sta->sta.addr);
return sta;
@@ -2468,15 +2469,9 @@ static void sta_update_codel_params(struct sta_info *sta, u32 thr)
if (!sta->sdata->local->ops->wake_tx_queue)
return;
- if (thr && thr < STA_SLOW_THRESHOLD * sta->local->num_sta) {
- sta->cparams.target = MS2TIME(50);
- sta->cparams.interval = MS2TIME(300);
- sta->cparams.ecn = false;
- } else {
- sta->cparams.target = MS2TIME(20);
- sta->cparams.interval = MS2TIME(100);
- sta->cparams.ecn = true;
- }
+ sta->cparams.target = MS2TIME(5);
+ sta->cparams.interval = MS2TIME(100);
+ sta->cparams.ecn = true;
}
void ieee80211_sta_set_expected_throughput(struct ieee80211_sta *pubsta,
diff --git a/net/mac80211/tx.c b/net/mac80211/tx.c
index 535911b..ca50d0a 100644
--- a/net/mac80211/tx.c
+++ b/net/mac80211/tx.c
@@ -1551,7 +1551,7 @@ int ieee80211_txq_setup_flows(struct ieee80211_local *local)
codel_params_init(&local->cparams);
local->cparams.interval = MS2TIME(100);
- local->cparams.target = MS2TIME(20);
+ local->cparams.target = MS2TIME(5);
local->cparams.ecn = true;
local->cvars = kcalloc(fq->flows_cnt, sizeof(local->cvars[0]),
^ permalink raw reply [relevance 1%]
* Re: [Cake] Advantages to tightly tuning latency
2020-04-23 11:57 1% ` Toke Høiland-Jørgensen
2020-04-23 12:29 1% ` Luca Muscariello
@ 2020-04-23 12:33 1% ` Maxime Bizon
2020-04-23 16:42 0% ` Toke Høiland-Jørgensen
2020-04-23 16:28 1% ` Dave Taht
2 siblings, 1 reply; 200+ results
From: Maxime Bizon @ 2020-04-23 12:33 UTC (permalink / raw)
To: Toke Høiland-Jørgensen; +Cc: Dave Taht, Cake List
On Thursday 23 Apr 2020 à 13:57:25 (+0200), Toke Høiland-Jørgensen wrote:
Hello Toke,
> That is awesome! Please make sure you include the AQL patch for ath10k,
> it really works wonders, as Dave demonstrated:
>
> https://lists.bufferbloat.net/pipermail/make-wifi-fast/2020-March/002721.html
Was it in 5.4 ? we try to stick to LTS kernel
> We're working on that in kernel land - ever heard of XDP? On big-iron
> servers we have no issues pushing 10s and 100s of Gbps in software
> (well, the latter only given enough cores to throw at the problem :)).
> There's not a lot of embedded platforms support as of yet, but we do
> have some people in the ARM world working on that.
>
> Personally, I do see embedded platforms as an important (future) use
> case for XDP, though, in particular for CPEs. So I would be very
> interested in hearing details about your particular platform, and your
> DPDK solution, so we can think about what it will take to achieve the
> same with XDP. If you're interested in this, please feel free to reach
> out :)
Last time I looked at XDP, its primary use cases were "early drop" /
"anti ddos".
In our case, each packet has to be routed+NAT, we have VLAN tags, we
also have MAP-E for IPv4 traffic. So in the vanilla forwading path,
this does multiple rounds of RX/TX because of tunneling.
TBH, the hard work in our optimized forwarding code is figuring out
what modifications to apply to each packets. Now whether modifications
and tx would be done by XDP or by hand written C code in the kernel is
more of a detail, even though using XDP is much cleaner of course.
What the kernel always lacked is what DaveM called once the "grand
unified flow cache", the ability to do a single lookup and be able to
decide what to do with the packet. Instead we have the bridge
forwarding table, the ip routing table (used to be a cache), the
netfilter conntrack lookup, and multiple round of those if you do
tunneling.
Once you have this "flow table" infrastructure, it becomes easy to
offload forwarding, either to real hardware, or software (for example,
dedicate a CPU core in polling mode)
The good news is that it seems nftables is building this:
https://wiki.nftables.org/wiki-nftables/index.php/Flowtable
I'm still using iptables, but it seems that the features I was missing
like TCPMSS are now in nft also, so I will have a look.
> Setting aside the fact that those single-stream tests ought to die a
> horrible death, I do wonder if it would be feasible to do a bit of
> 'optimising for the test'? With XDP we do have the ability to steer
> packets between CPUs based on arbitrary criteria, and while it is not as
> efficient as hardware-based RSS it may be enough to achieve line rate
> for a single TCP flow?
You cannot do steering for a single TCP flow at those rates because
you will get out-of-order packets and kill TCP performance.
I do not consider those single-stream tests to be unrealistic, this is
exactly what happen if say you buy a game on Steam and download it.
--
Maxime
^ permalink raw reply [relevance 1%]
* Re: [Cake] Advantages to tightly tuning latency
2020-04-23 11:57 1% ` Toke Høiland-Jørgensen
@ 2020-04-23 12:29 1% ` Luca Muscariello
2020-04-23 12:33 1% ` Maxime Bizon
2020-04-23 16:28 1% ` Dave Taht
2 siblings, 0 replies; 200+ results
From: Luca Muscariello @ 2020-04-23 12:29 UTC (permalink / raw)
To: Toke Høiland-Jørgensen; +Cc: Maxime Bizon, Dave Taht, Cake List
[-- Attachment #1: Type: text/plain, Size: 6004 bytes --]
On Thu, Apr 23, 2020 at 1:57 PM Toke Høiland-Jørgensen <toke@redhat.com>
wrote:
> Maxime Bizon <mbizon@freebox.fr> writes:
>
> > On Wednesday 22 Apr 2020 à 07:48:43 (-0700), Dave Taht wrote:
> >
> > Hello,
> >
> >> > Free has been using SFQ since 2005 (if I remember well).
> >> > They announced the wide deployment of SFQ in the free.fr newsgroup.
> >> > Wi-Fi in the free.fr router was not as good though.
> >>
> >> They're working on it. :)
> >
> > yes indeed.
> >
> > Switching to softmac approach, so now mac80211 will do rate control
> > and scheduling (using wake_tx_queue model).
> >
> > for 5ghz, we use ath10k
>
> That is awesome! Please make sure you include the AQL patch for ath10k,
> it really works wonders, as Dave demonstrated:
>
>
> https://lists.bufferbloat.net/pipermail/make-wifi-fast/2020-March/002721.html
>
> >> I am very, very happy for y'all. Fiber has always been the sanest
> >> thing. Is there a SPF+ gpon card yet I can plug into a convention
> >> open source router yet?
> >
> > FYI Free.fr uses 10G-EPON, not GPON.
> >
> > Also most deployments are using an additionnal terminal equipement at
> > called "ONT" or "ONU" that handle the PON part and exposes an ethernet
> > port where the operator CPE is plugged. So we are back to the early
> > days of DSL, where the hardest part (scheduling) is done inside a
> > black box. That makes it easier to replace the operator CPE with your
> > own standard ethernet router though.
> >
> > At least SOCs with integrated PON (supporting all flavours
> > GPON/EPON/..) are starting to be deployed. Nothing available in
> > opensource.
> >
> > Also note that it's not just kernel drivers, you also need some higher
> > OAM stack to make that work, and there are a lot of existing
> > standards, DPOE (EPON), OMCI (GPON)... all with interop challenges.
>
> It always bugged me that there was no open source support for these
> esoteric protocols and standards. It would seem like an obvious place to
> pool resources, but I guess proprietary vendors are going to keep doing
> their thing :/
>
> >> > The challenge becomes to keep up with these link rates in software
> >> > as there is a lot of hardware offloading.
> >
> > Yes that's our pain point, because that's what the SOCs vendors
> > deliver and you need to use that because there is no alternative.
> >
> > It's not entierely the SOCs vendors fault though.
> >
> > 15 years ago, your average SOC's CPU would be something like 200Mhz
> > MIPS, Linux standard forwarding path (softirq => routing+netfilter =>
> > qdisc) was too slow for this, too much cache footprint/overhead. So
> > every vendor started building alternatives forwarding path in their
> > hardware and never looked back.
> >
> > Nowdays, the baseline SOC CPU would be ARM Cortex A53@~1Ghz, which
> > with a non crappy network driver and internal fabric should do be able
> > to route 1Gbit/s with out-of-the-box kernel forwarding.
> >
> > But that's too late. SOC vendors compete against each others, and the
> > big telcos need a way to tell which SOC is better to make a buying
> > decision. So synthetic benchmarks have become the norm, and since
> > everybody was able to do fill their pipe with 1500 bytes packets,
> > benchmarks have moved to unrealistic 64 bytes packets (so called
> > wirespeed)
>
Yes, I'm not working anymore on these kinds of platforms
but I do remember the pain.
Hardware offloading may also have unexpected behaviours
for stateful offloads. A flow starts in a slow path and
then it moves to the fast path in hardware.
Out of order at this stage can be nasty for a TCP connection.
Worse a packet loss.
> >
> > If you don't have hardware acceleration for forwarding, you don't
> > exist in those benchmarks and will not sell your chipset. Also they
> > invested so much in their alternative network stack that it's
> > difficult to stop (huge R&D teams). That being said, they do have a
> > point, when speed go above 1Gbit/s, the kernel becomes the bottleneck.
> >
> > For Free.fr 10Gbit/s offer, we had to develop an alternative
> > (software) forwarding path using polling mode model (DPDK style),
> > otherwise our albeit powerful ARM Cortex A72@2Ghz could not forward
> > more than 2Gbit/s.
>
> We're working on that in kernel land - ever heard of XDP? On big-iron
> servers we have no issues pushing 10s and 100s of Gbps in software
> (well, the latter only given enough cores to throw at the problem :)).
> There's not a lot of embedded platforms support as of yet, but we do
> have some people in the ARM world working on that.
>
> Personally, I do see embedded platforms as an important (future) use
> case for XDP, though, in particular for CPEs. So I would be very
> interested in hearing details about your particular platform, and your
> DPDK solution, so we can think about what it will take to achieve the
> same with XDP. If you're interested in this, please feel free to reach
> out :)
>
> > And going multicore/RSS does not fly when the test case is single
> > stream TCP session, which is what most speedtest application do (ookla
> > only recently added multi-connections test).
>
> Setting aside the fact that those single-stream tests ought to die a
> horrible death, I do wonder if it would be feasible to do a bit of
> 'optimising for the test'? With XDP we do have the ability to steer
> packets between CPUs based on arbitrary criteria, and while it is not as
> efficient as hardware-based RSS it may be enough to achieve line rate
> for a single TCP flow?
>
Toke yes I was implicitly thinking about XDP but I did
not read yet any experience in CPEs using that.
DPDK, netmap and kernel bypass may be an option but
you lose all qdiscs.
>
> -Toke
>
> _______________________________________________
> Cake mailing list
> Cake@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cake
>
[-- Attachment #2: Type: text/html, Size: 8769 bytes --]
^ permalink raw reply [relevance 1%]
* Re: [Cake] Advantages to tightly tuning latency
2020-04-23 9:29 1% ` Maxime Bizon
@ 2020-04-23 11:57 1% ` Toke Høiland-Jørgensen
2020-04-23 12:29 1% ` Luca Muscariello
` (2 more replies)
0 siblings, 3 replies; 200+ results
From: Toke Høiland-Jørgensen @ 2020-04-23 11:57 UTC (permalink / raw)
To: Maxime Bizon, Dave Taht; +Cc: Cake List
Maxime Bizon <mbizon@freebox.fr> writes:
> On Wednesday 22 Apr 2020 à 07:48:43 (-0700), Dave Taht wrote:
>
> Hello,
>
>> > Free has been using SFQ since 2005 (if I remember well).
>> > They announced the wide deployment of SFQ in the free.fr newsgroup.
>> > Wi-Fi in the free.fr router was not as good though.
>>
>> They're working on it. :)
>
> yes indeed.
>
> Switching to softmac approach, so now mac80211 will do rate control
> and scheduling (using wake_tx_queue model).
>
> for 5ghz, we use ath10k
That is awesome! Please make sure you include the AQL patch for ath10k,
it really works wonders, as Dave demonstrated:
https://lists.bufferbloat.net/pipermail/make-wifi-fast/2020-March/002721.html
>> I am very, very happy for y'all. Fiber has always been the sanest
>> thing. Is there a SPF+ gpon card yet I can plug into a convention
>> open source router yet?
>
> FYI Free.fr uses 10G-EPON, not GPON.
>
> Also most deployments are using an additionnal terminal equipement at
> called "ONT" or "ONU" that handle the PON part and exposes an ethernet
> port where the operator CPE is plugged. So we are back to the early
> days of DSL, where the hardest part (scheduling) is done inside a
> black box. That makes it easier to replace the operator CPE with your
> own standard ethernet router though.
>
> At least SOCs with integrated PON (supporting all flavours
> GPON/EPON/..) are starting to be deployed. Nothing available in
> opensource.
>
> Also note that it's not just kernel drivers, you also need some higher
> OAM stack to make that work, and there are a lot of existing
> standards, DPOE (EPON), OMCI (GPON)... all with interop challenges.
It always bugged me that there was no open source support for these
esoteric protocols and standards. It would seem like an obvious place to
pool resources, but I guess proprietary vendors are going to keep doing
their thing :/
>> > The challenge becomes to keep up with these link rates in software
>> > as there is a lot of hardware offloading.
>
> Yes that's our pain point, because that's what the SOCs vendors
> deliver and you need to use that because there is no alternative.
>
> It's not entierely the SOCs vendors fault though.
>
> 15 years ago, your average SOC's CPU would be something like 200Mhz
> MIPS, Linux standard forwarding path (softirq => routing+netfilter =>
> qdisc) was too slow for this, too much cache footprint/overhead. So
> every vendor started building alternatives forwarding path in their
> hardware and never looked back.
>
> Nowdays, the baseline SOC CPU would be ARM Cortex A53@~1Ghz, which
> with a non crappy network driver and internal fabric should do be able
> to route 1Gbit/s with out-of-the-box kernel forwarding.
>
> But that's too late. SOC vendors compete against each others, and the
> big telcos need a way to tell which SOC is better to make a buying
> decision. So synthetic benchmarks have become the norm, and since
> everybody was able to do fill their pipe with 1500 bytes packets,
> benchmarks have moved to unrealistic 64 bytes packets (so called
> wirespeed)
>
> If you don't have hardware acceleration for forwarding, you don't
> exist in those benchmarks and will not sell your chipset. Also they
> invested so much in their alternative network stack that it's
> difficult to stop (huge R&D teams). That being said, they do have a
> point, when speed go above 1Gbit/s, the kernel becomes the bottleneck.
>
> For Free.fr 10Gbit/s offer, we had to develop an alternative
> (software) forwarding path using polling mode model (DPDK style),
> otherwise our albeit powerful ARM Cortex A72@2Ghz could not forward
> more than 2Gbit/s.
We're working on that in kernel land - ever heard of XDP? On big-iron
servers we have no issues pushing 10s and 100s of Gbps in software
(well, the latter only given enough cores to throw at the problem :)).
There's not a lot of embedded platforms support as of yet, but we do
have some people in the ARM world working on that.
Personally, I do see embedded platforms as an important (future) use
case for XDP, though, in particular for CPEs. So I would be very
interested in hearing details about your particular platform, and your
DPDK solution, so we can think about what it will take to achieve the
same with XDP. If you're interested in this, please feel free to reach
out :)
> And going multicore/RSS does not fly when the test case is single
> stream TCP session, which is what most speedtest application do (ookla
> only recently added multi-connections test).
Setting aside the fact that those single-stream tests ought to die a
horrible death, I do wonder if it would be feasible to do a bit of
'optimising for the test'? With XDP we do have the ability to steer
packets between CPUs based on arbitrary criteria, and while it is not as
efficient as hardware-based RSS it may be enough to achieve line rate
for a single TCP flow?
-Toke
^ permalink raw reply [relevance 1%]
* Re: [Cake] DSCP ramblings
2020-04-22 16:44 1% ` Stephen Hemminger
2020-04-22 16:58 1% ` Dave Taht
@ 2020-04-23 10:50 0% ` Kevin Darbyshire-Bryant
1 sibling, 0 replies; 200+ results
From: Kevin Darbyshire-Bryant @ 2020-04-23 10:50 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: Dave Taht, Cake List
[-- Attachment #1: Type: text/plain, Size: 2439 bytes --]
> On 22 Apr 2020, at 17:44, Stephen Hemminger <stephen@networkplumber.org> wrote:
>
> In my experience, except for a small number of cases (RDMA etc) Diffserv is a
> complete waste of time. There is no global ordering, there is no guarantee against
> starvation and any sane ISP strips the bits off or ignores them.
>
> Diffserv is even an issue at scale in the cloud. What does DSCP mean exactly on
> outer headers, who gets to decide for which service. And what about inner headers
> and propogating inner to outer. Its a mess.
That’s…depressing :-/ And suggests there are at least 6 bits spare in the ’TOS’ byte, perhaps that should go forward at the next IETF meeting ;-)
In my own naive home network fiefdom I would like to have some level of control as to who/what gets the lion’s/fairshare of my ISP wan link. I don’t actually care whether that DSCP decision makes it/returns on the wan link…I have my own combination of tc act_ctinfo (in upstream linux) and ‘iptables connmark —setdscp’ (not in upstream linux yet) to do my DSCP mark saving/restoration, so what I set sticks and affects CAKE’s tin allocation & shaping decisions. I want my wife’s facetime call to her sister to work perfectly, I want my bittorrenting to sit there in the background unnoticed, I want my network backup job/s to run in good time but not at the expense of interactive web browsing, ssh’ing etc. I want my BBC iplayer radio to sit there streaming away at the best quality, no interruptions. It’s all possible.
As it currently stands I’ve a series of guestimate iptables rules mainly based on source or destination ip address (occasionally a bit of port number) but that’s pretty coarse. Why can’t the application do it?
LE - Least Effort - 0/16 can be starved - bittorrent
GE - Good Effort - 1/16 not starved - background downloads, windows update
BE - Best Effort - 16/16 Normal activity, default
SP - Streaming Priority - 8/16 high bitrate streaming, Video/Audio streaming, Video portion of video conferencing
IIP - Interactive/Important Priority - 4/16 low bitrate streaming, SIP/VOIP, interactive SSH, DNS, Audio portion of video conferencing
I’m still stunned/shocked by the lack of obvious DSCP support in libcurl. If DSCP setting isn’t easy to do, no one will do it. Perhaps if it is built, people will come?
I’m probably stupid. And now depressed!
Kevin
[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [relevance 0%]
* Re: [Cake] Advantages to tightly tuning latency
2020-04-22 14:48 1% ` Dave Taht
2020-04-22 15:28 1% ` Luca Muscariello
@ 2020-04-23 9:29 1% ` Maxime Bizon
2020-04-23 11:57 1% ` Toke Høiland-Jørgensen
1 sibling, 1 reply; 200+ results
From: Maxime Bizon @ 2020-04-23 9:29 UTC (permalink / raw)
To: Dave Taht; +Cc: Luca Muscariello, Jonathan Morton, Cake List
On Wednesday 22 Apr 2020 à 07:48:43 (-0700), Dave Taht wrote:
Hello,
> > Free has been using SFQ since 2005 (if I remember well).
> > They announced the wide deployment of SFQ in the free.fr newsgroup.
> > Wi-Fi in the free.fr router was not as good though.
>
> They're working on it. :)
yes indeed.
Switching to softmac approach, so now mac80211 will do rate control
and scheduling (using wake_tx_queue model).
for 5ghz, we use ath10k
> I am very, very happy for y'all. Fiber has always been the sanest
> thing. Is there a SPF+ gpon card yet I can plug into a convention
> open source router yet?
FYI Free.fr uses 10G-EPON, not GPON.
Also most deployments are using an additionnal terminal equipement at
called "ONT" or "ONU" that handle the PON part and exposes an ethernet
port where the operator CPE is plugged. So we are back to the early
days of DSL, where the hardest part (scheduling) is done inside a
black box. That makes it easier to replace the operator CPE with your
own standard ethernet router though.
At least SOCs with integrated PON (supporting all flavours
GPON/EPON/..) are starting to be deployed. Nothing available in
opensource.
Also note that it's not just kernel drivers, you also need some higher
OAM stack to make that work, and there are a lot of existing
standards, DPOE (EPON), OMCI (GPON)... all with interop challenges.
> > The challenge becomes to keep up with these link rates in software
> > as there is a lot of hardware offloading.
Yes that's our pain point, because that's what the SOCs vendors
deliver and you need to use that because there is no alternative.
It's not entierely the SOCs vendors fault though.
15 years ago, your average SOC's CPU would be something like 200Mhz
MIPS, Linux standard forwarding path (softirq => routing+netfilter =>
qdisc) was too slow for this, too much cache footprint/overhead. So
every vendor started building alternatives forwarding path in their
hardware and never looked back.
Nowdays, the baseline SOC CPU would be ARM Cortex A53@~1Ghz, which
with a non crappy network driver and internal fabric should do be able
to route 1Gbit/s with out-of-the-box kernel forwarding.
But that's too late. SOC vendors compete against each others, and the
big telcos need a way to tell which SOC is better to make a buying
decision. So synthetic benchmarks have become the norm, and since
everybody was able to do fill their pipe with 1500 bytes packets,
benchmarks have moved to unrealistic 64 bytes packets (so called
wirespeed)
If you don't have hardware acceleration for forwarding, you don't
exist in those benchmarks and will not sell your chipset. Also they
invested so much in their alternative network stack that it's
difficult to stop (huge R&D teams). That being said, they do have a
point, when speed go above 1Gbit/s, the kernel becomes the bottleneck.
For Free.fr 10Gbit/s offer, we had to develop an alternative
(software) forwarding path using polling mode model (DPDK style),
otherwise our albeit powerful ARM Cortex A72@2Ghz could not forward
more than 2Gbit/s.
And going multicore/RSS does not fly when the test case is single
stream TCP session, which is what most speedtest application do (ookla
only recently added multi-connections test).
--
Maxime
^ permalink raw reply [relevance 1%]
* Re: [Cake] DSCP ramblings
2020-04-22 17:17 0% ` Kevin Darbyshire-Bryant
@ 2020-04-22 17:45 1% ` Dave Taht
0 siblings, 0 replies; 200+ results
From: Dave Taht @ 2020-04-22 17:45 UTC (permalink / raw)
To: Kevin Darbyshire-Bryant; +Cc: Cake List
On Wed, Apr 22, 2020 at 10:17 AM Kevin Darbyshire-Bryant
<kevin@darbyshire-bryant.me.uk> wrote:
>
>
>
> > On 22 Apr 2020, at 17:20, Dave Taht <dave.taht@gmail.com> wrote:
> >
> > and because of your I'm off building collectd because those graphs
> > look so good. :)
>
> Oh dear, sorry about that :-) The collection bit https://github.com/ldir-EDB0/packages/commit/932bb4b022bdbf3ab0fa1e43842f7c94da7f046a
> The display bit https://github.com/ldir-EDB0/luci/commit/a0a95da1703079887a85c4d9b6929e74d2c77a29
>
> Don’t break it, or if you do, send fixes ;-)
I just - amazingly - for being years out of practice - before my first
cup of coffee - patched openwrt for reducing the codel target, updated
babeld to 1.9.2, and the kernel to 5.4... reflashed a ubnt mesh ap...
and it worked, first time. I am afraid to push my luck, further.
>
> The idea of using collectd_exec and hence a sh script was the quickest way of spinning something up. It is inherently going to be heavier than a proper C based plugin/collector…and beyond my skill/patience limits
I get it. :)
My intent however is to somehow have the collector send stuff back to
another collector elsewhere and not
store anything locally. haven't figured out how to do that. (The
uap-lite mesh only has 8MB flash)
>(You should have seen how many combinations of ‘*' & ‘&’ were involved in getting https-dns-proxy/lubcurl "static curl_socket_t opensocket_callback(void *clientp, curlsocktype purpose, struct curl_sockaddr *addr) (void)setsockopt(sock, IPPROTO_IPV6, IP_TOS, (int *)clientp, sizeof(int));” to work :-) )
the cdecl tool is your friend here.
> Kevin
>
>
>
--
Make Music, Not War
Dave Täht
CTO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-831-435-0729
^ permalink raw reply [relevance 1%]
* Re: [Cake] Advantages to tightly tuning latency
2020-04-22 15:28 1% ` Luca Muscariello
@ 2020-04-22 17:42 2% ` David P. Reed
0 siblings, 0 replies; 200+ results
From: David P. Reed @ 2020-04-22 17:42 UTC (permalink / raw)
To: Luca Muscariello; +Cc: Dave Taht, Cake List, Maxime Bizon
[-- Attachment #1: Type: text/plain, Size: 8216 bytes --]
Having asymmetric gigabit cable modem service (1 Gb/s down) and very short latencies (5 ms.) to many servers of interest that can source 1 Gb/s), I would just comment that I find it very, very useful for "normal" use.
Perhaps my point is this: "normal" isn't a narrow gaussian distribution of performance needs. It's what might be called a time-varying long tailed distribution.
I pay for 1 gb/sec because it is "worth it" to download from, say, github cloning or a docker container image in under 1 second.
To think that isn't valuable is to miss the point that the Internet's performance isn't about isochronous flows or slow FTPs - it's not about throughput. It's about service delay.
And congestion control is about mitigating service delays under load, by eliminating sustained queueing delays that build up due to multiplexed use otherwise.
To talk about one use at a time, and treat an average throughput as the goal metric is to miss the entire point.
A home access connection is frequently multiplexed over unrelated uses. If you are single, live in your own apartment, ... you have a very, very warped idea of real usage.
On Wednesday, April 22, 2020 11:28am, "Luca Muscariello" <muscariello@ieee.org> said:
On Wed, Apr 22, 2020 at 4:48 PM Dave Taht <[ dave.taht@gmail.com ]( mailto:dave.taht@gmail.com )> wrote:On Wed, Apr 22, 2020 at 2:04 AM Luca Muscariello <[ muscariello@ieee.org ]( mailto:muscariello@ieee.org )> wrote:
>
>
>
> On Wed, Apr 22, 2020 at 12:44 AM Dave Taht <[ dave.taht@gmail.com ]( mailto:dave.taht@gmail.com )> wrote:
>>
>> On Tue, Apr 21, 2020 at 3:33 PM Jonathan Morton <[ chromatix99@gmail.com ]( mailto:chromatix99@gmail.com )> wrote:
>> >
>> > > On 22 Apr, 2020, at 1:25 am, Thibaut <[ hacks@slashdirt.org ]( mailto:hacks@slashdirt.org )> wrote:
>> > >
>> > > My curiosity is piqued. Can you elaborate on this? What does [ free.fr ]( http://free.fr ) do?
>> >
>> > They're a large French ISP. They made their own CPE devices, and debloated both them and their network quite a while ago. In that sense, at least, they're a model for others to follow - but few have.
>> >
>> > - Jonathan Morton
>>
>> they are one of the few ISPs that insisted on getting full source code
>> to their DSL stack, and retained the chops to be able to modify it. I
>> really admire their revolution v6 product. First introduced in 2010,
>> it's been continuously updated, did ipv6 at the outset, got fq_codel
>> when it first came out, and they update the kernel regularly. All
>> kinds of great features on it, and ecn is enabled by default for those
>> also (things like samba). over 3 million boxes now I hear....
>>
>> with <1ms of delay in the dsl driver, they don't need to shape, they
>> just run at line rate using three tiers of DRR that look a lot like
>> cake. They shared their config with me, and before I lost heart for
>> future internet drafts, I'd stuck it here:
>>
>> [ https://github.com/dtaht/bufferbloat-rfcs/blob/master/home_gateway_queue_management/middle.mkd ]( https://github.com/dtaht/bufferbloat-rfcs/blob/master/home_gateway_queue_management/middle.mkd )
>>
>> Occasionally they share some data with me. Sometimes I wish I lived in
>> paris just so I could have good internet! (their fiber offering is
>> reasonably buffered (not fq_codeled) and the wifi... maybe I can get
>> them to talk about what they did)
>>
>> When [ free.fr ]( http://free.fr ) shipped fq_codel 2 months after we finalized it, I
>> figured the rest of the world was only months behind. How hard is it
>> to add 50 lines of BQL oriented code to a DSL firmware?
>>
>
> Free has been using SFQ since 2005 (if I remember well).
> They announced the wide deployment of SFQ in the [ free.fr ]( http://free.fr ) newsgroup.
> Wi-Fi in the [ free.fr ]( http://free.fr ) router was not as good though.
They're working on it. :)
> In Paris there is a lot of GPON now that is replacing DSL. But there is
> a nation-wide effort funded by local administrations to get fiber
> everywhere. There are small towns in the countryside with fiber.
> Public money has made, and is making that possible.
> There is still a little of Euro-DOCSIS, but frankly compared to fiber
> it has no chance to survive.
I am very, very happy for y'all. Fiber has always been the sanest
thing. Is there
a SPF+ gpon card yet I can plug into a convention open source router yet?
>
> I currently have 2Gbps/600Mbps access with [ orange.fr ]( http://orange.fr ) and [ free.fr ]( http://free.fr ) has a subscription
> at 10Gbps GPON. I won't tell you the price because you may feel depressed
> compared to other countries where prices are much higher.
I'd emigrate!!!
> The challenge becomes to keep up with these link rates in software
> as there is a lot of hardware offloading.
I just meant that these routers tend to use HW offloading
and kernel qdiscs may be bypassed.
At this point, I kind of buy the stanford sqrt(bdp) argument. All you
really need for gigE+ fiber access to work well
for most modern traffic is a fairly short fifo (say, 20ms). Any form
of FQ would help but be hardly noticible. I think
there needs to be work on the hop between the internet and the subscriber...
Web traffic is dominated by RTT above 40mbit (presently).
streaming video traffic - is no more than 20Mbit, and your occasional
big download is a dozen big streams that would
bounce off a short fifo well.
gbit access to the home is (admittedly glorious, wonderful!) overkill
for all present forms of traffic.
I'm pretty sure if I had gig fiber I could come up with a way to use
it up (exiting the cloud entirely comes to mind), but
lacking new applications that demand that much bandwidth...
I of course, would like to see lola ( [ https://lola.conts.it/ ]( https://lola.conts.it/ ) ) finally
work, and videoconferencing and game stream with high rates and faster
(even raw) encoding also has potential to reduce e2e latencies
enormously at that layer.
>
> As soon as 802.11ax becomes the norm, software scheduling will become
> a challenge.
Do you mean in fiber or wireless? wireless is really problematic at ANY speed.
I meant that software scheduling becomes a challenge for the same
reason as above. Increase in total throughput of the box
will call for hardware offloading and kernel qdisc may be bypassed.
It is not a challenge per se, it is a challenge because traffic
may not be managed by the kernel.
at gfiber, the buffering moved to the wifi, and there are other
problems that really impact achievable bandwidth. When I was last in
paris, I could "hear" 300+ access points from my apt, and could only
get 100-200kbit per second out of the wireless n ap I had, unless I
cheated and stuck my traffic in the VI queue. A friend of mine there,
couldn't even get wifi across the room! Beacons ate into a lot of the
available
bandwidth. Since 5ghz (and soon 6ghz - is 6E a thing in france) is
shorter range I'm hoping that's got better, but with
802.11ac and ax peeing on half the wifi spectrum by default, I imagine
achievable rates in high density locations with many APs will be very
low... and very jittery... and thus still require good ATF, fq, and
aqm technologies.
I have high hopes for OFDMA and DU but thus far haven't found an AP
doing it. I'm not sure what to do about the beaconing problem except
offer a free tradein to all my neighbors still emitting G style
frames....
And in looking over some preliminary code for the mt76 ax chip, I
worry about both bad design of the firmware, and
insufficient resources on-chip to manage well.
How is the 5G rollout going in france?
Good question. I've just seen a speed test at Gbps on a phone
which can drain your battery in less than 5 minutes. Amazing tech!
I recently learned that much of japan is... wait for it... wimax.
>
> Luca
--
Make Music, Not War
Dave Täht
CTO, TekLibre, LLC
[ http://www.teklibre.com ]( http://www.teklibre.com )
Tel: 1-831-435-0729
[-- Attachment #2: Type: text/html, Size: 12406 bytes --]
^ permalink raw reply [relevance 2%]
* Re: [Cake] DSCP ramblings
2020-04-22 16:20 1% ` Dave Taht
2020-04-22 16:44 1% ` Stephen Hemminger
@ 2020-04-22 17:17 0% ` Kevin Darbyshire-Bryant
2020-04-22 17:45 1% ` Dave Taht
1 sibling, 1 reply; 200+ results
From: Kevin Darbyshire-Bryant @ 2020-04-22 17:17 UTC (permalink / raw)
To: Dave Taht; +Cc: Cake List
[-- Attachment #1: Type: text/plain, Size: 1000 bytes --]
> On 22 Apr 2020, at 17:20, Dave Taht <dave.taht@gmail.com> wrote:
>
> and because of your I'm off building collectd because those graphs
> look so good. :)
Oh dear, sorry about that :-) The collection bit https://github.com/ldir-EDB0/packages/commit/932bb4b022bdbf3ab0fa1e43842f7c94da7f046a
The display bit https://github.com/ldir-EDB0/luci/commit/a0a95da1703079887a85c4d9b6929e74d2c77a29
Don’t break it, or if you do, send fixes ;-)
The idea of using collectd_exec and hence a sh script was the quickest way of spinning something up. It is inherently going to be heavier than a proper C based plugin/collector…and beyond my skill/patience limits (You should have seen how many combinations of ‘*' & ‘&’ were involved in getting https-dns-proxy/lubcurl "static curl_socket_t opensocket_callback(void *clientp, curlsocktype purpose, struct curl_sockaddr *addr) (void)setsockopt(sock, IPPROTO_IPV6, IP_TOS, (int *)clientp, sizeof(int));” to work :-) )
Kevin
[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [relevance 0%]
* Re: [Cake] DSCP ramblings
2020-04-22 16:44 1% ` Stephen Hemminger
@ 2020-04-22 16:58 1% ` Dave Taht
2020-04-23 10:50 0% ` Kevin Darbyshire-Bryant
1 sibling, 0 replies; 200+ results
From: Dave Taht @ 2020-04-22 16:58 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: Kevin Darbyshire-Bryant, Cake List
On Wed, Apr 22, 2020 at 9:44 AM Stephen Hemminger
<stephen@networkplumber.org> wrote:
>
> On Wed, 22 Apr 2020 09:20:29 -0700
> Dave Taht <dave.taht@gmail.com> wrote:
>
> > and because of your I'm off building collectd because those graphs
> > look so good. :)
> >
> > https://forum.openwrt.org/t/sqm-reporting/59960/24
> >
> > I have long just used snmpd, and collectd looks interesting. I fear
> > it's too heavyweight, particularly shelling out to a script....
> >
> > On Wed, Apr 22, 2020 at 9:15 AM Dave Taht <dave.taht@gmail.com> wrote:
> > >
> > > On Wed, Apr 22, 2020 at 8:58 AM Kevin Darbyshire-Bryant
> > > <kevin@darbyshire-bryant.me.uk> wrote:
> > > >
> > > > During these strange times of lockdown I’ve been trying to keep myself occupied/entertained/sane(???) by ‘fiddling with stuff’ and improving my coding. This started with an idea of learning Python which was great until the on-line bit of it ran out and someone posted an idea on the Openwrt forum about graphing Cake stats.
> > > >
> > > > That had nothing to do with Python and involved (new to me) technologies such as ‘collectd’, ‘JSON’, a bit of javascript and my usual level of cobbling something together in ‘ash’…. So that course was well spent :-)
> > > >
> > > > Anyway, data was collected and graphs produced in a very small household. What’s immediately apparent from those graphs and cake in ‘diffserv4’ mode is that very, very few applications are using DSCP at all. Most things are to port 443.
> > > >
> > > > I was also a little surprised to see that my DNS over foo proxies such as stubby & https-dns-proxy don’t use DSCP coding. It surprised me even more to see RFC recommendations that DNS be treated as ‘Best Effort’. Now in the days of udp only and no dnssec (with fallback to tcp) this may be good enough, but I wonder if this is realistic these days?
> > > >
> > > > So putting aside the discussion of what codepoint should be used, I then wondered how hard it would be to actually set a dscp in these applications. And this is where I had another surprise. For example https-dns-proxy uses libcurl. libcurl has no standard ‘in-library’ method for setting a socket’s dscp. I cobbled a workaround in the application https://github.com/aarond10/https_dns_proxy/pull/83 - it works.
> > > >
> > > > Next I attacked stubby, which uses getdns. getdns doesn’t even have a callback or parameters passing so you can set a dscp on the socket from a client application, pure ‘hack the library’ stuff.
> > > >
> > > > To be blunt and on a small sample of 2 libraries/applications, it seems that DSCP is completely ignored. Applications signalling ’this is/isnt latency sensitive/bulk’ isn’t going to happen if it isn’t easy to do.
> > > >
> > > > Apple should be marking facetime calls as being ‘video conference’ or whatever. BBC iplayer Radio apps should be marking ‘audio streaming’. But every f*ing thing is CS0 port 443. And I’m wondering how much of this is because library support is simply missing. Maybe gaming apps are better? (I don’t game)
> > > >
> > > > Right, I’m off for a lie down. Sorry for the rant.
> > >
> > > Welcome to my explorations... in 2011. Diffserv is rather underused, isn't it?
> > >
> > > I took a survey of every (500+) gaming console at a convention. nearly
> > > zero diffserv usage and it was all over the map, and I think, mostly,
> > > from osx.
> > >
> > > windows requires admin privs to set the tos bits at all
> > > webrtc has an api to set the bits, but it doesn't work on windows.
> > >
> > > ssh will set the imm bit for interactive, I forget what it sets for bulk
> > > bgp sets cs6. so does babel. Arguably both usages are wrong.
> > > some windows stuff sets cs1 for things like ping
> > > I got the mosh folk to use AF42 as a (worldwide) test, for nearly a
> > > year. they had one user with a problem and they turned it off. It was
> > > funny, keith thought I was making an expert recommendation rather than
> > > a test and just copy pasted my code into the tree and shipped it.
> > >
> > > linux implements a strict priority queue in pfifo_fast. You can dos it
> > > if you hit it by setting the bits.
> > > irtt and netperf let you set the bits. iperf also.
> > >
> > > I produced a patch for rsync in particular (since I use it heavily)
> > >
> > > sqm at least used to mark dns and ntp as some elivated prio, but I
> > > forget which and for all I know the cake qos system doesn't implement
> > > those filters.
> > >
> > > A few multi-queue ethernet devices actually do interpret the bits.
> > > Undocumented as to which one..
> > >
> > > and lets not get started on ecn.
> > >
> > > >
> > > >
> > > > Hack for getdns/stubby
> > > >
> > > > diff --git a/src/stub.c b/src/stub.c
> > > > index 2547d10f..7e47aba5 100644
> > > > --- a/src/stub.c
> > > > +++ b/src/stub.c
> > > > @@ -52,6 +52,7 @@
> > > > #include "platform.h"
> > > > #include "general.h"
> > > > #include "pubkey-pinning.h"
> > > > +#include <netinet/ip.h>
> > > >
> > > > /* WSA TODO:
> > > > * STUB_TCP_RETRY added to deal with edge triggered event loops (versus
> > > > @@ -381,6 +382,9 @@ tcp_connect(getdns_upstream *upstream, getdns_transport_list_t transport)
> > > > # else
> > > > static const int enable = 1;
> > > > # endif
> > > > +#endif
> > > > +#if defined(IP_TOS)
> > > > + int dscp = IPTOS_CLASS_CS4;
> > > > #endif
> > > > int fd = -1;
> > > >
> > > > @@ -390,6 +394,12 @@ tcp_connect(getdns_upstream *upstream, getdns_transport_list_t transport)
> > > > __FUNC__, (void*)upstream);
> > > > if ((fd = socket(upstream->addr.ss_family, SOCK_STREAM, IPPROTO_TCP)) == -1)
> > > > return -1;
> > > > +#if defined(IP_TOS)
> > > > + if (upstream->addr.ss_family == AF_INET6)
> > > > + (void)setsockopt(fd, IPPROTO_IPV6, IP_TOS, &dscp, sizeof(dscp));
> > > > + else if (upstream->addr.ss_family == AF_INET)
> > > > + (void)setsockopt(fd, IPPROTO_IP, IP_TOS, &dscp, sizeof(dscp));
> > > > +#endif
> > > >
> > > >
> > > > Cheers,
> > > >
> > > > Kevin D-B
> > > >
> > > > gpg: 012C ACB2 28C6 C53E 9775 9123 B3A2 389B 9DE2 334A
> > > >
>
> In my experience, except for a small number of cases (RDMA etc) Diffserv is a
> complete waste of time. There is no global ordering, there is no guarantee against
> starvation and any sane ISP strips the bits off or ignores them.
I am under the impression that comcast at least, uses diffserv
internally, and also has all kinds of fun
mapping from MPLS back and forth. they also annoyingly, on some
CMTSes, remark most traffic to CS1,
which is why I was so insistent on getting the wash option into cake.
diffserv is totally OK on your internal network, but most of the time,
you really, really don't care. really time
sensitive traffic gets tossed onto a priority vlan....
given the track record of diffserv, the idea of handing ECT(1) to the
same folk as an identifier does not seem like
a good idea, either.
the first thing cake got when it hit the real world was proper tc
support so you could identify and prioritize packets
without twiddling the diffserv field. That said, I do rather support
the ability to mark you own traffic within your domain,
and in particular, I like the idea of a least effort or background
marking for traffic you don't care about much.
transmission does support qos markings... for tcp traffic.ledbat, no.
Bunch of one line patches
to a lot of tools and then where do we get?
> Diffserv is even an issue at scale in the cloud. What does DSCP mean exactly on
> outer headers, who gets to decide for which service. And what about inner headers
> and propogating inner to outer. Its a mess.
yep. we tried, by following the proposed standards for webrtc and so
on, to come up with
a sane set of interpretations. If you think diffserv was a messy idea... see:
https://datatracker.ietf.org/doc/draft-white-tsvwg-nqb/
and for that matter the recent thread on a new hop by hop options
header from china telecom on netdev titled
"net: ipv6: support Application-aware IPv6 Network (APN6)"
inmates, asylum, running amuck.
>
--
Make Music, Not War
Dave Täht
CTO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-831-435-0729
^ permalink raw reply [relevance 1%]
* Re: [Cake] DSCP ramblings
2020-04-22 16:20 1% ` Dave Taht
@ 2020-04-22 16:44 1% ` Stephen Hemminger
2020-04-22 16:58 1% ` Dave Taht
2020-04-23 10:50 0% ` Kevin Darbyshire-Bryant
2020-04-22 17:17 0% ` Kevin Darbyshire-Bryant
1 sibling, 2 replies; 200+ results
From: Stephen Hemminger @ 2020-04-22 16:44 UTC (permalink / raw)
To: Dave Taht; +Cc: Kevin Darbyshire-Bryant, Cake List
On Wed, 22 Apr 2020 09:20:29 -0700
Dave Taht <dave.taht@gmail.com> wrote:
> and because of your I'm off building collectd because those graphs
> look so good. :)
>
> https://forum.openwrt.org/t/sqm-reporting/59960/24
>
> I have long just used snmpd, and collectd looks interesting. I fear
> it's too heavyweight, particularly shelling out to a script....
>
> On Wed, Apr 22, 2020 at 9:15 AM Dave Taht <dave.taht@gmail.com> wrote:
> >
> > On Wed, Apr 22, 2020 at 8:58 AM Kevin Darbyshire-Bryant
> > <kevin@darbyshire-bryant.me.uk> wrote:
> > >
> > > During these strange times of lockdown I’ve been trying to keep myself occupied/entertained/sane(???) by ‘fiddling with stuff’ and improving my coding. This started with an idea of learning Python which was great until the on-line bit of it ran out and someone posted an idea on the Openwrt forum about graphing Cake stats.
> > >
> > > That had nothing to do with Python and involved (new to me) technologies such as ‘collectd’, ‘JSON’, a bit of javascript and my usual level of cobbling something together in ‘ash’…. So that course was well spent :-)
> > >
> > > Anyway, data was collected and graphs produced in a very small household. What’s immediately apparent from those graphs and cake in ‘diffserv4’ mode is that very, very few applications are using DSCP at all. Most things are to port 443.
> > >
> > > I was also a little surprised to see that my DNS over foo proxies such as stubby & https-dns-proxy don’t use DSCP coding. It surprised me even more to see RFC recommendations that DNS be treated as ‘Best Effort’. Now in the days of udp only and no dnssec (with fallback to tcp) this may be good enough, but I wonder if this is realistic these days?
> > >
> > > So putting aside the discussion of what codepoint should be used, I then wondered how hard it would be to actually set a dscp in these applications. And this is where I had another surprise. For example https-dns-proxy uses libcurl. libcurl has no standard ‘in-library’ method for setting a socket’s dscp. I cobbled a workaround in the application https://github.com/aarond10/https_dns_proxy/pull/83 - it works.
> > >
> > > Next I attacked stubby, which uses getdns. getdns doesn’t even have a callback or parameters passing so you can set a dscp on the socket from a client application, pure ‘hack the library’ stuff.
> > >
> > > To be blunt and on a small sample of 2 libraries/applications, it seems that DSCP is completely ignored. Applications signalling ’this is/isnt latency sensitive/bulk’ isn’t going to happen if it isn’t easy to do.
> > >
> > > Apple should be marking facetime calls as being ‘video conference’ or whatever. BBC iplayer Radio apps should be marking ‘audio streaming’. But every f*ing thing is CS0 port 443. And I’m wondering how much of this is because library support is simply missing. Maybe gaming apps are better? (I don’t game)
> > >
> > > Right, I’m off for a lie down. Sorry for the rant.
> >
> > Welcome to my explorations... in 2011. Diffserv is rather underused, isn't it?
> >
> > I took a survey of every (500+) gaming console at a convention. nearly
> > zero diffserv usage and it was all over the map, and I think, mostly,
> > from osx.
> >
> > windows requires admin privs to set the tos bits at all
> > webrtc has an api to set the bits, but it doesn't work on windows.
> >
> > ssh will set the imm bit for interactive, I forget what it sets for bulk
> > bgp sets cs6. so does babel. Arguably both usages are wrong.
> > some windows stuff sets cs1 for things like ping
> > I got the mosh folk to use AF42 as a (worldwide) test, for nearly a
> > year. they had one user with a problem and they turned it off. It was
> > funny, keith thought I was making an expert recommendation rather than
> > a test and just copy pasted my code into the tree and shipped it.
> >
> > linux implements a strict priority queue in pfifo_fast. You can dos it
> > if you hit it by setting the bits.
> > irtt and netperf let you set the bits. iperf also.
> >
> > I produced a patch for rsync in particular (since I use it heavily)
> >
> > sqm at least used to mark dns and ntp as some elivated prio, but I
> > forget which and for all I know the cake qos system doesn't implement
> > those filters.
> >
> > A few multi-queue ethernet devices actually do interpret the bits.
> > Undocumented as to which one..
> >
> > and lets not get started on ecn.
> >
> > >
> > >
> > > Hack for getdns/stubby
> > >
> > > diff --git a/src/stub.c b/src/stub.c
> > > index 2547d10f..7e47aba5 100644
> > > --- a/src/stub.c
> > > +++ b/src/stub.c
> > > @@ -52,6 +52,7 @@
> > > #include "platform.h"
> > > #include "general.h"
> > > #include "pubkey-pinning.h"
> > > +#include <netinet/ip.h>
> > >
> > > /* WSA TODO:
> > > * STUB_TCP_RETRY added to deal with edge triggered event loops (versus
> > > @@ -381,6 +382,9 @@ tcp_connect(getdns_upstream *upstream, getdns_transport_list_t transport)
> > > # else
> > > static const int enable = 1;
> > > # endif
> > > +#endif
> > > +#if defined(IP_TOS)
> > > + int dscp = IPTOS_CLASS_CS4;
> > > #endif
> > > int fd = -1;
> > >
> > > @@ -390,6 +394,12 @@ tcp_connect(getdns_upstream *upstream, getdns_transport_list_t transport)
> > > __FUNC__, (void*)upstream);
> > > if ((fd = socket(upstream->addr.ss_family, SOCK_STREAM, IPPROTO_TCP)) == -1)
> > > return -1;
> > > +#if defined(IP_TOS)
> > > + if (upstream->addr.ss_family == AF_INET6)
> > > + (void)setsockopt(fd, IPPROTO_IPV6, IP_TOS, &dscp, sizeof(dscp));
> > > + else if (upstream->addr.ss_family == AF_INET)
> > > + (void)setsockopt(fd, IPPROTO_IP, IP_TOS, &dscp, sizeof(dscp));
> > > +#endif
> > >
> > >
> > > Cheers,
> > >
> > > Kevin D-B
> > >
> > > gpg: 012C ACB2 28C6 C53E 9775 9123 B3A2 389B 9DE2 334A
> > >
In my experience, except for a small number of cases (RDMA etc) Diffserv is a
complete waste of time. There is no global ordering, there is no guarantee against
starvation and any sane ISP strips the bits off or ignores them.
Diffserv is even an issue at scale in the cloud. What does DSCP mean exactly on
outer headers, who gets to decide for which service. And what about inner headers
and propogating inner to outer. Its a mess.
^ permalink raw reply [relevance 1%]
* Re: [Cake] DSCP ramblings
2020-04-22 16:15 1% ` Dave Taht
@ 2020-04-22 16:20 1% ` Dave Taht
2020-04-22 16:44 1% ` Stephen Hemminger
2020-04-22 17:17 0% ` Kevin Darbyshire-Bryant
0 siblings, 2 replies; 200+ results
From: Dave Taht @ 2020-04-22 16:20 UTC (permalink / raw)
To: Kevin Darbyshire-Bryant; +Cc: Cake List
and because of your I'm off building collectd because those graphs
look so good. :)
https://forum.openwrt.org/t/sqm-reporting/59960/24
I have long just used snmpd, and collectd looks interesting. I fear
it's too heavyweight, particularly shelling out to a script....
On Wed, Apr 22, 2020 at 9:15 AM Dave Taht <dave.taht@gmail.com> wrote:
>
> On Wed, Apr 22, 2020 at 8:58 AM Kevin Darbyshire-Bryant
> <kevin@darbyshire-bryant.me.uk> wrote:
> >
> > During these strange times of lockdown I’ve been trying to keep myself occupied/entertained/sane(???) by ‘fiddling with stuff’ and improving my coding. This started with an idea of learning Python which was great until the on-line bit of it ran out and someone posted an idea on the Openwrt forum about graphing Cake stats.
> >
> > That had nothing to do with Python and involved (new to me) technologies such as ‘collectd’, ‘JSON’, a bit of javascript and my usual level of cobbling something together in ‘ash’…. So that course was well spent :-)
> >
> > Anyway, data was collected and graphs produced in a very small household. What’s immediately apparent from those graphs and cake in ‘diffserv4’ mode is that very, very few applications are using DSCP at all. Most things are to port 443.
> >
> > I was also a little surprised to see that my DNS over foo proxies such as stubby & https-dns-proxy don’t use DSCP coding. It surprised me even more to see RFC recommendations that DNS be treated as ‘Best Effort’. Now in the days of udp only and no dnssec (with fallback to tcp) this may be good enough, but I wonder if this is realistic these days?
> >
> > So putting aside the discussion of what codepoint should be used, I then wondered how hard it would be to actually set a dscp in these applications. And this is where I had another surprise. For example https-dns-proxy uses libcurl. libcurl has no standard ‘in-library’ method for setting a socket’s dscp. I cobbled a workaround in the application https://github.com/aarond10/https_dns_proxy/pull/83 - it works.
> >
> > Next I attacked stubby, which uses getdns. getdns doesn’t even have a callback or parameters passing so you can set a dscp on the socket from a client application, pure ‘hack the library’ stuff.
> >
> > To be blunt and on a small sample of 2 libraries/applications, it seems that DSCP is completely ignored. Applications signalling ’this is/isnt latency sensitive/bulk’ isn’t going to happen if it isn’t easy to do.
> >
> > Apple should be marking facetime calls as being ‘video conference’ or whatever. BBC iplayer Radio apps should be marking ‘audio streaming’. But every f*ing thing is CS0 port 443. And I’m wondering how much of this is because library support is simply missing. Maybe gaming apps are better? (I don’t game)
> >
> > Right, I’m off for a lie down. Sorry for the rant.
>
> Welcome to my explorations... in 2011. Diffserv is rather underused, isn't it?
>
> I took a survey of every (500+) gaming console at a convention. nearly
> zero diffserv usage and it was all over the map, and I think, mostly,
> from osx.
>
> windows requires admin privs to set the tos bits at all
> webrtc has an api to set the bits, but it doesn't work on windows.
>
> ssh will set the imm bit for interactive, I forget what it sets for bulk
> bgp sets cs6. so does babel. Arguably both usages are wrong.
> some windows stuff sets cs1 for things like ping
> I got the mosh folk to use AF42 as a (worldwide) test, for nearly a
> year. they had one user with a problem and they turned it off. It was
> funny, keith thought I was making an expert recommendation rather than
> a test and just copy pasted my code into the tree and shipped it.
>
> linux implements a strict priority queue in pfifo_fast. You can dos it
> if you hit it by setting the bits.
> irtt and netperf let you set the bits. iperf also.
>
> I produced a patch for rsync in particular (since I use it heavily)
>
> sqm at least used to mark dns and ntp as some elivated prio, but I
> forget which and for all I know the cake qos system doesn't implement
> those filters.
>
> A few multi-queue ethernet devices actually do interpret the bits.
> Undocumented as to which one..
>
> and lets not get started on ecn.
>
> >
> >
> > Hack for getdns/stubby
> >
> > diff --git a/src/stub.c b/src/stub.c
> > index 2547d10f..7e47aba5 100644
> > --- a/src/stub.c
> > +++ b/src/stub.c
> > @@ -52,6 +52,7 @@
> > #include "platform.h"
> > #include "general.h"
> > #include "pubkey-pinning.h"
> > +#include <netinet/ip.h>
> >
> > /* WSA TODO:
> > * STUB_TCP_RETRY added to deal with edge triggered event loops (versus
> > @@ -381,6 +382,9 @@ tcp_connect(getdns_upstream *upstream, getdns_transport_list_t transport)
> > # else
> > static const int enable = 1;
> > # endif
> > +#endif
> > +#if defined(IP_TOS)
> > + int dscp = IPTOS_CLASS_CS4;
> > #endif
> > int fd = -1;
> >
> > @@ -390,6 +394,12 @@ tcp_connect(getdns_upstream *upstream, getdns_transport_list_t transport)
> > __FUNC__, (void*)upstream);
> > if ((fd = socket(upstream->addr.ss_family, SOCK_STREAM, IPPROTO_TCP)) == -1)
> > return -1;
> > +#if defined(IP_TOS)
> > + if (upstream->addr.ss_family == AF_INET6)
> > + (void)setsockopt(fd, IPPROTO_IPV6, IP_TOS, &dscp, sizeof(dscp));
> > + else if (upstream->addr.ss_family == AF_INET)
> > + (void)setsockopt(fd, IPPROTO_IP, IP_TOS, &dscp, sizeof(dscp));
> > +#endif
> >
> >
> > Cheers,
> >
> > Kevin D-B
> >
> > gpg: 012C ACB2 28C6 C53E 9775 9123 B3A2 389B 9DE2 334A
> >
> > _______________________________________________
> > Cake mailing list
> > Cake@lists.bufferbloat.net
> > https://lists.bufferbloat.net/listinfo/cake
>
>
>
> --
> Make Music, Not War
>
> Dave Täht
> CTO, TekLibre, LLC
> http://www.teklibre.com
> Tel: 1-831-435-0729
--
Make Music, Not War
Dave Täht
CTO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-831-435-0729
^ permalink raw reply [relevance 1%]
* Re: [Cake] DSCP ramblings
@ 2020-04-22 16:15 1% ` Dave Taht
2020-04-22 16:20 1% ` Dave Taht
0 siblings, 1 reply; 200+ results
From: Dave Taht @ 2020-04-22 16:15 UTC (permalink / raw)
To: Kevin Darbyshire-Bryant; +Cc: Cake List
On Wed, Apr 22, 2020 at 8:58 AM Kevin Darbyshire-Bryant
<kevin@darbyshire-bryant.me.uk> wrote:
>
> During these strange times of lockdown I’ve been trying to keep myself occupied/entertained/sane(???) by ‘fiddling with stuff’ and improving my coding. This started with an idea of learning Python which was great until the on-line bit of it ran out and someone posted an idea on the Openwrt forum about graphing Cake stats.
>
> That had nothing to do with Python and involved (new to me) technologies such as ‘collectd’, ‘JSON’, a bit of javascript and my usual level of cobbling something together in ‘ash’…. So that course was well spent :-)
>
> Anyway, data was collected and graphs produced in a very small household. What’s immediately apparent from those graphs and cake in ‘diffserv4’ mode is that very, very few applications are using DSCP at all. Most things are to port 443.
>
> I was also a little surprised to see that my DNS over foo proxies such as stubby & https-dns-proxy don’t use DSCP coding. It surprised me even more to see RFC recommendations that DNS be treated as ‘Best Effort’. Now in the days of udp only and no dnssec (with fallback to tcp) this may be good enough, but I wonder if this is realistic these days?
>
> So putting aside the discussion of what codepoint should be used, I then wondered how hard it would be to actually set a dscp in these applications. And this is where I had another surprise. For example https-dns-proxy uses libcurl. libcurl has no standard ‘in-library’ method for setting a socket’s dscp. I cobbled a workaround in the application https://github.com/aarond10/https_dns_proxy/pull/83 - it works.
>
> Next I attacked stubby, which uses getdns. getdns doesn’t even have a callback or parameters passing so you can set a dscp on the socket from a client application, pure ‘hack the library’ stuff.
>
> To be blunt and on a small sample of 2 libraries/applications, it seems that DSCP is completely ignored. Applications signalling ’this is/isnt latency sensitive/bulk’ isn’t going to happen if it isn’t easy to do.
>
> Apple should be marking facetime calls as being ‘video conference’ or whatever. BBC iplayer Radio apps should be marking ‘audio streaming’. But every f*ing thing is CS0 port 443. And I’m wondering how much of this is because library support is simply missing. Maybe gaming apps are better? (I don’t game)
>
> Right, I’m off for a lie down. Sorry for the rant.
Welcome to my explorations... in 2011. Diffserv is rather underused, isn't it?
I took a survey of every (500+) gaming console at a convention. nearly
zero diffserv usage and it was all over the map, and I think, mostly,
from osx.
windows requires admin privs to set the tos bits at all
webrtc has an api to set the bits, but it doesn't work on windows.
ssh will set the imm bit for interactive, I forget what it sets for bulk
bgp sets cs6. so does babel. Arguably both usages are wrong.
some windows stuff sets cs1 for things like ping
I got the mosh folk to use AF42 as a (worldwide) test, for nearly a
year. they had one user with a problem and they turned it off. It was
funny, keith thought I was making an expert recommendation rather than
a test and just copy pasted my code into the tree and shipped it.
linux implements a strict priority queue in pfifo_fast. You can dos it
if you hit it by setting the bits.
irtt and netperf let you set the bits. iperf also.
I produced a patch for rsync in particular (since I use it heavily)
sqm at least used to mark dns and ntp as some elivated prio, but I
forget which and for all I know the cake qos system doesn't implement
those filters.
A few multi-queue ethernet devices actually do interpret the bits.
Undocumented as to which one..
and lets not get started on ecn.
>
>
> Hack for getdns/stubby
>
> diff --git a/src/stub.c b/src/stub.c
> index 2547d10f..7e47aba5 100644
> --- a/src/stub.c
> +++ b/src/stub.c
> @@ -52,6 +52,7 @@
> #include "platform.h"
> #include "general.h"
> #include "pubkey-pinning.h"
> +#include <netinet/ip.h>
>
> /* WSA TODO:
> * STUB_TCP_RETRY added to deal with edge triggered event loops (versus
> @@ -381,6 +382,9 @@ tcp_connect(getdns_upstream *upstream, getdns_transport_list_t transport)
> # else
> static const int enable = 1;
> # endif
> +#endif
> +#if defined(IP_TOS)
> + int dscp = IPTOS_CLASS_CS4;
> #endif
> int fd = -1;
>
> @@ -390,6 +394,12 @@ tcp_connect(getdns_upstream *upstream, getdns_transport_list_t transport)
> __FUNC__, (void*)upstream);
> if ((fd = socket(upstream->addr.ss_family, SOCK_STREAM, IPPROTO_TCP)) == -1)
> return -1;
> +#if defined(IP_TOS)
> + if (upstream->addr.ss_family == AF_INET6)
> + (void)setsockopt(fd, IPPROTO_IPV6, IP_TOS, &dscp, sizeof(dscp));
> + else if (upstream->addr.ss_family == AF_INET)
> + (void)setsockopt(fd, IPPROTO_IP, IP_TOS, &dscp, sizeof(dscp));
> +#endif
>
>
> Cheers,
>
> Kevin D-B
>
> gpg: 012C ACB2 28C6 C53E 9775 9123 B3A2 389B 9DE2 334A
>
> _______________________________________________
> Cake mailing list
> Cake@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cake
--
Make Music, Not War
Dave Täht
CTO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-831-435-0729
^ permalink raw reply [relevance 1%]
* Re: [Cake] Advantages to tightly tuning latency
2020-04-22 14:48 1% ` Dave Taht
@ 2020-04-22 15:28 1% ` Luca Muscariello
2020-04-22 17:42 2% ` David P. Reed
2020-04-23 9:29 1% ` Maxime Bizon
1 sibling, 1 reply; 200+ results
From: Luca Muscariello @ 2020-04-22 15:28 UTC (permalink / raw)
To: Dave Taht; +Cc: Jonathan Morton, Cake List, Maxime Bizon
[-- Attachment #1: Type: text/plain, Size: 6687 bytes --]
On Wed, Apr 22, 2020 at 4:48 PM Dave Taht <dave.taht@gmail.com> wrote:
> On Wed, Apr 22, 2020 at 2:04 AM Luca Muscariello <muscariello@ieee.org>
> wrote:
> >
> >
> >
> > On Wed, Apr 22, 2020 at 12:44 AM Dave Taht <dave.taht@gmail.com> wrote:
> >>
> >> On Tue, Apr 21, 2020 at 3:33 PM Jonathan Morton <chromatix99@gmail.com>
> wrote:
> >> >
> >> > > On 22 Apr, 2020, at 1:25 am, Thibaut <hacks@slashdirt.org> wrote:
> >> > >
> >> > > My curiosity is piqued. Can you elaborate on this? What does
> free.fr do?
> >> >
> >> > They're a large French ISP. They made their own CPE devices, and
> debloated both them and their network quite a while ago. In that sense, at
> least, they're a model for others to follow - but few have.
> >> >
> >> > - Jonathan Morton
> >>
> >> they are one of the few ISPs that insisted on getting full source code
> >> to their DSL stack, and retained the chops to be able to modify it. I
> >> really admire their revolution v6 product. First introduced in 2010,
> >> it's been continuously updated, did ipv6 at the outset, got fq_codel
> >> when it first came out, and they update the kernel regularly. All
> >> kinds of great features on it, and ecn is enabled by default for those
> >> also (things like samba). over 3 million boxes now I hear....
> >>
> >> with <1ms of delay in the dsl driver, they don't need to shape, they
> >> just run at line rate using three tiers of DRR that look a lot like
> >> cake. They shared their config with me, and before I lost heart for
> >> future internet drafts, I'd stuck it here:
> >>
> >>
> https://github.com/dtaht/bufferbloat-rfcs/blob/master/home_gateway_queue_management/middle.mkd
> >>
> >> Occasionally they share some data with me. Sometimes I wish I lived in
> >> paris just so I could have good internet! (their fiber offering is
> >> reasonably buffered (not fq_codeled) and the wifi... maybe I can get
> >> them to talk about what they did)
> >>
> >> When free.fr shipped fq_codel 2 months after we finalized it, I
> >> figured the rest of the world was only months behind. How hard is it
> >> to add 50 lines of BQL oriented code to a DSL firmware?
> >>
> >
> > Free has been using SFQ since 2005 (if I remember well).
> > They announced the wide deployment of SFQ in the free.fr newsgroup.
> > Wi-Fi in the free.fr router was not as good though.
>
> They're working on it. :)
>
> > In Paris there is a lot of GPON now that is replacing DSL. But there is
> > a nation-wide effort funded by local administrations to get fiber
> > everywhere. There are small towns in the countryside with fiber.
> > Public money has made, and is making that possible.
> > There is still a little of Euro-DOCSIS, but frankly compared to fiber
> > it has no chance to survive.
>
> I am very, very happy for y'all. Fiber has always been the sanest
> thing. Is there
> a SPF+ gpon card yet I can plug into a convention open source router yet?
>
> >
> > I currently have 2Gbps/600Mbps access with orange.fr and free.fr has a
> subscription
> > at 10Gbps GPON. I won't tell you the price because you may feel depressed
> > compared to other countries where prices are much higher.
>
> I'd emigrate!!!
>
> > The challenge becomes to keep up with these link rates in software
> > as there is a lot of hardware offloading.
>
I just meant that these routers tend to use HW offloading
and kernel qdiscs may be bypassed.
>
> At this point, I kind of buy the stanford sqrt(bdp) argument. All you
> really need for gigE+ fiber access to work well
> for most modern traffic is a fairly short fifo (say, 20ms). Any form
> of FQ would help but be hardly noticible. I think
> there needs to be work on the hop between the internet and the
> subscriber...
>
> Web traffic is dominated by RTT above 40mbit (presently).
> streaming video traffic - is no more than 20Mbit, and your occasional
> big download is a dozen big streams that would
> bounce off a short fifo well.
> gbit access to the home is (admittedly glorious, wonderful!) overkill
> for all present forms of traffic.
>
> I'm pretty sure if I had gig fiber I could come up with a way to use
> it up (exiting the cloud entirely comes to mind), but
> lacking new applications that demand that much bandwidth...
>
> I of course, would like to see lola ( https://lola.conts.it/ ) finally
> work, and videoconferencing and game stream with high rates and faster
> (even raw) encoding also has potential to reduce e2e latencies
> enormously at that layer.
>
> >
> > As soon as 802.11ax becomes the norm, software scheduling will become
> > a challenge.
>
> Do you mean in fiber or wireless? wireless is really problematic at ANY
> speed.
>
I meant that software scheduling becomes a challenge for the same
reason as above. Increase in total throughput of the box
will call for hardware offloading and kernel qdisc may be bypassed.
It is not a challenge per se, it is a challenge because traffic
may not be managed by the kernel.
>
> at gfiber, the buffering moved to the wifi, and there are other
> problems that really impact achievable bandwidth. When I was last in
> paris, I could "hear" 300+ access points from my apt, and could only
> get 100-200kbit per second out of the wireless n ap I had, unless I
> cheated and stuck my traffic in the VI queue. A friend of mine there,
> couldn't even get wifi across the room! Beacons ate into a lot of the
> available
> bandwidth. Since 5ghz (and soon 6ghz - is 6E a thing in france) is
> shorter range I'm hoping that's got better, but with
> 802.11ac and ax peeing on half the wifi spectrum by default, I imagine
> achievable rates in high density locations with many APs will be very
> low... and very jittery... and thus still require good ATF, fq, and
> aqm technologies.
>
> I have high hopes for OFDMA and DU but thus far haven't found an AP
> doing it. I'm not sure what to do about the beaconing problem except
> offer a free tradein to all my neighbors still emitting G style
> frames....
>
> And in looking over some preliminary code for the mt76 ax chip, I
> worry about both bad design of the firmware, and
> insufficient resources on-chip to manage well.
>
> How is the 5G rollout going in france?
>
Good question. I've just seen a speed test at Gbps on a phone
which can drain your battery in less than 5 minutes. Amazing tech!
>
> I recently learned that much of japan is... wait for it... wimax.
>
> >
> > Luca
>
>
>
> --
> Make Music, Not War
>
> Dave Täht
> CTO, TekLibre, LLC
> http://www.teklibre.com
> Tel: 1-831-435-0729
>
[-- Attachment #2: Type: text/html, Size: 9795 bytes --]
^ permalink raw reply [relevance 1%]
* Re: [Cake] Advantages to tightly tuning latency
2020-04-22 9:03 1% ` Luca Muscariello
@ 2020-04-22 14:48 1% ` Dave Taht
2020-04-22 15:28 1% ` Luca Muscariello
2020-04-23 9:29 1% ` Maxime Bizon
0 siblings, 2 replies; 200+ results
From: Dave Taht @ 2020-04-22 14:48 UTC (permalink / raw)
To: Luca Muscariello; +Cc: Jonathan Morton, Cake List, Maxime Bizon
On Wed, Apr 22, 2020 at 2:04 AM Luca Muscariello <muscariello@ieee.org> wrote:
>
>
>
> On Wed, Apr 22, 2020 at 12:44 AM Dave Taht <dave.taht@gmail.com> wrote:
>>
>> On Tue, Apr 21, 2020 at 3:33 PM Jonathan Morton <chromatix99@gmail.com> wrote:
>> >
>> > > On 22 Apr, 2020, at 1:25 am, Thibaut <hacks@slashdirt.org> wrote:
>> > >
>> > > My curiosity is piqued. Can you elaborate on this? What does free.fr do?
>> >
>> > They're a large French ISP. They made their own CPE devices, and debloated both them and their network quite a while ago. In that sense, at least, they're a model for others to follow - but few have.
>> >
>> > - Jonathan Morton
>>
>> they are one of the few ISPs that insisted on getting full source code
>> to their DSL stack, and retained the chops to be able to modify it. I
>> really admire their revolution v6 product. First introduced in 2010,
>> it's been continuously updated, did ipv6 at the outset, got fq_codel
>> when it first came out, and they update the kernel regularly. All
>> kinds of great features on it, and ecn is enabled by default for those
>> also (things like samba). over 3 million boxes now I hear....
>>
>> with <1ms of delay in the dsl driver, they don't need to shape, they
>> just run at line rate using three tiers of DRR that look a lot like
>> cake. They shared their config with me, and before I lost heart for
>> future internet drafts, I'd stuck it here:
>>
>> https://github.com/dtaht/bufferbloat-rfcs/blob/master/home_gateway_queue_management/middle.mkd
>>
>> Occasionally they share some data with me. Sometimes I wish I lived in
>> paris just so I could have good internet! (their fiber offering is
>> reasonably buffered (not fq_codeled) and the wifi... maybe I can get
>> them to talk about what they did)
>>
>> When free.fr shipped fq_codel 2 months after we finalized it, I
>> figured the rest of the world was only months behind. How hard is it
>> to add 50 lines of BQL oriented code to a DSL firmware?
>>
>
> Free has been using SFQ since 2005 (if I remember well).
> They announced the wide deployment of SFQ in the free.fr newsgroup.
> Wi-Fi in the free.fr router was not as good though.
They're working on it. :)
> In Paris there is a lot of GPON now that is replacing DSL. But there is
> a nation-wide effort funded by local administrations to get fiber
> everywhere. There are small towns in the countryside with fiber.
> Public money has made, and is making that possible.
> There is still a little of Euro-DOCSIS, but frankly compared to fiber
> it has no chance to survive.
I am very, very happy for y'all. Fiber has always been the sanest
thing. Is there
a SPF+ gpon card yet I can plug into a convention open source router yet?
>
> I currently have 2Gbps/600Mbps access with orange.fr and free.fr has a subscription
> at 10Gbps GPON. I won't tell you the price because you may feel depressed
> compared to other countries where prices are much higher.
I'd emigrate!!!
> The challenge becomes to keep up with these link rates in software
> as there is a lot of hardware offloading.
At this point, I kind of buy the stanford sqrt(bdp) argument. All you
really need for gigE+ fiber access to work well
for most modern traffic is a fairly short fifo (say, 20ms). Any form
of FQ would help but be hardly noticible. I think
there needs to be work on the hop between the internet and the subscriber...
Web traffic is dominated by RTT above 40mbit (presently).
streaming video traffic - is no more than 20Mbit, and your occasional
big download is a dozen big streams that would
bounce off a short fifo well.
gbit access to the home is (admittedly glorious, wonderful!) overkill
for all present forms of traffic.
I'm pretty sure if I had gig fiber I could come up with a way to use
it up (exiting the cloud entirely comes to mind), but
lacking new applications that demand that much bandwidth...
I of course, would like to see lola ( https://lola.conts.it/ ) finally
work, and videoconferencing and game stream with high rates and faster
(even raw) encoding also has potential to reduce e2e latencies
enormously at that layer.
>
> As soon as 802.11ax becomes the norm, software scheduling will become
> a challenge.
Do you mean in fiber or wireless? wireless is really problematic at ANY speed.
at gfiber, the buffering moved to the wifi, and there are other
problems that really impact achievable bandwidth. When I was last in
paris, I could "hear" 300+ access points from my apt, and could only
get 100-200kbit per second out of the wireless n ap I had, unless I
cheated and stuck my traffic in the VI queue. A friend of mine there,
couldn't even get wifi across the room! Beacons ate into a lot of the
available
bandwidth. Since 5ghz (and soon 6ghz - is 6E a thing in france) is
shorter range I'm hoping that's got better, but with
802.11ac and ax peeing on half the wifi spectrum by default, I imagine
achievable rates in high density locations with many APs will be very
low... and very jittery... and thus still require good ATF, fq, and
aqm technologies.
I have high hopes for OFDMA and DU but thus far haven't found an AP
doing it. I'm not sure what to do about the beaconing problem except
offer a free tradein to all my neighbors still emitting G style
frames....
And in looking over some preliminary code for the mt76 ax chip, I
worry about both bad design of the firmware, and
insufficient resources on-chip to manage well.
How is the 5G rollout going in france?
I recently learned that much of japan is... wait for it... wimax.
>
> Luca
--
Make Music, Not War
Dave Täht
CTO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-831-435-0729
^ permalink raw reply [relevance 1%]
* Re: [Cake] Advantages to tightly tuning latency
2020-04-21 22:44 1% ` Dave Taht
2020-04-22 8:28 0% ` Thibaut
@ 2020-04-22 9:03 1% ` Luca Muscariello
2020-04-22 14:48 1% ` Dave Taht
2 siblings, 1 reply; 200+ results
From: Luca Muscariello @ 2020-04-22 9:03 UTC (permalink / raw)
To: Dave Taht; +Cc: Jonathan Morton, Cake List
[-- Attachment #1: Type: text/plain, Size: 2814 bytes --]
On Wed, Apr 22, 2020 at 12:44 AM Dave Taht <dave.taht@gmail.com> wrote:
> On Tue, Apr 21, 2020 at 3:33 PM Jonathan Morton <chromatix99@gmail.com>
> wrote:
> >
> > > On 22 Apr, 2020, at 1:25 am, Thibaut <hacks@slashdirt.org> wrote:
> > >
> > > My curiosity is piqued. Can you elaborate on this? What does free.fr
> do?
> >
> > They're a large French ISP. They made their own CPE devices, and
> debloated both them and their network quite a while ago. In that sense, at
> least, they're a model for others to follow - but few have.
> >
> > - Jonathan Morton
>
> they are one of the few ISPs that insisted on getting full source code
> to their DSL stack, and retained the chops to be able to modify it. I
> really admire their revolution v6 product. First introduced in 2010,
> it's been continuously updated, did ipv6 at the outset, got fq_codel
> when it first came out, and they update the kernel regularly. All
> kinds of great features on it, and ecn is enabled by default for those
> also (things like samba). over 3 million boxes now I hear....
>
> with <1ms of delay in the dsl driver, they don't need to shape, they
> just run at line rate using three tiers of DRR that look a lot like
> cake. They shared their config with me, and before I lost heart for
> future internet drafts, I'd stuck it here:
>
>
> https://github.com/dtaht/bufferbloat-rfcs/blob/master/home_gateway_queue_management/middle.mkd
>
> Occasionally they share some data with me. Sometimes I wish I lived in
> paris just so I could have good internet! (their fiber offering is
> reasonably buffered (not fq_codeled) and the wifi... maybe I can get
> them to talk about what they did)
>
> When free.fr shipped fq_codel 2 months after we finalized it, I
> figured the rest of the world was only months behind. How hard is it
> to add 50 lines of BQL oriented code to a DSL firmware?
>
>
Free has been using SFQ since 2005 (if I remember well).
They announced the wide deployment of SFQ in the free.fr newsgroup.
Wi-Fi in the free.fr router was not as good though.
In Paris there is a lot of GPON now that is replacing DSL. But there is
a nation-wide effort funded by local administrations to get fiber
everywhere. There are small towns in the countryside with fiber.
Public money has made, and is making that possible.
There is still a little of Euro-DOCSIS, but frankly compared to fiber
it has no chance to survive.
I currently have 2Gbps/600Mbps access with orange.fr and free.fr has a
subscription
at 10Gbps GPON. I won't tell you the price because you may feel depressed
compared to other countries where prices are much higher.
The challenge becomes to keep up with these link rates in software
as there is a lot of hardware offloading.
As soon as 802.11ax becomes the norm, software scheduling will become
a challenge.
Luca
[-- Attachment #2: Type: text/html, Size: 5457 bytes --]
^ permalink raw reply [relevance 1%]
* Re: [Cake] Advantages to tightly tuning latency
2020-04-21 22:44 1% ` Dave Taht
@ 2020-04-22 8:28 0% ` Thibaut
2020-04-22 9:03 1% ` Luca Muscariello
2 siblings, 0 replies; 200+ results
From: Thibaut @ 2020-04-22 8:28 UTC (permalink / raw)
To: Dave Taht; +Cc: Jonathan Morton, Cake List
[-- Attachment #1: Type: text/plain, Size: 2889 bytes --]
> Le 22 avr. 2020 à 00:44, Dave Taht <dave.taht@gmail.com> a écrit :
>
> On Tue, Apr 21, 2020 at 3:33 PM Jonathan Morton <chromatix99@gmail.com> wrote:
>>
>>> On 22 Apr, 2020, at 1:25 am, Thibaut <hacks@slashdirt.org> wrote:
>>>
>>> My curiosity is piqued. Can you elaborate on this? What does free.fr do?
>>
>> They're a large French ISP. They made their own CPE devices, and debloated both them and their network quite a while ago. In that sense, at least, they're a model for others to follow - but few have.
>>
>> - Jonathan Morton
>
> they are one of the few ISPs that insisted on getting full source code
> to their DSL stack, and retained the chops to be able to modify it. I
> really admire their revolution v6 product. First introduced in 2010,
> it's been continuously updated, did ipv6 at the outset, got fq_codel
> when it first came out, and they update the kernel regularly. All
> kinds of great features on it, and ecn is enabled by default for those
> also (things like samba). over 3 million boxes now I hear....
>
> with <1ms of delay in the dsl driver, they don't need to shape, they
> just run at line rate using three tiers of DRR that look a lot like
> cake. They shared their config with me, and before I lost heart for
> future internet drafts, I'd stuck it here:
>
> https://github.com/dtaht/bufferbloat-rfcs/blob/master/home_gateway_queue_management/middle.mkd <https://github.com/dtaht/bufferbloat-rfcs/blob/master/home_gateway_queue_management/middle.mkd>
Very interesting, thanks. I wonder if they trickled down these improvements to the older V5: I had to plug my old V5 back after my DSLAM was moved to native IPV6 (I couldn’t find how to talk to it over VDSL2 modem, as I did before the switch), and though there was a massive drop in uplink bandwidth (from 10Mbps VDSL2 to 1Mbps ADSL), I noticed that I no longer needed cake on the router wan interface. Latency remained very well controlled without having to do anything special. In fact, enabling cake with the previous settings was wrecking havoc! (Maybe it interfered with whatever Free is doing in the box).
> Occasionally they share some data with me. Sometimes I wish I lived in
> paris just so I could have good internet! (their fiber offering is
> reasonably buffered (not fq_codeled) and the wifi... maybe I can get
> them to talk about what they did)
You don’t have to live in Paris to enjoy good internet: I’m currently stranded in the countryside and I enjoy a better connection than many a Parisian, thanks to a public/private FTTH network that appears to be very well handled by my current ISP (K-Net) :)
> When free.fr shipped fq_codel 2 months after we finalized it, I
> figured the rest of the world was only months behind. How hard is it
> to add 50 lines of BQL oriented code to a DSL firmware?
Heh.
Cheers,
Thibaut
[-- Attachment #2: Type: text/html, Size: 4220 bytes --]
^ permalink raw reply [relevance 0%]
* Re: [Cake] Advantages to tightly tuning latency
2020-04-21 23:07 0% ` Jonathan Morton
@ 2020-04-21 23:27 1% ` Dave Taht
0 siblings, 0 replies; 200+ results
From: Dave Taht @ 2020-04-21 23:27 UTC (permalink / raw)
To: Jonathan Morton; +Cc: Thibaut, Cake List
On Tue, Apr 21, 2020 at 4:07 PM Jonathan Morton <chromatix99@gmail.com> wrote:
>
> > On 22 Apr, 2020, at 1:50 am, Dave Taht <dave.taht@gmail.com> wrote:
> >
> > Jon, how's SCE looking? ready for a backport yet?
>
> We can't do any sort of wide deployment of SCE until it's been approved as an Internet experiment by the IETF.
If the present algorithm on the qdisc is stable, I'd like to try it on wifi.
> Interested individuals can already compile the SCE-enabled kernel and jump through the hoops to try things out - carefully. There's a bit of infrastructure code to go with the new TCP algorithms and qdiscs, so I'm not certain how easy a backport would be; better to just build the (relatively) current code for now.
Not on openwrt. It seemed easy to backport just cake there, and the
rest of the code on dedicated servers and clients. Similarly I'd try
for the wifi attempt also.
I care about the cpu impact a lot. Also a recent string of postings on
netdev that I have had too much PTSD to reply to seem to indicate that
accecn *requires* that the tcp offload engine be disabled, which is
difficult to swallow. Can SCE work with tcp offloads enabled (on the
server, client and qdisc?)?
The same post claimed that apple proved we could "just turn ecn on",
and to explore that claim I updated my osx to the latest only to
immediate find apple's heuristics *disabled* attempts at ecn
negotiation on the second of two rrul tests, and I'd also poked into a
worldwide dataset that showed WAY less ecn attempts making it from
apple gear to the test server.
Another post (which I have not responded to either) pointed to an
improvement in the 3WHS that may or may not be genuinely useful, but
at that point, I went back to fixing wifi with what I knew worked.
The code itself, was not bad. Perhaps some review of that set of
patches and thread is needed by some others with stronger stomachs.
https://www.spinics.net/lists/netdev/msg638882.html
>
> IETF TSVWG interim meeting next week (the second of two replacing planned in-person sessions at Vancouver) will discuss the big ECT(1) issue, which is hotly disputed between SCE and L4S. The key question is whether ECT(1) should become a classifier input to the network (orthogonal to Diffserv but with some of the same basic problems), or an additional congestion signal output from the network (indicating a lesser degree of congestion, to which a smaller and more nuanced response is desired). It's anyone's guess how that will turn out, but the technical merit is on our side and that really should count for something.
>
> If you're keeping an eye on the TSVWG list, expect a major bombshell to drop there in the next few days.
I do. I wish more did. Beverage in hand. :)
> - Jonathan Morton
>
--
Make Music, Not War
Dave Täht
CTO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-831-435-0729
^ permalink raw reply [relevance 1%]
* Re: [Cake] Advantages to tightly tuning latency
2020-04-21 23:06 1% ` Justin Kilpatrick
@ 2020-04-21 23:19 1% ` Dave Taht
0 siblings, 0 replies; 200+ results
From: Dave Taht @ 2020-04-21 23:19 UTC (permalink / raw)
To: Justin Kilpatrick; +Cc: Cake List
On Tue, Apr 21, 2020 at 4:07 PM Justin Kilpatrick <justin@althea.net> wrote:
>
> On Tue, Apr 21, 2020, at 2:44 PM, Dave Taht wrote:
> > It has always been my dream, that at least for outbound, there would
> > be sufficient backpressure from the driver
> > to not have to shape at all, or monitor the link. We have that now in
> > BQL and AQL. free.fr's dsl driver "does the right thing" - no other
> > dsl driver does. Nor usb network devices. I hope more folk roll up
> > their sleeves and test the ath10k some, it's looking lovely from here.
> >
> > https://forum.openwrt.org/t/aql-and-the-ath10k-is-lovely/
> >
> > next up either the new mediatek chip or intel..
>
> I'm curious if you have any opinions about the WiFi stacks for the Marvel Armada
Marvell's wifi is currently "not horrible", but certainly overbuffered.
>and Qualcomm IPQ40xx.
I was under the impression this looked like an ath10k to the world. Am
I wrong? What products is it in these days?
>Any trees I should be barking up for better performance? We have had some complaints in higher interference areas...
>
> These devices have the best WireGuard performance per dollar for Althea's use case so we're deploying them pretty heavily.
>
> --
> Justin Kilpatrick
> justin@althea.net
--
Make Music, Not War
Dave Täht
CTO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-831-435-0729
^ permalink raw reply [relevance 1%]
* Re: [Cake] Advantages to tightly tuning latency
@ 2020-04-21 23:07 0% ` Jonathan Morton
2020-04-21 23:27 1% ` Dave Taht
0 siblings, 1 reply; 200+ results
From: Jonathan Morton @ 2020-04-21 23:07 UTC (permalink / raw)
To: Dave Taht; +Cc: Thibaut, Cake List
> On 22 Apr, 2020, at 1:50 am, Dave Taht <dave.taht@gmail.com> wrote:
>
> Jon, how's SCE looking? ready for a backport yet?
We can't do any sort of wide deployment of SCE until it's been approved as an Internet experiment by the IETF. Interested individuals can already compile the SCE-enabled kernel and jump through the hoops to try things out - carefully. There's a bit of infrastructure code to go with the new TCP algorithms and qdiscs, so I'm not certain how easy a backport would be; better to just build the (relatively) current code for now.
IETF TSVWG interim meeting next week (the second of two replacing planned in-person sessions at Vancouver) will discuss the big ECT(1) issue, which is hotly disputed between SCE and L4S. The key question is whether ECT(1) should become a classifier input to the network (orthogonal to Diffserv but with some of the same basic problems), or an additional congestion signal output from the network (indicating a lesser degree of congestion, to which a smaller and more nuanced response is desired). It's anyone's guess how that will turn out, but the technical merit is on our side and that really should count for something.
If you're keeping an eye on the TSVWG list, expect a major bombshell to drop there in the next few days.
- Jonathan Morton
^ permalink raw reply [relevance 0%]
* Re: [Cake] Advantages to tightly tuning latency
2020-04-21 18:44 1% ` Dave Taht
2020-04-21 22:25 0% ` Thibaut
@ 2020-04-21 23:06 1% ` Justin Kilpatrick
2020-04-21 23:19 1% ` Dave Taht
1 sibling, 1 reply; 200+ results
From: Justin Kilpatrick @ 2020-04-21 23:06 UTC (permalink / raw)
To: Dave Taht; +Cc: cake
On Tue, Apr 21, 2020, at 2:44 PM, Dave Taht wrote:
> It has always been my dream, that at least for outbound, there would
> be sufficient backpressure from the driver
> to not have to shape at all, or monitor the link. We have that now in
> BQL and AQL. free.fr's dsl driver "does the right thing" - no other
> dsl driver does. Nor usb network devices. I hope more folk roll up
> their sleeves and test the ath10k some, it's looking lovely from here.
>
> https://forum.openwrt.org/t/aql-and-the-ath10k-is-lovely/
>
> next up either the new mediatek chip or intel..
I'm curious if you have any opinions about the WiFi stacks for the Marvel Armada and Qualcomm IPQ40xx. Any trees I should be barking up for better performance? We have had some complaints in higher interference areas...
These devices have the best WireGuard performance per dollar for Althea's use case so we're deploying them pretty heavily.
--
Justin Kilpatrick
justin@althea.net
^ permalink raw reply [relevance 1%]
* Re: [Cake] Advantages to tightly tuning latency
2020-04-21 22:33 0% ` Jonathan Morton
@ 2020-04-21 22:44 1% ` Dave Taht
` (2 more replies)
0 siblings, 3 replies; 200+ results
From: Dave Taht @ 2020-04-21 22:44 UTC (permalink / raw)
To: Jonathan Morton; +Cc: Thibaut, Cake List
On Tue, Apr 21, 2020 at 3:33 PM Jonathan Morton <chromatix99@gmail.com> wrote:
>
> > On 22 Apr, 2020, at 1:25 am, Thibaut <hacks@slashdirt.org> wrote:
> >
> > My curiosity is piqued. Can you elaborate on this? What does free.fr do?
>
> They're a large French ISP. They made their own CPE devices, and debloated both them and their network quite a while ago. In that sense, at least, they're a model for others to follow - but few have.
>
> - Jonathan Morton
they are one of the few ISPs that insisted on getting full source code
to their DSL stack, and retained the chops to be able to modify it. I
really admire their revolution v6 product. First introduced in 2010,
it's been continuously updated, did ipv6 at the outset, got fq_codel
when it first came out, and they update the kernel regularly. All
kinds of great features on it, and ecn is enabled by default for those
also (things like samba). over 3 million boxes now I hear....
with <1ms of delay in the dsl driver, they don't need to shape, they
just run at line rate using three tiers of DRR that look a lot like
cake. They shared their config with me, and before I lost heart for
future internet drafts, I'd stuck it here:
https://github.com/dtaht/bufferbloat-rfcs/blob/master/home_gateway_queue_management/middle.mkd
Occasionally they share some data with me. Sometimes I wish I lived in
paris just so I could have good internet! (their fiber offering is
reasonably buffered (not fq_codeled) and the wifi... maybe I can get
them to talk about what they did)
When free.fr shipped fq_codel 2 months after we finalized it, I
figured the rest of the world was only months behind. How hard is it
to add 50 lines of BQL oriented code to a DSL firmware?
--
Make Music, Not War
Dave Täht
CTO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-831-435-0729
^ permalink raw reply [relevance 1%]
* Re: [Cake] Advantages to tightly tuning latency
2020-04-21 22:25 0% ` Thibaut
@ 2020-04-21 22:33 0% ` Jonathan Morton
2020-04-21 22:44 1% ` Dave Taht
0 siblings, 1 reply; 200+ results
From: Jonathan Morton @ 2020-04-21 22:33 UTC (permalink / raw)
To: Thibaut; +Cc: Dave Taht, Cake List
> On 22 Apr, 2020, at 1:25 am, Thibaut <hacks@slashdirt.org> wrote:
>
> My curiosity is piqued. Can you elaborate on this? What does free.fr do?
They're a large French ISP. They made their own CPE devices, and debloated both them and their network quite a while ago. In that sense, at least, they're a model for others to follow - but few have.
- Jonathan Morton
^ permalink raw reply [relevance 0%]
* Re: [Cake] Advantages to tightly tuning latency
2020-04-21 18:44 1% ` Dave Taht
@ 2020-04-21 22:25 0% ` Thibaut
2020-04-21 22:33 0% ` Jonathan Morton
2020-04-21 23:06 1% ` Justin Kilpatrick
1 sibling, 1 reply; 200+ results
From: Thibaut @ 2020-04-21 22:25 UTC (permalink / raw)
To: Dave Taht; +Cc: Jonathan Morton, Cake List
[-- Attachment #1: Type: text/plain, Size: 3454 bytes --]
Hi,
> Le 21 avr. 2020 à 20:44, Dave Taht <dave.taht@gmail.com> a écrit :
>
> It has always been my dream, that at least for outbound, there would
> be sufficient backpressure from the driver
> to not have to shape at all, or monitor the link. We have that now in
> BQL and AQL. free.fr's dsl driver "does the right thing" - no other
> dsl driver does.
My curiosity is piqued. Can you elaborate on this? What does free.fr <http://free.fr/> do?
Thanks,
Thibaut
> Nor usb network devices. I hope more folk roll up
> their sleeves and test the ath10k some, it's looking lovely from here.
>
> https://forum.openwrt.org/t/aql-and-the-ath10k-is-lovely/
>
> next up either the new mediatek chip or intel..
>
> On Tue, Apr 21, 2020 at 11:40 AM Jonathan Morton <chromatix99@gmail.com> wrote:
>>
>>> On 21 Apr, 2020, at 9:22 pm, Justin Kilpatrick <justin@althea.net> wrote:
>>>
>>> I have a frequently changing link I'm using automated tools to monitor and tune using Cake. Currently I'm only tuning bandwidth parameter using latency and packet loss data.
>>>
>>> My reading of the codel RFC seems to say that trying to tune the 'interval' value using known path and link latency won't provide any advantages over just tuning the bandwidth parameter.
>>>
>>> Obviously codel is just one part of the Cake setup and I'm wondering if there are any advantages I'm missing by not providing this extra input using data I already gather.
>>
>> The default latency parameters are tuned well for general Internet paths. The median path length on the public Internet is about 80ms, for which the default interval of 100ms and target of 5ms works well. Codel is also designed to accommodate a significant deviation from the expected path length without too much difficulty.
>>
>> I think it's only worth trying to adjust this if your typical path is substantially different from that norm. If all your traffic goes over a satellite link, for example, the default parameters might be too tight. If the vast majority of it goes to a local CDN, you could try the "metro" keyword to tighten things up a bit. Otherwise, you'll be fine.
>>
>> Also, most protocols are actually not very sensitive to how tight the AQM is set in the first place. Either they don't really care about latency at all (eg. bulk downloads) or they are latency-sensitive but also sparse (eg. DNS, NTP, VoIP). So they are more interested in being isolated from the influence of other flows, which Cake does pretty well regardless of the AQM settings.
>>
>> It's *considerably* more important to ensure that your shaper is configured correctly. That means setting not only the bandwidth parameter, but the overhead parameters as well. A bad shaper setting could result in some or all of your traffic not seeing Cake as the effective bottleneck, and thus not receiving its care. This can be an orders-of-magnitude effect, depending on just how bloated the underlying hardware is.
>>
>> - Jonathan Morton
>> _______________________________________________
>> Cake mailing list
>> Cake@lists.bufferbloat.net
>> https://lists.bufferbloat.net/listinfo/cake
>
>
>
> --
> Make Music, Not War
>
> Dave Täht
> CTO, TekLibre, LLC
> http://www.teklibre.com
> Tel: 1-831-435-0729
> _______________________________________________
> Cake mailing list
> Cake@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cake
[-- Attachment #2: Type: text/html, Size: 4760 bytes --]
^ permalink raw reply [relevance 0%]
* Re: [Cake] Advantages to tightly tuning latency
2020-04-21 18:40 0% ` Jonathan Morton
@ 2020-04-21 18:44 1% ` Dave Taht
2020-04-21 22:25 0% ` Thibaut
2020-04-21 23:06 1% ` Justin Kilpatrick
0 siblings, 2 replies; 200+ results
From: Dave Taht @ 2020-04-21 18:44 UTC (permalink / raw)
To: Jonathan Morton; +Cc: Justin Kilpatrick, Cake List
It has always been my dream, that at least for outbound, there would
be sufficient backpressure from the driver
to not have to shape at all, or monitor the link. We have that now in
BQL and AQL. free.fr's dsl driver "does the right thing" - no other
dsl driver does. Nor usb network devices. I hope more folk roll up
their sleeves and test the ath10k some, it's looking lovely from here.
https://forum.openwrt.org/t/aql-and-the-ath10k-is-lovely/
next up either the new mediatek chip or intel..
On Tue, Apr 21, 2020 at 11:40 AM Jonathan Morton <chromatix99@gmail.com> wrote:
>
> > On 21 Apr, 2020, at 9:22 pm, Justin Kilpatrick <justin@althea.net> wrote:
> >
> > I have a frequently changing link I'm using automated tools to monitor and tune using Cake. Currently I'm only tuning bandwidth parameter using latency and packet loss data.
> >
> > My reading of the codel RFC seems to say that trying to tune the 'interval' value using known path and link latency won't provide any advantages over just tuning the bandwidth parameter.
> >
> > Obviously codel is just one part of the Cake setup and I'm wondering if there are any advantages I'm missing by not providing this extra input using data I already gather.
>
> The default latency parameters are tuned well for general Internet paths. The median path length on the public Internet is about 80ms, for which the default interval of 100ms and target of 5ms works well. Codel is also designed to accommodate a significant deviation from the expected path length without too much difficulty.
>
> I think it's only worth trying to adjust this if your typical path is substantially different from that norm. If all your traffic goes over a satellite link, for example, the default parameters might be too tight. If the vast majority of it goes to a local CDN, you could try the "metro" keyword to tighten things up a bit. Otherwise, you'll be fine.
>
> Also, most protocols are actually not very sensitive to how tight the AQM is set in the first place. Either they don't really care about latency at all (eg. bulk downloads) or they are latency-sensitive but also sparse (eg. DNS, NTP, VoIP). So they are more interested in being isolated from the influence of other flows, which Cake does pretty well regardless of the AQM settings.
>
> It's *considerably* more important to ensure that your shaper is configured correctly. That means setting not only the bandwidth parameter, but the overhead parameters as well. A bad shaper setting could result in some or all of your traffic not seeing Cake as the effective bottleneck, and thus not receiving its care. This can be an orders-of-magnitude effect, depending on just how bloated the underlying hardware is.
>
> - Jonathan Morton
> _______________________________________________
> Cake mailing list
> Cake@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cake
--
Make Music, Not War
Dave Täht
CTO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-831-435-0729
^ permalink raw reply [relevance 1%]
* Re: [Cake] Advantages to tightly tuning latency
@ 2020-04-21 18:40 0% ` Jonathan Morton
2020-04-21 18:44 1% ` Dave Taht
0 siblings, 1 reply; 200+ results
From: Jonathan Morton @ 2020-04-21 18:40 UTC (permalink / raw)
To: Justin Kilpatrick; +Cc: cake
> On 21 Apr, 2020, at 9:22 pm, Justin Kilpatrick <justin@althea.net> wrote:
>
> I have a frequently changing link I'm using automated tools to monitor and tune using Cake. Currently I'm only tuning bandwidth parameter using latency and packet loss data.
>
> My reading of the codel RFC seems to say that trying to tune the 'interval' value using known path and link latency won't provide any advantages over just tuning the bandwidth parameter.
>
> Obviously codel is just one part of the Cake setup and I'm wondering if there are any advantages I'm missing by not providing this extra input using data I already gather.
The default latency parameters are tuned well for general Internet paths. The median path length on the public Internet is about 80ms, for which the default interval of 100ms and target of 5ms works well. Codel is also designed to accommodate a significant deviation from the expected path length without too much difficulty.
I think it's only worth trying to adjust this if your typical path is substantially different from that norm. If all your traffic goes over a satellite link, for example, the default parameters might be too tight. If the vast majority of it goes to a local CDN, you could try the "metro" keyword to tighten things up a bit. Otherwise, you'll be fine.
Also, most protocols are actually not very sensitive to how tight the AQM is set in the first place. Either they don't really care about latency at all (eg. bulk downloads) or they are latency-sensitive but also sparse (eg. DNS, NTP, VoIP). So they are more interested in being isolated from the influence of other flows, which Cake does pretty well regardless of the AQM settings.
It's *considerably* more important to ensure that your shaper is configured correctly. That means setting not only the bandwidth parameter, but the overhead parameters as well. A bad shaper setting could result in some or all of your traffic not seeing Cake as the effective bottleneck, and thus not receiving its care. This can be an orders-of-magnitude effect, depending on just how bloated the underlying hardware is.
- Jonathan Morton
^ permalink raw reply [relevance 0%]
* Re: [Cake] Thinking about ingress shaping & cake
2020-04-12 11:02 0% ` Jonathan Morton
@ 2020-04-12 13:12 0% ` Kevin Darbyshire-Bryant
0 siblings, 0 replies; 200+ results
From: Kevin Darbyshire-Bryant @ 2020-04-12 13:12 UTC (permalink / raw)
To: Jonathan Morton; +Cc: Cake List
[-- Attachment #1: Type: text/plain, Size: 3804 bytes --]
> On 12 Apr 2020, at 12:02, Jonathan Morton <chromatix99@gmail.com> wrote:
>
>> On 12 Apr, 2020, at 11:23 am, Kevin Darbyshire-Bryant <kevin@darbyshire-bryant.me.uk> wrote:
>>
>> I’m wondering what the relationship between actual incoming rate vs shaped rate and latency peaks is? My brain can’t compute that but I suspect is related to the rtt of the flow/s and hence how quickly the signalling manages to control the incoming rate.
>
> There are two important cases to consider here: the slow-start and congestion-avoidance phases of TCP. But in general, the bigger the difference between the link rate and Cake's shaped rate, the less latency peaks you will notice.
>
> Slow-start basically doubles the send rate every RTT until terminated by a congestion signal. It's therefore likely that you'll get a full RTT of queued data at the moment of slow-start exit, which then has to drain - and most of this will occur in the dumb FIFO upstream of you. Typical Internet RTTs are about 80ms. You should expect a slow-start related latency spike every time you start a bulk flow, although some of them will be avoided by the HyStart algorithm, which uses increases in latency as a congestion signal specifically for governing slow-start exit.
>
> In congestion avoidance, TCP typically adds one segment to the congestion window per RTT. If you assume the shaper is saturated, you can calculate the excess bandwidth caused by this "Reno linear growth" as 8 bits per byte * 1500 bytes * flow count / RTT seconds. For a single flow at 80ms, that's 150 Kbps. At 20ms it would be 600 Kbps. If that number totals less than the margin you've left, then the peaks of the AIMD sawtooth should not collect in the dumb FIFO and will be handled entirely by Cake.
Thank you. That is really useful.
In case you all fancied a laugh at my expense and to show you what state of stir crazy I’m in due to lock down, here’s the analogy of queuing I came up with that explained to me why my queue departure rate must be less than the inbound rate.
So I imagined a farmer with a single cow only milking machine and a transporter that moves cows from the field to the milking machine(!) As Mr Farmer turns up at the field, the cows saunter over to the gate. The gate opens when there’s space for a cow on the transporter. The transporter can move a single cow to the milking machine at an arbitrary 1 cow per 10 seconds (6 cows a minute). The cows are interested at the thought of being milked so they arrive at the gate from around the field faster than 6 cows a minute. So the cows naturally form a queue and wait their turn to go through the gate.
Mr Farmer has some special cows that must be milked in preference to standard cows. So he installs some fencing and arranges them into two funnel shapes arriving at the gate. The gate has been upgraded too and it can choose from which funnel to accept a cow. If a cow is available in the special queue then it takes that cow, else it takes a standard cow. A helper assists in directing the cows to the correct queue.
It’s at this point I realised that for the special/standard cow preference to make any difference the cows must be arriving faster than they can depart, otherwise there’s never the case that a standard cow has to wait for a special cow, they just walk on through. I have to have a queue.
I won’t take the analogy any further since I’m aware of the ’special cow’ queue starving access to the ’normal cow’ queue and I’m not sure that controlling queue length when they all come running over (cow burst!) by culling cows is exactly ideal either :-)
Anyway welcome to my Easter madness :-)
Cheers,
Kevin D-B
gpg: 012C ACB2 28C6 C53E 9775 9123 B3A2 389B 9DE2 334A
[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [relevance 0%]
* Re: [Cake] Thinking about ingress shaping & cake
2020-04-12 8:23 0% ` Kevin Darbyshire-Bryant
2020-04-12 9:47 0% ` Sebastian Moeller
@ 2020-04-12 11:02 0% ` Jonathan Morton
2020-04-12 13:12 0% ` Kevin Darbyshire-Bryant
1 sibling, 1 reply; 200+ results
From: Jonathan Morton @ 2020-04-12 11:02 UTC (permalink / raw)
To: Kevin Darbyshire-Bryant; +Cc: Cake List
> On 12 Apr, 2020, at 11:23 am, Kevin Darbyshire-Bryant <kevin@darbyshire-bryant.me.uk> wrote:
>
> I’m wondering what the relationship between actual incoming rate vs shaped rate and latency peaks is? My brain can’t compute that but I suspect is related to the rtt of the flow/s and hence how quickly the signalling manages to control the incoming rate.
There are two important cases to consider here: the slow-start and congestion-avoidance phases of TCP. But in general, the bigger the difference between the link rate and Cake's shaped rate, the less latency peaks you will notice.
Slow-start basically doubles the send rate every RTT until terminated by a congestion signal. It's therefore likely that you'll get a full RTT of queued data at the moment of slow-start exit, which then has to drain - and most of this will occur in the dumb FIFO upstream of you. Typical Internet RTTs are about 80ms. You should expect a slow-start related latency spike every time you start a bulk flow, although some of them will be avoided by the HyStart algorithm, which uses increases in latency as a congestion signal specifically for governing slow-start exit.
In congestion avoidance, TCP typically adds one segment to the congestion window per RTT. If you assume the shaper is saturated, you can calculate the excess bandwidth caused by this "Reno linear growth" as 8 bits per byte * 1500 bytes * flow count / RTT seconds. For a single flow at 80ms, that's 150 Kbps. At 20ms it would be 600 Kbps. If that number totals less than the margin you've left, then the peaks of the AIMD sawtooth should not collect in the dumb FIFO and will be handled entirely by Cake.
- Jonathan Morton
^ permalink raw reply [relevance 0%]
* Re: [Cake] Thinking about ingress shaping & cake
2020-04-12 8:23 0% ` Kevin Darbyshire-Bryant
@ 2020-04-12 9:47 0% ` Sebastian Moeller
2020-04-12 11:02 0% ` Jonathan Morton
1 sibling, 0 replies; 200+ results
From: Sebastian Moeller @ 2020-04-12 9:47 UTC (permalink / raw)
To: Kevin Darbyshire-Bryant; +Cc: Jonathan Morton, Cake List
Hi Kevin.
> On Apr 12, 2020, at 10:23, Kevin Darbyshire-Bryant <kevin@darbyshire-bryant.me.uk> wrote:
>
>
>
>> On 10 Apr 2020, at 15:14, Jonathan Morton <chromatix99@gmail.com> wrote:
>>
>>
>> No. If the dequeue rate is never less than the enqueue rate, then the backlog remains at zero pretty much all the time. There are some short-term effects which can result in transient queuing of a small number of packets, but these will all drain out promptly.
>>
>> For Cake to actually gain control of the bottleneck queue, it needs to *become* the bottleneck - which, when downstream of the nominal bottleneck, can only be achieved by shaping to a slower rate. I would try 79Mbit for your case.
>>
>> - Jonathan Morton
>>
>
> Thanks for correcting my erroneous thinking Jonathan!
I can not see erroneous thinking here at all, you just independently discovered the approximate nature of post-bottleneck traffic shaping ;) As I see it, this theoretical concern made people ignore ingress shaping for far too long. As the bufferbloat effort demonstrated even approximate traffic shaping help considerably. And Jonathan's "ingress" mode for cake make that approximation less dependent on the number of concurrent flows, so as so often cake turns it up to eleven ;)
> As I was typing it I was thinking “how does that actually work?” I should have thought more.
You identified the core issue of ingress shaping quite succinctly for the rest there is the mailing list.
> I typically run ingress rate as 97.5% of modem sync rate (78000 of 80000) which is gives me a little wiggle room when the modem doesn’t quite make the 80000 target (often 79500ish). Egress is easy, 99.5% of 20000 ie. 19900, all is wonderful.
Those were the good old days, when I could just assume the sync rate would be the limiting factor, over on this side of the channel ISP pretty much all switched to use traffic shapers at their end (typically not in the DSLAM which seems to be more or less just a L2 switch with fancy media-converters for the individual subscriber lines). Getting reliably information about the setting of these shapers is near impossible... (And yes, the ISPs also seem to shape my egress, so have to deal with approximate shaping at their end as well). My current approach is to make a few speedtests without SQM enabled take my best estimate of the applicable maximum speed (this is harder than it looks, as a number of speedtests are really imprecise and try to get instantaneous rate estimates, which suffer from windowing effects, resulting in reported speeds higher than a DSL link can theoretically carry maximally). Then I take this and just plug this net rate into sqm as gross shaper rate and things should be a decent starting point (plus my best theoretical estimate of the per-packet-overhead PPO).
Since I usually can not help it, I then take my PPO estimate and reverse the gross rate from the net rate, e.g. for VDSL2/PTM/PPPoE/IPv4/TCP/RFC1323Timestamps)
gross shaper rate = net speedtest result * 65/64 * ((1500 + 26) / (1500 - 8 - 20 -20 -12))
and compare this with my sync rate, if this is <= my syncrate I then set:
egress gross shaper rate = egress net speedtest result * ((1500 + 26) / (1500 - 8 - 20 -20 -12)) * 0.995
ingress gross shaper rate = ingress net speedtest result * ((1500 + 26) / (1500 - 8 - 20 -20 -12)) * 0.95
if the calculated rate is > my syncrate I repeat with a different speedtest, while mumbling and cursing like a sailor...
>
> I’m wondering what the relationship between actual incoming rate vs shaped rate and latency peaks is?
Good question! My mental image is bound to the water and pipe model of the internet (series of tubes ;)) if the inrush is too high for the current bottleneck element/pipe there is going to be "back-spill" into the buffers upstream of the bottleneck. So, bursts and DOS traffic will flood back into the ISPs typically under-managed and over-sized buffers increasing the latency.
> My brain can’t compute that but I suspect is related to the rtt of the flow/s and hence how quickly the signalling manages to control the incoming rate.
I agree.
>
> I guess ultimately we’re dependent on the upstream (ISP) shaper configuration, ie if that’s a large buffer and we’ve an unresponsive flow incoming then no matter what we do, we’re stuffed, that flow will fill the buffer & induce latency on other flows.
Yes, but this is where cake's ingress mode helps, by aiming its rate target to the ingress side it will effectively send stronger signal so that the endpoints react faster reducing the likelihood of back-spill. But in the end, it would be a great help if the ISP's shaper would have acceptable buffer management...
Best Regards
Sebastian
>
>
> Cheers,
>
> Kevin D-B
>
> gpg: 012C ACB2 28C6 C53E 9775 9123 B3A2 389B 9DE2 334A
>
> _______________________________________________
> Cake mailing list
> Cake@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cake
^ permalink raw reply [relevance 0%]
* Re: [Cake] Thinking about ingress shaping & cake
2020-04-10 14:14 0% ` Jonathan Morton
@ 2020-04-12 8:23 0% ` Kevin Darbyshire-Bryant
2020-04-12 9:47 0% ` Sebastian Moeller
2020-04-12 11:02 0% ` Jonathan Morton
0 siblings, 2 replies; 200+ results
From: Kevin Darbyshire-Bryant @ 2020-04-12 8:23 UTC (permalink / raw)
To: Jonathan Morton; +Cc: Cake List
[-- Attachment #1: Type: text/plain, Size: 1637 bytes --]
> On 10 Apr 2020, at 15:14, Jonathan Morton <chromatix99@gmail.com> wrote:
>
>
> No. If the dequeue rate is never less than the enqueue rate, then the backlog remains at zero pretty much all the time. There are some short-term effects which can result in transient queuing of a small number of packets, but these will all drain out promptly.
>
> For Cake to actually gain control of the bottleneck queue, it needs to *become* the bottleneck - which, when downstream of the nominal bottleneck, can only be achieved by shaping to a slower rate. I would try 79Mbit for your case.
>
> - Jonathan Morton
>
Thanks for correcting my erroneous thinking Jonathan! As I was typing it I was thinking “how does that actually work?” I should have thought more. I typically run ingress rate as 97.5% of modem sync rate (78000 of 80000) which is gives me a little wiggle room when the modem doesn’t quite make the 80000 target (often 79500ish). Egress is easy, 99.5% of 20000 ie. 19900, all is wonderful.
I’m wondering what the relationship between actual incoming rate vs shaped rate and latency peaks is? My brain can’t compute that but I suspect is related to the rtt of the flow/s and hence how quickly the signalling manages to control the incoming rate.
I guess ultimately we’re dependent on the upstream (ISP) shaper configuration, ie if that’s a large buffer and we’ve an unresponsive flow incoming then no matter what we do, we’re stuffed, that flow will fill the buffer & induce latency on other flows.
Cheers,
Kevin D-B
gpg: 012C ACB2 28C6 C53E 9775 9123 B3A2 389B 9DE2 334A
[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [relevance 0%]
* Re: [Cake] Thinking about ingress shaping & cake
@ 2020-04-10 14:14 0% ` Jonathan Morton
2020-04-12 8:23 0% ` Kevin Darbyshire-Bryant
0 siblings, 1 reply; 200+ results
From: Jonathan Morton @ 2020-04-10 14:14 UTC (permalink / raw)
To: Kevin Darbyshire-Bryant; +Cc: Cake List
> On 10 Apr, 2020, at 4:16 pm, Kevin Darbyshire-Bryant <kevin@darbyshire-bryant.me.uk> wrote:
>
> I have a 80/20mbit FTTC line into the house. Egress shaping/control with cake is simple, easy, beautiful and it works. Tell it to use 19900Kbit, set some min packet size, a bit of overhead and off you go. Ingress has more problems:
>
> Assuming I do actually get 80Mbit incoming then the naive bandwidth setting for CAKE would be 80Mbit. Cake internally dequeues at that 80Mbit rate and therefore the only way any flows can accumulate backlog is when they’re competing with each other in terms of fairness(Tin/Host) and quantums become involved…I think.
No. If the dequeue rate is never less than the enqueue rate, then the backlog remains at zero pretty much all the time. There are some short-term effects which can result in transient queuing of a small number of packets, but these will all drain out promptly.
For Cake to actually gain control of the bottleneck queue, it needs to *become* the bottleneck - which, when downstream of the nominal bottleneck, can only be achieved by shaping to a slower rate. I would try 79Mbit for your case.
- Jonathan Morton
^ permalink raw reply [relevance 0%]
* Re: [Cake] cake and nat in openwrt... on by default?
2020-04-05 19:56 0% ` Kevin Darbyshire-Bryant
@ 2020-04-06 10:15 0% ` Toke Høiland-Jørgensen
0 siblings, 0 replies; 200+ results
From: Toke Høiland-Jørgensen @ 2020-04-06 10:15 UTC (permalink / raw)
To: Kevin Darbyshire-Bryant, Dave Taht; +Cc: Cake List
Kevin Darbyshire-Bryant <kevin@darbyshire-bryant.me.uk> writes:
>> On 5 Apr 2020, at 16:22, Dave Taht <dave.taht@gmail.com> wrote:
>>
>>>
>>
>> I'd still be willing to bet, then, that the majority of instances were
>> not turning nat mode on, when
>> they should have been.
>
> If memory serves, at the time there was a lot of concern about cpu
> usage and (I felt) that every line, every potential instruction and
> cache hit was being scrutinised. NAT lookup by default was deemed too
> much. Cake ended up being ‘cpu heavy’ though I’ve yet to see a better
> combination of shaper, aqm, flow fairness, host fairness, DSCP
> awareness and ack filtering in one overall package, let alone one that
> did it in less cpu.
Well, one thing we could do is change the defaults in sqm-scripts? :)
-Toke
^ permalink raw reply [relevance 0%]
* Re: [Cake] [Bloat] New board that looks interesting
@ 2020-04-05 20:17 1% ` David P. Reed
0 siblings, 0 replies; 200+ results
From: David P. Reed @ 2020-04-05 20:17 UTC (permalink / raw)
To: Dave Taht; +Cc: Aaron Wood, Cake List, Make-Wifi-fast, bloat
FYI - Fedora 31 continued not trying to "make my life easier" by inventing new packaging and containerization in the base distro. I think the folks who make Ubuntu think it is a consumer product. Making it harder to self-configure in developer-hacker friendly ways.
I expect Fedora32 will have Wireguard (if not earlier).
On Saturday, April 4, 2020 1:36pm, "Dave Taht" <dave.taht@gmail.com> said:
> I think I'll wait for y'all to try it and report back. I trust my
> apu2s and I actually kind of like they lack a graphics chip and need
> to be configured via serial port.
>
> In other news I've started testing ubuntu 20.4, which among other
> things, has wireguard in it. I've been really frustrated with the
> state of distributions lately, trying to get any complex thing done
> has required snaps and docker containers and I really prefer running
> stuff natively when possible. Tools that I still rely on like mrtg and
> smokeping are undermaintained, trying to get zoneminder to co-exist
> and co-install with anything else (notably jitsi thus far) has been a
> real PITA.
>
> I am pleased at the increasing size of the ipv6 deployment, my phone
> got it last month....
>
> I think I've found a babel bug with default routes...
>
> and I fired up a kernel build to go hack on the ax200 chips.
>
> On Sat, Apr 4, 2020 at 9:27 AM Aaron Wood <woody77@gmail.com> wrote:
>>
>> The comparison of chipset performance link (to OpemWRT forums) that went out had
>> this chip, the J4105 as the fastest. Able to do a gigabit with cake (nearly able
>> to do it in both directions).
>>
>> I think this has replaced the apu2 as the board I’m going with as my edge
>> router.
>>
>> On Sat, Apr 4, 2020 at 9:10 AM Dave Taht <dave.taht@gmail.com> wrote:
>>>
>>> Historically I've found the "Celeron" chips rather weak, but it's just
>>> a brand. I haven't the foggiest idea how well this variant will
>>> perform.
>>>
>>> The intel ethernet chips are best of breed in linux, however. It's
>>> been my hope that the 211 variant with the timed networking support
>>> would show up in the field (sch_etx) so we could fiddle with that,
>>> (the apu2s aren't using that version) but I cannot for the life of me
>>> remember the right keywords to look it up at the moment. this feature
>>> lets you program when a packet emerges from the driver and is sort of
>>> a whole new ballgame when it comes to scheduling - there hasn't been
>>> an aqm designed for it, and you can do fq by playing tricks with the
>>> sent timestamp.
>>>
>>> All the other features look rather nice on this board.
>>>
>>> On Sat, Apr 4, 2020 at 7:47 AM David P. Reed <dpreed@deepplum.com> wrote:
>>> >
>>> > Thanks! I ordered one just now. In my experience, this company does rather
>>> neat stuff. Their XMOS based microphone array (ReSpeaker) is really useful.
>>> What's the state of play in Linux/OpenWRT for Intel 9560 capabilities
>>> regarding AQM?
>>> >
>>> > On Saturday, April 4, 2020 12:12am, "Aaron Wood" <woody77@gmail.com> said:
>>> >
>>> > > _______________________________________________
>>> > > Cake mailing list
>>> > > Cake@lists.bufferbloat.net
>>> > > https://lists.bufferbloat.net/listinfo/cake
>>> > > https://www.seeedstudio.com/ODYSSEY-X86J4105800-p-4445.html
>>> > >
>>> > > quad-core Celeron J4105 1.5-2.5 GHz x64
>>> > > 8GB Ram
>>> > > 2x i211t intel ethernet controllers
>>> > > intel 9560 802.11ac (wave2) wifi/bluetooth chipset
>>> > > intel built-in graphics
>>> > > onboard ARM Cortex-M0 and RPi & Arduino headers
>>> > > m.2 and PCIe adapters
>>> > > <$200
>>> > >
>>> >
>>> >
>>> > _______________________________________________
>>> > Bloat mailing list
>>> > Bloat@lists.bufferbloat.net
>>> > https://lists.bufferbloat.net/listinfo/bloat
>>>
>>>
>>>
>>> --
>>> Make Music, Not War
>>>
>>> Dave Täht
>>> CTO, TekLibre, LLC
>>> http://www.teklibre.com
>>> Tel: 1-831-435-0729
>>
>> --
>> - Sent from my iPhone.
>
>
>
> --
> Make Music, Not War
>
> Dave Täht
> CTO, TekLibre, LLC
> http://www.teklibre.com
> Tel: 1-831-435-0729
>
^ permalink raw reply [relevance 1%]
* Re: [Cake] cake and nat in openwrt... on by default?
2020-04-05 18:03 1% ` Dave Taht
@ 2020-04-05 19:56 0% ` Kevin Darbyshire-Bryant
2020-04-06 10:15 0% ` Toke Høiland-Jørgensen
1 sibling, 1 reply; 200+ results
From: Kevin Darbyshire-Bryant @ 2020-04-05 19:56 UTC (permalink / raw)
To: Dave Taht; +Cc: Cake List
[-- Attachment #1: Type: text/plain, Size: 742 bytes --]
> On 5 Apr 2020, at 16:22, Dave Taht <dave.taht@gmail.com> wrote:
>
>>
>
> I'd still be willing to bet, then, that the majority of instances were
> not turning nat mode on, when
> they should have been.
If memory serves, at the time there was a lot of concern about cpu usage and (I felt) that every line, every potential instruction and cache hit was being scrutinised. NAT lookup by default was deemed too much. Cake ended up being ‘cpu heavy’ though I’ve yet to see a better combination of shaper, aqm, flow fairness, host fairness, DSCP awareness and ack filtering in one overall package, let alone one that did it in less cpu.
Cheers,
Kevin D-B
gpg: 012C ACB2 28C6 C53E 9775 9123 B3A2 389B 9DE2 334A
[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [relevance 0%]
* Re: [Cake] cake and nat in openwrt... on by default?
@ 2020-04-05 18:03 1% ` Dave Taht
2020-04-05 19:56 0% ` Kevin Darbyshire-Bryant
1 sibling, 0 replies; 200+ results
From: Dave Taht @ 2020-04-05 18:03 UTC (permalink / raw)
To: Kevin Darbyshire-Bryant; +Cc: Cake List
https://www.reddit.com/search/?q=bufferbloat&t=week
anyway I posted a PSA. Grump. it looks like comment threads are not
unified across forums.
https://www.reddit.com/r/HomeNetworking/comments/fvhr4w/psa_sqm_cake_nat_and_bufferbloat_tuning/
I do wish our mailing lists showed up on google searches.
On Sun, Apr 5, 2020 at 8:22 AM Dave Taht <dave.taht@gmail.com> wrote:
>
> On Sun, Apr 5, 2020 at 12:57 AM Kevin Darbyshire-Bryant
> <kevin@darbyshire-bryant.me.uk> wrote:
> >
> >
> >
> > > On 5 Apr 2020, at 05:17, Dave Taht <dave.taht@gmail.com> wrote:
> > >
> > > I see cake is moving to the upstreamed version. As best as I recall,
> > > nat mode was on by default in the openwrt code, but not the upstreamed
> > > code.
> > >
> > > People not setting nat mode on would explain a few things i've seen
> > > 'round the intertubes this week.
> >
> > From sch_cake repo and hence ‘out of tree’ cake
> >
> > if (tb[TCA_CAKE_NAT]) {
> > #if IS_REACHABLE(CONFIG_NF_CONNTRACK)
> > q->flow_mode &= ~CAKE_FLOW_NAT_FLAG;
> > q->flow_mode |= CAKE_FLOW_NAT_FLAG *
> > !!nla_get_u32(tb[TCA_CAKE_NAT]);
> > #else
> > #if LINUX_VERSION_CODE >= KERNEL_VERSION(4, 16, 0)
> > NL_SET_ERR_MSG_ATTR(extack, tb[TCA_CAKE_NAT],
> > "No conntrack support in kernel");
> > #endif
> > return -EOPNOTSUPP;
> > #endif
> > }
> >
> >
> > From kernel 5.4 as found in openwrt build dir
> >
> > if (tb[TCA_CAKE_NAT]) {
> > #if IS_ENABLED(CONFIG_NF_CONNTRACK)
> > q->flow_mode &= ~CAKE_FLOW_NAT_FLAG;
> > q->flow_mode |= CAKE_FLOW_NAT_FLAG *
> > !!nla_get_u32(tb[TCA_CAKE_NAT]);
> > #else
> > NL_SET_ERR_MSG_ATTR(extack, tb[TCA_CAKE_NAT],
> > "No conntrack support in kernel");
> > return -EOPNOTSUPP;
> > #endif
> >
> >
> >
> > cake_init(…) in both does:
> >
> > q->flow_mode = CAKE_FLOW_TRIPLE;
> >
> >
> > So openwrt doesn’t, by default, enable NAT mode in cake.
> >
> > I honestly don’t think that there are enough instances of cake out there, let alone instances of cake from openwrt, let alone instances of cake from master which switched to upstream cake 2-3 days ago, to make any sort of difference anyway.
>
> I'd still be willing to bet, then, that the majority of instances were
> not turning nat mode on, when
> they should have been.
>
> >
> > >
> > > --
> > > Make Music, Not War
> > >
> > > Dave Täht
> > > CTO, TekLibre, LLC
> > > http://www.teklibre.com
> > > Tel: 1-831-435-0729
> > > _______________________________________________
> > > Cake mailing list
> > > Cake@lists.bufferbloat.net
> > > https://lists.bufferbloat.net/listinfo/cake
> >
> >
> > Cheers,
> >
> > Kevin D-B
> >
> > gpg: 012C ACB2 28C6 C53E 9775 9123 B3A2 389B 9DE2 334A
> >
>
>
> --
> Make Music, Not War
>
> Dave Täht
> CTO, TekLibre, LLC
> http://www.teklibre.com
> Tel: 1-831-435-0729
--
Make Music, Not War
Dave Täht
CTO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-831-435-0729
^ permalink raw reply [relevance 1%]
Results 1-200 of ~300 next (older) | reverse | sort options + mbox downloads above
-- links below jump to the message on this page --
2020-04-04 4:12 [Cake] New board that looks interesting Aaron Wood
2020-04-04 14:47 ` David P. Reed
2020-04-04 16:10 ` [Cake] [Bloat] " Dave Taht
2020-04-04 16:27 ` Aaron Wood
2020-04-04 17:36 ` Dave Taht
2020-04-05 20:17 1% ` David P. Reed
2020-04-27 2:45 1% ` Dave Taht
2020-12-18 23:48 1% ` Aaron Wood
2021-01-04 2:11 1% ` Dean Scarff
2020-04-05 4:17 [Cake] cake and nat in openwrt... on by default? Dave Taht
2020-04-05 7:57 ` Kevin Darbyshire-Bryant
2020-04-05 15:22 ` Dave Taht
2020-04-05 18:03 1% ` Dave Taht
2020-04-05 19:56 0% ` Kevin Darbyshire-Bryant
2020-04-06 10:15 0% ` Toke Høiland-Jørgensen
2020-04-10 13:16 [Cake] Thinking about ingress shaping & cake Kevin Darbyshire-Bryant
2020-04-10 14:14 0% ` Jonathan Morton
2020-04-12 8:23 0% ` Kevin Darbyshire-Bryant
2020-04-12 9:47 0% ` Sebastian Moeller
2020-04-12 11:02 0% ` Jonathan Morton
2020-04-12 13:12 0% ` Kevin Darbyshire-Bryant
2020-04-21 18:22 [Cake] Advantages to tightly tuning latency Justin Kilpatrick
2020-04-21 18:40 0% ` Jonathan Morton
2020-04-21 18:44 1% ` Dave Taht
2020-04-21 22:25 0% ` Thibaut
2020-04-21 22:33 0% ` Jonathan Morton
2020-04-21 22:44 1% ` Dave Taht
2020-04-21 22:50 ` Dave Taht
2020-04-21 23:07 0% ` Jonathan Morton
2020-04-21 23:27 1% ` Dave Taht
2020-04-22 8:28 0% ` Thibaut
2020-04-22 9:03 1% ` Luca Muscariello
2020-04-22 14:48 1% ` Dave Taht
2020-04-22 15:28 1% ` Luca Muscariello
2020-04-22 17:42 2% ` David P. Reed
2020-04-23 9:29 1% ` Maxime Bizon
2020-04-23 11:57 1% ` Toke Høiland-Jørgensen
2020-04-23 12:29 1% ` Luca Muscariello
2020-04-23 12:33 1% ` Maxime Bizon
2020-04-23 16:42 0% ` Toke Høiland-Jørgensen
2020-04-23 17:31 1% ` Maxime Bizon
2020-04-23 18:30 0% ` Sebastian Moeller
2020-04-23 21:53 1% ` Maxime Bizon
2020-04-23 18:35 0% ` Toke Høiland-Jørgensen
2020-04-23 21:59 1% ` Maxime Bizon
2020-04-23 23:05 0% ` Toke Høiland-Jørgensen
2020-04-23 23:11 1% ` Dave Taht
2020-04-23 16:28 1% ` Dave Taht
2020-04-21 23:06 1% ` Justin Kilpatrick
2020-04-21 23:19 1% ` Dave Taht
2020-04-22 15:58 [Cake] DSCP ramblings Kevin Darbyshire-Bryant
2020-04-22 16:15 1% ` Dave Taht
2020-04-22 16:20 1% ` Dave Taht
2020-04-22 16:44 1% ` Stephen Hemminger
2020-04-22 16:58 1% ` Dave Taht
2020-04-23 10:50 0% ` Kevin Darbyshire-Bryant
2020-04-22 17:17 0% ` Kevin Darbyshire-Bryant
2020-04-22 17:45 1% ` Dave Taht
2020-04-25 11:07 [Cake] Cake tin behaviour - discuss Kevin Darbyshire-Bryant
2020-04-25 15:14 1% ` David P. Reed
2020-04-25 15:25 0% ` Jonathan Morton
2020-04-25 20:34 0% ` Kevin Darbyshire-Bryant
2020-04-25 20:56 1% ` David P. Reed
2020-04-25 21:31 1% ` Kevin Darbyshire-Bryant
2020-04-26 13:53 1% ` David P. Reed
2020-04-27 11:52 0% ` Kevin Darbyshire-Bryant
[not found] <CAA93jw71YdABJPeRkFDrzLGY2PtWy-zqaLoGrnFWuFhOPz48xg@mail.gmail.com>
[not found] ` <20200424120317.4d3d3e98@rellim.com>
[not found] ` <20200424120423.1f57def6@rellim.com>
[not found] ` <CAA93jw7e6k4sxh2+5H-dSBmdUkA53=VxJu7FmTdrSKTsbP0rWg@mail.gmail.com>
[not found] ` <20200424121344.2bc8e62c@rellim.com>
[not found] ` <CAA93jw5i7ccwc3VwSKiNk9XL-FXHgwznxzCHUDytpHFDsNGfoA@mail.gmail.com>
[not found] ` <20200424123005.64aef3bf@rellim.com>
[not found] ` <CAA93jw5xygaNsqYb9z9cF00TpH=8cOSDzFGZJxrDW-SkQFey4g@mail.gmail.com>
[not found] ` <20200424195745.72d725bd@rellim.com>
2020-04-25 19:09 1% ` [Cake] cake on linux 5.6 32 bit x86 might be broken Dave Taht
2020-04-25 19:19 1% ` Y
2020-04-25 19:59 0% ` Jonathan Morton
2020-04-25 20:05 1% ` Dave Taht
2020-04-28 16:57 [Cake] intel gives up on home gateways Dave Taht
2020-04-28 22:22 1% ` [Cake] [Cerowrt-devel] " Joel Wirāmu Pauling
[not found] <87368jc3df.wl-jch@irif.fr>
2020-05-01 16:11 1% ` [Cake] Fwd: [Babel-users] OT: Centralised WebRTC server available for testing Dave Taht
2020-05-01 16:44 [Cake] dslreports is no longer free Dave Taht
2020-05-01 19:34 1% ` [Cake] [Bloat] " Kenneth Porter
2020-05-01 19:54 0% ` Sebastian Moeller
2020-05-01 19:48 0% ` [Cake] " Sebastian Moeller
2020-05-01 20:09 1% ` [Bloat] " Sergey Fedorov
2020-05-01 21:11 0% ` [Cake] [Bloat] " Sebastian Moeller
2020-05-01 21:37 1% ` [Bloat] [Cake] " Sergey Fedorov
[not found] ` <mailman.170.1588363787.24343.bloat@lists.bufferbloat.net>
2020-05-01 22:07 ` [Cake] [Bloat] " Michael Richardson
2020-05-01 23:35 1% ` [Bloat] [Cake] " Sergey Fedorov
2020-05-02 1:14 ` [Cake] [Make-wifi-fast] [Bloat] " Jannie Hanekom
2020-05-02 16:37 1% ` Benjamin Cronce
2020-05-02 16:52 1% ` Dave Taht
2020-05-02 17:38 1% ` David P. Reed
2020-05-02 19:00 1% ` Sergey Fedorov
2020-05-02 23:23 1% ` David P. Reed
2020-05-03 15:31 1% ` [Cake] fast.com quality David P. Reed
2020-05-03 15:37 1% ` Dave Taht
2020-05-02 20:19 0% ` [Cake] [Make-wifi-fast] [Bloat] dslreports is no longer free Sebastian Moeller
2020-05-27 9:08 0% ` [Cake] " Matthew Ford
2020-05-27 9:32 0% ` Sebastian Moeller
2020-05-03 15:06 1% [Cake] [Make-wifi-fast] " David P. Reed
2020-05-04 17:04 1% ` Sergey Fedorov
2020-05-05 21:02 1% ` David P. Reed
2020-05-06 8:19 0% ` Sebastian Moeller
2020-05-06 15:39 ` [Cake] Slightly OT " David P. Reed
2020-05-06 15:51 1% ` Dave Taht
[not found] ` <mailman.253.1588611897.24343.make-wifi-fast@lists.bufferbloat.net>
2020-05-05 0:03 1% ` [Make-wifi-fast] [Cake] " Bob McMahon
[not found] ` <mailman.256.1588636996.24343.bloat@lists.bufferbloat.net>
2020-05-05 0:10 1% ` [Cake] [Bloat] [Make-wifi-fast] " Dave Taht
2020-05-06 8:08 0% ` [Cake] [Make-wifi-fast] [Bloat] " Sebastian Moeller
2020-05-06 18:43 [Cake] Query on ACK Avakash bhat
2020-05-06 19:01 0% ` Jonathan Morton
2020-05-06 19:13 0% ` Toke Høiland-Jørgensen
2020-05-07 6:44 ` Avakash bhat
2020-05-07 6:59 0% ` Jonathan Morton
2020-05-07 7:07 1% ` Sebastian Moeller
2020-05-08 6:36 1% ` Avakash bhat
2020-05-08 6:50 1% ` Dave Taht
2020-05-08 7:41 0% ` Sebastian Moeller
2020-05-08 15:08 0% ` Toke Høiland-Jørgensen
2020-05-08 15:11 1% ` Dave Taht
2020-05-08 15:20 ` Jonathan Morton
2020-05-08 15:40 1% ` Dave Taht
2020-05-25 5:17 1% ` Avakash bhat
2020-05-25 9:42 0% ` Jonathan Morton
2020-05-25 11:58 0% ` Toke Høiland-Jørgensen
2020-06-14 12:43 1% ` Avakash bhat
2020-06-14 14:43 0% ` Jonathan Morton
2020-06-16 5:22 1% ` Avakash bhat
2020-06-16 5:31 1% ` Dave Taht
2020-05-08 8:23 0% ` Sebastian Moeller
2020-05-07 7:58 [Cake] Latency target curiosity Kevin Darbyshire-Bryant
2020-05-07 8:09 0% ` Jonathan Morton
2020-05-07 9:11 0% ` Kevin Darbyshire-Bryant
2020-05-19 9:13 [Cake] [PATCH] net/sch_generic.h: use sizeof_member() and get rid of unused variable Antonio Quartulli
2020-05-19 22:40 1% ` David Miller
2020-05-20 8:39 1% ` Antonio Quartulli
2020-05-20 18:17 1% ` David Miller
2020-05-20 21:25 1% ` Antonio Quartulli
[not found] <mailman.404.1590061333.24343.cake@lists.bufferbloat.net>
2020-05-22 14:18 0% ` [Cake] Is target a command-line option? Toke Høiland-Jørgensen
[not found] <20200527165527.1085151-1-olteanv@gmail.com>
2020-05-27 19:29 1% ` [Cake] Fwd: [PATCH net-next] net: dsa: sja1105: offload the Credit-Based Shaper qdisc Dave Taht
2020-05-29 10:06 [Cake] Playing with ingredients = ruined the CAKE Kevin Darbyshire-Bryant
2020-05-29 15:24 0% ` Kevin Darbyshire-Bryant
2020-05-31 10:04 1% ` Kevin Darbyshire-Bryant
2020-05-31 16:38 ` John Yates
2020-05-31 17:08 0% ` Kevin Darbyshire-Bryant
2020-05-31 17:26 2% ` John Yates
2020-05-31 18:08 0% ` Kevin Darbyshire-Bryant
2020-05-31 19:01 1% ` Dave Taht
2020-05-31 19:25 0% ` Sebastian Moeller
2020-05-29 12:43 [Cake] [PATCH net] sch_cake: Take advantage of skb->hash where appropriate Toke Høiland-Jørgensen
2020-05-29 13:02 7% ` Toke Høiland-Jørgensen
2020-05-29 17:57 1% ` Jakub Kicinski
2020-05-29 18:31 0% ` Toke Høiland-Jørgensen
2020-05-31 4:52 1% ` David Miller
2020-06-03 7:35 [Cake] anyone using google stadia? Dave Taht
2020-06-03 19:09 1% ` [Cake] [Bloat] " Pedro Tumusok
2020-06-04 17:27 1% ` Dave Taht
[not found] <CALGR9oZ9u=huobnQig0mMPS=-Fu7Mu3q8GHLTBOxd2W5u0h_kw@mail.gmail.com>
[not found] ` <CALGR9oZ-MzUh6JZrM7w97i=64OEZ3JzjzhVir2RBTWm210Fw7w@mail.gmail.com>
2020-06-10 2:23 2% ` [Cake] Fwd: [tsvwg] Fwd: Working Group Last Call: QUIC protocol drafts Dave Taht
2020-06-22 13:10 [Cake] [CAKE] Rate is much lower than expected - CPU load is higher than expected Jose Blanquicet
2020-06-22 14:25 1% ` Y
2020-06-22 15:47 ` Toke Høiland-Jørgensen
2020-06-23 13:05 1% ` Jose Blanquicet
2020-06-23 14:41 0% ` Toke Høiland-Jørgensen
2020-06-23 15:21 0% ` Jonathan Morton
2020-06-23 16:08 0% ` Sebastian Moeller
2020-06-23 16:25 0% ` Jonathan Morton
2020-06-24 14:33 [Cake] Why are target & interval increased on the reduced bandwidth tins? Kevin Darbyshire-Bryant
2020-06-24 14:40 0% ` Sebastian Moeller
2020-06-25 13:40 0% ` Kevin Darbyshire-Bryant
2020-06-25 20:42 0% ` Jonathan Morton
2020-06-25 11:55 [Cake] [PATCH net-next 0/5] sched: A series of fixes and optimisations for sch_cake Toke Høiland-Jørgensen
2020-06-25 11:55 ` [Cake] [PATCH net-next 1/5] sch_cake: fix IP protocol handling in the presence of VLAN tags Toke Høiland-Jørgensen
2020-06-25 19:29 1% ` David Miller
2020-06-25 19:53 0% ` Toke Høiland-Jørgensen
2020-06-25 20:00 1% ` David Miller
2020-06-26 8:27 1% ` Davide Caratti
2020-06-26 12:52 0% ` Toke Høiland-Jørgensen
2020-06-26 14:01 1% ` Jamal Hadi Salim
2020-06-26 18:52 1% ` Davide Caratti
2020-06-29 10:27 0% ` Toke Høiland-Jørgensen
2020-06-26 13:11 ` Jonathan Morton
2020-06-26 14:59 0% ` Sebastian Moeller
2020-06-26 16:36 0% ` Jonathan Morton
2020-06-26 22:00 1% ` Stephen Hemminger
2020-06-25 19:31 1% ` [Cake] [PATCH net-next 0/5] sched: A series of fixes and optimisations for sch_cake David Miller
2020-06-25 19:49 0% ` Toke Høiland-Jørgensen
2020-06-25 20:12 [Cake] [PATCH net 0/3] sched: A couple of fixes " Toke Høiland-Jørgensen
2020-06-25 23:25 1% ` David Miller
2020-06-25 20:18 [Cake] [PATCH RESEND net-next] sch_cake: add RFC 8622 LE PHB support to CAKE diffserv handling Toke Høiland-Jørgensen
2020-06-25 23:31 1% ` David Miller
2020-07-03 12:05 [Cake] [PATCH net] sched: consistently handle layer3 header accesses in the presence of VLANs Toke Høiland-Jørgensen
2020-07-03 12:53 1% ` Davide Caratti
2020-07-03 14:37 0% ` Toke Høiland-Jørgensen
2020-07-03 15:22 [Cake] [PATCH net v2] " Toke Høiland-Jørgensen
2020-07-03 19:19 1% ` Cong Wang
2020-07-03 20:09 0% ` Toke Høiland-Jørgensen
2020-07-03 20:26 [Cake] [PATCH net v3] " Toke Høiland-Jørgensen
2020-07-03 21:35 1% ` David Miller
2020-07-04 3:24 1% ` Toshiaki Makita
2020-07-04 11:33 0% ` Toke Høiland-Jørgensen
2020-07-06 4:24 1% ` Toshiaki Makita
2020-07-06 10:53 0% ` Toke Høiland-Jørgensen
2020-07-06 12:29 [Cake] [PATCH net] vlan: consolidate VLAN parsing code and limit max parsing depth Toke Høiland-Jørgensen
2020-07-06 20:01 ` Daniel Borkmann
2020-07-06 22:44 ` Toke Høiland-Jørgensen
2020-07-07 10:49 1% ` Toshiaki Makita
2020-07-07 10:54 0% ` Toke Høiland-Jørgensen
2020-07-07 10:44 1% ` Toshiaki Makita
2020-07-07 10:57 0% ` Toke Høiland-Jørgensen
2020-07-07 11:01 1% ` Toshiaki Makita
2020-07-07 11:03 [Cake] [PATCH net v2] " Toke Høiland-Jørgensen
2020-07-07 22:49 1% ` David Miller
2020-07-19 12:22 2% [Cake] [PATCH for v5.9] sch_cake: Replace HTTP links with HTTPS ones Alexander A. Klimov
2020-07-21 15:32 [Cake] quantum configuration Luca Muscariello
2020-07-21 22:29 1% ` Y
2020-07-24 12:26 ` Toke Høiland-Jørgensen
2021-01-26 15:46 1% ` Dave Taht
2020-07-24 15:56 [Cake] diffserv3 vs diffserv4 Justin Kilpatrick
2020-07-24 17:42 0% ` Kevin Darbyshire-Bryant
2020-07-25 10:12 0% ` Kevin Darbyshire-Bryant
2020-07-25 17:18 0% ` Sebastian Moeller
2020-07-25 17:47 0% ` Jonathan Morton
2020-07-25 17:48 1% ` David P. Reed
2020-07-25 17:54 0% ` Kevin Darbyshire-Bryant
2020-07-25 19:35 1% ` David P. Reed
2020-07-25 20:04 0% ` Sebastian Moeller
2020-07-25 21:33 0% ` Kevin Darbyshire-Bryant
2020-07-25 21:27 0% ` Jonathan Morton
2020-07-25 3:13 0% ` Jonathan Morton
2020-07-25 17:05 1% ` David P. Reed
2020-07-27 21:41 [Cake] Cake, low speed ADSL & fwmark Jim Geo
2020-07-27 22:46 0% ` Jonathan Morton
2020-07-28 16:51 0% ` Jim Geo
2020-07-28 16:54 0% ` Jonathan Morton
2020-07-28 16:56 0% ` Toke Høiland-Jørgensen
2020-07-28 14:52 1% ` Y
2020-11-01 10:15 [Cake] NLA_F_NESTED is missing Dean Scarff
2020-11-01 16:53 1% ` Y
2020-11-02 12:37 ` Toke Høiland-Jørgensen
2020-11-03 1:11 1% ` Dean Scarff
2020-11-03 8:07 1% ` Dean Scarff
2020-11-03 11:00 0% ` Toke Høiland-Jørgensen
2020-11-04 5:48 1% ` Dean Scarff
2020-11-04 11:27 0% ` Toke Høiland-Jørgensen
2020-11-03 1:14 0% ` Jonathan Morton
2020-11-03 1:51 1% ` Dean Scarff
2020-12-22 20:06 [Cake] ECN not working? xnor
2020-12-22 20:15 0% ` Jonathan Morton
2021-03-28 15:56 [Cake] wireguard almost takes a bullet Dave Taht
2021-03-29 20:28 ` [Cake] [Cerowrt-devel] " David P. Reed
2021-03-30 1:52 ` Theodore Ts'o
2021-03-31 1:23 ` David P. Reed
2021-03-31 16:08 1% ` Theodore Ts'o
[not found] <wCPnMFHETQCgTR9s6iHn8w@geopod-ismtpd-2-0>
[not found] ` <CANmPVK-wsLrn4bp+pJ8j4K-ZYxQfVYqDQSBPLPKoK02KXdHBow@mail.gmail.com>
2021-04-06 14:50 1% ` [Cake] Fwd: Update | Starlink Beta Dave Taht
[not found] <fccbdadc-a57a-f6fe-68d2-0fbac2fd6b81@labbott.name>
2021-09-09 16:58 1% ` [Cake] Fwd: [Tech-board-discuss] Reminder: Voting procedures for the Linux Foundation Technical Advisory Board Dave Taht
[not found] <AD02259F-4E80-42B7-9B02-A50023EEF2F7@cl.cam.ac.uk>
2021-09-29 16:21 2% ` [Cake] Fwd: [NetFPGA-announce] Announcing NetFPGA PLUS 1.0 Dave Taht
2025-01-06 13:38 [Cake] [PATCH net] sched: sch_cake: add bounds checks to host bulk flow fairness counts Toke Høiland-Jørgensen
2025-01-07 3:14 1% ` kernel test robot
2025-04-01 17:27 [Cake] In loving memory of Dave Täht <3 Frantisek Borsik
2025-04-02 1:21 ` [Cake] [Bloat] " David Lang
2025-04-02 9:06 ` Toke Høiland-Jørgensen
[not found] ` <976DC4FC-44CA-4C7E-90E0-DE39B57F01E1@comcast.com>
2025-04-02 13:59 1% ` Livingood, Jason
2025-04-02 19:51 0% ` David P. Reed
2025-04-03 3:28 0% ` [Cake] [Starlink] " the keyboard of geoff goodfellow
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox