Re: Help me get started please - Toke Høiland-Jørgensen

Historic archive of defunct list bloat-devel@lists.bufferbloat.net
 help / color / mirror / Atom feed

From: "Toke Høiland-Jørgensen" <toke@toke.dk>
To: Siavash Eghlimi <eghlimidev@gmail.com>,
	bloat-devel@lists.bufferbloat.net
Subject: Re: Help me get started please
Date: Sat, 28 Oct 2017 22:38:58 +0200	[thread overview]
Message-ID: <87o9oruhwd.fsf@toke.dk> (raw)
In-Reply-To: <CAEPXwpNxNSDh-b5iXwTKp_24vOAvZ+aP-Hk-Yneqahon+5N4wA@mail.gmail.com>

Hi Siavash,

Awesome that you think bufferbloat is an interesting subject!

> in other words: "I wanna start working on this problem in Linux. what
> should i do? what to read? etc. etc."

Well that's easy: You can do anything! Such is the beauty of open source :)

Which means that what you'll probably want to do is to figure out what
you think is most interesting and work from there. The bufferbloat issue
touches on many different aspects of networking, from the lowest to the
highest layers, such as:

- Low-layer network protocols' design and their impact on buffering (ex:
  WiFi networks can't do aggregation without some buffering)

- Interactions between networking hardware and the operating system
  drivers (ex: many drivers will push a ton of packets into the hardware
  only to have it queued there; some have been fixed, but DSL drivers
  are still really bad at this, to name one)

- Interactions between network drivers and the rest of the networking
  stack (ex: we had to completely restructure the way the WiFi subsystem
  interacts with the rest of the networking stack to fix bufferbloat
  there)

- The general purpose queueing layer itself (ex: implementing AQM and
  packet scheduling algorithms as Linux qdiscs)

- The interface between higher level protocols and sockets and the
  networking stack (ex: TCP small queues made sure only the minimal
  amount of data is pushed into the qdisc layer)

- The end-to-end feedback loop between endpoints (ex: the BBR congestion
  control)

So a good starting point is trying to figure out which of these areas
you think is most interesting and start looking into that.

Secondly, it will probably help to take a step back and think about what
it is you want to achieve (at a high level) before you start diving into
code. Learning the intricacies of the kernel is an almost limitless
rathole, that you can spend months or even years doing without getting
any closer to your goal; especially if that goal is not well-defined.

Thirdly, think about what is most important to you: Diving into the code
and hacking on stuff, or writing a good master's thesis. Sadly, those
two are often quite distinct; and though there is some overlap in the
things you'll need to do to achieve them, conflating the two objectives
is a good way to do badly at both.

As for concrete ideas: One way to get into an area such as this is to do
a measurement study or other another kind of evaluation of existing
functionality. I did two of those myself during my master's studies, one
of them for my thesis. See [1] and [2] below for links. The nice thing
about starting with that is that you can improve your understanding of
the system you are measuring while simultaneously producing useful data;
and if you do want to change some functionality you are going to need to
do measurements anyway to evaluate the results of your changes. So
picking a subsystem (obvious contenders: Look at BBR and other TCP
congestion controls if you want to go high in the stack, different
qdiscs if you're into queueing and packet scheduling, or non-ethernet
drivers such as DSL or WiFi if you want to work at the lowest levels),
figure out what you want to do, and start measuring :)

And as for literature: Take a look at the references on the Wikipedia
Bufferbloat page. I updated that recently and added a couple of new
ones, but the old ACM article[3] is still the most comprehensive overview,
I think. "Understanding Linux Network Internals"[4] is nice for an
introduction to Linux network stack concepts, but it is quite dated now.
Ditto for "Linux Device Drivers"[5]. I'm not aware of any up-to-date
books on the kernel network stack. Free Electrons have some free
training material[6], mostly in slide form, that you may find useful.

Hope the above is useful; and good luck with your thesis work!

-Toke

[1] https://rex.kb.dk/primo-explore/fulldisplay?docid=RUC_studentstudentproject/75e69709-0000-48ab-91a9-32df79bbdfcf&context=L&vid=NUI&search_scope=RUC_student&tab=default_tab&lang=da_DK

[2] https://rex.kb.dk/primo-explore/fulldisplay?docid=RUC_studentstudentproject/60796a38-9395-4347-bf03-cc325dfbe28a&context=L&vid=NUI&search_scope=RUC_student&tab=default_tab&lang=da_DK

[3] https://cacm.acm.org/magazines/2012/1/144810-bufferbloat/fulltext

[4] https://www.amazon.com/Understanding-Linux-Network-Internals-Networking/dp/0596002556/

[5] https://lwn.net/Kernel/LDD3/

[6] http://free-electrons.com/docs/

     prev parent reply	other threads:[~2017-10-28 20:39 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-10-27 19:22 Siavash Eghlimi
2017-10-28 20:38 ` Toke Høiland-Jørgensen [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87o9oruhwd.fsf@toke.dk \
    --to=toke@toke.dk \
    --cc=bloat-devel@lists.bufferbloat.net \
    --cc=eghlimidev@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox