[Bloat] cake + ipv6

Tue Aug 18 00:15:55 EDT 2020

On 18/08/2020 06:44, Daniel Sterling wrote:
> ...is it possible to identify (and thus classify)
> plain old bulk downloads, as separate from video streams? They're both
> going to use http / https (or possibly QUIC) -- and they're both
> likely to come from CDN networks... I can't think of a simple way to
> tell them apart.

If there was an easy way to do it, I would already have done so.  We are 
unfortunately hamstrung by some bad design and deployment around 
Diffserv, which might otherwise provide a useful end-to-end visible 
signal here.

> Is this enough of a problem that people would try to make a list of
> netblocks / prefixes that belong to video vs other CDN content?

It's possible that someone is doing this, but I don't specifically know 
of such a source of information.  It would of course be better to find a 
solution that didn't rely on white/black lists, which have a distressing 
habit of going stale.

But one of the more reliable ways might be to use Autonomous System (AS) 
information.  ASes are an organisational unit used for assigning IP 
address ranges and for routing, and usually correspond to a more-or-less 
significant Internet organisation.  It should be feasible to map an 
observed IP address to an AS, then look up the address blocks assigned 
to that AS, thereby capturing a whole range of related IP addresses.

> I do notice video streams are much more bursty than plain downloads
> for me, but that may not hold for all users.
> 
> That is, for me at least, a video stream may average 5mbps over, say,
> 1 minute, but it will sit at 0mbps for a while and then burst at
> 20mbps for a bit.

Correct, YouTube at least likes to fetch a big block of data from disk 
and send it all at once, then rely on the client buffer to tide it over 
while the disk services other requests.  It makes some sense when you 
consider how slow disk seeks are relative to the number of clients they 
need to support, each of which will generally be watching a different 
video (or at least a different part of the same one).

However, this burstiness disappears on the wire just when you would like 
to use it to identify traffic, ie. when the video traffic saturates the 
bandwidth available to it.  If there's only just enough bandwidth, or 
even *less* than what is required, then YouTube sends data continuously 
into the client buffer, trying to keep it as full as possible.

There are no easy answers here.  But I've suggested some things to look 
for and try out.

  - Jonathan Morton