[Bloat] mosh, ecn, and diffserv marking

Dave Taht dave.taht at gmail.com
Wed Jun 13 23:02:25 EDT 2012


mosh list: sorry for broadening this discussion to my own list.

bloat list:

This is a discussion of the innovative new partial ssh replacement, mosh,
which is documented here:

http://mosh.mit.edu/

This particular protocol has made my (often crashy) wifi lab,
intercontinental connections, and life so much more manageable and
productive of late that I encourage as many folk as possible to play
with it, and port it to various pieces of equipment...

and we're playing with diffserv/ecn in this new context...

On Wed, Jun 13, 2012 at 3:27 PM, Keith Winstein <keithw at mit.edu> wrote:
> Hi Dave,
>
> We have been testing it (mosh set to AF42 + ECT) at MIT for a few weeks with no reported difficulty
> getting packets through. (I didn't put this in the 1.2.2 release though
> because we haven't really tested it enough, and I'm skitting about sending
> ECT when we just ignore ECN -- it seems like cheating.)
>
> I have found:
>
> (a) OS X refuses to let users set the DiffServ codepoint & ECT and returns
> an error. I think OS X may only permit the original IP TOS types. I haven't
> tested whether the DiffServ codepoint or ECT works separately.

I have asked some other mac folk I know to look into this.

> It works fine
> on Linux and I assume FreeBSD (haven't heard any reports to the contrary).

My principal concern with enabling ECN in mosh was sites that would
filter out udp packets marked such entirely, so that they would not
reach their destination. Thus far, in my limited testing (and yours),
that seems not to be the case.

If it were, I would recomend a backoff step for mosh's packets -
reverting to not having ECN marking after a lossy period, and
attempting to renable after recovery. This might still be a good
idea...

I have found multiple firewalls that don't allow mosh packets back at
all, but that has nothing to do with diffserv or ECN marking.
>
> (b) MIT's routers seem to strip the DiffServ codepoint after one hop.

The whole thing? Yeesh.

> ===
>
> Separate question: If we're setting ECT, what do you recommend we _do_ when
> we get ECN?

> My best idea is that we just alter the timestamp_reply we send
> back, which would be a clean way of getting the other side to gradually slow
> down its transmissions (because it would affect the SRTT, which governs our
> "frame rate").

Sounds promising to me! I would hope to get some of the more radical
thinkers on my list to make suggestions, also...

>
> But I'm a little concerned about the security implications here -- I don't
> want a bad guy to be able to sniff one packet from us and then replay it a
> million times with ECN set to cause a DoS.

The current implementations of codel and fq_codel have this flaw in
the general case (although ecn defaults to off in codel and on in
fq_codel), any misbehaving stream with ECT(2) set will bypass the
default fq_codel drop strategy by merely setting ECT(3), so it's of
much larger concern generally.

(I do have to note that it would take many ECN marked streams to
actually cause a problem with fq_codel, as it will balance all streams
as best as possible)

as it is very early days of codel deployment I feel we have time to
address this issue with various strategies. The simplest would be to
drop ecn marked packets when the codel target is being exceeded by
(say) 2x.

>So my thinking is we would only
> respond to ECN if it's a _new_ packet (with sequence number greater than any
> seen before). We use the same filter to decide whether to update our timing
> statistics so it seems somewhat natural.

Sounds good to me. Because your frame rate is so low in the first place
I have actually only seen codel mark a mosh packet twice in 10s of GB
of bg transfers!

However for future mosh-like applications with higher transfer rates a
satisfactory answer needs to be derived.

One possibility is to actually back off even more than what a packet
drop would normally do. This is an option I'm exploring with uTP as I
write.


>
> Have you seen anybody discuss the appropriate use of ECN in a
> security-sensitive transport protocol? TCP does not seem worried about this
> DoS.

Oh brave new world that has such protocols in it!

Not yet. cc'ing the bloat list, too.

>
> ===
>
> Third question for you: I had to take out the "out-of-order delivery"
> warning because we found that MIT's 802.11n network aggressively reordered
> packets, apparently because of the "block acks" feature in the protocol. In
> a full-throttle TCP download, something like 30-50% of all packets will be
> "out of order"! This seems like it will hurt TCP performance (since TCP will
> send an ACK immediately on any out-of-order packet). I have also heard a few
> reports from people outside MIT.
>
> RFC 3366 ("Advice to link designers on link Automatic Repeat reQuest")
> basically says that if a link layer is going to reorder packets, it should
> not cause a huge number of out-of-order deliveries to IP (and should put
> packets back in order itself if necessary and possible without a large
> delay).
>
> Have you heard anything about this and whether it's common in major 802.11n
> deployments or was discussed by the IETF community? It seems like it may be
> really hurting TCP performance.

I am painfully aware of issues like these in the wireless-n deployment
and should probably talk to them in a separate message series than this,
and perhaps on different lists entirely, too.

See also scary stuff like the infinite retry bugs present in millions
of deployed wireless-n network drivers...

http://www.bufferbloat.net/issues/390
http://www.bufferbloat.net/issues/216

A few more major bits of data regarding dense wireless networks also
went by my eyeballs recently, I'll have to find the latest and repost
it
(summary: in dense 802.11 networks over 70% of packets are not actual
data but frame and beacon related)

Wifi has a major, major, major set of interrelated issues that is
going to take a major effort to resolve.

>
> Cheers,
> Keith
>
>
> On Wed, 13 Jun 2012, Dave Taht wrote:
>
>> I am curious if you were able to measure/see
>> any differences in network latency (particularly over wifi) by the
>> switch to AF42 + ECN, or
>> had any difficulties getting packets through?
>>
>> I saw all sorts of interesting behavior on a recent test across the
>> country as to stripping bits (some ended up with ECN set, but nothing
>> else, others ended up with CS1 set, some preserved all the bits), but
>> lacking a means to compare A/B
>>
>> all I can say is that it feels better on saturated best effort wifi
>> networks.
>>
>>
>> On Wed, Jun 13, 2012 at 12:31 PM, Keith Winstein <keithw at mit.edu> wrote:
>>>
>>> Hello Mosh users and developers,
>>>
>>> mosh 1.2.2 has been released.
>>>
>>> The source code is at:
>>> https://github.com/downloads/keithw/mosh/mosh-1.2.2.tar.gz
>>>
>>> This is a minor maintenance release that:
>>>
>>> * Removes the warning on out-of-order packets (this was firing too
>>> often on some Wi-Fi networks)
>>>
>>> * Adds an experimental "--predict=experimental" mode to predict
>>> immediately, even if recent predictions were incorrect
>>>
>>> ===
>>>
>>> mosh 1.2.2 is backwards-compatible with mosh clients back to 0.96 and
>>> mosh servers back to 1.0.9. Please let us know of any problems
>>> (https://github.com/keithw/mosh/issues).
>>>
>>> Best regards from the Mosh team,
>>> Keith
>>> _______________________________________________
>>> mosh-devel mailing list
>>> mosh-devel at mit.edu
>>> http://mailman.mit.edu/mailman/listinfo/mosh-devel
>>
>>
>>
>>
>> --
>> Dave Täht
>> SKYPE: davetaht
>> http://ronsravings.blogspot.com/



-- 
Dave Täht
SKYPE: davetaht
http://ronsravings.blogspot.com/



More information about the Bloat mailing list