From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-la0-f53.google.com (mail-la0-f53.google.com [209.85.215.53]) (using TLSv1 with cipher RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority" (verified OK)) by huchra.bufferbloat.net (Postfix) with ESMTPS id 65B2221F0BD for ; Fri, 4 Jan 2013 13:19:37 -0800 (PST) Received: by mail-la0-f53.google.com with SMTP id fn20so10313836lab.40 for ; Fri, 04 Jan 2013 13:19:35 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=8rQllRxFJzq0meU0VwcqXHIW8ajeygxFZQBGp+52NBE=; b=flsvvEcawivK5965NlggnSC57qg4Xq/zzRYpDIhPkaz1JdnOZbLS8BaK/xRqAdfA3h QHoQ/9ifFO2K++pL4eCq8u7UhzhhMTsZwi1r8xyvZhhteb+Q6mG8AdaXIuSWIs5r3S+W IGnXgSoUk2s7vLujty4JtJJ0jxHZLlEqooAvN3+Z8Kqniun3c92SjrTz3Yhod2K3tl/f Q/LFCf+YifLyH2CpeWu8y+jy6BwoFoesRxAJ72EXKArC9pG1Pq+/M/R4GVbWDzGee+kf aI/lGQoEy87shpqYP09J8yZYMGLOOs9GeHPn20PZaBjXGusoaF9CQrqexuVQCqVVtRaB mY3g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:x-gm-message-state; bh=8rQllRxFJzq0meU0VwcqXHIW8ajeygxFZQBGp+52NBE=; b=SKOxaYyF4vEeUfNO8WVBp9dLJGeardAugdSxN++STSfOYScls4qPqFC/8UxDw18J9h 0KLXHz+iKU2mJ7NeRODXqIHsvgxi61ctPOXTyzXAAepaTatz3oZS5lvbCG0IJ09Q8jZI czAoxOlKBIpD0r2J52uhYqX95B0OhZzbbcNGZGXFuB+uOem2vJIDZGaTX5JgdpegiHwJ zTATTk7QK7MSgo1oH980V188n3g3NuI7lT3nY2fXMrnOu0fhESrRNeygkKBStxTlr9d2 jQZLvDAKEV9KyCIGTTNutiaL73OKh11CNW/cC6qEkpVD3wYVUyHyYF0KbPuiKsUl9hRi fWIQ== MIME-Version: 1.0 Received: by 10.112.24.161 with SMTP id v1mr21822756lbf.28.1357334374929; Fri, 04 Jan 2013 13:19:34 -0800 (PST) Received: by 10.112.49.134 with HTTP; Fri, 4 Jan 2013 13:19:34 -0800 (PST) In-Reply-To: References: Date: Fri, 4 Jan 2013 13:19:34 -0800 Message-ID: From: Jerry Chu To: Dave Taht Content-Type: multipart/alternative; boundary=485b390f7de442544504d27d0955 X-Gm-Message-State: ALoCoQkmJRYbBO/Y8Bkeav212Nm5Ht4YB4RQZAykqQUDwFfJjK36T/RZoWHg9dQ1CKqkOHTC0pOwpwfiLtjyApEaYl8LrOC9ba9hHBKPknvai68k+sLOiFvME2heHhRHPOK4+IVhvGPdj3atQZuJ6QFstWSc8SIfH3k2zc7o/0Tq6Fael6JjvbPAmC+UJyLh/2H7b+NtAFtgL5yMPWH8iX3AkjmmtToFuw== X-Mailman-Approved-At: Fri, 04 Jan 2013 13:45:31 -0800 Cc: Eric Dumazet , cerowrt-devel , Yuchung Cheng Subject: Re: [Cerowrt-devel] TFO crashes cerowrt 3.7.1-1 X-BeenThere: cerowrt-devel@lists.bufferbloat.net X-Mailman-Version: 2.1.13 Precedence: list List-Id: Development issues regarding the cerowrt test router project List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 04 Jan 2013 21:19:38 -0000 --485b390f7de442544504d27d0955 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable +ycheng On Fri, Jan 4, 2013 at 1:11 PM, Dave Taht wrote: > Hmm. I would lean towards there being an issue with the new (freshly > ported forward to 3.7.1) unaligned checksum code for mips based on > what you say here. Or an offload... > > As for the 239.x multicast issue, hmm... separate issue entirely. > Probably... > > And then there's TFO. I note that in order to use it properly you need > to turn it on in proc. Last I remember that was > > echo 3 > /proc/sys/net/ipv4/tcp_fastopen > Correct - to enable the normal use of TFO for both client and server. There are other flags for advanced usage: /* Bit Flags for sysctl_tcp_fastopen */ #define TFO_CLIENT_ENABLE 1 #define TFO_SERVER_ENABLE 2 #define TFO_CLIENT_NO_COOKIE 4 /* Send data-in-SYN w/o cookie */ /* Process SYN data but skip cookie validation */ #define TFO_SERVER_COOKIE_NOT_CHKED 0x100 /* Accept SYN data w/o any cookie option */ #define TFO_SERVER_COOKIE_NOT_REQD 0x200 /* Force enable TFO on all listeners, i.e., not requiring the * TCP_FASTOPEN socket option. SOCKOPT1/2 determine how to set max_qlen. */ #define TFO_SERVER_WO_SOCKOPT1 0x400 #define TFO_SERVER_WO_SOCKOPT2 0x800 /* Always create TFO child sockets on a TFO listener even when * cookie/data not present. (For testing purpose!) */ #define TFO_SERVER_ALWAYS 0x1000 > However that's an old memory and there is this tcp_fastopen_key file I > don't know anything about yet (this is such bleeding edge stuff!) > > ... and with tcp_fastopen disabled things should still work right... > so I'm thinking something else is busted in the stack. > > I've also observed a dns slowdown in what I've been testing but hadn't > dug into packet dumps. (and was assuming, until now, it was due to me > fiddling with ULAs inside the network) Thanks for digging this deep! > > I never said this first attempt at 3.7 for cero was going to be > perfect, but we've entered a new age of subtle problems here. > > I strongly suggest nobody else try this dev build as a default gw, and > that the TFO folk ignore the noise for now. > SG. Jerry > > I just got a 3.7.1 box built on x86_64 so as to a/b some captures. > Regrettably I'm short on time through the weekend... > > On Fri, Jan 4, 2013 at 12:42 PM, Maciej Soltysiak > wrote: > > I am seeing something strange here, with polipo related to TFO but also > DNS. > > When I just took 3.7.1-1 and set my windows 7 laptop to use > gw.home.lan:8123 > > as http proxy it didn't work. What I observed was: > > A) after quite a while polipo's response to browser was 504 Host > > www.osnews.com lookup failed: Timeout > > b) this error in ssh console: Host osnews.com lookup failed: Timeout > > (131072) > > c) Disabling TFO by adding option useTCPFastOpen 'false' to config > 'polipo' > > 'general' works around the problem > > d) Alternatively, you can keep TFO enabled in polipo but change option > > 'dnsUseGethostbyname' from 'reluctantly' to 'true' (!) > > This is very weird, because TFO is TCP and the DNS queries fired off by > > polipo are UDP: > > root@OpenWrt:/tmp/log# tcpdump -n -v -vv -vvv -x -X -s 1500 -i lo > > 20:21:56.160245 IP (tos 0x0, ttl 64, id 50129, offset 0, flags [DF], > proto > > UDP (17), length 60) > > 127.0.0.1.47304 > 127.0.0.1.53: [bad udp cksum 0xfe3b -> 0xd17f!] 55396= + > A? > > www.osnews.com. (32) > > 0x0000: 4500 003c c3d1 4000 4011 78dd 7f00 0001 E..<..@.@.x..... > > 0x0010: 7f00 0001 b8c8 0035 0028 fe3b d864 0100 .......5.(.;.d.. > > 0x0020: 0001 0000 0000 0000 0377 7777 066f 736e .........www.osn > > 0x0030: 6577 7303 636f 6d00 0001 0001 ews.com..... > > 20:21:56.160319 IP (tos 0x0, ttl 64, id 50130, offset 0, flags [DF], > proto > > UDP (17), length 60) > > 127.0.0.1.47304 > 127.0.0.1.53: [bad udp cksum 0xfe3b -> 0xd164!] 55396= + > > AAAA? www.osnews.com. (32) > > 0x0000: 4500 003c c3d2 4000 4011 78dc 7f00 0001 E..<..@.@.x..... > > 0x0010: 7f00 0001 b8c8 0035 0028 fe3b d864 0100 .......5.(.;.d.. > > 0x0020: 0001 0000 0000 0000 0377 7777 066f 736e .........www.osn > > 0x0030: 6577 7303 636f 6d00 001c 0001 ews.com..... > > 20:21:56.169942 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto > UDP > > (17), length 123) > > 127.0.0.1.53 > 127.0.0.1.47304: [bad udp cksum 0xfe7a -> 0x5f73!] 55396 > q: > > A? www.osnews.com. 1/2/0 www.osnews.com. [29m3s] A 74.86.31.159 ns: > > osnews.com. [29m3s] NS ns2.swelter.net., osnews.com. [29m3s] NS > > ns1.swelter.net. (95) > > 0x0000: 4500 007b 0000 4000 4011 3c70 7f00 0001 E..{..@.@. > 0x0010: 7f00 0001 0035 b8c8 0067 fe7a d864 8180 .....5...g.z.d.. > > 0x0020: 0001 0001 0002 0000 0377 7777 066f 736e .........www.osn > > 0x0030: 6577 7303 636f 6d00 0001 0001 c00c 0001 ews.com......... > > 0x0040: 0001 0000 06cf 0004 4a56 1f9f c010 0002 ........JV...... > > 0x0050: 0001 0000 06cf 0011 036e 7332 0773 7765 .........ns2.swe > > 0x0060: 6c74 6572 036e 6574 00c0 1000 0200 0100 lter.net........ > > 0x0070: 0006 cf00 0603 6e73 31c0 40 ......ns1.@ > > 20:21:56.173901 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto > UDP > > (17), length 135) > > 127.0.0.1.53 > 127.0.0.1.47304: [bad udp cksum 0xfe86 -> 0x8ecb!] 55396 > q: > > AAAA? www.osnews.com. 1/2/0 www.osnews.com. [54m44s] AAAA > > 2607:f0d0:1002:62::3 ns: osnews.com. [29m3s] NS ns1.swelter.net., > > osnews.com. [29m3s] NS ns2.swelter.net. (107) > > 0x0000: 4500 0087 0000 4000 4011 3c64 7f00 0001 E.....@.@. > 0x0010: 7f00 0001 0035 b8c8 0073 fe86 d864 8180 .....5...s...d.. > > 0x0020: 0001 0001 0002 0000 0377 7777 066f 736e .........www.osn > > 0x0030: 6577 7303 636f 6d00 001c 0001 c00c 001c ews.com......... > > 0x0040: 0001 0000 0cd4 0010 2607 f0d0 1002 0062 ........&......b > > 0x0050: 0000 0000 0000 0003 c010 0002 0001 0000 ................ > > 0x0060: 06cf 0011 036e 7331 0773 7765 6c74 6572 .....ns1.swelter > > 0x0070: 036e 6574 00c0 1000 0200 0100 0006 cf00 .net............ > > 0x0080: 0603 6e73 32c0 4c ..ns2.L > > This is the only DNS traffic I saw during the attempts. The tcpdumps ha= ve > > udp bad checksum but when I disabled TFO in polipo, the UDP where still > bad > > checksum but they worked. > > Really weird. > > p.s. UPNP still works for port forwarding negotiation as it did in > 3.6.11-4 > > I still couldn't get the UPNP/SSDP broadcasts (udp to 239.255.255.250) = to > > being forwarded between se00 and sw00/sw10. Last time it worked was > ~3.3.8. > > I'm starting not to question why it doesn't work, I'm starting to wonde= r > why > > it did work then ;-) > > Regards, > > Maciej > > On Fri, Jan 4, 2013 at 6:33 PM, Dave Taht wrote: > >> > >> On Fri, Jan 4, 2013 at 9:27 AM, Eric Dumazet > wrote: > >> > Sorry, could you give us a copy of the panic stack trace ? > >> > >> I will get a serial console up on a wndr3800 by sunday. (sorry, just > >> landed in california, am in disarray) > >> > >> The latest dev build of cero for the wndr3800 and wndr3700v2 is at: > >> > >> http://snapon.lab.bufferbloat.net/~cero2/cerowrt/wndr/3.7.1-1/ > >> > >> -- > >> Dave T=E4ht > >> > >> Fixing bufferbloat with cerowrt: > >> http://www.teklibre.com/cerowrt/subscribe.html > >> _______________________________________________ > >> Cerowrt-devel mailing list > >> Cerowrt-devel@lists.bufferbloat.net > >> https://lists.bufferbloat.net/listinfo/cerowrt-devel > > > > > > > > -- > Dave T=E4ht > > Fixing bufferbloat with cerowrt: > http://www.teklibre.com/cerowrt/subscribe.html > --485b390f7de442544504d27d0955 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
+ycheng


On Fri, Jan 4, 2013 at = 1:11 PM, Dave Taht <dave.taht@gmail.com> wrote:
Hmm. I would lean towards there being an issue with the ne= w (freshly
ported forward to 3.7.1) unaligned checksum code for mips based on
what you say here. Or an offload...

As for the 239.x multicast issue, hmm... separate issue entirely. Probably.= ..

And then there's TFO. I note that in order to use it properly you need<= br> to turn it on in proc. Last I remember that was

echo 3 > /proc/sys/net/ipv4/tcp_fastopen

=
Correct - to enable the normal use of TFO for both client and se= rver. There are other flags for advanced usage:
=A0/* Bit F= lags for sysctl_tcp_fastopen */
#define TFO_CLIENT_ENABLE =A0 =A0 =A0 1
#define TFO_SERVER_E= NABLE =A0 =A0 =A0 2
#define TFO_CLIENT_NO_COOKIE =A0 =A04 /* Send= data-in-SYN w/o cookie */

/* Process SYN data but= skip cookie validation */
#define TFO_SERVER_COOKIE_NOT_CHKED =A0 =A0 0x100
/* Accept = SYN data w/o any cookie option */
#define TFO_SERVER_COOKIE_NOT_R= EQD =A0 =A0 =A00x200

/* Force enable TFO on all li= steners, i.e., not requiring the
=A0* TCP_FASTOPEN socket option. SOCKOPT1/2 determine how to set max_q= len.
=A0*/
#define TFO_SERVER_WO_SOCKOPT1 =A00x400
#define TFO_SERVER_WO_SOCKOPT2 =A00x800
/* Always create TF= O child sockets on a TFO listener even when
=A0* cookie/data not present. (For testing purpose!)
=A0*/
#define TFO_SERVER_ALWAYS =A0 =A0 =A0 0x1000

<= blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-l= eft-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;pa= dding-left:1ex">
However that's an old memory and there is this tcp_fastopen_key file I<= br> don't know anything about yet (this is such bleeding edge stuff!)

... and with tcp_fastopen disabled things should still work right...
so I'm thinking something else is busted in the stack.

I've also observed a dns slowdown in what I've been testing but had= n't
dug into packet dumps. (and was assuming, until now, it was due to me
fiddling with ULAs inside the network) Thanks for digging this deep!

I never said this first attempt at 3.7 for cero was going to be
perfect, but we've entered a new age of subtle problems here.

I strongly suggest nobody else try this dev build as a default gw, and
that the TFO folk ignore the noise for now.

=
SG.

Jerry
= =A0

I just got a 3.7.1 box built on x86_64 so as to a/b some captures.
Regrettably I'm short on time through the weekend...

On Fri, Jan 4, 2013 at 12:42 PM, Maciej Soltysiak <maciej@soltysiak.com> wrote:
> I am seeing something strange here, with polipo related to TFO but als= o DNS.
> When I just took 3.7.1-1 and set my windows 7 laptop to use gw.home.la= n:8123
> as http proxy it didn't work. What I observed was:
> A) after quite a while polipo's response to browser was 504 Host > www.osnews.com= lookup failed: Timeout
> b) this error in ssh console: Host osnews.com lookup failed: Timeout
> (131072)
> c) Disabling TFO by adding option useTCPFastOpen 'false' to co= nfig 'polipo'
> 'general' works around the problem
> d) Alternatively, you can keep TFO enabled in polipo but change option=
> 'dnsUseGethostbyname' from 'reluctantly' to 'true&= #39; (!)
> This is very weird, because TFO is TCP and the DNS queries fired off b= y
> polipo are UDP:
> root@OpenWrt:/tmp/log# tcpdump -n -v -vv -vvv -x -X -s 1500 -i lo
> 20:21:56.160245 IP (tos 0x0, ttl 64, id 50129, offset 0, flags [DF], p= roto
> UDP (17), length 60)
> 127.0.0.1.47304 > 127.0.0.1.53: [bad udp cksum 0xfe3b -> 0xd17f!= ] 55396+ A?
> www.osnews.com= . (32)
> 0x0000: 4500 003c c3d1 4000 4011 78dd 7f00 0001 E..<..@.@.x..... > 0x0010: 7f00 0001 b8c8 0035 0028 fe3b d864 0100 .......5.(.;.d..
> 0x0020: 0001 0000 0000 0000 0377 7777 066f 736e .........www.osn
> 0x0030: 6577 7303 636f 6d00 0001 0001 ews.com.....
> 20:21:56.160319 IP (tos 0x0, ttl 64, id 50130, offset 0, flags [DF], p= roto
> UDP (17), length 60)
> 127.0.0.1.47304 > 127.0.0.1.53: [bad udp cksum 0xfe3b -> 0xd164!= ] 55396+
> AAAA? www.osnews.c= om. (32)
> 0x0000: 4500 003c c3d2 4000 4011 78dc 7f00 0001 E..<..@.@.x..... > 0x0010: 7f00 0001 b8c8 0035 0028 fe3b d864 0100 .......5.(.;.d..
> 0x0020: 0001 0000 0000 0000 0377 7777 066f 736e .........www.osn
> 0x0030: 6577 7303 636f 6d00 001c 0001 ews.com.....
> 20:21:56.169942 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto= UDP
> (17), length 123)
> 127.0.0.1.53 > 127.0.0.1.47304: [bad udp cksum 0xfe7a -> 0x5f73!= ] 55396 q:
> A? www.osnews.com<= /a>. 1/2/0 www.osnews.c= om. [29m3s] A 74.86.31.159 ns:
> osnews.com. [29m3s= ] NS ns2.swelter.net., osnews.com. [29m3s= ] NS
> ns1.swelter.net. (95)
> 0x0000: 4500 007b 0000 4000 4011 3c70 7f00 0001 E..{..@.@.<p.... > 0x0010: 7f00 0001 0035 b8c8 0067 fe7a d864 8180 .....5...g.z.d..
> 0x0020: 0001 0001 0002 0000 0377 7777 066f 736e .........www.osn
> 0x0030: 6577 7303 636f 6d00 0001 0001 c00c 0001 ews.com.........
> 0x0040: 0001 0000 06cf 0004 4a56 1f9f c010 0002 ........JV......
> 0x0050: 0001 0000 06cf 0011 036e 7332 0773 7765 .........ns2.swe
> 0x0060: 6c74 6572 036e 6574 00c0 1000 0200 0100 lter.net........
> 0x0070: 0006 cf00 0603 6e73 31c0 40 ......ns1.@
> 20:21:56.173901 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto= UDP
> (17), length 135)
> 127.0.0.1.53 > 127.0.0.1.47304: [bad udp cksum 0xfe86 -> 0x8ecb!= ] 55396 q:
> AAAA?
www.osnews.c= om. 1/2/0 www.osnew= s.com. [54m44s] AAAA
> 2607:f0d0:1002:62::3 ns: osnews.com. [29m3s] NS ns1.swelter.net.,
> osnews.com. [29m3s= ] NS ns2.swelter.net. (107)
> 0x0000: 4500 0087 0000 4000 4011 3c64 7f00 0001 E.....@.@.<d.... > 0x0010: 7f00 0001 0035 b8c8 0073 fe86 d864 8180 .....5...s...d..
> 0x0020: 0001 0001 0002 0000 0377 7777 066f 736e .........www.osn
> 0x0030: 6577 7303 636f 6d00 001c 0001 c00c 001c ews.com.........
> 0x0040: 0001 0000 0cd4 0010 2607 f0d0 1002 0062 ........&......b > 0x0050: 0000 0000 0000 0003 c010 0002 0001 0000 ................
> 0x0060: 06cf 0011 036e 7331 0773 7765 6c74 6572 .....ns1.swelter
> 0x0070: 036e 6574 00c0 1000 0200 0100 0006 cf00 .net............
> 0x0080: 0603 6e73 32c0 4c ..ns2.L
> This is the only DNS traffic I saw during the attempts. The tcpdumps h= ave
> udp bad checksum but when I disabled TFO in polipo, the UDP where stil= l bad
> checksum but they worked.
> Really weird.
> p.s. UPNP still works for port forwarding negotiation as it did in 3.6= .11-4
> I still couldn't get the UPNP/SSDP broadcasts (udp to 239.255.255.= 250) to
> being forwarded between se00 and sw00/sw10. Last time it worked was ~3= .3.8.
> I'm starting not to question why it doesn't work, I'm star= ting to wonder why
> it did work then ;-)
> Regards,
> Maciej
> On Fri, Jan 4, 2013 at 6:33 PM, Dave Taht <
dave.taht@gmail.com> wrote:
>>
>> On Fri, Jan 4, 2013 at 9:27 AM, Eric Dumazet <edumazet@google.com> wrote:
>> > Sorry, could you give us a copy of the panic stack trace ? >>
>> I will get a serial console up on a wndr3800 by sunday. (sorry, ju= st
>> landed in california, am in disarray)
>>
>> The latest dev build of cero for the wndr3800 and wndr3700v2 is at= :
>>
>> http://snapon.lab.bufferbloat.net/~cero2/cerowrt= /wndr/3.7.1-1/
>>
>> --
>> Dave T=E4ht
>>
>> Fixing bufferbloat with cerowrt:
>> http://www.teklibre.com/cerowrt/subscribe.html
>> _______________________________________________
>> Cerowrt-devel mailing list
>> Cerowrt-dev= el@lists.bufferbloat.net
>> https://lists.bufferbloat.net/listinfo/cerowrt-devel >
>



--
Dave T=E4ht

Fixing bufferbloat with cerowrt: http://www.teklibre.com/cerowrt/subscrib= e.html

--485b390f7de442544504d27d0955--