Without TFO all worked fine. The problem is when tfo server is on cero box. I will try both ECN on on laptop and disabling ECN on cero with TFO on. Will report the behavior seen. Thanks, Ketan. On Jan 5, 2013 7:50 AM, "Yuchung Cheng" wrote: > On Fri, Jan 4, 2013 at 5:59 PM, Ketan Kulkarni wrote: > > Well, I was trying polipo server on cero box and httping from laptop. On > > both the boxes I set 3 in tcp_fastopen. > > > > The panic is seen only when server is on cero box. > > If I run server on my laptop and httping from cero all TFO connections > are > > successful. > > So I doubt its the only problem is SYN+DATA. > Just to confirm: you meant the problem is SYN/data processing on the > server side? > > Maybe we hit some ECN / TFO bug. Some crash log would be great. Thanks > for trying TFO! > > > > > Unfortunately I don't have the serial cable right now, and logread or > dmesg > > didn't print any logs before the cero router restarted. > > > > Attached is the tcpdump capture on lo when client and server both run on > > cero box. > > HTH! > > > > If you (or anyone) can suggest more diagnostics, I will be glad to > provide. > > > > On Jan 5, 2013 2:49 AM, "Jerry Chu" wrote: > >> > >> +ycheng > >> > >> > >> On Fri, Jan 4, 2013 at 1:11 PM, Dave Taht wrote: > >>> > >>> Hmm. I would lean towards there being an issue with the new (freshly > >>> ported forward to 3.7.1) unaligned checksum code for mips based on > >>> what you say here. Or an offload... > >>> > >>> As for the 239.x multicast issue, hmm... separate issue entirely. > >>> Probably... > >>> > >>> And then there's TFO. I note that in order to use it properly you need > >>> to turn it on in proc. Last I remember that was > >>> > >>> echo 3 > /proc/sys/net/ipv4/tcp_fastopen > >> > >> > >> Correct - to enable the normal use of TFO for both client and server. > >> There are other flags for advanced usage: > >> /* Bit Flags for sysctl_tcp_fastopen */ > >> #define TFO_CLIENT_ENABLE 1 > >> #define TFO_SERVER_ENABLE 2 > >> #define TFO_CLIENT_NO_COOKIE 4 /* Send data-in-SYN w/o cookie */ > >> > >> /* Process SYN data but skip cookie validation */ > >> #define TFO_SERVER_COOKIE_NOT_CHKED 0x100 > >> /* Accept SYN data w/o any cookie option */ > >> #define TFO_SERVER_COOKIE_NOT_REQD 0x200 > >> > >> /* Force enable TFO on all listeners, i.e., not requiring the > >> * TCP_FASTOPEN socket option. SOCKOPT1/2 determine how to set max_qlen. > >> */ > >> #define TFO_SERVER_WO_SOCKOPT1 0x400 > >> #define TFO_SERVER_WO_SOCKOPT2 0x800 > >> /* Always create TFO child sockets on a TFO listener even when > >> * cookie/data not present. (For testing purpose!) > >> */ > >> #define TFO_SERVER_ALWAYS 0x1000 > >> > >>> > >>> However that's an old memory and there is this tcp_fastopen_key file I > >>> don't know anything about yet (this is such bleeding edge stuff!) > >>> > >>> ... and with tcp_fastopen disabled things should still work right... > >>> so I'm thinking something else is busted in the stack. > >>> > >>> I've also observed a dns slowdown in what I've been testing but hadn't > >>> dug into packet dumps. (and was assuming, until now, it was due to me > >>> fiddling with ULAs inside the network) Thanks for digging this deep! > >>> > >>> I never said this first attempt at 3.7 for cero was going to be > >>> perfect, but we've entered a new age of subtle problems here. > >>> > >>> I strongly suggest nobody else try this dev build as a default gw, and > >>> that the TFO folk ignore the noise for now. > >> > >> > >> SG. > >> > >> Jerry > >> > >>> > >>> > >>> I just got a 3.7.1 box built on x86_64 so as to a/b some captures. > >>> Regrettably I'm short on time through the weekend... > >>> > >>> On Fri, Jan 4, 2013 at 12:42 PM, Maciej Soltysiak < > maciej@soltysiak.com> > >>> wrote: > >>> > I am seeing something strange here, with polipo related to TFO but > also > >>> > DNS. > >>> > When I just took 3.7.1-1 and set my windows 7 laptop to use > >>> > gw.home.lan:8123 > >>> > as http proxy it didn't work. What I observed was: > >>> > A) after quite a while polipo's response to browser was 504 Host > >>> > www.osnews.com lookup failed: Timeout > >>> > b) this error in ssh console: Host osnews.com lookup failed: Timeout > >>> > (131072) > >>> > c) Disabling TFO by adding option useTCPFastOpen 'false' to config > >>> > 'polipo' > >>> > 'general' works around the problem > >>> > d) Alternatively, you can keep TFO enabled in polipo but change > option > >>> > 'dnsUseGethostbyname' from 'reluctantly' to 'true' (!) > >>> > This is very weird, because TFO is TCP and the DNS queries fired off > by > >>> > polipo are UDP: > >>> > root@OpenWrt:/tmp/log# tcpdump -n -v -vv -vvv -x -X -s 1500 -i lo > >>> > 20:21:56.160245 IP (tos 0x0, ttl 64, id 50129, offset 0, flags [DF], > >>> > proto > >>> > UDP (17), length 60) > >>> > 127.0.0.1.47304 > 127.0.0.1.53: [bad udp cksum 0xfe3b -> 0xd17f!] > >>> > 55396+ A? > >>> > www.osnews.com. (32) > >>> > 0x0000: 4500 003c c3d1 4000 4011 78dd 7f00 0001 E..<..@.@.x..... > >>> > 0x0010: 7f00 0001 b8c8 0035 0028 fe3b d864 0100 .......5.(.;.d.. > >>> > 0x0020: 0001 0000 0000 0000 0377 7777 066f 736e .........www.osn > >>> > 0x0030: 6577 7303 636f 6d00 0001 0001 ews.com..... > >>> > 20:21:56.160319 IP (tos 0x0, ttl 64, id 50130, offset 0, flags [DF], > >>> > proto > >>> > UDP (17), length 60) > >>> > 127.0.0.1.47304 > 127.0.0.1.53: [bad udp cksum 0xfe3b -> 0xd164!] > >>> > 55396+ > >>> > AAAA? www.osnews.com. (32) > >>> > 0x0000: 4500 003c c3d2 4000 4011 78dc 7f00 0001 E..<..@.@.x..... > >>> > 0x0010: 7f00 0001 b8c8 0035 0028 fe3b d864 0100 .......5.(.;.d.. > >>> > 0x0020: 0001 0000 0000 0000 0377 7777 066f 736e .........www.osn > >>> > 0x0030: 6577 7303 636f 6d00 001c 0001 ews.com..... > >>> > 20:21:56.169942 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], > proto > >>> > UDP > >>> > (17), length 123) > >>> > 127.0.0.1.53 > 127.0.0.1.47304: [bad udp cksum 0xfe7a -> 0x5f73!] > 55396 > >>> > q: > >>> > A? www.osnews.com. 1/2/0 www.osnews.com. [29m3s] A 74.86.31.159 ns: > >>> > osnews.com. [29m3s] NS ns2.swelter.net., osnews.com. [29m3s] NS > >>> > ns1.swelter.net. (95) > >>> > 0x0000: 4500 007b 0000 4000 4011 3c70 7f00 0001 E..{..@.@. >>> > 0x0010: 7f00 0001 0035 b8c8 0067 fe7a d864 8180 .....5...g.z.d.. > >>> > 0x0020: 0001 0001 0002 0000 0377 7777 066f 736e .........www.osn > >>> > 0x0030: 6577 7303 636f 6d00 0001 0001 c00c 0001 ews.com......... > >>> > 0x0040: 0001 0000 06cf 0004 4a56 1f9f c010 0002 ........JV...... > >>> > 0x0050: 0001 0000 06cf 0011 036e 7332 0773 7765 .........ns2.swe > >>> > 0x0060: 6c74 6572 036e 6574 00c0 1000 0200 0100 lter.net........ > >>> > 0x0070: 0006 cf00 0603 6e73 31c0 40 ......ns1.@ > >>> > 20:21:56.173901 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], > proto > >>> > UDP > >>> > (17), length 135) > >>> > 127.0.0.1.53 > 127.0.0.1.47304: [bad udp cksum 0xfe86 -> 0x8ecb!] > 55396 > >>> > q: > >>> > AAAA? www.osnews.com. 1/2/0 www.osnews.com. [54m44s] AAAA > >>> > 2607:f0d0:1002:62::3 ns: osnews.com. [29m3s] NS ns1.swelter.net., > >>> > osnews.com. [29m3s] NS ns2.swelter.net. (107) > >>> > 0x0000: 4500 0087 0000 4000 4011 3c64 7f00 0001 E.....@.@. >>> > 0x0010: 7f00 0001 0035 b8c8 0073 fe86 d864 8180 .....5...s...d.. > >>> > 0x0020: 0001 0001 0002 0000 0377 7777 066f 736e .........www.osn > >>> > 0x0030: 6577 7303 636f 6d00 001c 0001 c00c 001c ews.com......... > >>> > 0x0040: 0001 0000 0cd4 0010 2607 f0d0 1002 0062 ........&......b > >>> > 0x0050: 0000 0000 0000 0003 c010 0002 0001 0000 ................ > >>> > 0x0060: 06cf 0011 036e 7331 0773 7765 6c74 6572 .....ns1.swelter > >>> > 0x0070: 036e 6574 00c0 1000 0200 0100 0006 cf00 .net............ > >>> > 0x0080: 0603 6e73 32c0 4c ..ns2.L > >>> > This is the only DNS traffic I saw during the attempts. The tcpdumps > >>> > have > >>> > udp bad checksum but when I disabled TFO in polipo, the UDP where > still > >>> > bad > >>> > checksum but they worked. > >>> > Really weird. > >>> > p.s. UPNP still works for port forwarding negotiation as it did in > >>> > 3.6.11-4 > >>> > I still couldn't get the UPNP/SSDP broadcasts (udp to > 239.255.255.250) > >>> > to > >>> > being forwarded between se00 and sw00/sw10. Last time it worked was > >>> > ~3.3.8. > >>> > I'm starting not to question why it doesn't work, I'm starting to > >>> > wonder why > >>> > it did work then ;-) > >>> > Regards, > >>> > Maciej > >>> > On Fri, Jan 4, 2013 at 6:33 PM, Dave Taht > wrote: > >>> >> > >>> >> On Fri, Jan 4, 2013 at 9:27 AM, Eric Dumazet > >>> >> wrote: > >>> >> > Sorry, could you give us a copy of the panic stack trace ? > >>> >> > >>> >> I will get a serial console up on a wndr3800 by sunday. (sorry, just > >>> >> landed in california, am in disarray) > >>> >> > >>> >> The latest dev build of cero for the wndr3800 and wndr3700v2 is at: > >>> >> > >>> >> http://snapon.lab.bufferbloat.net/~cero2/cerowrt/wndr/3.7.1-1/ > >>> >> > >>> >> -- > >>> >> Dave Täht > >>> >> > >>> >> Fixing bufferbloat with cerowrt: > >>> >> http://www.teklibre.com/cerowrt/subscribe.html > >>> >> _______________________________________________ > >>> >> Cerowrt-devel mailing list > >>> >> Cerowrt-devel@lists.bufferbloat.net > >>> >> https://lists.bufferbloat.net/listinfo/cerowrt-devel > >>> > > >>> > > >>> > >>> > >>> > >>> -- > >>> Dave Täht > >>> > >>> Fixing bufferbloat with cerowrt: > >>> http://www.teklibre.com/cerowrt/subscribe.html > >> > >> > > >