On Mon, Jan 28, 2013 at 5:43 AM, Robert Bradley wrote: > It looks more like data corruption of various forms as opposed to a fault > in checksumming: > > - Truncation of some layer-4 data including headers to 75 octets > - Some bad TCP packets have stored header lengths of 0 octets > - I often see lines of incrementing bytes (30 31 32 etc.). For example, > packet 962 has a train of values from 0x10 to 0x2f, starting at position > 0x003a (the TCP timestamps). I think these are meant to be fragments from > the ping packets (which contain 8 octets then values 0x10 to 0x37), but > these are straying into non-ICMP packets. > - There are pieces of HTTP in non-HTTP protocols. For example, packet > 1394 is supposed to be UDP, but looks like it is really TCP traffic with > the wrong protocol number. The checksum is still invalid in either case. > - It is possible to corrupt layer-4 checksums only, leaving the IP layer > untouched. > > > On 28 January 2013 07:52, Dave Taht wrote: > >> Put up a pic http://snapon.lab.bufferbloat.net/~d/yurt >> >> they aren't bad all the time, but when they go bad, bad things happen. >> >> >> On Sun, Jan 27, 2013 at 11:41 PM, Dave Taht wrote: >> >>> >>> I have been debugging some weirdness for a while. You might want to do >>> some captures on the latest cero and look at checksums. >>> >>> An unreasonably high number of checksum issues seem to be happening, but >>> there doesn't appear to be a whole lot of pattern to it, as yet. >>> >>> I will simplify. I pinged locally and 8.8.8.8 and surfed the web, and a >>> symptom is that some other routers can't ping sometimes nor access much of >>> the internet beyond the gateway. They can always reach the gateway. >>> >>> in the interim, the topology on this capture are >>> >>> 172.30.102.17 - laptop via ethernet to >>> 172.20.102.1 - cerowrt 3.7.4-4 via ethernet to >>> 172.20.6.1 - ubnt 3.3.8-26 via mesh to >>> 172.20.142.11 - ubnt 3.7.4-4 via ethernet to >>> * 192.168.100.1 - cerowrt 3.7.2 capture point (yes, updating that) >>> 10.0.10.1 - comcast box (yes, double nat, fixing that) >>> >>> I took a capture on the se00 interface >>> >>> tcpdump -i se00 -w/tmp/yurt.cap host 172.20.102.17 >>> >>> and stuck that capture there: >>> >>> http://snapon.lab.bufferbloat.net/~d/yurt/yurt.cap >>> >>> and then looked at it with wireshark with this filter >>> >>> ip.checksum_bad == 1 >>> >>> and scratched my head at the error rate (about 1%) and the pattern (lack >>> thereof) >>> >>> I will simplify in the mroning >>> >>> -- >>> Dave Täht >>> >>> Fixing bufferbloat with cerowrt: >>> http://www.teklibre.com/cerowrt/subscribe.html >> >> >> >> >> -- >> Dave Täht >> >> Fixing bufferbloat with cerowrt: >> http://www.teklibre.com/cerowrt/subscribe.html >> > > > > -- > Robert Bradley > Well, it could just be tcpdump_mini blowing up. (doesn't explain the problems on the network tho) running tcpdump locally from the testing laptop I get no bad crcs anywhere on the path, forward or reverse.... -- Dave Täht Fixing bufferbloat with cerowrt: http://www.teklibre.com/cerowrt/subscribe.html