[Bug 568616] Re: random silent corruption of TCP data

Bogdan Butnaru bogdanb+launchpad at gmail.com
Wed Apr 28 22:20:25 UTC 2010


apport information

** Tags added: apport-collected

** Description changed:

  Hello! I’m having a very strange problem.
  
  I’m the proud reporter of bug #554749, and I think I found something
  that might explain it. The short of that bug is that I’m using SSHFS to
  mount some shares from my server on my desktop; randomly (a few times
  each day) something goes wrong, and every program using that mount-point
  freezes. (I have to do a complex evil ritual to re-mount it without
  rebooting the computer.) While trying to debug it I discovered some
  occasional “Corrupted MAC on input” errors. I googled a bit for it,
  without much success; anyway, a post somewhere suggested I check for
  network corruption with netcat.
  
  So, I cat’ed together two movie files, obtaining a 1.4 GB file filled
  with mostly random data. And I started shuttling it between the two
  computers, using netcat (via the default TCP). I did a dozen transfers,
  and exactly one of them was corrupted (the second, actually).
  Interestingly, the corruption was exactly 128 bytes long; the replaced
  data doesn’t have any obvious relationship to what was there originally.
  
  According to ifconfig,
  
  bogdanb at mabelode:~/tests$ ifconfig eth0 |grep errors
            RX packets:9487952 errors:0 dropped:0 overruns:0 frame:0
            TX packets:6132714 errors:0 dropped:0 overruns:0 carrier:2
  bogdanb at tanelorn:~/tests$ ifconfig eth0|grep errors
            RX packets:149100044 errors:0 dropped:0 overruns:0 frame:0
            TX packets:135620981 errors:0 dropped:0 overruns:0 carrier:0
  
  there haven’t been any transmission errors, so this being just something
  that randomly passed undetected through the TCP checksum is _really_
  unlikely. There’s also the suspicious length of the error.
  
  I’d expect a tiny bug in some of the routines that shuttle data between
  the NIC’s buffer and the application’s. I’ve no idea how to debug this
  further, please help!
  
  
  A few more notes:
  *) all this happens via Ethernet; the two computers are both linked to a switch with short cables. Anyway, given the above, it doesn’t look like line errors.
  *) the server runs Karmic, the desktop runs Lucid.
  *) I’ve had similar (but not identical) problems with SSHFS ever since I had these two computers (around Feisty, I think); it’s likely that whatever is causing the corruption was there since the beginning, but the way SSHFS handles occurrences of the bug changed.
  *) whatever it is, it’s very random. As the test showed, I got a single error after 2 GB, then no other error for the next 15 GB of transferred files. However, the SSHFS error (which I’m pretty sure is caused by this) sometimes happens after 15 minutes, sometimes I have no problems for a full day.
  *) I tried reporting this with ubuntu-bug, but Launchpad timed out on me several times in a row. Please tell me whatever information you think I should add.
+ 
+ 
+ 
+ --- 
+ AlsaVersion: Advanced Linux Sound Architecture Driver Version 1.0.21.
+ Architecture: amd64
+ AudioDevicesInUse:
+  Cannot stat file /proc/19634/fd/3: Transport endpoint is not connected
+                       USER        PID ACCESS COMMAND
+  /dev/snd/controlC1:  bogdanb    1604 F.... pulseaudio
+  /dev/snd/controlC0:  bogdanb    1604 F.... pulseaudio
+  /dev/snd/pcmC0D0p:   bogdanb    1604 F...m pulseaudio
+ CRDA: Error: [Errno 2] No such file or directory
+ Card0.Amixer.info:
+  Card hw:0 'Intel'/'HDA Intel at 0xf9ff8000 irq 22'
+    Mixer name	: 'Realtek ALC1200'
+    Components	: 'HDA:10ec0888,104382fe,00100101'
+    Controls      : 40
+    Simple ctrls  : 22
+ Card1.Amixer.info:
+  Card hw:1 'Headset'/'Logitech Logitech Wireless Headset at usb-0000:00:1d.0-2, full speed'
+    Mixer name	: 'USB Mixer'
+    Components	: 'USB046d:0a12'
+    Controls      : 4
+    Simple ctrls  : 2
+ DistroRelease: Ubuntu 10.04
+ EcryptfsInUse: Yes
+ Frequency: Once a day.
+ HibernationDevice: RESUME=/dev/sdb2
+ IwConfig:
+  lo        no wireless extensions.
+  
+  eth0      no wireless extensions.
+ MachineType: System manufacturer P5Q-PRO
+ NonfreeKernelModules: nvidia
+ Package: linux (not installed)
+ ProcCmdLine: BOOT_IMAGE=/boot/vmlinuz-2.6.32-21-generic root=/dev/sda1 ro nomodeset
+ ProcEnviron:
+  LANGUAGE=en_US:en
+  PATH=(custom, user)
+  LANG=en_US.UTF-8
+  SHELL=/bin/bash
+ ProcVersionSignature: Ubuntu 2.6.32-21.32-generic 2.6.32.11+drm33.2
+ Regression: Yes
+ RelatedPackageVersions: linux-firmware 1.34
+ Reproducible: No
+ RfKill:
+  
+ Tags: lucid networking regression-potential needs-upstream-testing
+ Uname: Linux 2.6.32-21-generic x86_64
+ UserAsoundrc:
+  # ALSA library configuration file
+  
+  # Include settings that are under the control of asoundconf(1).
+  # (To disable these settings, comment out this line.)
+  </home/bogdanb/.asoundrc.asoundconf>
+ UserGroups: adm admin audio cdrom dialout floppy fuse lpadmin netdev plugdev sambashare scanner staff video
+ WpaSupplicantLog:
+  
+ dmi.bios.date: 11/04/2008
+ dmi.bios.vendor: American Megatrends Inc.
+ dmi.bios.version: 1501
+ dmi.board.asset.tag: To Be Filled By O.E.M.
+ dmi.board.name: P5Q-PRO
+ dmi.board.vendor: ASUSTeK Computer INC.
+ dmi.board.version: Rev 1.xx
+ dmi.chassis.asset.tag: Asset-1234567890
+ dmi.chassis.type: 3
+ dmi.chassis.vendor: Chassis Manufacture
+ dmi.chassis.version: Chassis Version
+ dmi.modalias: dmi:bvnAmericanMegatrendsInc.:bvr1501:bd11/04/2008:svnSystemmanufacturer:pnP5Q-PRO:pvrSystemVersion:rvnASUSTeKComputerINC.:rnP5Q-PRO:rvrRev1.xx:cvnChassisManufacture:ct3:cvrChassisVersion:
+ dmi.product.name: P5Q-PRO
+ dmi.product.version: System Version
+ dmi.sys.vendor: System manufacturer

** Attachment added: "AlsaDevices.txt"
   http://launchpadlibrarian.net/46071511/AlsaDevices.txt

-- 
random silent corruption of TCP data
https://bugs.launchpad.net/bugs/568616
You received this bug notification because you are a member of Kernel
Bugs, which is subscribed to linux in ubuntu.




More information about the kernel-bugs mailing list