[Bug 589034] Re: nbd-proxy hangs the nbd-connection to server

Stéphane Graber stgraber at stgraber.org
Wed Jul 6 09:30:24 UTC 2011


Marking as fix released as nbd-proxy has been disabled for a while now in both upstream and more recent Ubuntu releases.
A rewrite of nbd-proxy has been done and should fix most of these issues so we might turn nbd-proxy back on in a later release.

** Changed in: ltsp
       Status: Confirmed => Fix Released

** Changed in: ltsp (Ubuntu)
       Status: New => Fix Released

-- 
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to ltsp in Ubuntu.
https://bugs.launchpad.net/bugs/589034

Title:
  nbd-proxy hangs the nbd-connection to server

Status in Linux Terminal Server Project:
  Fix Released
Status in “ltsp” package in Ubuntu:
  Fix Released

Bug description:
  I am running an ltsp server on Ubuntu (10.04) Lucid Lynx, with a
  Primergy TX120 S2 as an ltsp server, and HP Probook 4310s as a
  terminal connecting to server.  The server installation has the amd64
  architecture, but the terminal image is using i386.  This problem
  could also be reproduced with a kvm virtual machine functioning as a
  server, with a similar installation, and has been observed with
  another type of terminal machine as well (XPC shuttle X27D).

  The server contains ltsp-server and ltsp-server-standalone packages,
  in version 5.2.1-0ubuntu9.  The terminal image contains matching
  versions (5.2.1-0ubuntu9) of ltsp-client and ltsp-client-core
  packages.  Kernel version on the server side is 2.6.32-22-server, and
  on the terminal side it is 2.6.32-22-generic.

  I am using dnsmasq as the dhcp-server, and the following settings in
  /var/lib/tftpboot/ltsp/i386/lts.conf:

  [default]
          LDM_DIRECTX = True
          LDM_LANGUAGE = "fi_FI.UTF-8"
          LOCAL_APPS = True
          LOCALDEV = True
          LTSP_FATCLIENT = False
          NBD_SWAP = True
          REMOTE_APPS = True
          SSH_FOLLOW_SYMLINKS = False
          SSH_OVERRIDE_PORT = 222

  On the server side Linux reports the following about the network
  interface that connected to the terminal (some dmesg-snippets here):

  [    1.862987] 0000:30:00.0: eth1: (PCI Express:2.5GB/s:Width x1) 00:15:17:cf:5e:de
  [    1.862989] 0000:30:00.0: eth1: Intel(R) PRO/1000 Network Connection
  [    1.863069] 0000:30:00.0: eth1: MAC: 1, PHY: 4, PBA No: d50858-004
  [   20.324320] ADDRCONF(NETDEV_UP): eth1: link is not ready
  [   22.891005] e1000e: eth1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX
  [   22.892038] ADDRCONF(NETDEV_CHANGE): eth1: link becomes ready

  On the terminal side Linux reports the following about the network
  interface that connected to the server (dmesg-snippets):

  [    1.451527] sky2 eth0: addr 00:26:55:c4:06:95
  [    4.535708] sky2 eth0: enabling interface
  [    4.535949] ADDRCONF(NETDEV_UP): eth0: link is not ready
  [    7.029456] sky2 eth0: Link is up at 1000 Mbps, full duplex, flow control both
  [    7.029693] ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready

  On this configuration, nbd-connection to server works quite well,
  without any significant problems (it appears to rarely hang, but only
  rarely).  However, when putting a switch (ZyXEL Desktop Ethernet
  Switch 10/100Mbps) between these computers, the network interface
  state changes on the server:

  [18989.100157] e1000e: eth1 NIC Link is Down
  [18994.101017] e1000e: eth1 NIC Link is Up 100 Mbps Full Duplex, Flow Control: RX/TX
  [18994.101023] 0000:30:00.0: eth1: 10/100 speed: disabling TSO

  And on the terminal side:

  [  248.484539] sky2 eth0: Link is down.
  [  254.785883] sky2 eth0: Link is up at 100 Mbps, full duplex, flow control both

  On this slower network connection between the server and the terminal,
  nbd-connection frequently hangs.  Loading the kernel and initial
  ramdisk is always reliable, but the nbd connection may stop
  transferring data at some point, and this point appears to change
  randomly, yet often before the login screen comes up.  Note that the
  nbd connection does remain open --- at least on the server side a
  socket connection remains established to the terminal, but nothing is
  transferred between the machines.

  With the previous configuration, the success rate of reaching the ldm
  login screen is about 30-40% at every boot.  Without the switch
  sitting in-between, but using a direct gigabit link, the success rate
  is something between 90-100%.

  It seems this problem is due to nbd-proxy, because this issue goes
  away when it is disabled in the initial ramdisk downloaded by the
  terminal.  After using a direct connection from nbd-client to the
  server, the success rate of reaching the ldm login screen at every
  boot appears to be pretty close to 100%.

  I suspect there a correlation between the terminal CPU speed and the
  network speed that affects this issue.  Perhaps if a terminal machine
  is comparatively slow and the network is fast, this problem occurs
  very rarely?

  This problem can be worked around by disabling nbd-proxy.  This can be done by applying the attached patch to the terminal tree (under /opt/ltsp/i386 for the i386 architecture), and then rebuilding the terminal image with 
  "sudo ltsp-update-image --arch i386".

To manage notifications about this bug go to:
https://bugs.launchpad.net/ltsp/+bug/589034/+subscriptions




More information about the foundations-bugs mailing list