SRU: [PATCH] KVM: add schedule check to napi_enable call

Stefan Bader stefan.bader at canonical.com
Wed Feb 9 09:45:43 UTC 2011


On 02/08/2011 12:50 AM, Ken Stailey wrote:
> --- On Mon, 2/7/11, Stefan Bader <stefan.bader at canonical.com> wrote:
> 
>> From: Stefan Bader <stefan.bader at canonical.com>
>> Subject: Re: SRU: [PATCH] KVM: add schedule check to napi_enable call
>> To: kernel-team at lists.ubuntu.com
>> Date: Monday, February 7, 2011, 7:50 AM
>> On 02/05/2011 09:20 PM, Ken Stailey
>> wrote:
>>> SRU Justification:
>>>
>>> Impact: Under heavy network I/O load virtio-net driver
>> crashes making VM guest unusable.
>>>
>>> Testcase: I left a current Lucid VM running two
>> concurrent "scp -r" of > 200 GB from NFS read-only source
>> to a physical remote host overnight.  VM quickly
>> started emitting "page allocation errors" in the system
>> log.  Next morning when I checked the VM I could still
>> ping it but could not establish an SSH connection.
>>>
>>> Fix: This patch from Bruce Rogers at Novell
>>>
>>> * [PATCH] KVM: add schedule check to napi_enable
>> call
>>>     - http://kerneltrap.org/mailarchive/linux-netdev/2010/6/4/6278660
>>>
>>> BugLink: https://bugs.launchpad.net/bugs/579276
>>>
>>>
>>>
>> The patch itself looks reasonable. But this has not made
>> its way upstream. The mail thread seems to be reasonably old, so the
>> question would be why it is still missing.
> 
> I have reason to believe that the absence of this patch in upstream kernels is a critical oversight.
> 
> I used "apt-add-repository ppa:kernel-ppa/ppa" to put the "Natty" kernel on my Lucid test VM
> 
> $ uname -a
> Linux dubnium 2.6.38-2-server #29~lucid1-Ubuntu SMP Mon Feb 7 15:09:10 UTC 2011 x86_64 GNU/Linux
> 
> The stress test crashed the VM's network driver after copying only 63 GB.  
> 
> The test consists of running "scp -r /nfs_read_only/1 remote:/dir/1" concurrently with "scp -r /nfs_read_only/2 remote:/dir/2"
> 
> The NFS mount options on the client are:
> ro,tcp,hard,intr,sloppy,addr=10.1.1.1
> 
>> We need patches upstream before they can be SRUed.
>> Have you tried contacting Bruce or Olaf to ask what happened there?
>>
>> -Stefan
> 
> Who is Olaf?
> 
> Thanks,
> Ken
> 
Hi Ken,

sorry, late reply. I was referring to Olaf Kirch (the other guy in the signed
off by). But Bruce already had been responding. As you expected this is some
oversight and it is good that we get that fixed upstream and on the related
stable trees. Thanks again for helping in driving this home.

-Stefan




More information about the kernel-team mailing list