Ubuntu Wily / VMWare graphics boot regression

Thomas Hellstrom thellstrom at vmware.com
Thu Feb 25 08:46:33 UTC 2016


Hi!

There is a fix for this problem already upstream. For some reason it
wasn't cc'd stable..
The commit id is

12617971c443c50750a12a77ea0e08319d161975

and it applies from 3.15 to 4.2 provided the ttm fix is applied.

I'll send a message through stable.

/Thomas


On 02/25/2016 07:13 AM, Thomas Hellstrom wrote:
> Hi!
>
> Ugh. I'll try to reproduce and see if I can provide a fix. 4.3 saw a
> major linux modesetting rewrite so it might be possible that we fixed
> more than one bug and previously they might have canceled out eachother....
>
> /Thomas
>
>
>
> On 02/25/2016 12:27 AM, Sinclair Yeh wrote:
>> Hi,
>>
>> I was able to reproduce this last night after updating 15.10, and I
>> didn't know what the cause was until your mail.
>>
>> Let me try a 4.2 kernel with lockdep check enabled and see if I can
>> spot anything.
>>
>> Thomas' in a different time zone, so he may also pick this up in
>> his morning.
>>
>> Sinclair
>>
>> On Wed, Feb 24, 2016 at 02:42:10PM -0800, Kamal Mostafa wrote:
>>> Hi Thomas, Sinclair, and my team-
>>>
>>> Here's a weird one.  It appears that this Linux commit which was
>>> recently applied to Ubuntu Wily 15.10 (via 4.2-stable):
>>>
>>>   [mainline] 025af18 drm/ttm: Fixed a read/write lock imbalance
>>>
>>> is the trigger for this rather major Ubuntu/VMWare graphics boot
>>> regression:
>>>
>>>   https://bugs.launchpad.net/ubuntu/wily/+source/linux/+bug/1548587
>>>   Ubuntu 15.10 VMWare guest won't show UI after upgrading to 4.2.0-30
>>>
>>> (In Comment #33 I produced a test kernel with that commit reverted  
>>> which was confirmed as fixing the regression).
>>>
>>>
>>> But the thing is...
>>>
>>> 025af18 (attached) just looks so *obviously* valid, in that the thing
>>> it fixes looks like it was obviously wrong.  I was reluctant to even
>>> try reverting it, and was surprised when multiple testers confirmed
>>> that it fixed the problem.
>>>
>>> Furthermore, backports of 025af18 have been deployed in many other
>>> stable kernels (and of course, mainline) but the reported boot problem
>>> ** only seems to occur with v4.2-based kernels **.  The problem does
>>> occur with 4.2-stable (including the 025af18 backport), but does _not_
>>> occur with a 4.4 kernel (which always contained 025af18).  That commit
>>> been shipping in pre-4.2 Ubuntu Trusty and Vivid for at cycle or two
>>> with no reports of problems there either.
>>>
>>> So despite the indication that 025af18 is the troublemaker for 4.2-
>>> stable based kernels, I'm not very happy with the idea of just
>>> reverting it from 4.2-stable or from Wily without a better
>>> understanding of why.
>>>
>>> Any thoughts on this topic will be much appreciated.
>>>
>>>  -Kamal
>>> From 025af189fb44250206dd8a32fa4a682392af3301 Mon Sep 17 00:00:00 2001
>>> From: Thomas Hellstrom <thellstrom at vmware.com>
>>> Date: Fri, 20 Nov 2015 11:43:50 -0800
>>> Subject: drm/ttm: Fixed a read/write lock imbalance
>>>
>>> In ttm_write_lock(), the uninterruptible path should call
>>> __ttm_write_lock() not __ttm_read_lock().  This fixes a vmwgfx hang
>>> on F23 start up.
>>>
>>> syeh: Extracted this from one of Thomas' internal patches.
>>>
>>> Cc: <stable at vger.kernel.org>
>>> Signed-off-by: Thomas Hellstrom <thellstrom at vmware.com>
>>> Reviewed-by: Sinclair Yeh <syeh at vmware.com>
>>> ---
>>>  drivers/gpu/drm/ttm/ttm_lock.c | 2 +-
>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/gpu/drm/ttm/ttm_lock.c b/drivers/gpu/drm/ttm/ttm_lock.c
>>> index 6a95454..f154fb1 100644
>>> --- a/drivers/gpu/drm/ttm/ttm_lock.c
>>> +++ b/drivers/gpu/drm/ttm/ttm_lock.c
>>> @@ -180,7 +180,7 @@ int ttm_write_lock(struct ttm_lock *lock, bool interruptible)
>>>  			spin_unlock(&lock->lock);
>>>  		}
>>>  	} else
>>> -		wait_event(lock->queue, __ttm_read_lock(lock));
>>> +		wait_event(lock->queue, __ttm_write_lock(lock));
>>>  
>>>  	return ret;
>>>  }
>>> -- 
>>> 2.7.0
>>>





More information about the kernel-team mailing list