[lucid PATCH] fix smp kvm guests on AMD

Stefan Bader stefan.bader at canonical.com
Tue Mar 8 14:22:34 UTC 2011


On 03/07/2011 08:57 PM, Serge E. Hallyn wrote:
> Quoting Stefan Bader (stefan.bader at canonical.com):
> 
> Thanks for the review, Stefan.  Great catches.
> 
>> I looked through the patches and have a few questions/remarks:
>>
>> Patch #2 seems to work around some changes missing from
>>
>> commit 759379dd68c2885d1fafa433083d4487e710a685
>> Author: Zachary Amsden <zamsden at redhat.com>
>> Date:   Thu Aug 19 22:07:25 2010 -1000
>>
>>     KVM: x86: Add helper functions for time computation
>>
>> that patch replaces
>>  	ktime_get_ts(&ts);
>>  	monotonic_to_bootbased(&ts);
>> 	[timespec_to_ns(&ts);]
>> by
>> 	kernel_ns = get_kernel_ns()
>>
>> which is all done within the section of disabled interrupts, so I probably would
>> move the assignment of kernel_ns up before local_irq_restore().
> 
> Sounds good.
> 
>> Also in patch#2, the upstream patch does this:
>>
>> /* With all the info we got, fill in the values */
>> vcpu->hv_clock.tsc_timestamp = tsc_timestamp;
>>
>> This seems to be missing in the backport.
> 
> Yes, I missed that one, and clearly it's necessary.
> 
>> And finally in patch#3 last_guest_tsc is set right at the beginning but the
>> upstream patch had that later in the section that was assigning last_kernel_ns.
>>
>> @@ -1095,6 +1095,7 @@ static int kvm_write_guest_time(struct kvm_vcpu *v)
>>         vcpu->hv_clock.tsc_timestamp = tsc_timestamp;
>>         vcpu->hv_clock.system_time = kernel_ns + v->kvm->arch.kvmclock_offset;
>>         vcpu->last_kernel_ns = kernel_ns;
>> +       vcpu->last_guest_tsc = tsc_timestamp;
>>         vcpu->hv_clock.flags = 0;
> 
> Feh, that's how the first version of my patch did it.  Messed up in
> the split.
> 
> Are you planning to just fix these on your end?
> 
I would appreciate if you could update the patchset and re-submit it to this thread.

>> One not directly related note: when checking whether this could have effects on
>> the lucid-ec2 branch, I noted that the xen code also has some scale_delta in its
>> code. It seems separated enough to be unaffected. But then maybe we actually
>> want to look closer there. Maybe there is the same problem in that code (I did
>> not look closer, yet).
> 
> I'd worry about doing that without reports from xen users.
>
True. There are some reports about wrong process times but IIRC those were seen
in ec2 and bare metal and maybe get fixed by that big upstream scheduler change.

-Stefan

> thanks,
> -serge





More information about the kernel-team mailing list