NACK: [xenial][PATCH] UBUNTU: SAUCE: nvme-pci: fix memory barrier at nvme_ext_write_doorbell

Stefan Bader stefan.bader at canonical.com
Thu Sep 13 15:39:23 UTC 2018


On 13.09.2018 16:02, Marcelo Henrique Cerri wrote:
> On Wed, Sep 12, 2018 at 10:26:29AM +0200, Stefan Bader wrote:
>> On 11.09.2018 21:50, Marcelo Henrique Cerri wrote:
>>> BugLink: https://bugs.launchpad.net/bugs/1788222
>>>
>> --- cut --
>>> That's a custom backport for Xenial of the commit 07e860202d18
>>> ("nvme-pci: add a memory barrier to nvme_dbbuf_update_and_check_event"
>>> from git://git.infradead.org/nvme.git).
>>>
>>> Original description:
>>>
>> -- cut end --
>>> In many architectures loads may be reordered with older stores to
>>> different locations.  In the nvme driver the following two operations
>>> could be reordered:
>>>
>>>     - Write shadow doorbell (dbbuf_db) into memory.
>>>     - Read EventIdx (dbbuf_ei) from memory.
>>>
>>> This can result in a potential race condition between driver and VM host
>>> processing requests (if given virtual NVMe controller has a support for
>>> shadow doorbell).  If that occurs, then the NVMe controller may decide to
>>> wait for MMIO doorbell from guest operating system, and guest driver may
>>> decide not to issue MMIO doorbell on any of subsequent commands.
>>>
>>> This issue is purely timing-dependent one, so there is no easy way to
>>> reproduce it. Currently the easiest known approach is to run "Oracle IO
>>> Numbers" (orion) that is shipped with Oracle DB:
>>>
>>> orion -run advanced -num_large 0 -size_small 8 -type rand -simulate \
>>> 	concat -write 40 -duration 120 -matrix row -testname nvme_test
>>>
>>> Where nvme_test is a .lun file that contains a list of NVMe block
>>> devices to run test against. Limiting number of vCPUs assigned to given
>>> VM instance seems to increase chances for this bug to occur. On test
>>> environment with VM that got 4 NVMe drives and 1 vCPU assigned the
>>> virtual NVMe controller hang could be observed within 10-20 minutes.
>>> That correspond to about 400-500k IO operations processed (or about
>>> 100GB of IO read/writes).
>>>
>>> Orion tool was used as a validation and set to run in a loop for 36
>>> hours (equivalent of pushing 550M IO operations). No issues were
>>> observed. That suggest that the patch fixes the issue.
>>>
>>> Fixes: 4a4037f212ae ("UBUNTU: SAUCE: nvme: improve performance for virtual NVMe devices")
>> (backported from commit ??? git://git.infradead.org/nvme.git)
>> [mhcerry: <rough description of what had to be changed>]
> 
> I can add these lines describing the origin and changes.
> 
>>> Signed-off-by: Marcelo Henrique Cerri <marcelo.cerri at canonical.com>
>>> ---
>>
>> I cannot find the commit you refer to via the git web interface. Also, even
>> reading through the comments, I am not sure this really is a backport of some
>> upstream patch (in which case I would expect the commit message to be the same
>> as the source of the backport.
> 
> What commit couldn't you find? That fix was crafted by me because
> xenial carries a version of the original patchset that is completely
> different from upstream.
> 
> I didn't keep the original title because it refers to the upstream
> function that doesn't exist in Xenial.

At the top you wrote:
>>> That's a custom backport for Xenial of the commit 07e860202d18
>>> ("nvme-pci: add a memory barrier to nvme_dbbuf_update_and_check_event"
>>> from git://git.infradead.org/nvme.git).

So I can only assume that this patch here is somehow related to that and to a
certain degree adjusted. Like the rest of the commit message seems to come from
that source.

As a reviewer I have no history of things. I can looks at the related upstream
change and maybe make an educated guess whether this looks ok. But I just don't
know what to do with this as searching the infradead git did find nothing with
that sha1 or the commit message.

-Stefan

> 
>> But the way it is now, I have no clue where this comes from and what was done to
>> it before submission.
>>
>> -Stefan
>>
>>>  drivers/nvme/host/pci.c | 8 +++++++-
>>>  1 file changed, 7 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
>>> index 5e5f065bf729..3752052ae20a 100644
>>> --- a/drivers/nvme/host/pci.c
>>> +++ b/drivers/nvme/host/pci.c
>>> @@ -375,7 +375,13 @@ static void nvme_ext_write_doorbell(u16 value, u32 __iomem* q_db,
>>>  	old_value = *db_addr;
>>>  	*db_addr = value;
>>>  
>>> -	rmb();
>>> +	/*
>>> +	 * Ensure that the doorbell is updated before reading the event
>>> +	 * index from memory.  The controller needs to provide similar
>>> +	 * ordering to ensure the envent index is updated before reading
>>> +	 * the doorbell.
>>> +	 */
>>> +	mb();
>>>  	if (!nvme_ext_need_event(*event_idx, value, old_value))
>>>  		goto no_doorbell;
>>>  
>>>
>>
>>
> 
> 
> --
> Regards,
> Marcelo
> 


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: OpenPGP digital signature
URL: <https://lists.ubuntu.com/archives/kernel-team/attachments/20180913/14c34c0e/attachment.sig>


More information about the kernel-team mailing list