ACK/Cmnt: [focal:linux-azure][LP:#1894896][PATCH] hv_netvsc: Fix hibernation for mlx5 VF driver

Stefan Bader stefan.bader at canonical.com
Wed Oct 28 08:43:11 UTC 2020


On 27.10.20 18:18, Marcelo Henrique Cerri wrote:
> From: Dexuan Cui <decui at microsoft.com>
> 
> BugLink: https://bugs.launchpad.net/bugs/1894896
> 
> mlx5_suspend()/resume() keep the network interface, so during hibernation
> netvsc_unregister_vf() and netvsc_register_vf() are not called, and hence
> netvsc_resume() should call netvsc_vf_changed() to switch the data path
> back to the VF after hibernation. Note: after we close and re-open the
> vmbus channel of the netvsc NIC in netvsc_suspend() and netvsc_resume(),
> the data path is implicitly switched to the netvsc NIC. Similarly,
> netvsc_suspend() should not call netvsc_unregister_vf(), otherwise the VF
> can no longer be used after hibernation.
> 
> For mlx4, since the VF network interafce is explicitly destroyed and
> re-created during hibernation (see mlx4_suspend()/resume()), hv_netvsc
> already explicitly switches the data path from and to the VF automatically
> via netvsc_register_vf() and netvsc_unregister_vf(), so mlx4 doesn't need
> this fix. Note: mlx4 can still work with the fix because in
> netvsc_suspend()/resume() ndev_ctx->vf_netdev is NULL for mlx4.
> 
> Fixes: 0efeea5fb153 ("hv_netvsc: Add the support of hibernation")
> Signed-off-by: Dexuan Cui <decui at microsoft.com>
> Signed-off-by: Jakub Kicinski <kuba at kernel.org>
> (cherry picked from commit 19162fd4063a3211843b997a454b505edb81d5ce)
> Signed-off-by: Marcelo Henrique Cerri <marcelo.cerri at canonical.com>
Acked-by: Stefan Bader <stefan.bader at canonical.com>
> ---

Just two notes not related to the patch itself. One just for whoever applies it
later. Maybe no longer an issue but I remember some issues with the cutoff hints
when there are multiple of them. I would maybe manually delete things from the
downloaded patch...
Second is about the primary tasks on launchpad. This always relates to the
current development series and is a moving target. Right now, there would be no
source for azure there, so that task should be invalid. The reason for not
leaving around any tasks in an open (new, confirmed, triaged, in progress) state
is that this prevents the bug report from considered done. So it still shows up
in bug searches and is counted as "active" bug. And there is already enough of
those.

-Stefan

> 
> Marcelo's comments:
> 
> I was able to reproduce the problem with the current version from
> focal-updates following the test instructions from the LP bug and I
> was also able to confirm the fix solves the problem with a test
> kernel:
> 
> https://kernel.ubuntu.com/~mhcerri/azure/focal-linux-azure_lp1894896.tgz
> 
> 5.8 already received the fix via an upstream stable update
> (LP:#1894896) and hibernation is not supported in 4.15 linux-azure.
> 
> Besides that the fix is a clean cherry-pick from upstream.
> 
> ---
>  drivers/net/hyperv/netvsc_drv.c | 16 +++++++++++-----
>  1 file changed, 11 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c
> index 33d9d17ee0e0..be5ede2a6b94 100644
> --- a/drivers/net/hyperv/netvsc_drv.c
> +++ b/drivers/net/hyperv/netvsc_drv.c
> @@ -2551,8 +2551,8 @@ static int netvsc_remove(struct hv_device *dev)
>  static int netvsc_suspend(struct hv_device *dev)
>  {
>  	struct net_device_context *ndev_ctx;
> -	struct net_device *vf_netdev, *net;
>  	struct netvsc_device *nvdev;
> +	struct net_device *net;
>  	int ret;
>  
>  	net = hv_get_drvdata(dev);
> @@ -2568,10 +2568,6 @@ static int netvsc_suspend(struct hv_device *dev)
>  		goto out;
>  	}
>  
> -	vf_netdev = rtnl_dereference(ndev_ctx->vf_netdev);
> -	if (vf_netdev)
> -		netvsc_unregister_vf(vf_netdev);
> -
>  	/* Save the current config info */
>  	ndev_ctx->saved_netvsc_dev_info = netvsc_devinfo_get(nvdev);
>  
> @@ -2587,6 +2583,7 @@ static int netvsc_resume(struct hv_device *dev)
>  	struct net_device *net = hv_get_drvdata(dev);
>  	struct net_device_context *net_device_ctx;
>  	struct netvsc_device_info *device_info;
> +	struct net_device *vf_netdev;
>  	int ret;
>  
>  	rtnl_lock();
> @@ -2599,6 +2596,15 @@ static int netvsc_resume(struct hv_device *dev)
>  	netvsc_devinfo_put(device_info);
>  	net_device_ctx->saved_netvsc_dev_info = NULL;
>  
> +	/* A NIC driver (e.g. mlx5) may keep the VF network interface across
> +	 * hibernation, but here the data path is implicitly switched to the
> +	 * netvsc NIC since the vmbus channel is closed and re-opened, so
> +	 * netvsc_vf_changed() must be used to switch the data path to the VF.
> +	 */
> +	vf_netdev = rtnl_dereference(net_device_ctx->vf_netdev);
> +	if (vf_netdev && netvsc_vf_changed(vf_netdev) != NOTIFY_OK)
> +		ret = -EINVAL;
> +
>  	rtnl_unlock();
>  
>  	return ret;
> 


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <https://lists.ubuntu.com/archives/kernel-team/attachments/20201028/1ecc0c0d/attachment-0001.sig>


More information about the kernel-team mailing list