[Karmic] [PATCH]: mmc: prevent dangling block device from accessing stale queues

Andy Whitcroft apw at canonical.com
Thu Jun 11 10:46:16 UTC 2009


On Wed, Jun 10, 2009 at 05:30:50PM +0200, Stefan Bader wrote:
> Background: With the karmic kernel it became dangerous to use SD cards  
> (ext3/ext4) as, if they are left mounted on suspend, all data is gone on  
> resume. The problems seems to be hidden in the way mmcblock handles 
> suspend (by throwing away the device) likely mixed with the kernel being 
> quicker on its way down than before.
>
> Fix: The following fix will prevent data loss by making sure the block 
> queue cannot be re-used before the old device is really gone. It has been 
> sent upstream but it became a bit quiet around it. So I am not sure the 
> patch will be the final solution. But given the severe implications I 
> believe we should carry it until then.

I am inclined to agree that we should consider carrying this until it is
resolved upstream.  They do seem preocupied with the upcoming merge
right now.

Do we have an upstream bug filed too?  Sounds like we should so it gets
picked up as an upstream regression.


> From 3f8fa799dea815654381af2b12b0983e440c6c6e Mon Sep 17 00:00:00 2001
> From: Stefan Bader <stefan.bader at canonical.com>
> Date: Wed, 3 Jun 2009 18:17:31 +0000
> Subject: [PATCH] UBUNTU: [Upstream] mmc: prevent dangling block device from accessing stale queues
> 
> BugLink: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/383668
> 
> When the mmc subsytem removes the block device (e.g. for suspend), it will
> call mmc_cleanup_queue to release the block device request queue.
> However the gendisk struct still has a pointer to that queue which is not
> accounted for. If the block device is still open, the gendisk struct will
> not get freed and might still use the stale pointer.
> This gets even worse for the fact that (in this case) on resume, a new block
> device is created which gets the same request queue object from the cache.
> Now any stray access to that old block device end up on the new one.
> As the functions to get and put the blk queue are not exported, the fix will
> delay the actual call to blk_cleanup_queue until the last user of the mmc
> block device drops its reference. Until then the blk queue is present but
> will reject any I/O.
> 
> Signed-off-by: Stefan Bader <stefan.bader at canonical.com>
> ---
>  drivers/mmc/card/block.c |    6 ++++++
>  drivers/mmc/card/queue.c |    7 ++++++-
>  2 files changed, 12 insertions(+), 1 deletions(-)
> 
> diff --git a/drivers/mmc/card/block.c b/drivers/mmc/card/block.c
> index 014b271..69d7cec 100644
> --- a/drivers/mmc/card/block.c
> +++ b/drivers/mmc/card/block.c
> @@ -88,6 +88,12 @@ static void mmc_blk_put(struct mmc_blk_data *md)
>  		int devidx = MINOR(disk_devt(md->disk)) >> MMC_SHIFT;
>  		__clear_bit(devidx, dev_use);
>  
> +		/*
> +		 * We are about to drop the last reference to the disk object.
> +		 * Nothing else should now be looking at the queue pointer, so
> +		 * now it won't hurt if we release it.
> +		 */
> +		blk_cleanup_queue(md->disk->queue);
>  		put_disk(md->disk);
>  		kfree(md);
>  	}
> diff --git a/drivers/mmc/card/queue.c b/drivers/mmc/card/queue.c
> index 4978562..163cc28 100644
> --- a/drivers/mmc/card/queue.c
> +++ b/drivers/mmc/card/queue.c
> @@ -256,7 +256,12 @@ void mmc_cleanup_queue(struct mmc_queue *mq)
>  		kfree(mq->bounce_buf);
>  	mq->bounce_buf = NULL;
>  
> -	blk_cleanup_queue(mq->queue);
> +	/*
> +	 * Calling blk_cleanup_queue() would be too soon here. As long as
> +	 * the gendisk has a reference to it and is not released we should
> +	 * keep the queue. It has been shutdown and will not accept any new
> +	 * requests, so that should be safe.
> +	 */
>  
>  	mq->card = NULL;
>  }

Patch looks pretty sane to me.  ACK.

-apw




More information about the kernel-team mailing list