[SRU][J][PATCH 1/1] btrfs: zoned: fix use-after-free due to race with dev replace

Vinicius Peixoto vinicius.peixoto at canonical.com
Fri Aug 23 22:34:48 UTC 2024


While loading a zone's info during creation of a block group, we can race
with a device replace operation and then trigger a use-after-free on the
device that was just replaced (source device of the replace operation).

This happens because at btrfs_load_zone_info() we extract a device from
the chunk map into a local variable and then use the device while not
under the protection of the device replace rwsem. So if there's a device
replace operation happening when we extract the device and that device
is the source of the replace operation, we will trigger a use-after-free
if before we finish using the device the replace operation finishes and
frees the device.

Fix this by enlarging the critical section under the protection of the
device replace rwsem so that all uses of the device are done inside the
critical section.

CC: stable at vger.kernel.org # 6.1.x: 15c12fcc50a1: btrfs: zoned: introduce a zone_info struct in btrfs_load_block_group_zone_info
CC: stable at vger.kernel.org # 6.1.x: 09a46725cc84: btrfs: zoned: factor out per-zone logic from btrfs_load_block_group_zone_info
CC: stable at vger.kernel.org # 6.1.x: 9e0e3e74dc69: btrfs: zoned: factor out single bg handling from btrfs_load_block_group_zone_info
CC: stable at vger.kernel.org # 6.1.x: 87463f7e0250: btrfs: zoned: factor out DUP bg handling from btrfs_load_block_group_zone_info
CC: stable at vger.kernel.org # 6.1.x
Reviewed-by: Johannes Thumshirn <johannes.thumshirn at wdc.com>
Signed-off-by: Filipe Manana <fdmanana at suse.com>
Reviewed-by: David Sterba <dsterba at suse.com>
Signed-off-by: David Sterba <dsterba at suse.com>
(backported from commit 0090d6e1b210551e63cf43958dc7a1ec942cdde9)
[vpeixoto: upstream commit 09a46725cc84 ("btrfs: zoned: factor out
per-zone logic from btrfs_load_block_group_zone_info") is missing and
extracts some logic from btrfs_load_block_group_zone_info into its own
separate function, btrfs_load_zone_info. Since what the fix commit does
is essentially enlarging the critical section where we hold a r/w
semaphore in order to avoid a UAF, and the logic is basically the same
before and after the missing upstream refactor, I adapted it to the old
context manually.]
CVE-2024-39496
Signed-off-by: Vinicius Peixoto <vinicius.peixoto at canonical.com>
---
 fs/btrfs/zoned.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/zoned.c b/fs/btrfs/zoned.c
index 8c858f31bdbc0..1c07fbd01bced 100644
--- a/fs/btrfs/zoned.c
+++ b/fs/btrfs/zoned.c
@@ -1156,10 +1156,12 @@ int btrfs_load_block_group_zone_info(struct btrfs_block_group *cache, bool new)
 		struct btrfs_dev_replace *dev_replace = &fs_info->dev_replace;
 		int dev_replace_is_ongoing = 0;
 
+		down_read(&dev_replace->rwsem);
 		device = map->stripes[i].dev;
 		physical = map->stripes[i].physical;
 
 		if (device->bdev == NULL) {
+			up_read(&dev_replace->rwsem);
 			alloc_offsets[i] = WP_MISSING_DEV;
 			continue;
 		}
@@ -1171,6 +1173,7 @@ int btrfs_load_block_group_zone_info(struct btrfs_block_group *cache, bool new)
 			num_conventional++;
 
 		if (!is_sequential) {
+			up_read(&dev_replace->rwsem);
 			alloc_offsets[i] = WP_CONVENTIONAL;
 			continue;
 		}
@@ -1181,11 +1184,9 @@ int btrfs_load_block_group_zone_info(struct btrfs_block_group *cache, bool new)
 		 */
 		btrfs_dev_clear_zone_empty(device, physical);
 
-		down_read(&dev_replace->rwsem);
 		dev_replace_is_ongoing = btrfs_dev_replace_is_ongoing(dev_replace);
 		if (dev_replace_is_ongoing && dev_replace->tgtdev != NULL)
 			btrfs_dev_clear_zone_empty(dev_replace->tgtdev, physical);
-		up_read(&dev_replace->rwsem);
 
 		/*
 		 * The group is mapped to a sequential zone. Get the zone write
@@ -1196,6 +1197,7 @@ int btrfs_load_block_group_zone_info(struct btrfs_block_group *cache, bool new)
 		ret = btrfs_get_dev_zone(device, physical, &zone);
 		memalloc_nofs_restore(nofs_flag);
 		if (ret == -EIO || ret == -EOPNOTSUPP) {
+			up_read(&dev_replace->rwsem);
 			ret = 0;
 			alloc_offsets[i] = WP_MISSING_DEV;
 			continue;
@@ -1208,6 +1210,7 @@ int btrfs_load_block_group_zone_info(struct btrfs_block_group *cache, bool new)
 	"zoned: unexpected conventional zone %llu on device %s (devid %llu)",
 				zone.start << SECTOR_SHIFT,
 				rcu_str_deref(device->name), device->devid);
+			up_read(&dev_replace->rwsem);
 			ret = -EIO;
 			goto out;
 		}
@@ -1233,6 +1236,8 @@ int btrfs_load_block_group_zone_info(struct btrfs_block_group *cache, bool new)
 					((zone.wp - zone.start) << SECTOR_SHIFT);
 			break;
 		}
+
+		up_read(&dev_replace->rwsem);
 	}
 
 	if (num_sequential > 0)
-- 
2.43.0




More information about the kernel-team mailing list