Cmnt: [SRU][F][PATCH v2 0/3] CVE-2023-21400

Koichiro Den koichiro.den at canonical.com
Wed Dec 11 03:20:47 UTC 2024


On Tue, Dec 10, 2024 at 04:08:47PM +0800, Chengen Du wrote:
> On Tue, Dec 10, 2024 at 3:33 PM Koichiro Den <koichiro.den at canonical.com> wrote:
> >
> > On Tue, Dec 10, 2024 at 02:20:51PM +0800, Chengen Du wrote:
> > > On Tue, Dec 10, 2024 at 1:46 PM Koichiro Den <koichiro.den at canonical.com> wrote:
> > > >
> > > > On Tue, Nov 26, 2024 at 12:14:25PM +0800, Chengen Du wrote:
> > > > > CVE-2023-21400
> > > > >
> > > > > BugLink: https://bugs.launchpad.net/bugs/2078659
> > > > >
> > > > > SRU Justification:
> > > > >
> > > > > [Impact]
> > > > > io_commit_cqring() writes the CQ ring tail to make it visible and also triggers any deferred work.
> > > > > When a ring is set up with IOPOLL, it doesn't require locking around the CQ ring updates.
> > > > > However, if there is deferred work that needs processing, io_queue_deferred() assumes that the completion_lock is held.
> > > > > The io_uring subsystem does not properly handle locking for rings with IOPOLL, leading to a double-free vulnerability, which can be exploited as CVE-2023-21400.
> > > > >
> > > > > [Fix]
> > > > > There is a commit that fixed this issue.
> > > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=fb348857e7b67eefe365052f1423427b66dedbf3
> > > > >
> > > > > There is no direct upstream commit for this issue, and the patch needs to be reworked to apply to version 5.4.
> > > > >
> > > > > [Test Plan]
> > > > > This is a timing issue that can be triggered by using the liburing library to implement a test program.
> > > > > First, set the io_uring_params flag to IORING_SETUP_IOPOLL and open an XFS file with the O_RDWR | O_DIRECT flags, as the XFS filesystem implements the iopoll function hook.
> > > > > After setting up io_uring, create two threads in the process: one thread will wait for completion queue events, and the other will continuously send readv and writev requests in sequence.
> > > > > The writev requests should include the IOSQE_IO_DRAIN flag to ensure that previous submission queue events are completed first.
> > > > >
> > > > > The issue arises when writev requests add entries into the defer_list, but the io_iopoll_complete function consumes entries from defer_list without holding the appropriate lock.
> > > > >
> > > > > [Where problems could occur]
> > > > > The problematic call path can be triggered under specific usage scenarios and only affects io_uring functionality.
> > > > > If the patch contains any issues, it may lead to a deadlock.
> > > > >
> > > > > Jens Axboe (1):
> > > > >   io_uring: ensure IOPOLL locks around deferred work
> > > > >
> > > > > Pavel Begunkov (2):
> > > > >   io_uring: remove extra check in __io_commit_cqring
> > > > >   io_uring: dont kill fasync under completion_lock
> > > >
> > > > [1/3] and [2/3] seem to be not only unrelated to CVE-2023-21400 but also
> > > > unhelpful in reducing context conflicts for [3/3]. If there is still a
> > > > compelling reason to include those two for CVE-2023-21400, could you let me
> > > > know?
> > >
> > > Here is my initial version of the fix for this CVE:
> > > https://lists.ubuntu.com/archives/kernel-team/2024-November/155144.html
> > >
> > > To address the issue within the current code structure, I have wrapped
> > > io_commit_cqring in the io_iopoll_complete function with
> > > completion_lock.
> > > However, this approach has surfaced an existing issue related to
> > > invoking kill_fasync under completion_lock as highlighted in commit
> > > 4aa84f2ffa81 (io_uring: don't kill fasync under completion_lock).
> > >
> > > While this existing issue is not directly related to the CVE, I am
> > > concerned that the fix could increase the likelihood of encountering
> > > it, especially since io_iopoll_complete may be triggered more
> > > frequently.
> >
> >            CPU0                    CPU1
> >            ----                    ----
> >       lock(&new->fa_lock);
> >                                    local_irq_disable();
> >                                    lock(&ctx->completion_lock);
> >                                    lock(&new->fa_lock);
> >       <Interrupt>
> >         lock(&ctx->completion_lock);
> >
> >      *** DEADLOCK ***
> >
> > I think the existance of io_timeout_fn() can induce this potential lock
> > inversion situation. I'm not sure why io_iopoll_complete()'s additional
> > lock(&completion_lock) can excerbate it, when we wrap only
> > __io_commit_cqring_flush() inside the completion lock.
> >
> > Could you elaborate on a possible scenario? Or, is it more of a safety
> > catch in case some other backports in the future would increase the
> > possibility?
> 
> I apologize for the imprecise wording earlier.
> My concern is that adopting this logic in additional functions may
> increase the likelihood of triggering the issue.
> To wrap only __io_commit_cqring_flush() inside the completion_lock, we
> still need to include the two commits to ensure the calling order
> matches the Linux stable branch.

I got your point. Thank you for your explanation! (Without [1/3] and [2/3],
__io_commit_cqring() diverges from upstream stable trees, and seperating it
from the original io_commit_cqring() with introducing
__io_commit_cqring_flush() for the lock protection would be unsafe for
reasons other than the potential deadlock issue)

> In my view, it would be challenging to split io_commit_cqring() into
> two functions while maintaining the original logic, as
> __io_commit_cqring() performs multiple tasks and resides in the middle
> of the original io_commit_cqring() function.
> 
> My intention is to align the fix closely with the Linux stable branch,
> considering that this is for Focal, which is nearing the end of
> standard support.
> Please share your thoughts on this, as there might be a more efficient
> approach that I am not aware of.
> 
> >
> > >
> > > To avoid exacerbating the situation, I recommend addressing the
> > > kill_fasync logic separately before implementing the fix for the CVE,
> > > as well as backporting the prerequisite commit 0791015837f1 (io_uring:
> > > remove extra check in __io_commit_cqring).



More information about the kernel-team mailing list