Cmnt: [SRU][F][PATCH v2 0/3] CVE-2023-21400

Tue Dec 10 08:08:47 UTC 2024

On Tue, Dec 10, 2024 at 3:33 PM Koichiro Den <koichiro.den at canonical.com> wrote:
>
> On Tue, Dec 10, 2024 at 02:20:51PM +0800, Chengen Du wrote:
> > On Tue, Dec 10, 2024 at 1:46 PM Koichiro Den <koichiro.den at canonical.com> wrote:
> > >
> > > On Tue, Nov 26, 2024 at 12:14:25PM +0800, Chengen Du wrote:
> > > > CVE-2023-21400
> > > >
> > > > BugLink: https://bugs.launchpad.net/bugs/2078659
> > > >
> > > > SRU Justification:
> > > >
> > > > [Impact]
> > > > io_commit_cqring() writes the CQ ring tail to make it visible and also triggers any deferred work.
> > > > When a ring is set up with IOPOLL, it doesn't require locking around the CQ ring updates.
> > > > However, if there is deferred work that needs processing, io_queue_deferred() assumes that the completion_lock is held.
> > > > The io_uring subsystem does not properly handle locking for rings with IOPOLL, leading to a double-free vulnerability, which can be exploited as CVE-2023-21400.
> > > >
> > > > [Fix]
> > > > There is a commit that fixed this issue.
> > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=fb348857e7b67eefe365052f1423427b66dedbf3
> > > >
> > > > There is no direct upstream commit for this issue, and the patch needs to be reworked to apply to version 5.4.
> > > >
> > > > [Test Plan]
> > > > This is a timing issue that can be triggered by using the liburing library to implement a test program.
> > > > First, set the io_uring_params flag to IORING_SETUP_IOPOLL and open an XFS file with the O_RDWR | O_DIRECT flags, as the XFS filesystem implements the iopoll function hook.
> > > > After setting up io_uring, create two threads in the process: one thread will wait for completion queue events, and the other will continuously send readv and writev requests in sequence.
> > > > The writev requests should include the IOSQE_IO_DRAIN flag to ensure that previous submission queue events are completed first.
> > > >
> > > > The issue arises when writev requests add entries into the defer_list, but the io_iopoll_complete function consumes entries from defer_list without holding the appropriate lock.
> > > >
> > > > [Where problems could occur]
> > > > The problematic call path can be triggered under specific usage scenarios and only affects io_uring functionality.
> > > > If the patch contains any issues, it may lead to a deadlock.
> > > >
> > > > Jens Axboe (1):
> > > >   io_uring: ensure IOPOLL locks around deferred work
> > > >
> > > > Pavel Begunkov (2):
> > > >   io_uring: remove extra check in __io_commit_cqring
> > > >   io_uring: dont kill fasync under completion_lock
> > >
> > > [1/3] and [2/3] seem to be not only unrelated to CVE-2023-21400 but also
> > > unhelpful in reducing context conflicts for [3/3]. If there is still a
> > > compelling reason to include those two for CVE-2023-21400, could you let me
> > > know?
> >
> > Here is my initial version of the fix for this CVE:
> > https://lists.ubuntu.com/archives/kernel-team/2024-November/155144.html
> >
> > To address the issue within the current code structure, I have wrapped
> > io_commit_cqring in the io_iopoll_complete function with
> > completion_lock.
> > However, this approach has surfaced an existing issue related to
> > invoking kill_fasync under completion_lock as highlighted in commit
> > 4aa84f2ffa81 (io_uring: don't kill fasync under completion_lock).
> >
> > While this existing issue is not directly related to the CVE, I am
> > concerned that the fix could increase the likelihood of encountering
> > it, especially since io_iopoll_complete may be triggered more
> > frequently.
>
>            CPU0                    CPU1
>            ----                    ----
>       lock(&new->fa_lock);
>                                    local_irq_disable();
>                                    lock(&ctx->completion_lock);
>                                    lock(&new->fa_lock);
>       <Interrupt>
>         lock(&ctx->completion_lock);
>
>      *** DEADLOCK ***
>
> I think the existance of io_timeout_fn() can induce this potential lock
> inversion situation. I'm not sure why io_iopoll_complete()'s additional
> lock(&completion_lock) can excerbate it, when we wrap only
> __io_commit_cqring_flush() inside the completion lock.
>
> Could you elaborate on a possible scenario? Or, is it more of a safety
> catch in case some other backports in the future would increase the
> possibility?

I apologize for the imprecise wording earlier.
My concern is that adopting this logic in additional functions may
increase the likelihood of triggering the issue.
To wrap only __io_commit_cqring_flush() inside the completion_lock, we
still need to include the two commits to ensure the calling order
matches the Linux stable branch.
In my view, it would be challenging to split io_commit_cqring() into
two functions while maintaining the original logic, as
__io_commit_cqring() performs multiple tasks and resides in the middle
of the original io_commit_cqring() function.

My intention is to align the fix closely with the Linux stable branch,
considering that this is for Focal, which is nearing the end of
standard support.
Please share your thoughts on this, as there might be a more efficient
approach that I am not aware of.

>
> >
> > To avoid exacerbating the situation, I recommend addressing the
> > kill_fasync logic separately before implementing the fix for the CVE,
> > as well as backporting the prerequisite commit 0791015837f1 (io_uring:
> > remove extra check in __io_commit_cqring).