Name |
Description |
CVE-2024-53052 |
In the Linux kernel, the following vulnerability has been resolved: io_uring/rw: fix missing NOWAIT check for O_DIRECT start write When io_uring starts a write, it'll call kiocb_start_write() to bump the super block rwsem, preventing any freezes from happening while that write is in-flight. The freeze side will grab that rwsem for writing, excluding any new writers from happening and waiting for existing writes to finish. But io_uring unconditionally uses kiocb_start_write(), which will block if someone is currently attempting to freeze the mount point. This causes a deadlock where freeze is waiting for previous writes to complete, but the previous writes cannot complete, as the task that is supposed to complete them is blocked waiting on starting a new write. This results in the following stuck trace showing that dependency with the write blocked starting a new write: task:fio state:D stack:0 pid:886 tgid:886 ppid:876 Call trace: __switch_to+0x1d8/0x348 __schedule+0x8e8/0x2248 schedule+0x110/0x3f0 percpu_rwsem_wait+0x1e8/0x3f8 __percpu_down_read+0xe8/0x500 io_write+0xbb8/0xff8 io_issue_sqe+0x10c/0x1020 io_submit_sqes+0x614/0x2110 __arm64_sys_io_uring_enter+0x524/0x1038 invoke_syscall+0x74/0x268 el0_svc_common.constprop.0+0x160/0x238 do_el0_svc+0x44/0x60 el0_svc+0x44/0xb0 el0t_64_sync_handler+0x118/0x128 el0t_64_sync+0x168/0x170 INFO: task fsfreeze:7364 blocked for more than 15 seconds. Not tainted 6.12.0-rc5-00063-g76aaf945701c #7963 with the attempting freezer stuck trying to grab the rwsem: task:fsfreeze state:D stack:0 pid:7364 tgid:7364 ppid:995 Call trace: __switch_to+0x1d8/0x348 __schedule+0x8e8/0x2248 schedule+0x110/0x3f0 percpu_down_write+0x2b0/0x680 freeze_super+0x248/0x8a8 do_vfs_ioctl+0x149c/0x1b18 __arm64_sys_ioctl+0xd0/0x1a0 invoke_syscall+0x74/0x268 el0_svc_common.constprop.0+0x160/0x238 do_el0_svc+0x44/0x60 el0_svc+0x44/0xb0 el0t_64_sync_handler+0x118/0x128 el0t_64_sync+0x168/0x170 Fix this by having the io_uring side honor IOCB_NOWAIT, and only attempt a blocking grab of the super block rwsem if it isn't set. For normal issue where IOCB_NOWAIT would always be set, this returns -EAGAIN which will have io_uring core issue a blocking attempt of the write. That will in turn also get completions run, ensuring forward progress. Since freezing requires CAP_SYS_ADMIN in the first place, this isn't something that can be triggered by a regular user.
|
CVE-2024-50079 |
In the Linux kernel, the following vulnerability has been resolved: io_uring/sqpoll: ensure task state is TASK_RUNNING when running task_work When the sqpoll is exiting and cancels pending work items, it may need to run task_work. If this happens from within io_uring_cancel_generic(), then it may be under waiting for the io_uring_task waitqueue. This results in the below splat from the scheduler, as the ring mutex may be attempted grabbed while in a TASK_INTERRUPTIBLE state. Ensure that the task state is set appropriately for that, just like what is done for the other cases in io_run_task_work(). do not call blocking ops when !TASK_RUNNING; state=1 set at [<0000000029387fd2>] prepare_to_wait+0x88/0x2fc WARNING: CPU: 6 PID: 59939 at kernel/sched/core.c:8561 __might_sleep+0xf4/0x140 Modules linked in: CPU: 6 UID: 0 PID: 59939 Comm: iou-sqp-59938 Not tainted 6.12.0-rc3-00113-g8d020023b155 #7456 Hardware name: linux,dummy-virt (DT) pstate: 61400005 (nZCv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--) pc : __might_sleep+0xf4/0x140 lr : __might_sleep+0xf4/0x140 sp : ffff80008c5e7830 x29: ffff80008c5e7830 x28: ffff0000d93088c0 x27: ffff60001c2d7230 x26: dfff800000000000 x25: ffff0000e16b9180 x24: ffff80008c5e7a50 x23: 1ffff000118bcf4a x22: ffff0000e16b9180 x21: ffff0000e16b9180 x20: 000000000000011b x19: ffff80008310fac0 x18: 1ffff000118bcd90 x17: 30303c5b20746120 x16: 74657320313d6574 x15: 0720072007200720 x14: 0720072007200720 x13: 0720072007200720 x12: ffff600036c64f0b x11: 1fffe00036c64f0a x10: ffff600036c64f0a x9 : dfff800000000000 x8 : 00009fffc939b0f6 x7 : ffff0001b6327853 x6 : 0000000000000001 x5 : ffff0001b6327850 x4 : ffff600036c64f0b x3 : ffff8000803c35bc x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff0000e16b9180 Call trace: __might_sleep+0xf4/0x140 mutex_lock+0x84/0x124 io_handle_tw_list+0xf4/0x260 tctx_task_work_run+0x94/0x340 io_run_task_work+0x1ec/0x3c0 io_uring_cancel_generic+0x364/0x524 io_sq_thread+0x820/0x124c ret_from_fork+0x10/0x20
|
CVE-2024-50060 |
In the Linux kernel, the following vulnerability has been resolved: io_uring: check if we need to reschedule during overflow flush In terms of normal application usage, this list will always be empty. And if an application does overflow a bit, it'll have a few entries. However, nothing obviously prevents syzbot from running a test case that generates a ton of overflow entries, and then flushing them can take quite a while. Check for needing to reschedule while flushing, and drop our locks and do so if necessary. There's no state to maintain here as overflows always prune from head-of-list, hence it's fine to drop and reacquire the locks at the end of the loop.
|
CVE-2024-42254 |
In the Linux kernel, the following vulnerability has been resolved: io_uring: fix error pbuf checking Syz reports a problem, which boils down to NULL vs IS_ERR inconsistent error handling in io_alloc_pbuf_ring(). KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007] RIP: 0010:__io_remove_buffers+0xac/0x700 io_uring/kbuf.c:341 Call Trace: <TASK> io_put_bl io_uring/kbuf.c:378 [inline] io_destroy_buffers+0x14e/0x490 io_uring/kbuf.c:392 io_ring_ctx_free+0xa00/0x1070 io_uring/io_uring.c:2613 io_ring_exit_work+0x80f/0x8a0 io_uring/io_uring.c:2844 process_one_work kernel/workqueue.c:3231 [inline] process_scheduled_works+0xa2c/0x1830 kernel/workqueue.c:3312 worker_thread+0x86d/0xd40 kernel/workqueue.c:3390 kthread+0x2f0/0x390 kernel/kthread.c:389 ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
|
CVE-2024-41080 |
In the Linux kernel, the following vulnerability has been resolved: io_uring: fix possible deadlock in io_register_iowq_max_workers() The io_register_iowq_max_workers() function calls io_put_sq_data(), which acquires the sqd->lock without releasing the uring_lock. Similar to the commit 009ad9f0c6ee ("io_uring: drop ctx->uring_lock before acquiring sqd->lock"), this can lead to a potential deadlock situation. To resolve this issue, the uring_lock is released before calling io_put_sq_data(), and then it is re-acquired after the function call. This change ensures that the locks are acquired in the correct order, preventing the possibility of a deadlock.
|
CVE-2024-41001 |
In the Linux kernel, the following vulnerability has been resolved: io_uring/sqpoll: work around a potential audit memory leak kmemleak complains that there's a memory leak related to connect handling: unreferenced object 0xffff0001093bdf00 (size 128): comm "iou-sqp-455", pid 457, jiffies 4294894164 hex dump (first 32 bytes): 02 00 fa ea 7f 00 00 01 00 00 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace (crc 2e481b1a): [<00000000c0a26af4>] kmemleak_alloc+0x30/0x38 [<000000009c30bb45>] kmalloc_trace+0x228/0x358 [<000000009da9d39f>] __audit_sockaddr+0xd0/0x138 [<0000000089a93e34>] move_addr_to_kernel+0x1a0/0x1f8 [<000000000b4e80e6>] io_connect_prep+0x1ec/0x2d4 [<00000000abfbcd99>] io_submit_sqes+0x588/0x1e48 [<00000000e7c25e07>] io_sq_thread+0x8a4/0x10e4 [<00000000d999b491>] ret_from_fork+0x10/0x20 which can can happen if: 1) The command type does something on the prep side that triggers an audit call. 2) The thread hasn't done any operations before this that triggered an audit call inside ->issue(), where we have audit_uring_entry() and audit_uring_exit(). Work around this by issuing a blanket NOP operation before the SQPOLL does anything.
|
CVE-2024-40922 |
In the Linux kernel, the following vulnerability has been resolved: io_uring/rsrc: don't lock while !TASK_RUNNING There is a report of io_rsrc_ref_quiesce() locking a mutex while not TASK_RUNNING, which is due to forgetting restoring the state back after io_run_task_work_sig() and attempts to break out of the waiting loop. do not call blocking ops when !TASK_RUNNING; state=1 set at [<ffffffff815d2494>] prepare_to_wait+0xa4/0x380 kernel/sched/wait.c:237 WARNING: CPU: 2 PID: 397056 at kernel/sched/core.c:10099 __might_sleep+0x114/0x160 kernel/sched/core.c:10099 RIP: 0010:__might_sleep+0x114/0x160 kernel/sched/core.c:10099 Call Trace: <TASK> __mutex_lock_common kernel/locking/mutex.c:585 [inline] __mutex_lock+0xb4/0x940 kernel/locking/mutex.c:752 io_rsrc_ref_quiesce+0x590/0x940 io_uring/rsrc.c:253 io_sqe_buffers_unregister+0xa2/0x340 io_uring/rsrc.c:799 __io_uring_register io_uring/register.c:424 [inline] __do_sys_io_uring_register+0x5b9/0x2400 io_uring/register.c:613 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xd8/0x270 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x6f/0x77
|
CVE-2024-39508 |
In the Linux kernel, the following vulnerability has been resolved: io_uring/io-wq: Use set_bit() and test_bit() at worker->flags Utilize set_bit() and test_bit() on worker->flags within io_uring/io-wq to address potential data races. The structure io_worker->flags may be accessed through various data paths, leading to concurrency issues. When KCSAN is enabled, it reveals data races occurring in io_worker_handle_work and io_wq_activate_free_worker functions. BUG: KCSAN: data-race in io_worker_handle_work / io_wq_activate_free_worker write to 0xffff8885c4246404 of 4 bytes by task 49071 on cpu 28: io_worker_handle_work (io_uring/io-wq.c:434 io_uring/io-wq.c:569) io_wq_worker (io_uring/io-wq.c:?) <snip> read to 0xffff8885c4246404 of 4 bytes by task 49024 on cpu 5: io_wq_activate_free_worker (io_uring/io-wq.c:? io_uring/io-wq.c:285) io_wq_enqueue (io_uring/io-wq.c:947) io_queue_iowq (io_uring/io_uring.c:524) io_req_task_submit (io_uring/io_uring.c:1511) io_handle_tw_list (io_uring/io_uring.c:1198) <snip> Line numbers against commit 18daea77cca6 ("Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm"). These races involve writes and reads to the same memory location by different tasks running on different CPUs. To mitigate this, refactor the code to use atomic operations such as set_bit(), test_bit(), and clear_bit() instead of basic "and" and "or" operations. This ensures thread-safe manipulation of worker flags. Also, move `create_index` to avoid holes in the structure.
|
CVE-2024-39371 |
In the Linux kernel, the following vulnerability has been resolved: io_uring: check for non-NULL file pointer in io_file_can_poll() In earlier kernels, it was possible to trigger a NULL pointer dereference off the forced async preparation path, if no file had been assigned. The trace leading to that looks as follows: BUG: kernel NULL pointer dereference, address: 00000000000000b0 PGD 0 P4D 0 Oops: 0000 [#1] PREEMPT SMP CPU: 67 PID: 1633 Comm: buf-ring-invali Not tainted 6.8.0-rc3+ #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS unknown 2/2/2022 RIP: 0010:io_buffer_select+0xc3/0x210 Code: 00 00 48 39 d1 0f 82 ae 00 00 00 48 81 4b 48 00 00 01 00 48 89 73 70 0f b7 50 0c 66 89 53 42 85 ed 0f 85 d2 00 00 00 48 8b 13 <48> 8b 92 b0 00 00 00 48 83 7a 40 00 0f 84 21 01 00 00 4c 8b 20 5b RSP: 0018:ffffb7bec38c7d88 EFLAGS: 00010246 RAX: ffff97af2be61000 RBX: ffff97af234f1700 RCX: 0000000000000040 RDX: 0000000000000000 RSI: ffff97aecfb04820 RDI: ffff97af234f1700 RBP: 0000000000000000 R08: 0000000000200030 R09: 0000000000000020 R10: ffffb7bec38c7dc8 R11: 000000000000c000 R12: ffffb7bec38c7db8 R13: ffff97aecfb05800 R14: ffff97aecfb05800 R15: ffff97af2be5e000 FS: 00007f852f74b740(0000) GS:ffff97b1eeec0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000000000b0 CR3: 000000016deab005 CR4: 0000000000370ef0 Call Trace: <TASK> ? __die+0x1f/0x60 ? page_fault_oops+0x14d/0x420 ? do_user_addr_fault+0x61/0x6a0 ? exc_page_fault+0x6c/0x150 ? asm_exc_page_fault+0x22/0x30 ? io_buffer_select+0xc3/0x210 __io_import_iovec+0xb5/0x120 io_readv_prep_async+0x36/0x70 io_queue_sqe_fallback+0x20/0x260 io_submit_sqes+0x314/0x630 __do_sys_io_uring_enter+0x339/0xbc0 ? __do_sys_io_uring_register+0x11b/0xc50 ? vm_mmap_pgoff+0xce/0x160 do_syscall_64+0x5f/0x180 entry_SYSCALL_64_after_hwframe+0x46/0x4e RIP: 0033:0x55e0a110a67e Code: ba cc 00 00 00 45 31 c0 44 0f b6 92 d0 00 00 00 31 d2 41 b9 08 00 00 00 41 83 e2 01 41 c1 e2 04 41 09 c2 b8 aa 01 00 00 0f 05 <c3> 90 89 30 eb a9 0f 1f 40 00 48 8b 42 20 8b 00 a8 06 75 af 85 f6 because the request is marked forced ASYNC and has a bad file fd, and hence takes the forced async prep path. Current kernels with the request async prep cleaned up can no longer hit this issue, but for ease of backporting, let's add this safety check in here too as it really doesn't hurt. For both cases, this will inevitably end with a CQE posted with -EBADF.
|
CVE-2024-35880 |
In the Linux kernel, the following vulnerability has been resolved: io_uring/kbuf: hold io_buffer_list reference over mmap If we look up the kbuf, ensure that it doesn't get unregistered until after we're done with it. Since we're inside mmap, we cannot safely use the io_uring lock. Rely on the fact that we can lookup the buffer list under RCU now and grab a reference to it, preventing it from being unregistered until we're done with it. The lookup returns the io_buffer_list directly with it referenced.
|
CVE-2024-35831 |
In the Linux kernel, the following vulnerability has been resolved: io_uring: Fix release of pinned pages when __io_uaddr_map fails Looking at the error path of __io_uaddr_map, if we fail after pinning the pages for any reasons, ret will be set to -EINVAL and the error handler won't properly release the pinned pages. I didn't manage to trigger it without forcing a failure, but it can happen in real life when memory is heavily fragmented.
|
CVE-2024-35827 |
In the Linux kernel, the following vulnerability has been resolved: io_uring/net: fix overflow check in io_recvmsg_mshot_prep() The "controllen" variable is type size_t (unsigned long). Casting it to int could lead to an integer underflow. The check_add_overflow() function considers the type of the destination which is type int. If we add two positive values and the result cannot fit in an integer then that's counted as an overflow. However, if we cast "controllen" to an int and it turns negative, then negative values *can* fit into an int type so there is no overflow. Good: 100 + (unsigned long)-4 = 96 <-- overflow Bad: 100 + (int)-4 = 96 <-- no overflow I deleted the cast of the sizeof() as well. That's not a bug but the cast is unnecessary.
|
CVE-2024-26764 |
In the Linux kernel, the following vulnerability has been resolved: fs/aio: Restrict kiocb_set_cancel_fn() to I/O submitted via libaio If kiocb_set_cancel_fn() is called for I/O submitted via io_uring, the following kernel warning appears: WARNING: CPU: 3 PID: 368 at fs/aio.c:598 kiocb_set_cancel_fn+0x9c/0xa8 Call trace: kiocb_set_cancel_fn+0x9c/0xa8 ffs_epfile_read_iter+0x144/0x1d0 io_read+0x19c/0x498 io_issue_sqe+0x118/0x27c io_submit_sqes+0x25c/0x5fc __arm64_sys_io_uring_enter+0x104/0xab0 invoke_syscall+0x58/0x11c el0_svc_common+0xb4/0xf4 do_el0_svc+0x2c/0xb0 el0_svc+0x2c/0xa4 el0t_64_sync_handler+0x68/0xb4 el0t_64_sync+0x1a4/0x1a8 Fix this by setting the IOCB_AIO_RW flag for read and write I/O that is submitted by libaio.
|
CVE-2024-26676 |
In the Linux kernel, the following vulnerability has been resolved: af_unix: Call kfree_skb() for dead unix_(sk)->oob_skb in GC. syzbot reported a warning [0] in __unix_gc() with a repro, which creates a socketpair and sends one socket's fd to itself using the peer. socketpair(AF_UNIX, SOCK_STREAM, 0, [3, 4]) = 0 sendmsg(4, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="\360", iov_len=1}], msg_iovlen=1, msg_control=[{cmsg_len=20, cmsg_level=SOL_SOCKET, cmsg_type=SCM_RIGHTS, cmsg_data=[3]}], msg_controllen=24, msg_flags=0}, MSG_OOB|MSG_PROBE|MSG_DONTWAIT|MSG_ZEROCOPY) = 1 This forms a self-cyclic reference that GC should finally untangle but does not due to lack of MSG_OOB handling, resulting in memory leak. Recently, commit 11498715f266 ("af_unix: Remove io_uring code for GC.") removed io_uring's dead code in GC and revealed the problem. The code was executed at the final stage of GC and unconditionally moved all GC candidates from gc_candidates to gc_inflight_list. That papered over the reported problem by always making the following WARN_ON_ONCE(!list_empty(&gc_candidates)) false. The problem has been there since commit 2aab4b969002 ("af_unix: fix struct pid leaks in OOB support") added full scm support for MSG_OOB while fixing another bug. To fix this problem, we must call kfree_skb() for unix_sk(sk)->oob_skb if the socket still exists in gc_candidates after purging collected skb. Then, we need to set NULL to oob_skb before calling kfree_skb() because it calls last fput() and triggers unix_release_sock(), where we call duplicate kfree_skb(u->oob_skb) if not NULL. Note that the leaked socket remained being linked to a global list, so kmemleak also could not detect it. We need to check /proc/net/protocol to notice the unfreed socket. [0]: WARNING: CPU: 0 PID: 2863 at net/unix/garbage.c:345 __unix_gc+0xc74/0xe80 net/unix/garbage.c:345 Modules linked in: CPU: 0 PID: 2863 Comm: kworker/u4:11 Not tainted 6.8.0-rc1-syzkaller-00583-g1701940b1a02 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/25/2024 Workqueue: events_unbound __unix_gc RIP: 0010:__unix_gc+0xc74/0xe80 net/unix/garbage.c:345 Code: 8b 5c 24 50 e9 86 f8 ff ff e8 f8 e4 22 f8 31 d2 48 c7 c6 30 6a 69 89 4c 89 ef e8 97 ef ff ff e9 80 f9 ff ff e8 dd e4 22 f8 90 <0f> 0b 90 e9 7b fd ff ff 48 89 df e8 5c e7 7c f8 e9 d3 f8 ff ff e8 RSP: 0018:ffffc9000b03fba0 EFLAGS: 00010293 RAX: 0000000000000000 RBX: ffffc9000b03fc10 RCX: ffffffff816c493e RDX: ffff88802c02d940 RSI: ffffffff896982f3 RDI: ffffc9000b03fb30 RBP: ffffc9000b03fce0 R08: 0000000000000001 R09: fffff52001607f66 R10: 0000000000000003 R11: 0000000000000002 R12: dffffc0000000000 R13: ffffc9000b03fc10 R14: ffffc9000b03fc10 R15: 0000000000000001 FS: 0000000000000000(0000) GS:ffff8880b9400000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00005559c8677a60 CR3: 000000000d57a000 CR4: 00000000003506f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> process_one_work+0x889/0x15e0 kernel/workqueue.c:2633 process_scheduled_works kernel/workqueue.c:2706 [inline] worker_thread+0x8b9/0x12a0 kernel/workqueue.c:2787 kthread+0x2c6/0x3b0 kernel/kthread.c:388 ret_from_fork+0x45/0x80 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x1b/0x30 arch/x86/entry/entry_64.S:242 </TASK>
|
CVE-2024-22017 |
setuid() does not affect libuv's internal io_uring operations if initialized before the call to setuid(). This allows the process to perform privileged operations despite presumably having dropped such privileges through a call to setuid(). This vulnerability affects all users using version greater or equal than Node.js 18.18.0, Node.js 20.4.0 and Node.js 21.
|
CVE-2024-0582 |
A memory leak flaw was found in the Linux kernel’s io_uring functionality in how a user registers a buffer ring with IORING_REGISTER_PBUF_RING, mmap() it, and then frees it. This flaw allows a local user to crash or potentially escalate their privileges on the system.
|
CVE-2023-6560 |
An out-of-bounds memory access flaw was found in the io_uring SQ/CQ rings functionality in the Linux kernel. This issue could allow a local user to crash the system.
|
CVE-2023-52914 |
In the Linux kernel, the following vulnerability has been resolved: io_uring/poll: add hash if ready poll request can't complete inline If we don't, then we may lose access to it completely, leading to a request leak. This will eventually stall the ring exit process as well.
|
CVE-2023-52903 |
In the Linux kernel, the following vulnerability has been resolved: io_uring: lock overflowing for IOPOLL syzbot reports an issue with overflow filling for IOPOLL: WARNING: CPU: 0 PID: 28 at io_uring/io_uring.c:734 io_cqring_event_overflow+0x1c0/0x230 io_uring/io_uring.c:734 CPU: 0 PID: 28 Comm: kworker/u4:1 Not tainted 6.2.0-rc3-syzkaller-16369-g358a161a6a9e #0 Workqueue: events_unbound io_ring_exit_work Call trace: io_cqring_event_overflow+0x1c0/0x230 io_uring/io_uring.c:734 io_req_cqe_overflow+0x5c/0x70 io_uring/io_uring.c:773 io_fill_cqe_req io_uring/io_uring.h:168 [inline] io_do_iopoll+0x474/0x62c io_uring/rw.c:1065 io_iopoll_try_reap_events+0x6c/0x108 io_uring/io_uring.c:1513 io_uring_try_cancel_requests+0x13c/0x258 io_uring/io_uring.c:3056 io_ring_exit_work+0xec/0x390 io_uring/io_uring.c:2869 process_one_work+0x2d8/0x504 kernel/workqueue.c:2289 worker_thread+0x340/0x610 kernel/workqueue.c:2436 kthread+0x12c/0x158 kernel/kthread.c:376 ret_from_fork+0x10/0x20 arch/arm64/kernel/entry.S:863 There is no real problem for normal IOPOLL as flush is also called with uring_lock taken, but it's getting more complicated for IOPOLL|SQPOLL, for which __io_cqring_overflow_flush() happens from the CQ waiting path.
|
CVE-2023-52895 |
In the Linux kernel, the following vulnerability has been resolved: io_uring/poll: don't reissue in case of poll race on multishot request A previous commit fixed a poll race that can occur, but it's only applicable for multishot requests. For a multishot request, we can safely ignore a spurious wakeup, as we never leave the waitqueue to begin with. A blunt reissue of a multishot armed request can cause us to leak a buffer, if they are ring provided. While this seems like a bug in itself, it's not really defined behavior to reissue a multishot request directly. It's less efficient to do so as well, and not required to rearm anything like it is for singleshot poll requests.
|
CVE-2023-52656 |
In the Linux kernel, the following vulnerability has been resolved: io_uring: drop any code related to SCM_RIGHTS This is dead code after we dropped support for passing io_uring fds over SCM_RIGHTS, get rid of it.
|
CVE-2023-52654 |
In the Linux kernel, the following vulnerability has been resolved: io_uring/af_unix: disable sending io_uring over sockets File reference cycles have caused lots of problems for io_uring in the past, and it still doesn't work exactly right and races with unix_stream_read_generic(). The safest fix would be to completely disallow sending io_uring files via sockets via SCM_RIGHT, so there are no possible cycles invloving registered files and thus rendering SCM accounting on the io_uring side unnecessary.
|
CVE-2023-46862 |
An issue was discovered in the Linux kernel through 6.5.9. During a race with SQ thread exit, an io_uring/fdinfo.c io_uring_show_fdinfo NULL pointer dereference can occur.
|
CVE-2023-3389 |
A use-after-free vulnerability in the Linux Kernel io_uring subsystem can be exploited to achieve local privilege escalation. Racing a io_uring cancel poll request with a linked timeout can cause a UAF in a hrtimer. We recommend upgrading past commit ef7dfac51d8ed961b742218f526bd589f3900a59 (4716c73b188566865bdd79c3a6709696a224ac04 for 5.10 stable and 0e388fce7aec40992eadee654193cad345d62663 for 5.15 stable).
|
CVE-2023-2598 |
A flaw was found in the fixed buffer registration code for io_uring (io_sqe_buffer_register in io_uring/rsrc.c) in the Linux kernel that allows out-of-bounds access to physical memory beyond the end of the buffer. This flaw enables full local privilege escalation.
|
CVE-2023-23586 |
Due to a vulnerability in the io_uring subsystem, it is possible to leak kernel memory information to the user process. timens_install calls current_is_single_threaded to determine if the current process is single-threaded, but this call does not consider io_uring's io_worker threads, thus it is possible to insert a time namespace's vvar page to process's memory space via a page fault. When this time namespace is destroyed, the vvar page is also freed, but not removed from the process' memory, and a next page allocated by the kernel will be still available from the user-space process and can leak memory contents via this (read-only) use-after-free vulnerability. We recommend upgrading past version 5.10.161 or commit 788d0824269bef539fe31a785b1517882eafed93 https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/io_uring
|
CVE-2023-2236 |
A use-after-free vulnerability in the Linux Kernel io_uring subsystem can be exploited to achieve local privilege escalation. Both io_install_fixed_file and its callers call fput in a file in case of an error, causing a reference underflow which leads to a use-after-free vulnerability. We recommend upgrading past commit 9d94c04c0db024922e886c9fd429659f22f48ea4.
|
CVE-2023-21400 |
In multiple functions of io_uring.c, there is a possible kernel memory corruption due to improper locking. This could lead to local escalation of privilege in the kernel with System execution privileges needed. User interaction is not needed for exploitation.
|
CVE-2023-1872 |
A use-after-free vulnerability in the Linux Kernel io_uring system can be exploited to achieve local privilege escalation. The io_file_get_fixed function lacks the presence of ctx->uring_lock which can lead to a Use-After-Free vulnerability due a race condition with fixed files getting unregistered. We recommend upgrading past commit da24142b1ef9fd5d36b76e36bab328a5b27523e8.
|
CVE-2023-1583 |
A NULL pointer dereference was found in io_file_bitmap_get in io_uring/filetable.c in the io_uring sub-component in the Linux Kernel. When fixed files are unregistered, some context information (file_alloc_{start,end} and alloc_hint) is not cleared. A subsequent request that has auto index selection enabled via IORING_FILE_INDEX_ALLOC can cause a NULL pointer dereference. An unprivileged user can use the flaw to cause a system crash.
|
CVE-2023-1295 |
A time-of-check to time-of-use issue exists in io_uring subsystem's IORING_OP_CLOSE operation in the Linux kernel's versions 5.6 - 5.11 (inclusive), which allows a local user to elevate their privileges to root. Introduced in b5dba59e0cf7e2cc4d3b3b1ac5fe81ddf21959eb, patched in 9eac1904d3364254d622bf2c771c4f85cd435fc2, backported to stable in 788d0824269bef539fe31a785b1517882eafed93.
|
CVE-2023-1032 |
The Linux kernel io_uring IORING_OP_SOCKET operation contained a double free in function __sys_socket_file() in file net/socket.c. This issue was introduced in da214a475f8bd1d3e9e7a19ddfeb4d1617551bab and fixed in 649c15c7691e9b13cbe9bf6c65c365350e056067.
|
CVE-2023-0469 |
A use-after-free flaw was found in io_uring/filetable.c in io_install_fixed_file in the io_uring subcomponent in the Linux Kernel during call cleanup. This flaw may lead to a denial of service.
|
CVE-2023-0468 |
A use-after-free flaw was found in io_uring/poll.c in io_poll_check_events in the io_uring subcomponent in the Linux Kernel due to a race condition of poll_refs. This flaw may cause a NULL pointer dereference.
|
CVE-2023-0240 |
There is a logic error in io_uring's implementation which can be used to trigger a use-after-free vulnerability leading to privilege escalation. In the io_prep_async_work function the assumption that the last io_grab_identity call cannot return false is not true, and in this case the function will use the init_cred or the previous linked requests identity to do operations instead of using the current identity. This can lead to reference counting issues causing use-after-free. We recommend upgrading past version 5.10.161.
|
CVE-2022-48983 |
In the Linux kernel, the following vulnerability has been resolved: io_uring: Fix a null-ptr-deref in io_tctx_exit_cb() Syzkaller reports a NULL deref bug as follows: BUG: KASAN: null-ptr-deref in io_tctx_exit_cb+0x53/0xd3 Read of size 4 at addr 0000000000000138 by task file1/1955 CPU: 1 PID: 1955 Comm: file1 Not tainted 6.1.0-rc7-00103-gef4d3ea40565 #75 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0xcd/0x134 ? io_tctx_exit_cb+0x53/0xd3 kasan_report+0xbb/0x1f0 ? io_tctx_exit_cb+0x53/0xd3 kasan_check_range+0x140/0x190 io_tctx_exit_cb+0x53/0xd3 task_work_run+0x164/0x250 ? task_work_cancel+0x30/0x30 get_signal+0x1c3/0x2440 ? lock_downgrade+0x6e0/0x6e0 ? lock_downgrade+0x6e0/0x6e0 ? exit_signals+0x8b0/0x8b0 ? do_raw_read_unlock+0x3b/0x70 ? do_raw_spin_unlock+0x50/0x230 arch_do_signal_or_restart+0x82/0x2470 ? kmem_cache_free+0x260/0x4b0 ? putname+0xfe/0x140 ? get_sigframe_size+0x10/0x10 ? do_execveat_common.isra.0+0x226/0x710 ? lockdep_hardirqs_on+0x79/0x100 ? putname+0xfe/0x140 ? do_execveat_common.isra.0+0x238/0x710 exit_to_user_mode_prepare+0x15f/0x250 syscall_exit_to_user_mode+0x19/0x50 do_syscall_64+0x42/0xb0 entry_SYSCALL_64_after_hwframe+0x63/0xcd RIP: 0023:0x0 Code: Unable to access opcode bytes at 0xffffffffffffffd6. RSP: 002b:00000000fffb7790 EFLAGS: 00000200 ORIG_RAX: 000000000000000b RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 </TASK> Kernel panic - not syncing: panic_on_warn set ... This happens because the adding of task_work from io_ring_exit_work() isn't synchronized with canceling all work items from eg exec. The execution of the two are ordered in that they are both run by the task itself, but if io_tctx_exit_cb() is queued while we're canceling all work items off exec AND gets executed when the task exits to userspace rather than in the main loop in io_uring_cancel_generic(), then we can find current->io_uring == NULL and hit the above crash. It's safe to add this NULL check here, because the execution of the two paths are done by the task itself. [axboe: add code comment and also put an explanation in the commit msg]
|
CVE-2022-48937 |
In the Linux kernel, the following vulnerability has been resolved: io_uring: add a schedule point in io_add_buffers() Looping ~65535 times doing kmalloc() calls can trigger soft lockups, especially with DEBUG features (like KASAN). [ 253.536212] watchdog: BUG: soft lockup - CPU#64 stuck for 26s! [b219417889:12575] [ 253.544433] Modules linked in: vfat fat i2c_mux_pca954x i2c_mux spidev cdc_acm xhci_pci xhci_hcd sha3_generic gq(O) [ 253.544451] CPU: 64 PID: 12575 Comm: b219417889 Tainted: G S O 5.17.0-smp-DEV #801 [ 253.544457] RIP: 0010:kernel_text_address (./include/asm-generic/sections.h:192 ./include/linux/kallsyms.h:29 kernel/extable.c:67 kernel/extable.c:98) [ 253.544464] Code: 0f 93 c0 48 c7 c1 e0 63 d7 a4 48 39 cb 0f 92 c1 20 c1 0f b6 c1 5b 5d c3 90 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 53 48 89 fb <48> c7 c0 00 00 80 a0 41 be 01 00 00 00 48 39 c7 72 0c 48 c7 c0 40 [ 253.544468] RSP: 0018:ffff8882d8baf4c0 EFLAGS: 00000246 [ 253.544471] RAX: 1ffff1105b175e00 RBX: ffffffffa13ef09a RCX: 00000000a13ef001 [ 253.544474] RDX: ffffffffa13ef09a RSI: ffff8882d8baf558 RDI: ffffffffa13ef09a [ 253.544476] RBP: ffff8882d8baf4d8 R08: ffff8882d8baf5e0 R09: 0000000000000004 [ 253.544479] R10: ffff8882d8baf5e8 R11: ffffffffa0d59a50 R12: ffff8882eab20380 [ 253.544481] R13: ffffffffa0d59a50 R14: dffffc0000000000 R15: 1ffff1105b175eb0 [ 253.544483] FS: 00000000016d3380(0000) GS:ffff88af48c00000(0000) knlGS:0000000000000000 [ 253.544486] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 253.544488] CR2: 00000000004af0f0 CR3: 00000002eabfa004 CR4: 00000000003706e0 [ 253.544491] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 253.544492] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 253.544494] Call Trace: [ 253.544496] <TASK> [ 253.544498] ? io_queue_sqe (fs/io_uring.c:7143) [ 253.544505] __kernel_text_address (kernel/extable.c:78) [ 253.544508] unwind_get_return_address (arch/x86/kernel/unwind_frame.c:19) [ 253.544514] arch_stack_walk (arch/x86/kernel/stacktrace.c:27) [ 253.544517] ? io_queue_sqe (fs/io_uring.c:7143) [ 253.544521] stack_trace_save (kernel/stacktrace.c:123) [ 253.544527] ____kasan_kmalloc (mm/kasan/common.c:39 mm/kasan/common.c:45 mm/kasan/common.c:436 mm/kasan/common.c:515) [ 253.544531] ? ____kasan_kmalloc (mm/kasan/common.c:39 mm/kasan/common.c:45 mm/kasan/common.c:436 mm/kasan/common.c:515) [ 253.544533] ? __kasan_kmalloc (mm/kasan/common.c:524) [ 253.544535] ? kmem_cache_alloc_trace (./include/linux/kasan.h:270 mm/slab.c:3567) [ 253.544541] ? io_issue_sqe (fs/io_uring.c:4556 fs/io_uring.c:4589 fs/io_uring.c:6828) [ 253.544544] ? __io_queue_sqe (fs/io_uring.c:?) [ 253.544551] __kasan_kmalloc (mm/kasan/common.c:524) [ 253.544553] kmem_cache_alloc_trace (./include/linux/kasan.h:270 mm/slab.c:3567) [ 253.544556] ? io_issue_sqe (fs/io_uring.c:4556 fs/io_uring.c:4589 fs/io_uring.c:6828) [ 253.544560] io_issue_sqe (fs/io_uring.c:4556 fs/io_uring.c:4589 fs/io_uring.c:6828) [ 253.544564] ? __kasan_slab_alloc (mm/kasan/common.c:45 mm/kasan/common.c:436 mm/kasan/common.c:469) [ 253.544567] ? __kasan_slab_alloc (mm/kasan/common.c:39 mm/kasan/common.c:45 mm/kasan/common.c:436 mm/kasan/common.c:469) [ 253.544569] ? kmem_cache_alloc_bulk (mm/slab.h:732 mm/slab.c:3546) [ 253.544573] ? __io_alloc_req_refill (fs/io_uring.c:2078) [ 253.544578] ? io_submit_sqes (fs/io_uring.c:7441) [ 253.544581] ? __se_sys_io_uring_enter (fs/io_uring.c:10154 fs/io_uring.c:10096) [ 253.544584] ? __x64_sys_io_uring_enter (fs/io_uring.c:10096) [ 253.544587] ? do_syscall_64 (arch/x86/entry/common.c:50 arch/x86/entry/common.c:80) [ 253.544590] ? entry_SYSCALL_64_after_hwframe (??:?) [ 253.544596] __io_queue_sqe (fs/io_uring.c:?) [ 253.544600] io_queue_sqe (fs/io_uring.c:7143) [ 253.544603] io_submit_sqe (fs/io_uring.c:?) [ 253.544608] io_submit_sqes (fs/io_uring.c:?) [ 253.544612] __se_sys_io_uring_enter (fs/io_uring.c:10154 fs/io_uri ---truncated---
|
CVE-2022-4696 |
There exists a use-after-free vulnerability in the Linux kernel through io_uring and the IORING_OP_SPLICE operation. If IORING_OP_SPLICE is missing the IO_WQ_WORK_FILES flag, which signals that the operation won't use current->nsproxy, so its reference counter is not increased. This assumption is not always true as calling io_splice on specific files will call the get_uts function which will use current->nsproxy leading to invalidly decreasing its reference counter later causing the use-after-free vulnerability. We recommend upgrading to version 5.10.160 or above
|
CVE-2022-3910 |
Use After Free vulnerability in Linux Kernel allows Privilege Escalation. An improper Update of Reference Count in io_uring leads to Use-After-Free and Local Privilege Escalation. When io_msg_ring was invoked with a fixed file, it called io_fput_file() which improperly decreased its reference count (leading to Use-After-Free and Local Privilege Escalation). Fixed files are permanently registered to the ring, and should not be put separately. We recommend upgrading past commit https://github.com/torvalds/linux/commit/fc7222c3a9f56271fba02aabbfbae999042f1679 https://github.com/torvalds/linux/commit/fc7222c3a9f56271fba02aabbfbae999042f1679
|
CVE-2022-3176 |
There exists a use-after-free in io_uring in the Linux kernel. Signalfd_poll() and binder_poll() use a waitqueue whose lifetime is the current task. It will send a POLLFREE notification to all waiters before the queue is freed. Unfortunately, the io_uring poll doesn't handle POLLFREE. This allows a use-after-free to occur if a signalfd or binder fd is polled with io_uring poll, and the waitqueue gets freed. We recommend upgrading past commit fc78b2fc21f10c4c9c4d5d659a685710ffa63659
|
CVE-2022-3103 |
off-by-one in io_uring module.
|
CVE-2022-29582 |
In the Linux kernel before 5.17.3, fs/io_uring.c has a use-after-free due to a race condition in io_uring timeouts. This can be triggered by a local user who has no access to any user namespace; however, the race condition perhaps can only be exploited infrequently.
|
CVE-2022-2602 |
io_uring UAF, Unix SCM garbage collection
|
CVE-2022-2327 |
io_uring use work_flags to determine which identity need to grab from the calling process to make sure it is consistent with the calling process when executing IORING_OP. Some operations are missing some types, which can lead to incorrect reference counts which can then lead to a double free. We recommend upgrading the kernel past commit df3f3bb5059d20ef094d6b2f0256c4bf4127a859
|
CVE-2022-1786 |
A use-after-free flaw was found in the Linux kernel’s io_uring subsystem in the way a user sets up a ring with IORING_SETUP_IOPOLL with more than one task completing submissions on this ring. This flaw allows a local user to crash or escalate their privileges on the system.
|
CVE-2022-1508 |
An out-of-bounds read flaw was found in the Linux kernel’s io_uring module in the way a user triggers the io_read() function with some special parameters. This flaw allows a local user to read some memory out of bounds.
|
CVE-2022-1116 |
Integer Overflow or Wraparound vulnerability in io_uring of Linux Kernel allows local attacker to cause memory corruption and escalate privileges to root. This issue affects: Linux Kernel versions prior to 5.4.189; version 5.4.24 and later versions.
|
CVE-2022-1043 |
A flaw was found in the Linux kernel’s io_uring implementation. This flaw allows an attacker with a local account to corrupt system memory, crash the system or escalate privileges.
|
CVE-2021-47569 |
In the Linux kernel, the following vulnerability has been resolved: io_uring: fail cancellation for EXITING tasks WARNING: CPU: 1 PID: 20 at fs/io_uring.c:6269 io_try_cancel_userdata+0x3c5/0x640 fs/io_uring.c:6269 CPU: 1 PID: 20 Comm: kworker/1:0 Not tainted 5.16.0-rc1-syzkaller #0 Workqueue: events io_fallback_req_func RIP: 0010:io_try_cancel_userdata+0x3c5/0x640 fs/io_uring.c:6269 Call Trace: <TASK> io_req_task_link_timeout+0x6b/0x1e0 fs/io_uring.c:6886 io_fallback_req_func+0xf9/0x1ae fs/io_uring.c:1334 process_one_work+0x9b2/0x1690 kernel/workqueue.c:2298 worker_thread+0x658/0x11f0 kernel/workqueue.c:2445 kthread+0x405/0x4f0 kernel/kthread.c:327 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295 </TASK> We need original task's context to do cancellations, so if it's dying and the callback is executed in a fallback mode, fail the cancellation attempt.
|
CVE-2021-47504 |
In the Linux kernel, the following vulnerability has been resolved: io_uring: ensure task_work gets run as part of cancelations If we successfully cancel a work item but that work item needs to be processed through task_work, then we can be sleeping uninterruptibly in io_uring_cancel_generic() and never process it. Hence we don't make forward progress and we end up with an uninterruptible sleep warning. While in there, correct a comment that should be IFF, not IIF.
|
CVE-2021-47292 |
In the Linux kernel, the following vulnerability has been resolved: io_uring: fix memleak in io_init_wq_offload() I got memory leak report when doing fuzz test: BUG: memory leak unreferenced object 0xffff888107310a80 (size 96): comm "syz-executor.6", pid 4610, jiffies 4295140240 (age 20.135s) hex dump (first 32 bytes): 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 00 00 00 ad 4e ad de ff ff ff ff 00 00 00 00 .....N.......... backtrace: [<000000001974933b>] kmalloc include/linux/slab.h:591 [inline] [<000000001974933b>] kzalloc include/linux/slab.h:721 [inline] [<000000001974933b>] io_init_wq_offload fs/io_uring.c:7920 [inline] [<000000001974933b>] io_uring_alloc_task_context+0x466/0x640 fs/io_uring.c:7955 [<0000000039d0800d>] __io_uring_add_tctx_node+0x256/0x360 fs/io_uring.c:9016 [<000000008482e78c>] io_uring_add_tctx_node fs/io_uring.c:9052 [inline] [<000000008482e78c>] __do_sys_io_uring_enter fs/io_uring.c:9354 [inline] [<000000008482e78c>] __se_sys_io_uring_enter fs/io_uring.c:9301 [inline] [<000000008482e78c>] __x64_sys_io_uring_enter+0xabc/0xc20 fs/io_uring.c:9301 [<00000000b875f18f>] do_syscall_x64 arch/x86/entry/common.c:50 [inline] [<00000000b875f18f>] do_syscall_64+0x3b/0x90 arch/x86/entry/common.c:80 [<000000006b0a8484>] entry_SYSCALL_64_after_hwframe+0x44/0xae CPU0 CPU1 io_uring_enter io_uring_enter io_uring_add_tctx_node io_uring_add_tctx_node __io_uring_add_tctx_node __io_uring_add_tctx_node io_uring_alloc_task_context io_uring_alloc_task_context io_init_wq_offload io_init_wq_offload hash = kzalloc hash = kzalloc ctx->hash_map = hash ctx->hash_map = hash <- one of the hash is leaked When calling io_uring_enter() in parallel, the 'hash_map' will be leaked, add uring_lock to protect 'hash_map'.
|
CVE-2021-47124 |
In the Linux kernel, the following vulnerability has been resolved: io_uring: fix link timeout refs WARNING: CPU: 0 PID: 10242 at lib/refcount.c:28 refcount_warn_saturate+0x15b/0x1a0 lib/refcount.c:28 RIP: 0010:refcount_warn_saturate+0x15b/0x1a0 lib/refcount.c:28 Call Trace: __refcount_sub_and_test include/linux/refcount.h:283 [inline] __refcount_dec_and_test include/linux/refcount.h:315 [inline] refcount_dec_and_test include/linux/refcount.h:333 [inline] io_put_req fs/io_uring.c:2140 [inline] io_queue_linked_timeout fs/io_uring.c:6300 [inline] __io_queue_sqe+0xbef/0xec0 fs/io_uring.c:6354 io_submit_sqe fs/io_uring.c:6534 [inline] io_submit_sqes+0x2bbd/0x7c50 fs/io_uring.c:6660 __do_sys_io_uring_enter fs/io_uring.c:9240 [inline] __se_sys_io_uring_enter+0x256/0x1d60 fs/io_uring.c:9182 io_link_timeout_fn() should put only one reference of the linked timeout request, however in case of racing with the master request's completion first io_req_complete() puts one and then io_put_req_deferred() is called.
|
CVE-2021-47123 |
In the Linux kernel, the following vulnerability has been resolved: io_uring: fix ltout double free on completion race Always remove linked timeout on io_link_timeout_fn() from the master request link list, otherwise we may get use-after-free when first io_link_timeout_fn() puts linked timeout in the fail path, and then will be found and put on master's free.
|
CVE-2021-47040 |
In the Linux kernel, the following vulnerability has been resolved: io_uring: fix overflows checks in provide buffers Colin reported before possible overflow and sign extension problems in io_provide_buffers_prep(). As Linus pointed out previous attempt did nothing useful, see d81269fecb8ce ("io_uring: fix provide_buffers sign extension"). Do that with help of check_<op>_overflow helpers. And fix struct io_provide_buf::len type, as it doesn't make much sense to keep it signed.
|
CVE-2021-46942 |
In the Linux kernel, the following vulnerability has been resolved: io_uring: fix shared sqpoll cancellation hangs [ 736.982891] INFO: task iou-sqp-4294:4295 blocked for more than 122 seconds. [ 736.982897] Call Trace: [ 736.982901] schedule+0x68/0xe0 [ 736.982903] io_uring_cancel_sqpoll+0xdb/0x110 [ 736.982908] io_sqpoll_cancel_cb+0x24/0x30 [ 736.982911] io_run_task_work_head+0x28/0x50 [ 736.982913] io_sq_thread+0x4e3/0x720 We call io_uring_cancel_sqpoll() one by one for each ctx either in sq_thread() itself or via task works, and it's intended to cancel all requests of a specified context. However the function uses per-task counters to track the number of inflight requests, so it counts more requests than available via currect io_uring ctx and goes to sleep for them to appear (e.g. from IRQ), that will never happen. Cancel a bit more than before, i.e. all ctxs that share sqpoll and continue to use shared counters. Don't forget that we should not remove ctx from the list before running that task_work sqpoll-cancel, otherwise the function wouldn't be able to find the context and will hang.
|
CVE-2021-41073 |
loop_rw_iter in fs/io_uring.c in the Linux kernel 5.10 through 5.14.6 allows local users to gain privileges by using IORING_OP_PROVIDE_BUFFERS to trigger a free of a kernel buffer, as demonstrated by using /proc/<pid>/maps for exploitation.
|
CVE-2021-3491 |
The io_uring subsystem in the Linux kernel allowed the MAX_RW_COUNT limit to be bypassed in the PROVIDE_BUFFERS operation, which led to negative values being usedin mem_rw when reading /proc/<PID>/mem. This could be used to create a heap overflow leading to arbitrary code execution in the kernel. It was addressed via commit d1f82808877b ("io_uring: truncate lengths larger than MAX_RW_COUNT on provide buffers") (v5.13-rc1) and backported to the stable kernels in v5.12.4, v5.11.21, and v5.10.37. It was introduced in ddf0322db79c ("io_uring: add IORING_OP_PROVIDE_BUFFERS") (v5.7-rc1).
|
CVE-2021-20226 |
A use-after-free flaw was found in the io_uring in Linux kernel, where a local attacker with a user privilege could cause a denial of service problem on the system The issue results from the lack of validating the existence of an object prior to performing operations on the object by not incrementing the file reference counter while in use. The highest threat from this vulnerability is to data integrity, confidentiality and system availability.
|
CVE-2020-29534 |
An issue was discovered in the Linux kernel before 5.9.3. io_uring takes a non-refcounted reference to the files_struct of the process that submitted a request, causing execve() to incorrectly optimize unshare_fd(), aka CID-0f2122045b94.
|
CVE-2019-19241 |
In the Linux kernel before 5.4.2, the io_uring feature leads to requests that inadvertently have UID 0 and full capabilities, aka CID-181e448d8709. This is related to fs/io-wq.c, fs/io_uring.c, and net/socket.c. For example, an attacker can bypass intended restrictions on adding an IPv4 address to the loopback interface. This occurs because IORING_OP_SENDMSG operations, although requested in the context of an unprivileged user, are sometimes performed by a kernel worker thread without considering that context.
|