CVE-2021-46925

Severity CVSS v4.0:
Pending analysis
Type:
CWE-362 Concurrent Execution using Shared Resource with Improper Synchronization ('Race Condition')
Publication date:
27/02/2024
Last modified:
29/10/2024

Description

In the Linux kernel, the following vulnerability has been resolved:<br /> <br /> net/smc: fix kernel panic caused by race of smc_sock<br /> <br /> A crash occurs when smc_cdc_tx_handler() tries to access smc_sock<br /> but smc_release() has already freed it.<br /> <br /> [ 4570.695099] BUG: unable to handle page fault for address: 000000002eae9e88<br /> [ 4570.696048] #PF: supervisor write access in kernel mode<br /> [ 4570.696728] #PF: error_code(0x0002) - not-present page<br /> [ 4570.697401] PGD 0 P4D 0<br /> [ 4570.697716] Oops: 0002 [#1] PREEMPT SMP NOPTI<br /> [ 4570.698228] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.16.0-rc4+ #111<br /> [ 4570.699013] Hardware name: Alibaba Cloud Alibaba Cloud ECS, BIOS 8c24b4c 04/0<br /> [ 4570.699933] RIP: 0010:_raw_spin_lock+0x1a/0x30<br /> <br /> [ 4570.711446] Call Trace:<br /> [ 4570.711746] <br /> [ 4570.711992] smc_cdc_tx_handler+0x41/0xc0<br /> [ 4570.712470] smc_wr_tx_tasklet_fn+0x213/0x560<br /> [ 4570.712981] ? smc_cdc_tx_dismisser+0x10/0x10<br /> [ 4570.713489] tasklet_action_common.isra.17+0x66/0x140<br /> [ 4570.714083] __do_softirq+0x123/0x2f4<br /> [ 4570.714521] irq_exit_rcu+0xc4/0xf0<br /> [ 4570.714934] common_interrupt+0xba/0xe0<br /> <br /> Though smc_cdc_tx_handler() checked the existence of smc connection,<br /> smc_release() may have already dismissed and released the smc socket<br /> before smc_cdc_tx_handler() further visits it.<br /> <br /> smc_cdc_tx_handler() |smc_release()<br /> if (!conn) |<br /> |<br /> |smc_cdc_tx_dismiss_slots()<br /> | smc_cdc_tx_dismisser()<br /> |<br /> |sock_put(&amp;smc-&gt;sk) sk) (panic) |<br /> <br /> To make sure we won&amp;#39;t receive any CDC messages after we free the<br /> smc_sock, add a refcount on the smc_connection for inflight CDC<br /> message(posted to the QP but haven&amp;#39;t received related CQE), and<br /> don&amp;#39;t release the smc_connection until all the inflight CDC messages<br /> haven been done, for both success or failed ones.<br /> <br /> Using refcount on CDC messages brings another problem: when the link<br /> is going to be destroyed, smcr_link_clear() will reset the QP, which<br /> then remove all the pending CQEs related to the QP in the CQ. To make<br /> sure all the CQEs will always come back so the refcount on the<br /> smc_connection can always reach 0, smc_ib_modify_qp_reset() was replaced<br /> by smc_ib_modify_qp_error().<br /> And remove the timeout in smc_wr_tx_wait_no_pending_sends() since we<br /> need to wait for all pending WQEs done, or we may encounter use-after-<br /> free when handling CQEs.<br /> <br /> For IB device removal routine, we need to wait for all the QPs on that<br /> device been destroyed before we can destroy CQs on the device, or<br /> the refcount on smc_connection won&amp;#39;t reach 0 and smc_sock cannot be<br /> released.

Vulnerable products and versions

CPE From Up to
cpe:2.3:o:linux:linux_kernel:*:*:*:*:*:*:*:* 4.11.0 (including) 5.10.90 (excluding)
cpe:2.3:o:linux:linux_kernel:*:*:*:*:*:*:*:* 5.11.0 (including) 5.15.13 (excluding)