CVE-2024-46765
Severity CVSS v4.0:
Pending analysis
Type:
CWE-476
NULL Pointer Dereference
Publication date:
18/09/2024
Last modified:
26/09/2024
Description
In the Linux kernel, the following vulnerability has been resolved:<br />
<br />
ice: protect XDP configuration with a mutex<br />
<br />
The main threat to data consistency in ice_xdp() is a possible asynchronous<br />
PF reset. It can be triggered by a user or by TX timeout handler.<br />
<br />
XDP setup and PF reset code access the same resources in the following<br />
sections:<br />
* ice_vsi_close() in ice_prepare_for_reset() - already rtnl-locked<br />
* ice_vsi_rebuild() for the PF VSI - not protected<br />
* ice_vsi_open() - already rtnl-locked<br />
<br />
With an unfortunate timing, such accesses can result in a crash such as the<br />
one below:<br />
<br />
[ +1.999878] ice 0000:b1:00.0: Registered XDP mem model MEM_TYPE_XSK_BUFF_POOL on Rx ring 14<br />
[ +2.002992] ice 0000:b1:00.0: Registered XDP mem model MEM_TYPE_XSK_BUFF_POOL on Rx ring 18<br />
[Mar15 18:17] ice 0000:b1:00.0 ens801f0np0: NETDEV WATCHDOG: CPU: 38: transmit queue 14 timed out 80692736 ms<br />
[ +0.000093] ice 0000:b1:00.0 ens801f0np0: tx_timeout: VSI_num: 6, Q 14, NTC: 0x0, HW_HEAD: 0x0, NTU: 0x0, INT: 0x4000001<br />
[ +0.000012] ice 0000:b1:00.0 ens801f0np0: tx_timeout recovery level 1, txqueue 14<br />
[ +0.394718] ice 0000:b1:00.0: PTP reset successful<br />
[ +0.006184] BUG: kernel NULL pointer dereference, address: 0000000000000098<br />
[ +0.000045] #PF: supervisor read access in kernel mode<br />
[ +0.000023] #PF: error_code(0x0000) - not-present page<br />
[ +0.000023] PGD 0 P4D 0<br />
[ +0.000018] Oops: 0000 [#1] PREEMPT SMP NOPTI<br />
[ +0.000023] CPU: 38 PID: 7540 Comm: kworker/38:1 Not tainted 6.8.0-rc7 #1<br />
[ +0.000031] Hardware name: Intel Corporation S2600WFT/S2600WFT, BIOS SE5C620.86B.02.01.0014.082620210524 08/26/2021<br />
[ +0.000036] Workqueue: ice ice_service_task [ice]<br />
[ +0.000183] RIP: 0010:ice_clean_tx_ring+0xa/0xd0 [ice]<br />
[...]<br />
[ +0.000013] Call Trace:<br />
[ +0.000016] <br />
[ +0.000014] ? __die+0x1f/0x70<br />
[ +0.000029] ? page_fault_oops+0x171/0x4f0<br />
[ +0.000029] ? schedule+0x3b/0xd0<br />
[ +0.000027] ? exc_page_fault+0x7b/0x180<br />
[ +0.000022] ? asm_exc_page_fault+0x22/0x30<br />
[ +0.000031] ? ice_clean_tx_ring+0xa/0xd0 [ice]<br />
[ +0.000194] ice_free_tx_ring+0xe/0x60 [ice]<br />
[ +0.000186] ice_destroy_xdp_rings+0x157/0x310 [ice]<br />
[ +0.000151] ice_vsi_decfg+0x53/0xe0 [ice]<br />
[ +0.000180] ice_vsi_rebuild+0x239/0x540 [ice]<br />
[ +0.000186] ice_vsi_rebuild_by_type+0x76/0x180 [ice]<br />
[ +0.000145] ice_rebuild+0x18c/0x840 [ice]<br />
[ +0.000145] ? delay_tsc+0x4a/0xc0<br />
[ +0.000022] ? delay_tsc+0x92/0xc0<br />
[ +0.000020] ice_do_reset+0x140/0x180 [ice]<br />
[ +0.000886] ice_service_task+0x404/0x1030 [ice]<br />
[ +0.000824] process_one_work+0x171/0x340<br />
[ +0.000685] worker_thread+0x277/0x3a0<br />
[ +0.000675] ? preempt_count_add+0x6a/0xa0<br />
[ +0.000677] ? _raw_spin_lock_irqsave+0x23/0x50<br />
[ +0.000679] ? __pfx_worker_thread+0x10/0x10<br />
[ +0.000653] kthread+0xf0/0x120<br />
[ +0.000635] ? __pfx_kthread+0x10/0x10<br />
[ +0.000616] ret_from_fork+0x2d/0x50<br />
[ +0.000612] ? __pfx_kthread+0x10/0x10<br />
[ +0.000604] ret_from_fork_asm+0x1b/0x30<br />
[ +0.000604] <br />
<br />
The previous way of handling this through returning -EBUSY is not viable,<br />
particularly when destroying AF_XDP socket, because the kernel proceeds<br />
with removal anyway.<br />
<br />
There is plenty of code between those calls and there is no need to create<br />
a large critical section that covers all of them, same as there is no need<br />
to protect ice_vsi_rebuild() with rtnl_lock().<br />
<br />
Add xdp_state_lock mutex to protect ice_vsi_rebuild() and ice_xdp().<br />
<br />
Leaving unprotected sections in between would result in two states that<br />
have to be considered:<br />
1. when the VSI is closed, but not yet rebuild<br />
2. when VSI is already rebuild, but not yet open<br />
<br />
The latter case is actually already handled through !netif_running() case,<br />
we just need to adjust flag checking a little. The former one is not as<br />
trivial, because between ice_vsi_close() and ice_vsi_rebuild(), a lot of<br />
hardware interaction happens, this can make adding/deleting rings exit<br />
with an error. Luckily, VSI rebuild is pending and can apply new<br />
configuration for us in a managed fashion.<br />
<br />
Therefore, add an additional VSI state flag ICE_VSI_REBUILD_PENDING to<br />
indicate that ice_x<br />
---truncated---
Impact
Base Score 3.x
5.50
Severity 3.x
MEDIUM
Vulnerable products and versions
CPE | From | Up to |
---|---|---|
cpe:2.3:o:linux:linux_kernel:*:*:*:*:*:*:*:* | 5.5 (including) | 6.6.51 (excluding) |
cpe:2.3:o:linux:linux_kernel:*:*:*:*:*:*:*:* | 6.7 (including) | 6.10.10 (excluding) |
cpe:2.3:o:linux:linux_kernel:6.11:rc1:*:*:*:*:*:* | ||
cpe:2.3:o:linux:linux_kernel:6.11:rc2:*:*:*:*:*:* | ||
cpe:2.3:o:linux:linux_kernel:6.11:rc3:*:*:*:*:*:* | ||
cpe:2.3:o:linux:linux_kernel:6.11:rc4:*:*:*:*:*:* | ||
cpe:2.3:o:linux:linux_kernel:6.11:rc5:*:*:*:*:*:* | ||
cpe:2.3:o:linux:linux_kernel:6.11:rc6:*:*:*:*:*:* |
To consult the complete list of CPE names with products and versions, see this page