CVE-2024-27014
Publication date:
01/05/2024
In the Linux kernel, the following vulnerability has been resolved:<br />
<br />
net/mlx5e: Prevent deadlock while disabling aRFS<br />
<br />
When disabling aRFS under the `priv->state_lock`, any scheduled<br />
aRFS works are canceled using the `cancel_work_sync` function,<br />
which waits for the work to end if it has already started.<br />
However, while waiting for the work handler, the handler will<br />
try to acquire the `state_lock` which is already acquired.<br />
<br />
The worker acquires the lock to delete the rules if the state<br />
is down, which is not the worker&#39;s responsibility since<br />
disabling aRFS deletes the rules.<br />
<br />
Add an aRFS state variable, which indicates whether the aRFS is<br />
enabled and prevent adding rules when the aRFS is disabled.<br />
<br />
Kernel log:<br />
<br />
======================================================<br />
WARNING: possible circular locking dependency detected<br />
6.7.0-rc4_net_next_mlx5_5483eb2 #1 Tainted: G I<br />
------------------------------------------------------<br />
ethtool/386089 is trying to acquire lock:<br />
ffff88810f21ce68 ((work_completion)(&rule->arfs_work)){+.+.}-{0:0}, at: __flush_work+0x74/0x4e0<br />
<br />
but task is already holding lock:<br />
ffff8884a1808cc0 (&priv->state_lock){+.+.}-{3:3}, at: mlx5e_ethtool_set_channels+0x53/0x200 [mlx5_core]<br />
<br />
which lock already depends on the new lock.<br />
<br />
the existing dependency chain (in reverse order) is:<br />
<br />
-> #1 (&priv->state_lock){+.+.}-{3:3}:<br />
__mutex_lock+0x80/0xc90<br />
arfs_handle_work+0x4b/0x3b0 [mlx5_core]<br />
process_one_work+0x1dc/0x4a0<br />
worker_thread+0x1bf/0x3c0<br />
kthread+0xd7/0x100<br />
ret_from_fork+0x2d/0x50<br />
ret_from_fork_asm+0x11/0x20<br />
<br />
-> #0 ((work_completion)(&rule->arfs_work)){+.+.}-{0:0}:<br />
__lock_acquire+0x17b4/0x2c80<br />
lock_acquire+0xd0/0x2b0<br />
__flush_work+0x7a/0x4e0<br />
__cancel_work_timer+0x131/0x1c0<br />
arfs_del_rules+0x143/0x1e0 [mlx5_core]<br />
mlx5e_arfs_disable+0x1b/0x30 [mlx5_core]<br />
mlx5e_ethtool_set_channels+0xcb/0x200 [mlx5_core]<br />
ethnl_set_channels+0x28f/0x3b0<br />
ethnl_default_set_doit+0xec/0x240<br />
genl_family_rcv_msg_doit+0xd0/0x120<br />
genl_rcv_msg+0x188/0x2c0<br />
netlink_rcv_skb+0x54/0x100<br />
genl_rcv+0x24/0x40<br />
netlink_unicast+0x1a1/0x270<br />
netlink_sendmsg+0x214/0x460<br />
__sock_sendmsg+0x38/0x60<br />
__sys_sendto+0x113/0x170<br />
__x64_sys_sendto+0x20/0x30<br />
do_syscall_64+0x40/0xe0<br />
entry_SYSCALL_64_after_hwframe+0x46/0x4e<br />
<br />
other info that might help us debug this:<br />
<br />
Possible unsafe locking scenario:<br />
<br />
CPU0 CPU1<br />
---- ----<br />
lock(&priv->state_lock);<br />
lock((work_completion)(&rule->arfs_work));<br />
lock(&priv->state_lock);<br />
lock((work_completion)(&rule->arfs_work));<br />
<br />
*** DEADLOCK ***<br />
<br />
3 locks held by ethtool/386089:<br />
#0: ffffffff82ea7210 (cb_lock){++++}-{3:3}, at: genl_rcv+0x15/0x40<br />
#1: ffffffff82e94c88 (rtnl_mutex){+.+.}-{3:3}, at: ethnl_default_set_doit+0xd3/0x240<br />
#2: ffff8884a1808cc0 (&priv->state_lock){+.+.}-{3:3}, at: mlx5e_ethtool_set_channels+0x53/0x200 [mlx5_core]<br />
<br />
stack backtrace:<br />
CPU: 15 PID: 386089 Comm: ethtool Tainted: G I 6.7.0-rc4_net_next_mlx5_5483eb2 #1<br />
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014<br />
Call Trace:<br />
<br />
dump_stack_lvl+0x60/0xa0<br />
check_noncircular+0x144/0x160<br />
__lock_acquire+0x17b4/0x2c80<br />
lock_acquire+0xd0/0x2b0<br />
? __flush_work+0x74/0x4e0<br />
? save_trace+0x3e/0x360<br />
? __flush_work+0x74/0x4e0<br />
__flush_work+0x7a/0x4e0<br />
? __flush_work+0x74/0x4e0<br />
? __lock_acquire+0xa78/0x2c80<br />
? lock_acquire+0xd0/0x2b0<br />
? mark_held_locks+0x49/0x70<br />
__cancel_work_timer+0x131/0x1c0<br />
? mark_held_locks+0x49/0x70<br />
arfs_del_rules+0x143/0x1e0 [mlx5_core]<br />
mlx5e_arfs_disable+0x1b/0x30 [mlx5_core]<br />
mlx5e_ethtool_set_channels+0xcb/0x200 [mlx5_core]<br />
ethnl_set_channels+0x28f/0x3b0<br />
ethnl_default_set_doit+0xec/0x240<br />
genl_family_rcv_msg_doit+0xd0/0x120<br />
genl_rcv_msg+0x188/0x2c0<br />
? ethn<br />
---truncated---
Severity CVSS v4.0: Pending analysis
Last modification:
04/11/2025