CVE-2024-56592
Publication date:
27/12/2024
In the Linux kernel, the following vulnerability has been resolved:<br />
<br />
bpf: Call free_htab_elem() after htab_unlock_bucket()<br />
<br />
For htab of maps, when the map is removed from the htab, it may hold the<br />
last reference of the map. bpf_map_fd_put_ptr() will invoke<br />
bpf_map_free_id() to free the id of the removed map element. However,<br />
bpf_map_fd_put_ptr() is invoked while holding a bucket lock<br />
(raw_spin_lock_t), and bpf_map_free_id() attempts to acquire map_idr_lock<br />
(spinlock_t), triggering the following lockdep warning:<br />
<br />
=============================<br />
[ BUG: Invalid wait context ]<br />
6.11.0-rc4+ #49 Not tainted<br />
-----------------------------<br />
test_maps/4881 is trying to lock:<br />
ffffffff84884578 (map_idr_lock){+...}-{3:3}, at: bpf_map_free_id.part.0+0x21/0x70<br />
other info that might help us debug this:<br />
context-{5:5}<br />
2 locks held by test_maps/4881:<br />
#0: ffffffff846caf60 (rcu_read_lock){....}-{1:3}, at: bpf_fd_htab_map_update_elem+0xf9/0x270<br />
#1: ffff888149ced148 (&htab->lockdep_key#2){....}-{2:2}, at: htab_map_update_elem+0x178/0xa80<br />
stack backtrace:<br />
CPU: 0 UID: 0 PID: 4881 Comm: test_maps Not tainted 6.11.0-rc4+ #49<br />
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), ...<br />
Call Trace:<br />
<br />
dump_stack_lvl+0x6e/0xb0<br />
dump_stack+0x10/0x20<br />
__lock_acquire+0x73e/0x36c0<br />
lock_acquire+0x182/0x450<br />
_raw_spin_lock_irqsave+0x43/0x70<br />
bpf_map_free_id.part.0+0x21/0x70<br />
bpf_map_put+0xcf/0x110<br />
bpf_map_fd_put_ptr+0x9a/0xb0<br />
free_htab_elem+0x69/0xe0<br />
htab_map_update_elem+0x50f/0xa80<br />
bpf_fd_htab_map_update_elem+0x131/0x270<br />
htab_map_update_elem+0x50f/0xa80<br />
bpf_fd_htab_map_update_elem+0x131/0x270<br />
bpf_map_update_value+0x266/0x380<br />
__sys_bpf+0x21bb/0x36b0<br />
__x64_sys_bpf+0x45/0x60<br />
x64_sys_call+0x1b2a/0x20d0<br />
do_syscall_64+0x5d/0x100<br />
entry_SYSCALL_64_after_hwframe+0x76/0x7e<br />
<br />
One way to fix the lockdep warning is using raw_spinlock_t for<br />
map_idr_lock as well. However, bpf_map_alloc_id() invokes<br />
idr_alloc_cyclic() after acquiring map_idr_lock, it will trigger a<br />
similar lockdep warning because the slab&#39;s lock (s->cpu_slab->lock) is<br />
still a spinlock.<br />
<br />
Instead of changing map_idr_lock&#39;s type, fix the issue by invoking<br />
htab_put_fd_value() after htab_unlock_bucket(). However, only deferring<br />
the invocation of htab_put_fd_value() is not enough, because the old map<br />
pointers in htab of maps can not be saved during batched deletion.<br />
Therefore, also defer the invocation of free_htab_elem(), so these<br />
to-be-freed elements could be linked together similar to lru map.<br />
<br />
There are four callers for ->map_fd_put_ptr:<br />
<br />
(1) alloc_htab_elem() (through htab_put_fd_value())<br />
It invokes ->map_fd_put_ptr() under a raw_spinlock_t. The invocation of<br />
htab_put_fd_value() can not simply move after htab_unlock_bucket(),<br />
because the old element has already been stashed in htab->extra_elems.<br />
It may be reused immediately after htab_unlock_bucket() and the<br />
invocation of htab_put_fd_value() after htab_unlock_bucket() may release<br />
the newly-added element incorrectly. Therefore, saving the map pointer<br />
of the old element for htab of maps before unlocking the bucket and<br />
releasing the map_ptr after unlock. Beside the map pointer in the old<br />
element, should do the same thing for the special fields in the old<br />
element as well.<br />
<br />
(2) free_htab_elem() (through htab_put_fd_value())<br />
Its caller includes __htab_map_lookup_and_delete_elem(),<br />
htab_map_delete_elem() and __htab_map_lookup_and_delete_batch().<br />
<br />
For htab_map_delete_elem(), simply invoke free_htab_elem() after<br />
htab_unlock_bucket(). For __htab_map_lookup_and_delete_batch(), just<br />
like lru map, linking the to-be-freed element into node_to_free list<br />
and invoking free_htab_elem() for these element after unlock. It is safe<br />
to reuse batch_flink as the link for node_to_free, because these<br />
elements have been removed from the hash llist.<br />
<br />
Because htab of maps doesn&#39;t support lookup_and_delete operation,<br />
__htab_map_lookup_and_delete_elem() doesn&#39;t have the problem, so kept<br />
it as<br />
---truncated---
Severity CVSS v4.0: Pending analysis
Last modification:
08/10/2025