CVE-2025-39886

Severity CVSS v4.0:
Pending analysis
Type:
Unavailable / Other
Publication date:
23/09/2025
Last modified:
23/09/2025

Description

In the Linux kernel, the following vulnerability has been resolved:<br /> <br /> bpf: Tell memcg to use allow_spinning=false path in bpf_timer_init()<br /> <br /> Currently, calling bpf_map_kmalloc_node() from __bpf_async_init() can<br /> cause various locking issues; see the following stack trace (edited for<br /> style) as one example:<br /> <br /> ...<br /> [10.011566] do_raw_spin_lock.cold<br /> [10.011570] try_to_wake_up (5) double-acquiring the same<br /> [10.011575] kick_pool rq_lock, causing a hardlockup<br /> [10.011579] __queue_work<br /> [10.011582] queue_work_on<br /> [10.011585] kernfs_notify<br /> [10.011589] cgroup_file_notify<br /> [10.011593] try_charge_memcg (4) memcg accounting raises an<br /> [10.011597] obj_cgroup_charge_pages MEMCG_MAX event<br /> [10.011599] obj_cgroup_charge_account<br /> [10.011600] __memcg_slab_post_alloc_hook<br /> [10.011603] __kmalloc_node_noprof<br /> ...<br /> [10.011611] bpf_map_kmalloc_node<br /> [10.011612] __bpf_async_init<br /> [10.011615] bpf_timer_init (3) BPF calls bpf_timer_init()<br /> [10.011617] bpf_prog_xxxxxxxxxxxxxxxx_fcg_runnable<br /> [10.011619] bpf__sched_ext_ops_runnable<br /> [10.011620] enqueue_task_scx (2) BPF runs with rq_lock held<br /> [10.011622] enqueue_task<br /> [10.011626] ttwu_do_activate<br /> [10.011629] sched_ttwu_pending (1) grabs rq_lock<br /> ...<br /> <br /> The above was reproduced on bpf-next (b338cf849ec8) by modifying<br /> ./tools/sched_ext/scx_flatcg.bpf.c to call bpf_timer_init() during<br /> ops.runnable(), and hacking the memcg accounting code a bit to make<br /> a bpf_timer_init() call more likely to raise an MEMCG_MAX event.<br /> <br /> We have also run into other similar variants (both internally and on<br /> bpf-next), including double-acquiring cgroup_file_kn_lock, the same<br /> worker_pool::lock, etc.<br /> <br /> As suggested by Shakeel, fix this by using __GFP_HIGH instead of<br /> GFP_ATOMIC in __bpf_async_init(), so that e.g. if try_charge_memcg()<br /> raises an MEMCG_MAX event, we call __memcg_memory_event() with<br /> @allow_spinning=false and avoid calling cgroup_file_notify() there.<br /> <br /> Depends on mm patch<br /> "memcg: skip cgroup_file_notify if spinning is not allowed":<br /> https://lore.kernel.org/bpf/20250905201606.66198-1-shakeel.butt@linux.dev/<br /> <br /> v0 approach s/bpf_map_kmalloc_node/bpf_mem_alloc/<br /> https://lore.kernel.org/bpf/20250905061919.439648-1-yepeilin@google.com/<br /> v1 approach:<br /> https://lore.kernel.org/bpf/20250905234547.862249-1-yepeilin@google.com/

Impact