CVE-2024-56547
Severity CVSS v4.0:
Pending analysis
Type:
Unavailable / Other
Publication date:
27/12/2024
Last modified:
27/12/2024
Description
In the Linux kernel, the following vulnerability has been resolved:<br />
<br />
rcu/nocb: Fix missed RCU barrier on deoffloading<br />
<br />
Currently, running rcutorture test with torture_type=rcu fwd_progress=8<br />
n_barrier_cbs=8 nocbs_nthreads=8 nocbs_toggle=100 onoff_interval=60<br />
test_boost=2, will trigger the following warning:<br />
<br />
WARNING: CPU: 19 PID: 100 at kernel/rcu/tree_nocb.h:1061 rcu_nocb_rdp_deoffload+0x292/0x2a0<br />
RIP: 0010:rcu_nocb_rdp_deoffload+0x292/0x2a0<br />
Call Trace:<br />
<br />
? __warn+0x7e/0x120<br />
? rcu_nocb_rdp_deoffload+0x292/0x2a0<br />
? report_bug+0x18e/0x1a0<br />
? handle_bug+0x3d/0x70<br />
? exc_invalid_op+0x18/0x70<br />
? asm_exc_invalid_op+0x1a/0x20<br />
? rcu_nocb_rdp_deoffload+0x292/0x2a0<br />
rcu_nocb_cpu_deoffload+0x70/0xa0<br />
rcu_nocb_toggle+0x136/0x1c0<br />
? __pfx_rcu_nocb_toggle+0x10/0x10<br />
kthread+0xd1/0x100<br />
? __pfx_kthread+0x10/0x10<br />
ret_from_fork+0x2f/0x50<br />
? __pfx_kthread+0x10/0x10<br />
ret_from_fork_asm+0x1a/0x30<br />
<br />
<br />
CPU0 CPU2 CPU3<br />
//rcu_nocb_toggle //nocb_cb_wait //rcutorture<br />
<br />
// deoffload CPU1 // process CPU1&#39;s rdp<br />
rcu_barrier()<br />
rcu_segcblist_entrain()<br />
rcu_segcblist_add_len(1);<br />
// len == 2<br />
// enqueue barrier<br />
// callback to CPU1&#39;s<br />
// rdp->cblist<br />
rcu_do_batch()<br />
// invoke CPU1&#39;s rdp->cblist<br />
// callback<br />
rcu_barrier_callback()<br />
rcu_barrier()<br />
mutex_lock(&rcu_state.barrier_mutex);<br />
// still see len == 2<br />
// enqueue barrier callback<br />
// to CPU1&#39;s rdp->cblist<br />
rcu_segcblist_entrain()<br />
rcu_segcblist_add_len(1);<br />
// len == 3<br />
// decrement len<br />
rcu_segcblist_add_len(-2);<br />
kthread_parkme()<br />
<br />
// CPU1&#39;s rdp->cblist len == 1<br />
// Warn because there is<br />
// still a pending barrier<br />
// trigger warning<br />
WARN_ON_ONCE(rcu_segcblist_n_cbs(&rdp->cblist));<br />
cpus_read_unlock();<br />
<br />
// wait CPU1 to comes online and<br />
// invoke barrier callback on<br />
// CPU1 rdp&#39;s->cblist<br />
wait_for_completion(&rcu_state.barrier_completion);<br />
// deoffload CPU4<br />
cpus_read_lock()<br />
rcu_barrier()<br />
mutex_lock(&rcu_state.barrier_mutex);<br />
// block on barrier_mutex<br />
// wait rcu_barrier() on<br />
// CPU3 to unlock barrier_mutex<br />
// but CPU3 unlock barrier_mutex<br />
// need to wait CPU1 comes online<br />
// when CPU1 going online will block on cpus_write_lock<br />
<br />
The above scenario will not only trigger a WARN_ON_ONCE(), but also<br />
trigger a deadlock.<br />
<br />
Thanks to nocb locking, a second racing rcu_barrier() on an offline CPU<br />
will either observe the decremented callback counter down to 0 and spare<br />
the callback enqueue, or rcuo will observe the new callback and keep<br />
rdp->nocb_cb_sleep to false.<br />
<br />
Therefore check rdp->nocb_cb_sleep before parking to make sure no<br />
further rcu_barrier() is waiting on the rdp.