CVE-2024-27004
Publication date:
01/05/2024
In the Linux kernel, the following vulnerability has been resolved:<br />
<br />
clk: Get runtime PM before walking tree during disable_unused<br />
<br />
Doug reported [1] the following hung task:<br />
<br />
INFO: task swapper/0:1 blocked for more than 122 seconds.<br />
Not tainted 5.15.149-21875-gf795ebc40eb8 #1<br />
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.<br />
task:swapper/0 state:D stack: 0 pid: 1 ppid: 0 flags:0x00000008<br />
Call trace:<br />
__switch_to+0xf4/0x1f4<br />
__schedule+0x418/0xb80<br />
schedule+0x5c/0x10c<br />
rpm_resume+0xe0/0x52c<br />
rpm_resume+0x178/0x52c<br />
__pm_runtime_resume+0x58/0x98<br />
clk_pm_runtime_get+0x30/0xb0<br />
clk_disable_unused_subtree+0x58/0x208<br />
clk_disable_unused_subtree+0x38/0x208<br />
clk_disable_unused_subtree+0x38/0x208<br />
clk_disable_unused_subtree+0x38/0x208<br />
clk_disable_unused_subtree+0x38/0x208<br />
clk_disable_unused+0x4c/0xe4<br />
do_one_initcall+0xcc/0x2d8<br />
do_initcall_level+0xa4/0x148<br />
do_initcalls+0x5c/0x9c<br />
do_basic_setup+0x24/0x30<br />
kernel_init_freeable+0xec/0x164<br />
kernel_init+0x28/0x120<br />
ret_from_fork+0x10/0x20<br />
INFO: task kworker/u16:0:9 blocked for more than 122 seconds.<br />
Not tainted 5.15.149-21875-gf795ebc40eb8 #1<br />
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.<br />
task:kworker/u16:0 state:D stack: 0 pid: 9 ppid: 2 flags:0x00000008<br />
Workqueue: events_unbound deferred_probe_work_func<br />
Call trace:<br />
__switch_to+0xf4/0x1f4<br />
__schedule+0x418/0xb80<br />
schedule+0x5c/0x10c<br />
schedule_preempt_disabled+0x2c/0x48<br />
__mutex_lock+0x238/0x488<br />
__mutex_lock_slowpath+0x1c/0x28<br />
mutex_lock+0x50/0x74<br />
clk_prepare_lock+0x7c/0x9c<br />
clk_core_prepare_lock+0x20/0x44<br />
clk_prepare+0x24/0x30<br />
clk_bulk_prepare+0x40/0xb0<br />
mdss_runtime_resume+0x54/0x1c8<br />
pm_generic_runtime_resume+0x30/0x44<br />
__genpd_runtime_resume+0x68/0x7c<br />
genpd_runtime_resume+0x108/0x1f4<br />
__rpm_callback+0x84/0x144<br />
rpm_callback+0x30/0x88<br />
rpm_resume+0x1f4/0x52c<br />
rpm_resume+0x178/0x52c<br />
__pm_runtime_resume+0x58/0x98<br />
__device_attach+0xe0/0x170<br />
device_initial_probe+0x1c/0x28<br />
bus_probe_device+0x3c/0x9c<br />
device_add+0x644/0x814<br />
mipi_dsi_device_register_full+0xe4/0x170<br />
devm_mipi_dsi_device_register_full+0x28/0x70<br />
ti_sn_bridge_probe+0x1dc/0x2c0<br />
auxiliary_bus_probe+0x4c/0x94<br />
really_probe+0xcc/0x2c8<br />
__driver_probe_device+0xa8/0x130<br />
driver_probe_device+0x48/0x110<br />
__device_attach_driver+0xa4/0xcc<br />
bus_for_each_drv+0x8c/0xd8<br />
__device_attach+0xf8/0x170<br />
device_initial_probe+0x1c/0x28<br />
bus_probe_device+0x3c/0x9c<br />
deferred_probe_work_func+0x9c/0xd8<br />
process_one_work+0x148/0x518<br />
worker_thread+0x138/0x350<br />
kthread+0x138/0x1e0<br />
ret_from_fork+0x10/0x20<br />
<br />
The first thread is walking the clk tree and calling<br />
clk_pm_runtime_get() to power on devices required to read the clk<br />
hardware via struct clk_ops::is_enabled(). This thread holds the clk<br />
prepare_lock, and is trying to runtime PM resume a device, when it finds<br />
that the device is in the process of resuming so the thread schedule()s<br />
away waiting for the device to finish resuming before continuing. The<br />
second thread is runtime PM resuming the same device, but the runtime<br />
resume callback is calling clk_prepare(), trying to grab the<br />
prepare_lock waiting on the first thread.<br />
<br />
This is a classic ABBA deadlock. To properly fix the deadlock, we must<br />
never runtime PM resume or suspend a device with the clk prepare_lock<br />
held. Actually doing that is near impossible today because the global<br />
prepare_lock would have to be dropped in the middle of the tree, the<br />
device runtime PM resumed/suspended, and then the prepare_lock grabbed<br />
again to ensure consistency of the clk tree topology. If anything<br />
changes with the clk tree in the meantime, we&#39;ve lost and will need to<br />
start the operation all over again.<br />
<br />
Luckily, most of the time we&#39;re simply incrementing or decrementing the<br />
runtime PM count on an active device, so we don&#39;t have the chance to<br />
schedule away with the prepare_lock held. Let&#39;s fix this immediate<br />
problem that can be<br />
---truncated---
Severity CVSS v4.0: Pending analysis
Last modification:
23/12/2025