CVE-2025-38104
Publication date:
18/04/2025
In the Linux kernel, the following vulnerability has been resolved:<br />
<br />
drm/amdgpu: Replace Mutex with Spinlock for RLCG register access to avoid Priority Inversion in SRIOV<br />
<br />
RLCG Register Access is a way for virtual functions to safely access GPU<br />
registers in a virtualized environment., including TLB flushes and<br />
register reads. When multiple threads or VFs try to access the same<br />
registers simultaneously, it can lead to race conditions. By using the<br />
RLCG interface, the driver can serialize access to the registers. This<br />
means that only one thread can access the registers at a time,<br />
preventing conflicts and ensuring that operations are performed<br />
correctly. Additionally, when a low-priority task holds a mutex that a<br />
high-priority task needs, ie., If a thread holding a spinlock tries to<br />
acquire a mutex, it can lead to priority inversion. register access in<br />
amdgpu_virt_rlcg_reg_rw especially in a fast code path is critical.<br />
<br />
The call stack shows that the function amdgpu_virt_rlcg_reg_rw is being<br />
called, which attempts to acquire the mutex. This function is invoked<br />
from amdgpu_sriov_wreg, which in turn is called from<br />
gmc_v11_0_flush_gpu_tlb.<br />
<br />
The [ BUG: Invalid wait context ] indicates that a thread is trying to<br />
acquire a mutex while it is in a context that does not allow it to sleep<br />
(like holding a spinlock).<br />
<br />
Fixes the below:<br />
<br />
[ 253.013423] =============================<br />
[ 253.013434] [ BUG: Invalid wait context ]<br />
[ 253.013446] 6.12.0-amdstaging-drm-next-lol-050225 #14 Tainted: G U OE<br />
[ 253.013464] -----------------------------<br />
[ 253.013475] kworker/0:1/10 is trying to lock:<br />
[ 253.013487] ffff9f30542e3cf8 (&adev->virt.rlcg_reg_lock){+.+.}-{3:3}, at: amdgpu_virt_rlcg_reg_rw+0xf6/0x330 [amdgpu]<br />
[ 253.013815] other info that might help us debug this:<br />
[ 253.013827] context-{4:4}<br />
[ 253.013835] 3 locks held by kworker/0:1/10:<br />
[ 253.013847] #0: ffff9f3040050f58 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x3f5/0x680<br />
[ 253.013877] #1: ffffb789c008be40 ((work_completion)(&wfc.work)){+.+.}-{0:0}, at: process_one_work+0x1d6/0x680<br />
[ 253.013905] #2: ffff9f3054281838 (&adev->gmc.invalidate_lock){+.+.}-{2:2}, at: gmc_v11_0_flush_gpu_tlb+0x198/0x4f0 [amdgpu]<br />
[ 253.014154] stack backtrace:<br />
[ 253.014164] CPU: 0 UID: 0 PID: 10 Comm: kworker/0:1 Tainted: G U OE 6.12.0-amdstaging-drm-next-lol-050225 #14<br />
[ 253.014189] Tainted: [U]=USER, [O]=OOT_MODULE, [E]=UNSIGNED_MODULE<br />
[ 253.014203] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS Hyper-V UEFI Release v4.1 11/18/2024<br />
[ 253.014224] Workqueue: events work_for_cpu_fn<br />
[ 253.014241] Call Trace:<br />
[ 253.014250] <br />
[ 253.014260] dump_stack_lvl+0x9b/0xf0<br />
[ 253.014275] dump_stack+0x10/0x20<br />
[ 253.014287] __lock_acquire+0xa47/0x2810<br />
[ 253.014303] ? srso_alias_return_thunk+0x5/0xfbef5<br />
[ 253.014321] lock_acquire+0xd1/0x300<br />
[ 253.014333] ? amdgpu_virt_rlcg_reg_rw+0xf6/0x330 [amdgpu]<br />
[ 253.014562] ? __lock_acquire+0xa6b/0x2810<br />
[ 253.014578] __mutex_lock+0x85/0xe20<br />
[ 253.014591] ? amdgpu_virt_rlcg_reg_rw+0xf6/0x330 [amdgpu]<br />
[ 253.014782] ? sched_clock_noinstr+0x9/0x10<br />
[ 253.014795] ? srso_alias_return_thunk+0x5/0xfbef5<br />
[ 253.014808] ? local_clock_noinstr+0xe/0xc0<br />
[ 253.014822] ? amdgpu_virt_rlcg_reg_rw+0xf6/0x330 [amdgpu]<br />
[ 253.015012] ? srso_alias_return_thunk+0x5/0xfbef5<br />
[ 253.015029] mutex_lock_nested+0x1b/0x30<br />
[ 253.015044] ? mutex_lock_nested+0x1b/0x30<br />
[ 253.015057] amdgpu_virt_rlcg_reg_rw+0xf6/0x330 [amdgpu]<br />
[ 253.015249] amdgpu_sriov_wreg+0xc5/0xd0 [amdgpu]<br />
[ 253.015435] gmc_v11_0_flush_gpu_tlb+0x44b/0x4f0 [amdgpu]<br />
[ 253.015667] gfx_v11_0_hw_init+0x499/0x29c0 [amdgpu]<br />
[ 253.015901] ? __pfx_smu_v13_0_update_pcie_parameters+0x10/0x10 [amdgpu]<br />
[ 253.016159] ? srso_alias_return_thunk+0x5/0xfbef5<br />
[ 253.016173] ? smu_hw_init+0x18d/0x300 [amdgpu]<br />
[ 253.016403] amdgpu_device_init+0x29ad/0x36a0 [amdgpu]<br />
[ 253.016614] amdgpu_driver_load_kms+0x1a/0xc0 [amdgpu]<br />
[ 253.0170<br />
---truncated---
Severity CVSS v4.0: Pending analysis
Last modification:
06/02/2026