CVE-2025-21880

Severity CVSS v4.0:
Pending analysis
Type:
Unavailable / Other
Publication date:
27/03/2025
Last modified:
27/03/2025

Description

In the Linux kernel, the following vulnerability has been resolved:<br /> <br /> drm/xe/userptr: fix EFAULT handling<br /> <br /> Currently we treat EFAULT from hmm_range_fault() as a non-fatal error<br /> when called from xe_vm_userptr_pin() with the idea that we want to avoid<br /> killing the entire vm and chucking an error, under the assumption that<br /> the user just did an unmap or something, and has no intention of<br /> actually touching that memory from the GPU. At this point we have<br /> already zapped the PTEs so any access should generate a page fault, and<br /> if the pin fails there also it will then become fatal.<br /> <br /> However it looks like it&amp;#39;s possible for the userptr vma to still be on<br /> the rebind list in preempt_rebind_work_func(), if we had to retry the<br /> pin again due to something happening in the caller before we did the<br /> rebind step, but in the meantime needing to re-validate the userptr and<br /> this time hitting the EFAULT.<br /> <br /> This explains an internal user report of hitting:<br /> <br /> [ 191.738349] WARNING: CPU: 1 PID: 157 at drivers/gpu/drm/xe/xe_res_cursor.h:158 xe_pt_stage_bind.constprop.0+0x60a/0x6b0 [xe]<br /> [ 191.738551] Workqueue: xe-ordered-wq preempt_rebind_work_func [xe]<br /> [ 191.738616] RIP: 0010:xe_pt_stage_bind.constprop.0+0x60a/0x6b0 [xe]<br /> [ 191.738690] Call Trace:<br /> [ 191.738692] <br /> [ 191.738694] ? show_regs+0x69/0x80<br /> [ 191.738698] ? __warn+0x93/0x1a0<br /> [ 191.738703] ? xe_pt_stage_bind.constprop.0+0x60a/0x6b0 [xe]<br /> [ 191.738759] ? report_bug+0x18f/0x1a0<br /> [ 191.738764] ? handle_bug+0x63/0xa0<br /> [ 191.738767] ? exc_invalid_op+0x19/0x70<br /> [ 191.738770] ? asm_exc_invalid_op+0x1b/0x20<br /> [ 191.738777] ? xe_pt_stage_bind.constprop.0+0x60a/0x6b0 [xe]<br /> [ 191.738834] ? ret_from_fork_asm+0x1a/0x30<br /> [ 191.738849] bind_op_prepare+0x105/0x7b0 [xe]<br /> [ 191.738906] ? dma_resv_reserve_fences+0x301/0x380<br /> [ 191.738912] xe_pt_update_ops_prepare+0x28c/0x4b0 [xe]<br /> [ 191.738966] ? kmemleak_alloc+0x4b/0x80<br /> [ 191.738973] ops_execute+0x188/0x9d0 [xe]<br /> [ 191.739036] xe_vm_rebind+0x4ce/0x5a0 [xe]<br /> [ 191.739098] ? trace_hardirqs_on+0x4d/0x60<br /> [ 191.739112] preempt_rebind_work_func+0x76f/0xd00 [xe]<br /> <br /> Followed by NPD, when running some workload, since the sg was never<br /> actually populated but the vma is still marked for rebind when it should<br /> be skipped for this special EFAULT case. This is confirmed to fix the<br /> user report.<br /> <br /> v2 (MattB):<br /> - Move earlier.<br /> v3 (MattB):<br /> - Update the commit message to make it clear that this indeed fixes the<br /> issue.<br /> <br /> (cherry picked from commit 6b93cb98910c826c2e2004942f8b060311e43618)

Impact