CVE-2024-56552

Severity CVSS v4.0:
Pending analysis
Type:
CWE-362 Concurrent Execution using Shared Resource with Improper Synchronization ('Race Condition')
Publication date:
27/12/2024
Last modified:
23/09/2025

Description

In the Linux kernel, the following vulnerability has been resolved:<br /> <br /> drm/xe/guc_submit: fix race around suspend_pending<br /> <br /> Currently in some testcases we can trigger:<br /> <br /> xe 0000:03:00.0: [drm] Assertion `exec_queue_destroyed(q)` failed!<br /> ....<br /> WARNING: CPU: 18 PID: 2640 at drivers/gpu/drm/xe/xe_guc_submit.c:1826 xe_guc_sched_done_handler+0xa54/0xef0 [xe]<br /> xe 0000:03:00.0: [drm] *ERROR* GT1: DEREGISTER_DONE: Unexpected engine state 0x00a1, guc_id=57<br /> <br /> Looking at a snippet of corresponding ftrace for this GuC id we can see:<br /> <br /> 162.673311: xe_sched_msg_add: dev=0000:03:00.0, gt=1 guc_id=57, opcode=3<br /> 162.673317: xe_sched_msg_recv: dev=0000:03:00.0, gt=1 guc_id=57, opcode=3<br /> 162.673319: xe_exec_queue_scheduling_disable: dev=0000:03:00.0, 1:0x2, gt=1, width=1, guc_id=57, guc_state=0x29, flags=0x0<br /> 162.674089: xe_exec_queue_kill: dev=0000:03:00.0, 1:0x2, gt=1, width=1, guc_id=57, guc_state=0x29, flags=0x0<br /> 162.674108: xe_exec_queue_close: dev=0000:03:00.0, 1:0x2, gt=1, width=1, guc_id=57, guc_state=0xa9, flags=0x0<br /> 162.674488: xe_exec_queue_scheduling_done: dev=0000:03:00.0, 1:0x2, gt=1, width=1, guc_id=57, guc_state=0xa9, flags=0x0<br /> 162.678452: xe_exec_queue_deregister: dev=0000:03:00.0, 1:0x2, gt=1, width=1, guc_id=57, guc_state=0xa1, flags=0x0<br /> <br /> It looks like we try to suspend the queue (opcode=3), setting<br /> suspend_pending and triggering a disable_scheduling. The user then<br /> closes the queue. However the close will also forcefully signal the<br /> suspend fence after killing the queue, later when the G2H response for<br /> disable_scheduling comes back we have now cleared suspend_pending when<br /> signalling the suspend fence, so the disable_scheduling now incorrectly<br /> tries to also deregister the queue. This leads to warnings since the queue<br /> has yet to even be marked for destruction. We also seem to trigger<br /> errors later with trying to double unregister the same queue.<br /> <br /> To fix this tweak the ordering when handling the response to ensure we<br /> don&amp;#39;t race with a disable_scheduling that didn&amp;#39;t actually intend to<br /> perform an unregister. The destruction path should now also correctly<br /> wait for any pending_disable before marking as destroyed.<br /> <br /> (cherry picked from commit f161809b362f027b6d72bd998e47f8f0bad60a2e)

Vulnerable products and versions

CPE From Up to
cpe:2.3:o:linux:linux_kernel:*:*:*:*:*:*:*:* 6.8 (including) 6.12.4 (excluding)