CVE-2023-53087
Severity CVSS v4.0:
Pending analysis
Type:
Unavailable / Other
Publication date:
02/05/2025
Last modified:
05/05/2025
Description
In the Linux kernel, the following vulnerability has been resolved:<br />
<br />
drm/i915/active: Fix misuse of non-idle barriers as fence trackers<br />
<br />
Users reported oopses on list corruptions when using i915 perf with a<br />
number of concurrently running graphics applications. Root cause analysis<br />
pointed at an issue in barrier processing code -- a race among perf open /<br />
close replacing active barriers with perf requests on kernel context and<br />
concurrent barrier preallocate / acquire operations performed during user<br />
context first pin / last unpin.<br />
<br />
When adding a request to a composite tracker, we try to reuse an existing<br />
fence tracker, already allocated and registered with that composite. The<br />
tracker we obtain may already track another fence, may be an idle barrier,<br />
or an active barrier.<br />
<br />
If the tracker we get occurs a non-idle barrier then we try to delete that<br />
barrier from a list of barrier tasks it belongs to. However, while doing<br />
that we don&#39;t respect return value from a function that performs the<br />
barrier deletion. Should the deletion ever fail, we would end up reusing<br />
the tracker still registered as a barrier task. Since the same structure<br />
field is reused with both fence callback lists and barrier tasks list,<br />
list corruptions would likely occur.<br />
<br />
Barriers are now deleted from a barrier tasks list by temporarily removing<br />
the list content, traversing that content with skip over the node to be<br />
deleted, then populating the list back with the modified content. Should<br />
that intentionally racy concurrent deletion attempts be not serialized,<br />
one or more of those may fail because of the list being temporary empty.<br />
<br />
Related code that ignores the results of barrier deletion was initially<br />
introduced in v5.4 by commit d8af05ff38ae ("drm/i915: Allow sharing the<br />
idle-barrier from other kernel requests"). However, all users of the<br />
barrier deletion routine were apparently serialized at that time, then the<br />
issue didn&#39;t exhibit itself. Results of git bisect with help of a newly<br />
developed igt@gem_barrier_race@remote-request IGT test indicate that list<br />
corruptions might start to appear after commit 311770173fac ("drm/i915/gt:<br />
Schedule request retirement when timeline idles"), introduced in v5.5.<br />
<br />
Respect results of barrier deletion attempts -- mark the barrier as idle<br />
only if successfully deleted from the list. Then, before proceeding with<br />
setting our fence as the one currently tracked, make sure that the tracker<br />
we&#39;ve got is not a non-idle barrier. If that check fails then don&#39;t use<br />
that tracker but go back and try to acquire a new, usable one.<br />
<br />
v3: use unlikely() to document what outcome we expect (Andi),<br />
- fix bad grammar in commit description.<br />
v2: no code changes,<br />
- blame commit 311770173fac ("drm/i915/gt: Schedule request retirement<br />
when timeline idles"), v5.5, not commit d8af05ff38ae ("drm/i915: Allow<br />
sharing the idle-barrier from other kernel requests"), v5.4,<br />
- reword commit description.<br />
<br />
(cherry picked from commit 506006055769b10d1b2b4e22f636f3b45e0e9fc7)
Impact
References to Advisories, Solutions, and Tools
- https://git.kernel.org/stable/c/5c7591b8574c52c56b3994c2fbef1a3a311b5715
- https://git.kernel.org/stable/c/5e784a7d07af42057c0576fb647b482f4cb0dc2c
- https://git.kernel.org/stable/c/6ab7d33617559cced63d467928f478ea5c459021
- https://git.kernel.org/stable/c/9159db27fb19bbf1c91b5c9d5285e66cc96cc5ff
- https://git.kernel.org/stable/c/e0e6b416b25ee14716f3549e0cbec1011b193809