CVE-2026-23435
Fecha de publicación:
03/04/2026
*** Pendiente de traducción *** In the Linux kernel, the following vulnerability has been resolved:<br />
<br />
perf/x86: Move event pointer setup earlier in x86_pmu_enable()<br />
<br />
A production AMD EPYC system crashed with a NULL pointer dereference<br />
in the PMU NMI handler:<br />
<br />
BUG: kernel NULL pointer dereference, address: 0000000000000198<br />
RIP: x86_perf_event_update+0xc/0xa0<br />
Call Trace:<br />
<br />
amd_pmu_v2_handle_irq+0x1a6/0x390<br />
perf_event_nmi_handler+0x24/0x40<br />
<br />
The faulting instruction is `cmpq $0x0, 0x198(%rdi)` with RDI=0,<br />
corresponding to the `if (unlikely(!hwc->event_base))` check in<br />
x86_perf_event_update() where hwc = &event->hw and event is NULL.<br />
<br />
drgn inspection of the vmcore on CPU 106 showed a mismatch between<br />
cpuc->active_mask and cpuc->events[]:<br />
<br />
active_mask: 0x1e (bits 1, 2, 3, 4)<br />
events[1]: 0xff1100136cbd4f38 (valid)<br />
events[2]: 0x0 (NULL, but active_mask bit 2 set)<br />
events[3]: 0xff1100076fd2cf38 (valid)<br />
events[4]: 0xff1100079e990a90 (valid)<br />
<br />
The event that should occupy events[2] was found in event_list[2]<br />
with hw.idx=2 and hw.state=0x0, confirming x86_pmu_start() had run<br />
(which clears hw.state and sets active_mask) but events[2] was<br />
never populated.<br />
<br />
Another event (event_list[0]) had hw.state=0x7 (STOPPED|UPTODATE|ARCH),<br />
showing it was stopped when the PMU rescheduled events, confirming the<br />
throttle-then-reschedule sequence occurred.<br />
<br />
The root cause is commit 7e772a93eb61 ("perf/x86: Fix NULL event access<br />
and potential PEBS record loss") which moved the cpuc->events[idx]<br />
assignment out of x86_pmu_start() and into step 2 of x86_pmu_enable(),<br />
after the PERF_HES_ARCH check. This broke any path that calls<br />
pmu->start() without going through x86_pmu_enable() -- specifically<br />
the unthrottle path:<br />
<br />
perf_adjust_freq_unthr_events()<br />
-> perf_event_unthrottle_group()<br />
-> perf_event_unthrottle()<br />
-> event->pmu->start(event, 0)<br />
-> x86_pmu_start() // sets active_mask but not events[]<br />
<br />
The race sequence is:<br />
<br />
1. A group of perf events overflows, triggering group throttle via<br />
perf_event_throttle_group(). All events are stopped: active_mask<br />
bits cleared, events[] preserved (x86_pmu_stop no longer clears<br />
events[] after commit 7e772a93eb61).<br />
<br />
2. While still throttled (PERF_HES_STOPPED), x86_pmu_enable() runs<br />
due to other scheduling activity. Stopped events that need to<br />
move counters get PERF_HES_ARCH set and events[old_idx] cleared.<br />
In step 2 of x86_pmu_enable(), PERF_HES_ARCH causes these events<br />
to be skipped -- events[new_idx] is never set.<br />
<br />
3. The timer tick unthrottles the group via pmu->start(). Since<br />
commit 7e772a93eb61 removed the events[] assignment from<br />
x86_pmu_start(), active_mask[new_idx] is set but events[new_idx]<br />
remains NULL.<br />
<br />
4. A PMC overflow NMI fires. The handler iterates active counters,<br />
finds active_mask[2] set, reads events[2] which is NULL, and<br />
crashes dereferencing it.<br />
<br />
Move the cpuc->events[hwc->idx] assignment in x86_pmu_enable() to<br />
before the PERF_HES_ARCH check, so that events[] is populated even<br />
for events that are not immediately started. This ensures the<br />
unthrottle path via pmu->start() always finds a valid event pointer.
Gravedad: Pendiente de análisis
Última modificación:
03/04/2026