CVE-2025-40230
Gravedad:
Pendiente de análisis
Tipo:
No Disponible / Otro tipo
Fecha de publicación:
04/12/2025
Última modificación:
04/12/2025
Descripción
*** Pendiente de traducción *** In the Linux kernel, the following vulnerability has been resolved:<br />
<br />
mm: prevent poison consumption when splitting THP<br />
<br />
When performing memory error injection on a THP (Transparent Huge Page)<br />
mapped to userspace on an x86 server, the kernel panics with the following<br />
trace. The expected behavior is to terminate the affected process instead<br />
of panicking the kernel, as the x86 Machine Check code can recover from an<br />
in-userspace #MC.<br />
<br />
mce: [Hardware Error]: CPU 0: Machine Check Exception: f Bank 3: bd80000000070134<br />
mce: [Hardware Error]: RIP 10: {memchr_inv+0x4c/0xf0}<br />
mce: [Hardware Error]: TSC afff7bbff88a ADDR 1d301b000 MISC 80 PPIN 1e741e77539027db<br />
mce: [Hardware Error]: PROCESSOR 0:d06d0 TIME 1758093249 SOCKET 0 APIC 0 microcode 80000320<br />
mce: [Hardware Error]: Run the above through &#39;mcelog --ascii&#39;<br />
mce: [Hardware Error]: Machine check: Data load in unrecoverable area of kernel<br />
Kernel panic - not syncing: Fatal local machine check<br />
<br />
The root cause of this panic is that handling a memory failure triggered<br />
by an in-userspace #MC necessitates splitting the THP. The splitting<br />
process employs a mechanism, implemented in<br />
try_to_map_unused_to_zeropage(), which reads the pages in the THP to<br />
identify zero-filled pages. However, reading the pages in the THP results<br />
in a second in-kernel #MC, occurring before the initial memory_failure()<br />
completes, ultimately leading to a kernel panic. See the kernel panic<br />
call trace on the two #MCs.<br />
<br />
First Machine Check occurs // [1]<br />
memory_failure() // [2]<br />
try_to_split_thp_page()<br />
split_huge_page()<br />
split_huge_page_to_list_to_order()<br />
__folio_split() // [3]<br />
remap_page()<br />
remove_migration_ptes()<br />
remove_migration_pte()<br />
try_to_map_unused_to_zeropage() // [4]<br />
memchr_inv() // [5]<br />
Second Machine Check occurs // [6]<br />
Kernel panic<br />
<br />
[1] Triggered by accessing a hardware-poisoned THP in userspace, which is<br />
typically recoverable by terminating the affected process.<br />
<br />
[2] Call folio_set_has_hwpoisoned() before try_to_split_thp_page().<br />
<br />
[3] Pass the RMP_USE_SHARED_ZEROPAGE remap flag to remap_page().<br />
<br />
[4] Try to map the unused THP to zeropage.<br />
<br />
[5] Re-access pages in the hw-poisoned THP in the kernel.<br />
<br />
[6] Triggered in-kernel, leading to a panic kernel.<br />
<br />
In Step[2], memory_failure() sets the poisoned flag on the page in the THP<br />
by TestSetPageHWPoison() before calling try_to_split_thp_page().<br />
<br />
As suggested by David Hildenbrand, fix this panic by not accessing to the<br />
poisoned page in the THP during zeropage identification, while continuing<br />
to scan unaffected pages in the THP for possible zeropage mapping. This<br />
prevents a second in-kernel #MC that would cause kernel panic in Step[4].<br />
<br />
Thanks to Andrew Zaborowski for his initial work on fixing this issue.



