CVE-2025-38334

Severity CVSS v4.0:
Pending analysis
Type:
Unavailable / Other
Publication date:
10/07/2025
Last modified:
10/07/2025

Description

In the Linux kernel, the following vulnerability has been resolved:<br /> <br /> x86/sgx: Prevent attempts to reclaim poisoned pages<br /> <br /> TL;DR: SGX page reclaim touches the page to copy its contents to<br /> secondary storage. SGX instructions do not gracefully handle machine<br /> checks. Despite this, the existing SGX code will try to reclaim pages<br /> that it _knows_ are poisoned. Avoid even trying to reclaim poisoned pages.<br /> <br /> The longer story:<br /> <br /> Pages used by an enclave only get epc_page-&gt;poison set in<br /> arch_memory_failure() but they currently stay on sgx_active_page_list until<br /> sgx_encl_release(), with the SGX_EPC_PAGE_RECLAIMER_TRACKED flag untouched.<br /> <br /> epc_page-&gt;poison is not checked in the reclaimer logic meaning that, if other<br /> conditions are met, an attempt will be made to reclaim an EPC page that was<br /> poisoned. This is bad because 1. we don&amp;#39;t want that page to end up added<br /> to another enclave and 2. it is likely to cause one core to shut down<br /> and the kernel to panic.<br /> <br /> Specifically, reclaiming uses microcode operations including "EWB" which<br /> accesses the EPC page contents to encrypt and write them out to non-SGX<br /> memory. Those operations cannot handle MCEs in their accesses other than<br /> by putting the executing core into a special shutdown state (affecting<br /> both threads with HT.) The kernel will subsequently panic on the<br /> remaining cores seeing the core didn&amp;#39;t enter MCE handler(s) in time.<br /> <br /> Call sgx_unmark_page_reclaimable() to remove the affected EPC page from<br /> sgx_active_page_list on memory error to stop it being considered for<br /> reclaiming.<br /> <br /> Testing epc_page-&gt;poison in sgx_reclaim_pages() would also work but I assume<br /> it&amp;#39;s better to add code in the less likely paths.<br /> <br /> The affected EPC page is not added to &amp;node-&gt;sgx_poison_page_list until<br /> later in sgx_encl_release()-&gt;sgx_free_epc_page() when it is EREMOVEd.<br /> Membership on other lists doesn&amp;#39;t change to avoid changing any of the<br /> lists&amp;#39; semantics except for sgx_active_page_list. There&amp;#39;s a "TBD" comment<br /> in arch_memory_failure() about pre-emptive actions, the goal here is not<br /> to address everything that it may imply.<br /> <br /> This also doesn&amp;#39;t completely close the time window when a memory error<br /> notification will be fatal (for a not previously poisoned EPC page) --<br /> the MCE can happen after sgx_reclaim_pages() has selected its candidates<br /> or even *inside* a microcode operation (actually easy to trigger due to<br /> the amount of time spent in them.)<br /> <br /> The spinlock in sgx_unmark_page_reclaimable() is safe because<br /> memory_failure() runs in process context and no spinlocks are held,<br /> explicitly noted in a mm/memory-failure.c comment.

Impact