CVE-2025-39989
Severity CVSS v4.0:
Pending analysis
Type:
Unavailable / Other
Publication date:
18/04/2025
Last modified:
02/05/2025
Description
In the Linux kernel, the following vulnerability has been resolved:<br />
<br />
x86/mce: use is_copy_from_user() to determine copy-from-user context<br />
<br />
Patch series "mm/hwpoison: Fix regressions in memory failure handling",<br />
v4.<br />
<br />
## 1. What am I trying to do:<br />
<br />
This patchset resolves two critical regressions related to memory failure<br />
handling that have appeared in the upstream kernel since version 5.17, as<br />
compared to 5.10 LTS.<br />
<br />
- copyin case: poison found in user page while kernel copying from user space<br />
- instr case: poison found while instruction fetching in user space<br />
<br />
## 2. What is the expected outcome and why<br />
<br />
- For copyin case:<br />
<br />
Kernel can recover from poison found where kernel is doing get_user() or<br />
copy_from_user() if those places get an error return and the kernel return<br />
-EFAULT to the process instead of crashing. More specifily, MCE handler<br />
checks the fixup handler type to decide whether an in kernel #MC can be<br />
recovered. When EX_TYPE_UACCESS is found, the PC jumps to recovery code<br />
specified in _ASM_EXTABLE_FAULT() and return a -EFAULT to user space.<br />
<br />
- For instr case:<br />
<br />
If a poison found while instruction fetching in user space, full recovery<br />
is possible. User process takes #PF, Linux allocates a new page and fills<br />
by reading from storage.<br />
<br />
<br />
## 3. What actually happens and why<br />
<br />
- For copyin case: kernel panic since v5.17<br />
<br />
Commit 4c132d1d844a ("x86/futex: Remove .fixup usage") introduced a new<br />
extable fixup type, EX_TYPE_EFAULT_REG, and later patches updated the<br />
extable fixup type for copy-from-user operations, changing it from<br />
EX_TYPE_UACCESS to EX_TYPE_EFAULT_REG. It breaks previous EX_TYPE_UACCESS<br />
handling when posion found in get_user() or copy_from_user().<br />
<br />
- For instr case: user process is killed by a SIGBUS signal due to #CMCI<br />
and #MCE race<br />
<br />
When an uncorrected memory error is consumed there is a race between the<br />
CMCI from the memory controller reporting an uncorrected error with a UCNA<br />
signature, and the core reporting and SRAR signature machine check when<br />
the data is about to be consumed.<br />
<br />
### Background: why *UN*corrected errors tied to *C*MCI in Intel platform [1]<br />
<br />
Prior to Icelake memory controllers reported patrol scrub events that<br />
detected a previously unseen uncorrected error in memory by signaling a<br />
broadcast machine check with an SRAO (Software Recoverable Action<br />
Optional) signature in the machine check bank. This was overkill because<br />
it&#39;s not an urgent problem that no core is on the verge of consuming that<br />
bad data. It&#39;s also found that multi SRAO UCE may cause nested MCE<br />
interrupts and finally become an IERR.<br />
<br />
Hence, Intel downgrades the machine check bank signature of patrol scrub<br />
from SRAO to UCNA (Uncorrected, No Action required), and signal changed to<br />
#CMCI. Just to add to the confusion, Linux does take an action (in<br />
uc_decode_notifier()) to try to offline the page despite the UC*NA*<br />
signature name.<br />
<br />
### Background: why #CMCI and #MCE race when poison is consuming in<br />
Intel platform [1]<br />
<br />
Having decided that CMCI/UCNA is the best action for patrol scrub errors,<br />
the memory controller uses it for reads too. But the memory controller is<br />
executing asynchronously from the core, and can&#39;t tell the difference<br />
between a "real" read and a speculative read. So it will do CMCI/UCNA if<br />
an error is found in any read.<br />
<br />
Thus:<br />
<br />
1) Core is clever and thinks address A is needed soon, issues a<br />
speculative read.<br />
<br />
2) Core finds it is going to use address A soon after sending the read<br />
request<br />
<br />
3) The CMCI from the memory controller is in a race with MCE from the<br />
core that will soon try to retire the load from address A.<br />
<br />
Quite often (because speculation has got better) the CMCI from the memory<br />
controller is delivered before the core is committed to the instruction<br />
reading address A, so the interrupt is taken, and Linux offlines the page<br />
(marking it as poison).<br />
<br />
<br />
## Why user process is killed for instr case<br />
<br />
Commit 046545a661af ("mm/hwpoison: fix error page recovered but reported<br />
"not<br />
---truncated---
Impact
References to Advisories, Solutions, and Tools
- https://git.kernel.org/stable/c/0b8388e97ba6a8c033f9a8b5565af41af07f9345
- https://git.kernel.org/stable/c/1a15bb8303b6b104e78028b6c68f76a0d4562134
- https://git.kernel.org/stable/c/3e3d8169c0950a0b3cd5105f6403a78350dcac80
- https://git.kernel.org/stable/c/449413da90a337f343cc5a73070cbd68e92e8a54
- https://git.kernel.org/stable/c/5724654a084f701dc64b08d34a0e800f22f0e6e4