CVE-2022-49124
Severity CVSS v4.0:
Pending analysis
Type:
Unavailable / Other
Publication date:
26/02/2025
Last modified:
26/02/2025
Description
In the Linux kernel, the following vulnerability has been resolved:<br />
<br />
x86/mce: Work around an erratum on fast string copy instructions<br />
<br />
A rare kernel panic scenario can happen when the following conditions<br />
are met due to an erratum on fast string copy instructions:<br />
<br />
1) An uncorrected error.<br />
2) That error must be in first cache line of a page.<br />
3) Kernel must execute page_copy from the page immediately before that<br />
page.<br />
<br />
The fast string copy instructions ("REP; MOVS*") could consume an<br />
uncorrectable memory error in the cache line _right after_ the desired<br />
region to copy and raise an MCE.<br />
<br />
Bit 0 of MSR_IA32_MISC_ENABLE can be cleared to disable fast string<br />
copy and will avoid such spurious machine checks. However, that is less<br />
preferable due to the permanent performance impact. Considering memory<br />
poison is rare, it&#39;s desirable to keep fast string copy enabled until an<br />
MCE is seen.<br />
<br />
Intel has confirmed the following:<br />
1. The CPU erratum of fast string copy only applies to Skylake,<br />
Cascade Lake and Cooper Lake generations.<br />
<br />
Directly return from the MCE handler:<br />
2. Will result in complete execution of the "REP; MOVS*" with no data<br />
loss or corruption.<br />
3. Will not result in another MCE firing on the next poisoned cache line<br />
due to "REP; MOVS*".<br />
4. Will resume execution from a correct point in code.<br />
5. Will result in the same instruction that triggered the MCE firing a<br />
second MCE immediately for any other software recoverable data fetch<br />
errors.<br />
6. Is not safe without disabling the fast string copy, as the next fast<br />
string copy of the same buffer on the same CPU would result in a PANIC<br />
MCE.<br />
<br />
This should mitigate the erratum completely with the only caveat that<br />
the fast string copy is disabled on the affected hyper thread thus<br />
performance degradation.<br />
<br />
This is still better than the OS crashing on MCEs raised on an<br />
irrelevant process due to "REP; MOVS*&#39; accesses in a kernel context,<br />
e.g., copy_page.<br />
<br />
<br />
Injected errors on 1st cache line of 8 anonymous pages of process<br />
&#39;proc1&#39; and observed MCE consumption from &#39;proc2&#39; with no panic<br />
(directly returned).<br />
<br />
Without the fix, the host panicked within a few minutes on a<br />
random &#39;proc2&#39; process due to kernel access from copy_page.<br />
<br />
[ bp: Fix comment style + touch ups, zap an unlikely(), improve the<br />
quirk function&#39;s readability. ]