CVE-2025-37964

Severity CVSS v4.0:
Pending analysis
Type:
Unavailable / Other
Publication date:
20/05/2025
Last modified:
21/05/2025

Description

In the Linux kernel, the following vulnerability has been resolved:<br /> <br /> x86/mm: Eliminate window where TLB flushes may be inadvertently skipped<br /> <br /> tl;dr: There is a window in the mm switching code where the new CR3 is<br /> set and the CPU should be getting TLB flushes for the new mm. But<br /> should_flush_tlb() has a bug and suppresses the flush. Fix it by<br /> widening the window where should_flush_tlb() sends an IPI.<br /> <br /> Long Version:<br /> <br /> === History ===<br /> <br /> There were a few things leading up to this.<br /> <br /> First, updating mm_cpumask() was observed to be too expensive, so it was<br /> made lazier. But being lazy caused too many unnecessary IPIs to CPUs<br /> due to the now-lazy mm_cpumask(). So code was added to cull<br /> mm_cpumask() periodically[2]. But that culling was a bit too aggressive<br /> and skipped sending TLB flushes to CPUs that need them. So here we are<br /> again.<br /> <br /> === Problem ===<br /> <br /> The too-aggressive code in should_flush_tlb() strikes in this window:<br /> <br /> // Turn on IPIs for this CPU/mm combination, but only<br /> // if should_flush_tlb() agrees:<br /> cpumask_set_cpu(cpu, mm_cpumask(next));<br /> <br /> next_tlb_gen = atomic64_read(&amp;next-&gt;context.tlb_gen);<br /> choose_new_asid(next, next_tlb_gen, &amp;new_asid, &amp;need_flush);<br /> load_new_mm_cr3(need_flush);<br /> // ^ After &amp;#39;need_flush&amp;#39; is set to false, IPIs *MUST*<br /> // be sent to this CPU and not be ignored.<br /> <br /> this_cpu_write(cpu_tlbstate.loaded_mm, next);<br /> // ^ Not until this point does should_flush_tlb()<br /> // become true!<br /> <br /> should_flush_tlb() will suppress TLB flushes between load_new_mm_cr3()<br /> and writing to &amp;#39;loaded_mm&amp;#39;, which is a window where they should not be<br /> suppressed. Whoops.<br /> <br /> === Solution ===<br /> <br /> Thankfully, the fuzzy "just about to write CR3" window is already marked<br /> with loaded_mm==LOADED_MM_SWITCHING. Simply checking for that state in<br /> should_flush_tlb() is sufficient to ensure that the CPU is targeted with<br /> an IPI.<br /> <br /> This will cause more TLB flush IPIs. But the window is relatively small<br /> and I do not expect this to cause any kind of measurable performance<br /> impact.<br /> <br /> Update the comment where LOADED_MM_SWITCHING is written since it grew<br /> yet another user.<br /> <br /> Peter Z also raised a concern that should_flush_tlb() might not observe<br /> &amp;#39;loaded_mm&amp;#39; and &amp;#39;is_lazy&amp;#39; in the same order that switch_mm_irqs_off()<br /> writes them. Add a barrier to ensure that they are observed in the<br /> order they are written.

Impact