CVE-2024-50066

Severity CVSS v4.0:
Pending analysis
Type:
CWE-362 Concurrent Execution using Shared Resource with Improper Synchronization ('Race Condition')
Publication date:
23/10/2024
Last modified:
07/03/2025

Description

In the Linux kernel, the following vulnerability has been resolved:<br /> <br /> mm/mremap: fix move_normal_pmd/retract_page_tables race<br /> <br /> In mremap(), move_page_tables() looks at the type of the PMD entry and the<br /> specified address range to figure out by which method the next chunk of<br /> page table entries should be moved.<br /> <br /> At that point, the mmap_lock is held in write mode, but no rmap locks are<br /> held yet. For PMD entries that point to page tables and are fully covered<br /> by the source address range, move_pgt_entry(NORMAL_PMD, ...) is called,<br /> which first takes rmap locks, then does move_normal_pmd(). <br /> move_normal_pmd() takes the necessary page table locks at source and<br /> destination, then moves an entire page table from the source to the<br /> destination.<br /> <br /> The problem is: The rmap locks, which protect against concurrent page<br /> table removal by retract_page_tables() in the THP code, are only taken<br /> after the PMD entry has been read and it has been decided how to move it. <br /> So we can race as follows (with two processes that have mappings of the<br /> same tmpfs file that is stored on a tmpfs mount with huge=advise); note<br /> that process A accesses page tables through the MM while process B does it<br /> through the file rmap:<br /> <br /> process A process B<br /> ========= =========<br /> mremap<br /> mremap_to<br /> move_vma<br /> move_page_tables<br /> get_old_pmd<br /> alloc_new_pmd<br /> *** PREEMPT ***<br /> madvise(MADV_COLLAPSE)<br /> do_madvise<br /> madvise_walk_vmas<br /> madvise_vma_behavior<br /> madvise_collapse<br /> hpage_collapse_scan_file<br /> collapse_file<br /> retract_page_tables<br /> i_mmap_lock_read(mapping)<br /> pmdp_collapse_flush<br /> i_mmap_unlock_read(mapping)<br /> move_pgt_entry(NORMAL_PMD, ...)<br /> take_rmap_locks<br /> move_normal_pmd<br /> drop_rmap_locks<br /> <br /> When this happens, move_normal_pmd() can end up creating bogus PMD entries<br /> in the line `pmd_populate(mm, new_pmd, pmd_pgtable(pmd))`. The effect<br /> depends on arch-specific and machine-specific details; on x86, you can end<br /> up with physical page 0 mapped as a page table, which is likely<br /> exploitable for user-&gt;kernel privilege escalation.<br /> <br /> Fix the race by letting process B recheck that the PMD still points to a<br /> page table after the rmap locks have been taken. Otherwise, we bail and<br /> let the caller fall back to the PTE-level copying path, which will then<br /> bail immediately at the pmd_none() check.<br /> <br /> Bug reachability: Reaching this bug requires that you can create<br /> shmem/file THP mappings - anonymous THP uses different code that doesn&amp;#39;t<br /> zap stuff under rmap locks. File THP is gated on an experimental config<br /> flag (CONFIG_READ_ONLY_THP_FOR_FS), so on normal distro kernels you need<br /> shmem THP to hit this bug. As far as I know, getting shmem THP normally<br /> requires that you can mount your own tmpfs with the right mount flags,<br /> which would require creating your own user+mount namespace; though I don&amp;#39;t<br /> know if some distros maybe enable shmem THP by default or something like<br /> that.<br /> <br /> Bug impact: This issue can likely be used for user-&gt;kernel privilege<br /> escalation when it is reachable.

Vulnerable products and versions

CPE From Up to
cpe:2.3:o:linux:linux_kernel:*:*:*:*:*:*:*:* 6.6 (including) 6.6.58 (excluding)
cpe:2.3:o:linux:linux_kernel:*:*:*:*:*:*:*:* 6.7 (including) 6.11.5 (excluding)
cpe:2.3:o:linux:linux_kernel:6.12:rc1:*:*:*:*:*:*
cpe:2.3:o:linux:linux_kernel:6.12:rc2:*:*:*:*:*:*
cpe:2.3:o:linux:linux_kernel:6.12:rc3:*:*:*:*:*:*