CVE-2025-37988

Severity CVSS v4.0:
Pending analysis
Type:
Unavailable / Other
Publication date:
20/05/2025
Last modified:
21/05/2025

Description

In the Linux kernel, the following vulnerability has been resolved:<br /> <br /> fix a couple of races in MNT_TREE_BENEATH handling by do_move_mount()<br /> <br /> Normally do_lock_mount(path, _) is locking a mountpoint pinned by<br /> *path and at the time when matching unlock_mount() unlocks that<br /> location it is still pinned by the same thing.<br /> <br /> Unfortunately, for &amp;#39;beneath&amp;#39; case it&amp;#39;s no longer that simple -<br /> the object being locked is not the one *path points to. It&amp;#39;s the<br /> mountpoint of path-&gt;mnt. The thing is, without sufficient locking<br /> -&gt;mnt_parent may change under us and none of the locks are held<br /> at that point. The rules are<br /> * mount_lock stabilizes m-&gt;mnt_parent for any mount m.<br /> * namespace_sem stabilizes m-&gt;mnt_parent, provided that<br /> m is mounted.<br /> * if either of the above holds and refcount of m is positive,<br /> we are guaranteed the same for refcount of m-&gt;mnt_parent.<br /> <br /> namespace_sem nests inside inode_lock(), so do_lock_mount() has<br /> to take inode_lock() before grabbing namespace_sem. It does<br /> recheck that path-&gt;mnt is still mounted in the same place after<br /> getting namespace_sem, and it does take care to pin the dentry.<br /> It is needed, since otherwise we might end up with racing mount --move<br /> (or umount) happening while we were getting locks; in that case<br /> dentry would no longer be a mountpoint and could&amp;#39;ve been evicted<br /> on memory pressure along with its inode - not something you want<br /> when grabbing lock on that inode.<br /> <br /> However, pinning a dentry is not enough - the matching mount is<br /> also pinned only by the fact that path-&gt;mnt is mounted on top it<br /> and at that point we are not holding any locks whatsoever, so<br /> the same kind of races could end up with all references to<br /> that mount gone just as we are about to enter inode_lock().<br /> If that happens, we are left with filesystem being shut down while<br /> we are holding a dentry reference on it; results are not pretty.<br /> <br /> What we need to do is grab both dentry and mount at the same time;<br /> that makes inode_lock() safe *and* avoids the problem with fs getting<br /> shut down under us. After taking namespace_sem we verify that<br /> path-&gt;mnt is still mounted (which stabilizes its -&gt;mnt_parent) and<br /> check that it&amp;#39;s still mounted at the same place. From that point<br /> on to the matching namespace_unlock() we are guaranteed that<br /> mount/dentry pair we&amp;#39;d grabbed are also pinned by being the mountpoint<br /> of path-&gt;mnt, so we can quietly drop both the dentry reference (as<br /> the current code does) and mnt one - it&amp;#39;s OK to do under namespace_sem,<br /> since we are not dropping the final refs.<br /> <br /> That solves the problem on do_lock_mount() side; unlock_mount()<br /> also has one, since dentry is guaranteed to stay pinned only until<br /> the namespace_unlock(). That&amp;#39;s easy to fix - just have inode_unlock()<br /> done earlier, while it&amp;#39;s still pinned by mp-&gt;m_dentry.

Impact