Instituto Nacional de ciberseguridad. Sección Incibe
Instituto Nacional de Ciberseguridad. Sección INCIBE-CERT

CVE-2025-39756

Gravedad:
Pendiente de análisis
Tipo:
No Disponible / Otro tipo
Fecha de publicación:
11/09/2025
Última modificación:
03/11/2025

Descripción

*** Pendiente de traducción *** In the Linux kernel, the following vulnerability has been resolved:<br /> <br /> fs: Prevent file descriptor table allocations exceeding INT_MAX<br /> <br /> When sysctl_nr_open is set to a very high value (for example, 1073741816<br /> as set by systemd), processes attempting to use file descriptors near<br /> the limit can trigger massive memory allocation attempts that exceed<br /> INT_MAX, resulting in a WARNING in mm/slub.c:<br /> <br /> WARNING: CPU: 0 PID: 44 at mm/slub.c:5027 __kvmalloc_node_noprof+0x21a/0x288<br /> <br /> This happens because kvmalloc_array() and kvmalloc() check if the<br /> requested size exceeds INT_MAX and emit a warning when the allocation is<br /> not flagged with __GFP_NOWARN.<br /> <br /> Specifically, when nr_open is set to 1073741816 (0x3ffffff8) and a<br /> process calls dup2(oldfd, 1073741880), the kernel attempts to allocate:<br /> - File descriptor array: 1073741880 * 8 bytes = 8,589,935,040 bytes<br /> - Multiple bitmaps: ~400MB<br /> - Total allocation size: &gt; 8GB (exceeding INT_MAX = 2,147,483,647)<br /> <br /> Reproducer:<br /> 1. Set /proc/sys/fs/nr_open to 1073741816:<br /> # echo 1073741816 &gt; /proc/sys/fs/nr_open<br /> <br /> 2. Run a program that uses a high file descriptor:<br /> #include <br /> #include <br /> <br /> int main() {<br /> struct rlimit rlim = {1073741824, 1073741824};<br /> setrlimit(RLIMIT_NOFILE, &amp;rlim);<br /> dup2(2, 1073741880); // Triggers the warning<br /> return 0;<br /> }<br /> <br /> 3. Observe WARNING in dmesg at mm/slub.c:5027<br /> <br /> systemd commit a8b627a introduced automatic bumping of fs.nr_open to the<br /> maximum possible value. The rationale was that systems with memory<br /> control groups (memcg) no longer need separate file descriptor limits<br /> since memory is properly accounted. However, this change overlooked<br /> that:<br /> <br /> 1. The kernel&amp;#39;s allocation functions still enforce INT_MAX as a maximum<br /> size regardless of memcg accounting<br /> 2. Programs and tests that legitimately test file descriptor limits can<br /> inadvertently trigger massive allocations<br /> 3. The resulting allocations (&gt;8GB) are impractical and will always fail<br /> <br /> systemd&amp;#39;s algorithm starts with INT_MAX and keeps halving the value<br /> until the kernel accepts it. On most systems, this results in nr_open<br /> being set to 1073741816 (0x3ffffff8), which is just under 1GB of file<br /> descriptors.<br /> <br /> While processes rarely use file descriptors near this limit in normal<br /> operation, certain selftests (like<br /> tools/testing/selftests/core/unshare_test.c) and programs that test file<br /> descriptor limits can trigger this issue.<br /> <br /> Fix this by adding a check in alloc_fdtable() to ensure the requested<br /> allocation size does not exceed INT_MAX. This causes the operation to<br /> fail with -EMFILE instead of triggering a kernel warning and avoids the<br /> impractical &gt;8GB memory allocation request.

Impacto