CVE-2024-57976

Severity CVSS v4.0:
Pending analysis
Type:
Unavailable / Other
Publication date:
27/02/2025
Last modified:
06/07/2025

Description

In the Linux kernel, the following vulnerability has been resolved:<br /> <br /> btrfs: do proper folio cleanup when cow_file_range() failed<br /> <br /> [BUG]<br /> When testing with COW fixup marked as BUG_ON() (this is involved with the<br /> new pin_user_pages*() change, which should not result new out-of-band<br /> dirty pages), I hit a crash triggered by the BUG_ON() from hitting COW<br /> fixup path.<br /> <br /> This BUG_ON() happens just after a failed btrfs_run_delalloc_range():<br /> <br /> BTRFS error (device dm-2): failed to run delalloc range, root 348 ino 405 folio 65536 submit_bitmap 6-15 start 90112 len 106496: -28<br /> ------------[ cut here ]------------<br /> kernel BUG at fs/btrfs/extent_io.c:1444!<br /> Internal error: Oops - BUG: 00000000f2000800 [#1] SMP<br /> CPU: 0 UID: 0 PID: 434621 Comm: kworker/u24:8 Tainted: G OE 6.12.0-rc7-custom+ #86<br /> Hardware name: QEMU KVM Virtual Machine, BIOS unknown 2/2/2022<br /> Workqueue: events_unbound btrfs_async_reclaim_data_space [btrfs]<br /> pc : extent_writepage_io+0x2d4/0x308 [btrfs]<br /> lr : extent_writepage_io+0x2d4/0x308 [btrfs]<br /> Call trace:<br /> extent_writepage_io+0x2d4/0x308 [btrfs]<br /> extent_writepage+0x218/0x330 [btrfs]<br /> extent_write_cache_pages+0x1d4/0x4b0 [btrfs]<br /> btrfs_writepages+0x94/0x150 [btrfs]<br /> do_writepages+0x74/0x190<br /> filemap_fdatawrite_wbc+0x88/0xc8<br /> start_delalloc_inodes+0x180/0x3b0 [btrfs]<br /> btrfs_start_delalloc_roots+0x174/0x280 [btrfs]<br /> shrink_delalloc+0x114/0x280 [btrfs]<br /> flush_space+0x250/0x2f8 [btrfs]<br /> btrfs_async_reclaim_data_space+0x180/0x228 [btrfs]<br /> process_one_work+0x164/0x408<br /> worker_thread+0x25c/0x388<br /> kthread+0x100/0x118<br /> ret_from_fork+0x10/0x20<br /> Code: aa1403e1 9402f3ef aa1403e0 9402f36f (d4210000)<br /> ---[ end trace 0000000000000000 ]---<br /> <br /> [CAUSE]<br /> That failure is mostly from cow_file_range(), where we can hit -ENOSPC.<br /> <br /> Although the -ENOSPC is already a bug related to our space reservation<br /> code, let&amp;#39;s just focus on the error handling.<br /> <br /> For example, we have the following dirty range [0, 64K) of an inode,<br /> with 4K sector size and 4K page size:<br /> <br /> 0 16K 32K 48K 64K<br /> |///////////////////////////////////////|<br /> |#######################################|<br /> <br /> Where |///| means page are still dirty, and |###| means the extent io<br /> tree has EXTENT_DELALLOC flag.<br /> <br /> - Enter extent_writepage() for page 0<br /> <br /> - Enter btrfs_run_delalloc_range() for range [0, 64K)<br /> <br /> - Enter cow_file_range() for range [0, 64K)<br /> <br /> - Function btrfs_reserve_extent() only reserved one 16K extent<br /> So we created extent map and ordered extent for range [0, 16K)<br /> <br /> 0 16K 32K 48K 64K<br /> |////////|//////////////////////////////|<br /> ||##############################|<br /> <br /> And range [0, 16K) has its delalloc flag cleared.<br /> But since we haven&amp;#39;t yet submit any bio, involved 4 pages are still<br /> dirty.<br /> <br /> - Function btrfs_reserve_extent() returns with -ENOSPC<br /> Now we have to run error cleanup, which will clear all<br /> EXTENT_DELALLOC* flags and clear the dirty flags for the remaining<br /> ranges:<br /> <br /> 0 16K 32K 48K 64K<br /> |////////| |<br /> | | |<br /> <br /> Note that range [0, 16K) still has its pages dirty.<br /> <br /> - Some time later, writeback is triggered again for the range [0, 16K)<br /> since the page range still has dirty flags.<br /> <br /> - btrfs_run_delalloc_range() will do nothing because there is no<br /> EXTENT_DELALLOC flag.<br /> <br /> - extent_writepage_io() finds page 0 has no ordered flag<br /> Which falls into the COW fixup path, triggering the BUG_ON().<br /> <br /> Unfortunately this error handling bug dates back to the introduction of<br /> btrfs. Thankfully with the abuse of COW fixup, at least it won&amp;#39;t crash<br /> the kernel.<br /> <br /> [FIX]<br /> Instead of immediately unlocking the extent and folios, we keep the extent<br /> and folios locked until either erroring out or the whole delalloc range<br /> finished.<br /> <br /> When the whole delalloc range finished without error, we just unlock the<br /> whole range with PAGE_SET_ORDERED (and PAGE_UNLOCK for !keep_locked<br /> cases)<br /> ---truncated---

Impact