CVE-2022-48920
Publication date:
22/08/2024
In the Linux kernel, the following vulnerability has been resolved:<br />
<br />
btrfs: get rid of warning on transaction commit when using flushoncommit<br />
<br />
When using the flushoncommit mount option, during almost every transaction<br />
commit we trigger a warning from __writeback_inodes_sb_nr():<br />
<br />
$ cat fs/fs-writeback.c:<br />
(...)<br />
static void __writeback_inodes_sb_nr(struct super_block *sb, ...<br />
{<br />
(...)<br />
WARN_ON(!rwsem_is_locked(&sb->s_umount));<br />
(...)<br />
}<br />
(...)<br />
<br />
The trace produced in dmesg looks like the following:<br />
<br />
[947.473890] WARNING: CPU: 5 PID: 930 at fs/fs-writeback.c:2610 __writeback_inodes_sb_nr+0x7e/0xb3<br />
[947.481623] Modules linked in: nfsd nls_cp437 cifs asn1_decoder cifs_arc4 fscache cifs_md4 ipmi_ssif<br />
[947.489571] CPU: 5 PID: 930 Comm: btrfs-transacti Not tainted 95.16.3-srb-asrock-00001-g36437ad63879 #186<br />
[947.497969] RIP: 0010:__writeback_inodes_sb_nr+0x7e/0xb3<br />
[947.502097] Code: 24 10 4c 89 44 24 18 c6 (...)<br />
[947.519760] RSP: 0018:ffffc90000777e10 EFLAGS: 00010246<br />
[947.523818] RAX: 0000000000000000 RBX: 0000000000963300 RCX: 0000000000000000<br />
[947.529765] RDX: 0000000000000000 RSI: 000000000000fa51 RDI: ffffc90000777e50<br />
[947.535740] RBP: ffff888101628a90 R08: ffff888100955800 R09: ffff888100956000<br />
[947.541701] R10: 0000000000000002 R11: 0000000000000001 R12: ffff888100963488<br />
[947.547645] R13: ffff888100963000 R14: ffff888112fb7200 R15: ffff888100963460<br />
[947.553621] FS: 0000000000000000(0000) GS:ffff88841fd40000(0000) knlGS:0000000000000000<br />
[947.560537] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033<br />
[947.565122] CR2: 0000000008be50c4 CR3: 000000000220c000 CR4: 00000000001006e0<br />
[947.571072] Call Trace:<br />
[947.572354] <br />
[947.573266] btrfs_commit_transaction+0x1f1/0x998<br />
[947.576785] ? start_transaction+0x3ab/0x44e<br />
[947.579867] ? schedule_timeout+0x8a/0xdd<br />
[947.582716] transaction_kthread+0xe9/0x156<br />
[947.585721] ? btrfs_cleanup_transaction.isra.0+0x407/0x407<br />
[947.590104] kthread+0x131/0x139<br />
[947.592168] ? set_kthread_struct+0x32/0x32<br />
[947.595174] ret_from_fork+0x22/0x30<br />
[947.597561] <br />
[947.598553] ---[ end trace 644721052755541c ]---<br />
<br />
This is because we started using writeback_inodes_sb() to flush delalloc<br />
when committing a transaction (when using -o flushoncommit), in order to<br />
avoid deadlocks with filesystem freeze operations. This change was made<br />
by commit ce8ea7cc6eb313 ("btrfs: don&#39;t call btrfs_start_delalloc_roots<br />
in flushoncommit"). After that change we started producing that warning,<br />
and every now and then a user reports this since the warning happens too<br />
often, it spams dmesg/syslog, and a user is unsure if this reflects any<br />
problem that might compromise the filesystem&#39;s reliability.<br />
<br />
We can not just lock the sb->s_umount semaphore before calling<br />
writeback_inodes_sb(), because that would at least deadlock with<br />
filesystem freezing, since at fs/super.c:freeze_super() sync_filesystem()<br />
is called while we are holding that semaphore in write mode, and that can<br />
trigger a transaction commit, resulting in a deadlock. It would also<br />
trigger the same type of deadlock in the unmount path. Possibly, it could<br />
also introduce some other locking dependencies that lockdep would report.<br />
<br />
To fix this call try_to_writeback_inodes_sb() instead of<br />
writeback_inodes_sb(), because that will try to read lock sb->s_umount<br />
and then will only call writeback_inodes_sb() if it was able to lock it.<br />
This is fine because the cases where it can&#39;t read lock sb->s_umount<br />
are during a filesystem unmount or during a filesystem freeze - in those<br />
cases sb->s_umount is write locked and sync_filesystem() is called, which<br />
calls writeback_inodes_sb(). In other words, in all cases where we can&#39;t<br />
take a read lock on sb->s_umount, writeback is already being triggered<br />
elsewhere.<br />
<br />
An alternative would be to call btrfs_start_delalloc_roots() with a<br />
number of pages different from LONG_MAX, for example matching the number<br />
of delalloc bytes we currently have, in <br />
---truncated---
Severity CVSS v4.0: Pending analysis
Last modification:
12/09/2024