CVE-2023-54158
Publication date:
24/12/2025
In the Linux kernel, the following vulnerability has been resolved:<br />
<br />
btrfs: don&#39;t free qgroup space unless specified<br />
<br />
Boris noticed in his simple quotas testing that he was getting a leak<br />
with Sweet Tea&#39;s change to subvol create that stopped doing a<br />
transaction commit. This was just a side effect of that change.<br />
<br />
In the delayed inode code we have an optimization that will free extra<br />
reservations if we think we can pack a dir item into an already modified<br />
leaf. Previously this wouldn&#39;t be triggered in the subvolume create<br />
case because we&#39;d commit the transaction, it was still possible but<br />
much harder to trigger. It could actually be triggered if we did a<br />
mkdir && subvol create with qgroups enabled.<br />
<br />
This occurs because in btrfs_insert_delayed_dir_index(), which gets<br />
called when we&#39;re adding the dir item, we do the following:<br />
<br />
btrfs_block_rsv_release(fs_info, trans->block_rsv, bytes, NULL);<br />
<br />
if we&#39;re able to skip reserving space.<br />
<br />
The problem here is that trans->block_rsv points at the temporary block<br />
rsv for the subvolume create, which has qgroup reservations in the block<br />
rsv.<br />
<br />
This is a problem because btrfs_block_rsv_release() will do the<br />
following:<br />
<br />
if (block_rsv->qgroup_rsv_reserved >= block_rsv->qgroup_rsv_size) {<br />
qgroup_to_release = block_rsv->qgroup_rsv_reserved -<br />
block_rsv->qgroup_rsv_size;<br />
block_rsv->qgroup_rsv_reserved = block_rsv->qgroup_rsv_size;<br />
}<br />
<br />
The temporary block rsv just has ->qgroup_rsv_reserved set,<br />
->qgroup_rsv_size == 0. The optimization in<br />
btrfs_insert_delayed_dir_index() sets ->qgroup_rsv_reserved = 0. Then<br />
later on when we call btrfs_subvolume_release_metadata() which has<br />
<br />
btrfs_block_rsv_release(fs_info, rsv, (u64)-1, &qgroup_to_release);<br />
btrfs_qgroup_convert_reserved_meta(root, qgroup_to_release);<br />
<br />
qgroup_to_release is set to 0, and we do not convert the reserved<br />
metadata space.<br />
<br />
The problem here is that the block rsv code has been unconditionally<br />
messing with ->qgroup_rsv_reserved, because the main place this is used<br />
is delalloc, and any time we call btrfs_block_rsv_release() we do it<br />
with qgroup_to_release set, and thus do the proper accounting.<br />
<br />
The subvolume code is the only other code that uses the qgroup<br />
reservation stuff, but it&#39;s intermingled with the above optimization,<br />
and thus was getting its reservation freed out from underneath it and<br />
thus leaking the reserved space.<br />
<br />
The solution is to simply not mess with the qgroup reservations if we<br />
don&#39;t have qgroup_to_release set. This works with the existing code as<br />
anything that messes with the delalloc reservations always have<br />
qgroup_to_release set. This fixes the leak that Boris was observing.
Severity CVSS v4.0: Pending analysis
Last modification:
24/12/2025