CVE-2023-54271
Gravedad:
Pendiente de análisis
Tipo:
No Disponible / Otro tipo
Fecha de publicación:
30/12/2025
Última modificación:
30/12/2025
Descripción
*** Pendiente de traducción *** In the Linux kernel, the following vulnerability has been resolved:<br />
<br />
blk-cgroup: Fix NULL deref caused by blkg_policy_data being installed before init<br />
<br />
blk-iocost sometimes causes the following crash:<br />
<br />
BUG: kernel NULL pointer dereference, address: 00000000000000e0<br />
...<br />
RIP: 0010:_raw_spin_lock+0x17/0x30<br />
Code: be 01 02 00 00 e8 79 38 39 ff 31 d2 89 d0 5d c3 0f 1f 00 0f 1f 44 00 00 55 48 89 e5 65 ff 05 48 d0 34 7e b9 01 00 00 00 31 c0 0f b1 0f 75 02 5d c3 89 c6 e8 ea 04 00 00 5d c3 0f 1f 84 00 00<br />
RSP: 0018:ffffc900023b3d40 EFLAGS: 00010046<br />
RAX: 0000000000000000 RBX: 00000000000000e0 RCX: 0000000000000001<br />
RDX: ffffc900023b3d20 RSI: ffffc900023b3cf0 RDI: 00000000000000e0<br />
RBP: ffffc900023b3d40 R08: ffffc900023b3c10 R09: 0000000000000003<br />
R10: 0000000000000064 R11: 000000000000000a R12: ffff888102337000<br />
R13: fffffffffffffff2 R14: ffff88810af408c8 R15: ffff8881070c3600<br />
FS: 00007faaaf364fc0(0000) GS:ffff88842fdc0000(0000) knlGS:0000000000000000<br />
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033<br />
CR2: 00000000000000e0 CR3: 00000001097b1000 CR4: 0000000000350ea0<br />
Call Trace:<br />
<br />
ioc_weight_write+0x13d/0x410<br />
cgroup_file_write+0x7a/0x130<br />
kernfs_fop_write_iter+0xf5/0x170<br />
vfs_write+0x298/0x370<br />
ksys_write+0x5f/0xb0<br />
__x64_sys_write+0x1b/0x20<br />
do_syscall_64+0x3d/0x80<br />
entry_SYSCALL_64_after_hwframe+0x46/0xb0<br />
<br />
This happens because iocg->ioc is NULL. The field is initialized by<br />
ioc_pd_init() and never cleared. The NULL deref is caused by<br />
blkcg_activate_policy() installing blkg_policy_data before initializing it.<br />
<br />
blkcg_activate_policy() was doing the following:<br />
<br />
1. Allocate pd&#39;s for all existing blkg&#39;s and install them in blkg->pd[].<br />
2. Initialize all pd&#39;s.<br />
3. Online all pd&#39;s.<br />
<br />
blkcg_activate_policy() only grabs the queue_lock and may release and<br />
re-acquire the lock as allocation may need to sleep. ioc_weight_write()<br />
grabs blkcg->lock and iterates all its blkg&#39;s. The two can race and if<br />
ioc_weight_write() runs during #1 or between #1 and #2, it can encounter a<br />
pd which is not initialized yet, leading to crash.<br />
<br />
The crash can be reproduced with the following script:<br />
<br />
#!/bin/bash<br />
<br />
echo +io > /sys/fs/cgroup/cgroup.subtree_control<br />
systemd-run --unit touch-sda --scope dd if=/dev/sda of=/dev/null bs=1M count=1 iflag=direct<br />
echo 100 > /sys/fs/cgroup/system.slice/io.weight<br />
bash -c "echo &#39;8:0 enable=1&#39; > /sys/fs/cgroup/io.cost.qos" &<br />
sleep .2<br />
echo 100 > /sys/fs/cgroup/system.slice/io.weight<br />
<br />
with the following patch applied:<br />
<br />
> diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c<br />
> index fc49be622e05..38d671d5e10c 100644<br />
> --- a/block/blk-cgroup.c<br />
> +++ b/block/blk-cgroup.c<br />
> @@ -1553,6 +1553,12 @@ int blkcg_activate_policy(struct gendisk *disk, const struct blkcg_policy *pol)<br />
> pd->online = false;<br />
> }<br />
><br />
> + if (system_state == SYSTEM_RUNNING) {<br />
> + spin_unlock_irq(&q->queue_lock);<br />
> + ssleep(1);<br />
> + spin_lock_irq(&q->queue_lock);<br />
> + }<br />
> +<br />
> /* all allocated, init in the same order */<br />
> if (pol->pd_init_fn)<br />
> list_for_each_entry_reverse(blkg, &q->blkg_list, q_node)<br />
<br />
I don&#39;t see a reason why all pd&#39;s should be allocated, initialized and<br />
onlined together. The only ordering requirement is that parent blkgs to be<br />
initialized and onlined before children, which is guaranteed from the<br />
walking order. Let&#39;s fix the bug by allocating, initializing and onlining pd<br />
for each blkg and holding blkcg->lock over initialization and onlining. This<br />
ensures that an installed blkg is always fully initialized and onlined<br />
removing the the race window.



