CVE-2022-49700
Publication date:
26/02/2025
In the Linux kernel, the following vulnerability has been resolved:<br />
<br />
mm/slub: add missing TID updates on slab deactivation<br />
<br />
The fastpath in slab_alloc_node() assumes that c->slab is stable as long as<br />
the TID stays the same. However, two places in __slab_alloc() currently<br />
don&#39;t update the TID when deactivating the CPU slab.<br />
<br />
If multiple operations race the right way, this could lead to an object<br />
getting lost; or, in an even more unlikely situation, it could even lead to<br />
an object being freed onto the wrong slab&#39;s freelist, messing up the<br />
`inuse` counter and eventually causing a page to be freed to the page<br />
allocator while it still contains slab objects.<br />
<br />
(I haven&#39;t actually tested these cases though, this is just based on<br />
looking at the code. Writing testcases for this stuff seems like it&#39;d be<br />
a pain...)<br />
<br />
The race leading to state inconsistency is (all operations on the same CPU<br />
and kmem_cache):<br />
<br />
- task A: begin do_slab_free():<br />
- read TID<br />
- read pcpu freelist (==NULL)<br />
- check `slab == c->slab` (true)<br />
- [PREEMPT A->B]<br />
- task B: begin slab_alloc_node():<br />
- fastpath fails (`c->freelist` is NULL)<br />
- enter __slab_alloc()<br />
- slub_get_cpu_ptr() (disables preemption)<br />
- enter ___slab_alloc()<br />
- take local_lock_irqsave()<br />
- read c->freelist as NULL<br />
- get_freelist() returns NULL<br />
- write `c->slab = NULL`<br />
- drop local_unlock_irqrestore()<br />
- goto new_slab<br />
- slub_percpu_partial() is NULL<br />
- get_partial() returns NULL<br />
- slub_put_cpu_ptr() (enables preemption)<br />
- [PREEMPT B->A]<br />
- task A: finish do_slab_free():<br />
- this_cpu_cmpxchg_double() succeeds()<br />
- [CORRUPT STATE: c->slab==NULL, c->freelist!=NULL]<br />
<br />
From there, the object on c->freelist will get lost if task B is allowed to<br />
continue from here: It will proceed to the retry_load_slab label,<br />
set c->slab, then jump to load_freelist, which clobbers c->freelist.<br />
<br />
But if we instead continue as follows, we get worse corruption:<br />
<br />
- task A: run __slab_free() on object from other struct slab:<br />
- CPU_PARTIAL_FREE case (slab was on no list, is now on pcpu partial)<br />
- task A: run slab_alloc_node() with NUMA node constraint:<br />
- fastpath fails (c->slab is NULL)<br />
- call __slab_alloc()<br />
- slub_get_cpu_ptr() (disables preemption)<br />
- enter ___slab_alloc()<br />
- c->slab is NULL: goto new_slab<br />
- slub_percpu_partial() is non-NULL<br />
- set c->slab to slub_percpu_partial(c)<br />
- [CORRUPT STATE: c->slab points to slab-1, c->freelist has objects<br />
from slab-2]<br />
- goto redo<br />
- node_match() fails<br />
- goto deactivate_slab<br />
- existing c->freelist is passed into deactivate_slab()<br />
- inuse count of slab-1 is decremented to account for object from<br />
slab-2<br />
<br />
At this point, the inuse count of slab-1 is 1 lower than it should be.<br />
This means that if we free all allocated objects in slab-1 except for one,<br />
SLUB will think that slab-1 is completely unused, and may free its page,<br />
leading to use-after-free.
Severity CVSS v4.0: Pending analysis
Last modification:
25/03/2025