CVE-2025-68310
Gravedad:
Pendiente de análisis
Tipo:
No Disponible / Otro tipo
Fecha de publicación:
16/12/2025
Última modificación:
16/12/2025
Descripción
*** Pendiente de traducción *** In the Linux kernel, the following vulnerability has been resolved:<br />
<br />
s390/pci: Avoid deadlock between PCI error recovery and mlx5 crdump<br />
<br />
Do not block PCI config accesses through pci_cfg_access_lock() when<br />
executing the s390 variant of PCI error recovery: Acquire just<br />
device_lock() instead of pci_dev_lock() as powerpc&#39;s EEH and<br />
generig PCI AER processing do.<br />
<br />
During error recovery testing a pair of tasks was reported to be hung:<br />
<br />
mlx5_core 0000:00:00.1: mlx5_health_try_recover:338:(pid 5553): health recovery flow aborted, PCI reads still not working<br />
INFO: task kmcheck:72 blocked for more than 122 seconds.<br />
Not tainted 5.14.0-570.12.1.bringup7.el9.s390x #1<br />
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.<br />
task:kmcheck state:D stack:0 pid:72 tgid:72 ppid:2 flags:0x00000000<br />
Call Trace:<br />
[] __schedule+0x2a0/0x590<br />
[] schedule+0x36/0xe0<br />
[] schedule_preempt_disabled+0x22/0x30<br />
[] __mutex_lock.constprop.0+0x484/0x8a8<br />
[] mlx5_unload_one+0x34/0x58 [mlx5_core]<br />
[] mlx5_pci_err_detected+0x94/0x140 [mlx5_core]<br />
[] zpci_event_attempt_error_recovery+0xf2/0x398<br />
[] __zpci_event_error+0x23a/0x2c0<br />
INFO: task kworker/u1664:6:1514 blocked for more than 122 seconds.<br />
Not tainted 5.14.0-570.12.1.bringup7.el9.s390x #1<br />
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.<br />
task:kworker/u1664:6 state:D stack:0 pid:1514 tgid:1514 ppid:2 flags:0x00000000<br />
Workqueue: mlx5_health0000:00:00.0 mlx5_fw_fatal_reporter_err_work [mlx5_core]<br />
Call Trace:<br />
[] __schedule+0x2a0/0x590<br />
[] schedule+0x36/0xe0<br />
[] pci_wait_cfg+0x80/0xe8<br />
[] pci_cfg_access_lock+0x74/0x88<br />
[] mlx5_vsc_gw_lock+0x36/0x178 [mlx5_core]<br />
[] mlx5_crdump_collect+0x34/0x1c8 [mlx5_core]<br />
[] mlx5_fw_fatal_reporter_dump+0x6a/0xe8 [mlx5_core]<br />
[] devlink_health_do_dump.part.0+0x82/0x168<br />
[] devlink_health_report+0x19a/0x230<br />
[] mlx5_fw_fatal_reporter_err_work+0xba/0x1b0 [mlx5_core]<br />
<br />
No kernel log of the exact same error with an upstream kernel is<br />
available - but the very same deadlock situation can be constructed there,<br />
too:<br />
<br />
- task: kmcheck<br />
mlx5_unload_one() tries to acquire devlink lock while the PCI error<br />
recovery code has set pdev->block_cfg_access by way of<br />
pci_cfg_access_lock()<br />
- task: kworker<br />
mlx5_crdump_collect() tries to set block_cfg_access through<br />
pci_cfg_access_lock() while devlink_health_report() had acquired<br />
the devlink lock.<br />
<br />
A similar deadlock situation can be reproduced by requesting a<br />
crdump with<br />
> devlink health dump show pci/ reporter fw_fatal<br />
<br />
while PCI error recovery is executed on the same physical function<br />
by mlx5_core&#39;s pci_error_handlers. On s390 this can be injected with<br />
> zpcictl --reset-fw <br />
<br />
Tests with this patch failed to reproduce that second deadlock situation,<br />
the devlink command is rejected with "kernel answers: Permission denied" -<br />
and we get a kernel log message of:<br />
<br />
mlx5_core 1ed0:00:00.1: mlx5_crdump_collect:50:(pid 254382): crdump: failed to lock vsc gw err -5<br />
<br />
because the config read of VSC_SEMAPHORE is rejected by the underlying<br />
hardware.<br />
<br />
Two prior attempts to address this issue have been discussed and<br />
ultimately rejected [see link], with the primary argument that s390&#39;s<br />
implementation of PCI error recovery is imposing restrictions that<br />
neither powerpc&#39;s EEH nor PCI AER handling need. Tests show that PCI<br />
error recovery on s390 is running to completion even without blocking<br />
access to PCI config space.
Impacto
Referencias a soluciones, herramientas e información
- https://git.kernel.org/stable/c/0fd20f65df6aa430454a0deed8f43efa91c54835
- https://git.kernel.org/stable/c/3591d56ea9bfd3e7fbbe70f749bdeed689d415f9
- https://git.kernel.org/stable/c/54f938d9f5693af8ed586a08db4af5d9da1f0f2d
- https://git.kernel.org/stable/c/b63c061be622b17b495cbf78a6d5f2d4c3147f8e
- https://git.kernel.org/stable/c/d0df2503bc3c2be385ca2fd96585daad1870c7c5



