CVE-2025-68310

Severity CVSS v4.0:
Pending analysis
Type:
Unavailable / Other
Publication date:
16/12/2025
Last modified:
16/12/2025

Description

In the Linux kernel, the following vulnerability has been resolved:<br /> <br /> s390/pci: Avoid deadlock between PCI error recovery and mlx5 crdump<br /> <br /> Do not block PCI config accesses through pci_cfg_access_lock() when<br /> executing the s390 variant of PCI error recovery: Acquire just<br /> device_lock() instead of pci_dev_lock() as powerpc&amp;#39;s EEH and<br /> generig PCI AER processing do.<br /> <br /> During error recovery testing a pair of tasks was reported to be hung:<br /> <br /> mlx5_core 0000:00:00.1: mlx5_health_try_recover:338:(pid 5553): health recovery flow aborted, PCI reads still not working<br /> INFO: task kmcheck:72 blocked for more than 122 seconds.<br /> Not tainted 5.14.0-570.12.1.bringup7.el9.s390x #1<br /> "echo 0 &gt; /proc/sys/kernel/hung_task_timeout_secs" disables this message.<br /> task:kmcheck state:D stack:0 pid:72 tgid:72 ppid:2 flags:0x00000000<br /> Call Trace:<br /> [] __schedule+0x2a0/0x590<br /> [] schedule+0x36/0xe0<br /> [] schedule_preempt_disabled+0x22/0x30<br /> [] __mutex_lock.constprop.0+0x484/0x8a8<br /> [] mlx5_unload_one+0x34/0x58 [mlx5_core]<br /> [] mlx5_pci_err_detected+0x94/0x140 [mlx5_core]<br /> [] zpci_event_attempt_error_recovery+0xf2/0x398<br /> [] __zpci_event_error+0x23a/0x2c0<br /> INFO: task kworker/u1664:6:1514 blocked for more than 122 seconds.<br /> Not tainted 5.14.0-570.12.1.bringup7.el9.s390x #1<br /> "echo 0 &gt; /proc/sys/kernel/hung_task_timeout_secs" disables this message.<br /> task:kworker/u1664:6 state:D stack:0 pid:1514 tgid:1514 ppid:2 flags:0x00000000<br /> Workqueue: mlx5_health0000:00:00.0 mlx5_fw_fatal_reporter_err_work [mlx5_core]<br /> Call Trace:<br /> [] __schedule+0x2a0/0x590<br /> [] schedule+0x36/0xe0<br /> [] pci_wait_cfg+0x80/0xe8<br /> [] pci_cfg_access_lock+0x74/0x88<br /> [] mlx5_vsc_gw_lock+0x36/0x178 [mlx5_core]<br /> [] mlx5_crdump_collect+0x34/0x1c8 [mlx5_core]<br /> [] mlx5_fw_fatal_reporter_dump+0x6a/0xe8 [mlx5_core]<br /> [] devlink_health_do_dump.part.0+0x82/0x168<br /> [] devlink_health_report+0x19a/0x230<br /> [] mlx5_fw_fatal_reporter_err_work+0xba/0x1b0 [mlx5_core]<br /> <br /> No kernel log of the exact same error with an upstream kernel is<br /> available - but the very same deadlock situation can be constructed there,<br /> too:<br /> <br /> - task: kmcheck<br /> mlx5_unload_one() tries to acquire devlink lock while the PCI error<br /> recovery code has set pdev-&gt;block_cfg_access by way of<br /> pci_cfg_access_lock()<br /> - task: kworker<br /> mlx5_crdump_collect() tries to set block_cfg_access through<br /> pci_cfg_access_lock() while devlink_health_report() had acquired<br /> the devlink lock.<br /> <br /> A similar deadlock situation can be reproduced by requesting a<br /> crdump with<br /> &gt; devlink health dump show pci/ reporter fw_fatal<br /> <br /> while PCI error recovery is executed on the same physical function<br /> by mlx5_core&amp;#39;s pci_error_handlers. On s390 this can be injected with<br /> &gt; zpcictl --reset-fw <br /> <br /> Tests with this patch failed to reproduce that second deadlock situation,<br /> the devlink command is rejected with "kernel answers: Permission denied" -<br /> and we get a kernel log message of:<br /> <br /> mlx5_core 1ed0:00:00.1: mlx5_crdump_collect:50:(pid 254382): crdump: failed to lock vsc gw err -5<br /> <br /> because the config read of VSC_SEMAPHORE is rejected by the underlying<br /> hardware.<br /> <br /> Two prior attempts to address this issue have been discussed and<br /> ultimately rejected [see link], with the primary argument that s390&amp;#39;s<br /> implementation of PCI error recovery is imposing restrictions that<br /> neither powerpc&amp;#39;s EEH nor PCI AER handling need. Tests show that PCI<br /> error recovery on s390 is running to completion even without blocking<br /> access to PCI config space.

Impact