CVE-2026-45973 PUBLISHED

RDMA/mlx5: Fix UMR hang in LAG error state unload

Assigner: Linux
Reserved: 13.05.2026 Published: 27.05.2026 Updated: 27.05.2026

In the Linux kernel, the following vulnerability has been resolved:

RDMA/mlx5: Fix UMR hang in LAG error state unload

During firmware reset in LAG mode, a race condition causes the driver to hang indefinitely while waiting for UMR completion during device unload. See [1].

In LAG mode the bond device is only registered on the master, so it never sees sys_error events from the slave. During firmware reset this causes UMR waits to hang forever on unload as the slave is dead but the master hasn't entered error state yet, so UMR posts succeed but completions never arrive.

Fix this by adding a sys_error notifier that gets registered before MLX5_IB_STAGE_IB_REG and stays alive until after ib_unregister_device(). This ensures error events reach the bond device throughout teardown.

[1] Call Trace: __schedule+0x2bd/0x760 schedule+0x37/0xa0 schedule_preempt_disabled+0xa/0x10 __mutex_lock.isra.6+0x2b5/0x4a0 __mlx5_ib_dereg_mr+0x606/0x870 [mlx5_ib] ? __xa_erase+0x4a/0xa0 ? _cond_resched+0x15/0x30 ? wait_for_completion+0x31/0x100 ib_dereg_mr_user+0x48/0xc0 [ib_core] ? rdmacg_uncharge_hierarchy+0xa0/0x100 destroy_hw_idr_uobject+0x20/0x50 [ib_uverbs] uverbs_destroy_uobject+0x37/0x150 [ib_uverbs] __uverbs_cleanup_ufile+0xda/0x140 [ib_uverbs] uverbs_destroy_ufile_hw+0x3a/0xf0 [ib_uverbs] ib_uverbs_remove_one+0xc3/0x140 [ib_uverbs] remove_client_context+0x8b/0xd0 [ib_core] disable_device+0x8c/0x130 [ib_core] __ib_unregister_device+0x10d/0x180 [ib_core] ib_unregister_device+0x21/0x30 [ib_core] __mlx5_ib_remove+0x1e4/0x1f0 [mlx5_ib] auxiliary_bus_remove+0x1e/0x30 device_release_driver_internal+0x103/0x1f0 bus_remove_device+0xf7/0x170 device_del+0x181/0x410 mlx5_rescan_drivers_locked.part.10+0xa9/0x1d0 [mlx5_core] mlx5_disable_lag+0x253/0x260 [mlx5_core] mlx5_lag_disable_change+0x89/0xc0 [mlx5_core] mlx5_eswitch_disable+0x67/0xa0 [mlx5_core] mlx5_unload+0x15/0xd0 [mlx5_core] mlx5_unload_one+0x71/0xc0 [mlx5_core] mlx5_sync_reset_reload_work+0x83/0x100 [mlx5_core] process_one_work+0x1a7/0x360 worker_thread+0x30/0x390 ? create_worker+0x1a0/0x1a0 kthread+0x116/0x130 ? kthread_flush_work_fn+0x10/0x10 ret_from_fork+0x22/0x40

Product Status

Vendor Linux
Product Linux
Versions Default: unaffected
  • affected from 6b0acf6a94c31efa43fce4edc22413a3390f9c05 to c8fb5c965ac7d0104872a8e4f6451f3bc6328199 (excl.)
  • affected from ede132a5cf559f3ab35a4c28bac4f4a6c20334d8 to 6d838873da9cb97551d42316967cc82bf8f8031b (excl.)
  • affected from ede132a5cf559f3ab35a4c28bac4f4a6c20334d8 to 613f5d4139b6ba801ccd93f9a28943be60d903bc (excl.)
  • affected from ede132a5cf559f3ab35a4c28bac4f4a6c20334d8 to ebc2164a4cd4314503f1a0c8e7aaf76d7e5fa211 (excl.)
  • Version 921fcf2971a1e8d3b904ba2c2905b96f4ec3d4ad is affected
  • Version 542bd62b7a7f37182c9ef192c2bd25d118c144e4 is affected
  • affected from 6.12.2 to 6.12.75 (excl.)
  • affected from 6.6.64 to 6.7 (excl.)
  • affected from 6.11.11 to 6.12 (excl.)
Vendor Linux
Product Linux
Versions Default: affected
  • Version 6.13 is affected
  • unaffected from 0 to 6.13 (excl.)
  • unaffected from 6.12.75 to 6.12.* (incl.)
  • unaffected from 6.18.14 to 6.18.* (incl.)
  • unaffected from 6.19.4 to 6.19.* (incl.)
  • unaffected from 7.0 to * (incl.)

References