ib_sdp: fix deadlock when sdp_cma_handler is called while socket is being closed
issue: 130280
sdp_close will grap sock_lock and while closing sdp_cma_handler can be called from cma context
under id_priv->qp_mutex and the sdp_cma_handler will wait for sock_lock to be available.
sdp_close will call rdma_disconnect which will need to grap id_priv->qp_mutex --> deadlock !
this patch fixes the following call trace :
Call Trace:
[<
ffffffff813b4476>] lock_sock_nested+0x86/0xbf
[<
ffffffff81077024>] ? autoremove_wake_function+0x0/0x3d
[<
ffffffffa03ae65a>] sdp_cma_handler+0xe7/0x1529 [ib_sdp]
[<
ffffffffa04ca060>] ? mlx4_free_cmd_mailbox+0x31/0x35 [mlx4_core]
[<
ffffffffa04ca060>] ? mlx4_free_cmd_mailbox+0x31/0x35 [mlx4_core]
[<
ffffffffa04dece6>] ? __mlx4_qp_modify+0x2c6/0x2eb [mlx4_core]
[<
ffffffffa01d8408>] ? rdma_port_link_layer+0x1b/0x42 [ib_core]
[<
ffffffffa0234de0>] ? mlx4_ib_modify_qp+0xd22/0xd46 [mlx4_ib]
[<
ffffffffa0234df2>] ? mlx4_ib_modify_qp+0xd34/0xd46 [mlx4_ib]
[<
ffffffffa038e1de>] cma_qp_set_alt_path+0x2b7/0x32c [rdma_cm]
[<
ffffffffa0215792>] ? ib_post_send_mad+0x440/0x50f [ib_mad]
[<
ffffffffa0390425>] cma_ib_handler+0x70f/0x9fc [rdma_cm]
[<
ffffffffa01dbe60>] ? ib_find_cached_pkey+0xf0/0x105 [ib_core]
[<
ffffffffa02a5a07>] cm_process_work+0x53/0x9b [ib_cm]
[<
ffffffffa02a7352>] cm_work_handler+0x66e/0xdcd [ib_cm]
[<
ffffffffa02a6ce4>] ? cm_work_handler+0x0/0xdcd [ib_cm]
[<
ffffffff81072d5e>] worker_thread+0x14d/0x1ed
[<
ffffffff81077024>] ? autoremove_wake_function+0x0/0x3d
[<
ffffffff81072c11>] ? worker_thread+0x0/0x1ed
[<
ffffffff81076c7b>] kthread+0x6e/0x76
[<
ffffffff81012dea>] child_rip+0xa/0x20
[<
ffffffff81076c0d>] ? kthread+0x0/0x76
[<
ffffffff81012de0>] ? child_rip+0x0/0x20
INFO: task rdma_cm:24917 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
rdma_cm D
0000000000000000 0 24917 2 0x00000000
ffff8800ce4e7d20 0000000000000046 0000000000000000 000000008104cb48
ffff8800da5e83c0 ffffffff81aae4c0 ffff8800da5e8790 000000010319de96
0000000000000400 0000000000000000 0000000000000000 ffff880107864664
Call Trace:
[<
ffffffff81456870>] __mutex_lock_common+0x12f/0x1a1
[<
ffffffff81456931>] __mutex_lock_slowpath+0x19/0x1b
[<
ffffffff8145699a>] mutex_lock+0x23/0x3a
[<
ffffffffa038d03c>] cma_sap_work_handler+0x105/0x245 [rdma_cm]
[<
ffffffff810432be>] ? need_resched+0x23/0x2d
[<
ffffffff814560ab>] ? thread_return+0x99/0xb0
[<
ffffffffa038ee11>] ? cma_work_handler+0x0/0x94 [rdma_cm]
[<
ffffffffa038cf37>] ? cma_sap_work_handler+0x0/0x245 [rdma_cm]
[<
ffffffff81072d5e>] worker_thread+0x14d/0x1ed
[<
ffffffff81077024>] ? autoremove_wake_function+0x0/0x3d
[<
ffffffff81072c11>] ? worker_thread+0x0/0x1ed
[<
ffffffff81076c7b>] kthread+0x6e/0x76
[<
ffffffff81012dea>] child_rip+0xa/0x20
[<
ffffffff81076c0d>] ? kthread+0x0/0x76
[<
ffffffff81012de0>] ? child_rip+0x0/0x20
INFO: task NPtcp:4326 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
NPtcp D
0000000000000000 0 4326 4319 0x00000000
ffff8800b66dfc78 0000000000000086 ffff8800d3ff03c0 0000000000000005
ffff8800b12b40c0 ffffffff81aae4c0 ffff8800b12b4490 0000000028210680
ffff8800b66dfd70 ffff8800b66dfde8 0000000000000000 ffff88010786461c
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Amir Vadai <amirv@mellanox.com>