net/rds: Fix kernel panic caused by a race between setup/teardown
Running rds-stress with --reset option in a tight loop provokes a
"NULL pointer dereference" in rds_ib_recv_refill().
IP: [<..
fa04fefa1>] rds_ib_recv_refill+0x2a1/0x630 [rds_rdma]
Call Trace:
[<..
fa04fc140>] rds_ib_cm_connect_complete+0x2f0/0x360 [rds_rdma]
[<..
f810ba337>] ? wake_up_process+0x27/0x50
[<..
f810a1b54>] ? wake_up_worker+0x24/0x30
[<..
f810a2902>] ? insert_work+0x62/0xa0
[<..
fa04f4405>] rds_rdma_cm_event_handler_cmn+0x405/0x8c0 [rds_rdma]
[<..
f810a2d82>] ? __queue_delayed_work+0xb2/0x1a0
[<..
fa04f48d0>] rds_rdma_cm_event_handler+0x10/0x20 [rds_rdma]
[<..
fa03bf85f>] cma_ib_handler+0x10f/0x280 [rdma_cm]
[<..
fa03b1f5b>] cm_process_work+0x2b/0x140 [ib_cm]
[<..
fa03b414b>] cm_work_handler+0x99b/0x1630 [ib_cm]
[<..
f810a5375>] process_one_work+0x165/0x470
[<..
f810a5b82>] worker_thread+0x112/0x540
[<..
f810a5a70>] ? rescuer_thread+0x3f0/0x3f0
[<..
f810ab51a>] kthread+0xda/0xf0
[<..
f810ab440>] ? kthread_create_on_node+0x1b0/0x1b0
[<..
f817521b8>] ret_from_fork+0x58/0x90
[<..
f810ab440>] ? kthread_create_on_node+0x1b0/0x1b0
This is due to a race condition between rds_ib_cm_connect_complete()
trying to post a recv (using QP) and rds_ib_conn_path_shutdown
destroying the QP and zeroing it. By waiting and reserving the
RDS_RECV_REFILL bit before calling shutdown we avoid the problem.
Adding path-transition of CONNECTING to DISCONNECTING for
completeness.
Orabug:
28326553
Signed-off-by: Hans Westgaard Ry <hans.westgaard.ry@oracle.com>
Reviewed-by: Shannon Nelson <shannon.nelson@oracle.com>
Signed-off-by: Brian Maly <brian.maly@oracle.com>