From: Santosh Shilimkar Date: Thu, 29 Sep 2016 18:07:11 +0000 (-0700) Subject: RDS: IB: fix panic with handlers running post teardown X-Git-Tag: v4.1.12-92~57^2~10 X-Git-Url: https://www.infradead.org/git/?a=commitdiff_plain;h=25f2af0698e9c914054105f014ab2a6c1b4164e1;p=users%2Fjedix%2Flinux-maple.git RDS: IB: fix panic with handlers running post teardown Shutdown cqe reaping loop takes care of emptying the CQ's before they being destroyed. And once tasklets are killed, the hanlders are not expected to run. But because of core tasklet BUG, tasklet handler could still run after tasklet_kill which lead can lead to kernel panic. Fix for core tasklet code was proposed and accepted upstream, but it comes with bagage of fixing quite a few bad users of it. Also for receive, we have additional kthread to take care. The BUG fix done as part of Orabug 2446085, had an additional assumption that reaping code won't reap all the CQEs after QP moved to error state which was not correct. QP is moved to error state as part of rdma_disconnect() and all the CQEs are reaped by the loop properly. Any handler running after above and trying to access the qp/cq resources gets exposed to race conditions. Patch fixes this race by makes sure that handlers returns without any action post teardown. Orabug: 24460805 Reviewed-by: Wengang Signed-off-by: Santosh Shilimkar --- diff --git a/net/rds/ib.h b/net/rds/ib.h index e5d84fba0c3ac..aabdcbc9019f0 100644 --- a/net/rds/ib.h +++ b/net/rds/ib.h @@ -233,6 +233,7 @@ struct rds_ib_connection { spinlock_t i_rx_lock; unsigned int i_rx_wait_for_handler; atomic_t i_worker_has_rx; + atomic_t i_cq_quiesce; }; /* This assumes that atomic_t is at least 32 bits */ diff --git a/net/rds/ib_cm.c b/net/rds/ib_cm.c index 13a3ef4e54d7b..7102520da75fc 100644 --- a/net/rds/ib_cm.c +++ b/net/rds/ib_cm.c @@ -309,6 +309,7 @@ void rds_ib_cm_connect_complete(struct rds_connection *conn, struct rdma_cm_even } ic->i_sl = ic->i_cm_id->route.path_rec->sl; + atomic_set(&ic->i_cq_quiesce, 0); /* * Init rings and fill recv. this needs to wait until protocol negotiation @@ -471,8 +472,8 @@ void rds_ib_tasklet_fn_send(unsigned long data) memset(&ack_state, 0, sizeof(ack_state)); rds_ib_stats_inc(s_ib_tasklet_call); - /* if send cq has been destroyed, ignore incoming cq event */ - if (!ic->i_scq) + /* if cq has been already reaped, ignore incoming cq event */ + if (atomic_read(&ic->i_cq_quiesce)) return; poll_cq(ic, ic->i_scq, ic->i_send_wc, &ack_state, 0); @@ -501,6 +502,10 @@ static void rds_ib_rx(struct rds_ib_connection *ic) rds_ib_stats_inc(s_ib_tasklet_call); + /* if cq has been already reaped, ignore incoming cq event */ + if (atomic_read(&ic->i_cq_quiesce)) + return; + memset(&ack_state, 0, sizeof(ack_state)); ic->i_rx_poll_cq = 0; @@ -1185,6 +1190,8 @@ void rds_ib_conn_shutdown(struct rds_connection *conn) tasklet_kill(&ic->i_stasklet); tasklet_kill(&ic->i_rtasklet); + atomic_set(&ic->i_cq_quiesce, 1); + /* first destroy the ib state that generates callbacks */ if (ic->i_cm_id->qp) rdma_destroy_qp(ic->i_cm_id);