From: Sowmini Varadhan Date: Tue, 12 Apr 2016 00:17:37 +0000 (-0700) Subject: RDS: TOS fixes in failure paths when RDS-TCP and RDS-RDMA are run together X-Git-Tag: v4.1.12-92~175^2~1 X-Git-Url: https://www.infradead.org/git/?a=commitdiff_plain;h=05becdc96f9cd6af9bf0eae93e9b3bfee6e3eb1e;p=users%2Fjedix%2Flinux-maple.git RDS: TOS fixes in failure paths when RDS-TCP and RDS-RDMA are run together Orabug 20930687 When errors such as connection hangs or failures are encountered over RDS-TCP, the sending RDS, in an attempt at HA, will try to reconnect, and trip up on all sorts of data structures intended for ToS support. The ToS feature is currently only supported for RDS-IB, and unplanned/untested usage of these data structures by RDS-TCP causes deadlocks and panics. Until we properly design, support, and test the ToS feature for RDS-TCP, such paths should not be wandered into. Thus this patchset adds defensive checks to ignore rs_tos settings in rds_sendmsg() for TCP transports, and prevents the sending of ToS heartbeat pings Until we properly design, support, and test the ToS feature for RDS-TCP, such paths should not be wandered into. Thus this patchset adds defensive checks to ignore rs_tos settings in rds_sendmsg() for TCP transports, and prevents the sending of ToS heartbeat pings in rds_send_hb() for TCP transport. For reference, the deadlock that can be encountered in the hb ping path is: [] do_tcp_setsockopt+0x244/0x710 <-- wants sock_lock [] ? __alloc_pages_nodemask+0x12d/0x230 [] ? rb_insert_color+0x9d/0x160 [] ? native_sched_clock+0x13/0x80 [] ? sched_clock+0x9/0x10 [] ? trace_clock_local+0x9/0x10 [] ? rb_reserve_next_event+0x67/0x480 [] ? __alloc_and_insert_iova_range+0x17f/0x1f0 [] ? get_page_from_freelist+0x1e2/0x550 [] tcp_setsockopt+0x2a/0x30 [] sock_common_setsockopt+0x14/0x20 [] rds_tcp_xmit_prepare+0x5d/0x70 [rds_tcp] [] rds_send_xmit+0xe5/0x860 [rds] [] rds_send_hb+0xcd/0x130 [rds] [] rds_recv_local+0x20b/0x330 [rds] [] rds_recv_incoming+0x7d/0x290 [rds] [] ? native_sched_clock+0x13/0x80 [] ? sched_clock+0x9/0x10 [] rds_tcp_data_recv+0x316/0x440 [rds_tcp] [] tcp_read_sock+0xda/0x230 [] ? rds_tcp_recv+0x60/0x60 [rds_tcp] [] ? sock_get_timestamp+0xc0/0xc0 [] rds_tcp_read_sock+0x4d/0x60 [rds_tcp] [] rds_tcp_data_ready+0x7a/0xd0 [rds_tcp] [] tcp_data_queue+0x2fe/0xac0 [] tcp_rcv_established+0x349/0x740 [] tcp_v4_do_rcv+0x125/0x1f0 [] tcp_v4_rcv+0x597/0x830 <---- holds sock_lock [] ? __tcp_ack_snd_check+0x5e/0xa0 [] ? tcp_rcv_established+0x36d/0x740 [] ip_local_deliver_finish+0xdd/0x2a0 [] ip_local_deliver+0x80/0x90 [] ip_rcv_finish+0x105/0x370 Signed-off-by: Sowmini Varadhan Acked-by: Santosh Shilimkar --- diff --git a/net/rds/af_rds.c b/net/rds/af_rds.c index a94317dfdcff..9aa99921ce29 100644 --- a/net/rds/af_rds.c +++ b/net/rds/af_rds.c @@ -228,6 +228,10 @@ static int rds_ioctl(struct socket *sock, unsigned int cmd, unsigned long arg) if (get_user(tos, (rds_tos_t __user *)arg)) return -EFAULT; + if (rs->rs_transport && + rs->rs_transport->t_type == RDS_TRANS_TCP) + tos = 0; + spin_lock_bh(&rds_sock_lock); if (rs->rs_tos || rs->rs_conn) { spin_unlock_bh(&rds_sock_lock); diff --git a/net/rds/send.c b/net/rds/send.c index 5274d4e2fd61..5bc6400ae368 100644 --- a/net/rds/send.c +++ b/net/rds/send.c @@ -1606,6 +1606,9 @@ rds_send_hb(struct rds_connection *conn, int response) unsigned long flags; int ret = 0; + if (conn->c_trans->t_type == RDS_TRANS_TCP) + return 0; + rm = rds_message_alloc(0, GFP_ATOMIC); if (!rm) return -ENOMEM;