From: Håkon Bugge Date: Wed, 2 Jan 2019 13:59:35 +0000 (+0100) Subject: rds: ib: Use a delay when reconnecting to the very same IP address X-Git-Tag: v4.1.12-124.31.3~322 X-Git-Url: https://www.infradead.org/git/?a=commitdiff_plain;h=779f68d76eda7f34ce694485f0a99d08896c2b52;p=users%2Fjedix%2Flinux-maple.git rds: ib: Use a delay when reconnecting to the very same IP address An RDS IB connection may be formed from the very same IB port using HCA level internal loop-back. If this connection attempt is performed after RDS has cleared the ARP cache of the same IP address, an ARP IB multicast is sent out on the IPoIB interface. If the above scenario is performed on IPoIB interfaces that are members of an IB Limited Partition, the ARP multicast will be dropped by the HCA port. A corresponding PKey Violation is counted and a corresponding PKey Violation Trap is sent to the OpenSM, subject to rate control. Now, due to a bug in RDS connection management, where it was not anticipated that the peers of a connection could actually be the very same port and have the same IP address, the reconnect attempts happens with zero delay. This leads to about 7700 connection attempts per second, about 4400 PKey Violations per second, and 8500 ARP multicasts per second. This commit reduces the reconnect rate down to one second. This because the RDS uses exponential backoff to calculate the delay, which will shortly end up at rds_sysctl_reconnect_max_jiffies, which by default is HZ, in other words, a delay at one second after the 10 first reconnects. Orabug: 29138813 Signed-off-by: Håkon Bugge Reviewed-by: Ka-cheong Poon --- v1 -> v2: * Amended commit message as per Ka-Cheong's suggestions Signed-off-by: Brian Maly --- diff --git a/net/rds/threads.c b/net/rds/threads.c index 12ac53c360cb..d828f1be63f7 100644 --- a/net/rds/threads.c +++ b/net/rds/threads.c @@ -134,7 +134,6 @@ EXPORT_SYMBOL_GPL(rds_connect_complete); */ void rds_queue_reconnect(struct rds_conn_path *cp) { - unsigned long rand; struct rds_connection *conn = cp->cp_conn; bool is_tcp = conn->c_trans->t_type == RDS_TRANS_TCP; @@ -154,17 +153,16 @@ void rds_queue_reconnect(struct rds_conn_path *cp) return; } - get_random_bytes(&rand, sizeof(rand)); rds_rtd_ptr(RDS_RTD_CM_EXT, - "%lu delay %lu ceil conn %p for %pI6c -> %pI6c tos %d\n", - rand % cp->cp_reconnect_jiffies, cp->cp_reconnect_jiffies, - conn, &conn->c_laddr, &conn->c_faddr, conn->c_tos); + "delay %lu conn %p for %pI6c -> %pI6c tos %d\n", + cp->cp_reconnect_jiffies, conn, &conn->c_laddr, + &conn->c_faddr, conn->c_tos); - if (rds_addr_cmp(&conn->c_laddr, &conn->c_faddr) >= 0) + if (rds_addr_cmp(&conn->c_laddr, &conn->c_faddr) > 0) queue_delayed_work(cp->cp_wq, &cp->cp_conn_w, 0); else queue_delayed_work(cp->cp_wq, &cp->cp_conn_w, - msecs_to_jiffies(100)); + cp->cp_reconnect_jiffies); cp->cp_reconnect_jiffies = min(cp->cp_reconnect_jiffies * 2, rds_sysctl_reconnect_max_jiffies);