From: Venkat Venkatsubra Date: Sat, 10 Nov 2018 16:01:47 +0000 (-0800) Subject: net/rds: Fix endless RNR situation X-Git-Tag: v4.1.12-124.31.3~418 X-Git-Url: https://www.infradead.org/git/?a=commitdiff_plain;h=06e7043467a0a654045dd1bac79d95982a4de03b;p=users%2Fjedix%2Flinux-maple.git net/rds: Fix endless RNR situation Working with the following SRs: Exadata SR# 3-15640329311 Linux SR#3-15675579325 it was discovered that by inserting IB_SEND_SOLICITED at regular intervals removed the endless RNR Retry situation. The test was made by inserting IB_SEND_SOLICITED at the same interval as IB_SEND_SIGNALED was inserted, that is, by default for every 17th fragment. This commit introduces the sysctl variable net.rds.ib.max_unsolicited_wr. A value of zero disables the functionality of inserting IB_SEND_SOLICITED. A value of N will insert IB_SEND_SOLICITED for every Nth fragment. net.rds.ib.max_unsolicited_wr is by default 16, in order to avoid customization when this fix is applied at the customer site. This fix also has the nice side-effect that it improves IOPS for 1Q, 1D, 1T cases: -q 1M -a 256: Without fix: tsks tx/s rx/s tx+rx K/s mbi K/s mbo K/s tx us/c rtt us cpu % 1 1161 0 1189243.20 0.00 0.00 203.52 857.34 -1.00 (average) With fix (with default net.rds.ib.max_unsolicited_wr = 16): tsks tx/s rx/s tx+rx K/s mbi K/s mbo K/s tx us/c rtt us cpu % 1 1323 0 1355849.36 0.00 0.00 203.76 751.50 -1.00 (average) -q $[32*1024+256] -a 256: With fix (net.rds.ib.max_unsolicited_wr = 0, i.e. disabled): tsks tx/s rx/s tx+rx K/s mbi K/s mbo K/s tx us/c rtt us cpu % 1 15243 0 492547.75 0.00 0.00 10.58 62.01 -1.00 (average) Ditto with net.rds.ib.max_unsolicited_wr = 4 (two SEND_SOLICITED per ~32K): tsks tx/s rx/s tx+rx K/s mbi K/s mbo K/s tx us/c rtt us cpu % 1 16422 0 530641.03 0.00 0.00 10.28 57.25 -1.00 (average) Orabug: 28857027 Reviewed-by: HÃ¥kon Bugge Signed-off-by: Brian Maly --- diff --git a/net/rds/ib.h b/net/rds/ib.h index 07daeb4df6c0e..bd8eea05cb859 100644 --- a/net/rds/ib.h +++ b/net/rds/ib.h @@ -255,6 +255,9 @@ struct rds_ib_connection { /* Batched completions */ unsigned int i_unsignaled_wrs; + + /* Wake up receiver once in a while */ + unsigned int i_unsolicited_wrs; u8 i_sl; atomic_t i_cache_allocs; @@ -747,6 +750,7 @@ void rds_ib_sysctl_exit(void); extern unsigned long rds_ib_sysctl_max_send_wr; extern unsigned long rds_ib_sysctl_max_recv_wr; extern unsigned long rds_ib_sysctl_max_unsig_wrs; +extern unsigned long rds_ib_sysctl_max_unsolicited_wrs; extern unsigned long rds_ib_sysctl_max_unsig_bytes; extern unsigned long rds_ib_sysctl_max_recv_allocation; extern unsigned int rds_ib_sysctl_flow_control; diff --git a/net/rds/ib_send.c b/net/rds/ib_send.c index 462d88ee6a4d1..5d1a127de6cf4 100644 --- a/net/rds/ib_send.c +++ b/net/rds/ib_send.c @@ -535,9 +535,15 @@ static inline int rds_ib_set_wr_signal_state(struct rds_ib_connection *ic, if (ic->i_unsignaled_wrs-- == 0 || notify) { ic->i_unsignaled_wrs = rds_ib_sysctl_max_unsig_wrs; send->s_wr.send_flags |= IB_SEND_SIGNALED; - return 1; } - return 0; + + /* To keep the rx pipeline going, add SEND_SOLIICITED once in a while */ + if (rds_ib_sysctl_max_unsolicited_wrs && --ic->i_unsolicited_wrs == 0) { + ic->i_unsolicited_wrs = rds_ib_sysctl_max_unsolicited_wrs; + send->s_wr.send_flags |= IB_SEND_SOLICITED; + } + + return !!(send->s_wr.send_flags & IB_SEND_SIGNALED); } /* @@ -642,6 +648,7 @@ int rds_ib_xmit(struct rds_connection *conn, struct rds_message *rm, rm->data.op_count = 0; } + ic->i_unsolicited_wrs = rds_ib_sysctl_max_unsolicited_wrs; rds_message_addref(rm); rm->data.op_dmasg = 0; rm->data.op_dmaoff = 0; diff --git a/net/rds/ib_sysctl.c b/net/rds/ib_sysctl.c index 5515ee743acf7..2e08386548b5f 100644 --- a/net/rds/ib_sysctl.c +++ b/net/rds/ib_sysctl.c @@ -49,6 +49,16 @@ unsigned long rds_ib_sysctl_max_unsig_wrs = 16; static unsigned long rds_ib_sysctl_max_unsig_wr_min = 1; static unsigned long rds_ib_sysctl_max_unsig_wr_max = 64; +unsigned long rds_ib_sysctl_max_unsolicited_wrs = 16; + +/* Zero means inserting SEND_SOLICITED in the middle of an RDS message + * is disabled + */ +static unsigned long rds_ib_sysctl_max_unsolicited_wr_min; +/* Nmbr frags of 1MB + 256B RDBMS hdr */ +static unsigned long rds_ib_sysctl_max_unsolicited_wr_max = + (1 * 1024 * 1024 + RDS_FRAG_SIZE) / RDS_FRAG_SIZE; + /* * This sysctl does nothing. * @@ -106,6 +116,15 @@ static struct ctl_table rds_ib_sysctl_table[] = { .extra1 = &rds_ib_sysctl_max_unsig_wr_min, .extra2 = &rds_ib_sysctl_max_unsig_wr_max, }, + { + .procname = "max_unsolicited_wr", + .data = &rds_ib_sysctl_max_unsolicited_wrs, + .maxlen = sizeof(unsigned long), + .mode = 0644, + .proc_handler = &proc_doulongvec_minmax, + .extra1 = &rds_ib_sysctl_max_unsolicited_wr_min, + .extra2 = &rds_ib_sysctl_max_unsolicited_wr_max, + }, { .procname = "max_recv_allocation", .data = &rds_ib_sysctl_max_recv_allocation,