When supplying a module parameter value for the fail-over group as:
ib0,ib1;ib2,ib3;ib4,ib5;ib6,ib7;ib8,ib9
rdmaip / rds_rdma is unable to parse it correctly. Based on debug
prints, it does:
lab38 kernel: RDS/IB: ib0 is designated group 1
lab38 kernel: RDS/IB: ib1 is designated group 1
lab38 kernel: RDS/IB: ib2 is designated group 2
(no more prints).
This implies, that for Xn-8 systems, fail-over will not be distributed
correctly. We see:
lab38 kernel: RDS/IP: IP 192.2.23.100 migrated from ib0 to ib1:P01
lab38 kernel: RDS/IP: IP 192.2.23.102 migrated from ib2 to ib1:P03
lab38 kernel: RDS/IP: IP 192.2.23.104 migrated from ib4 to ib3:P05
lab38 kernel: RDS/IP: IP 192.2.23.106 migrated from ib6 to ib3:P07
lab38 kernel: RDS/IP: IP 192.2.23.108 migrated from ib8 to ib3:P09
That is, all fail-overs except for ib0 are migrated to ib3.
This commit fixes this. The following values for the module parameter
have been tested (in user space):
ib0
ib0,ib1
ib0,ib1,ib2
ib0,ib1,ib2,ib3
ib0;ib1
ib0,ib1;ib2,ib3
ib0,ib1,ib2;ib3,ib4,ib5
ib0,ib1,ib2,ib3;ib4,ib5,ib6,ib7
ib0;ib1;ib2
ib0,ib1;ib2,ib3;ib4,ib5
ib0,ib1,ib2;ib3,ib4,ib5;ib6,ib7,ib8
ib0,ib1,ib2,ib3;ib4,ib5,ib6,ib7;ib8,ib9,ib10,ib11
ib0;ib1;ib2;ib3
ib0,ib1;ib2,ib3;ib4,ib5;ib6,ib7
ib0,ib1,ib2;ib3,ib4,ib5;ib6,ib7,ib8;ib9,ib10,ib11
ib0,ib1,ib2,ib3;ib4,ib5,ib6,ib7;ib8,ib9,ib10,ib11;ib12,ib13,ib14,ib15
Orabug:
28198749
Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com>
Reviewed-by: Avinash Repaka <avinash.repaka@oracle.com>
Reviewed-by: Shannon Nelson <shannon.nelson@oracle.com>
Reviewed-by: Zhu Yanjun <yanjun.zhu@oracle.com>
(inspired by uek-5-next commit
bf8cd0080482fa23ee859cc6118c964693cb3a72)
Signed-off-by: Brian Maly <brian.maly@oracle.com>