net/rds: prevent RDS connections using stale ARP entries
Active bonding fallback is an asynchronous event in the cluster. Thus,
there is a potential race where a host has sent out a CM REQ, using a
wrong remote gid, before IP migration happens at the remote end. The
side effect of this issue causes RDS connections only use one physical
IB port. When that particular IB port goes down, it causes send
completion error on those RDS connections and eventually increases the
brownout time.
To avoid this phenomenon, RDS has to reject all incoming CM REQ after
RDMA_CM_EVENT_DISCONNECTED or RDMA_CM_EVENT_ADDR_CHANGE events until
route resolution has been performed locally. By doing so, we can
ensure that the gratituos ARPs are sent to the remote end and RDS
connections are established based on the correct ARP entries.
Orabug:
28149101
Signed-off-by: Wei Lin Guay <wei.lin.guay@oracle.com>
Tested-by: Dib Chatterjee <dib.chatterjee@oracle.com>
(cherry picked from commit
d7f00e2a0d30b0f14a763d96e6c8ae2ddb40a281
repo https://linux-git.us.oracle.com/UEK/linux-wguay-public)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
Conflicts:
net/rds/rds.h
net/rds/ib_cm.c
Fixed checkpatch issues.
Signed-off-by: HÃ¥kon Bugge <haakon.bugge@oracle.com>
Reviewed-by: Shannon Nelson <shannon.nelson@oracle.com>
Reviewed-by: Zhu Yanjun <yanjun.zhu@oracle.com>
Signed-off-by: Brian Maly <brian.maly@oracle.com>