When rds makes failback, it will migrate to a new ib device after
several seconds delays. But in the delay, the target ib is possible
to be down. So it is necessary to check the target ib state.
The following is an example.
"
...
May 11 10:37:23 kernel: mlx4_core 0000:03:00.0: mlx4_ib: Port 2 logical link is down
May 11 10:37:23 kernel: RDS/IP: IP 192.168.10.2 migrated from ib1 to ib0:P02
May 11 10:38:18 kernel: mlx4_core 0000:03:00.0: mlx4_ib: Port 2 logical link is up
May 11 10:38:23 kernel: mlx4_core 0000:03:00.0: mlx4_ib: Port 2 logical link is down
May 11 10:38:28 kernel: RDS/IP: IP 192.168.10.2 migrated from ib0:P02 to ib1
...
"
When ib1 is up, failback is in process. But this failback has several
seconds delay. In this delay, ib1 is down. Finally rds migrates to an
ib device that is down.
Orabug:
28097129
Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com>
Reviewed-by: Junxiao Bi <junxiao.bi@oracle.com>
Reviewed-by: HÃ¥kon Bugge <haakon.bugge@oracle.com>
Signed-off-by: Brian Maly <brian.maly@oracle.com>
container_of(_work, struct rds_ib_port_ud_work, work.work);
u8 i, ip_active_port, port = work->port;
- if (ip_config[port].port_state == RDS_IB_PORT_INIT) {
+ if ((ip_config[port].port_state == RDS_IB_PORT_INIT) ||
+ (ip_config[port].port_state == RDS_IB_PORT_DOWN)) {
printk(KERN_ERR "RDS/IB: devname %s failback request "
- "with port_state in INIT state!",
- ip_config[port].dev->name);
+ "with port_state in %s state!",
+ ip_config[port].dev->name,
+ ip_config[port].port_state == RDS_IB_PORT_INIT ?
+ "INIT" : "DOWN");
goto out;
}