From: Vladimir Oltean Date: Wed, 4 Aug 2021 13:54:35 +0000 (+0300) Subject: net: dsa: sja1105: suppress TX packets from looping back in "H" topologies X-Git-Tag: nvme-5.15-2021-09-07~37^2~287^2~1 X-Git-Url: https://www.infradead.org/git/?a=commitdiff_plain;h=0f9b762c097c1816bba072fb44b9018a41e2e65b;p=nvme.git net: dsa: sja1105: suppress TX packets from looping back in "H" topologies H topologies like this one have a problem: eth0 eth1 | | CPU port CPU port | DSA link | sw0p0 sw0p1 sw0p2 sw0p3 sw0p4 -------- sw1p4 sw1p3 sw1p2 sw1p1 sw1p0 | | | | | | user user user user user user port port port port port port Basically any packet sent by the eth0 DSA master can be flooded on the interconnecting DSA link sw0p4 <-> sw1p4 and it will be received by the eth1 DSA master too. Basically we are talking to ourselves. In VLAN-unaware mode, these packets are encoded using a tag_8021q TX VLAN, which dsa_8021q_rcv() rightfully cannot decode and complains. Whereas in VLAN-aware mode, the packets are encoded with a bridge VLAN which _can_ be decoded by the tagger running on eth1, so it will attempt to reinject that packet into the network stack (the bridge, if there is any port under eth1 that is under a bridge). In the case where the ports under eth1 are under the same cross-chip bridge as the ports under eth0, the TX packets will even be learned as RX packets. The only thing that will prevent loops with the software bridging path, and therefore disaster, is that the source port and the destination port are in the same hardware domain, and the bridge will receive packets from the driver with skb->offload_fwd_mark = true and will not forward between the two. The proper solution to this problem is to detect H topologies and enforce that all packets are received through the local switch and we do not attempt to receive packets on our CPU port from switches that have their own. This is a viable solution which works thanks to the fact that MAC addresses which should be filtered towards the host are installed by DSA as static MAC addresses towards the CPU port of each switch. TX from a CPU port towards the DSA port continues to be allowed, this is because sja1105 supports bridge TX forwarding offload, and the skb->dev used initially for xmit does not have any direct correlation with where the station that will respond to that packet is connected. It may very well happen that when we send a ping through a br0 interface that spans all switch ports, the xmit packet will exit the system through a DSA switch interface under eth1 (say sw1p2), but the destination station is connected to a switch port under eth0, like sw0p0. So the switch under eth1 needs to communicate on TX with the switch under eth0. The response, however, will not follow the same path, but instead, this patch enforces that the response is sent by the first switch directly to its DSA master which is eth0. Signed-off-by: Vladimir Oltean Signed-off-by: David S. Miller --- diff --git a/drivers/net/dsa/sja1105/sja1105_main.c b/drivers/net/dsa/sja1105/sja1105_main.c index fffcaef6b148..b3b5ae3ef408 100644 --- a/drivers/net/dsa/sja1105/sja1105_main.c +++ b/drivers/net/dsa/sja1105/sja1105_main.c @@ -474,7 +474,9 @@ static int sja1105_init_l2_forwarding(struct sja1105_private *priv) { struct sja1105_l2_forwarding_entry *l2fwd; struct dsa_switch *ds = priv->ds; + struct dsa_switch_tree *dst; struct sja1105_table *table; + struct dsa_link *dl; int port, tc; int from, to; @@ -547,6 +549,33 @@ static int sja1105_init_l2_forwarding(struct sja1105_private *priv) } } + /* In odd topologies ("H" connections where there is a DSA link to + * another switch which also has its own CPU port), TX packets can loop + * back into the system (they are flooded from CPU port 1 to the DSA + * link, and from there to CPU port 2). Prevent this from happening by + * cutting RX from DSA links towards our CPU port, if the remote switch + * has its own CPU port and therefore doesn't need ours for network + * stack termination. + */ + dst = ds->dst; + + list_for_each_entry(dl, &dst->rtable, list) { + if (dl->dp->ds != ds || dl->link_dp->cpu_dp == dl->dp->cpu_dp) + continue; + + from = dl->dp->index; + to = dsa_upstream_port(ds, from); + + dev_warn(ds->dev, + "H topology detected, cutting RX from DSA link %d to CPU port %d to prevent TX packet loops\n", + from, to); + + sja1105_port_allow_traffic(l2fwd, from, to, false); + + l2fwd[from].bc_domain &= ~BIT(to); + l2fwd[from].fl_domain &= ~BIT(to); + } + /* Finally, manage the egress flooding domain. All ports start up with * flooding enabled, including the CPU port and DSA links. */