Eric Dumazet [Sun, 5 Dec 2021 04:21:57 +0000 (20:21 -0800)]
net: add net device refcount tracker infrastructure
net device are refcounted. Over the years we had numerous bugs
caused by imbalanced dev_hold() and dev_put() calls.
The general idea is to be able to precisely pair each decrement with
a corresponding prior increment. Both share a cookie, basically
a pointer to private data storing stack traces.
This patch adds dev_hold_track() and dev_put_track().
To use these helpers, each data structure owning a refcount
should also use a "netdevice_tracker" to pair the hold and put.
It can be hard to track where references are taken and released.
In networking, we have annoying issues at device or netns dismantles,
and we had various proposals to ease root causing them.
This patch adds new infrastructure pairing refcount increases
and decreases. This will self document code, because programmers
will have to associate increments/decrements.
This is controled by CONFIG_REF_TRACKER which can be selected
by users of this feature.
This adds both cpu and memory costs, and thus should probably be
used with care.
Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Manish Chopra [Thu, 2 Dec 2021 21:01:57 +0000 (13:01 -0800)]
qed*: esl priv flag support through ethtool
ESL(Enhanced System Lockdown) was designed to lock PCI adapter firmware
images and prevent changes to critical non-volatile configuration data
so that uncontrolled, malicious or unintentional modification to the
adapters are avoided, ensuring it's operational state. Once this feature is
enabled, the device is locked, rejecting any modification to non-volatile
images. Once unlocked, the protection is off such that firmware and
non-volatile configurations may be altered.
Driver just reflects the capability and status of this through
the ethtool private flag.
Manish Chopra [Thu, 2 Dec 2021 21:01:56 +0000 (13:01 -0800)]
qed*: enhance tx timeout debug info
This patch add some new qed APIs to query status block
info and report various data to MFW on tx timeout event
Along with that it enhances qede to dump more debug logs
(not just specific to the queue which was reported by stack)
on tx timeout which includes various other basic metadata about
all tx queues and other info (like status block etc.)
Dan Carpenter [Fri, 3 Dec 2021 09:55:31 +0000 (12:55 +0300)]
net: lan966x: fix a IS_ERR() vs NULL check in lan966x_create_targets()
The devm_ioremap() function does not return error pointers. It returns
NULL.
Fixes: db8bcaad5393 ("net: lan966x: add the basic lan966x driver") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Horatiu Vultur <horatiu.vultur@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Yang Yingliang [Fri, 3 Dec 2021 07:04:18 +0000 (15:04 +0800)]
net: prestera: acl: fix return value check in prestera_acl_rule_entry_find()
rhashtable_lookup_fast() returns NULL pointer not ERR_PTR().
Return rhashtable_lookup_fast() directly to fix this.
Fixes: 47327e198d42 ("net: prestera: acl: migrate to new vTCAM api") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Andy Shevchenko [Thu, 2 Dec 2021 21:00:29 +0000 (23:00 +0200)]
net: dsa: vsc73xxx: Get rid of duplicate of_node assignment
GPIO library does copy the of_node from the parent device of
the GPIO chip, there is no need to repeat this in the individual
drivers. Remove assignment here.
For the details one may look into the of_gpio_dev_init() implementation.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Chris Mi [Wed, 1 Dec 2021 13:31:53 +0000 (15:31 +0200)]
net/sched: act_ct: Offload only ASSURED connections
Short-lived connections increase the insertion rate requirements,
fill the offload table and provide very limited offload value since
they process a very small amount of packets. The ct ASSURED flag is
designed to filter short-lived connections for early expiration.
Offload connections when they are ESTABLISHED and ASSURED.
Signed-off-by: Chris Mi <cmi@nvidia.com> Reviewed-by: Oz Shlomo <ozsh@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jie Wang [Fri, 3 Dec 2021 09:20:59 +0000 (17:20 +0800)]
net: hns3: fix hns3 driver header file not self-contained issue
The hns3 driver header file uses the structure of other files, but does
not include corresponding file, which causes a check warning that the
header file is not self-contained.
Therefore, the required header file is included in the header file, and
the structure declaration is added to the header file to avoid cyclic
dependency of the header file.
Signed-off-by: Jie Wang <wangjie125@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Hao Chen [Fri, 3 Dec 2021 09:20:58 +0000 (17:20 +0800)]
net: hns3: replace one tab with space in for statement
Replace one tab with space between symbol ')' and '{' in for statement of
function hclge_map_tqp().
Signed-off-by: Hao Chen <chenhao288@hisilicon.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Hao Chen [Fri, 3 Dec 2021 09:20:57 +0000 (17:20 +0800)]
net: hns3: remove rebundant line for hclge_dbg_dump_tm_pg()
Return value judgment should follow the function call, so remove line
between them.
Signed-off-by: Hao Chen <chenhao288@hisilicon.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Hao Chen [Fri, 3 Dec 2021 09:20:56 +0000 (17:20 +0800)]
net: hns3: add comments for hclge_dbg_fill_content()
When we use hclge_dbg_fill_content() to fill contents with
specific format according to struct hclge_dbg_item *items,
it may cause content cover due to unreasonable items.
So add comments to explain how to avoid it.
Signed-off-by: Hao Chen <chenhao288@hisilicon.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Hao Chen [Fri, 3 Dec 2021 09:20:55 +0000 (17:20 +0800)]
net: hns3: add void before function which don't receive ret
Add void before function which don't receive ret to improve code
readability.
Signed-off-by: Hao Chen <chenhao288@hisilicon.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Hao Chen [Fri, 3 Dec 2021 09:20:54 +0000 (17:20 +0800)]
net: hns3: align return value type of atomic_read() with its output
Change output value type of atomic_read() from %u to %d.
Signed-off-by: Hao Chen <chenhao288@hisilicon.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Hao Chen [Fri, 3 Dec 2021 09:20:52 +0000 (17:20 +0800)]
net: hns3: Align type of some variables with their print type
The c language has a set of implicit type conversions, when
two variables perform bitwise or arithmetic operations.
For example, variable A (type u16/u8) -1, its output is int type variable.
u16/u8 will convert to int type implicitly before it does arithmetic
operations. So, change 1 to unsigned type.
Signed-off-by: Hao Chen <chenhao288@hisilicon.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Guangbin Huang [Fri, 3 Dec 2021 09:20:50 +0000 (17:20 +0800)]
net: hns3: refactor function hclge_set_vlan_filter_hw
Function hclge_set_vlan_filter_hw() is a bit too long, so add a new
function hclge_need_update_port_vlan() to simplify code and improve
code readability.
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Yufeng Mo [Fri, 3 Dec 2021 09:20:49 +0000 (17:20 +0800)]
net: hns3: optimize function hclge_cfg_common_loopback()
hclge_cfg_common_loopback() is a bit too long, so
encapsulate hclge_cfg_common_loopback_cmd_send() and
hclge_cfg_common_loopback_wait() two functions to
improve readability.
Signed-off-by: Yufeng Mo <moyufeng@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Fri, 3 Dec 2021 03:00:17 +0000 (19:00 -0800)]
Merge tag 'mlx5-updates-2021-12-02' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
mlx5-updates-2021-12-02
Misc updates to mlx5 driver
1) Various code cleanups
2) Error path handling fixes of latest features
3) Print more information on pci error handling
4) Dynamically resize flow counters query buffer
====================
The flow counters bulk query buffer is allocated once during
mlx5_fc_init_stats(). For PFs and VFs this buffer usually takes a little
more than 512KB of memory, which is aligned to the next power of 2, to
1MB. For SFs, this buffer is reduced and takes around 128 Bytes.
The buffer size determines the maximum number of flow counters that
can be queried at a time. Thus, having a bigger buffer can improve
performance for users that need to query many flow counters.
There are cases that don't use many flow counters and don't need a big
buffer (e.g. SFs, VFs). Since this size is critical with large scale,
in these cases the buffer size should be reduced.
In order to reduce memory consumption while maintaining query
performance, change the query buffer's allocation scheme to the
following:
- First allocate the buffer with small initial size.
- If the number of counters surpasses the initial size, resize the
buffer to the maximum size.
The buffer only grows and isn't shrank, because users with many flow
counters don't care about the buffer size and we don't want to add
resize overhead if the current number of counters drops.
This solution is preferable to the current one, which is less accurate
and only addresses SFs.
Roi Dayan [Tue, 9 Nov 2021 12:07:49 +0000 (14:07 +0200)]
net/mlx5e: TC, Set flow attr ip_version earlier
Setting flow attr ip_version is not related to parsing tc flow actions.
It needs to be set after parsing flower matches which changes the spec.
So move it outside parse_tc_fdb_actions() and set it in
__mlx5e_add_fdb_flow().
Signed-off-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Oz Shlomo <ozsh@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
net/mlx5e: Hide function mlx5e_num_channels_changed
No calls for mlx5e_num_channels_changed() out of en_main.c,
turn it static and remove from header.
Keep the wrapper function mlx5e_num_channels_changed_ctx exposed.
Dan Carpenter [Sat, 27 Nov 2021 14:19:53 +0000 (17:19 +0300)]
net/mlx5: SF, silence an uninitialized variable warning
This code sometimes calls mlx5_sf_hw_table_hwc_init() when "ext_base_id"
is uninitialized. It's not used on that path, but it generates a static
checker warning to pass uninitialized variables to another function.
It may also generate runtime UBSan warnings depending on if the
mlx5_sf_hw_table_hwc_init() function is inlined or not.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Arnd Bergmann [Mon, 8 Nov 2021 11:10:33 +0000 (12:10 +0100)]
mlx5: fix mlx5i_grp_sw_update_stats() stack usage
The mlx5e_sw_stats structure has grown to the point of triggering
a warning when put on the stack of a function:
mlx5/core/ipoib/ipoib.c: In function 'mlx5i_grp_sw_update_stats':
mlx5/core/ipoib/ipoib.c:136:1: error: the frame size of 1028 bytes is larger than 1024 bytes [-Werror=frame-larger-than=]
In this case, only five of the structure members are actually set,
so it's sufficient to have those as separate local variables.
As en_rep.c uses 'struct rtnl_link_stats64' for this, just use
the same one here for consistency.
Arnd Bergmann [Mon, 8 Nov 2021 11:10:32 +0000 (12:10 +0100)]
mlx5: fix psample_sample_packet link error
When PSAMPLE is a loadable module, built-in drivers cannot use it:
aarch64-linux-ld: drivers/net/ethernet/mellanox/mlx5/core/en/tc/sample.o: in function `mlx5e_tc_sample_skb':
sample.c:(.text+0xd68): undefined reference to `psample_sample_packet'
Add the same dependency here that is used for MLXSW
- rds: correct socket tunable error in rds_tcp_tune()
- ipv6: fix memory leak in fib6_rule_suppress
- wireguard: reset peer src endpoint when netns exits
- wireguard: improve resilience to DoS around incoming handshakes
- tcp: fix page frag corruption on page fault which involves TCP
- mpls: fix missing attributes in delete notifications
- mt7915: fix NULL pointer dereference with ad-hoc mode
Misc:
- rt2x00: be more lenient about EPROTO errors during start
- mlx4_en: update reported link modes for 1/10G"
* tag 'net-5.16-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (85 commits)
net: dsa: b53: Add SPI ID table
gro: Fix inconsistent indenting
selftests: net: Correct case name
net/rds: correct socket tunable error in rds_tcp_tune()
mctp: Don't let RTM_DELROUTE delete local routes
net/smc: Keep smc_close_final rc during active close
ibmvnic: drop bad optimization in reuse_tx_pools()
ibmvnic: drop bad optimization in reuse_rx_pools()
net/smc: fix wrong list_del in smc_lgr_cleanup_early
Fix Comment of ETH_P_802_3_MIN
ethernet: aquantia: Try MAC address from device tree
ipv4: convert fib_num_tclassid_users to atomic_t
net: avoid uninit-value from tcp_conn_request
net: annotate data-races on txq->xmit_lock_owner
octeontx2-af: Fix a memleak bug in rvu_mbox_init()
net/mlx4_en: Fix an use-after-free bug in mlx4_en_try_alloc_resources()
vrf: Reset IPCB/IP6CB when processing outbound pkts in vrf dev xmit
net: qlogic: qlcnic: Fix a NULL pointer dereference in qlcnic_83xx_add_rings()
net: dsa: mv88e6xxx: Link in pcs_get_state() if AN is bypassed
net: dsa: mv88e6xxx: Fix inband AN for 2500base-x on 88E6393X family
...
Linus Torvalds [Thu, 2 Dec 2021 19:07:41 +0000 (11:07 -0800)]
Merge tag 'trace-v5.16-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing fixes from Steven Rostedt:
"Three tracing fixes:
- Allow compares of strings when using signed and unsigned characters
- Fix kmemleak false positive for histogram entries
- Handle negative numbers for user defined kretprobe data sizes"
* tag 'trace-v5.16-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
kprobes: Limit max data_size of the kretprobe instances
tracing: Fix a kmemleak false positive in tracing_map
tracing/histograms: String compares should not care about signed values
Linus Torvalds [Thu, 2 Dec 2021 18:56:16 +0000 (10:56 -0800)]
Merge tag 'for-linus-5.16-2' of git://github.com/cminyard/linux-ipmi
Pull IPMI fixes from Corey Minyard:
"Some changes that went in 5.16 had issues. When working on the design
a piece was redesigned and things got missed. And the message type was
not being initialized when it was allocated, resulting in crashes.
In addition, the IPMI driver has had a shutdown issue where it could
still have an item in a system workqueue after it had been shutdown.
Move to a private workqueue to avoid that problem"
* tag 'for-linus-5.16-2' of git://github.com/cminyard/linux-ipmi:
ipmi:ipmb: Fix unknown command response
ipmi: fix IPMI_SMI_MSG_TYPE_IPMB_DIRECT response length checking
ipmi: fix oob access due to uninit smi_msg type
ipmi: msghandler: Make symbol 'remove_work_wq' static
ipmi: Move remove_work to dedicated workqueue
Florian Fainelli [Thu, 2 Dec 2021 04:17:20 +0000 (20:17 -0800)]
net: dsa: b53: Add SPI ID table
Currently autoloading for SPI devices does not use the DT ID table, it
uses SPI modalises. Supporting OF modalises is going to be difficult if
not impractical, an attempt was made but has been reverted, so ensure
that module autoloading works for this driver by adding an id_table
listing the SPI IDs for everything.
Fixes: 96c8395e2166 ("spi: Revert modalias changes") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Horatiu Vultur [Thu, 2 Dec 2021 08:15:11 +0000 (09:15 +0100)]
net: lan966x: Fix builds for lan966x driver
The lan966x is using the function 'packing' to create/extract the
information for the IFH, that is used to be added in front of the frames
when they are injected/extracted.
Therefore update the Kconfig to select config option 'PACKING' whenever
lan966x driver is enabled.
Fixes: db8bcaad539314 ("net: lan966x: add the basic lan966x driver") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Horatiu Vultur [Thu, 2 Dec 2021 11:05:04 +0000 (12:05 +0100)]
dt-bindings: net: lan966x: Add additional properties for lan966x
This patch updates the dt-bindings for lan966x switch.
It adds the properties 'additionalProperties' and
'unevaluatedProperties' for ethernet-ports and ports nodes. In this way
it is not possible to add more properties to these nodes.
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Prabhakar Kushwaha [Thu, 2 Dec 2021 10:41:56 +0000 (10:41 +0000)]
qed: Enhance rammod debug prints to provide pretty details
Instead of printing numbers of protocol IDs and rammod commands,
enhance debug prints to provide names. s_protocol_types[] and
s_ramrod_cmd_ids arrays[] are added to support along with APIs.
Signed-off-by: Ariel Elior <aelior@marvell.com> Signed-off-by: Alok Prasad <palok@marvell.com> Signed-off-by: Prabhakar Kushwaha <pkushwaha@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Li Zhijian [Thu, 2 Dec 2021 02:28:41 +0000 (10:28 +0800)]
selftests: net: Correct case name
ipv6_addr_bind/ipv4_addr_bind are function names. Previously, bind test
would not be run by default due to the wrong case names
Fixes: 34d0302ab861 ("selftests: Add ipv6 address bind tests to fcnal-test") Fixes: 75b2b2b3db4c ("selftests: Add ipv4 address bind tests to fcnal-test") Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Horatiu Vultur [Wed, 1 Dec 2021 14:53:51 +0000 (15:53 +0100)]
net: lan966x: Fix duplicate check in frame extraction
The blamed commit generates the following smatch static checker warning:
drivers/net/ethernet/microchip/lan966x/lan966x_main.c:515 lan966x_xtr_irq_handler()
warn: duplicate check 'sz < 0' (previous on line 502)
This patch fixes this issue removing the duplicate check 'sz < 0'
Fixes: d28d6d2e37d10d ("net: lan966x: add port module support") Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
William Kucharski [Wed, 1 Dec 2021 14:45:22 +0000 (07:45 -0700)]
net/rds: correct socket tunable error in rds_tcp_tune()
Correct an error where setting /proc/sys/net/rds/tcp/rds_tcp_rcvbuf would
instead modify the socket's sk_sndbuf and would leave sk_rcvbuf untouched.
Fixes: c6a58ffed536 ("RDS: TCP: Add sysctl tunables for sndbuf/rcvbuf on rds-tcp socket") Signed-off-by: William Kucharski <william.kucharski@oracle.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Matt Johnston [Wed, 1 Dec 2021 08:07:42 +0000 (16:07 +0800)]
mctp: Don't let RTM_DELROUTE delete local routes
We need to test against the existing route type, not
the rtm_type in the netlink request.
Fixes: 83f0a0b7285b ("mctp: Specify route types, require rtm_type in RTM_*ROUTE messages") Signed-off-by: Matt Johnston <matt@codeconstruct.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
Tony Lu [Wed, 1 Dec 2021 06:42:16 +0000 (14:42 +0800)]
net/smc: Keep smc_close_final rc during active close
When smc_close_final() returns error, the return code overwrites by
kernel_sock_shutdown() in smc_close_active(). The return code of
smc_close_final() is more important than kernel_sock_shutdown(), and it
will pass to userspace directly.
Fix it by keeping both return codes, if smc_close_final() raises an
error, return it or kernel_sock_shutdown()'s.
Link: https://lore.kernel.org/linux-s390/1f67548e-cbf6-0dce-82b5-10288a4583bd@linux.ibm.com/ Fixes: 606a63c9783a ("net/smc: Ensure the active closing peer first closes clcsock") Suggested-by: Karsten Graul <kgraul@linux.ibm.com> Signed-off-by: Tony Lu <tonylu@linux.alibaba.com> Reviewed-by: Wen Gu <guwen@linux.alibaba.com> Acked-by: Karsten Graul <kgraul@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Sukadev Bhattiprolu [Wed, 1 Dec 2021 05:48:36 +0000 (21:48 -0800)]
ibmvnic: drop bad optimization in reuse_tx_pools()
When trying to decide whether or not reuse existing rx/tx pools
we tried to allow a range of values for the pool parameters rather
than exact matches. This was intended to reuse the resources for
instance when switching between two VIO servers with different
default parameters.
But this optimization is incomplete and breaks when we try to
change the number of queues for instance. The optimization needs
to be updated, so drop it for now and simplify the code.
Fixes: bbd809305bc7 ("ibmvnic: Reuse tx pools when possible") Reported-by: Dany Madden <drt@linux.ibm.com> Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Reviewed-by: Dany Madden <drt@linux.ibm.com> Reviewed-by: Rick Lindsley <ricklind@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Sukadev Bhattiprolu [Wed, 1 Dec 2021 05:48:35 +0000 (21:48 -0800)]
ibmvnic: drop bad optimization in reuse_rx_pools()
When trying to decide whether or not reuse existing rx/tx pools
we tried to allow a range of values for the pool parameters rather
than exact matches. This was intended to reuse the resources for
instance when switching between two VIO servers with different
default parameters.
But this optimization is incomplete and breaks when we try to
change the number of queues for instance. The optimization needs
to be updated, so drop it for now and simplify the code.
Fixes: 489de956e7a2 ("ibmvnic: Reuse rx pools when possible") Reported-by: Dany Madden <drt@linux.ibm.com> Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Reviewed-by: Dany Madden <drt@linux.ibm.com> Reviewed-by: Rick Lindsley <ricklind@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Dust Li [Wed, 1 Dec 2021 03:02:30 +0000 (11:02 +0800)]
net/smc: fix wrong list_del in smc_lgr_cleanup_early
smc_lgr_cleanup_early() meant to delete the link
group from the link group list, but it deleted
the list head by mistake.
This may cause memory corruption since we didn't
remove the real link group from the list and later
memseted the link group structure.
We got a list corruption panic when testing:
Fixes: a0a62ee15a829 ("net/smc: separate locks for SMCD and SMCR link group lists") Signed-off-by: Dust Li <dust.li@linux.alibaba.com> Acked-by: Karsten Graul <kgraul@linux.ibm.com> Reviewed-by: Tony Lu <tonylu@linux.alibaba.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Xiayu Zhang [Wed, 1 Dec 2021 02:57:13 +0000 (10:57 +0800)]
Fix Comment of ETH_P_802_3_MIN
The description of ETH_P_802_3_MIN is misleading.
The value of EthernetType in Ethernet II frame is more than 0x0600,
the value of Length in 802.3 frame is less than 0x0600.
Signed-off-by: Xiayu Zhang <Xiayu.Zhang@mediatek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Tianhao Chai [Wed, 1 Dec 2021 02:57:06 +0000 (20:57 -0600)]
ethernet: aquantia: Try MAC address from device tree
Apple M1 Mac minis (2020) with 10GE NICs do not have MAC address in the
card, but instead need to obtain MAC addresses from the device tree. In
this case the hardware will report an invalid MAC.
Currently atlantic driver does not query the DT for MAC address and will
randomly assign a MAC if the NIC doesn't have a permanent MAC burnt in.
This patch causes the driver to perfer a valid MAC address from OF (if
present) over HW self-reported MAC and only fall back to a random MAC
address when neither of them is valid.
Signed-off-by: Tianhao Chai <cth451@gmail.com> Reviewed-by: Igor Russkikh <irusskikh@marvell.com> Reviewed-by: Hector Martin <marcan@marcan.st> Signed-off-by: David S. Miller <davem@davemloft.net>
Ansuel Smith [Tue, 30 Nov 2021 21:16:25 +0000 (22:16 +0100)]
dt-bindings: net: dsa: qca8k: improve port definition documentation
Clean and improve port definition for qca8k documentation by referencing
the dsa generic port definition and adding the additional specific port
definition.
Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com> Reviewed-by: Rob Herring <robh@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Thu, 2 Dec 2021 02:26:35 +0000 (18:26 -0800)]
ipv4: convert fib_num_tclassid_users to atomic_t
Before commit faa041a40b9f ("ipv4: Create cleanup helper for fib_nh")
changes to net->ipv4.fib_num_tclassid_users were protected by RTNL.
After the change, this is no longer the case, as free_fib_info_rcu()
runs after rcu grace period, without rtnl being held.
Fixes: faa041a40b9f ("ipv4: Create cleanup helper for fib_nh") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: David Ahern <dsahern@kernel.org> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Jie Wang [Thu, 2 Dec 2021 08:36:03 +0000 (16:36 +0800)]
net: hns3: refactor function hns3_get_vector_ring_chain()
Currently hns3_get_vector_ring_chain() is a bit long. Refactor it by
extracting sub process to improve the readability.
Signed-off-by: Jie Wang <wangjie125@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jie Wang [Thu, 2 Dec 2021 08:36:02 +0000 (16:36 +0800)]
net: hns3: refactor function hclge_set_channels()
Currently hclge_set_channels() is a bit long. Refactor it by extracting
sub process to improve the readability.
Signed-off-by: Jie Wang <wangjie125@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jie Wang [Thu, 2 Dec 2021 08:36:01 +0000 (16:36 +0800)]
net: hns3: refactor function hclge_configure()
Currently hclge_configure() is a bit long. Refactor it by extracting sub
process to improve the readability.
Signed-off-by: Jie Wang <wangjie125@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jian Shen [Thu, 2 Dec 2021 08:36:00 +0000 (16:36 +0800)]
net: hns3: split function hclge_update_port_base_vlan_cfg()
Currently the function hclge_update_port_base_vlan_cfg() is a
bit long. Split it to several small functions, to improve the
readability.
Signed-off-by: Jian Shen <shenjian15@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Yufeng Mo [Thu, 2 Dec 2021 08:35:59 +0000 (16:35 +0800)]
net: hns3: split function hns3_nic_net_xmit()
Function hns3_nic_net_xmit() is a bit too long. So add a
new function hns3_handle_skb_desc() to simplify code and improve
code readability.
Signed-off-by: Yufeng Mo <moyufeng@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jian Shen [Thu, 2 Dec 2021 08:35:58 +0000 (16:35 +0800)]
net: hns3: split function hclge_get_fd_rule_info()
Currently the function hclge_get_fd_rule_info() is a bit long.
Split it to several small functions, to improve readability.
Signed-off-by: Jian Shen <shenjian15@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jian Shen [Thu, 2 Dec 2021 08:35:57 +0000 (16:35 +0800)]
net: hns3: split function hclge_init_vlan_config()
Currently the function hclge_init_vlan_config() is a bit long.
Split it to several small functions, to simplify code and
improve code readability.
Signed-off-by: Jian Shen <shenjian15@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Peng Li [Thu, 2 Dec 2021 08:35:56 +0000 (16:35 +0800)]
net: hns3: refactor function hns3_fill_skb_desc to simplify code
The function hns3_fill_skb_desc is hard to read, this patch
extract 2 functions and add new a struct data to simplify the
code and Improve readability.
Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Peng Li [Thu, 2 Dec 2021 08:35:55 +0000 (16:35 +0800)]
net: hns3: extract macro to simplify ring stats update code
As the code to update ring stats is alike for different ring stats
type, this patch extract macro to simplify ring stats update code.
Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Reported by Kernel Concurrency Sanitizer on:
CPU: 1 PID: 28712 Comm: syz-executor.0 Tainted: G W 5.16.0-rc1-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Zhou Qingyang [Tue, 30 Nov 2021 16:50:39 +0000 (00:50 +0800)]
octeontx2-af: Fix a memleak bug in rvu_mbox_init()
In rvu_mbox_init(), mbox_regions is not freed or passed out
under the switch-default region, which could lead to a memory leak.
Fix this bug by changing 'return err' to 'goto free_regions'.
This bug was found by a static analyzer. The analysis employs
differential checking to identify inconsistent security operations
(e.g., checks or kfrees) between two code paths and confirms that the
inconsistent operations are not recovered in the current function or
the callers, so they constitute bugs.
Note that, as a bug found by static analysis, it can be a false
positive or hard to trigger. Multiple researchers have cross-reviewed
the bug.
Builds with CONFIG_OCTEONTX2_AF=y show no new warnings,
and our static analyzer no longer warns about this code.
The new SNMP variable (TCPSmallQueueFailure) can be incremented
for good reasons, even on a 100Gbit single TCP_STREAM flow.
If we really wanted to ease driver debugging [1], this would
require something more sophisticated.
[1] Usually, if a driver is delaying TX completions too much,
this can lead to stalls in TCP output. Various work arounds
have been used in the past, like skb_orphan() in ndo_start_xmit().
Zhou Qingyang [Tue, 30 Nov 2021 16:44:38 +0000 (00:44 +0800)]
net/mlx4_en: Fix an use-after-free bug in mlx4_en_try_alloc_resources()
In mlx4_en_try_alloc_resources(), mlx4_en_copy_priv() is called and
tmp->tx_cq will be freed on the error path of mlx4_en_copy_priv().
After that mlx4_en_alloc_resources() is called and there is a dereference
of &tmp->tx_cq[t][i] in mlx4_en_alloc_resources(), which could lead to
a use after free problem on failure of mlx4_en_copy_priv().
Fix this bug by adding a check of mlx4_en_copy_priv()
This bug was found by a static analyzer. The analysis employs
differential checking to identify inconsistent security operations
(e.g., checks or kfrees) between two code paths and confirms that the
inconsistent operations are not recovered in the current function or
the callers, so they constitute bugs.
Note that, as a bug found by static analysis, it can be a false
positive or hard to trigger. Multiple researchers have cross-reviewed
the bug.
Builds with CONFIG_MLX4_EN=m show no new warnings,
and our static analyzer no longer warns about this code.
Fixes: ec25bc04ed8e ("net/mlx4_en: Add resilience in low memory systems") Signed-off-by: Zhou Qingyang <zhou1615@umn.edu> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Link: https://lore.kernel.org/r/20211130164438.190591-1-zhou1615@umn.edu Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Stephen Suryaputra [Tue, 30 Nov 2021 16:26:37 +0000 (11:26 -0500)]
vrf: Reset IPCB/IP6CB when processing outbound pkts in vrf dev xmit
IPCB/IP6CB need to be initialized when processing outbound v4 or v6 pkts
in the codepath of vrf device xmit function so that leftover garbage
doesn't cause futher code that uses the CB to incorrectly process the
pkt.
One occasion of the issue might occur when MPLS route uses the vrf
device as the outgoing device such as when the route is added using "ip
-f mpls route add <label> dev <vrf>" command.
The problems seems to exist since day one. Hence I put the day one
commits on the Fixes tags.
Fixes: 193125dbd8eb ("net: Introduce VRF device driver") Fixes: 35402e313663 ("net: Add IPv6 support to VRF device") Cc: stable@vger.kernel.org Signed-off-by: Stephen Suryaputra <ssuryaextr@gmail.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20211130162637.3249-1-ssuryaextr@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Russell King [Tue, 30 Nov 2021 14:54:05 +0000 (14:54 +0000)]
net: mvneta: program 1ms autonegotiation clock divisor
Program the 1ms autonegotiation clock divisor according to the clocking
rate of neta - without this, the 1ms clock ticks at about 660us on
Armada 38x configured for 250MHz. Bring this into correct specification.
====================
net: dsa: convert two drivers to phylink_generic_validate()
Patches 1 to 3 update core DSA code to allow drivers to be converted,
and patches 4 and 5 convert hellcreek and lantiq to use this (both of
which received reviewed-by from their maintainers.) As the rest have
yet to be reviewed by their maintainers, they are not included here.
Patch 1 had a request to change the formatting of it; I have not done
so as I believe a patch should do one change and one change only -
reformatting it is a separate change that should be in its own patch.
However, as patch 2 gets rid of the reason for reformatting it, it
would be pointless, and pure noise to include such an intermediary
patch.
Instead, I have swapped the order of patches 2 and 3 from the RFC
series.
====================
Russell King (Oracle) [Tue, 30 Nov 2021 13:10:16 +0000 (13:10 +0000)]
net: dsa: lantiq: convert to phylink_generic_validate()
Populate the supported interfaces and MAC capabilities for the Lantiq
DSA switches and remove the old validate implementation to allow DSA to
use phylink_generic_validate() for this switch driver.
The exclusion of Gigabit linkmodes for MII, Reverse MII and Reduced MII
links is handled within phylink_generic_validate() in phylink, so there
is no need to make them conditional on the interface mode in the driver.
Reviewed-by: Hauke Mehrtens <hauke@hauke-m.de> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Russell King (Oracle) [Tue, 30 Nov 2021 13:10:11 +0000 (13:10 +0000)]
net: dsa: hellcreek: convert to phylink_generic_validate()
Populate the supported interfaces and MAC capabilities for the
hellcreek DSA switch and remove the old validate implementation to
allow DSA to use phylink_generic_validate() for this switch driver.
The switch actually only supports MII and RGMII, but as phylib defaults
to GMII, we need to include this interface mode to keep existing DT
working.
Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de> Tested-by: Kurt Kanzenbach <kurt@linutronix.de> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Russell King (Oracle) [Tue, 30 Nov 2021 13:10:06 +0000 (13:10 +0000)]
net: dsa: support use of phylink_generic_validate()
Support the use of phylink_generic_validate() when there is no
phylink_validate method given in the DSA switch operations and
mac_capabilities have been set in the phylink_config structure by the
DSA switch driver.
This gives DSA switch drivers the option to use this if they provide
the supported_interfaces and mac_capabilities, while still giving them
an option to override the default implementation if necessary.
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Marek Behún <kabel@kernel.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Russell King (Oracle) [Tue, 30 Nov 2021 13:10:01 +0000 (13:10 +0000)]
net: dsa: replace phylink_get_interfaces() with phylink_get_caps()
Phylink needs slightly more information than phylink_get_interfaces()
allows us to get from the DSA drivers - we need the MAC capabilities.
Replace the phylink_get_interfaces() method with phylink_get_caps() to
allow DSA drivers to fill in the phylink_config MAC capabilities field
as well.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Marek Behún <kabel@kernel.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Russell King (Oracle) [Tue, 30 Nov 2021 13:09:55 +0000 (13:09 +0000)]
net: dsa: consolidate phylink creation
The code in port.c and slave.c creating the phylink instance is very
similar - let's consolidate this into a single function.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Marek Behún <kabel@kernel.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Jakub Kicinski <kuba@kernel.org>