]> www.infradead.org Git - users/dwmw2/linux.git/log
users/dwmw2/linux.git
2 years agonet: cpmac: remove driver to prepare for platform removal
Wolfram Sang [Fri, 22 Sep 2023 06:15:26 +0000 (08:15 +0200)]
net: cpmac: remove driver to prepare for platform removal

AR7 is going to be removed from the Kernel, so remove its networking
support in form of the cpmac driver. This allows us to remove the
platform because this driver includes a platform specific header.

Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/all/20230922061530.3121-6-wsa+renesas@sang-engineering.com/
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Jakub Kicinski [Thu, 5 Oct 2023 20:16:31 +0000 (13:16 -0700)]
Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Cross-merge networking fixes after downstream PR.

No conflicts (or adjacent changes of note).

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoMerge tag 'net-6.6-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Linus Torvalds [Thu, 5 Oct 2023 18:29:21 +0000 (11:29 -0700)]
Merge tag 'net-6.6-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "Including fixes from Bluetooth, netfilter, BPF and WiFi.

  I didn't collect precise data but feels like we've got a lot of 6.5
  fixes here. WiFi fixes are most user-awaited.

  Current release - regressions:

   - Bluetooth: fix hci_link_tx_to RCU lock usage

  Current release - new code bugs:

   - bpf: mprog: fix maximum program check on mprog attachment

   - eth: ti: icssg-prueth: fix signedness bug in prueth_init_tx_chns()

  Previous releases - regressions:

   - ipv6: tcp: add a missing nf_reset_ct() in 3WHS handling

   - vringh: don't use vringh_kiov_advance() in vringh_iov_xfer(), it
     doesn't handle zero length like we expected

   - wifi:
      - cfg80211: fix cqm_config access race, fix crashes with brcmfmac
      - iwlwifi: mvm: handle PS changes in vif_cfg_changed
      - mac80211: fix mesh id corruption on 32 bit systems
      - mt76: mt76x02: fix MT76x0 external LNA gain handling

   - Bluetooth: fix handling of HCI_QUIRK_STRICT_DUPLICATE_FILTER

   - l2tp: fix handling of transhdrlen in __ip{,6}_append_data()

   - dsa: mv88e6xxx: avoid EEPROM timeout when EEPROM is absent

   - eth: stmmac: fix the incorrect parameter after refactoring

  Previous releases - always broken:

   - net: replace calls to sock->ops->connect() with kernel_connect(),
     prevent address rewrite in kernel_bind(); otherwise BPF hooks may
     modify arguments, unexpectedly to the caller

   - tcp: fix delayed ACKs when reads and writes align with MSS

   - bpf:
      - verifier: unconditionally reset backtrack_state masks on global
        func exit
      - s390: let arch_prepare_bpf_trampoline return program size, fix
        struct_ops offsets
      - sockmap: fix accounting of available bytes in presence of PEEKs
      - sockmap: reject sk_msg egress redirects to non-TCP sockets

   - ipv4/fib: send netlink notify when delete source address routes

   - ethtool: plca: fix width of reads when parsing netlink commands

   - netfilter: nft_payload: rebuild vlan header on h_proto access

   - Bluetooth: hci_codec: fix leaking memory of local_codecs

   - eth: intel: ice: always add legacy 32byte RXDID in supported_rxdids

   - eth: stmmac:
     - dwmac-stm32: fix resume on STM32 MCU
     - remove buggy and unneeded stmmac_poll_controller, depend on NAPI

   - ibmveth: always recompute TCP pseudo-header checksum, fix use of
     the driver with Open vSwitch

   - wifi:
      - rtw88: rtw8723d: fix MAC address offset in EEPROM
      - mt76: fix lock dependency problem for wed_lock
      - mwifiex: sanity check data reported by the device
      - iwlwifi: ensure ack flag is properly cleared
      - iwlwifi: mvm: fix a memory corruption due to bad pointer arithm
      - iwlwifi: mvm: fix incorrect usage of scan API

  Misc:

   - wifi: mac80211: work around Cisco AP 9115 VHT MPDU length"

* tag 'net-6.6-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (99 commits)
  MAINTAINERS: update Matthieu's email address
  mptcp: userspace pm allow creating id 0 subflow
  mptcp: fix delegated action races
  net: stmmac: remove unneeded stmmac_poll_controller
  net: lan743x: also select PHYLIB
  net: ethernet: mediatek: disable irq before schedule napi
  net: mana: Fix oversized sge0 for GSO packets
  net: mana: Fix the tso_bytes calculation
  net: mana: Fix TX CQE error handling
  netlink: annotate data-races around sk->sk_err
  sctp: update hb timer immediately after users change hb_interval
  sctp: update transport state when processing a dupcook packet
  tcp: fix delayed ACKs for MSS boundary condition
  tcp: fix quick-ack counting to count actual ACKs of new data
  page_pool: fix documentation typos
  tipc: fix a potential deadlock on &tx->lock
  net: stmmac: dwmac-stm32: fix resume on STM32 MCU
  ipv4: Set offload_failed flag in fibmatch results
  netfilter: nf_tables: nft_set_rbtree: fix spurious insertion failure
  netfilter: nf_tables: Deduplicate nft_register_obj audit logs
  ...

2 years agoMerge tag 'integrity-v6.6-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar...
Linus Torvalds [Thu, 5 Oct 2023 18:12:33 +0000 (11:12 -0700)]
Merge tag 'integrity-v6.6-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity

Pull integrity fixes from Mimi Zohar:
 "Two additional patches to fix the removal of the deprecated
  IMA_TRUSTED_KEYRING Kconfig"

* tag 'integrity-v6.6-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity:
  ima: rework CONFIG_IMA dependency block
  ima: Finish deprecation of IMA_TRUSTED_KEYRING Kconfig

2 years agoMerge tag 'leds-fixes-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/leds
Linus Torvalds [Thu, 5 Oct 2023 18:07:03 +0000 (11:07 -0700)]
Merge tag 'leds-fixes-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/leds

Pull LED fix from Lee Jones:
 "Just the one bug-fix:

   - Fix regression affecting LED_COLOR_ID_MULTI users"

* tag 'leds-fixes-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/leds:
  leds: Drop BUG_ON check for LED_COLOR_ID_MULTI

2 years agoMerge tag 'mfd-fixes-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd
Linus Torvalds [Thu, 5 Oct 2023 18:03:20 +0000 (11:03 -0700)]
Merge tag 'mfd-fixes-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd

Pull MFD fixes from Lee Jones:
 "A couple of small fixes:

   - Potential build failure in CS42L43

   - Device Tree bindings clean-up for a superseded patch"

* tag 'mfd-fixes-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd:
  dt-bindings: mfd: Revert "dt-bindings: mfd: maxim,max77693: Add USB connector"
  mfd: cs42l43: Fix MFD_CS42L43 dependency on REGMAP_IRQ

2 years agoMerge tag 'ovl-fixes-6.6-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/overla...
Linus Torvalds [Thu, 5 Oct 2023 17:56:18 +0000 (10:56 -0700)]
Merge tag 'ovl-fixes-6.6-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/overlayfs/vfs

Pull overlayfs fixes from Amir Goldstein:

 - Fix for file reference leak regression

 - Fix for NULL pointer deref regression

 - Fixes for RCU-walk race regressions:

   Two of the fixes were taken from Al's RCU pathwalk race fixes series
   with his consent [1].

   Note that unlike most of Al's series, these two patches are not about
   racing with ->kill_sb() and they are also very recent regressions
   from v6.5, so I think it's worth getting them into v6.5.y.

   There is also a fix for an RCU pathwalk race with ->kill_sb(), which
   may have been solved in vfs generic code as you suggested, but it
   also rids overlayfs from a nasty hack, so I think it's worth anyway.

Link: https://lore.kernel.org/linux-fsdevel/20231003204749.GA800259@ZenIV/
* tag 'ovl-fixes-6.6-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/overlayfs/vfs:
  ovl: fix NULL pointer defer when encoding non-decodable lower fid
  ovl: make use of ->layers safe in rcu pathwalk
  ovl: fetch inode once in ovl_dentry_revalidate_common()
  ovl: move freeing ovl_entry past rcu delay
  ovl: fix file reference leak when submitting aio

2 years agoMerge branch 'mptcp-fixes-and-maintainer-email-update-for-v6-6'
Jakub Kicinski [Thu, 5 Oct 2023 16:34:34 +0000 (09:34 -0700)]
Merge branch 'mptcp-fixes-and-maintainer-email-update-for-v6-6'

Mat Martineau says:

====================
mptcp: Fixes and maintainer email update for v6.6

Patch 1 addresses a race condition in MPTCP "delegated actions"
infrastructure. Affects v5.19 and later.

Patch 2 removes an unnecessary restriction that did not allow additional
outgoing subflows using the local address of the initial MPTCP subflow.
v5.16 and later.

Patch 3 updates Matthieu's email address.
====================

Link: https://lore.kernel.org/r/20231004-send-net-20231004-v1-0-28de4ac663ae@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoMAINTAINERS: update Matthieu's email address
Matthieu Baerts [Wed, 4 Oct 2023 20:38:13 +0000 (13:38 -0700)]
MAINTAINERS: update Matthieu's email address

Use my kernel.org account instead.

The other one will bounce by the end of the year.

Signed-off-by: Matthieu Baerts <matttbe@kernel.org>
Signed-off-by: Mat Martineau <martineau@kernel.org>
Link: https://lore.kernel.org/r/20231004-send-net-20231004-v1-3-28de4ac663ae@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agomptcp: userspace pm allow creating id 0 subflow
Geliang Tang [Wed, 4 Oct 2023 20:38:12 +0000 (13:38 -0700)]
mptcp: userspace pm allow creating id 0 subflow

This patch drops id 0 limitation in mptcp_nl_cmd_sf_create() to allow
creating additional subflows with the local addr ID 0.

There is no reason not to allow additional subflows from this local
address: we should be able to create new subflows from the initial
endpoint. This limitation was breaking fullmesh support from userspace.

Fixes: 702c2f646d42 ("mptcp: netlink: allow userspace-driven subflow establishment")
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/391
Cc: stable@vger.kernel.org
Suggested-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <martineau@kernel.org>
Link: https://lore.kernel.org/r/20231004-send-net-20231004-v1-2-28de4ac663ae@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agomptcp: fix delegated action races
Paolo Abeni [Wed, 4 Oct 2023 20:38:11 +0000 (13:38 -0700)]
mptcp: fix delegated action races

The delegated action infrastructure is prone to the following
race: different CPUs can try to schedule different delegated
actions on the same subflow at the same time.

Each of them will check different bits via mptcp_subflow_delegate(),
and will try to schedule the action on the related per-cpu napi
instance.

Depending on the timing, both can observe an empty delegated list
node, causing the same entry to be added simultaneously on two different
lists.

The root cause is that the delegated actions infra does not provide
a single synchronization point. Address the issue reserving an additional
bit to mark the subflow as scheduled for delegation. Acquiring such bit
guarantee the caller to own the delegated list node, and being able to
safely schedule the subflow.

Clear such bit only when the subflow scheduling is completed, ensuring
proper barrier in place.

Additionally swap the meaning of the delegated_action bitmask, to allow
the usage of the existing helper to set multiple bit at once.

Fixes: bcd97734318d ("mptcp: use delegate action to schedule 3rd ack retrans")
Cc: stable@vger.kernel.org
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <martineau@kernel.org>
Link: https://lore.kernel.org/r/20231004-send-net-20231004-v1-1-28de4ac663ae@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: stmmac: remove unneeded stmmac_poll_controller
Remi Pommarel [Wed, 4 Oct 2023 14:33:56 +0000 (16:33 +0200)]
net: stmmac: remove unneeded stmmac_poll_controller

Using netconsole netpoll_poll_dev could be called from interrupt
context, thus using disable_irq() would cause the following kernel
warning with CONFIG_DEBUG_ATOMIC_SLEEP enabled:

  BUG: sleeping function called from invalid context at kernel/irq/manage.c:137
  in_atomic(): 1, irqs_disabled(): 128, non_block: 0, pid: 10, name: ksoftirqd/0
  CPU: 0 PID: 10 Comm: ksoftirqd/0 Tainted: G        W         5.15.42-00075-g816b502b2298-dirty #117
  Hardware name: aml (r1) (DT)
  Call trace:
   dump_backtrace+0x0/0x270
   show_stack+0x14/0x20
   dump_stack_lvl+0x8c/0xac
   dump_stack+0x18/0x30
   ___might_sleep+0x150/0x194
   __might_sleep+0x64/0xbc
   synchronize_irq+0x8c/0x150
   disable_irq+0x2c/0x40
   stmmac_poll_controller+0x140/0x1a0
   netpoll_poll_dev+0x6c/0x220
   netpoll_send_skb+0x308/0x390
   netpoll_send_udp+0x418/0x760
   write_msg+0x118/0x140 [netconsole]
   console_unlock+0x404/0x500
   vprintk_emit+0x118/0x250
   dev_vprintk_emit+0x19c/0x1cc
   dev_printk_emit+0x90/0xa8
   __dev_printk+0x78/0x9c
   _dev_warn+0xa4/0xbc
   ath10k_warn+0xe8/0xf0 [ath10k_core]
   ath10k_htt_txrx_compl_task+0x790/0x7fc [ath10k_core]
   ath10k_pci_napi_poll+0x98/0x1f4 [ath10k_pci]
   __napi_poll+0x58/0x1f4
   net_rx_action+0x504/0x590
   _stext+0x1b8/0x418
   run_ksoftirqd+0x74/0xa4
   smpboot_thread_fn+0x210/0x3c0
   kthread+0x1fc/0x210
   ret_from_fork+0x10/0x20

Since [0] .ndo_poll_controller is only needed if driver doesn't or
partially use NAPI. Because stmmac does so, stmmac_poll_controller
can be removed fixing the above warning.

[0] commit ac3d9dd034e5 ("netpoll: make ndo_poll_controller() optional")

Cc: <stable@vger.kernel.org> # 5.15.x
Fixes: 47dd7a540b8a ("net: add support for STMicroelectronics Ethernet controllers")
Signed-off-by: Remi Pommarel <repk@triplefau.lt>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/1c156a6d8c9170bd6a17825f2277115525b4d50f.1696429960.git.repk@triplefau.lt
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: lan743x: also select PHYLIB
Randy Dunlap [Mon, 2 Oct 2023 19:35:44 +0000 (12:35 -0700)]
net: lan743x: also select PHYLIB

Since FIXED_PHY depends on PHYLIB, PHYLIB needs to be set to avoid
a kconfig warning:

WARNING: unmet direct dependencies detected for FIXED_PHY
  Depends on [n]: NETDEVICES [=y] && PHYLIB [=n]
  Selected by [y]:
  - LAN743X [=y] && NETDEVICES [=y] && ETHERNET [=y] && NET_VENDOR_MICROCHIP [=y] && PCI [=y] && PTP_1588_CLOCK_OPTIONAL [=y]

Fixes: 73c4d1b307ae ("net: lan743x: select FIXED_PHY")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reported-by: kernel test robot <lkp@intel.com>
Closes: lore.kernel.org/r/202309261802.JPbRHwti-lkp@intel.com
Cc: Bryan Whitehead <bryan.whitehead@microchip.com>
Cc: UNGLinuxDriver@microchip.com
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Simon Horman <horms@kernel.org> # build-tested
Link: https://lore.kernel.org/r/20231002193544.14529-1-rdunlap@infradead.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: ethernet: mediatek: disable irq before schedule napi
Christian Marangi [Mon, 2 Oct 2023 14:08:05 +0000 (16:08 +0200)]
net: ethernet: mediatek: disable irq before schedule napi

While searching for possible refactor of napi_schedule_prep and
__napi_schedule it was notice that the mtk eth driver disable the
interrupt for rx and tx AFTER napi is scheduled.

While this is a very hard to repro case it might happen to have
situation where the interrupt is disabled and never enabled again as the
napi completes and the interrupt is enabled before.

This is caused by the fact that a napi driven by interrupt expect a
logic with:
1. interrupt received. napi prepared -> interrupt disabled -> napi
   scheduled
2. napi triggered. ring cleared -> interrupt enabled -> wait for new
   interrupt

To prevent this case, disable the interrupt BEFORE the napi is
scheduled.

Fixes: 656e705243fd ("net-next: mediatek: add support for MT7623 ethernet")
Cc: stable@vger.kernel.org
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Link: https://lore.kernel.org/r/20231002140805.568-1-ansuelsmth@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agonet_sched: sch_fq: add TCA_FQ_WEIGHTS attribute
Eric Dumazet [Mon, 2 Oct 2023 13:17:38 +0000 (13:17 +0000)]
net_sched: sch_fq: add TCA_FQ_WEIGHTS attribute

This attribute can be used to tune the per band weight
and report them in "tc qdisc show" output:

qdisc fq 802f: parent 1:9 limit 100000p flow_limit 500p buckets 1024 orphan_mask 1023
 quantum 8364b initial_quantum 41820b low_rate_threshold 550Kbit
 refill_delay 40ms timer_slack 10us horizon 10s horizon_drop
 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 weights 589824 196608 65536
 Sent 236460814 bytes 792991 pkt (dropped 0, overlimits 0 requeues 0)
 rate 25816bit 10pps backlog 0b 0p requeues 0
  flows 4 (inactive 4 throttled 0)
  gc 0 throttled 19 latency 17.6us fastpath 773882

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Dave Taht <dave.taht@gmail.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agonet_sched: sch_fq: add 3 bands and WRR scheduling
Eric Dumazet [Mon, 2 Oct 2023 13:17:37 +0000 (13:17 +0000)]
net_sched: sch_fq: add 3 bands and WRR scheduling

Before Google adopted FQ for its production servers,
we had to ensure AF4 packets would get a higher share
than BE1 ones.

As discussed this week in Netconf 2023 in Paris, it is time
to upstream this for public use.

After this patch FQ can replace pfifo_fast, with the following
differences :

- FQ uses WRR instead of strict prio, to avoid starvation of
  low priority packets.

- We make sure each band/prio tracks its own usage against sch->limit.
  This was done to make sure flood of low priority packets would not
  prevent AF4 packets to be queued. Contributed by Willem.

- priomap can be changed, if needed (default value are the ones
  coming from pfifo_fast).

In this patch, we set default band weights so that :

- high prio (band=0) packets get 90% of the bandwidth
  if they compete with low prio (band=2) packets.

- high prio packets get 75% of the bandwidth
  if they compete with medium prio (band=1) packets.

Following patch in this series adds the possibility to tune
the per-band weights.

As we added many fields in 'struct fq_sched_data', we had
to make sure to have the first cache line read-mostly, and
avoid wasting precious cache lines.

More optimizations are possible but will be sent separately.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Dave Taht <dave.taht@gmail.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agonet_sched: export pfifo_fast prio2band[]
Eric Dumazet [Mon, 2 Oct 2023 13:17:36 +0000 (13:17 +0000)]
net_sched: export pfifo_fast prio2band[]

pfifo_fast prio2band[] is renamed to sch_default_prio2band[]
and exported because we want to share it in FQ.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Dave Taht <dave.taht@gmail.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agonet_sched: sch_fq: remove q->ktime_cache
Eric Dumazet [Mon, 2 Oct 2023 13:17:35 +0000 (13:17 +0000)]
net_sched: sch_fq: remove q->ktime_cache

Now that both enqueue() and dequeue() need to use ktime_get_ns(),
there is no point wasting 8 bytes in struct fq_sched_data.

This makes room for future fields. ;)

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Dave Taht <dave.taht@gmail.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agoMerge branch 'net-mana-fix-some-tx-processing-bugs'
Paolo Abeni [Thu, 5 Oct 2023 09:45:08 +0000 (11:45 +0200)]
Merge branch 'net-mana-fix-some-tx-processing-bugs'

Haiyang Zhang says:

====================
net: mana: Fix some TX processing bugs

Fix TX processing bugs on error handling, tso_bytes calculation,
and sge0 size.
====================

Link: https://lore.kernel.org/r/1696020147-14989-1-git-send-email-haiyangz@microsoft.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agonet: mana: Fix oversized sge0 for GSO packets
Haiyang Zhang [Fri, 29 Sep 2023 20:42:27 +0000 (13:42 -0700)]
net: mana: Fix oversized sge0 for GSO packets

Handle the case when GSO SKB linear length is too large.

MANA NIC requires GSO packets to put only the header part to SGE0,
otherwise the TX queue may stop at the HW level.

So, use 2 SGEs for the skb linear part which contains more than the
packet header.

Fixes: ca9c54d2d6a5 ("net: mana: Add a driver for Microsoft Azure Network Adapter (MANA)")
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Shradha Gupta <shradhagupta@linux.microsoft.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agonet: mana: Fix the tso_bytes calculation
Haiyang Zhang [Fri, 29 Sep 2023 20:42:26 +0000 (13:42 -0700)]
net: mana: Fix the tso_bytes calculation

sizeof(struct hop_jumbo_hdr) is not part of tso_bytes, so remove
the subtraction from header size.

Cc: stable@vger.kernel.org
Fixes: bd7fc6e1957c ("net: mana: Add new MANA VF performance counters for easier troubleshooting")
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Shradha Gupta <shradhagupta@linux.microsoft.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agonet: mana: Fix TX CQE error handling
Haiyang Zhang [Fri, 29 Sep 2023 20:42:25 +0000 (13:42 -0700)]
net: mana: Fix TX CQE error handling

For an unknown TX CQE error type (probably from a newer hardware),
still free the SKB, update the queue tail, etc., otherwise the
accounting will be wrong.

Also, TX errors can be triggered by injecting corrupted packets, so
replace the WARN_ONCE to ratelimited error logging.

Cc: stable@vger.kernel.org
Fixes: ca9c54d2d6a5 ("net: mana: Add a driver for Microsoft Azure Network Adapter (MANA)")
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Shradha Gupta <shradhagupta@linux.microsoft.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agoMerge tag 'rtla-v6.6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bristot...
Linus Torvalds [Thu, 5 Oct 2023 01:19:55 +0000 (18:19 -0700)]
Merge tag 'rtla-v6.6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bristot/linux

Pull rtla fixes from Daniel Bristot de Oliveira:
 "rtla (Real-Time Linux Analysis) tool fixes.

  Timerlat auto-analysis:

   - Timerlat is reporting thread interference time without thread noise
     events occurrence. It was caused because the thread interference
     variable was not reset after the analysis of a timerlat activation
     that did not hit the threshold.

   - The IRQ handler delay is estimated from the delta of the IRQ
     latency reported by timerlat, and the timestamp from IRQ handler
     start event. If the delta is near-zero, the drift from the external
     clock and the trace event and/or the overhead can cause the value
     to be negative. If the value is negative, print a zero-delay.

   - IRQ handlers happening after the timerlat thread event but before
     the stop tracing were being reported as IRQ that happened before
     the *current* IRQ occurrence. Ignore Previous IRQ noise in this
     condition because they are valid only for the *next* timerlat
     activation.

  Timerlat user-space:

   - Timerlat is stopping all user-space thread if a CPU becomes
     offline. Do not stop the entire tool if a CPU is/become offline,
     but only the thread of the unavailable CPU. Stop the tool only, if
     all threads leave because the CPUs become/are offline.

  man-pages:

   - Fix command line example in timerlat hist man page"

* tag 'rtla-v6.6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bristot/linux:
  rtla: fix a example in rtla-timerlat-hist.rst
  rtla/timerlat: Do not stop user-space if a cpu is offline
  rtla/timerlat_aa: Fix previous IRQ delay for IRQs that happens after thread sample
  rtla/timerlat_aa: Fix negative IRQ delay
  rtla/timerlat_aa: Zero thread sum after every sample analysis

2 years agoMerge branch 'ynl-makefile-cleanup'
Jakub Kicinski [Thu, 5 Oct 2023 00:33:56 +0000 (17:33 -0700)]
Merge branch 'ynl-makefile-cleanup'

Jakub Kicinski says:

====================
ynl Makefile cleanup

While catching up on recent changes I noticed unexpected
changes to Makefiles in YNL. Indeed they were not working
as intended but the fixes put in place were not what I had
in mind :)
====================

Link: https://lore.kernel.org/r/20231003153416.2479808-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agotools: ynl: use uAPI include magic for samples
Jakub Kicinski [Tue, 3 Oct 2023 15:34:16 +0000 (08:34 -0700)]
tools: ynl: use uAPI include magic for samples

Makefile.deps provides direct includes in CFLAGS_$(obj).
We just need to rewrite the rules to make use of the extra
flags, no need to hard-include all of tools/include/uapi.

Acked-by: Stanislav Fomichev <sdf@google.com>
Link: https://lore.kernel.org/r/20231003153416.2479808-4-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agotools: ynl: don't regen on every make
Jakub Kicinski [Tue, 3 Oct 2023 15:34:15 +0000 (08:34 -0700)]
tools: ynl: don't regen on every make

As far as I can tell the normal Makefile dependency tracking
works, generated files get re-generated if the YAML was updated.
Let make do its job, don't force the re-generation.
make hardclean can be used to force regeneration.

Acked-by: Stanislav Fomichev <sdf@google.com>
Link: https://lore.kernel.org/r/20231003153416.2479808-3-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoynl: netdev: drop unnecessary enum-as-flags
Jakub Kicinski [Tue, 3 Oct 2023 15:34:14 +0000 (08:34 -0700)]
ynl: netdev: drop unnecessary enum-as-flags

enum-as-flags can be used when enum declares bit positions but
we want to carry bitmask in an attribute. If the definition
is already provided as flags there's no need to indicate
the flag-iness of the attribute.

Acked-by: Stanislav Fomichev <sdf@google.com>
Link: https://lore.kernel.org/r/20231003153416.2479808-2-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonetlink: annotate data-races around sk->sk_err
Eric Dumazet [Tue, 3 Oct 2023 18:34:55 +0000 (18:34 +0000)]
netlink: annotate data-races around sk->sk_err

syzbot caught another data-race in netlink when
setting sk->sk_err.

Annotate all of them for good measure.

BUG: KCSAN: data-race in netlink_recvmsg / netlink_recvmsg

write to 0xffff8881613bb220 of 4 bytes by task 28147 on cpu 0:
netlink_recvmsg+0x448/0x780 net/netlink/af_netlink.c:1994
sock_recvmsg_nosec net/socket.c:1027 [inline]
sock_recvmsg net/socket.c:1049 [inline]
__sys_recvfrom+0x1f4/0x2e0 net/socket.c:2229
__do_sys_recvfrom net/socket.c:2247 [inline]
__se_sys_recvfrom net/socket.c:2243 [inline]
__x64_sys_recvfrom+0x78/0x90 net/socket.c:2243
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd

write to 0xffff8881613bb220 of 4 bytes by task 28146 on cpu 1:
netlink_recvmsg+0x448/0x780 net/netlink/af_netlink.c:1994
sock_recvmsg_nosec net/socket.c:1027 [inline]
sock_recvmsg net/socket.c:1049 [inline]
__sys_recvfrom+0x1f4/0x2e0 net/socket.c:2229
__do_sys_recvfrom net/socket.c:2247 [inline]
__se_sys_recvfrom net/socket.c:2243 [inline]
__x64_sys_recvfrom+0x78/0x90 net/socket.c:2243
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd

value changed: 0x00000000 -> 0x00000016

Reported by Kernel Concurrency Sanitizer on:
CPU: 1 PID: 28146 Comm: syz-executor.0 Not tainted 6.6.0-rc3-syzkaller-00055-g9ed22ae6be81 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/06/2023

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20231003183455.3410550-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: skb_queue_purge_reason() optimizations
Eric Dumazet [Tue, 3 Oct 2023 18:19:20 +0000 (18:19 +0000)]
net: skb_queue_purge_reason() optimizations

1) Exit early if the list is empty.

2) splice the list into a local list,
   so that we block hard irqs only once.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20231003181920.3280453-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agosctp: update hb timer immediately after users change hb_interval
Xin Long [Sun, 1 Oct 2023 15:04:20 +0000 (11:04 -0400)]
sctp: update hb timer immediately after users change hb_interval

Currently, when hb_interval is changed by users, it won't take effect
until the next expiry of hb timer. As the default value is 30s, users
have to wait up to 30s to wait its hb_interval update to work.

This becomes pretty bad in containers where a much smaller value is
usually set on hb_interval. This patch improves it by resetting the
hb timer immediately once the value of hb_interval is updated by users.

Note that we don't address the already existing 'problem' when sending
a heartbeat 'on demand' if one hb has just been sent(from the timer)
mentioned in:

  https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg590224.html

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Link: https://lore.kernel.org/r/75465785f8ee5df2fb3acdca9b8fafdc18984098.1696172660.git.lucien.xin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agosctp: update transport state when processing a dupcook packet
Xin Long [Sun, 1 Oct 2023 14:58:45 +0000 (10:58 -0400)]
sctp: update transport state when processing a dupcook packet

During the 4-way handshake, the transport's state is set to ACTIVE in
sctp_process_init() when processing INIT_ACK chunk on client or
COOKIE_ECHO chunk on server.

In the collision scenario below:

  192.168.1.2 > 192.168.1.1: sctp (1) [INIT] [init tag: 3922216408]
    192.168.1.1 > 192.168.1.2: sctp (1) [INIT] [init tag: 144230885]
    192.168.1.2 > 192.168.1.1: sctp (1) [INIT ACK] [init tag: 3922216408]
    192.168.1.1 > 192.168.1.2: sctp (1) [COOKIE ECHO]
    192.168.1.2 > 192.168.1.1: sctp (1) [COOKIE ACK]
  192.168.1.1 > 192.168.1.2: sctp (1) [INIT ACK] [init tag: 3914796021]

when processing COOKIE_ECHO on 192.168.1.2, as it's in COOKIE_WAIT state,
sctp_sf_do_dupcook_b() is called by sctp_sf_do_5_2_4_dupcook() where it
creates a new association and sets its transport to ACTIVE then updates
to the old association in sctp_assoc_update().

However, in sctp_assoc_update(), it will skip the transport update if it
finds a transport with the same ipaddr already existing in the old asoc,
and this causes the old asoc's transport state not to move to ACTIVE
after the handshake.

This means if DATA retransmission happens at this moment, it won't be able
to enter PF state because of the check 'transport->state == SCTP_ACTIVE'
in sctp_do_8_2_transport_strike().

This patch fixes it by updating the transport in sctp_assoc_update() with
sctp_assoc_add_peer() where it updates the transport state if there is
already a transport with the same ipaddr exists in the old asoc.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Link: https://lore.kernel.org/r/fd17356abe49713ded425250cc1ae51e9f5846c6.1696172325.git.lucien.xin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoMerge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next...
Jakub Kicinski [Thu, 5 Oct 2023 00:27:24 +0000 (17:27 -0700)]
Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue

Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2023-10-03 (i40e, iavf)

This series contains updates to i40e and iavf drivers.

Yajun Deng aligns reporting of buffer exhaustion statistics to follow
documentation for i40e.

Jake removes undesired 'inline' from functions in iavf.

* '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue:
  iavf: remove "inline" functions from iavf_txrx.c
  i40e: Add rx_missed_errors for buffer exhaustion
====================

Link: https://lore.kernel.org/r/20231003223610.2004976-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoMerge branch 'fix-a-couple-recent-instances-of-wincompatible-function-pointer-types...
Jakub Kicinski [Thu, 5 Oct 2023 00:15:06 +0000 (17:15 -0700)]
Merge branch 'fix-a-couple-recent-instances-of-wincompatible-function-pointer-types-strict-from-mode_get-implementations'

Nathan Chancellor says:

====================
Fix a couple recent instances of -Wincompatible-function-pointer-types-strict from ->mode_get() implementations

This series fixes a couple of instances of
-Wincompatible-function-pointer-types-strict that were introduced by a
recent series that added a new type of ops, struct dpll_device_ops,
along with implementations of the callback ->mode_get() that had a
mismatched mode type.

This warning is not currently enabled for any build but I am planning on
submitting a patch to add it to W=1 to prevent new instances of the
warning from popping up while we try and fix the existing instances in
other drivers.

This series is based on current net-next but if they need to go into
individual maintainer trees, please feel free to take the patches
individually.
====================

Link: https://lore.kernel.org/r/20231002-net-wifpts-dpll_mode_get-v1-0-a356a16413cf@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agomlx5: Fix type of mode parameter in mlx5_dpll_device_mode_get()
Nathan Chancellor [Mon, 2 Oct 2023 20:55:21 +0000 (13:55 -0700)]
mlx5: Fix type of mode parameter in mlx5_dpll_device_mode_get()

When building with -Wincompatible-function-pointer-types-strict, a
warning designed to catch potential kCFI failures at build time rather
than run time due to incorrect function pointer types, there is a
warning due to a mismatch between the type of the mode parameter in
mlx5_dpll_device_mode_get() vs. what the function pointer prototype for
->mode_get() in 'struct dpll_device_ops' expects.

  drivers/net/ethernet/mellanox/mlx5/core/dpll.c:141:14: error: incompatible function pointer types initializing 'int (*)(const struct dpll_device *, void *, enum dpll_mode *, struct netlink_ext_ack *)' with an expression of type 'int (const struct dpll_device *, void *, u32 *, struct netlink_ext_ack *)' (aka 'int (const struct dpll_device *, void *, unsigned int *, struct netlink_ext_ack *)') [-Werror,-Wincompatible-function-pointer-types-strict]
    141 |         .mode_get = mlx5_dpll_device_mode_get,
        |                     ^~~~~~~~~~~~~~~~~~~~~~~~~
  1 error generated.

Change the type of the mode parameter in mlx5_dpll_device_mode_get() to
clear up the warning and avoid kCFI failures at run time.

Fixes: 496fd0a26bbf ("mlx5: Implement SyncE support using DPLL infrastructure")
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20231002-net-wifpts-dpll_mode_get-v1-2-a356a16413cf@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoptp: Fix type of mode parameter in ptp_ocp_dpll_mode_get()
Nathan Chancellor [Mon, 2 Oct 2023 20:55:20 +0000 (13:55 -0700)]
ptp: Fix type of mode parameter in ptp_ocp_dpll_mode_get()

When building with -Wincompatible-function-pointer-types-strict, a
warning designed to catch potential kCFI failures at build time rather
than run time due to incorrect function pointer types, there is a
warning due to a mismatch between the type of the mode parameter in
ptp_ocp_dpll_mode_get() vs. what the function pointer prototype for
->mode_get() in 'struct dpll_device_ops' expects.

  drivers/ptp/ptp_ocp.c:4353:14: error: incompatible function pointer types initializing 'int (*)(const struct dpll_device *, void *, enum dpll_mode *, struct netlink_ext_ack *)' with an expression of type 'int (const struct dpll_device *, void *, u32 *, struct netlink_ext_ack *)' (aka 'int (const struct dpll_device *, void *, unsigned int *, struct netlink_ext_ack *)') [-Werror,-Wincompatible-function-pointer-types-strict]
   4353 |         .mode_get = ptp_ocp_dpll_mode_get,
        |                     ^~~~~~~~~~~~~~~~~~~~~
  1 error generated.

Change the type of the mode parameter in ptp_ocp_dpll_mode_get() to
clear up the warning and avoid kCFI failures at run time.

Fixes: 09eeb3aecc6c ("ptp_ocp: implement DPLL ops")
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20231002-net-wifpts-dpll_mode_get-v1-1-a356a16413cf@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoMerge branch 'r8152-modify-rx_bottom'
Jakub Kicinski [Wed, 4 Oct 2023 23:51:33 +0000 (16:51 -0700)]
Merge branch 'r8152-modify-rx_bottom'

Hayes Wang says:

====================
r8152: modify rx_bottom

v3:
For patch #1, this patch is replaced. The new patch only break the loop,
and keep that the driver would queue the rx packets.

For patch #2, modify the code depends on patch #1. For work_down < budget,
napi_get_frags() and napi_gro_frags() would be used. For the others,
nothing is changed.

v2:
For patch #1, add comment, update commit message, and add Fixes tag.

v1:
These patches are used to improve rx_bottom().
====================

Link: https://lore.kernel.org/r/20230926111714.9448-432-nic_swsd@realtek.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agor8152: use napi_gro_frags
Hayes Wang [Tue, 26 Sep 2023 11:17:14 +0000 (19:17 +0800)]
r8152: use napi_gro_frags

Use napi_gro_frags() for the skb of fragments when the work_done is less
than budget.

Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Link: https://lore.kernel.org/r/20230926111714.9448-434-nic_swsd@realtek.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agor8152: break the loop when the budget is exhausted
Hayes Wang [Tue, 26 Sep 2023 11:17:13 +0000 (19:17 +0800)]
r8152: break the loop when the budget is exhausted

A bulk transfer of the USB may contain many packets. And, the total
number of the packets in the bulk transfer may be more than budget.

Originally, only budget packets would be handled by napi_gro_receive(),
and the other packets would be queued in the driver for next schedule.

This patch would break the loop about getting next bulk transfer, when
the budget is exhausted. That is, only the current bulk transfer would
be handled, and the other bulk transfers would be queued for next
schedule. Besides, the packets which are more than the budget in the
current bulk trasnfer would be still queued in the driver, as the
original method.

In addition, a bulk transfer wouldn't contain more than 400 packets, so
the check of queue length is unnecessary. Therefore, I replace it with
WARN_ON_ONCE().

Fixes: cf74eb5a5bc8 ("eth: r8152: try to use a normal budget")
Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Link: https://lore.kernel.org/r/20230926111714.9448-433-nic_swsd@realtek.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoMerge branch 'chelsio-annotate-structs-with-__counted_by'
Jakub Kicinski [Wed, 4 Oct 2023 22:37:15 +0000 (15:37 -0700)]
Merge branch 'chelsio-annotate-structs-with-__counted_by'

Kees Cook says:

====================
chelsio: Annotate structs with __counted_by

This annotates several chelsio structures with the coming __counted_by
attribute for bounds checking of flexible arrays at run-time. For more details,
see commit dd06e72e68bc ("Compiler Attributes: Add __counted_by macro").
====================

Link: https://lore.kernel.org/r/20230929181042.work.990-kees@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agocxgb4: Annotate struct smt_data with __counted_by
Kees Cook [Fri, 29 Sep 2023 18:11:49 +0000 (11:11 -0700)]
cxgb4: Annotate struct smt_data with __counted_by

Prepare for the coming implementation by GCC and Clang of the __counted_by
attribute. Flexible array members annotated with __counted_by can have
their accesses bounds-checked at run-time checking via CONFIG_UBSAN_BOUNDS
(for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family
functions).

As found with Coccinelle[1], add __counted_by for struct smt_data.

[1] https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci

Cc: Raju Rangoju <rajur@chelsio.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Link: https://lore.kernel.org/r/20230929181149.3006432-5-keescook@chromium.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agocxgb4: Annotate struct sched_table with __counted_by
Kees Cook [Fri, 29 Sep 2023 18:11:48 +0000 (11:11 -0700)]
cxgb4: Annotate struct sched_table with __counted_by

Prepare for the coming implementation by GCC and Clang of the __counted_by
attribute. Flexible array members annotated with __counted_by can have
their accesses bounds-checked at run-time checking via CONFIG_UBSAN_BOUNDS
(for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family
functions).

As found with Coccinelle[1], add __counted_by for struct sched_table.

[1] https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci

Cc: Raju Rangoju <rajur@chelsio.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Link: https://lore.kernel.org/r/20230929181149.3006432-4-keescook@chromium.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agocxgb4: Annotate struct cxgb4_tc_u32_table with __counted_by
Kees Cook [Fri, 29 Sep 2023 18:11:47 +0000 (11:11 -0700)]
cxgb4: Annotate struct cxgb4_tc_u32_table with __counted_by

Prepare for the coming implementation by GCC and Clang of the __counted_by
attribute. Flexible array members annotated with __counted_by can have
their accesses bounds-checked at run-time checking via CONFIG_UBSAN_BOUNDS
(for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family
functions).

As found with Coccinelle[1], add __counted_by for struct cxgb4_tc_u32_table.

[1] https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci

Cc: Raju Rangoju <rajur@chelsio.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Link: https://lore.kernel.org/r/20230929181149.3006432-3-keescook@chromium.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agocxgb4: Annotate struct clip_tbl with __counted_by
Kees Cook [Fri, 29 Sep 2023 18:11:46 +0000 (11:11 -0700)]
cxgb4: Annotate struct clip_tbl with __counted_by

Prepare for the coming implementation by GCC and Clang of the __counted_by
attribute. Flexible array members annotated with __counted_by can have
their accesses bounds-checked at run-time checking via CONFIG_UBSAN_BOUNDS
(for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family
functions).

As found with Coccinelle[1], add __counted_by for struct clip_tbl.

[1] https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci

Cc: Raju Rangoju <rajur@chelsio.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Link: https://lore.kernel.org/r/20230929181149.3006432-2-keescook@chromium.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agochelsio/l2t: Annotate struct l2t_data with __counted_by
Kees Cook [Fri, 29 Sep 2023 18:11:45 +0000 (11:11 -0700)]
chelsio/l2t: Annotate struct l2t_data with __counted_by

Prepare for the coming implementation by GCC and Clang of the __counted_by
attribute. Flexible array members annotated with __counted_by can have
their accesses bounds-checked at run-time checking via CONFIG_UBSAN_BOUNDS
(for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family
functions).

As found with Coccinelle[1], add __counted_by for struct l2t_data.

[1] https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci

Cc: Raju Rangoju <rajur@chelsio.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Link: https://lore.kernel.org/r/20230929181149.3006432-1-keescook@chromium.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agotcp: fix delayed ACKs for MSS boundary condition
Neal Cardwell [Sun, 1 Oct 2023 15:12:39 +0000 (11:12 -0400)]
tcp: fix delayed ACKs for MSS boundary condition

This commit fixes poor delayed ACK behavior that can cause poor TCP
latency in a particular boundary condition: when an application makes
a TCP socket write that is an exact multiple of the MSS size.

The problem is that there is painful boundary discontinuity in the
current delayed ACK behavior. With the current delayed ACK behavior,
we have:

(1) If an app reads data when > 1*MSS is unacknowledged, then
    tcp_cleanup_rbuf() ACKs immediately because of:

     tp->rcv_nxt - tp->rcv_wup > icsk->icsk_ack.rcv_mss ||

(2) If an app reads all received data, and the packets were < 1*MSS,
    and either (a) the app is not ping-pong or (b) we received two
    packets < 1*MSS, then tcp_cleanup_rbuf() ACKs immediately beecause
    of:

     ((icsk->icsk_ack.pending & ICSK_ACK_PUSHED2) ||
      ((icsk->icsk_ack.pending & ICSK_ACK_PUSHED) &&
       !inet_csk_in_pingpong_mode(sk))) &&

(3) *However*: if an app reads exactly 1*MSS of data,
    tcp_cleanup_rbuf() does not send an immediate ACK. This is true
    even if the app is not ping-pong and the 1*MSS of data had the PSH
    bit set, suggesting the sending application completed an
    application write.

Thus if the app is not ping-pong, we have this painful case where
>1*MSS gets an immediate ACK, and <1*MSS gets an immediate ACK, but a
write whose last skb is an exact multiple of 1*MSS can get a 40ms
delayed ACK. This means that any app that transfers data in one
direction and takes care to align write size or packet size with MSS
can suffer this problem. With receive zero copy making 4KB MSS values
more common, it is becoming more common to have application writes
naturally align with MSS, and more applications are likely to
encounter this delayed ACK problem.

The fix in this commit is to refine the delayed ACK heuristics with a
simple check: immediately ACK a received 1*MSS skb with PSH bit set if
the app reads all data. Why? If an skb has a len of exactly 1*MSS and
has the PSH bit set then it is likely the end of an application
write. So more data may not be arriving soon, and yet the data sender
may be waiting for an ACK if cwnd-bound or using TX zero copy. Thus we
set ICSK_ACK_PUSHED in this case so that tcp_cleanup_rbuf() will send
an ACK immediately if the app reads all of the data and is not
ping-pong. Note that this logic is also executed for the case where
len > MSS, but in that case this logic does not matter (and does not
hurt) because tcp_cleanup_rbuf() will always ACK immediately if the
app reads data and there is more than an MSS of unACKed data.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Reviewed-by: Yuchung Cheng <ycheng@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Cc: Xin Guo <guoxin0309@gmail.com>
Link: https://lore.kernel.org/r/20231001151239.1866845-2-ncardwell.sw@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agotcp: fix quick-ack counting to count actual ACKs of new data
Neal Cardwell [Sun, 1 Oct 2023 15:12:38 +0000 (11:12 -0400)]
tcp: fix quick-ack counting to count actual ACKs of new data

This commit fixes quick-ack counting so that it only considers that a
quick-ack has been provided if we are sending an ACK that newly
acknowledges data.

The code was erroneously using the number of data segments in outgoing
skbs when deciding how many quick-ack credits to remove. This logic
does not make sense, and could cause poor performance in
request-response workloads, like RPC traffic, where requests or
responses can be multi-segment skbs.

When a TCP connection decides to send N quick-acks, that is to
accelerate the cwnd growth of the congestion control module
controlling the remote endpoint of the TCP connection. That quick-ack
decision is purely about the incoming data and outgoing ACKs. It has
nothing to do with the outgoing data or the size of outgoing data.

And in particular, an ACK only serves the intended purpose of allowing
the remote congestion control to grow the congestion window quickly if
the ACK is ACKing or SACKing new data.

The fix is simple: only count packets as serving the goal of the
quickack mechanism if they are ACKing/SACKing new data. We can tell
whether this is the case by checking inet_csk_ack_scheduled(), since
we schedule an ACK exactly when we are ACKing/SACKing new data.

Fixes: fc6415bcb0f5 ("[TCP]: Fix quick-ack decrementing with TSO.")
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Reviewed-by: Yuchung Cheng <ycheng@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20231001151239.1866845-1-ncardwell.sw@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoMerge tag 'nf-23-10-04' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf
Jakub Kicinski [Wed, 4 Oct 2023 21:53:17 +0000 (14:53 -0700)]
Merge tag 'nf-23-10-04' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf

Florian Westphal says:

====================
netfilter patches for net

First patch resolves a regression with vlan header matching, this was
broken since 6.5 release.  From myself.

Second patch fixes an ancient problem with sctp connection tracking in
case INIT_ACK packets are delayed.  This comes with a selftest, both
patches from Xin Long.

Patch 4 extends the existing nftables audit selftest, from
Phil Sutter.

Patch 5, also from Phil, avoids a situation where nftables
would emit an audit record twice. This was broken since 5.13 days.

Patch 6, from myself, avoids spurious insertion failure if we encounter an
overlapping but expired range during element insertion with the
'nft_set_rbtree' backend. This problem exists since 6.2.

* tag 'nf-23-10-04' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
  netfilter: nf_tables: nft_set_rbtree: fix spurious insertion failure
  netfilter: nf_tables: Deduplicate nft_register_obj audit logs
  selftests: netfilter: Extend nft_audit.sh
  selftests: netfilter: test for sctp collision processing in nf_conntrack
  netfilter: handle the connecting collision properly in nf_conntrack_proto_sctp
  netfilter: nft_payload: rebuild vlan header on h_proto access
====================

Link: https://lore.kernel.org/r/20231004141405.28749-1-fw@strlen.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoMerge tag 'nf-next-23-09-28' of https://git.kernel.org/pub/scm/linux/kernel/git/netfi...
Jakub Kicinski [Wed, 4 Oct 2023 21:25:37 +0000 (14:25 -0700)]
Merge tag 'nf-next-23-09-28' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next

Florian Westphal says:

====================
netfilter updates for net-next

First patch, from myself, is a bug fix. The issue (connect timeout) is
ancient, so I think its safe to give this more soak time given the esoteric
conditions needed to trigger this.
Also updates the existing selftest to cover this.

Add netlink extacks when an update references a non-existent
table/chain/set.  This allows userspace to provide much better
errors to the user, from Pablo Neira Ayuso.

Last patch adds more policy checks to nf_tables as a better
alternative to the existing runtime checks, from Phil Sutter.

* tag 'nf-next-23-09-28' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next:
  netfilter: nf_tables: Utilize NLA_POLICY_NESTED_ARRAY
  netfilter: nf_tables: missing extended netlink error in lookup functions
  selftests: netfilter: test nat source port clash resolution interaction with tcp early demux
  netfilter: nf_nat: undo erroneous tcp edemux lookup after port clash
====================

Link: https://lore.kernel.org/r/20230928144916.18339-1-fw@strlen.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agopage_pool: fix documentation typos
Randy Dunlap [Sun, 1 Oct 2023 00:38:45 +0000 (17:38 -0700)]
page_pool: fix documentation typos

Correct grammar for better readability.

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Jesper Dangaard Brouer <hawk@kernel.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Acked-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Link: https://lore.kernel.org/r/20231001003846.29541-1-rdunlap@infradead.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agosctp: Spelling s/preceeding/preceding/g
Geert Uytterhoeven [Thu, 28 Sep 2023 12:17:48 +0000 (14:17 +0200)]
sctp: Spelling s/preceeding/preceding/g

Fix a misspelling of "preceding".

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/663b14d07d6d716ddc34482834d6b65a2f714cfb.1695903447.git.geert+renesas@glider.be
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agotipc: fix a potential deadlock on &tx->lock
Chengfeng Ye [Wed, 27 Sep 2023 18:14:14 +0000 (18:14 +0000)]
tipc: fix a potential deadlock on &tx->lock

It seems that tipc_crypto_key_revoke() could be be invoked by
wokequeue tipc_crypto_work_rx() under process context and
timer/rx callback under softirq context, thus the lock acquisition
on &tx->lock seems better use spin_lock_bh() to prevent possible
deadlock.

This flaw was found by an experimental static analysis tool I am
developing for irq-related deadlock.

tipc_crypto_work_rx() <workqueue>
--> tipc_crypto_key_distr()
--> tipc_bcast_xmit()
--> tipc_bcbase_xmit()
--> tipc_bearer_bc_xmit()
--> tipc_crypto_xmit()
--> tipc_ehdr_build()
--> tipc_crypto_key_revoke()
--> spin_lock(&tx->lock)
<timer interrupt>
   --> tipc_disc_timeout()
   --> tipc_bearer_xmit_skb()
   --> tipc_crypto_xmit()
   --> tipc_ehdr_build()
   --> tipc_crypto_key_revoke()
   --> spin_lock(&tx->lock) <deadlock here>

Signed-off-by: Chengfeng Ye <dg573847474@gmail.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Acked-by: Jon Maloy <jmaloy@redhat.com>
Fixes: fc1b6d6de220 ("tipc: introduce TIPC encryption & authentication")
Link: https://lore.kernel.org/r/20230927181414.59928-1-dg573847474@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: stmmac: dwmac-stm32: fix resume on STM32 MCU
Ben Wolsieffer [Wed, 27 Sep 2023 17:57:49 +0000 (13:57 -0400)]
net: stmmac: dwmac-stm32: fix resume on STM32 MCU

The STM32MP1 keeps clk_rx enabled during suspend, and therefore the
driver does not enable the clock in stm32_dwmac_init() if the device was
suspended. The problem is that this same code runs on STM32 MCUs, which
do disable clk_rx during suspend, causing the clock to never be
re-enabled on resume.

This patch adds a variant flag to indicate that clk_rx remains enabled
during suspend, and uses this to decide whether to enable the clock in
stm32_dwmac_init() if the device was suspended.

This approach fixes this specific bug with limited opportunity for
unintended side-effects, but I have a follow up patch that will refactor
the clock configuration and hopefully make it less error prone.

Fixes: 6528e02cc9ff ("net: ethernet: stmmac: add adaptation for stm32mp157c.")
Signed-off-by: Ben Wolsieffer <ben.wolsieffer@hefring.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/20230927175749.1419774-1-ben.wolsieffer@hefring.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoptp: ocp: fix error code in probe()
Dan Carpenter [Wed, 27 Sep 2023 12:55:10 +0000 (15:55 +0300)]
ptp: ocp: fix error code in probe()

There is a copy and paste error so this uses a valid pointer instead of
an error pointer.

Fixes: 09eeb3aecc6c ("ptp_ocp: implement DPLL ops")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Link: https://lore.kernel.org/r/5c581336-0641-48bd-88f7-51984c3b1f79@moroto.mountain
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: dsa: mt753x: remove mt753x_phylink_pcs_link_up()
Russell King (Oracle) [Wed, 27 Sep 2023 12:13:56 +0000 (13:13 +0100)]
net: dsa: mt753x: remove mt753x_phylink_pcs_link_up()

Remove the mt753x_phylink_pcs_link_up() function for two reasons:

1) priv->pcs[i].pcs.neg_mode is set true, meaning it doesn't take a
   MLO_AN_FIXED anymore, but one of PHYLINK_PCS_NEG_*. However, this
   is inconsequential due to...
2) priv->pcs[port].pcs.ops is always initialised to point at
   mt7530_pcs_ops, which does not have a pcs_link_up() member.

So, let's remove mt753x_phylink_pcs_link_up() entirely.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://lore.kernel.org/r/E1qlTQS-008BWe-Va@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: appletalk: remove cops support
Greg Kroah-Hartman [Wed, 27 Sep 2023 09:00:30 +0000 (11:00 +0200)]
net: appletalk: remove cops support

The COPS Appletalk support is very old, never said to actually work
properly, and the firmware code for the devices are under a very suspect
license.  Remove it all to clear up the license issue, if it is still
needed and actually used by anyone, we can add it back later once the
license is cleared up.

Reported-by: Prarit Bhargava <prarit@redhat.com>
Cc: jschlst@samba.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Christoph Hellwig <hch@lst.de>
Acked-by: Prarit Bhargava <prarit@redhat.com>
Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Link: https://lore.kernel.org/r/20230927090029.44704-2-gregkh@linuxfoundation.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoipv4: Set offload_failed flag in fibmatch results
Benjamin Poirier [Tue, 26 Sep 2023 18:27:30 +0000 (14:27 -0400)]
ipv4: Set offload_failed flag in fibmatch results

Due to a small omission, the offload_failed flag is missing from ipv4
fibmatch results. Make sure it is set correctly.

The issue can be witnessed using the following commands:
echo "1 1" > /sys/bus/netdevsim/new_device
ip link add dummy1 up type dummy
ip route add 192.0.2.0/24 dev dummy1
echo 1 > /sys/kernel/debug/netdevsim/netdevsim1/fib/fail_route_offload
ip route add 198.51.100.0/24 dev dummy1
ip route
# 192.168.15.0/24 has rt_trap
# 198.51.100.0/24 has rt_offload_failed
ip route get 192.168.15.1 fibmatch
# Result has rt_trap
ip route get 198.51.100.1 fibmatch
# Result differs from the route shown by `ip route`, it is missing
# rt_offload_failed
ip link del dev dummy1
echo 1 > /sys/bus/netdevsim/del_device

Fixes: 36c5100e859d ("IPv4: Add "offload failed" indication to routes")
Signed-off-by: Benjamin Poirier <bpoirier@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20230926182730.231208-1-bpoirier@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoMerge tag 'linux-kselftest-fixes-6.6-rc5' of git://git.kernel.org/pub/scm/linux/kerne...
Linus Torvalds [Wed, 4 Oct 2023 18:35:23 +0000 (11:35 -0700)]
Merge tag 'linux-kselftest-fixes-6.6-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest

Pull kselftest fix from Shuah Khan:
 "One single fix to Makefile to fix the incorrect TARGET name for uevent
  test"

* tag 'linux-kselftest-fixes-6.6-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  selftests: Fix wrong TARGET in kselftest top level Makefile

2 years agoMerge tag 'wireless-2023-09-27' of git://git.kernel.org/pub/scm/linux/kernel/git...
Jakub Kicinski [Wed, 4 Oct 2023 18:30:21 +0000 (11:30 -0700)]
Merge tag 'wireless-2023-09-27' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless

Johannes Berg says:

====================

Quite a collection of fixes this time, really too many
to list individually. Many stack fixes, even rfkill
(found by simulation and the new eevdf scheduler)!

Also a bigger maintainers file cleanup, to remove old
and redundant information.

* tag 'wireless-2023-09-27' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless: (32 commits)
  wifi: iwlwifi: mvm: Fix incorrect usage of scan API
  wifi: mac80211: Create resources for disabled links
  wifi: cfg80211: avoid leaking stack data into trace
  wifi: mac80211: allow transmitting EAPOL frames with tainted key
  wifi: mac80211: work around Cisco AP 9115 VHT MPDU length
  wifi: cfg80211: Fix 6GHz scan configuration
  wifi: mac80211: fix potential key leak
  wifi: mac80211: fix potential key use-after-free
  wifi: mt76: mt76x02: fix MT76x0 external LNA gain handling
  wifi: brcmfmac: Replace 1-element arrays with flexible arrays
  wifi: mwifiex: Fix oob check condition in mwifiex_process_rx_packet
  wifi: rtw88: rtw8723d: Fix MAC address offset in EEPROM
  rfkill: sync before userspace visibility/changes
  wifi: mac80211: fix mesh id corruption on 32 bit systems
  wifi: cfg80211: add missing kernel-doc for cqm_rssi_work
  wifi: cfg80211: fix cqm_config access race
  wifi: iwlwifi: mvm: Fix a memory corruption issue
  wifi: iwlwifi: Ensure ack flag is properly cleared.
  wifi: iwlwifi: dbg_ini: fix structure packing
  iwlwifi: mvm: handle PS changes in vif_cfg_changed
  ...
====================

Link: https://lore.kernel.org/r/20230927095835.25803-2-johannes@sipsolutions.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoRevert "bnxt_en: Support QOS and TPID settings for the SRIOV VLAN"
Jakub Kicinski [Wed, 4 Oct 2023 18:20:29 +0000 (11:20 -0700)]
Revert "bnxt_en: Support QOS and TPID settings for the SRIOV VLAN"

This reverts commit e76d44fe722761f5480b908e38c5ce1a2c2cb6d6.

We no longer accept drivers extending their use of the legacy
SR-IOV configuration APIs. Users should move to bridge offload.

Link: https://lore.kernel.org/r/20231004112243.41cb6351@kernel.org/
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agodt-bindings: net: fec: Add imx8dxl description
Fabio Estevam [Tue, 26 Sep 2023 11:10:17 +0000 (08:10 -0300)]
dt-bindings: net: fec: Add imx8dxl description

The imx8dl FEC has the same programming model as the one on the imx8qxp.

Add the imx8dl compatible string.

Signed-off-by: Fabio Estevam <festevam@denx.de>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/20230926111017.320409-1-festevam@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoice: fix linking when CONFIG_PTP_1588_CLOCK=n
Jacob Keller [Mon, 2 Oct 2023 18:51:32 +0000 (11:51 -0700)]
ice: fix linking when CONFIG_PTP_1588_CLOCK=n

The recent support for DPLL introduced by commit 8a3a565ff210 ("ice: add
admin commands to access cgu configuration") and commit d7999f5ea64b ("ice:
implement dpll interface to control cgu") broke linking the ice driver if
CONFIG_PTP_1588_CLOCK=n:

ld: vmlinux.o: in function `ice_init_feature_support':
(.text+0x8702b8): undefined reference to `ice_is_phy_rclk_present'
ld: (.text+0x8702cd): undefined reference to `ice_is_cgu_present'
ld: (.text+0x8702d9): undefined reference to `ice_is_clock_mux_present_e810t'
ld: vmlinux.o: in function `ice_dpll_init_info_direct_pins':
ice_dpll.c:(.text+0x894167): undefined reference to `ice_cgu_get_pin_freq_supp'
ld: ice_dpll.c:(.text+0x894197): undefined reference to `ice_cgu_get_pin_name'
ld: ice_dpll.c:(.text+0x8941a8): undefined reference to `ice_cgu_get_pin_type'
ld: vmlinux.o: in function `ice_dpll_update_state':
ice_dpll.c:(.text+0x894494): undefined reference to `ice_get_cgu_state'
ld: vmlinux.o: in function `ice_dpll_init':
(.text+0x8953d5): undefined reference to `ice_get_cgu_rclk_pin_info'

The first commit broke things by calling functions in
ice_init_feature_support that are compiled as part of ice_ptp_hw.o,
including:

* ice_is_phy_rclk_present
* ice_is_clock_mux_present_e810t
* ice_is_cgU_present

The second commit continued the break by calling several CGU functions
defined in ice_ptp_hw.c in the DPLL code.
Because the ice_dpll.c file is compiled unconditionally, it will not
link when CONFIG_PTP_1588_CLOCK=n.

It might be possible to break this dependency and expose those functions
without CONFIG_PTP_1588_CLOCK, but that is not clear to me.

For the DPLL case, simply compile ice_dpll.o only when we have
CONFIG_PTP_1588_CLOCK. Add stub no-op implementation of ice_dpll_init() and
ice_dpll_uninit() when CONFIG_PTP_1588_CLOCK=n into ice_dpll.h

The other functions are part of checking the netlist to see if hardware
features are enabled. These checks don't really belong in ice_ptp_hw.c, and
make more sense as part of the ice_common.c file. We already have
ice_is_gps_in_netlist() in ice_common.c which is doing a similar check.

Move the functions into ice_common.c and rename them to have the similar
postfix of "in_netlist()" to be more expressive of what they are actually
checking.

This also makes the ice_find_netlist_node only called from within
ice_common.c, so its safe to mark it static and stop declaring it in the
ice_common.h header as well.

Fixes: 8a3a565ff210 ("ice: add admin commands to access cgu configuration")
Fixes: d7999f5ea64b ("ice: implement dpll interface to control cgu")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202309191214.TaYEct4H-lkp@intel.com
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Simon Horman <horms@kernel.org> # build-tested
Link: https://lore.kernel.org/r/20231002185132.1575271-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoMerge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Jakub Kicinski [Wed, 4 Oct 2023 15:28:07 +0000 (08:28 -0700)]
Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf

Daniel Borkmann says:

====================
pull-request: bpf 2023-10-02

We've added 11 non-merge commits during the last 12 day(s) which contain
a total of 12 files changed, 176 insertions(+), 41 deletions(-).

The main changes are:

1) Fix BPF verifier to reset backtrack_state masks on global function
   exit as otherwise subsequent precision tracking would reuse them,
   from Andrii Nakryiko.

2) Several sockmap fixes for available bytes accounting,
   from John Fastabend.

3) Reject sk_msg egress redirects to non-TCP sockets given this
   is only supported for TCP sockets today, from Jakub Sitnicki.

4) Fix a syzkaller splat in bpf_mprog when hitting maximum program
   limits with BPF_F_BEFORE directive, from Daniel Borkmann
   and Nikolay Aleksandrov.

5) Fix BPF memory allocator to use kmalloc_size_roundup() to adjust
   size_index for selecting a bpf_mem_cache, from Hou Tao.

6) Fix arch_prepare_bpf_trampoline return code for s390 JIT,
   from Song Liu.

7) Fix bpf_trampoline_get when CONFIG_BPF_JIT is turned off,
   from Leon Hwang.

* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
  bpf: Use kmalloc_size_roundup() to adjust size_index
  selftest/bpf: Add various selftests for program limits
  bpf, mprog: Fix maximum program check on mprog attachment
  bpf, sockmap: Reject sk_msg egress redirects to non-TCP sockets
  bpf, sockmap: Add tests for MSG_F_PEEK
  bpf, sockmap: Do not inc copied_seq when PEEK flag set
  bpf: tcp_read_skb needs to pop skb regardless of seq
  bpf: unconditionally reset backtrack_state masks on global func exit
  bpf: Fix tr dereferencing
  selftests/bpf: Check bpf_cubic_acked() is called via struct_ops
  s390/bpf: Let arch_prepare_bpf_trampoline return program size
====================

Link: https://lore.kernel.org/r/20231002113417.2309-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonetfilter: nf_tables: nft_set_rbtree: fix spurious insertion failure
Florian Westphal [Thu, 28 Sep 2023 13:12:44 +0000 (15:12 +0200)]
netfilter: nf_tables: nft_set_rbtree: fix spurious insertion failure

nft_rbtree_gc_elem() walks back and removes the end interval element that
comes before the expired element.

There is a small chance that we've cached this element as 'rbe_ge'.
If this happens, we hold and test a pointer that has been queued for
freeing.

It also causes spurious insertion failures:

$ cat test-testcases-sets-0044interval_overlap_0.1/testout.log
Error: Could not process rule: File exists
add element t s {  0 -  2 }
                   ^^^^^^
Failed to insert  0 -  2 given:
table ip t {
        set s {
                type inet_service
                flags interval,timeout
                timeout 2s
                gc-interval 2s
        }
}

The set (rbtree) is empty. The 'failure' doesn't happen on next attempt.

Reason is that when we try to insert, the tree may hold an expired
element that collides with the range we're adding.
While we do evict/erase this element, we can trip over this check:

if (rbe_ge && nft_rbtree_interval_end(rbe_ge) && nft_rbtree_interval_end(new))
      return -ENOTEMPTY;

rbe_ge was erased by the synchronous gc, we should not have done this
check.  Next attempt won't find it, so retry results in successful
insertion.

Restart in-kernel to avoid such spurious errors.

Such restart are rare, unless userspace intentionally adds very large
numbers of elements with very short timeouts while setting a huge
gc interval.

Even in this case, this cannot loop forever, on each retry an existing
element has been removed.

As the caller is holding the transaction mutex, its impossible
for a second entity to add more expiring elements to the tree.

After this it also becomes feasible to remove the async gc worker
and perform all garbage collection from the commit path.

Fixes: c9e6978e2725 ("netfilter: nft_set_rbtree: Switch to node list walk for overlap detection")
Signed-off-by: Florian Westphal <fw@strlen.de>
2 years agonetfilter: nf_tables: Deduplicate nft_register_obj audit logs
Phil Sutter [Sat, 23 Sep 2023 01:53:50 +0000 (03:53 +0200)]
netfilter: nf_tables: Deduplicate nft_register_obj audit logs

When adding/updating an object, the transaction handler emits suitable
audit log entries already, the one in nft_obj_notify() is redundant. To
fix that (and retain the audit logging from objects' 'update' callback),
Introduce an "audit log free" variant for internal use.

Fixes: c520292f29b8 ("audit: log nftables configuration change events once per table")
Signed-off-by: Phil Sutter <phil@nwl.cc>
Reviewed-by: Richard Guy Briggs <rgb@redhat.com>
Acked-by: Paul Moore <paul@paul-moore.com> (Audit)
Signed-off-by: Florian Westphal <fw@strlen.de>
2 years agoselftests: netfilter: Extend nft_audit.sh
Phil Sutter [Sat, 23 Sep 2023 01:53:49 +0000 (03:53 +0200)]
selftests: netfilter: Extend nft_audit.sh

Add tests for sets and elements and deletion of all kinds. Also
reorder rule reset tests: By moving the bulk rule add command up, the
two 'reset rules' tests become identical.

While at it, fix for a failing bulk rule add test's error status getting
lost due to its use in a pipe. Avoid this by using a temporary file.

Headings in diff output for failing tests contain no useful data, strip
them.

Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Florian Westphal <fw@strlen.de>
2 years agoselftests: netfilter: test for sctp collision processing in nf_conntrack
Xin Long [Tue, 3 Oct 2023 17:17:54 +0000 (13:17 -0400)]
selftests: netfilter: test for sctp collision processing in nf_conntrack

This patch adds a test case to reproduce the SCTP DATA chunk retransmission
timeout issue caused by the improper SCTP collision processing in netfilter
nf_conntrack_proto_sctp.

In this test, client sends a INIT chunk, but the INIT_ACK replied from
server is delayed until the server sends a INIT chunk to start a new
connection from its side. After the connection is complete from server
side, the delayed INIT_ACK arrives in nf_conntrack_proto_sctp.

The delayed INIT_ACK should be dropped in nf_conntrack_proto_sctp instead
of updating the vtag with the out-of-date init_tag, otherwise, the vtag
in DATA chunks later sent by client don't match the vtag in the conntrack
entry and the DATA chunks get dropped.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
2 years agonetfilter: handle the connecting collision properly in nf_conntrack_proto_sctp
Xin Long [Tue, 3 Oct 2023 17:17:53 +0000 (13:17 -0400)]
netfilter: handle the connecting collision properly in nf_conntrack_proto_sctp

In Scenario A and B below, as the delayed INIT_ACK always changes the peer
vtag, SCTP ct with the incorrect vtag may cause packet loss.

Scenario A: INIT_ACK is delayed until the peer receives its own INIT_ACK

  192.168.1.2 > 192.168.1.1: [INIT] [init tag: 1328086772]
    192.168.1.1 > 192.168.1.2: [INIT] [init tag: 1414468151]
    192.168.1.2 > 192.168.1.1: [INIT ACK] [init tag: 1328086772]
  192.168.1.1 > 192.168.1.2: [INIT ACK] [init tag: 1650211246] *
  192.168.1.2 > 192.168.1.1: [COOKIE ECHO]
    192.168.1.1 > 192.168.1.2: [COOKIE ECHO]
    192.168.1.2 > 192.168.1.1: [COOKIE ACK]

Scenario B: INIT_ACK is delayed until the peer completes its own handshake

  192.168.1.2 > 192.168.1.1: sctp (1) [INIT] [init tag: 3922216408]
    192.168.1.1 > 192.168.1.2: sctp (1) [INIT] [init tag: 144230885]
    192.168.1.2 > 192.168.1.1: sctp (1) [INIT ACK] [init tag: 3922216408]
    192.168.1.1 > 192.168.1.2: sctp (1) [COOKIE ECHO]
    192.168.1.2 > 192.168.1.1: sctp (1) [COOKIE ACK]
  192.168.1.1 > 192.168.1.2: sctp (1) [INIT ACK] [init tag: 3914796021] *

This patch fixes it as below:

In SCTP_CID_INIT processing:
- clear ct->proto.sctp.init[!dir] if ct->proto.sctp.init[dir] &&
  ct->proto.sctp.init[!dir]. (Scenario E)
- set ct->proto.sctp.init[dir].

In SCTP_CID_INIT_ACK processing:
- drop it if !ct->proto.sctp.init[!dir] && ct->proto.sctp.vtag[!dir] &&
  ct->proto.sctp.vtag[!dir] != ih->init_tag. (Scenario B, Scenario C)
- drop it if ct->proto.sctp.init[dir] && ct->proto.sctp.init[!dir] &&
  ct->proto.sctp.vtag[!dir] != ih->init_tag. (Scenario A)

In SCTP_CID_COOKIE_ACK processing:
- clear ct->proto.sctp.init[dir] and ct->proto.sctp.init[!dir].
  (Scenario D)

Also, it's important to allow the ct state to move forward with cookie_echo
and cookie_ack from the opposite dir for the collision scenarios.

There are also other Scenarios where it should allow the packet through,
addressed by the processing above:

Scenario C: new CT is created by INIT_ACK.

Scenario D: start INIT on the existing ESTABLISHED ct.

Scenario E: start INIT after the old collision on the existing ESTABLISHED
ct.

  192.168.1.2 > 192.168.1.1: sctp (1) [INIT] [init tag: 3922216408]
  192.168.1.1 > 192.168.1.2: sctp (1) [INIT] [init tag: 144230885]
  (both side are stopped, then start new connection again in hours)
  192.168.1.2 > 192.168.1.1: sctp (1) [INIT] [init tag: 242308742]

Fixes: 9fb9cbb1082d ("[NETFILTER]: Add nf_conntrack subsystem.")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
2 years agonetfilter: nft_payload: rebuild vlan header on h_proto access
Florian Westphal [Fri, 29 Sep 2023 08:42:10 +0000 (10:42 +0200)]
netfilter: nft_payload: rebuild vlan header on h_proto access

nft can perform merging of adjacent payload requests.
This means that:

ether saddr 00:11 ... ether type 8021ad ...

is a single payload expression, for 8 bytes, starting at the
ethernet source offset.

Check that offset+length is fully within the source/destination mac
addersses.

This bug prevents 'ether type' from matching the correct h_proto in case
vlan tag got stripped.

Fixes: de6843be3082 ("netfilter: nft_payload: rebuild vlan header when needed")
Reported-by: David Ward <david.ward@ll.mit.edu>
Signed-off-by: Florian Westphal <fw@strlen.de>
2 years agoMerge branch 'bnxt_en-hwmon-SRIOV'
David S. Miller [Wed, 4 Oct 2023 10:23:02 +0000 (11:23 +0100)]
Merge branch 'bnxt_en-hwmon-SRIOV'

Michael Chan says:

====================
bnxt_en: hwmon and SRIOV updates

The first 7 patches are v2 of the hwmon patches posted about 6 weeks ago
on Aug 14.  The last 2 patches are SRIOV related updates.

Link to v1 hwmon patches:
https://lore.kernel.org/netdev/20230815045658.80494-11-michael.chan@broadcom.com/
====================

Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agobnxt_en: Update VNIC resource calculation for VFs
Vikas Gupta [Wed, 27 Sep 2023 03:57:34 +0000 (20:57 -0700)]
bnxt_en: Update VNIC resource calculation for VFs

Newer versions of firmware will pre-reserve 1 VNIC for every possible
PF and VF function.  Update the driver logic to take this into account
when assigning VNICs to the VFs.  These pre-reserved VNICs for the
inactive VFs should be subtracted from the global pool before
assigning them to the active VFs.

Not doing so may cause discrepancies that ultimately may cause some VFs to
have insufficient VNICs to support features such as aRFS.

Signed-off-by: Vikas Gupta <vikas.gupta@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agobnxt_en: Support QOS and TPID settings for the SRIOV VLAN
Sreekanth Reddy [Wed, 27 Sep 2023 03:57:33 +0000 (20:57 -0700)]
bnxt_en: Support QOS and TPID settings for the SRIOV VLAN

Add these missing settings in the .ndo_set_vf_vlan() method.
Older firmware does not support the TPID setting so check for
proper support.

Remove the unused BNXT_VF_QOS flag.

Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agobnxt_en: Event handler for Thermal event
Kalesh AP [Wed, 27 Sep 2023 03:57:32 +0000 (20:57 -0700)]
bnxt_en: Event handler for Thermal event

Newer FW will send a new async event when it detects that
the chip's temperature has crossed the configured threshold value.
The driver will now notify hwmon and will log a warning message.

Link: https://lore.kernel.org/netdev/20230815045658.80494-13-michael.chan@broadcom.com/
Cc: Jean Delvare <jdelvare@suse.com>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: linux-hwmon@vger.kernel.org
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Acked-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agobnxt_en: Use non-standard attribute to expose shutdown temperature
Kalesh AP [Wed, 27 Sep 2023 03:57:31 +0000 (20:57 -0700)]
bnxt_en: Use non-standard attribute to expose shutdown temperature

Implement the sysfs attributes directly in the driver for
shutdown threshold temperature and pass an extra attribute group
to the hwmon core when registering the hwmon device.

Link: https://lore.kernel.org/netdev/20230815045658.80494-12-michael.chan@broadcom.com/
Cc: Jean Delvare <jdelvare@suse.com>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: linux-hwmon@vger.kernel.org
Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agobnxt_en: Expose threshold temperatures through hwmon
Kalesh AP [Wed, 27 Sep 2023 03:57:30 +0000 (20:57 -0700)]
bnxt_en: Expose threshold temperatures through hwmon

HWRM_TEMP_MONITOR_QUERY response now indicates various
threshold temperatures. Expose these threshold temperatures
through the hwmon sysfs using this mapping:

hwmon_temp_max : bp->warn_thresh_temp
hwmon_temp_crit : bp->crit_thresh_temp
hwmon_temp_emergency : bp->fatal_thresh_temp

hwmon_temp_max_alarm : temp >= bp->warn_thresh_temp
hwmon_temp_crit_alarm : temp >= bp->crit_thresh_temp
hwmon_temp_emergency_alarm : temp >= bp->fatal_thresh_temp

Link: https://lore.kernel.org/netdev/20230815045658.80494-12-michael.chan@broadcom.com/
Cc: Jean Delvare <jdelvare@suse.com>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: linux-hwmon@vger.kernel.org
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Acked-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agobnxt_en: Modify the driver to use hwmon_device_register_with_info
Kalesh AP [Wed, 27 Sep 2023 03:57:29 +0000 (20:57 -0700)]
bnxt_en: Modify the driver to use hwmon_device_register_with_info

The use of hwmon_device_register_with_groups() is deprecated.
Modified the driver to use hwmon_device_register_with_info().

Driver currently exports only temp1_input through hwmon sysfs
interface. But FW has been modified to report more threshold
temperatures and driver want to report them through the
hwmon interface.

Cc: Jean Delvare <jdelvare@suse.com>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: linux-hwmon@vger.kernel.org
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Acked-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agobnxt_en: Move hwmon functions into a dedicated file
Kalesh AP [Wed, 27 Sep 2023 03:57:28 +0000 (20:57 -0700)]
bnxt_en: Move hwmon functions into a dedicated file

This is in preparation for upcoming patches in the series.
Driver has to expose more threshold temperatures through the
hwmon sysfs interface. More code will be added and do not
want to overload bnxt.c.

Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Cc: Jean Delvare <jdelvare@suse.com>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: linux-hwmon@vger.kernel.org
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Acked-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agobnxt_en: Enhance hwmon temperature reporting
Kalesh AP [Wed, 27 Sep 2023 03:57:27 +0000 (20:57 -0700)]
bnxt_en: Enhance hwmon temperature reporting

Driver currently does hwmon device register and unregister
in open and close() respectively. As a result, user will not
be able to query hwmon temperature when interface is in
ifdown state.

Enhance it by moving the hwmon register/unregister to the
probe/remove functions.

Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agobnxt_en: Update firmware interface to 1.10.2.171
Michael Chan [Wed, 27 Sep 2023 03:57:26 +0000 (20:57 -0700)]
bnxt_en: Update firmware interface to 1.10.2.171

The main changes are the additional thermal thresholds in
hwrm_temp_monitor_query_output and the new async event to
report thermal errors.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoibmveth: Remove condition to recompute TCP header checksum.
David Wilder [Tue, 26 Sep 2023 21:42:51 +0000 (16:42 -0500)]
ibmveth: Remove condition to recompute TCP header checksum.

In some OVS environments the TCP pseudo header checksum may need to be
recomputed. Currently this is only done when the interface instance is
configured for "Trunk Mode". We found the issue also occurs in some
Kubernetes environments, these environments do not use "Trunk Mode",
therefor the condition is removed.

Performance tests with this change show only a fractional decrease in
throughput (< 0.2%).

Fixes: 7525de2516fb ("ibmveth: Set CHECKSUM_PARTIAL if NULL TCP CSUM.")
Signed-off-by: David Wilder <dwilder@us.ibm.com>
Reviewed-by: Nick Child <nnac123@linux.ibm.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agodmaengine: ti: k3-udma-glue: clean up k3_udma_glue_tx_get_irq() return
Dan Carpenter [Tue, 26 Sep 2023 14:06:58 +0000 (17:06 +0300)]
dmaengine: ti: k3-udma-glue: clean up k3_udma_glue_tx_get_irq() return

The k3_udma_glue_tx_get_irq() function currently returns negative error
codes on error, zero on error and positive values for success.  This
complicates life for the callers who need to propagate the error code.
Also GCC will not warn about unsigned comparisons when you check:

if (unsigned_irq <= 0)

All the callers have been fixed now but let's just make this easy going
forward.

Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Reviewed-by: Roger Quadros <rogerq@kernel.org>
Acked-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: ti: icssg-prueth: Fix signedness bug in prueth_init_tx_chns()
Dan Carpenter [Tue, 26 Sep 2023 14:05:59 +0000 (17:05 +0300)]
net: ti: icssg-prueth: Fix signedness bug in prueth_init_tx_chns()

The "tx_chn->irq" variable is unsigned so the error checking does not
work correctly.

Fixes: 128d5874c082 ("net: ti: icssg-prueth: Add ICSSG ethernet driver")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Reviewed-by: Roger Quadros <rogerq@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: ethernet: ti: am65-cpsw: Fix error code in am65_cpsw_nuss_init_tx_chns()
Dan Carpenter [Tue, 26 Sep 2023 14:04:43 +0000 (17:04 +0300)]
net: ethernet: ti: am65-cpsw: Fix error code in am65_cpsw_nuss_init_tx_chns()

This accidentally returns success, but it should return a negative error
code.

Fixes: 93a76530316a ("net: ethernet: ti: introduce am65x/j721e gigabit eth subsystem driver")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Reviewed-by: Roger Quadros <rogerq@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agovringh: don't use vringh_kiov_advance() in vringh_iov_xfer()
Stefano Garzarella [Mon, 25 Sep 2023 10:30:57 +0000 (12:30 +0200)]
vringh: don't use vringh_kiov_advance() in vringh_iov_xfer()

In the while loop of vringh_iov_xfer(), `partlen` could be 0 if one of
the `iov` has 0 lenght.
In this case, we should skip the iov and go to the next one.
But calling vringh_kiov_advance() with 0 lenght does not cause the
advancement, since it returns immediately if asked to advance by 0 bytes.

Let's restore the code that was there before commit b8c06ad4d67d
("vringh: implement vringh_kiov_advance()"), avoiding using
vringh_kiov_advance().

Fixes: b8c06ad4d67d ("vringh: implement vringh_kiov_advance()")
Cc: stable@vger.kernel.org
Reported-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoMAINTAINERS: adjust header file entry in DPLL SUBSYSTEM
Lukas Bulwahn [Mon, 25 Sep 2023 05:43:05 +0000 (07:43 +0200)]
MAINTAINERS: adjust header file entry in DPLL SUBSYSTEM

Commit 9431063ad323 ("dpll: core: Add DPLL framework base functions") adds
the section DPLL SUBSYSTEM in MAINTAINERS and includes a file entry to the
non-existing file 'include/net/dpll.h'.

Hence, ./scripts/get_maintainer.pl --self-test=patterns complains about a
broken reference. Looking at the file stat of the commit above, this entry
clearly intended to refer to 'include/linux/dpll.h'.

Adjust this header file entry in DPLL SUBSYSTEM.

Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agorswitch: Fix PHY station management clock setting
Yoshihiro Shimoda [Tue, 26 Sep 2023 12:30:54 +0000 (21:30 +0900)]
rswitch: Fix PHY station management clock setting

Fix the MPIC.PSMCS value following the programming example in the
section 6.4.2 Management Data Clock (MDC) Setting, Ethernet MAC IP,
S4 Hardware User Manual Rev.1.00.

The value is calculated by
    MPIC.PSMCS = clk[MHz] / (MDC frequency[MHz] * 2) - 1
with the input clock frequency from clk_get_rate() and MDC frequency
of 2.5MHz. Otherwise, this driver cannot communicate PHYs on the R-Car
S4 Starter Kit board.

Fixes: 3590918b5d07 ("net: ethernet: renesas: Add support for "Ethernet Switch"")
Reported-by: Tam Nguyen <tam.nguyen.xa@renesas.com>
Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Tested-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20230926123054.3976752-1-yoshihiro.shimoda.uh@renesas.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: add sysctl to disable rfc4862 5.5.3e lifetime handling
Patrick Rohr [Mon, 25 Sep 2023 21:47:11 +0000 (14:47 -0700)]
net: add sysctl to disable rfc4862 5.5.3e lifetime handling

This change adds a sysctl to opt-out of RFC4862 section 5.5.3e's valid
lifetime derivation mechanism.

RFC4862 section 5.5.3e prescribes that the valid lifetime in a Router
Advertisement PIO shall be ignored if it less than 2 hours and to reset
the lifetime of the corresponding address to 2 hours. An in-progress
6man draft (see draft-ietf-6man-slaac-renum-07 section 4.2) is currently
looking to remove this mechanism. While this draft has not been moving
particularly quickly for other reasons, there is widespread consensus on
section 4.2 which updates RFC4862 section 5.5.3e.

Cc: Maciej Żenczykowski <maze@google.com>
Cc: Lorenzo Colitti <lorenzo@google.com>
Cc: Jen Linkova <furry@google.com>
Signed-off-by: Patrick Rohr <prohr@google.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20230925214711.959704-1-prohr@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoiavf: remove "inline" functions from iavf_txrx.c
Jacob Keller [Thu, 24 Aug 2023 20:44:00 +0000 (14:44 -0600)]
iavf: remove "inline" functions from iavf_txrx.c

The iAVF txrx hotpath code has several functions that are marked as
"static inline" in the iavf_txrx.c file. This use of inline is frowned
upon in the netdev community and explicitly marked as something to avoid
in the Linux coding-style document (section 15).

Even though these functions are only used once, it is expected that GCC
is smart enough to decide when to perform function inlining where
appropriate without the "hint".

./scripts/bloat-o-meter is showing zero difference with this changes.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Ahmed Zaki <ahmed.zaki@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2 years agoi40e: Add rx_missed_errors for buffer exhaustion
Yajun Deng [Wed, 6 Sep 2023 07:27:57 +0000 (15:27 +0800)]
i40e: Add rx_missed_errors for buffer exhaustion

As the comment in struct rtnl_link_stats64, rx_dropped should not
include packets dropped by the device due to buffer exhaustion.
They are counted in rx_missed_errors, procfs folds those two counters
together.

Add rx_missed_errors for buffer exhaustion, rx_missed_errors corresponds
to rx_discards, rx_dropped corresponds to rx_discards_other.

Signed-off-by: Yajun Deng <yajun.deng@linux.dev>
Tested-by: Arpana Arland <arpanax.arland@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2 years agoMerge branch 'documentation-fixes-for-dpll-subsystem'
Jakub Kicinski [Tue, 3 Oct 2023 22:02:19 +0000 (15:02 -0700)]
Merge branch 'documentation-fixes-for-dpll-subsystem'

Bagas Sanjaya says:

====================
Documentation fixes for dpll subsystem

Here is a mini docs fixes for dpll subsystem. The fixes are all code
block-related.

This series is triggered because I was emailed by kernel test robot,
alerting htmldocs warnings (see patch [1/2]).

[1]: https://lore.kernel.org/all/20230918093240.29824-1-bagasdotme@gmail.com/
====================

Link: https://lore.kernel.org/r/20230928052708.44820-1-bagasdotme@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoDocumentation: dpll: wrap DPLL_CMD_PIN_GET output in a code block
Bagas Sanjaya [Thu, 28 Sep 2023 05:27:08 +0000 (12:27 +0700)]
Documentation: dpll: wrap DPLL_CMD_PIN_GET output in a code block

DPLL_CMD_PIN_GET netlink command output for mux-type pins looks ugly
with normal paragraph formatting. Format it as a code block instead.

Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Tested-by: Randy Dunlap <rdunlap@infradead.org>
Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Link: https://lore.kernel.org/r/20230928052708.44820-3-bagasdotme@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoDocumentation: dpll: Fix code blocks
Bagas Sanjaya [Thu, 28 Sep 2023 05:27:07 +0000 (12:27 +0700)]
Documentation: dpll: Fix code blocks

kernel test robot and Stephen Rothwell report htmldocs warnings:

Documentation/driver-api/dpll.rst:427: WARNING: Error in "code-block" directive:
maximum 1 argument(s) allowed, 18 supplied.

.. code-block:: c
<snipped>...
Documentation/driver-api/dpll.rst:444: WARNING: Error in "code-block" directive:
maximum 1 argument(s) allowed, 21 supplied.

.. code-block:: c
<snipped>...
Documentation/driver-api/dpll.rst:474: WARNING: Error in "code-block" directive:
maximum 1 argument(s) allowed, 12 supplied.

.. code-block:: c
<snipped>...

Fix these above by adding missing blank line separator after code-block
directive.

Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202309180456.lOhxy9gS-lkp@intel.com/
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Closes: https://lore.kernel.org/linux-next/20230918131521.155e9e63@canb.auug.org.au/
Fixes: dbb291f19393b6 ("dpll: documentation on DPLL subsystem interface")
Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Tested-by: Randy Dunlap <rdunlap@infradead.org>
Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Link: https://lore.kernel.org/r/20230928052708.44820-2-bagasdotme@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoMerge branch 'introduce-define_flex-macro'
Jakub Kicinski [Tue, 3 Oct 2023 19:17:13 +0000 (12:17 -0700)]
Merge branch 'introduce-define_flex-macro'

Przemek Kitszel says:

====================
introduce DEFINE_FLEX() macro

Add DEFINE_FLEX() macro, that helps on-stack allocation of structures
with trailing flex array member.
Expose __struct_size() macro which reads size of data allocated
by DEFINE_FLEX().

Accompany new macros introduction with actual usage,
in the ice driver - hence targeting for netdev tree.

Obvious benefits include simpler resulting code, less heap usage,
less error checking. Less obvious is the fact that compiler has
more room to optimize, and as a whole, even with more stuff on the stack,
we end up with overall better (smaller) report from bloat-o-meter:
add/remove: 8/6 grow/shrink: 7/18 up/down: 2211/-2270 (-59)
(individual results in each patch).
====================

Link: https://lore.kernel.org/r/20230912115937.1645707-1-przemyslaw.kitszel@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoice: make use of DEFINE_FLEX() in ice_switch.c
Przemek Kitszel [Tue, 12 Sep 2023 11:59:37 +0000 (07:59 -0400)]
ice: make use of DEFINE_FLEX() in ice_switch.c

Use DEFINE_FLEX() macro for 1-elem flex array members of ice_switch.c

Signed-off-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Link: https://lore.kernel.org/r/20230912115937.1645707-8-przemyslaw.kitszel@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoice: make use of DEFINE_FLEX() for struct ice_aqc_dis_txq_item
Przemek Kitszel [Tue, 12 Sep 2023 11:59:36 +0000 (07:59 -0400)]
ice: make use of DEFINE_FLEX() for struct ice_aqc_dis_txq_item

Use DEFINE_FLEX() macro for 1-elem flex array use case
of struct ice_aqc_dis_txq_item.

Signed-off-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Link: https://lore.kernel.org/r/20230912115937.1645707-7-przemyslaw.kitszel@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoice: make use of DEFINE_FLEX() for struct ice_aqc_add_tx_qgrp
Przemek Kitszel [Tue, 12 Sep 2023 11:59:35 +0000 (07:59 -0400)]
ice: make use of DEFINE_FLEX() for struct ice_aqc_add_tx_qgrp

Use DEFINE_FLEX() macro for 1-elem flex array use case
of struct ice_aqc_add_tx_qgrp.

Signed-off-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Link: https://lore.kernel.org/r/20230912115937.1645707-6-przemyslaw.kitszel@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoice: make use of DEFINE_FLEX() in ice_ddp.c
Przemek Kitszel [Tue, 12 Sep 2023 11:59:34 +0000 (07:59 -0400)]
ice: make use of DEFINE_FLEX() in ice_ddp.c

Use DEFINE_FLEX() macro for constant-num-of-elems (4)
flex array members of ice_ddp.c

Signed-off-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Link: https://lore.kernel.org/r/20230912115937.1645707-5-przemyslaw.kitszel@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoice: drop two params of ice_aq_move_sched_elems()
Przemek Kitszel [Tue, 12 Sep 2023 11:59:33 +0000 (07:59 -0400)]
ice: drop two params of ice_aq_move_sched_elems()

Remove two arguments of ice_aq_move_sched_elems().
Last of them was always NULL, and @grps_req was always 1.

Assuming @grps_req to be one, allows us to use DEFINE_FLEX() macro,
what removes some need for heap allocations.

Signed-off-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Link: https://lore.kernel.org/r/20230912115937.1645707-4-przemyslaw.kitszel@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoice: ice_sched_remove_elems: replace 1 elem array param by u32
Przemek Kitszel [Tue, 12 Sep 2023 11:59:32 +0000 (07:59 -0400)]
ice: ice_sched_remove_elems: replace 1 elem array param by u32

Replace array+size params of ice_sched_remove_elems:() by just single u32,
as all callers are using it with "1".

This enables moving from heap-based, to stack-based allocation, what is also
more elegant thanks to DEFINE_FLEX() macro.

Signed-off-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Link: https://lore.kernel.org/r/20230912115937.1645707-3-przemyslaw.kitszel@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agooverflow: add DEFINE_FLEX() for on-stack allocs
Przemek Kitszel [Tue, 12 Sep 2023 11:59:31 +0000 (07:59 -0400)]
overflow: add DEFINE_FLEX() for on-stack allocs

Add DEFINE_FLEX() macro for on-stack allocations of structs with
flexible array member.

Expose __struct_size() macro outside of fortify-string.h, as it could be
used to read size of structs allocated by DEFINE_FLEX().
Move __member_size() alongside it.
-Kees

Using underlying array for on-stack storage lets us to declare
known-at-compile-time structures without kzalloc().

Actual usage for ice driver is in following patches of the series.

Missing __has_builtin() workaround is moved up to serve also assembly
compilation with m68k-linux-gcc, see [1].
Error was (note the .S file extension):
In file included from ../include/linux/linkage.h:5,
                 from ../arch/m68k/fpsp040/skeleton.S:40:
../include/linux/compiler_types.h:331:5: warning: "__has_builtin" is not defined, evaluates to 0 [-Wundef]
  331 | #if __has_builtin(__builtin_dynamic_object_size)
      |     ^~~~~~~~~~~~~
../include/linux/compiler_types.h:331:18: error: missing binary operator before token "("
  331 | #if __has_builtin(__builtin_dynamic_object_size)
      |                  ^

[1] https://lore.kernel.org/netdev/202308112122.OuF0YZqL-lkp@intel.com/
Co-developed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Link: https://lore.kernel.org/r/20230912115937.1645707-2-przemyslaw.kitszel@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoMerge tag 'nfs-for-6.6-3' of git://git.linux-nfs.org/projects/anna/linux-nfs
Linus Torvalds [Tue, 3 Oct 2023 19:13:10 +0000 (12:13 -0700)]
Merge tag 'nfs-for-6.6-3' of git://git.linux-nfs.org/projects/anna/linux-nfs

Pull NFS client fixes from Anna Schumaker:
 "Stable fixes:
   - Revert "SUNRPC dont update timeout value on connection reset"
   - NFSv4: Fix a state manager thread deadlock regression

  Fixes:
   - Fix potential NULL pointer dereference in nfs_inode_remove_request()
   - Fix rare NULL pointer dereference in xs_tcp_tls_setup_socket()
   - Fix long delay before failing a TLS mount when server does not
     support TLS
   - Fix various NFS state manager issues"

* tag 'nfs-for-6.6-3' of git://git.linux-nfs.org/projects/anna/linux-nfs:
  nfs: decrement nrequests counter before releasing the req
  SUNRPC/TLS: Lock the lower_xprt during the tls handshake
  Revert "SUNRPC dont update timeout value on connection reset"
  NFSv4: Fix a state manager thread deadlock regression
  NFSv4: Fix a nfs4_state_manager() race
  SUNRPC: Fail quickly when server does not recognize TLS