tracing: fprobe: Cleanup fprobe hash when module unloading
Cleanup fprobe address hash table on module unloading because the
target symbols will be disappeared when unloading module and not
sure the same symbol is mapped on the same address.
Note that this is at least disables the fprobes if a part of target
symbols on the unloaded modules. Unlike kprobes, fprobe does not
re-enable the probe point by itself. To do that, the caller should
take care register/unregister fprobe when loading/unloading modules.
This simplifies the fprobe state managememt related to the module
loading/unloading.
tracing: fprobe events: Fix possible UAF on modules
Commit ac91052f0ae5 ("tracing: tprobe-events: Fix leakage of module
refcount") moved try_module_get() from __find_tracepoint_module_cb()
to find_tracepoint() caller, but that introduced a possible UAF
because the module can be unloaded before try_module_get(). In this
case, the module object should be freed too. Thus, try_module_get()
does not only fail but may access to the freed object.
To avoid that, try_module_get() in __find_tracepoint_module_cb()
again.
tracing: fprobe: Fix to lock module while registering fprobe
Since register_fprobe() does not get the module reference count while
registering fgraph filter, if the target functions (symbols) are in
modules, those modules can be unloaded when registering fprobe to
fgraph.
To avoid this issue, get the reference counter of module for each
symbol, and put it after register the fprobe.
Linus Torvalds [Sat, 22 Mar 2025 17:45:44 +0000 (10:45 -0700)]
Merge tag 'io_uring-6.14-20250322' of git://git.kernel.dk/linux
Pull io_uring fix from Jens Axboe:
"Just a single fix for the commit that went into your tree yesterday,
which exposed an issue with not always clearing notifications. That
could cause them to be used more than once"
* tag 'io_uring-6.14-20250322' of git://git.kernel.dk/linux:
io_uring/net: fix sendzc double notif flush
Before the blamed commit, sendzc relied on io_req_msg_cleanup() to clear
REQ_F_NEED_CLEANUP, so after the following snippet the request will
never hit the core io_uring cleanup path.
io_notif_flush();
io_req_msg_cleanup();
The easiest fix is to null the notification. io_send_zc_cleanup() can
still be called after, but it's tolerated.
David Howells [Wed, 19 Mar 2025 15:57:46 +0000 (15:57 +0000)]
keys: Fix UAF in key_put()
Once a key's reference count has been reduced to 0, the garbage collector
thread may destroy it at any time and so key_put() is not allowed to touch
the key after that point. The most key_put() is normally allowed to do is
to touch key_gc_work as that's a static global variable.
However, in an effort to speed up the reclamation of quota, this is now
done in key_put() once the key's usage is reduced to 0 - but now the code
is looking at the key after the deadline, which is forbidden.
Fix this by using a flag to indicate that a key can be gc'd now rather than
looking at the key's refcount in the garbage collector.
Fixes: 9578e327b2b4 ("keys: update key quotas in key_put()") Reported-by: syzbot+6105ffc1ded71d194d6d@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/673b6aec.050a0220.87769.004a.GAE@google.com/ Signed-off-by: David Howells <dhowells@redhat.com> Tested-by: syzbot+6105ffc1ded71d194d6d@syzkaller.appspotmail.com Reviewed-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
Namhyung Kim [Sat, 22 Mar 2025 07:13:01 +0000 (08:13 +0100)]
perf/amd/ibs: Prevent leaking sensitive data to userspace
Although IBS "swfilt" can prevent leaking samples with kernel RIP to the
userspace, there are few subtle cases where a 'data' address and/or a
'branch target' address can fall under kernel address range although RIP
is from userspace. Prevent leaking kernel 'data' addresses by discarding
such samples when {exclude_kernel=1,swfilt=1}.
IBS can now be invoked by unprivileged user with the introduction of
"swfilt". However, this creates a loophole in the interface where an
unprivileged user can get physical address of the userspace virtual
addresses through IBS register raw dump (PERF_SAMPLE_RAW). Prevent this
as well.
This upstream commit fixed the most obvious leak:
65a99264f5e5 perf/x86: Check data address for IBS software filter
Follow that up with a more complete fix.
Fixes: d29e744c7167 ("perf/x86: Relax privilege filter restriction on AMD IBS") Suggested-by: Matteo Rizzo <matteorizzo@google.com> Co-developed-by: Ravi Bangoria <ravi.bangoria@amd.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250321161251.1033-1-ravi.bangoria@amd.com
Linus Torvalds [Fri, 21 Mar 2025 20:42:55 +0000 (13:42 -0700)]
Merge tag 'regulator-fix-v6.14-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator
Pull regulator fixes from Mark Brown:
"More fixes than I'd like at this point, some of which is due to me
cooking things in -next for a bit and resetting that cooking time as
more fixes came in.
- Christian Eggers fixed some race conditions with the dummy
regulator not being available very early in boot due to the use of
asynchronous probing, both the provider side (ensuring that it's
availalbe) and consumer side (handling things if that goes wrong)
are fixed
- Ludvig Pärsson fixed some lockdep issues with the debugfs
registration for regulators holding more locks than it really needs
causing issues later when looking at the resulting debugfs.boot
- Some device specific fixes for incorrect descriptions of the
RTQ2208 from ChiYuan Huang"
* tag 'regulator-fix-v6.14-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
regulator: rtq2208: Fix the LDO DVS capability
regulator: rtq2208: Fix incorrect buck converter phase mapping
regulator: check that dummy regulator has been probed before using it
regulator: dummy: force synchronous probing
regulator: core: Fix deadlock in create_regulator()
Linus Torvalds [Fri, 21 Mar 2025 20:02:28 +0000 (13:02 -0700)]
Merge tag 'pinctrl-v6.14-4' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl
Pull pin control fix from Linus Walleij:
- A single patch for Spacemit K1 fixing up the Kconfig to not default
to "y"
* tag 'pinctrl-v6.14-4' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
pinctrl: spacemit: PINCTRL_SPACEMIT_K1 should not default to y unconditionally
Linus Torvalds [Fri, 21 Mar 2025 17:30:15 +0000 (10:30 -0700)]
Merge tag 'io_uring-6.14-20250321' of git://git.kernel.dk/linux
Pull io_uring fix from Jens Axboe:
"Single fix heading to stable, fixing an issue with io_req_msg_cleanup()
sometimes too eagerly clearing cleanup flags"
* tag 'io_uring-6.14-20250321' of git://git.kernel.dk/linux:
io_uring/net: don't clear REQ_F_NEED_CLEANUP unconditionally
Linus Torvalds [Fri, 21 Mar 2025 15:52:31 +0000 (08:52 -0700)]
Merge tag 'perf-urgent-2025-03-21' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 perf events fixes from Ingo Molnar:
"Two fixes: an RAPL PMU driver error handling fix, and an AMD IBS
software filter fix"
* tag 'perf-urgent-2025-03-21' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/x86/rapl: Fix error handling in init_rapl_pmus()
perf/x86: Check data address for IBS software filter
Linus Torvalds [Fri, 21 Mar 2025 15:48:40 +0000 (08:48 -0700)]
Merge tag 'sched-urgent-2025-03-21' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fix from Ingo Molnar:
"Revert a scheduler performance optimization that regressed other
workloads"
* tag 'sched-urgent-2025-03-21' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
Revert "sched/core: Reduce cost of sched_move_task when config autogroup"
* tag 'drm-fixes-2025-03-21' of https://gitlab.freedesktop.org/drm/kernel: (21 commits)
drm/xe: Fix exporting xe buffers multiple times
gpu: host1x: Do not assume that a NULL domain means no DMA IOMMU
drm/amdgpu/pm: Handle SCLK offset correctly in overdrive for smu 14.0.2
drm/amd/display: Fix incorrect fw_state address in dmub_srv
drm/amd/display: Use HW lock mgr for PSR1 when only one eDP
drm/amd/display: Fix message for support_edp0_on_dp1
drm/amdkfd: Fix user queue validation on Gfx7/8
drm/amdgpu: Restore uncached behaviour on GFX12
drm/amdgpu/gfx12: correct cleanup of 'me' field with gfx_v12_0_me_fini()
drm/amdkfd: Fix instruction hazard in gfx12 trap handler
drm/amdgpu/pm: wire up hwmon fan speed for smu 14.0.2
drm/amd/pm: add unique_id for gfx12
drm/amdgpu: Remove JPEG from vega and carrizo video caps
drm/amdgpu: Fix JPEG video caps max size for navi1x and raven
drm/amdgpu: Fix MPEG2, MPEG4 and VC1 video caps max size
drm/radeon: fix uninitialized size issue in radeon_vce_cs_parse()
accel/qaic: Fix integer overflow in qaic_validate_req()
accel/qaic: Fix possible data corruption in BOs > 2G
drm/v3d: Set job pointer to NULL when the job's fence has an error
drm/v3d: Don't run jobs that have errors flagged in its fence
...
Linus Torvalds [Thu, 20 Mar 2025 23:55:24 +0000 (16:55 -0700)]
Merge tag 'dma-mapping-6.14-2025-03-21' of git://git.kernel.org/pub/scm/linux/kernel/git/mszyprowski/linux
Pull dma-mapping fix from Marek Szyprowski:
- fix missing clear bdr in check_ram_in_range_map() (Baochen Qiang)
* tag 'dma-mapping-6.14-2025-03-21' of git://git.kernel.org/pub/scm/linux/kernel/git/mszyprowski/linux:
dma-mapping: fix missing clear bdr in check_ram_in_range_map()
Linus Torvalds [Thu, 20 Mar 2025 21:13:50 +0000 (14:13 -0700)]
Merge tag 'vfs-6.14-final.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs fixes from Christian Brauner:
"A final set of fixes for this cycle:
VFS:
- Ensure that the stable offset api doesn't return duplicate
directory entries when userspace has to perform the getdents call
multiple times on large directories
afs:
- Prevent invalid pointer dereference during get_link RCU pathwalk
fuse:
- Fix deadlock caused by uninitialized rings when using io_uring with
fuse
- Handle race condition when using io_uring with fuse to prevent NULL
dereference
libnetfs:
- Ensure that invalidate_cache is only called if implemented
- Fix collection of results during pause when collection is
offloaded
- Ensure rolling_buffer_load_from_ra() doesn't clear mark bits
- Make netfs_unbuffered_read() return ssize_t rather than int"
* tag 'vfs-6.14-final.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
libfs: Fix duplicate directory entry in offset_dir_lookup
fuse: fix possible deadlock if rings are never initialized
netfs: Fix netfs_unbuffered_read() to return ssize_t rather than int
netfs: Fix rolling_buffer_load_from_ra() to not clear mark bits
netfs: Call `invalidate_cache` only if implemented
netfs: Fix collection of results during pause when collection offloaded
fuse: fix uring race condition for null dereference of fc
afs: Fix afs_atcell_get_link() to check if ws_cell is unset first
Dhananjay Ugwekar [Thu, 20 Mar 2025 10:06:19 +0000 (10:06 +0000)]
perf/x86/rapl: Fix error handling in init_rapl_pmus()
If init_rapl_pmu() fails while allocating memory for "rapl_pmu" objects,
we miss freeing the "rapl_pmus" object in the error path. Fix that.
Fixes: 9b99d65c0bb4 ("perf/x86/rapl: Move the pmu allocation out of CPU hotplug") Signed-off-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250320100617.4480-1-dhananjay.ugwekar@amd.com
Linus Torvalds [Thu, 20 Mar 2025 18:34:30 +0000 (11:34 -0700)]
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm fix from Paolo Bonzini:
"A lone fix for a s390 regression. An earlier 6.14 commit stopped
taking the pte lock for pages that are being converted to secure, but
it was needed to avoid races.
The patch was in development for a while and is finally ready, but I
wish it was split into 3-4 commits at least"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: s390: pv: fix race when making a page secure
io_req_msg_cleanup() relies on the fact that io_netmsg_recycle() will
always fully recycle, but that may not be the case if the msg cache
was already full. To ensure that normal cleanup always gets run,
let io_netmsg_recycle() deal with clearing the relevant cleanup flags,
as it knows exactly when that should be done.
Tomasz Rusinowicz [Tue, 18 Feb 2025 10:03:53 +0000 (11:03 +0100)]
drm/xe: Fix exporting xe buffers multiple times
The `struct ttm_resource->placement` contains TTM_PL_FLAG_* flags, but
it was incorrectly tested for XE_PL_* flags.
This caused xe_dma_buf_pin() to always fail when invoked for
the second time. Fix this by checking the `mem_type` field instead.
Fixes: 7764222d54b7 ("drm/xe: Disallow pinning dma-bufs in VRAM") Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: "Thomas Hellström" <thomas.hellstrom@linux.intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Nirmoy Das <nirmoy.das@intel.com> Cc: Jani Nikula <jani.nikula@intel.com> Cc: intel-xe@lists.freedesktop.org Cc: <stable@vger.kernel.org> # v6.8+ Signed-off-by: Tomasz Rusinowicz <tomasz.rusinowicz@intel.com> Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250218100353.2137964-1-jacek.lawrynowicz@linux.intel.com Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
(cherry picked from commit b96dabdba9b95f71ded50a1c094ee244408b2a8e) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
- eth: ti: am65-cpsw: fix NAPI registration sequence
Previous releases - regressions:
- ipv6: fix memleak of nhc_pcpu_rth_output in fib_check_nh_v6_gw().
- mptcp: fix data stream corruption in the address announcement
- bluetooth: fix connection regression between LE and non-LE adapters
- can:
- flexcan: only change CAN state when link up in system PM
- ucan: fix out of bound read in strscpy() source
Previous releases - always broken:
- lwtunnel: fix reentry loops
- ipv6: fix TCP GSO segmentation with NAT
- xfrm: force software GSO only in tunnel mode
- eth: ti: icssg-prueth: add lock to stats
Misc:
- add Andrea Mayer as a maintainer of SRv6"
* tag 'net-6.14-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (33 commits)
MAINTAINERS: Add Andrea Mayer as a maintainer of SRv6
Revert "gre: Fix IPv6 link-local address generation."
Revert "selftests: Add IPv6 link-local address generation tests for GRE devices."
net/neighbor: add missing policy for NDTPA_QUEUE_LENBYTES
tools headers: Sync uapi/asm-generic/socket.h with the kernel sources
mptcp: Fix data stream corruption in the address announcement
selftests: net: test for lwtunnel dst ref loops
net: ipv6: ioam6: fix lwtunnel_output() loop
net: lwtunnel: fix recursion loops
net: ti: icssg-prueth: Add lock to stats
net: atm: fix use after free in lec_send()
xsk: fix an integer overflow in xp_create_and_assign_umem()
net: stmmac: dwc-qos-eth: use devm_kzalloc() for AXI data
selftests: drv-net: use defer in the ping test
phy: fix xa_alloc_cyclic() error handling
dpll: fix xa_alloc_cyclic() error handling
devlink: fix xa_alloc_cyclic() error handling
ipv6: Set errno after ip_fib_metrics_init() in ip6_route_info_create().
ipv6: Fix memleak of nhc_pcpu_rth_output in fib_check_nh_v6_gw().
net: ipv6: fix TCP GSO segmentation with NAT
...
Linus Torvalds [Thu, 20 Mar 2025 16:25:25 +0000 (09:25 -0700)]
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
Pull rdma fixes from Jason Gunthorpe:
"Collected driver fixes from the last few weeks, I was surprised how
significant many of them seemed to be.
- Fix rdma-core test failures due to wrong startup ordering in rxe
- Don't crash in bnxt_re if the FW supports more than 64k QPs
- Fix wrong QP table indexing math in bnxt_re
- Calculate the max SRQs for userspace properly in bnxt_re
- Don't try to do math on errno for mlx5's rate calculation
- Properly allow userspace to control the VLAN in the QP state during
INIT->RTR for bnxt_re
- 6 bug fixes for HNS:
- Soft lockup when processing huge MRs, add a cond_resched()
- Fix missed error unwind for doorbell allocation
- Prevent bad send queue parameters from userspace
- Wrong error unwind in qp creation
- Missed xa_destroy during driver shutdown
- Fix reporting to userspace of max_sge_rd, hns doesn't have a
read/write difference"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
RDMA/hns: Fix wrong value of max_sge_rd
RDMA/hns: Fix missing xa_destroy()
RDMA/hns: Fix a missing rollback in error path of hns_roce_create_qp_common()
RDMA/hns: Fix invalid sq params not being blocked
RDMA/hns: Fix unmatched condition in error path of alloc_user_qp_db()
RDMA/hns: Fix soft lockup during bt pages loop
RDMA/bnxt_re: Avoid clearing VLAN_ID mask in modify qp path
RDMA/mlx5: Handle errors returned from mlx5r_ib_rate()
RDMA/bnxt_re: Fix reporting maximum SRQs on P7 chips
RDMA/bnxt_re: Add missing paranthesis in map_qp_id_to_tbl_indx
RDMA/bnxt_re: Fix allocation of QP table
RDMA/rxe: Fix the failure of ibv_query_device() and ibv_query_device_ex() tests
Linus Torvalds [Thu, 20 Mar 2025 16:18:38 +0000 (09:18 -0700)]
Merge tag 'efi-fixes-for-v6.14-3' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi
Pull EFI fixes from Ard Biesheuvel:
"Here's a final batch of EFI fixes for v6.14.
The efivarfs ones are fixes for changes that were made this cycle.
James's fix is somewhat of a band-aid, but it was blessed by the VFS
folks, who are working with James to come up with something better for
the next cycle.
- Avoid physical address 0x0 for random page allocations
- Add correct lockdep annotation when traversing efivarfs on resume
- Avoid NULL mount in kernel_file_open() when traversing efivarfs on
resume"
* tag 'efi-fixes-for-v6.14-3' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi:
efivarfs: fix NULL dereference on resume
efivarfs: use I_MUTEX_CHILD nested lock to traverse variables on resume
efi/libstub: Avoid physical address 0x0 when doing random allocation
David Ahern [Wed, 12 Mar 2025 09:22:12 +0000 (10:22 +0100)]
MAINTAINERS: Add Andrea Mayer as a maintainer of SRv6
Andrea has made significant contributions to SRv6 support in Linux.
Acknowledge the work and on-going interest in Srv6 support with a
maintainers entry for these files so hopefully he is included
on patches going forward.
Following Paolo's suggestion, let's revert the IPv6 link-local address
generation fix for GRE devices. The patch introduced regressions in the
upstream CI, which are still under investigation.
Start by reverting the kselftest that depend on that fix (patch 1), then
revert the kernel code itself (patch 2).
====================
This patch broke net/forwarding/ip6gre_custom_multipath_hash.sh in some
circumstances (https://lore.kernel.org/netdev/Z9RIyKZDNoka53EO@mini-arch/).
Let's revert it while the problem is being investigated.
1) Fix tunnel mode TX datapath in packet offload mode
by directly putting it to the xmit path.
From Alexandre Cassen.
2) Force software GSO only in tunnel mode in favor
of potential HW GSO. From Cosmin Ratiu.
ipsec-2025-03-19
* tag 'ipsec-2025-03-19' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec:
xfrm_output: Force software GSO only in tunnel mode
xfrm: fix tunnel mode TX datapath in packet offload mode
====================
Paolo Abeni [Thu, 20 Mar 2025 14:29:59 +0000 (15:29 +0100)]
Merge tag 'batadv-net-pullrequest-20250318' of git://git.open-mesh.org/linux-merge
Simon Wunderlich says:
====================
Here is batman-adv bugfix:
- Ignore own maximum aggregation size during RX, Sven Eckelmann
* tag 'batadv-net-pullrequest-20250318' of git://git.open-mesh.org/linux-merge:
batman-adv: Ignore own maximum aggregation size during RX
====================
Lin Ma [Sat, 15 Mar 2025 16:51:13 +0000 (00:51 +0800)]
net/neighbor: add missing policy for NDTPA_QUEUE_LENBYTES
Previous commit 8b5c171bb3dc ("neigh: new unresolved queue limits")
introduces new netlink attribute NDTPA_QUEUE_LENBYTES to represent
approximative value for deprecated QUEUE_LEN. However, it forgot to add
the associated nla_policy in nl_ntbl_parm_policy array. Fix it with one
simple NLA_U32 type policy.
Arthur Mongodin [Fri, 14 Mar 2025 20:11:31 +0000 (21:11 +0100)]
mptcp: Fix data stream corruption in the address announcement
Because of the size restriction in the TCP options space, the MPTCP
ADD_ADDR option is exclusive and cannot be sent with other MPTCP ones.
For this reason, in the linked mptcp_out_options structure, group of
fields linked to different options are part of the same union.
There is a case where the mptcp_pm_add_addr_signal() function can modify
opts->addr, but not ended up sending an ADD_ADDR. Later on, back in
mptcp_established_options, other options will be sent, but with
unexpected data written in other fields due to the union, e.g. in
opts->ext_copy. This could lead to a data stream corruption in the next
packet.
Using an intermediate variable, prevents from corrupting previously
established DSS option. The assignment of the ADD_ADDR option
parameters is now done once we are sure this ADD_ADDR option can be set
in the packet, e.g. after having dropped other suboptions.
Fixes: 1bff1e43a30e ("mptcp: optimize out option generation") Cc: stable@vger.kernel.org Suggested-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Arthur Mongodin <amongodin@randorisec.fr> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
[ Matt: the commit message has been updated: long lines splits and some
clarifications. ] Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250314-net-mptcp-fix-data-stream-corr-sockopt-v1-1-122dbb249db3@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Yang Yingliang [Thu, 3 Nov 2022 12:11:46 +0000 (20:11 +0800)]
i2c: amd-mp2: drop free_irq() of devm_request_irq() allocated irq
irq allocated with devm_request_irq() will be freed in devm_irq_release(),
using free_irq() in ->remove() will causes a dangling pointer, and a
subsequent double free. So remove the free_irq() in the error path and
remove path.
Fixes: 969864efae78 ("i2c: amd-mp2: use msix/msi if the hardware supports") Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Acked-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com> Link: https://lore.kernel.org/r/20221103121146.99836-1-yangyingliang@huawei.com Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
Yongjian Sun [Thu, 20 Mar 2025 03:44:17 +0000 (11:44 +0800)]
libfs: Fix duplicate directory entry in offset_dir_lookup
There is an issue in the kernel:
In tmpfs, when using the "ls" command to list the contents
of a directory with a large number of files, glibc performs
the getdents call in multiple rounds. If a concurrent unlink
occurs between these getdents calls, it may lead to duplicate
directory entries in the ls output. One possible reproduction
scenario is as follows:
Create 1026 files and execute ls and rm concurrently:
for i in {1..1026}; do
echo "This is file $i" > /tmp/dir/file$i
done
ls /tmp/dir rm /tmp/dir/file4
->getdents(file1026-file5)
->unlink(file4)
->getdents(file5,file3,file2,file1)
It is expected that the second getdents call to return file3
through file1, but instead it returns an extra file5.
The root cause of this problem is in the offset_dir_lookup
function. It uses mas_find to determine the starting position
for the current getdents call. Since mas_find locates the first
position that is greater than or equal to mas->index, when file4
is deleted, it ends up returning file5.
It can be fixed by replacing mas_find with mas_find_rev, which
finds the first position that is less than or equal to mas->index.
Fixes: b9b588f22a0c ("libfs: Use d_children list to iterate simple_offset directories") Signed-off-by: Yongjian Sun <sunyongjian1@huawei.com> Link: https://lore.kernel.org/r/20250320034417.555810-1-sunyongjian@huaweicloud.com Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
When the destination is the same after the transformation, we enter a
lwtunnel loop. This is true for most of lwt users: ioam6, rpl, seg6,
seg6_local, ila_lwt, and lwt_bpf. It can happen in their input() and
output() handlers respectively, where either dst_input() or dst_output()
is called at the end. It can also happen in xmit() handlers.
... until rpl_do_srh() fails, which means skb_cow_head() failed.
This series provides a fix at the core level of lwtunnel to catch such
loops when they're not caught by the respective lwtunnel users, and
handle the loop case in ioam6 which is one of the users. This series
also comes with a new selftest to detect some dst cache reference loops
in lwtunnel users.
====================
Justin Iurman [Fri, 14 Mar 2025 12:00:48 +0000 (13:00 +0100)]
selftests: net: test for lwtunnel dst ref loops
As recently specified by commit 0ea09cbf8350 ("docs: netdev: add a note
on selftest posting") in net-next, the selftest is therefore shipped in
this series. However, this selftest does not really test this series. It
needs this series to avoid crashing the kernel. What it really tests,
thanks to kmemleak, is what was fixed by the following commits:
- commit c71a192976de ("net: ipv6: fix dst refleaks in rpl, seg6 and
ioam6 lwtunnels")
- commit 92191dd10730 ("net: ipv6: fix dst ref loops in rpl, seg6 and
ioam6 lwtunnels")
- commit c64a0727f9b1 ("net: ipv6: fix dst ref loop on input in seg6
lwt")
- commit 13e55fbaec17 ("net: ipv6: fix dst ref loop on input in rpl
lwt")
- commit 0e7633d7b95b ("net: ipv6: fix dst ref loop in ila lwtunnel")
- commit 5da15a9c11c1 ("net: ipv6: fix missing dst ref drop in ila
lwtunnel")
Justin Iurman [Fri, 14 Mar 2025 12:00:47 +0000 (13:00 +0100)]
net: ipv6: ioam6: fix lwtunnel_output() loop
Fix the lwtunnel_output() reentry loop in ioam6_iptunnel when the
destination is the same after transformation. Note that a check on the
destination address was already performed, but it was not enough. This
is the example of a lwtunnel user taking care of loops without relying
only on the last resort detection offered by lwtunnel.
Justin Iurman [Fri, 14 Mar 2025 12:00:46 +0000 (13:00 +0100)]
net: lwtunnel: fix recursion loops
This patch acts as a parachute, catch all solution, by detecting
recursion loops in lwtunnel users and taking care of them (e.g., a loop
between routes, a loop within the same route, etc). In general, such
loops are the consequence of pathological configurations. Each lwtunnel
user is still free to catch such loops early and do whatever they want
with them. It will be the case in a separate patch for, e.g., seg6 and
seg6_local, in order to provide drop reasons and update statistics.
Another example of a lwtunnel user taking care of loops is ioam6, which
has valid use cases that include loops (e.g., inline mode), and which is
addressed by the next patch in this series. Overall, this patch acts as
a last resort to catch loops and drop packets, since we don't want to
leak something unintentionally because of a pathological configuration
in lwtunnels.
The solution in this patch reuses dev_xmit_recursion(),
dev_xmit_recursion_inc(), and dev_xmit_recursion_dec(), which seems fine
considering the context.
Closes: https://lore.kernel.org/netdev/2bc9e2079e864a9290561894d2a602d6@akamai.com/ Closes: https://lore.kernel.org/netdev/Z7NKYMY7fJT5cYWu@shredder/ Fixes: ffce41962ef6 ("lwtunnel: support dst output redirect function") Fixes: 2536862311d2 ("lwt: Add support to redirect dst.input") Fixes: 14972cbd34ff ("net: lwtunnel: Handle fragmentation") Signed-off-by: Justin Iurman <justin.iurman@uliege.be> Link: https://patch.msgid.link/20250314120048.12569-2-justin.iurman@uliege.be Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Currently the API emac_update_hardware_stats() reads different ICSSG
stats without any lock protection.
This API gets called by .ndo_get_stats64() which is only under RCU
protection and nothing else. Add lock to this API so that the reading of
statistics happens during lock.
Gavrilov Ilia [Thu, 13 Mar 2025 08:50:08 +0000 (08:50 +0000)]
xsk: fix an integer overflow in xp_create_and_assign_umem()
Since the i and pool->chunk_size variables are of type 'u32',
their product can wrap around and then be cast to 'u64'.
This can lead to two different XDP buffers pointing to the same
memory area.
Found by InfoTeCS on behalf of Linux Verification Center
(linuxtesting.org) with SVACE.
Paolo Abeni [Wed, 19 Mar 2025 18:44:05 +0000 (19:44 +0100)]
Merge tag 'for-net-2025-03-14' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth
Luiz Augusto von Dentz says:
====================
bluetooth pull request for net:
- hci_event: Fix connection regression between LE and non-LE adapters
- Fix error code in chan_alloc_skb_cb()
* tag 'for-net-2025-03-14' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth:
Bluetooth: hci_event: Fix connection regression between LE and non-LE adapters
Bluetooth: Fix error code in chan_alloc_skb_cb()
====================
Linus Torvalds [Wed, 19 Mar 2025 18:12:18 +0000 (11:12 -0700)]
Merge tag 'hwmon-fixes-for-v6.14-rc8/6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging
Pull hwmon fixes from Guenter Roeck:
- Fix an entry in MAINTAINERS to avoid sending hwmon review requests to
the i2c mailing list
- Fix an out-of-bounds access in nct6775 driver
* tag 'hwmon-fixes-for-v6.14-rc8/6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
hwmon: (nct6775-core) Fix out of bounds access for NCT679{8,9}
MAINTAINERS: correct list and scope of LTC4286 HARDWARE MONITOR
Russell King (Oracle) [Wed, 12 Mar 2025 19:43:09 +0000 (19:43 +0000)]
net: stmmac: dwc-qos-eth: use devm_kzalloc() for AXI data
Everywhere else in the driver uses devm_kzalloc() when allocating the
AXI data, so there is no kfree() of this structure. However,
dwc-qos-eth uses kzalloc(), which leads to this memory being leaked.
Switch to use devm_kzalloc().
Fixes: d8256121a91a ("stmmac: adding new glue driver dwmac-dwc-qos-eth") Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/E1tsRyv-0064nU-O9@rmk-PC.armlinux.org.uk Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Jason Gunthorpe [Tue, 4 Feb 2025 19:18:19 +0000 (15:18 -0400)]
gpu: host1x: Do not assume that a NULL domain means no DMA IOMMU
Previously with tegra-smmu, even with CONFIG_IOMMU_DMA, the default domain
could have been left as NULL. The NULL domain is specially recognized by
host1x_iommu_attach() as meaning it is not the DMA domain and
should be replaced with the special shared domain.
This happened prior to the below commit because tegra-smmu was using the
NULL domain to mean IDENTITY.
Now that the domain is properly labled the test in DRM doesn't see NULL.
Check for IDENTITY as well to enable the special domains.
This is the same issue and basic fix as seen in
commit fae6e669cdc5 ("drm/tegra: Do not assume that a NULL domain means no
DMA IOMMU").
Fixes: c8cc2655cc6c ("iommu/tegra-smmu: Implement an IDENTITY domain") Reported-by: Diogo Ivo <diogo.ivo@tecnico.ulisboa.pt> Closes: https://lore.kernel.org/all/c6a6f114-3acd-4d56-a13b-b88978e927dc@tecnico.ulisboa.pt/ Tested-by: Diogo Ivo <diogo.ivo@tecnico.ulisboa.pt> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com> Link: https://patchwork.freedesktop.org/patch/msgid/0-v1-10dcc8ce3869+3a7-host1x_identity_jgg@nvidia.com
Linus Torvalds [Wed, 19 Mar 2025 14:31:43 +0000 (07:31 -0700)]
Merge tag 'ata-6.14-final' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux
Pull ata fix from Niklas Cassel:
- Fix a regression on ATI AHCI controllers, where certain Samsung
drives fails to be detected on a warm boot when LPM is enabled.
LPM on ATI AHCI works fine with other drives. Likewise, the
Samsung drives works fine with LPM with other AHI controllers.
Thus, just like the weirdo ATA_QUIRK_NO_NCQ_ON_ATI quirk, add a
new ATA_QUIRK_NO_LPM_ON_ATI quirk to disable LPM only on ATI
AHCI controllers.
* tag 'ata-6.14-final' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux:
ata: libata-core: Add ATA_QUIRK_NO_LPM_ON_ATI for certain Samsung SSDs
Paolo Bonzini [Wed, 19 Mar 2025 13:01:53 +0000 (09:01 -0400)]
Merge tag 'kvm-s390-master-6.14-1' of https://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD
Holding the pte lock for the page that is being converted to secure is
needed to avoid races. A previous commit removed the locking, which
caused issues. Fix by locking the pte again.
Luis Henriques [Thu, 6 Mar 2025 11:12:18 +0000 (11:12 +0000)]
fuse: fix possible deadlock if rings are never initialized
When mounting a user-space filesystem using io_uring, the initialization
of the rings is done separately in the server side. If for some reason
(e.g. a server bug) this step is not performed it will be impossible to
unmount the filesystem if there are already requests waiting.
This issue is easily reproduced with the libfuse passthrough_ll example,
if the queue depth is set to '0' and a request is queued before trying to
unmount the filesystem. When trying to force the unmount, fuse_abort_conn()
will try to wake up all tasks waiting in fc->blocked_waitq, but because the
rings were never initialized, fuse_uring_ready() will never return 'true'.
Fixes: 3393ff964e0f ("fuse: block request allocation until io-uring init is complete") Signed-off-by: Luis Henriques <luis@igalia.com> Link: https://lore.kernel.org/r/20250306111218.13734-1-luis@igalia.com Acked-by: Miklos Szeredi <mszeredi@redhat.com> Reviewed-by: Bernd Schubert <bschubert@ddn.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
Miaoqian Lin [Wed, 19 Mar 2025 03:23:04 +0000 (11:23 +0800)]
spi: Fix reference count leak in slave_show()
Fix a reference count leak in slave_show() by properly putting the device
reference obtained from device_find_any_child().
Fixes: 6c364062bfed ("spi: core: Add support for registering SPI slave controllers") Fixes: c21b0837983d ("spi: Use device_find_any_child() instead of custom approach") Signed-off-by: Miaoqian Lin <linmq006@gmail.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Link: https://patch.msgid.link/20250319032305.70340-1-linmq006@gmail.com Signed-off-by: Mark Brown <broonie@kernel.org>
Pierre Riteau <pierre@stackhpc.com> found suspicious handling an error
from xa_alloc_cyclic() in scheduler code [1]. The same is done in few
other places.
v1 --> v2: [2]
* add fixes tags
* fix also the same usage in dpll and phy
Michal Swiatkowski [Wed, 12 Mar 2025 09:52:51 +0000 (10:52 +0100)]
phy: fix xa_alloc_cyclic() error handling
xa_alloc_cyclic() can return 1, which isn't an error. To prevent
situation when the caller of this function will treat it as no error do
a check only for negative here.
Fixes: 384968786909 ("net: phy: Introduce ethernet link topology representation") Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Michal Swiatkowski [Wed, 12 Mar 2025 09:52:50 +0000 (10:52 +0100)]
dpll: fix xa_alloc_cyclic() error handling
In case of returning 1 from xa_alloc_cyclic() (wrapping) ERR_PTR(1) will
be returned, which will cause IS_ERR() to be false. Which can lead to
dereference not allocated pointer (pin).
Fix it by checking if err is lower than zero.
This wasn't found in real usecase, only noticed. Credit to Pierre.
Fixes: 97f265ef7f5b ("dpll: allocate pin ids in cycle") Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Reviewed-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Michal Swiatkowski [Wed, 12 Mar 2025 09:52:49 +0000 (10:52 +0100)]
devlink: fix xa_alloc_cyclic() error handling
In case of returning 1 from xa_alloc_cyclic() (wrapping) ERR_PTR(1) will
be returned, which will cause IS_ERR() to be false. Which can lead to
dereference not allocated pointer (rel).
Fix it by checking if err is lower than zero.
This wasn't found in real usecase, only noticed. Credit to Pierre.
Fixes: c137743bce02 ("devlink: introduce object and nested devlink relationship infra") Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Christian Brauner [Wed, 19 Mar 2025 09:04:29 +0000 (10:04 +0100)]
Merge patch series "netfs: Miscellaneous fixes"
David Howells <dhowells@redhat.com> says:
Here are some miscellaneous fixes and changes for netfslib:
(1) Fix the collection of results during a pause in transmission.
(2) Call ->invalidate_cache() only if provided.
(3) Fix the rolling buffer to not hammer atomic bit clears when loading
from readahead.
(4) Fix netfs_unbuffered_read() to return ssize_t.
* patches from https://lore.kernel.org/r/20250314164201.1993231-1-dhowells@redhat.com:
netfs: Fix netfs_unbuffered_read() to return ssize_t rather than int
netfs: Fix rolling_buffer_load_from_ra() to not clear mark bits
netfs: Call `invalidate_cache` only if implemented
netfs: Fix collection of results during pause when collection offloaded
David Howells [Fri, 14 Mar 2025 16:41:58 +0000 (16:41 +0000)]
netfs: Fix rolling_buffer_load_from_ra() to not clear mark bits
rolling_buffer_load_from_ra() looms large in the perf report because it
loops around doing an atomic clear for each of the three mark bits per
folio. However, this is both inefficient (it would be better to build a
mask and atomically AND them out) and unnecessary as they shouldn't be set.
Fix this by removing the loop.
Fixes: ee4cdf7ba857 ("netfs: Speed up buffered reading") Signed-off-by: David Howells <dhowells@redhat.com> Link: https://lore.kernel.org/r/20250314164201.1993231-4-dhowells@redhat.com Acked-by: "Paulo Alcantara (Red Hat)" <pc@manguebit.com>
cc: Jeff Layton <jlayton@kernel.org>
cc: Steve French <sfrench@samba.org>
cc: Paulo Alcantara <pc@manguebit.com>
cc: netfs@lists.linux.dev
cc: linux-cifs@vger.kernel.org
cc: linux-fsdevel@vger.kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
Max Kellermann [Fri, 14 Mar 2025 16:41:57 +0000 (16:41 +0000)]
netfs: Call `invalidate_cache` only if implemented
Many filesystems such as NFS and Ceph do not implement the
`invalidate_cache` method. On those filesystems, if writing to the
cache (`NETFS_WRITE_TO_CACHE`) fails for some reason, the kernel
crashes like this:
David Howells [Fri, 14 Mar 2025 16:41:56 +0000 (16:41 +0000)]
netfs: Fix collection of results during pause when collection offloaded
A netfs read request can run in one of two modes: for synchronous reads
writes, the app thread does the collection of results and for asynchronous
reads, this is offloaded to a worker thread. This is controlled by the
NETFS_RREQ_OFFLOAD_COLLECTION flag.
Now, if a subrequest incurs an error, the NETFS_RREQ_PAUSE flag is set to
stop the issuing loop temporarily from issuing more subrequests until a
retry is successful or the request is abandoned.
When the issuing loop sees NETFS_RREQ_PAUSE, it jumps to
netfs_wait_for_pause() which will wait for the PAUSE flag to be cleared -
and whilst it is waiting, it will call out to the collector as more results
acrue... But this is the wrong thing to do if OFFLOAD_COLLECTION is set as
we can then end up with both the app thread and the work item collecting
results simultaneously.
This manifests itself occasionally when running the generic/323 xfstest
against multichannel cifs as an oops that's a bit random but frequently
involving io_submit() (the test does lots of simultaneous async DIO reads).
Fix this by only doing the collection in netfs_wait_for_pause() if the
NETFS_RREQ_OFFLOAD_COLLECTION is not set.
Fixes: e2d46f2ec332 ("netfs: Change the read result collector to only use one work item") Reported-by: Steve French <stfrench@microsoft.com> Signed-off-by: David Howells <dhowells@redhat.com> Link: https://lore.kernel.org/r/20250314164201.1993231-2-dhowells@redhat.com Acked-by: "Paulo Alcantara (Red Hat)" <pc@manguebit.com>
cc: Paulo Alcantara <pc@manguebit.com>
cc: Jeff Layton <jlayton@kernel.org>
cc: linux-cifs@vger.kernel.org
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
Joanne Koong [Tue, 18 Mar 2025 00:30:28 +0000 (17:30 -0700)]
fuse: fix uring race condition for null dereference of fc
There is a race condition leading to a kernel crash from a null
dereference when attemping to access fc->lock in
fuse_uring_create_queue(). fc may be NULL in the case where another
thread is creating the uring in fuse_uring_create() and has set
fc->ring but has not yet set ring->fc when fuse_uring_create_queue()
reads ring->fc. There is another race condition as well where in
fuse_uring_register(), ring->nr_queues may still be 0 and not yet set
to the new value when we compare qid against it.
This fix sets fc->ring only after ring->fc and ring->nr_queues have been
set, which guarantees now that ring->fc is a proper pointer when any
queues are created and ring->nr_queues reflects the right number of
queues if ring is not NULL. We must use smp_store_release() and
smp_load_acquire() semantics to ensure the ordering will remain correct
where fc->ring is assigned only after ring->fc and ring->nr_queues have
been assigned.
Tomasz Pakuła [Tue, 11 Mar 2025 21:38:33 +0000 (22:38 +0100)]
drm/amdgpu/pm: Handle SCLK offset correctly in overdrive for smu 14.0.2
Currently, it seems like the code was carried over from RDNA3 because
it assumes two possible values to set. RDNA4, instead of having:
0: min SCLK
1: max SCLK
only has
0: SCLK offset
This change makes it so it only reports current offset value instead of
showing possible min/max values and their indices. Moreover, it now only
accepts the offset as a value, without the indice index.
Additionally, the lower bound was printed as %u by mistake.
Setting this offset:
Old: "s 1 <offset>"
New: "s <offset>"
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4036 Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Tomasz Pakuła <tomasz.pakula.oficjalny@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 1cfeb60e6e8837b1de5eb4e17df7cf31f4442144) Cc: stable@vger.kernel.org # 6.12.x
Lo-an Chen [Mon, 10 Mar 2025 06:52:22 +0000 (14:52 +0800)]
drm/amd/display: Fix incorrect fw_state address in dmub_srv
[WHY]
The fw_state in dmub_srv was assigned with wrong address.
The address was pointed to the firmware region.
[HOW]
Fix the firmware state by using DMUB_DEBUG_FW_STATE_OFFSET
in dmub_cmd.h.
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Lo-an Chen <lo-an.chen@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit f57b38ac85a01bf03020cc0a9761d63e5c0ce197)
Mario Limonciello [Fri, 7 Mar 2025 21:55:20 +0000 (15:55 -0600)]
drm/amd/display: Use HW lock mgr for PSR1 when only one eDP
[WHY]
DMUB locking is important to make sure that registers aren't accessed
while in PSR. Previously it was enabled but caused a deadlock in
situations with multiple eDP panels.
[HOW]
Detect if multiple eDP panels are in use to decide whether to use
lock. Refactor the function so that the first check is for PSR-SU
and then replay is in use to prevent having to look up number
of eDP panels for those configurations.
Fixes: f245b400a223 ("Revert "drm/amd/display: Use HW lock mgr for PSR1"") Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3965 Reviewed-by: ChiaHsuan Chung <chiahsuan.chung@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit ed569e1279a3045d6b974226c814e071fa0193a6) Cc: stable@vger.kernel.org
Philip Yang [Wed, 29 Jan 2025 17:37:30 +0000 (12:37 -0500)]
drm/amdkfd: Fix user queue validation on Gfx7/8
To workaround queue full h/w issue on Gfx7/8, when application create
AQL queue, the ring buffer bo allocate size is queue_size/2 and
map queue_size ring buffer to GPU in 2 pieces using 2 attachments, each
attachment map size is queue_size/2, with same ring_bo backing memory.
For Gfx7/8, user queue buffer validation should use queue_size/2 to
verify ring_bo allocation and mapping size.
Fixes: 68e599db7a54 ("drm/amdkfd: Validate user queue buffers") Suggested-by: Tomáš Trnka <trnka@scm.com> Signed-off-by: Philip Yang <Philip.Yang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit e7a477735f1771b9a9346a5fbd09d7ff0641723a) Cc: stable@vger.kernel.org
Wentao Liang [Wed, 12 Mar 2025 06:31:06 +0000 (14:31 +0800)]
drm/amdgpu/gfx12: correct cleanup of 'me' field with gfx_v12_0_me_fini()
In gfx_v12_0_cp_gfx_load_me_microcode_rs64(), gfx_v12_0_pfp_fini() is
incorrectly used to free 'me' field of 'gfx', since gfx_v12_0_pfp_fini()
can only release 'pfp' field of 'gfx'. The release function of 'me' field
should be gfx_v12_0_me_fini().
Fixes: 52cb80c12e8a ("drm/amdgpu: Add gfx v12_0 ip block support (v6)") Signed-off-by: Wentao Liang <vulab@iscas.ac.cn> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit ebdc52607a46cda08972888178c6aa9cd6965141) Cc: stable@vger.kernel.org # 6.12.x
Jay Cornwall [Fri, 7 Feb 2025 21:40:34 +0000 (16:40 -0500)]
drm/amdkfd: Fix instruction hazard in gfx12 trap handler
VALU instructions with SGPR source need wait states to avoid hazard
with SALU using different SGPR.
v2: Eliminate some hazards to reduce code explosion
Signed-off-by: Jay Cornwall <jay.cornwall@amd.com> Reviewed-by: Lancelot Six <lancelot.six@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 7e0459d453b911435673edd7a86eadc600c63238) Cc: stable@vger.kernel.org # 6.12.x
Nikita Zhandarovich [Tue, 11 Mar 2025 11:14:59 +0000 (14:14 +0300)]
drm/radeon: fix uninitialized size issue in radeon_vce_cs_parse()
On the off chance that command stream passed from userspace via
ioctl() call to radeon_vce_cs_parse() is weirdly crafted and
first command to execute is to encode (case 0x03000001), the function
in question will attempt to call radeon_vce_cs_reloc() with size
argument that has not been properly initialized. Specifically, 'size'
will point to 'tmp' variable before the latter had a chance to be
assigned any value.
Play it safe and init 'tmp' with 0, thus ensuring that
radeon_vce_cs_reloc() will catch an early error in cases like these.
Found by Linux Verification Center (linuxtesting.org) with static
analysis tool SVACE.
Fixes: 2fc5703abda2 ("drm/radeon: check VCE relocation buffer range v3") Signed-off-by: Nikita Zhandarovich <n.zhandarovich@fintech.ru> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 2d52de55f9ee7aaee0e09ac443f77855989c6b68) Cc: stable@vger.kernel.org
Kuniyuki Iwashima [Wed, 12 Mar 2025 01:38:48 +0000 (18:38 -0700)]
ipv6: Set errno after ip_fib_metrics_init() in ip6_route_info_create().
While creating a new IPv6, we could get a weird -ENOMEM when
RTA_NH_ID is set and either of the conditions below is true:
1) CONFIG_IPV6_SUBTREES is enabled and rtm_src_len is specified
2) nexthop_get() fails
e.g.)
# strace ip -6 route add fe80::dead:beef:dead:beef nhid 1 from ::
recvmsg(3, {msg_iov=[{iov_base=[...[
{error=-ENOMEM, msg=[... [...]]},
[{nla_len=49, nla_type=NLMSGERR_ATTR_MSG}, "Nexthops can not be used with so"...]
]], iov_len=32768}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, 0) = 148
Let's set err explicitly after ip_fib_metrics_init() in
ip6_route_info_create().
Fixes: f88d8ea67fbd ("ipv6: Plumb support for nexthop object in a fib6_info") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://patch.msgid.link/20250312013854.61125-1-kuniyu@amazon.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Kuniyuki Iwashima [Wed, 12 Mar 2025 01:03:25 +0000 (18:03 -0700)]
ipv6: Fix memleak of nhc_pcpu_rth_output in fib_check_nh_v6_gw().
fib_check_nh_v6_gw() expects that fib6_nh_init() cleans up everything
when it fails.
Commit 7dd73168e273 ("ipv6: Always allocate pcpu memory in a fib6_nh")
moved fib_nh_common_init() before alloc_percpu_gfp() within fib6_nh_init()
but forgot to add cleanup for fib6_nh->nh_common.nhc_pcpu_rth_output in
case it fails to allocate fib6_nh->rt6i_pcpu, resulting in memleak.
Let's call fib_nh_common_release() and clear nhc_pcpu_rth_output in the
error path.
Note that we can remove the fib6_nh_release() call in nh_create_ipv6()
later in net-next.git.
* tag 'linux-can-fixes-for-6.14-20250314' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can:
can: flexcan: disable transceiver during system PM
can: flexcan: only change CAN state when link up in system PM
can: rcar_canfd: Fix page entries in the AFL list
dt-bindings: can: renesas,rcar-canfd: Fix typo in pattern properties for R-Car V4M
can: statistics: use atomic access in hot path
can: ucan: fix out of bound read in strscpy() source
====================
Haiyang Zhang [Tue, 11 Mar 2025 20:12:54 +0000 (13:12 -0700)]
net: mana: Support holes in device list reply msg
According to GDMA protocol, holes (zeros) are allowed at the beginning
or middle of the gdma_list_devices_resp message. The existing code
cannot properly handle this, and may miss some devices in the list.
To fix, scan the entire list until the num_of_devs are found, or until
the end of the list.
Cc: stable@vger.kernel.org Fixes: ca9c54d2d6a5 ("net: mana: Add a driver for Microsoft Azure Network Adapter (MANA)") Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Reviewed-by: Long Li <longli@microsoft.com> Reviewed-by: Shradha Gupta <shradhagupta@microsoft.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Link: https://patch.msgid.link/1741723974-1534-1-git-send-email-haiyangz@microsoft.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Vignesh Raghavendra [Tue, 11 Mar 2025 15:42:59 +0000 (21:12 +0530)]
net: ethernet: ti: am65-cpsw: Fix NAPI registration sequence
Registering the interrupts for TX or RX DMA Channels prior to registering
their respective NAPI callbacks can result in a NULL pointer dereference.
This is seen in practice as a random occurrence since it depends on the
randomness associated with the generation of traffic by Linux and the
reception of traffic from the wire.
Fixes: 681eb2beb3ef ("net: ethernet: ti: am65-cpsw: ensure proper channel cleanup in error path") Signed-off-by: Vignesh Raghavendra <vigneshr@ti.com> Co-developed-by: Siddharth Vadapalli <s-vadapalli@ti.com> Signed-off-by: Siddharth Vadapalli <s-vadapalli@ti.com> Reviewed-by: Alexander Sverdlin <alexander.sverdlin@siemens.com> Reviewed-by: Roger Quadros <rogerq@kernel.org> Link: https://patch.msgid.link/20250311154259.102865-1-s-vadapalli@ti.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Niklas Cassel [Mon, 17 Mar 2025 17:03:49 +0000 (18:03 +0100)]
ata: libata-core: Add ATA_QUIRK_NO_LPM_ON_ATI for certain Samsung SSDs
Before commit 7627a0edef54 ("ata: ahci: Drop low power policy board type")
the ATI AHCI controllers specified board type 'board_ahci' rather than
board type 'board_ahci'. This means that LPM was historically not enabled
for the ATI AHCI controllers.
By looking at commit 7a8526a5cd51 ("libata: Add ATA_HORKAGE_NO_NCQ_ON_ATI
for Samsung 860 and 870 SSD."), it is clear that, for some unknown reason,
that Samsung SSDs do not play nice with ATI AHCI controllers. (When using
other AHCI controllers, NCQ can be enabled on these Samsung SSDs without
issues.)
In a similar way, from user reports, it is clear the ATI AHCI controllers
can enable LPM on e.g. Maxtor HDDs perfectly fine, but when enabling LPM
on certain Samsung SSDs, things break. (E.g. the SSDs will not get detected
by the ATI AHCI controller even after a COMRESET.)
Yet, when using LPM on these Samsung SSDs with other AHCI controllers, e.g.
Intel AHCI controllers, these Samsung drives appear to work perfectly fine.
Considering that the combination of ATI + Samsung, for some unknown reason,
does not seem to work well, disable LPM when detecting an ATI AHCI
controller with a problematic Samsung SSD.
Apply this new ATA_QUIRK_NO_LPM_ON_ATI quirk for all Samsung SSDs that have
already been reported to not play nice with ATI (ATA_QUIRK_NO_NCQ_ON_ATI).
Fixes: 7627a0edef54 ("ata: ahci: Drop low power policy board type") Suggested-by: Hans de Goede <hdegoede@redhat.com> Reported-by: Eric <eric.4.debian@grabatoulnz.fr> Closes: https://lore.kernel.org/linux-ide/Z8SBZMBjvVXA7OAK@eldamar.lan/ Tested-by: Eric <eric.4.debian@grabatoulnz.fr> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Link: https://lore.kernel.org/r/20250317170348.1748671-2-cassel@kernel.org Signed-off-by: Niklas Cassel <cassel@kernel.org>
James Bottomley [Tue, 18 Mar 2025 03:06:01 +0000 (23:06 -0400)]
efivarfs: fix NULL dereference on resume
LSMs often inspect the path.mnt of files in the security hooks, and this
causes a NULL deref in efivarfs_pm_notify() because the path is
constructed with a NULL path.mnt.
Fix by obtaining from vfs_kern_mount() instead, and being very careful
to ensure that deactivate_super() (potentially triggered by a racing
userspace umount) is not called directly from the notifier, because it
would deadlock when efivarfs_kill_sb() tried to unregister the notifier
chain.
[ Al notes:
Umm... That's probably safe, but not as a long-term solution -
it's too intimately dependent upon fs/super.c internals. The
reasons why you can't run into ->s_umount deadlock here are
non-trivial... ]
Linus Torvalds [Tue, 18 Mar 2025 05:27:27 +0000 (22:27 -0700)]
Merge tag 'mm-hotfixes-stable-2025-03-17-20-09' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull misc hotfixes from Andrew Morton:
"15 hotfixes. 7 are cc:stable and the remainder address post-6.13
issues or aren't considered necessary for -stable kernels.
13 are for MM and the other two are for squashfs and procfs.
All are singletons. Please see the individual changelogs for details"
* tag 'mm-hotfixes-stable-2025-03-17-20-09' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
mm/page_alloc: fix memory accept before watermarks gets initialized
mm: decline to manipulate the refcount on a slab page
memcg: drain obj stock on cpu hotplug teardown
mm/huge_memory: drop beyond-EOF folios with the right number of refs
selftests/mm: run_vmtests.sh: fix half_ufd_size_MB calculation
mm: fix error handling in __filemap_get_folio() with FGP_NOWAIT
mm: memcontrol: fix swap counter leak from offline cgroup
mm/vma: do not register private-anon mappings with khugepaged during mmap
squashfs: fix invalid pointer dereference in squashfs_cache_delete
mm/migrate: fix shmem xarray update during migration
mm/hugetlb: fix surplus pages in dissolve_free_huge_page()
mm/damon/core: initialize damos->walk_completed in damon_new_scheme()
mm/damon: respect core layer filters' allowance decision on ops layer
filemap: move prefaulting out of hot write path
proc: fix UAF in proc_get_inode()
Namhyung Kim [Mon, 17 Mar 2025 16:37:55 +0000 (09:37 -0700)]
perf/x86: Check data address for IBS software filter
The IBS software filter is filtering kernel samples for regular users in
the PMI handler. It checks the instruction address in the IBS register to
determine if it was in kernel mode or not.
But it turns out that it's possible to report a kernel data address even
if the instruction address belongs to user-space. Matteo Rizzo
found that when an instruction raises an exception, IBS can report some
kernel data addresses like IDT while holding the faulting instruction's
RIP. To prevent an information leak, it should double check if the data
address in PERF_SAMPLE_DATA is in the kernel space as well.
Paulo Alcantara [Mon, 17 Mar 2025 19:39:22 +0000 (16:39 -0300)]
smb: client: don't retry IO on failed negprotos with soft mounts
If @server->tcpStatus is set to CifsNeedReconnect after acquiring
@ses->session_mutex in smb2_reconnect() or cifs_reconnect_tcon(), it
means that a concurrent thread failed to negotiate, in which case the
server is no longer responding to any SMB requests, so there is no
point making the caller retry the IO by returning -EAGAIN.
Fix this by returning -EHOSTDOWN to the callers on soft mounts.
Cc: David Howells <dhowells@redhat.com> Reported-by: Jay Shin <jaeshin@redhat.com> Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.com> Signed-off-by: Steve French <stfrench@microsoft.com>
Linus Torvalds [Mon, 17 Mar 2025 21:40:40 +0000 (14:40 -0700)]
Merge tag 'soc-fixes-6.14-2' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull SoC fixes from Arnd Bergmann:
"The majority of these last fixes are for devicetree files.
These address two important regressions for the Qualcomm SMMU and the
Raspberry Pi 4 USB controller, as well as a larger number of patches
fixing minor mistakes in board specific files for Rockchips, i.MX,
starfive and broadcom.
The non-DT changes are
- A fix for an old boot regression on Renesas shmobile chips
- Another boot time regression for for the Qualcomm PDR SoC driver,
among a few other Qualcomm firmware driver fixes for efivars and
tzmem
- Minor Kconfig fixes for davinci and OMAP1
- Minor code fixes for sparx5 reset controllers, OMAP memory
controller, i.MX SCU, cpufreq and SoC drivers and a Hisilicon SoC
driver
- One more update to the Asahi maintainers, adding Neal Gompa as a
reviewer"
* tag 'soc-fixes-6.14-2' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (35 commits)
ARM: davinci: da850: fix selecting ARCH_DAVINCI_DA8XX
soc: hisilicon: kunpeng_hccs: Fix incorrect string assembly
memory: omap-gpmc: drop no compatible check
reset: mchp: sparx5: Fix for lan966x
ARM: shmobile: smp: Enforce shmobile_smp_* alignment
MAINTAINERS: Add myself (Neal Gompa) as a reviewer for ARM Apple support
MAINTAINERS: Add apple-spi driver & binding files
arm64: dts: rockchip: slow down emmc freq for rock 5 itx
ARM: dts: BCM5301X: Fix switch port labels of ASUS RT-AC3200
ARM: dts: BCM5301X: Fix switch port labels of ASUS RT-AC5300
ARM: dts: bcm2711: Don't mark timer regs unconfigured
ARM: OMAP1: select CONFIG_GENERIC_IRQ_CHIP
arm64: dts: rockchip: Add missing PCIe supplies to RockPro64 board dtsi
arm64: dts: rockchip: Add avdd HDMI supplies to RockPro64 board dtsi
arm64: dts: rockchip: Remove undocumented sdmmc property from lubancat-1
arm64: dts: rockchip: fix pinmux of UART5 for PX30 Ringneck on Haikou
arm64: dts: rockchip: fix pinmux of UART0 for PX30 Ringneck on Haikou
arm64: dts: rockchip: fix u2phy1_host status for NanoPi R4S
arm64: dts: bcm2712: PL011 UARTs are actually r1p5
ARM: dts: bcm2711: PL011 UARTs are actually r1p5
...