Linus Torvalds [Fri, 6 Dec 2024 21:47:55 +0000 (13:47 -0800)]
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Catalin Marinas:
"Nothing major, some left-overs from the recent merging window (MTE,
coco) and some newly found issues like the ptrace() ones.
- MTE/hugetlbfs:
- Set VM_MTE_ALLOWED in the arch code and remove it from the core
code for hugetlbfs mappings
- Fix copy_highpage() warning when the source is a huge page but
not MTE tagged, taking the wrong small page path
- drivers/virt/coco:
- Add the pKVM and Arm CCA drivers under the arm64 maintainership
- Fix the pkvm driver to fall back to ioremap() (and warn) if the
MMIO_GUARD hypercall fails
- Keep the Arm CCA driver default 'n' rather than 'm'
- A series of fixes for the arm64 ptrace() implementation,
potentially leading to the kernel consuming uninitialised stack
variables when PTRACE_SETREGSET is invoked with a length of 0
- Fix zone_dma_limit calculation when RAM starts below 4GB and
ZONE_DMA is capped to this limit
- Fix early boot warning with CONFIG_DEBUG_VIRTUAL=y triggered by a
call to page_to_phys() (from patch_map()) which checks pfn_valid()
before vmemmap has been set up
- Do not clobber bits 15:8 of the ASID used for TTBR1_EL1 and TLBI
ops when the kernel assumes 8-bit ASIDs but running under a
hypervisor on a system that implements 16-bit ASIDs (found running
Linux under Parallels on Apple M4)
- ACPI/IORT: Add PMCG platform information for HiSilicon HIP09A as it
is using the same SMMU PMCG as HIP09 and suffers from the same
errata
- Add GCS to cpucap_is_possible(), missed in the recent merge"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: ptrace: fix partial SETREGSET for NT_ARM_GCS
arm64: ptrace: fix partial SETREGSET for NT_ARM_POE
arm64: ptrace: fix partial SETREGSET for NT_ARM_FPMR
arm64: ptrace: fix partial SETREGSET for NT_ARM_TAGGED_ADDR_CTRL
arm64: cpufeature: Add GCS to cpucap_is_possible()
coco: virt: arm64: Do not enable cca guest driver by default
arm64: mte: Fix copy_highpage() warning on hugetlb folios
arm64: Ensure bits ASID[15:8] are masked out when the kernel uses 8-bit ASIDs
ACPI/IORT: Add PMCG platform information for HiSilicon HIP09A
MAINTAINERS: Add CCA and pKVM CoCO guest support to the ARM64 entry
drivers/virt: pkvm: Don't fail ioremap() call if MMIO_GUARD fails
arm64: patching: avoid early page_to_phys()
arm64: mm: Fix zone_dma_limit calculation
arm64: mte: set VM_MTE_ALLOWED for hugetlbfs at correct place
Linus Torvalds [Fri, 6 Dec 2024 21:42:03 +0000 (13:42 -0800)]
Merge tag 'fixes-2024-12-06' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock
Pull memblock fixes from Mike Rapoport:
"Restore check for node validity in arch_numa.
The rework of NUMA initialization in arch_numa dropped a check that
refused to accept configurations with invalid node IDs.
Restore that check to ensure that when firmware passes invalid nodes,
such configuration is rejected and kernel gracefully falls back to
dummy NUMA"
* tag 'fixes-2024-12-06' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock:
arch_numa: Restore nid checks before registering a memblock with a node
memblock: allow zero threshold in validate_numa_converage()
* tag 'drm-fixes-2024-12-06' of https://gitlab.freedesktop.org/drm/kernel:
drm/amdgpu: rework resume handling for display (v2)
drm/amd/pm: fix and simplify workload handling
Revert "drm/amd/pm: correct the workload setting"
drm/amdgpu: fix sriov reinit late orders
drm/amdgpu: Fix ISP hw init issue
drm/amd/display: Add hblank borrowing support
drm/amd/display: Limit VTotal range to max hw cap minus fp
drm/amd/display: Correct prefetch calculation
drm/amd/display: Add option to retrieve detile buffer size
drm/amd/display: Add a left edge pixel if in YCbCr422 or YCbCr420 and odm
drm/amdkfd: hard-code cacheline for gc943,gc944
drm/amdkfd: add MEC version that supports no PCIe atomics for GFX12
drm/amd/display: Fix programming backlight on OLED panels
drm/amd: Sanity check the ACPI EDID
drm/amdgpu/hdp7.0: do a posting read when flushing HDP
drm/amdgpu/hdp6.0: do a posting read when flushing HDP
drm/amdgpu/hdp5.2: do a posting read when flushing HDP
drm/amdgpu/hdp5.0: do a posting read when flushing HDP
drm/amdgpu/hdp4.0: do a posting read when flushing HDP
drm/amdgpu/jpeg1.0: fix idle work handler
Linus Torvalds [Fri, 6 Dec 2024 19:52:15 +0000 (11:52 -0800)]
Merge tag 'drm-fixes-2024-12-07' of https://gitlab.freedesktop.org/drm/kernel
Pull drm fixes from Dave Airlie:
"Pretty quiet week which is probably expected after US holidays, the
dma-fence and displayport MST message handling fixes make up the bulk
of this, along with a couple of minor xe and other driver fixes.
dma-fence:
- Fix reference leak on fence-merge failure path
- Simplify fence merging with kernel's sort()
- Fix dma_fence_array_signaled() to ensure forward progress
dp_mst:
- Fix MST sideband message body length check
- Fix a bunch of locking/state handling with DP MST msgs
sti:
- Add __iomem for mixer_dbg_mxn()'s parameter
xe:
- Missing init value and 64-bit write-order check
- Fix a memory allocation issue causing lockdep violation
v3d:
- Performance counter fix"
* tag 'drm-fixes-2024-12-07' of https://gitlab.freedesktop.org/drm/kernel:
drm/v3d: Enable Performance Counters before clearing them
drm/dp_mst: Use reset_msg_rx_state() instead of open coding it
drm/dp_mst: Reset message rx state after OOM in drm_dp_mst_handle_up_req()
drm/dp_mst: Ensure mst_primary pointer is valid in drm_dp_mst_handle_up_req()
drm/dp_mst: Fix down request message timeout handling
drm/dp_mst: Simplify error path in drm_dp_mst_handle_down_rep()
drm/dp_mst: Verify request type in the corresponding down message reply
drm/dp_mst: Fix resetting msg rx state after topology removal
drm/xe: Move the coredump registration to the worker thread
drm/xe/guc: Fix missing init value and add register order check
drm/sti: Add __iomem for mixer_dbg_mxn's parameter
drm/dp_mst: Fix MST sideband message body length check
dma-buf: fix dma_fence_array_signaled v4
dma-fence: Use kernel's sort for merging fences
dma-fence: Fix reference leak on fence merge failure path
Linus Torvalds [Fri, 6 Dec 2024 19:46:39 +0000 (11:46 -0800)]
Merge tag 'sound-6.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"A collection of small fixes that have been gathered in the week.
- Fix the missing XRUN handling in USB-audio low latency mode
- Fix regression by the previous USB-audio hadening change
- Clean up old SH sound driver to use the standard helpers
- A few further fixes for MIDI 2.0 UMP handling
- Various HD-audio and USB-audio quirks
- Fix jack handling at PM on ASoC Intel AVS
- Misc small fixes for ASoC SOF and Mediatek"
* tag 'sound-6.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: hda/realtek: Fix spelling mistake "Firelfy" -> "Firefly"
ASoC: mediatek: mt8188-mt6359: Remove hardcoded dmic codec
ALSA: hda/realtek: fix micmute LEDs don't work on HP Laptops
ALSA: usb-audio: Add extra PID for RME Digiface USB
ALSA: usb-audio: Fix a DMA to stack memory bug
ASoC: SOF: ipc3-topology: fix resource leaks in sof_ipc3_widget_setup_comp_dai()
ALSA: hda/realtek: Add support for Samsung Galaxy Book3 360 (NP730QFG)
ASoC: Intel: avs: da7219: Remove suspend_pre() and resume_post()
ALSA: hda/tas2781: Fix error code tas2781_read_acpi()
ALSA: hda/realtek: Enable mute and micmute LED on HP ProBook 430 G8
ALSA: usb-audio: add mixer mapping for Corsair HS80
ALSA: ump: Shut up truncated string warning
ALSA: sh: Use standard helper for buffer accesses
ALSA: usb-audio: Notify xrun for low-latency mode
ALSA: hda/conexant: fix Z60MR100 startup pop issue
ALSA: ump: Update legacy substream names upon FB info update
ALSA: ump: Indicate the inactive group in legacy substream names
ALSA: ump: Don't open legacy substream for an inactive group
ALSA: seq: ump: Fix seq port updates per FB info notify
Linus Torvalds [Fri, 6 Dec 2024 19:43:22 +0000 (11:43 -0800)]
Merge tag 'regmap-fix-v6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap
Pull regmap fixes from Mark Brown:
"A couple of small fixes, fixing an incorrect format specifier in a log
message and adding missing cleanup of the devres data used to support
dev_get_regmap() when a device is unregistered"
* tag 'regmap-fix-v6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
regmap: detach regmap from dev on regmap_exit
regmap: Use correct format specifier for logging range errors
Linus Torvalds [Fri, 6 Dec 2024 19:36:48 +0000 (11:36 -0800)]
Merge tag 'spi-fix-v6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi
Pull spi fixes from Mark Brown:
"A few small driver specific fixes and device ID updates for SPI.
The Apple change flags the driver as being compatible with the core's
GPIO chip select support, fixing support for some systems"
* tag 'spi-fix-v6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
spi: omap2-mcspi: Fix the IS_ERR() bug for devm_clk_get_optional_enabled()
spi: intel: Add Panther Lake SPI controller support
spi: apple: Set use_gpio_descriptors to true
spi: mpc52xx: Add cancel_work_sync before module remove
Linus Torvalds [Fri, 6 Dec 2024 19:27:10 +0000 (11:27 -0800)]
Merge tag 'mmc-v6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc
Pull MMC fixes from Ulf Hansson:
"Core:
- Further prevent card detect during shutdown
Host drivers:
- sdhci-pci: Add DMI quirk for missing CD GPIO on Vexia Edu Atla 10
tablet"
* tag 'mmc-v6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
mmc: core: Further prevent card detect during shutdown
mmc: sdhci-pci: Add DMI quirk for missing CD GPIO on Vexia Edu Atla 10 tablet
Linus Torvalds [Fri, 6 Dec 2024 19:24:00 +0000 (11:24 -0800)]
Merge tag 'pmdomain-v6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm
Pull pmdomain fixes from Ulf Hansson:
"Core:
- Fix a couple of memory-leaks during genpd init/remove
Providers:
- imx: Adjust delay for gpcv2 to fix power up handshake
- mediatek: Fix DT bindings by adding another nested power-domain
layer"
* tag 'pmdomain-v6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm:
pmdomain: imx: gpcv2: Adjust delay after power up handshake
pmdomain: core: Fix error path in pm_genpd_init() when ida alloc fails
pmdomain: core: Add missing put_device()
dt-bindings: power: mediatek: Add another nested power-domain layer
Linus Torvalds [Thu, 5 Dec 2024 23:11:39 +0000 (15:11 -0800)]
Merge tag 'audit-pr-20241205' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit
Pull audit build problem workaround from Paul Moore:
"A minor audit patch that shuffles some code slightly to workaround a
GCC bug affecting a number of people.
The GCC folks have been able to reproduce the problem and are
discussing solutions (see the bug report link in the commit), but
since the workaround is trivial let's do that in the kernel so we can
unblock people who are hitting this"
* tag 'audit-pr-20241205' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit:
audit: workaround a GCC bug triggered by task comm changes
Linus Torvalds [Thu, 5 Dec 2024 23:02:20 +0000 (15:02 -0800)]
Merge tag 'for-linus-iommufd' of git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd
Pull iommufd fixes from Jason Gunthorpe:
"One bug fix and some documentation updates:
- Correct typos in comments
- Elaborate a comment about how the uAPI works for
IOMMU_HW_INFO_TYPE_ARM_SMMUV3
- Fix a double free on error path and add test coverage for the bug"
* tag 'for-linus-iommufd' of git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd:
iommu/arm-smmu-v3: Improve uAPI comment for IOMMU_HW_INFO_TYPE_ARM_SMMUV3
iommufd/selftest: Cover IOMMU_FAULT_QUEUE_ALLOC in iommufd_fail_nth
iommufd: Fix out_fput in iommufd_fault_alloc()
iommufd: Fix typos in kernel-doc comments
Linus Torvalds [Thu, 5 Dec 2024 22:38:49 +0000 (14:38 -0800)]
Merge tag 'v6.13-rc1-ksmbd-server-fixes' of git://git.samba.org/ksmbd
Pull smb server fixes from Steve French:
- Three fixes for potential out of bound accesses in read and write
paths (e.g. when alternate data streams enabled)
- GCC 15 build fix
* tag 'v6.13-rc1-ksmbd-server-fixes' of git://git.samba.org/ksmbd:
ksmbd: align aux_payload_buf to avoid OOB reads in cryptographic operations
ksmbd: fix Out-of-Bounds Write in ksmbd_vfs_stream_write
ksmbd: fix Out-of-Bounds Read in ksmbd_vfs_stream_read
smb: server: Fix building with GCC 15
- ipv6:
- avoid possible NULL deref in modify_prefix_route()
- release expired exception dst cached in socket
- smc: fix LGR and link use-after-free issue
- hsr: avoid potential out-of-bound access in fill_frame_info()
- can: hi311x: fix potential use-after-free
- eth: ice: fix VLAN pruning in switchdev mode
Previous releases - always broken:
- netfilter:
- ipset: hold module reference while requesting a module
- nft_inner: incorrect percpu area handling under softirq
- can: j1939: fix skb reference counting
- eth:
- mlxsw: use correct key block on Spectrum-4
- mlx5: fix memory leak in mlx5hws_definer_calc_layout"
* tag 'net-6.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (76 commits)
net :mana :Request a V2 response version for MANA_QUERY_GF_STAT
net: avoid potential UAF in default_operstate()
vsock/test: verify socket options after setting them
vsock/test: fix parameter types in SO_VM_SOCKETS_* calls
vsock/test: fix failures due to wrong SO_RCVLOWAT parameter
net/mlx5e: Remove workaround to avoid syndrome for internal port
net/mlx5e: SD, Use correct mdev to build channel param
net/mlx5: E-Switch, Fix switching to switchdev mode in MPV
net/mlx5: E-Switch, Fix switching to switchdev mode with IB device disabled
net/mlx5: HWS: Properly set bwc queue locks lock classes
net/mlx5: HWS: Fix memory leak in mlx5hws_definer_calc_layout
bnxt_en: handle tpa_info in queue API implementation
bnxt_en: refactor bnxt_alloc_rx_rings() to call bnxt_alloc_rx_agg_bmap()
bnxt_en: refactor tpa_info alloc/free into helpers
geneve: do not assume mac header is set in geneve_xmit_skb()
mlxsw: spectrum_acl_flex_keys: Use correct key block on Spectrum-4
ethtool: Fix wrong mod state in case of verbose and no_mask bitset
ipmr: tune the ipmr_can_free_table() checks.
netfilter: nft_set_hash: skip duplicated elements pending gc run
netfilter: ipset: Hold module reference while requesting a module
...
Linus Torvalds [Thu, 5 Dec 2024 18:17:55 +0000 (10:17 -0800)]
Merge tag 'trace-v6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull tracing fixes from Steven Rostedt:
- Fix trace histogram sort function cmp_entries_dup()
The sort function cmp_entries_dup() returns either 1 or 0, and not -1
if parameter "a" is less than "b" by memcmp().
- Fix archs that call trace_hardirqs_off() without RCU watching
Both x86 and arm64 no longer call any tracepoints with RCU not
watching. It was assumed that it was safe to get rid of
trace_*_rcuidle() version of the tracepoint calls. This was needed to
get rid of the SRCU protection and be able to implement features like
faultable traceponits and add rust tracepoints.
Unfortunately, there were a few architectures that still relied on
that logic. There's only one file that has tracepoints that are
called without RCU watching. Add macro logic around the tracepoints
for architectures that do not have CONFIG_ARCH_WANTS_NO_INSTR defined
will check if the code is in the idle path (the only place RCU isn't
watching), and enable RCU around calling the tracepoint, but only do
it if the tracepoint is enabled.
* tag 'trace-v6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
tracing: Fix archs that still call tracepoints without RCU watching
tracing: Fix cmp_entries_dup() to respect sort() comparison rules
Linus Torvalds [Thu, 5 Dec 2024 18:06:47 +0000 (10:06 -0800)]
Merge tag 'hid-for-linus-2024120501' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid
Pull HID fixes from Benjamin Tissoires:
- regression fix in suspend/resume for i2c-hid (Kenny Levinsen)
- fix wacom driver assuming a name can not be null (WangYuli)
- a couple of constify changes/fixes (Thomas Weißschuh)
- a couple of selftests/hid fixes (Maximilian Heyne & Benjamin
Tissoires)
* tag 'hid-for-linus-2024120501' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid:
selftests/hid: fix kfunc inclusions with newer bpftool
HID: bpf: drop unneeded casts discarding const
HID: bpf: constify hid_ops
selftests: hid: fix typo and exit code
HID: wacom: fix when get product name maybe null pointer
HID: i2c-hid: Revert to using power commands to wake on resume
Mark Rutland [Thu, 5 Dec 2024 12:16:55 +0000 (12:16 +0000)]
arm64: ptrace: fix partial SETREGSET for NT_ARM_GCS
Currently gcs_set() doesn't initialize the temporary 'user_gcs'
variable, and a SETREGSET call with a length of 0, 8, or 16 will leave
some portion of this uninitialized. Consequently some arbitrary
uninitialized values may be written back to the relevant fields in task
struct, potentially leaking up to 192 bits of memory from the kernel
stack. The read is limited to a specific slot on the stack, and the
issue does not provide a write mechanism.
As gcs_set() rejects cases where user_gcs::features_enabled has bits set
other than PR_SHADOW_STACK_SUPPORTED_STATUS_MASK, a SETREGSET call with
a length of zero will randomly succeed or fail depending on the value of
the uninitialized value, it isn't possible to leak the full 192 bits.
With a length of 8 or 16, user_gcs::features_enabled can be initialized
to an accepted value, making it practical to leak 128 or 64 bits.
Fix this by initializing the temporary value before copying the regset
from userspace, as for other regsets (e.g. NT_PRSTATUS, NT_PRFPREG,
NT_ARM_SYSTEM_CALL). In the case of a zero-length or partial write, the
existing contents of the fields which are not written to will be
retained.
To ensure that the extraction and insertion of fields is consistent
across the GETREGSET and SETREGSET calls, new task_gcs_to_user() and
task_gcs_from_user() helpers are added, matching the style of
pac_address_keys_to_user() and pac_address_keys_from_user().
Fixes: 7ec3b57cb29f ("arm64/ptrace: Expose GCS via ptrace and core files") Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Mark Brown <broonie@kernel.org> Cc: Will Deacon <will@kernel.org> Reviewed-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20241205121655.1824269-5-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Mark Rutland [Thu, 5 Dec 2024 12:16:54 +0000 (12:16 +0000)]
arm64: ptrace: fix partial SETREGSET for NT_ARM_POE
Currently poe_set() doesn't initialize the temporary 'ctrl' variable,
and a SETREGSET call with a length of zero will leave this
uninitialized. Consequently an arbitrary value will be written back to
target->thread.por_el0, potentially leaking up to 64 bits of memory from
the kernel stack. The read is limited to a specific slot on the stack,
and the issue does not provide a write mechanism.
Fix this by initializing the temporary value before copying the regset
from userspace, as for other regsets (e.g. NT_PRSTATUS, NT_PRFPREG,
NT_ARM_SYSTEM_CALL). In the case of a zero-length write, the existing
contents of POR_EL1 will be retained.
Mark Rutland [Thu, 5 Dec 2024 12:16:53 +0000 (12:16 +0000)]
arm64: ptrace: fix partial SETREGSET for NT_ARM_FPMR
Currently fpmr_set() doesn't initialize the temporary 'fpmr' variable,
and a SETREGSET call with a length of zero will leave this
uninitialized. Consequently an arbitrary value will be written back to
target->thread.uw.fpmr, potentially leaking up to 64 bits of memory from
the kernel stack. The read is limited to a specific slot on the stack,
and the issue does not provide a write mechanism.
Fix this by initializing the temporary value before copying the regset
from userspace, as for other regsets (e.g. NT_PRSTATUS, NT_PRFPREG,
NT_ARM_SYSTEM_CALL). In the case of a zero-length write, the existing
contents of FPMR will be retained.
Fixes: 4035c22ef7d4 ("arm64/ptrace: Expose FPMR via ptrace") Cc: <stable@vger.kernel.org> # 6.9.x Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Mark Brown <broonie@kernel.org> Cc: Will Deacon <will@kernel.org> Reviewed-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20241205121655.1824269-3-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Linus Torvalds [Thu, 5 Dec 2024 18:03:43 +0000 (10:03 -0800)]
Merge tag 'linux-watchdog-6.13-rc1' of git://www.linux-watchdog.org/linux-watchdog
Pull watchdog updates from Wim Van Sebroeck:
- Add support for exynosautov920 SoC
- Add support for Airoha EN7851 watchdog
- Add support for MT6735 TOPRGU/WDT
- Delete the cpu5wdt driver
- Always print when registering watchdog fails
- Several other small fixes and improvements
* tag 'linux-watchdog-6.13-rc1' of git://www.linux-watchdog.org/linux-watchdog: (36 commits)
watchdog: rti: of: honor timeout-sec property
watchdog: s3c2410_wdt: add support for exynosautov920 SoC
dt-bindings: watchdog: Document ExynosAutoV920 watchdog bindings
watchdog: mediatek: Add support for MT6735 TOPRGU/WDT
watchdog: mediatek: Make sure system reset gets asserted in mtk_wdt_restart()
dt-bindings: watchdog: fsl-imx-wdt: Add missing 'big-endian' property
dt-bindings: watchdog: Document Qualcomm QCS8300
docs: ABI: Fix spelling mistake in pretimeout_avaialable_governors
Revert "watchdog: s3c2410_wdt: use exynos_get_pmu_regmap_by_phandle() for PMU regs"
watchdog: rzg2l_wdt: Power on the watchdog domain in the restart handler
watchdog: Switch back to struct platform_driver::remove()
watchdog: it87_wdt: add PWRGD enable quirk for Qotom QCML04
watchdog: da9063: Remove __maybe_unused notations
watchdog: da9063: Do not use a global variable
watchdog: Delete the cpu5wdt driver
watchdog: Add support for Airoha EN7851 watchdog
dt-bindings: watchdog: airoha: document watchdog for Airoha EN7581
watchdog: sl28cpld_wdt: don't print out if registering watchdog fails
watchdog: rza_wdt: don't print out if registering watchdog fails
watchdog: rti_wdt: don't print out if registering watchdog fails
...
Mark Rutland [Thu, 5 Dec 2024 12:16:52 +0000 (12:16 +0000)]
arm64: ptrace: fix partial SETREGSET for NT_ARM_TAGGED_ADDR_CTRL
Currently tagged_addr_ctrl_set() doesn't initialize the temporary 'ctrl'
variable, and a SETREGSET call with a length of zero will leave this
uninitialized. Consequently tagged_addr_ctrl_set() will consume an
arbitrary value, potentially leaking up to 64 bits of memory from the
kernel stack. The read is limited to a specific slot on the stack, and
the issue does not provide a write mechanism.
As set_tagged_addr_ctrl() only accepts values where bits [63:4] zero and
rejects other values, a partial SETREGSET attempt will randomly succeed
or fail depending on the value of the uninitialized value, and the
exposure is significantly limited.
Fix this by initializing the temporary value before copying the regset
from userspace, as for other regsets (e.g. NT_PRSTATUS, NT_PRFPREG,
NT_ARM_SYSTEM_CALL). In the case of a zero-length write, the existing
value of the tagged address ctrl will be retained.
The NT_ARM_TAGGED_ADDR_CTRL regset is only visible in the
user_aarch64_view used by a native AArch64 task to manipulate another
native AArch64 task. As get_tagged_addr_ctrl() only returns an error
value when called for a compat task, tagged_addr_ctrl_get() and
tagged_addr_ctrl_set() should never observe an error value from
get_tagged_addr_ctrl(). Add a WARN_ON_ONCE() to both to indicate that
such an error would be unexpected, and error handlnig is not missing in
either case.
Fixes: 2200aa7154cb ("arm64: mte: ptrace: Add NT_ARM_TAGGED_ADDR_CTRL regset") Cc: <stable@vger.kernel.org> # 5.10.x Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will@kernel.org> Reviewed-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20241205121655.1824269-2-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Maíra Canal [Wed, 4 Dec 2024 12:28:31 +0000 (09:28 -0300)]
drm/v3d: Enable Performance Counters before clearing them
On the Raspberry Pi 5, performance counters are not being cleared
when `v3d_perfmon_start()` is called, even though we write to the
CLR register. As a result, their values accumulate until they
overflow.
The expected behavior is for performance counters to reset to zero
at the start of a job. When the job finishes and the perfmon is
stopped, the counters should accurately reflect the values for that
specific job.
To ensure this behavior, the performance counters are now enabled
before being cleared. This allows the CLR register to function as
intended, zeroing the counter values when the job begins.
Steven Rostedt [Wed, 4 Dec 2024 15:04:14 +0000 (10:04 -0500)]
tracing: Fix archs that still call tracepoints without RCU watching
Tracepoints require having RCU "watching" as it uses RCU to do updates to
the tracepoints. There are some cases that would call a tracepoint when
RCU was not "watching". This was usually in the idle path where RCU has
"shutdown". For the few locations that had tracepoints without RCU
watching, there was an trace_*_rcuidle() variant that could be used. This
used SRCU for protection.
There are tracepoints that trace when interrupts and preemption are
enabled and disabled. In some architectures, these tracepoints are called
in a path where RCU is not watching. When x86 and arm64 removed these
locations, it was incorrectly assumed that it would be safe to remove the
trace_*_rcuidle() variant and also remove the SRCU logic, as it made the
code more complex and harder to implement new tracepoint features (like
faultable tracepoints and tracepoints in rust).
Instead of bringing back the trace_*_rcuidle(), as it will not be trivial
to do as new code has already been added depending on its removal, add a
workaround to the one file that still requires it (trace_preemptirq.c). If
the architecture does not define CONFIG_ARCH_WANTS_NO_INSTR, then check if
the code is in the idle path, and if so, call ct_irq_enter/exit() which
will enable RCU around the tracepoint.
Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/20241204100414.4d3e06d0@gandalf.local.home Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Fixes: 48bcda684823 ("tracing: Remove definition of trace_*_rcuidle()") Closes: https://lore.kernel.org/all/bddb02de-957a-4df5-8e77-829f55728ea2@roeck-us.net/ Acked-by: Paul E. McKenney <paulmck@kernel.org> Tested-by: Guenter Roeck <linux@roeck-us.net> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Tested-by: Madhavan Srinivasan <maddy@linux.ibm.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Imre Deak [Tue, 3 Dec 2024 16:02:22 +0000 (18:02 +0200)]
drm/dp_mst: Reset message rx state after OOM in drm_dp_mst_handle_up_req()
After an out-of-memory error the reception state should be reset, so
that the next attempt receiving a message doesn't fail (due to getting a
start-of-message packet, while the reception state has already the
start-of-message flag set).
Imre Deak [Wed, 4 Dec 2024 13:20:07 +0000 (15:20 +0200)]
drm/dp_mst: Ensure mst_primary pointer is valid in drm_dp_mst_handle_up_req()
While receiving an MST up request message from one thread in
drm_dp_mst_handle_up_req(), the MST topology could be removed from
another thread via drm_dp_mst_topology_mgr_set_mst(false), freeing
mst_primary and setting drm_dp_mst_topology_mgr::mst_primary to NULL.
This could lead to a NULL deref/use-after-free of mst_primary in
drm_dp_mst_handle_up_req().
Avoid the above by holding a reference for mst_primary in
drm_dp_mst_handle_up_req() while it's used.
v2: Fix kfreeing the request if getting an mst_primary reference fails.
Imre Deak [Tue, 3 Dec 2024 17:46:32 +0000 (19:46 +0200)]
drm/dp_mst: Fix down request message timeout handling
If receiving a reply for an MST down request message times out, the
thread receiving the reply in drm_dp_mst_handle_down_rep() could try to
dereference the drm_dp_sideband_msg_tx txmsg request message after the
thread waiting for the reply - calling drm_dp_mst_wait_tx_reply() - has
timed out and freed txmsg, hence leading to a use-after-free in
drm_dp_mst_handle_down_rep().
Prevent the above by holding the drm_dp_mst_topology_mgr::qlock in
drm_dp_mst_handle_down_rep() for the whole duration txmsg is looked up
from the request list and dereferenced.
v2: Fix unlocking mgr->qlock after verify_rx_request_type() fails.
Imre Deak [Tue, 3 Dec 2024 16:02:18 +0000 (18:02 +0200)]
drm/dp_mst: Verify request type in the corresponding down message reply
After receiving the response for an MST down request message, the
response should be accepted/parsed only if the response type matches
that of the request. Ensure this by checking if the request type code
stored both in the request and the reply match, dropping the reply in
case of a mismatch.
This fixes the topology detection for an MST hub, as described in the
Closes link below, where the hub sends an incorrect reply message after
a CLEAR_PAYLOAD_TABLE -> LINK_ADDRESS down request message sequence.
Imre Deak [Tue, 3 Dec 2024 16:02:17 +0000 (18:02 +0200)]
drm/dp_mst: Fix resetting msg rx state after topology removal
If the MST topology is removed during the reception of an MST down reply
or MST up request sideband message, the
drm_dp_mst_topology_mgr::up_req_recv/down_rep_recv states could be reset
from one thread via drm_dp_mst_topology_mgr_set_mst(false), racing with
the reading/parsing of the message from another thread via
drm_dp_mst_handle_down_rep() or drm_dp_mst_handle_up_req(). The race is
possible since the reader/parser doesn't hold any lock while accessing
the reception state. This in turn can lead to a memory corruption in the
reader/parser as described by commit bd2fccac61b4 ("drm/dp_mst: Fix MST
sideband message body length check").
Fix the above by resetting the message reception state if needed before
reading/parsing a message. Another solution would be to hold the
drm_dp_mst_topology_mgr::lock for the whole duration of the message
reception/parsing in drm_dp_mst_handle_down_rep() and
drm_dp_mst_handle_up_req(), however this would require a bigger change.
Since the fix is also needed for stable, opting for the simpler solution
in this patch.
Cc: Lyude Paul <lyude@redhat.com> Cc: <stable@vger.kernel.org> Fixes: 1d082618bbf3 ("drm/display/dp_mst: Fix down/up message handling after sink disconnect") Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/13056 Reviewed-by: Lyude Paul <lyude@redhat.com> Signed-off-by: Imre Deak <imre.deak@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241203160223.2926014-2-imre.deak@intel.com
Remove hardcoded dmic codec from the UL_SRC dai link to avoid requiring
a dmic codec to be present for the driver to probe, as not every
MT8188-based platform might need a dmic codec. The codec can be assigned
to the dai link through the dai-link property in Devicetree on the
platforms where it is needed.
No Devicetree currently relies on it so it is safe to remove without
worrying about backward compatibility.
Fixes: 9f08dcbddeb3 ("ASoC: mediatek: mt8188-mt6359: support new board with nau88255") Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Link: https://patch.msgid.link/20241203-mt8188-6359-unhardcode-dmic-v1-1-346e3e5cbe6d@collabora.com Signed-off-by: Mark Brown <broonie@kernel.org>
Catalin Marinas [Wed, 4 Dec 2024 17:50:04 +0000 (17:50 +0000)]
arm64: mte: Fix copy_highpage() warning on hugetlb folios
Commit 25c17c4b55de ("hugetlb: arm64: add mte support") improved the
copy_highpage() function to update the tags in the destination hugetlb
folio. However, when the source folio isn't tagged, the code takes the
non-hugetlb path where try_page_mte_tagging() warns as the destination
is a hugetlb folio:
Change the check for the tagged status of the source folio so that it
does not fall through the non-hugetlb case. In addition, only perform
the copy (for the full folio) if the source page is the folio head and
warn if the destination folio is already tagged, for symmetry with the
non-hugetlb case.
Catalin Marinas [Tue, 3 Dec 2024 15:19:41 +0000 (15:19 +0000)]
arm64: Ensure bits ASID[15:8] are masked out when the kernel uses 8-bit ASIDs
Linux currently sets the TCR_EL1.AS bit unconditionally during CPU
bring-up. On an 8-bit ASID CPU, this is RES0 and ignored, otherwise
16-bit ASIDs are enabled. However, if running in a VM and the hypervisor
reports 8-bit ASIDs (ID_AA64MMFR0_EL1.ASIDBits == 0) on a 16-bit ASIDs
CPU, Linux uses bits 8 to 63 as a generation number for tracking old
process ASIDs. The bottom 8 bits of this generation end up being written
to TTBR1_EL1 and also used for the ASID-based TLBI operations as the
upper 8 bits of the ASID. Following an ASID roll-over event we can have
threads of the same application with the same 8-bit ASID but different
generation numbers running on separate CPUs. Both TLB caching and the
TLBI operations will end up using different actual 16-bit ASIDs for the
same process.
A similar scenario can happen in a big.LITTLE configuration if the boot
CPU only uses 8-bit ASIDs while secondary CPUs have 16-bit ASIDs.
Ensure that the ASID generation is only tracked by bits 16 and up,
leaving bits 15:8 as 0 if the kernel uses 8-bit ASIDs. Note that
clearing TCR_EL1.AS is not sufficient since the architecture requires
that the top 8 bits of the ASID passed to TLBI instructions are 0 rather
than ignored in such configuration.
Cc: stable@vger.kernel.org Cc: Will Deacon <will@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Marc Zyngier <maz@kernel.org> Cc: James Morse <james.morse@arm.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20241203151941.353796-1-catalin.marinas@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Qinxin Xia [Thu, 5 Dec 2024 01:33:31 +0000 (09:33 +0800)]
ACPI/IORT: Add PMCG platform information for HiSilicon HIP09A
HiSilicon HIP09A platforms using the same SMMU PMCG with HIP09
and thus suffers the same erratum. List them in the PMCG platform
information list without introducing a new SMMU PMCG Model.
Shradha Gupta [Wed, 4 Dec 2024 05:48:20 +0000 (21:48 -0800)]
net :mana :Request a V2 response version for MANA_QUERY_GF_STAT
The current requested response version(V1) for MANA_QUERY_GF_STAT query
results in STATISTICS_FLAGS_TX_ERRORS_GDMA_ERROR value being set to
0 always.
In order to get the correct value for this counter we request the response
version to be V2.
Cc: stable@vger.kernel.org Fixes: e1df5202e879 ("net :mana :Add remaining GDMA stats for MANA to ethtool") Signed-off-by: Shradha Gupta <shradhagupta@linux.microsoft.com> Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Link: https://patch.msgid.link/1733291300-12593-1-git-send-email-shradhagupta@linux.microsoft.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
The buggy address belongs to the object at ffff888043eba000
which belongs to the cache kmalloc-2k of size 2048
The buggy address is located 432 bytes inside of
freed 2048-byte region [ffff888043eba000, ffff888043eba800)
Fixes: 8c55facecd7a ("net: linkwatch: only report IF_OPER_LOWERLAYERDOWN if iflink is actually down") Reported-by: syzbot+1939f24bdb783e9e43d9@syzkaller.appspotmail.com Closes: https://lore.kernel.org/netdev/674f3a18.050a0220.48a03.0041.GAE@google.com/T/#u Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://patch.msgid.link/20241203170933.2449307-1-edumazet@google.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Paolo Abeni [Thu, 5 Dec 2024 10:49:14 +0000 (11:49 +0100)]
Merge tag 'nf-24-12-05' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
The following patchset contains Netfilter fixes for net:
1) Fix esoteric undefined behaviour due to uninitialized stack access
in ip_vs_protocol_init(), from Jinghao Jia.
2) Fix iptables xt_LED slab-out-of-bounds due to incorrect sanitization
of the led string identifier, reported by syzbot. Patch from
Dmitry Antipov.
3) Remove WARN_ON_ONCE reachable from userspace to check for the maximum
cgroup level, nft_socket cgroup matching is restricted to 255 levels,
but cgroups allow for INT_MAX levels by default. Reported by syzbot.
4) Fix nft_inner incorrect use of percpu area to store tunnel parser
context with softirqs, resulting in inconsistent inner header
offsets that could lead to bogus rule mismatches, reported by syzbot.
5) Grab module reference on ipset core while requesting set type modules,
otherwise kernel crash is possible by removing ipset core module,
patch from Phil Sutter.
6) Fix possible double-free in nft_hash garbage collector due to unstable
walk interator that can provide twice the same element. Use a sequence
number to skip expired/dead elements that have been already scheduled
for removal. Based on patch from Laurent Fasnach
netfilter pull request 24-12-05
* tag 'nf-24-12-05' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
netfilter: nft_set_hash: skip duplicated elements pending gc run
netfilter: ipset: Hold module reference while requesting a module
netfilter: nft_inner: incorrect percpu area handling under softirq
netfilter: nft_socket: remove WARN_ON_ONCE on maximum cgroup level
netfilter: x_tables: fix LED ID check in led_tg_check()
ipvs: fix UB due to uninitialized stack access in ip_vs_protocol_init()
====================
Parameters were created using wrong C types, which caused them to be of
wrong size on some architectures, causing problems.
The problem with SO_RCVLOWAT was found on s390 (big endian), while x86-64
didn't show it. After the fix, all tests pass on s390.
Then Stefano Garzarella pointed out that SO_VM_SOCKETS_* calls might have
a similar problem, which turned out to be true, hence, the second patch.
Changes for v8:
- Fix whitespace warnings from "checkpatch.pl --strict"
- Add maintainers to Cc:
Changes for v7:
- Rebase on top of https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net.git
- Add the "net" tags to the subjects
Changes for v6:
- rework the patch #3 to avoid creating a new file for new functions,
and exclude vsock_perf from calling the new functions.
- add "Reviewed-by:" to the patch #2.
Changes for v5:
- in the patch #2 replace the introduced uint64_t with unsigned long long
to match documentation
- add a patch #3 that verifies every setsockopt() call.
Changes for v4:
- add "Reviewed-by:" to the first patch, and add a second patch fixing
SO_VM_SOCKETS_* calls, which depends on the first one (hence, it's now
a patch series.)
Changes for v3:
- fix the same problem in vsock_perf and update commit message
Changes for v2:
- add "Fixes:" lines to the commit message
====================
Konstantin Shkolnyy [Tue, 3 Dec 2024 15:06:56 +0000 (09:06 -0600)]
vsock/test: verify socket options after setting them
Replace setsockopt() calls with calls to functions that follow
setsockopt() with getsockopt() and check that the returned value and its
size are the same as have been set. (Except in vsock_perf.)
Signed-off-by: Konstantin Shkolnyy <kshk@linux.ibm.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Konstantin Shkolnyy [Tue, 3 Dec 2024 15:06:55 +0000 (09:06 -0600)]
vsock/test: fix parameter types in SO_VM_SOCKETS_* calls
Change parameters of SO_VM_SOCKETS_* to unsigned long long as documented
in the vm_sockets.h, because the corresponding kernel code requires them
to be at least 64-bit, no matter what architecture. Otherwise they are
too small on 32-bit machines.
Fixes: 5c338112e48a ("test/vsock: rework message bounds test") Fixes: 685a21c314a8 ("test/vsock: add big message test") Fixes: 542e893fbadc ("vsock/test: two tests to check credit update logic") Fixes: 8abbffd27ced ("test/vsock: vsock_perf utility") Signed-off-by: Konstantin Shkolnyy <kshk@linux.ibm.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Konstantin Shkolnyy [Tue, 3 Dec 2024 15:06:54 +0000 (09:06 -0600)]
vsock/test: fix failures due to wrong SO_RCVLOWAT parameter
This happens on 64-bit big-endian machines.
SO_RCVLOWAT requires an int parameter. However, instead of int, the test
uses unsigned long in one place and size_t in another. Both are 8 bytes
long on 64-bit machines. The kernel, having received the 8 bytes, doesn't
test for the exact size of the parameter, it only cares that it's >=
sizeof(int), and casts the 4 lower-addressed bytes to an int, which, on
a big-endian machine, contains 0. 0 doesn't trigger an error, SO_RCVLOWAT
returns with success and the socket stays with the default SO_RCVLOWAT = 1,
which results in vsock_test failures, while vsock_perf doesn't even notice
that it's failed to change it.
Fixes: b1346338fbae ("vsock_test: POLLIN + SO_RCVLOWAT test") Fixes: 542e893fbadc ("vsock/test: two tests to check credit update logic") Fixes: 8abbffd27ced ("test/vsock: vsock_perf utility") Signed-off-by: Konstantin Shkolnyy <kshk@linux.ibm.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Yafang shao [Thu, 5 Dec 2024 02:11:00 +0000 (10:11 +0800)]
audit: workaround a GCC bug triggered by task comm changes
A build failure has been reported with the following details:
In file included from include/linux/string.h:390,
from include/linux/bitmap.h:13,
from include/linux/cpumask.h:12,
from include/linux/smp.h:13,
from include/linux/lockdep.h:14,
from include/linux/spinlock.h:63,
from include/linux/wait.h:9,
from include/linux/wait_bit.h:8,
from include/linux/fs.h:6,
from kernel/auditsc.c:37:
In function 'sized_strscpy',
inlined from '__audit_ptrace' at kernel/auditsc.c:2732:2:
>> include/linux/fortify-string.h:293:17:
error: call to '__write_overflow' declared with attribute error:
detected write beyond size of object (1st parameter)
293 | __write_overflow();
| ^~~~~~~~~~~~~~~~~~
In function 'sized_strscpy',
inlined from 'audit_signal_info_syscall' at kernel/auditsc.c:2759:3:
>> include/linux/fortify-string.h:293:17:
error: call to '__write_overflow' declared with attribute error:
detected write beyond size of object (1st parameter)
293 | __write_overflow();
| ^~~~~~~~~~~~~~~~~~
The issue appears to be a GCC bug, though the root cause remains
unclear at this time. For now, let's implement a workaround.
A bug report has also been filed with GCC [0].
Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117912 Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202410171420.1V00ICVG-lkp@intel.com/ Reported-by: Steven Rostedt (Google) <rostedt@goodmis.org> Closes: https://lore.kernel.org/all/20241128182435.57a1ea6f@gandalf.local.home/ Reported-by: Zhuo, Qiuxu <qiuxu.zhuo@intel.com> Closes: https://lore.kernel.org/all/CY8PR11MB71348E568DBDA576F17DAFF389362@CY8PR11MB7134.namprd11.prod.outlook.com/ Originally-by: Kees Cook <kees@kernel.org> Link: https://lore.kernel.org/linux-hardening/202410171059.C2C395030@keescook/ Signed-off-by: Yafang shao <laoar.shao@gmail.com> Tested-by: Steven Rostedt (Google) <rostedt@goodmis.org> Tested-by: Yafang Shao <laoar.shao@gmail.com>
[PM: subject tweak, description line wrapping] Signed-off-by: Paul Moore <paul@paul-moore.com>
This series contains updates to ice, idpf, ixgbe, ixgbevf, and igb
drivers.
For ice:
Arkadiusz corrects search for determining whether PHY clock recovery is
supported on the device.
Przemyslaw corrects mask used for PHY timestamps on ETH56G devices.
Wojciech adds missing virtchnl ops which caused NULL pointer
dereference.
Marcin fixes VLAN filter settings for uplink VSI in switchdev mode.
For idpf:
Josh restores setting of completion tag for empty buffers.
For ixgbevf:
Jake removes incorrect initialization/support of IPSEC for mailbox
version 1.5.
For ixgbe:
Jake rewords and downgrades misleading message when negotiation
of VF mailbox version is not supported.
Tore Amundsen corrects value for BASE-BX10 capability.
For igb:
Yuan Can adds proper teardown on failed pci_register_driver() call.
* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
igb: Fix potential invalid memory access in igb_init_module()
ixgbe: Correct BASE-BX10 compliance code
ixgbe: downgrade logging of unsupported VF API version to debug
ixgbevf: stop attempting IPSEC offload on Mailbox API 1.5
idpf: set completion tag for "empty" bufs associated with a packet
ice: Fix VLAN pruning in switchdev mode
ice: Fix NULL pointer dereference in switchdev
ice: fix PHY timestamp extraction for ETH56G
ice: fix PHY Clock Recovery availability check
====================
Jianbo Liu [Tue, 3 Dec 2024 20:49:20 +0000 (22:49 +0200)]
net/mlx5e: Remove workaround to avoid syndrome for internal port
Previously a workaround was added to avoid syndrome 0xcdb051. It is
triggered when offload a rule with tunnel encapsulation, and
forwarding to another table, but not matching on the internal port in
firmware steering mode. The original workaround skips internal tunnel
port logic, which is not correct as not all cases are considered. As
an example, if vlan is configured on the uplink port, traffic can't
pass because vlan header is not added with this workaround. Besides,
there is no such issue for software steering. So, this patch removes
that, and returns error directly if trying to offload such rule for
firmware steering.
Fixes: 06b4eac9c4be ("net/mlx5e: Don't offload internal port if filter device is out device") Signed-off-by: Jianbo Liu <jianbol@nvidia.com> Tested-by: Frode Nordahl <frode.nordahl@canonical.com> Reviewed-by: Chris Mi <cmi@nvidia.com> Reviewed-by: Ariel Levkovich <lariel@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20241203204920.232744-7-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Tariq Toukan [Tue, 3 Dec 2024 20:49:19 +0000 (22:49 +0200)]
net/mlx5e: SD, Use correct mdev to build channel param
In a multi-PF netdev, each traffic channel creates its own resources
against a specific PF.
In the cited commit, where this support was added, the channel_param
logic was mistakenly kept unchanged, so it always used the primary PF
which is found at priv->mdev.
In this patch we fix this by moving the logic to be per-channel, and
passing the correct mdev instance.
This bug happened to be usually harmless, as the resulting cparam
structures would be the same for all channels, due to identical FW logic
and decisions.
However, in some use cases, like fwreset, this gets broken.
This could lead to different symptoms. Example:
Error cqe on cqn 0x428, ci 0x0, qn 0x10a9, opcode 0xe, syndrome 0x4,
vendor syndrome 0x32
Fixes: e4f9686bdee7 ("net/mlx5e: Let channels be SD-aware") Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Lama Kayal <lkayal@nvidia.com> Reviewed-by: Gal Pressman <gal@nvidia.com> Link: https://patch.msgid.link/20241203204920.232744-6-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Patrisious Haddad [Tue, 3 Dec 2024 20:49:18 +0000 (22:49 +0200)]
net/mlx5: E-Switch, Fix switching to switchdev mode in MPV
Fix the mentioned commit change for MPV mode, since in MPV mode the IB
device is shared between different core devices, so under this change
when moving both devices simultaneously to switchdev mode the IB device
removal and re-addition can race with itself causing unexpected behavior.
In such case do rescan_drivers() only once in order to add the ethernet
representor auxiliary device, and skip adding and removing IB devices.
Patrisious Haddad [Tue, 3 Dec 2024 20:49:17 +0000 (22:49 +0200)]
net/mlx5: E-Switch, Fix switching to switchdev mode with IB device disabled
In case that IB device is already disabled when moving to switchdev mode,
which can happen when working with LAG, need to do rescan_drivers()
before leaving in order to add ethernet representor auxiliary device.
====================
bnxt_en: support header page pool in queue API
Commit 7ed816be35ab ("eth: bnxt: use page pool for head frags") added a
separate page pool for header frags. Now, frags are allocated from this
header page pool e.g. rxr->tpa_info.data.
The queue API did not properly handle rxr->tpa_info and so using the
queue API to i.e. reset any queues will result in pages being returned
to the incorrect page pool, causing inflight != 0 warnings.
Fix this bug by properly allocating/freeing tpa_info and copying/freeing
head_pool in the queue API implementation.
The 1st patch is a prep patch that refactors helpers out to be used by
the implementation patch later.
The 2nd patch is a drive-by refactor. Happy to take it out and re-send
to net-next if there are any objections.
The 3rd patch is the implementation patch that will properly alloc/free
rxr->tpa_info.
====================
David Wei [Wed, 4 Dec 2024 04:10:22 +0000 (20:10 -0800)]
bnxt_en: handle tpa_info in queue API implementation
Commit 7ed816be35ab ("eth: bnxt: use page pool for head frags") added a
page pool for header frags, which may be distinct from the existing pool
for the aggregation ring. Prior to this change, frags used in the TPA
ring rx_tpa were allocated from system memory e.g. napi_alloc_frag()
meaning their lifetimes were not associated with a page pool. They can
be returned at any time and so the queue API did not alloc or free
rx_tpa.
But now frags come from a separate head_pool which may be different to
page_pool. Without allocating and freeing rx_tpa, frags allocated from
the old head_pool may be returned to a different new head_pool which
causes a mismatch between the pp hold/release count.
Fix this problem by properly freeing and allocating rx_tpa in the queue
API implementation.
Ido Schimmel [Tue, 3 Dec 2024 15:16:05 +0000 (16:16 +0100)]
mlxsw: spectrum_acl_flex_keys: Use correct key block on Spectrum-4
The driver is currently using an ACL key block that is not supported by
Spectrum-4. This works because the driver is only using a single field
from this key block which is located in the same offset in the
equivalent Spectrum-4 key block.
The issue was discovered when the firmware started rejecting the use of
the unsupported key block. The change has been reverted to avoid
breaking users that only update their firmware.
Nonetheless, fix the issue by using the correct key block.
Kory Maincent [Mon, 2 Dec 2024 15:33:57 +0000 (16:33 +0100)]
ethtool: Fix wrong mod state in case of verbose and no_mask bitset
A bitset without mask in a _SET request means we want exactly the bits in
the bitset to be set. This works correctly for compact format but when
verbose format is parsed, ethnl_update_bitset32_verbose() only sets the
bits present in the request bitset but does not clear the rest. The commit 6699170376ab ("ethtool: fix application of verbose no_mask bitset") fixes
this issue by clearing the whole target bitmap before we start iterating.
The solution proposed brought an issue with the behavior of the mod
variable. As the bitset is always cleared the old value will always
differ to the new value.
Fix it by adding a new function to compare bitmaps and a temporary variable
which save the state of the old bitmap.
The root cause is a network namespace creation failing after successful
initialization of the ipmr subsystem. Such a case is not currently
matched by the ipmr_can_free_table() helper.
New namespaces are zeroed on allocation and inserted into net ns list
only after successful creation; when deleting an ipmr table, the list
next pointer can be NULL only on netns initialization failure.
Update the ipmr_can_free_table() checks leveraging such condition.
Reported-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot+6e8cb445d4b43d006e0c@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=6e8cb445d4b43d006e0c Fixes: 11b6e701bce9 ("ipmr: add debug check for mr table cleanup") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/8bde975e21bbca9d9c27e36209b2dd4f1d7a3f00.1733212078.git.pabeni@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Norbert Szetei [Sat, 30 Nov 2024 15:56:14 +0000 (16:56 +0100)]
ksmbd: align aux_payload_buf to avoid OOB reads in cryptographic operations
The aux_payload_buf allocation in SMB2 read is performed without ensuring
alignment, which could result in out-of-bounds (OOB) reads during
cryptographic operations such as crypto_xor or ghash. This patch aligns
the allocation of aux_payload_buf to prevent these issues.
(Note that to add this patch to stable would require modifications due
to recent patch "ksmbd: use __GFP_RETRY_MAYFAIL")
Signed-off-by: Norbert Szetei <norbert@doyensec.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
Pablo Neira Ayuso [Sun, 1 Dec 2024 23:04:49 +0000 (00:04 +0100)]
netfilter: nft_set_hash: skip duplicated elements pending gc run
rhashtable does not provide stable walk, duplicated elements are
possible in case of resizing. I considered that checking for errors when
calling rhashtable_walk_next() was sufficient to detect the resizing.
However, rhashtable_walk_next() returns -EAGAIN only at the end of the
iteration, which is too late, because a gc work containing duplicated
elements could have been already scheduled for removal to the worker.
Add a u32 gc worker sequence number per set, bump it on every workqueue
run. Annotate gc worker sequence number on the expired element. Use it
to skip those already seen in this gc workqueue run.
Note that this new field is never reset in case gc transaction fails, so
next gc worker run on the expired element overrides it. Wraparound of gc
worker sequence number should not be an issue with stale gc worker
sequence number in the element, that would just postpone the element
removal in one gc run.
Note that it is not possible to use flags to annotate that element is
pending gc run to detect duplicates, given that gc transaction can be
invalidated in case of update from the control plane, therefore, not
allowing to clear such flag.
On x86_64, pahole reports no changes in the size of nft_rhash_elem.
Fixes: f6c383b8c31a ("netfilter: nf_tables: adapt set backend to use GC transaction API") Reported-by: Laurent Fasnacht <laurent.fasnacht@proton.ch> Tested-by: Laurent Fasnacht <laurent.fasnacht@proton.ch> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Linus Torvalds [Wed, 4 Dec 2024 18:31:37 +0000 (10:31 -0800)]
Merge tag 'loongarch-fixes-6.13-1' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson
Pull LoongArch fixes from Huacai Chen:
"Fix bugs about EFI screen info, hugetlb pte clear and Lockdep-RCU
splat in KVM, plus some trival cleanups"
* tag 'loongarch-fixes-6.13-1' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson:
LoongArch: KVM: Protect kvm_io_bus_{read,write}() with SRCU
LoongArch: KVM: Protect kvm_check_requests() with SRCU
LoongArch: BPF: Adjust the parameter of emit_jirl()
LoongArch: Add architecture specific huge_pte_clear()
LoongArch/irq: Use seq_put_decimal_ull_width() for decimal values
LoongArch: Fix reserving screen info memory for above-4G firmware
* tag 'platform-drivers-x86-v6.13-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86:
platform/x86: asus-nb-wmi: Ignore unknown event 0xCF
platform/x86: asus-wmi: Ignore return value when writing thermal policy
platform/x86: samsung-laptop: Match MODULE_DESCRIPTION() to functionality
Kuan-Wei Chiu [Tue, 3 Dec 2024 20:22:28 +0000 (04:22 +0800)]
tracing: Fix cmp_entries_dup() to respect sort() comparison rules
The cmp_entries_dup() function used as the comparator for sort()
violated the symmetry and transitivity properties required by the
sorting algorithm. Specifically, it returned 1 whenever memcmp() was
non-zero, which broke the following expectations:
* Symmetry: If x < y, then y > x.
* Transitivity: If x < y and y < z, then x < z.
These violations could lead to incorrect sorting and failure to
correctly identify duplicate elements.
Fix the issue by directly returning the result of memcmp(), which
adheres to the required comparison properties.
Phil Sutter [Fri, 29 Nov 2024 15:30:38 +0000 (16:30 +0100)]
netfilter: ipset: Hold module reference while requesting a module
User space may unload ip_set.ko while it is itself requesting a set type
backend module, leading to a kernel crash. The race condition may be
provoked by inserting an mdelay() right after the nfnl_unlock() call.
Fixes: a7b4f989a629 ("netfilter: ipset: IP set core support") Signed-off-by: Phil Sutter <phil@nwl.cc> Acked-by: Jozsef Kadlecsik <kadlec@netfilter.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Aapo Vienamo [Wed, 4 Dec 2024 08:02:08 +0000 (10:02 +0200)]
spi: intel: Add Panther Lake SPI controller support
The Panther Lake SPI controllers are compatible with the Cannon Lake
controllers. Add support for following SPI controller device IDs:
- H-series: 0xe323
- P-series: 0xe423
- U-series: 0xe423
Lion Ackermann [Mon, 2 Dec 2024 16:22:57 +0000 (17:22 +0100)]
net: sched: fix ordering of qlen adjustment
Changes to sch->q.qlen around qdisc_tree_reduce_backlog() need to happen
_before_ a call to said function because otherwise it may fail to notify
parent qdiscs when the child is about to become empty.
Signed-off-by: Lion Ackermann <nnamrec@gmail.com> Acked-by: Toke Høiland-Jørgensen <toke@toke.dk> Signed-off-by: David S. Miller <davem@davemloft.net>
Xin Long [Mon, 2 Dec 2024 15:21:38 +0000 (10:21 -0500)]
net: sched: fix erspan_opt settings in cls_flower
When matching erspan_opt in cls_flower, only the (version, dir, hwid)
fields are relevant. However, in fl_set_erspan_opt() it initializes
all bits of erspan_opt and its mask to 1. This inadvertently requires
packets to match not only the (version, dir, hwid) fields but also the
other fields that are unexpectedly set to 1.
This patch resolves the issue by ensuring that only the (version, dir,
hwid) fields are configured in fl_set_erspan_opt(), leaving the other
fields to 0 in erspan_opt.
Fixes: 79b1011cb33d ("net: sched: allow flower to match erspan options") Reported-by: Shuang Li <shuali@redhat.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Reviewed-by: Cong Wang <cong.wang@bytedance.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Gal Pressman [Mon, 2 Dec 2024 16:48:05 +0000 (18:48 +0200)]
ethtool: Fix access to uninitialized fields in set RXNFC command
The check for non-zero ring with RSS is only relevant for
ETHTOOL_SRXCLSRLINS command, in other cases the check tries to access
memory which was not initialized by the userspace tool. Only perform the
check in case of ETHTOOL_SRXCLSRLINS.
Without this patch, filter deletion (for example) could statistically
result in a false error:
# ethtool --config-ntuple eth3 delete 484
rmgr: Cannot delete RX class rule: Invalid argument
Cannot delete classification rule
Fernando Fernandez Mancera [Mon, 2 Dec 2024 15:56:08 +0000 (15:56 +0000)]
Revert "udp: avoid calling sock_def_readable() if possible"
This reverts commit 612b1c0dec5bc7367f90fc508448b8d0d7c05414. On a
scenario with multiple threads blocking on a recvfrom(), we need to call
sock_def_readable() on every __udp_enqueue_schedule_skb() otherwise the
threads won't be woken up as __skb_wait_for_more_packets() is using
prepare_to_wait_exclusive().
Joe Damato [Mon, 2 Dec 2024 18:21:02 +0000 (18:21 +0000)]
net: Make napi_hash_lock irq safe
Make napi_hash_lock IRQ safe. It is used during the control path, and is
taken and released in napi_hash_add and napi_hash_del, which will
typically be called by calls to napi_enable and napi_disable.
This change avoids a deadlock in pcnet32 (and other any other drivers
which follow the same pattern):
CPU 0:
pcnet32_open
spin_lock_irqsave(&lp->lock, ...)
napi_enable
napi_hash_add <- before this executes, CPU 1 proceeds
spin_lock(napi_hash_lock)
[...]
spin_unlock_irqrestore(&lp->lock, flags);
Alex Deucher [Mon, 25 Nov 2024 18:59:09 +0000 (13:59 -0500)]
drm/amdgpu: rework resume handling for display (v2)
Split resume into a 3rd step to handle displays when DCC is
enabled on DCN 4.0.1. Move display after the buffer funcs
have been re-enabled so that the GPU will do the move and
properly set the DCC metadata for DCN.
v2: fix fence irq resume ordering
Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.11.x
Pablo Neira Ayuso [Wed, 27 Nov 2024 11:46:54 +0000 (12:46 +0100)]
netfilter: nft_inner: incorrect percpu area handling under softirq
Softirq can interrupt ongoing packet from process context that is
walking over the percpu area that contains inner header offsets.
Disable bh and perform three checks before restoring the percpu inner
header offsets to validate that the percpu area is valid for this
skbuff:
1) If the NFT_PKTINFO_INNER_FULL flag is set on, then this skbuff
has already been parsed before for inner header fetching to
register.
2) Validate that the percpu area refers to this skbuff using the
skbuff pointer as a cookie. If there is a cookie mismatch, then
this skbuff needs to be parsed again.
3) Finally, validate if the percpu area refers to this tunnel type.
Only after these three checks the percpu area is restored to a on-stack
copy and bh is enabled again.
After inner header fetching, the on-stack copy is stored back to the
percpu area.
Fixes: 3a07327d10a0 ("netfilter: nft_inner: support for inner tunnel header matching") Reported-by: syzbot+84d0441b9860f0d63285@syzkaller.appspotmail.com Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Linus Torvalds [Tue, 3 Dec 2024 19:02:17 +0000 (11:02 -0800)]
Merge tag 'for-6.13-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fixes from David Sterba:
- add lockdep annotations for io_uring/encoded read integration, inode
lock is held when returning to userspace
- properly reflect experimental config option to sysfs
- handle NULL root in case the rescue mode accepts invalid/damaged tree
roots (rescue=ibadroot)
- regression fix of a deadlock between transaction and extent locks
- fix pending bio accounting bug in encoded read ioctl
- fix NOWAIT mode when checking references for NOCOW files
- fix use-after-free in a rb-tree cleanup in ref-verify debugging tool
* tag 'for-6.13-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: fix lockdep warnings on io_uring encoded reads
btrfs: ref-verify: fix use-after-free after invalid ref action
btrfs: add a sanity check for btrfs root in btrfs_search_slot()
btrfs: don't loop for nowait writes when checking for cross references
btrfs: sysfs: advertise experimental features only if CONFIG_BTRFS_EXPERIMENTAL=y
btrfs: fix deadlock between transaction commits and extent locks
btrfs: fix use-after-free in btrfs_encoded_read_endio()
Linus Torvalds [Tue, 3 Dec 2024 18:50:22 +0000 (10:50 -0800)]
Merge tag 'fs_for_v6.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Pull quota and udf fixes from Jan Kara:
"Two small UDF fixes for better handling of corrupted filesystem and a
quota fix to fix handling of filesystem freezing"
* tag 'fs_for_v6.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
udf: Verify inode link counts before performing rename
udf: Skip parent dir link count update if corrupted
quota: flush quota_release_work upon quota writeback
Linus Torvalds [Tue, 3 Dec 2024 18:46:49 +0000 (10:46 -0800)]
Merge tag 'xfs-fixes-6.13-rc2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
Pull xfs fixes from Carlos Maiolino:
- Use xchg() in xlog_cil_insert_pcp_aggregate()
- Fix ABBA deadlock on a race between mount and log shutdown
- Fix quota softlimit incoherency on delalloc
- Fix sparse inode limits on runt AG
- remove unknown compat feature checks in SB write valdation
- Eliminate a lockdep false positive
* tag 'xfs-fixes-6.13-rc2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
xfs: don't call xfs_bmap_same_rtgroup in xfs_bmap_add_extent_hole_delay
xfs: Use xchg() in xlog_cil_insert_pcp_aggregate()
xfs: prevent mount and log shutdown race
xfs: delalloc and quota softlimit timers are incoherent
xfs: fix sparse inode limits on runt AG
xfs: remove unknown compat feature check in superblock write validation
xfs: eliminate lockdep false positives in xfs_attr_shortform_list
Yuan Can [Wed, 23 Oct 2024 12:10:48 +0000 (20:10 +0800)]
igb: Fix potential invalid memory access in igb_init_module()
The pci_register_driver() can fail and when this happened, the dca_notifier
needs to be unregistered, otherwise the dca_notifier can be called when
igb fails to install, resulting to invalid memory access.
Fixes: bbd98fe48a43 ("igb: Fix DCA errors and do not use context index for 82576") Signed-off-by: Yuan Can <yuancan@huawei.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tore Amundsen [Fri, 15 Nov 2024 14:17:36 +0000 (14:17 +0000)]
ixgbe: Correct BASE-BX10 compliance code
SFF-8472 (section 5.4 Transceiver Compliance Codes) defines bit 6 as
BASE-BX10. Bit 6 means a value of 0x40 (decimal 64).
The current value in the source code is 0x64, which appears to be a
mix-up of hex and decimal values. A value of 0x64 (binary 01100100)
incorrectly sets bit 2 (1000BASE-CX) and bit 5 (100BASE-FX) as well.
Fixes: 1b43e0d20f2d ("ixgbe: Add 1000BASE-BX support") Signed-off-by: Tore Amundsen <tore@amundsen.org> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Acked-by: Ernesto Castellotti <ernesto@castellotti.net> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Jacob Keller [Fri, 1 Nov 2024 23:05:43 +0000 (16:05 -0700)]
ixgbe: downgrade logging of unsupported VF API version to debug
The ixgbe PF driver logs an info message when a VF attempts to negotiate an
API version which it does not support:
VF 0 requested invalid api version 6
The ixgbevf driver attempts to load with mailbox API v1.5, which is
required for best compatibility with other hosts such as the ESX VMWare PF.
The Linux PF only supports API v1.4, and does not currently have support
for the v1.5 API.
The logged message can confuse users, as the v1.5 API is valid, but just
happens to not currently be supported by the Linux PF.
Downgrade the info message to a debug message, and fix the language to
use 'unsupported' instead of 'invalid' to improve message clarity.
Long term, we should investigate whether the improvements in the v1.5 API
make sense for the Linux PF, and if so implement them properly. This may
require yet another API version to resolve issues with negotiating IPSEC
offload support.
Fixes: 339f28964147 ("ixgbevf: Add support for new mailbox communication between PF and VF") Reported-by: Yifei Liu <yifei.l.liu@oracle.com> Link: https://lore.kernel.org/intel-wired-lan/20240301235837.3741422-1-yifei.l.liu@oracle.com/ Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Jacob Keller [Fri, 1 Nov 2024 23:05:42 +0000 (16:05 -0700)]
ixgbevf: stop attempting IPSEC offload on Mailbox API 1.5
Commit 339f28964147 ("ixgbevf: Add support for new mailbox communication
between PF and VF") added support for v1.5 of the PF to VF mailbox
communication API. This commit mistakenly enabled IPSEC offload for API
v1.5.
No implementation of the v1.5 API has support for IPSEC offload. This
offload is only supported by the Linux PF as mailbox API v1.4. In fact, the
v1.5 API is not implemented in any Linux PF.
Attempting to enable IPSEC offload on a PF which supports v1.5 API will not
work. Only the Linux upstream ixgbe and ixgbevf support IPSEC offload, and
only as part of the v1.4 API.
Fix the ixgbevf Linux driver to stop attempting IPSEC offload when
the mailbox API does not support it.
The existing API design choice makes it difficult to support future API
versions, as other non-Linux hosts do not implement IPSEC offload. If we
add support for v1.5 to the Linux PF, then we lose support for IPSEC
offload.
A full solution likely requires a new mailbox API with a proper negotiation
to check that IPSEC is actually supported by the host.
Fixes: 339f28964147 ("ixgbevf: Add support for new mailbox communication between PF and VF") Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Joshua Hay [Mon, 7 Oct 2024 20:24:35 +0000 (13:24 -0700)]
idpf: set completion tag for "empty" bufs associated with a packet
Commit d9028db618a6 ("idpf: convert to libeth Tx buffer completion")
inadvertently removed code that was necessary for the tx buffer cleaning
routine to iterate over all buffers associated with a packet.
When a frag is too large for a single data descriptor, it will be split
across multiple data descriptors. This means the frag will span multiple
buffers in the buffer ring in order to keep the descriptor and buffer
ring indexes aligned. The buffer entries in the ring are technically
empty and no cleaning actions need to be performed. These empty buffers
can precede other frags associated with the same packet. I.e. a single
packet on the buffer ring can look like:
The cleaning routine iterates through these buffers based on a matching
completion tag. If the completion tag is not set for buf2, the loop will
end prematurely. Frag2 will be left uncleaned and next_to_clean will be
left pointing to the end of packet, which will break the cleaning logic
for subsequent cleans. This consequently leads to tx timeouts.
Assign the empty bufs the same completion tag for the packet to ensure
the cleaning routine iterates over all of the buffers associated with
the packet.
Fixes: d9028db618a6 ("idpf: convert to libeth Tx buffer completion") Signed-off-by: Joshua Hay <joshua.a.hay@intel.com> Acked-by: Alexander Lobakin <aleksander.lobakin@intel.com> Reviewed-by: Madhu chittim <madhu.chittim@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Marcin Szycik [Mon, 4 Nov 2024 18:49:09 +0000 (19:49 +0100)]
ice: Fix VLAN pruning in switchdev mode
In switchdev mode the uplink VSI should receive all unmatched packets,
including VLANs. Therefore, VLAN pruning should be disabled if uplink is
in switchdev mode. It is already being done in ice_eswitch_setup_env(),
however the addition of ice_up() in commit 44ba608db509 ("ice: do
switchdev slow-path Rx using PF VSI") caused VLAN pruning to be
re-enabled after disabling it.
Add a check to ice_set_vlan_filtering_features() to ensure VLAN
filtering will not be enabled if uplink is in switchdev mode. Note that
ice_is_eswitch_mode_switchdev() is being used instead of
ice_is_switchdev_running(), as the latter would only return true after
the whole switchdev setup completes.
Fixes: 44ba608db509 ("ice: do switchdev slow-path Rx using PF VSI") Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Signed-off-by: Marcin Szycik <marcin.szycik@linux.intel.com> Tested-by: Priya Singh <priyax.singh@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Wojciech Drewek [Tue, 29 Oct 2024 09:42:59 +0000 (10:42 +0100)]
ice: Fix NULL pointer dereference in switchdev
Commit 608a5c05c39b ("virtchnl: support queue rate limit and quanta
size configuration") introduced new virtchnl ops:
- get_qos_caps
- cfg_q_bw
- cfg_q_quanta
New ops were added to ice_virtchnl_dflt_ops, in
commit 015307754a19 ("ice: Support VF queue rate limit and quanta
size configuration"), but not to the ice_virtchnl_repr_ops. Because
of that, if we get one of those messages in switchdev mode we end up
with NULL pointer dereference:
To check if PHY Clock Recovery mechanic is available for a device, there
is a need to verify if given PHY is available within the netlist, but the
netlist node type used for the search is wrong, also the search context
shall be specified.
Modify the search function to allow specifying the context in the
search.
Use the PHY node type instead of CLOCK CONTROLLER type, also use proper
search context which for PHY search is PORT, as defined in E810
Datasheet [1] ('3.3.8.2.4 Node Part Number and Node Options (0x0003)' and
'Table 3-105. Program Topology Device NVM Admin Command').
Will Deacon [Mon, 2 Dec 2024 14:57:30 +0000 (14:57 +0000)]
MAINTAINERS: Add CCA and pKVM CoCO guest support to the ARM64 entry
Commits 7999edc484ca ("virt: arm-cca-guest: TSM_REPORT support for
realm") and a06c3fad49a5 ("drivers/virt: pkvm: Add initial support for
running as a protected guest") added arm64 guest-side support for
running in CCA and pKVM confidential computing environments
respectively.
Unfortunately, these changes were not accompanied by a MAINTAINERS
entry and so aren't automatically picked up by the get_maintainer.pl
script. Since the initial support was merged via the arm64 tree, extend
the ARM64 entry to cover the two new directories.
Cc: Marc Zyngier <maz@kernel.org> Cc: Oliver Upton <oliver.upton@linux.dev> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Will Deacon <will@kernel.org> Acked-by: Suzuki K Poulose <suzuki.poulose@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20241202145731.6422-3-will@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Will Deacon [Mon, 2 Dec 2024 14:57:29 +0000 (14:57 +0000)]
drivers/virt: pkvm: Don't fail ioremap() call if MMIO_GUARD fails
Calling the MMIO_GUARD hypercall from guests which have not been
enrolled (e.g. because they are running without pvmfw) results in
-EINVAL being returned. In this case, MMIO_GUARD is not active
and so we can simply proceed with the normal ioremap() routine.
Don't fail ioremap() if MMIO_GUARD fails; instead WARN_ON_ONCE()
to highlight that the pvm environment is slightly wonky.
Fixes: 0f1269495800 ("drivers/virt: pkvm: Intercept ioremap using pKVM MMIO_GUARD hypercall") Signed-off-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20241202145731.6422-2-will@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
3e25d5a49f99b75b ("asm-generic: add an optional pfn_valid check to page_to_phys")
... which added a pfn_valid() check into page_to_phys(), and at this
point in boot pfn_valid() will always return false because the vmemmap
has not yet been initialized and there are no valid mem_sections yet.
Before that commit, the arithmetic performed by page_to_phys() would
give the expected physical address, though it is somewhat dubious to use
vmemmap addresses before the vmemmap has been initialized.
Aside from kernel image addresses, all executable code should be
allocated from execmem (where all allocations will fall within the
vmalloc area), and so there's no need for the fallback case when
CONFIG_EXECMEM=n.
Simplify patch_map() accordingly, directly converting kernel image
addresses and removing the redundant fallback case.
Fixes: 3e25d5a49f99 ("asm-generic: add an optional pfn_valid check to page_to_phys") Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christoph Hellwig <hch@lst.de> Cc: Mike Rapoport <rppt@kernel.org> Cc: Thomas Huth <thuth@redhat.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20241202170359.1475019-1-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Jason Gunthorpe [Tue, 12 Nov 2024 18:51:36 +0000 (14:51 -0400)]
iommu/arm-smmu-v3: Improve uAPI comment for IOMMU_HW_INFO_TYPE_ARM_SMMUV3
Be specific about what fields should be accessed in the idr result and
give other guidance to the VMM on how it should generate the
vIDR. Discussion on the list, and review of the qemu implementation
understood this needs to be clearer and more detailed.
Yang Shi [Mon, 25 Nov 2024 17:16:50 +0000 (09:16 -0800)]
arm64: mm: Fix zone_dma_limit calculation
Commit ba0fb44aed47 ("dma-mapping: replace zone_dma_bits by
zone_dma_limit") and subsequent patches changed how zone_dma_limit is
calculated to allow a reduced ZONE_DMA even when RAM starts above 4GB.
Commit 122c234ef4e1 ("arm64: mm: keep low RAM dma zone") further fixed
this to ensure ZONE_DMA remains below U32_MAX if RAM starts below 4GB,
especially on platforms that do not have IORT or DT description of the
device DMA ranges. While zone boundaries calculation was fixed by the
latter commit, zone_dma_limit, used to determine the GFP_DMA flag in the
core code, was not updated. This results in excessive use of GFP_DMA and
unnecessary ZONE_DMA allocations on some platforms.
Update zone_dma_limit to match the actual upper bound of ZONE_DMA.
Fixes: ba0fb44aed47 ("dma-mapping: replace zone_dma_bits by zone_dma_limit") Cc: <stable@vger.kernel.org> # 6.12.x Reported-by: Yutang Jiang <jiangyutang@os.amperecomputing.com> Tested-by: Yutang Jiang <jiangyutang@os.amperecomputing.com> Signed-off-by: Yang Shi <yang@os.amperecomputing.com> Link: https://lore.kernel.org/r/20241125171650.77424-1-yang@os.amperecomputing.com
[catalin.marinas@arm.com: some tweaking of the commit log] Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Masahiro Yamada [Tue, 3 Dec 2024 10:21:07 +0000 (19:21 +0900)]
module: Convert default symbol namespace to string literal
Commit cdd30ebb1b9f ("module: Convert symbol namespace to string
literal") only converted MODULE_IMPORT_NS() and EXPORT_SYMBOL_NS(),
leaving DEFAULT_SYMBOL_NAMESPACE as a macro expansion.
This commit converts DEFAULT_SYMBOL_NAMESPACE in the same way to avoid
annoyance for the default namespace as well.