]> www.infradead.org Git - users/willy/xarray.git/log
users/willy/xarray.git
13 months agoARM: 9414/1: Fix build issue with LD_DEAD_CODE_DATA_ELIMINATION
Yuntao Liu [Wed, 21 Aug 2024 06:34:41 +0000 (07:34 +0100)]
ARM: 9414/1: Fix build issue with LD_DEAD_CODE_DATA_ELIMINATION

There is a build issue with LD segmentation fault, while
CONFIG_LD_DEAD_CODE_DATA_ELIMINATION is not enabled, as bellow.

scripts/link-vmlinux.sh: line 49:  3796 Segmentation fault
 (core dumped) ${ld} ${ldflags} -o ${output} ${wl}--whole-archive
 ${objs} ${wl}--no-whole-archive ${wl}--start-group
 ${libs} ${wl}--end-group ${kallsymso} ${btf_vmlinux_bin_o} ${ldlibs}

The error occurs in older versions of the GNU ld with version earlier
than 2.36. It makes most sense to have a minimum LD version as
a dependency for HAVE_LD_DEAD_CODE_DATA_ELIMINATION and eliminate
the impact of ".reloc  .text, R_ARM_NONE, ." when
CONFIG_LD_DEAD_CODE_DATA_ELIMINATION is not enabled.

Fixes: ed0f94102251 ("ARM: 9404/1: arm32: enable HAVE_LD_DEAD_CODE_DATA_ELIMINATION")
Reported-by: Harith George <mail2hgg@gmail.com>
Tested-by: Harith George <mail2hgg@gmail.com>
Suggested-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Yuntao Liu <liuyuntao12@huawei.com>
Link: https://lore.kernel.org/all/14e9aefb-88d1-4eee-8288-ef15d4a9b059@gmail.com/
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
14 months agoLinux 6.11-rc2
Linus Torvalds [Sun, 4 Aug 2024 20:50:53 +0000 (13:50 -0700)]
Linux 6.11-rc2

14 months agoprofiling: remove profile=sleep support
Tetsuo Handa [Sun, 4 Aug 2024 09:48:10 +0000 (18:48 +0900)]
profiling: remove profile=sleep support

The kernel sleep profile is no longer working due to a recursive locking
bug introduced by commit 42a20f86dc19 ("sched: Add wrapper for get_wchan()
to keep task blocked")

Booting with the 'profile=sleep' kernel command line option added or
executing

  # echo -n sleep > /sys/kernel/profiling

after boot causes the system to lock up.

Lockdep reports

  kthreadd/3 is trying to acquire lock:
  ffff93ac82e08d58 (&p->pi_lock){....}-{2:2}, at: get_wchan+0x32/0x70

  but task is already holding lock:
  ffff93ac82e08d58 (&p->pi_lock){....}-{2:2}, at: try_to_wake_up+0x53/0x370

with the call trace being

   lock_acquire+0xc8/0x2f0
   get_wchan+0x32/0x70
   __update_stats_enqueue_sleeper+0x151/0x430
   enqueue_entity+0x4b0/0x520
   enqueue_task_fair+0x92/0x6b0
   ttwu_do_activate+0x73/0x140
   try_to_wake_up+0x213/0x370
   swake_up_locked+0x20/0x50
   complete+0x2f/0x40
   kthread+0xfb/0x180

However, since nobody noticed this regression for more than two years,
let's remove 'profile=sleep' support based on the assumption that nobody
needs this functionality.

Fixes: 42a20f86dc19 ("sched: Add wrapper for get_wchan() to keep task blocked")
Cc: stable@vger.kernel.org # v5.16+
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
14 months agoMerge tag 'x86-urgent-2024-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sun, 4 Aug 2024 15:57:08 +0000 (08:57 -0700)]
Merge tag 'x86-urgent-2024-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fixes from Thomas Gleixner:

 - Prevent a deadlock on cpu_hotplug_lock in the aperf/mperf driver.

   A recent change in the ACPI code which consolidated code pathes moved
   the invocation of init_freq_invariance_cppc() to be moved to a CPU
   hotplug handler. The first invocation on AMD CPUs ends up enabling a
   static branch which dead locks because the static branch enable tries
   to acquire cpu_hotplug_lock but that lock is already held write by
   the hotplug machinery.

   Use static_branch_enable_cpuslocked() instead and take the hotplug
   lock read for the Intel code path which is invoked from the
   architecture code outside of the CPU hotplug operations.

 - Fix the number of reserved bits in the sev_config structure bit field
   so that the bitfield does not exceed 64 bit.

 - Add missing Zen5 model numbers

 - Fix the alignment assumptions of pti_clone_pgtable() and
   clone_entry_text() on 32-bit:

   The code assumes PMD aligned code sections, but on 32-bit the kernel
   entry text is not PMD aligned. So depending on the code size and
   location, which is configuration and compiler dependent, entry text
   can cross a PMD boundary. As the start is not PMD aligned adding PMD
   size to the start address is larger than the end address which
   results in partially mapped entry code for user space. That causes
   endless recursion on the first entry from userspace (usually #PF).

   Cure this by aligning the start address in the addition so it ends up
   at the next PMD start address.

   clone_entry_text() enforces PMD mapping, but on 32-bit the tail might
   eventually be PTE mapped, which causes a map fail because the PMD for
   the tail is not a large page mapping. Use PTI_LEVEL_KERNEL_IMAGE for
   the clone() invocation which resolves to PTE on 32-bit and PMD on
   64-bit.

 - Zero the 8-byte case for get_user() on range check failure on 32-bit

   The recend consolidation of the 8-byte get_user() case broke the
   zeroing in the failure case again. Establish it by clearing ECX
   before the range check and not afterwards as that obvioulsy can't be
   reached when the range check fails

* tag 'x86-urgent-2024-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/uaccess: Zero the 8-byte get_range case on failure on 32-bit
  x86/mm: Fix pti_clone_entry_text() for i386
  x86/mm: Fix pti_clone_pgtable() alignment assumption
  x86/setup: Parse the builtin command line before merging
  x86/CPU/AMD: Add models 0x60-0x6f to the Zen5 range
  x86/sev: Fix __reserved field in sev_config
  x86/aperfmperf: Fix deadlock on cpu_hotplug_lock

14 months agoMerge tag 'timers-urgent-2024-08-04' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 4 Aug 2024 15:50:16 +0000 (08:50 -0700)]
Merge tag 'timers-urgent-2024-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull timer fixes from Thomas Gleixner:
 "Two fixes for the timer/clocksource code:

   - The recent fix to make the take over of the broadcast timer more
     reliable retrieves a per CPU pointer in preemptible context.

     This went unnoticed in testing as some compilers hoist the access
     into the non-preemotible section where the pointer is actually
     used, but obviously compilers can rightfully invoke it where the
     code put it.

     Move it into the non-preemptible section right to the actual usage
     side to cure it.

   - The clocksource watchdog is supposed to emit a warning when the
     retry count is greater than one and the number of retries reaches
     the limit.

     The condition is backwards and warns always when the count is
     greater than one. Fixup the condition to prevent spamming dmesg"

* tag 'timers-urgent-2024-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  clocksource: Fix brown-bag boolean thinko in cs_watchdog_read()
  tick/broadcast: Move per CPU pointer access into the atomic section

14 months agoMerge tag 'sched-urgent-2024-08-04' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 4 Aug 2024 15:46:14 +0000 (08:46 -0700)]
Merge tag 'sched-urgent-2024-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull scheduler fixes from Thomas Gleixner:

 - When stime is larger than rtime due to accounting imprecision, then
   utime = rtime - stime becomes negative. As this is unsigned math, the
   result becomes a huge positive number.

   Cure it by resetting stime to rtime in that case, so utime becomes 0.

 - Restore consistent state when sched_cpu_deactivate() fails.

   When offlining a CPU fails in sched_cpu_deactivate() after the SMT
   present counter has been decremented, then the function aborts but
   fails to increment the SMT present counter and leaves it imbalanced.
   Consecutive operations cause it to underflow. Add the missing fixup
   for the error path.

   For SMT accounting the runqueue needs to marked online again in the
   error exit path to restore consistent state.

* tag 'sched-urgent-2024-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/core: Fix unbalance set_rq_online/offline() in sched_cpu_deactivate()
  sched/core: Introduce sched_set_rq_on/offline() helper
  sched/smt: Fix unbalance sched_smt_present dec/inc
  sched/smt: Introduce sched_smt_present_inc/dec() helper
  sched/cputime: Fix mul_u64_u64_div_u64() precision for cputime

14 months agoMerge tag 'perf-urgent-2024-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sun, 4 Aug 2024 15:42:18 +0000 (08:42 -0700)]
Merge tag 'perf-urgent-2024-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 perf fixes from Thomas Gleixner:

 - Move the smp_processor_id() invocation back into the non-preemtible
   region, so that the result is valid to use

 - Add the missing package C2 residency counters for Sierra Forest CPUs
   to make the newly added support actually useful

* tag 'perf-urgent-2024-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/x86: Fix smp_processor_id()-in-preemptible warnings
  perf/x86/intel/cstate: Add pkg C2 residency counter for Sierra Forest

14 months agoMerge tag 'irq-urgent-2024-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sun, 4 Aug 2024 15:36:57 +0000 (08:36 -0700)]
Merge tag 'irq-urgent-2024-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull irq fixes from Thomas Gleixner:
 "A couple of fixes for interrupt chip drivers:

   - Make sure to skip the clear register space in the MBIGEN driver
     when calculating the node register index. Otherwise the clear
     register is clobbered and the wrong node registers are accessed.

   - Fix a signed/unsigned confusion in the loongarch CPU driver which
     converts an error code to a huge "valid" interrupt number.

   - Convert the mesion GPIO interrupt controller lock to a raw spinlock
     so it works on RT.

   - Add a missing static to a internal function in the pic32 EVIC
     driver"

* tag 'irq-urgent-2024-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  irqchip/mbigen: Fix mbigen node address layout
  irqchip/meson-gpio: Convert meson_gpio_irq_controller::lock to 'raw_spinlock_t'
  irqchip/irq-pic32-evic: Add missing 'static' to internal function
  irqchip/loongarch-cpu: Fix return value of lpic_gsi_to_irq()

14 months agoMerge tag 'locking-urgent-2024-08-04' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 4 Aug 2024 15:32:31 +0000 (08:32 -0700)]
Merge tag 'locking-urgent-2024-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull locking fixes from Thomas Gleixner:
 "Two fixes for locking and jump labels:

   - Ensure that the atomic_cmpxchg() conditions are correct and
     evaluating to true on any non-zero value except 1. The missing
     check of the return value leads to inconsisted state of the jump
     label counter.

   - Add a missing type conversion in the paravirt spinlock code which
     makes loongson build again"

* tag 'locking-urgent-2024-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  jump_label: Fix the fix, brown paper bags galore
  locking/pvqspinlock: Correct the type of "old" variable in pv_kick_node()

14 months agoarm: dts: arm: versatile-ab: Fix duplicate clock node name
Rob Herring (Arm) [Tue, 30 Jul 2024 21:00:30 +0000 (15:00 -0600)]
arm: dts: arm: versatile-ab: Fix duplicate clock node name

Commit 04f08ef291d4 ("arm/arm64: dts: arm: Use generic clock and
regulator nodenames") renamed nodes and created 2 "clock-24000000" nodes
(at different paths).

The kernel can't handle these duplicate names even though they are at
different paths.  Fix this by renaming one of the nodes to "clock-pclk".

This name is aligned with other Arm boards (those didn't have a known
frequency to use in the node name).

Fixes: 04f08ef291d4 ("arm/arm64: dts: arm: Use generic clock and regulator nodenames")
Reported-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Tested-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
14 months agoMerge tag '6.11-rc1-smb-client-fixes' of git://git.samba.org/sfrench/cifs-2.6
Linus Torvalds [Sun, 4 Aug 2024 15:18:40 +0000 (08:18 -0700)]
Merge tag '6.11-rc1-smb-client-fixes' of git://git.samba.org/sfrench/cifs-2.6

Pull smb client fixes from Steve French:

 - two reparse point fixes

 - minor cleanup

 - additional trace point (to help debug a recent problem)

* tag '6.11-rc1-smb-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
  cifs: update internal version number
  smb: client: fix FSCTL_GET_REPARSE_POINT against NetApp
  smb3: add dynamic tracepoints for shutdown ioctl
  cifs: Remove cifs_aio_ctx
  smb: client: handle lack of FSCTL_GET_REPARSE_POINT support

14 months agoMerge tag 'media/v6.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab...
Linus Torvalds [Sun, 4 Aug 2024 15:12:33 +0000 (08:12 -0700)]
Merge tag 'media/v6.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media

Pull media fixes from Mauro Carvalho Chehab:

 - two Kconfig fixes

 - one fix for the UVC driver addressing probing time detection of a UVC
   custom controls

 - one fix related to PDF generation

* tag 'media/v6.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
  media: v4l: Fix missing tabular column hint for Y14P format
  media: intel/ipu6: select AUXILIARY_BUS in Kconfig
  media: ipu-bridge: fix ipu6 Kconfig dependencies
  media: uvcvideo: Fix custom control mapping probing

14 months agoMerge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Linus Torvalds [Sat, 3 Aug 2024 22:12:56 +0000 (15:12 -0700)]
Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
 "One core change that reverts the double message print patch in sd.c
  (it was causing regressions on embedded systems).

  The rest are driver fixes in ufs, mpt3sas and mpi3mr"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: ufs: exynos: Don't resume FMP when crypto support is disabled
  scsi: mpt3sas: Avoid IOMMU page faults on REPORT ZONES
  scsi: mpi3mr: Avoid IOMMU page faults on REPORT ZONES
  scsi: ufs: core: Do not set link to OFF state while waking up from hibernation
  scsi: Revert "scsi: sd: Do not repeat the starting disk message"
  scsi: ufs: core: Fix deadlock during RTC update
  scsi: ufs: core: Bypass quick recovery if force reset is needed
  scsi: ufs: core: Check LSDBS cap when !mcq

14 months agoMerge tag 'xfs-6.11-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
Linus Torvalds [Sat, 3 Aug 2024 16:09:25 +0000 (09:09 -0700)]
Merge tag 'xfs-6.11-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux

Pull xfs fixes from Chandan Babu:

 - Fix memory leak when corruption is detected during scrubbing parent
   pointers

 - Allow SECURE namespace xattrs to use reserved block pool to in order
   to prevent ENOSPC

 - Save stack space by passing tracepoint's char array to file_path()
   instead of another stack variable

 - Remove unused parameter in macro XFS_DQUOT_LOGRES

 - Replace comma with semicolon in a couple of places

* tag 'xfs-6.11-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
  xfs: convert comma to semicolon
  xfs: convert comma to semicolon
  xfs: remove unused parameter in macro XFS_DQUOT_LOGRES
  xfs: fix file_path handling in tracepoints
  xfs: allow SECURE namespace xattrs to use reserved block pool
  xfs: fix a memory leak

14 months agoMerge tag 'parisc-for-6.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sat, 3 Aug 2024 16:03:21 +0000 (09:03 -0700)]
Merge tag 'parisc-for-6.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux

Pull parisc architecture fixes from Helge Deller:

 - fix unaligned memory accesses when calling BPF functions

 - adjust memory size constants to fix possible DMA corruptions

* tag 'parisc-for-6.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
  parisc: fix a possible DMA corruption
  parisc: fix unaligned accesses in BPF

14 months agoruntime constants: deal with old decrepit linkers
Linus Torvalds [Sat, 3 Aug 2024 01:12:06 +0000 (18:12 -0700)]
runtime constants: deal with old decrepit linkers

The runtime constants linker script depended on documented linker
behavior [1]:

 "If an output section’s name is the same as the input section’s name
  and is representable as a C identifier, then the linker will
  automatically PROVIDE two symbols: __start_SECNAME and __stop_SECNAME,
  where SECNAME is the name of the section. These indicate the start
  address and end address of the output section respectively"

to just automatically define the symbol names for the bounds of the
runtime constant arrays.

It turns out that this isn't actually something we can rely on, with old
linkers not generating these automatic symbols.  It looks to have been
introduced in binutils-2.29 back in 2017, and we still support building
with versions all the way back to binutils-2.25 (from 2015).

And yes, Oleg actually seems to be using such ancient versions of
binutils.

So instead of depending on the implicit symbols from "section names
match and are representable C identifiers", just do this all manually.
It's not like it causes us any extra pain, we already have to do that
for all the other sections that we use that often have special
characters in them.

Reported-and-tested-by: Oleg Nesterov <oleg@redhat.com>
Link: https://sourceware.org/binutils/docs/ld/Input-Section-Example.html
Link: https://lore.kernel.org/all/20240802114518.GA20924@redhat.com/
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
14 months agoMerge tag 'tags/fixes-media-uvc-20230722' of git://git.kernel.org/pub/scm/linux/kerne...
Hans Verkuil [Sat, 3 Aug 2024 09:01:04 +0000 (11:01 +0200)]
Merge tag 'tags/fixes-media-uvc-20230722' of git://git.kernel.org/pub/scm/linux/kernel/git/pinchartl/linux.git

uvcvideo v6.11 regression fix: fix custom control mapping probing

Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
14 months agoMerge tag 'io_uring-6.11-20240802' of git://git.kernel.dk/linux
Linus Torvalds [Fri, 2 Aug 2024 21:18:31 +0000 (14:18 -0700)]
Merge tag 'io_uring-6.11-20240802' of git://git.kernel.dk/linux

Pull io_uring fixes from Jens Axboe:
 "Two minor tweaks for the NAPI handling, both from Olivier:

   - Kill two unused list definitions

   - Ensure that multishot NAPI doesn't age away"

* tag 'io_uring-6.11-20240802' of git://git.kernel.dk/linux:
  io_uring: remove unused local list heads in NAPI functions
  io_uring: keep multishot request NAPI timeout current

14 months agoMerge tag 'thermal-6.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael...
Linus Torvalds [Fri, 2 Aug 2024 21:10:11 +0000 (14:10 -0700)]
Merge tag 'thermal-6.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull thermal control fixes from Rafael Wysocki:
 "These fix a few issues related to the MSI IRQs management in the
  int340x thermal driver, fix a thermal core issue that may lead to
  missing trip point crossing events and update the thermal core
  documentation.

  Specifics:

   - Fix MSI error path cleanup in int340x, allow it to work with a
     subset of thermal MSI IRQs if some of them are not working and make
     it free all MSI IRQs on module exit (Srinivas Pandruvada)

   - Fix a thermal core issue that may lead to missing trip point
     crossing events in some cases when thermal_zone_set_trips() is used
     and update the thermal core documentation (Rafael Wysocki)"

* tag 'thermal-6.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  thermal: core: Update thermal zone registration documentation
  thermal: trip: Avoid skipping trips in thermal_zone_set_trips()
  thermal: intel: int340x: Free MSI IRQ vectors on module exit
  thermal: intel: int340x: Allow limited thermal MSI support
  thermal: intel: int340x: Fix kernel warning during MSI cleanup

14 months agoMerge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Linus Torvalds [Fri, 2 Aug 2024 20:46:43 +0000 (13:46 -0700)]
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 fixes from Catalin Marinas:

 - Expand the speculative SSBS errata workaround to more CPUs

 - Ensure jump label changes are visible to all CPUs with a
   kick_all_cpus_sync() (and also enable jump label batching as part of
   the fix)

 - The shadow call stack sanitiser is currently incompatible with Rust,
   make CONFIG_RUST conditional on !CONFIG_SHADOW_CALL_STACK

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: jump_label: Ensure patched jump_labels are visible to all CPUs
  rust: SHADOW_CALL_STACK is incompatible with Rust
  arm64: errata: Expand speculative SSBS workaround (again)
  arm64: cputype: Add Cortex-A725 definitions
  arm64: cputype: Add Cortex-X1C definitions

14 months agoMerge tag 'ceph-for-6.11-rc2' of https://github.com/ceph/ceph-client
Linus Torvalds [Fri, 2 Aug 2024 17:33:06 +0000 (10:33 -0700)]
Merge tag 'ceph-for-6.11-rc2' of https://github.com/ceph/ceph-client

Pull ceph fix from Ilya Dryomov:
 "A fix for a potential hang in the MDS when cap revocation races with
  the client releasing the caps in question, marked for stable"

* tag 'ceph-for-6.11-rc2' of https://github.com/ceph/ceph-client:
  ceph: force sending a cap update msg back to MDS for revoke op

14 months agoMerge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Linus Torvalds [Fri, 2 Aug 2024 17:17:49 +0000 (10:17 -0700)]
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm updates from Paolo Bonzini:
 "The bulk of the changes here is a largish change to guest_memfd,
  delaying the clearing and encryption of guest-private pages until they
  are actually added to guest page tables. This started as "let's make
  it impossible to misuse the API" for SEV-SNP; but then it ballooned a
  bit.

  The new logic is generally simpler and more ready for hugepage support
  in guest_memfd.

  Summary:

   - fix latent bug in how usage of large pages is determined for
     confidential VMs

   - fix "underline too short" in docs

   - eliminate log spam from limited APIC timer periods

   - disallow pre-faulting of memory before SEV-SNP VMs are initialized

   - delay clearing and encrypting private memory until it is added to
     guest page tables

   - this change also enables another small cleanup: the checks in
     SNP_LAUNCH_UPDATE that limit it to non-populated, private pages can
     now be moved in the common kvm_gmem_populate() function

   - fix compilation error that the RISC-V merge introduced in selftests"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: x86/mmu: fix determination of max NPT mapping level for private pages
  KVM: riscv: selftests: Fix compile error
  KVM: guest_memfd: abstract how prepared folios are recorded
  KVM: guest_memfd: let kvm_gmem_populate() operate only on private gfns
  KVM: extend kvm_range_has_memory_attributes() to check subset of attributes
  KVM: cleanup and add shortcuts to kvm_range_has_memory_attributes()
  KVM: guest_memfd: move check for already-populated page to common code
  KVM: remove kvm_arch_gmem_prepare_needed()
  KVM: guest_memfd: make kvm_gmem_prepare_folio() operate on a single struct kvm
  KVM: guest_memfd: delay kvm_gmem_prepare_folio() until the memory is passed to the guest
  KVM: guest_memfd: return locked folio from __kvm_gmem_get_pfn
  KVM: rename CONFIG_HAVE_KVM_GMEM_* to CONFIG_HAVE_KVM_ARCH_GMEM_*
  KVM: guest_memfd: do not go through struct page
  KVM: guest_memfd: delay folio_mark_uptodate() until after successful preparation
  KVM: guest_memfd: return folio from __kvm_gmem_get_pfn()
  KVM: x86: disallow pre-fault for SNP VMs before initialization
  KVM: Documentation: Fix title underline too short warning
  KVM: x86: Eliminate log spam from limited APIC timer periods

14 months agoMerge branch 'kvm-fixes' into HEAD
Paolo Bonzini [Fri, 2 Aug 2024 16:31:48 +0000 (12:31 -0400)]
Merge branch 'kvm-fixes' into HEAD

* fix latent bug in how usage of large pages is determined for
  confidential VMs

* fix "underline too short" in docs

* eliminate log spam from limited APIC timer periods

* disallow pre-faulting of memory before SEV-SNP VMs are initialized

* delay clearing and encrypting private memory until it is added to
  guest page tables

* this change also enables another small cleanup: the checks in
  SNP_LAUNCH_UPDATE that limit it to non-populated, private pages
  can now be moved in the common kvm_gmem_populate() function

14 months agoMerge tag 'riscv-for-linus-6.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Fri, 2 Aug 2024 16:33:35 +0000 (09:33 -0700)]
Merge tag 'riscv-for-linus-6.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull RISC-V fixes from Palmer Dabbelt:

 - A fix to avoid dropping some of the internal pseudo-extensions, which
   breaks *envcfg dependency parsing

 - The kernel entry address is now aligned in purgatory, which avoids a
   misaligned load that can lead to crash on systems that don't support
   misaligned accesses early in boot

 - The FW_SFENCE_VMA_RECEIVED perf event was duplicated in a handful of
   perf JSON configurations, one of them been updated to
   FW_SFENCE_VMA_ASID_SENT

 - The starfive cache driver is now restricted to 64-bit systems, as it
   isn't 32-bit clean

 - A fix for to avoid aliasing legacy-mode perf counters with software
   perf counters

 - VM_FAULT_SIGSEGV is now handled in the page fault code

 - A fix for stalls during CPU hotplug due to IPIs being disabled

 - A fix for memblock bounds checking. This manifests as a crash on
   systems with discontinuous memory maps that have regions that don't
   fit in the linear map

* tag 'riscv-for-linus-6.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  riscv: Fix linear mapping checks for non-contiguous memory regions
  RISC-V: Enable the IPI before workqueue_online_cpu()
  riscv/mm: Add handling for VM_FAULT_SIGSEGV in mm_fault_error()
  perf: riscv: Fix selecting counters in legacy mode
  cache: StarFive: Require a 64-bit system
  perf arch events: Fix duplicate RISC-V SBI firmware event name
  riscv/purgatory: align riscv_kernel_entry
  riscv: cpufeature: Do not drop Linux-internal extensions

14 months agoMerge tag 'kvm-riscv-fixes-6.11-1' of https://github.com/kvm-riscv/linux into HEAD
Paolo Bonzini [Fri, 2 Aug 2024 16:31:29 +0000 (12:31 -0400)]
Merge tag 'kvm-riscv-fixes-6.11-1' of https://github.com/kvm-riscv/linux into HEAD

KVM/riscv fixes for 6.11, take #1

- Fix compile error in get-reg-list selftests

14 months agoMerge tag 's390-6.11-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Linus Torvalds [Fri, 2 Aug 2024 16:29:54 +0000 (09:29 -0700)]
Merge tag 's390-6.11-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux

Pull s390 fixes from Vasily Gorbik:

 - remove unused empty CPU alternatives header file

 - fix recently and erroneously removed exception handling when loading
   an invalid floating point register

 - ptdump fixes to reflect the recent changes due to the uncoupling of
   physical vs virtual kernel address spaces

 - changes to avoid the unnecessary splitting of large pages in kernel
   mappings

 - add the missing MODULE_DESCRIPTION for the CIO modules

* tag 's390-6.11-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390: Keep inittext section writable
  s390/vmlinux.lds.S: Move ro_after_init section behind rodata section
  s390/mm: Get rid of RELOC_HIDE()
  s390/mm/ptdump: Improve sorting of markers
  s390/mm/ptdump: Add support for relocated lowcore mapping
  s390/mm/ptdump: Fix handling of identity mapping area
  s390/cio: Add missing MODULE_DESCRIPTION() macros
  s390/alternatives: Remove unused empty header file
  s390/fpu: Re-add exception handling in load_fpu_state()

14 months agoclocksource: Fix brown-bag boolean thinko in cs_watchdog_read()
Paul E. McKenney [Fri, 2 Aug 2024 15:46:15 +0000 (08:46 -0700)]
clocksource: Fix brown-bag boolean thinko in cs_watchdog_read()

The current "nretries > 1 || nretries >= max_retries" check in
cs_watchdog_read() will always evaluate to true, and thus pr_warn(), if
nretries is greater than 1.  The intent is instead to never warn on the
first try, but otherwise warn if the successful retry was the last retry.

Therefore, change that "||" to "&&".

Fixes: db3a34e17433 ("clocksource: Retry clock read if long delays detected")
Reported-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/all/20240802154618.4149953-2-paulmck@kernel.org
14 months agoMerge tag 'asm-generic-fixes-6.11-1' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Fri, 2 Aug 2024 16:14:48 +0000 (09:14 -0700)]
Merge tag 'asm-generic-fixes-6.11-1' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic

Pull asm-generic fixes from Arnd Bergmann:
 "These are three important bug fixes for the cross-architecture tree,
  fixing a regression with the new syscall.tbl file, the inconsistent
  numbering for the new uretprobe syscall and a bug with iowrite64be on
  alpha"

* tag 'asm-generic-fixes-6.11-1' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic:
  syscalls: fix syscall macros for newfstat/newfstatat
  uretprobe: change syscall number, again
  alpha: fix ioread64be()/iowrite64be() helpers

14 months agoMerge tag 'sound-6.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai...
Linus Torvalds [Fri, 2 Aug 2024 16:04:57 +0000 (09:04 -0700)]
Merge tag 'sound-6.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound

Pull sound fixes from Takashi Iwai:
 "A small collection of fixes:

   - Revert of FireWire changes that caused a long-time regression

   - Another long-time regression fix for AMD HDMI

   - MIDI2 UMP fixes

   - HD-audio Conexant codec fixes and a quirk"

* tag 'sound-6.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ALSA: hda: Conditionally use snooping for AMD HDMI
  ALSA: usb-audio: Correct surround channels in UAC1 channel map
  ALSA: seq: ump: Explicitly reset RPN with Null RPN
  ALSA: seq: ump: Transmit RPN/NRPN message at each MSB/LSB data reception
  ALSA: seq: ump: Use the common RPN/bank conversion context
  ALSA: ump: Explicitly reset RPN with Null RPN
  ALSA: ump: Transmit RPN/NRPN message at each MSB/LSB data reception
  Revert "ALSA: firewire-lib: operate for period elapse event in process context"
  Revert "ALSA: firewire-lib: obsolete workqueue for period update"
  ALSA: hda/realtek: Add quirk for Acer Aspire E5-574G
  ALSA: seq: ump: Optimize conversions from SysEx to UMP
  ALSA: hda/conexant: Mute speakers at suspend / shutdown
  ALSA: hda/generic: Add a helper to mute speakers at suspend/shutdown
  ALSA: hda: conexant: Fix headset auto detect fail in the polling mode

14 months agoMerge tag 'drm-fixes-2024-08-02' of https://gitlab.freedesktop.org/drm/kernel
Linus Torvalds [Fri, 2 Aug 2024 15:59:09 +0000 (08:59 -0700)]
Merge tag 'drm-fixes-2024-08-02' of https://gitlab.freedesktop.org/drm/kernel

Pull drm fixes from Dave Airlie:
 "Regular weekly fixes. This is a bit larger than usual but doesn't seem
  too crazy.

  Most of it is vmwgfx changes that fix a bunch of issues with wayland
  userspaces with dma-buf/external buffers and modesetting fixes.

  Otherwise it's kinda spread out, v3d fixes some new ioctls, nouveau
  has regression revert and fixes, amdgpu, i915 and ast have some small
  fixes, and some core fixes spread about.

  client:
   - fix error code

  atomic:
   - allow damage clips with async flips
   - allow explicit sync with async flips

  kselftests:
   - fix dmabuf-heaps test

  panic:
   - fix schedule_work in panic paths

  panel:
   - fix OrangePi Neo orientation

  gpuvm:
   - fix missing dependency

  amdgpu:
   - SMU 14.x update
   - Fix contiguous VRAM handling for IB parsing
   - GFX 12 fix
   - Regression fix for old APUs

  i915:
   - Static analysis fix for int overflow
   - Fix for HDCP2_STREAM_STATUS macro and removal of PWR_CLK_STATE for gen12

  nouveau:
   - revert busy wait change that caused a resume regression
   - fix buffer placement fault on dynamic pm s/r
   - fix refcount underflow

  ast:
   - fix black screen on resume
   - wake during connector status detect

  v3d:
   - fix issues with perf/timestamp ioctls

  vmwgfx:
   - fix deadlock in dma-buf fence polling
   - fix screen surface refcounting
   - fix dumb buffer handling
   - fix support for external buffers
   - fix overlay with screen targets
   - trigger modeset on screen moves"

* tag 'drm-fixes-2024-08-02' of https://gitlab.freedesktop.org/drm/kernel: (31 commits)
  Revert "nouveau: rip out busy fence waits"
  nouveau: set placement to original placement on uvmm validate.
  drm/atomic: Allow userspace to use damage clips with async flips
  drm/atomic: Allow userspace to use explicit sync with atomic async flips
  drm/i915: Fix possible int overflow in skl_ddi_calculate_wrpll()
  drm/i915/hdcp: Fix HDCP2_STREAM_STATUS macro
  drm/ast: astdp: Wake up during connector status detection
  i915/perf: Remove code to update PWR_CLK_STATE for gen12
  kselftests: dmabuf-heaps: Ensure the driver name is null-terminated
  drm/client: Fix error code in drm_client_buffer_vmap_local()
  drm/amdgpu: Fix APU handling in amdgpu_pm_load_smu_firmware()
  drm/amdgpu: increase mes log buffer size for gfx12
  drm/amdgpu: fix contiguous handling for IB parsing v2
  drm/amdgpu/pm: support gpu_metrics sysfs interface for smu v14.0.2/3
  drm/vmwgfx: Trigger a modeset when the screen moves
  drm/vmwgfx: Fix overlay when using Screen Targets
  drm/vmwgfx: Add basic support for external buffers
  drm/vmwgfx: Fix handling of dumb buffers
  drm/vmwgfx: Make sure the screen surface is ref counted
  drm/vmwgfx: Fix a deadlock in dma buf fence polling
  ...

14 months agocifs: update internal version number
Steve French [Fri, 26 Jul 2024 23:44:16 +0000 (18:44 -0500)]
cifs: update internal version number

To 2.50

Signed-off-by: Steve French <stfrench@microsoft.com>
14 months agosmb: client: fix FSCTL_GET_REPARSE_POINT against NetApp
Paulo Alcantara [Thu, 1 Aug 2024 21:12:39 +0000 (18:12 -0300)]
smb: client: fix FSCTL_GET_REPARSE_POINT against NetApp

NetApp server requires the file to be open with FILE_READ_EA access in
order to support FSCTL_GET_REPARSE_POINT, otherwise it will return
STATUS_INVALID_DEVICE_REQUEST.  It doesn't make any sense because
there's no requirement for FILE_READ_EA bit to be set nor
STATUS_INVALID_DEVICE_REQUEST being used for something other than
"unsupported reparse points" in MS-FSA.

To fix it and improve compatibility, set FILE_READ_EA & SYNCHRONIZE
bits to match what Windows client currently does.

Tested-by: Sebastian Steinbeisser <Sebastian.Steinbeisser@lrz.de>
Acked-by: Tom Talpey <tom@talpey.com>
Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
14 months agosmb3: add dynamic tracepoints for shutdown ioctl
Steve French [Tue, 30 Jul 2024 05:26:21 +0000 (00:26 -0500)]
smb3: add dynamic tracepoints for shutdown ioctl

For debugging an umount failure in xfstests generic/043 generic/044 in some
configurations, we needed more information on the shutdown ioctl which
was suspected of being related to the cause, so tracepoints are added
in this patch e.g.

  "trace-cmd record -e smb3_shutdown_enter -e smb3_shutdown_done -e smb3_shutdown_err"

Sample output:
  godown-47084   [011] .....  3313.756965: smb3_shutdown_enter: flags=0x1 tid=0x733b3e75
  godown-47084   [011] .....  3313.756968: smb3_shutdown_done: flags=0x1 tid=0x733b3e75

Tested-by: Anthony Nandaa (Microsoft) <profnandaa@gmail.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
14 months agocifs: Remove cifs_aio_ctx
David Howells [Wed, 31 Jul 2024 10:30:00 +0000 (11:30 +0100)]
cifs: Remove cifs_aio_ctx

Remove struct cifs_aio_ctx and its associated alloc/release functions as it
is no longer used, the functions being taken over by netfslib.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Steve French <sfrench@samba.org>
cc: linux-cifs@vger.kernel.org
Signed-off-by: Steve French <stfrench@microsoft.com>
14 months agosmb: client: handle lack of FSCTL_GET_REPARSE_POINT support
Paulo Alcantara [Wed, 31 Jul 2024 13:23:39 +0000 (10:23 -0300)]
smb: client: handle lack of FSCTL_GET_REPARSE_POINT support

As per MS-FSA 2.1.5.10.14, support for FSCTL_GET_REPARSE_POINT is
optional and if the server doesn't support it,
STATUS_INVALID_DEVICE_REQUEST must be returned for the operation.

If we find files with reparse points and we can't read them due to
lack of client or server support, just ignore it and then treat them
as regular files or junctions.

Fixes: 5f71ebc41294 ("smb: client: parse reparse point flag in create response")
Reported-by: Sebastian Steinbeisser <Sebastian.Steinbeisser@lrz.de>
Tested-by: Sebastian Steinbeisser <Sebastian.Steinbeisser@lrz.de>
Acked-by: Tom Talpey <tom@talpey.com>
Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
14 months agoMerge tag 'ata-6.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/libata...
Linus Torvalds [Fri, 2 Aug 2024 15:54:16 +0000 (08:54 -0700)]
Merge tag 'ata-6.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux

Pull ata fix from Damien Le Moal:

 - Add missing power-domains property to the device tree bindings for
   the Rockchip Designware AHCI adapter (from Heiko)

* tag 'ata-6.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux:
  dt-bindings: ata: rockchip-dwc-ahci: add missing power-domains

14 months agoMerge tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Linus Torvalds [Fri, 2 Aug 2024 15:52:27 +0000 (08:52 -0700)]
Merge tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

Pull vfs fix from Al Viro:
 "do_dup2() out-of-bounds array speculation fix"

* tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  protect the fetch of ->fd[fd] in do_dup2() from mispredictions

14 months agoarm64: jump_label: Ensure patched jump_labels are visible to all CPUs
Will Deacon [Wed, 31 Jul 2024 13:36:01 +0000 (14:36 +0100)]
arm64: jump_label: Ensure patched jump_labels are visible to all CPUs

Although the Arm architecture permits concurrent modification and
execution of NOP and branch instructions, it still requires some
synchronisation to ensure that other CPUs consistently execute the newly
written instruction:

 >  When the modified instructions are observable, each PE that is
 >  executing the modified instructions must execute an ISB or perform a
 >  context synchronizing event to ensure execution of the modified
 >  instructions

Prior to commit f6cc0c501649 ("arm64: Avoid calling stop_machine() when
patching jump labels"), the arm64 jump_label patching machinery
performed synchronisation using stop_machine() after each modification,
however this was problematic when flipping static keys from atomic
contexts (namely, the arm_arch_timer CPU hotplug startup notifier) and
so we switched to the _nosync() patching routines to avoid "scheduling
while atomic" BUG()s during boot.

In hindsight, the analysis of the issue in f6cc0c501649 isn't quite
right: it cites the use of IPIs in the default patching routines as the
cause of the lockup, whereas stop_machine() does not rely on IPIs and
the I-cache invalidation is performed using __flush_icache_range(),
which elides the call to kick_all_cpus_sync(). In fact, the blocking
wait for other CPUs is what triggers the BUG() and the problem remains
even after f6cc0c501649, for example because we could block on the
jump_label_mutex. Eventually, the arm_arch_timer driver was fixed to
avoid the static key entirely in commit a862fc2254bd
("clocksource/arm_arch_timer: Remove use of workaround static key").

This all leaves the jump_label patching code in a funny situation on
arm64 as we do not synchronise with other CPUs to reduce the likelihood
of a bug which no longer exists. Consequently, toggling a static key on
one CPU cannot be assumed to take effect on other CPUs, leading to
potential issues, for example with missing preempt notifiers.

Rather than revert f6cc0c501649 and go back to stop_machine() for each
patch site, implement arch_jump_label_transform_apply() and kick all
the other CPUs with an IPI at the end of patching.

Cc: Alexander Potapenko <glider@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Fixes: f6cc0c501649 ("arm64: Avoid calling stop_machine() when patching jump labels")
Signed-off-by: Will Deacon <will@kernel.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240731133601.3073-1-will@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
14 months agosyscalls: fix syscall macros for newfstat/newfstatat
Arnd Bergmann [Thu, 1 Aug 2024 12:27:23 +0000 (14:27 +0200)]
syscalls: fix syscall macros for newfstat/newfstatat

The __NR_newfstat and __NR_newfstatat macros accidentally got renamed
in the conversion to the syscall.tbl format, dropping the 'new' portion
of the name.

In an unrelated change, the two syscalls are no longer architecture
specific but are once more defined on all 64-bit architectures, so the
'newstat' ABI keyword can be dropped from the table as a simplification.

Fixes: Fixes: 4fe53bf2ba0a ("syscalls: add generic scripts/syscall.tbl")
Closes: https://lore.kernel.org/lkml/838053e0-b186-4e9f-9668-9a3384a71f23@app.fastmail.com/T/#t
Reported-by: Florian Weimer <fweimer@redhat.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
14 months agouretprobe: change syscall number, again
Arnd Bergmann [Tue, 30 Jul 2024 15:30:40 +0000 (17:30 +0200)]
uretprobe: change syscall number, again

Despite multiple attempts to get the syscall number assignment right
for the newly added uretprobe syscall, we ended up with a bit of a mess:

 - The number is defined as 467 based on the assumption that the
   xattrat family of syscalls would use 463 through 466, but those
   did not make it into 6.11.

 - The include/uapi/asm-generic/unistd.h file still lists the number
   463, but the new scripts/syscall.tbl that was supposed to have the
   same data lists 467 instead as the number for arc, arm64, csky,
   hexagon, loongarch, nios2, openrisc and riscv. None of these
   architectures actually provide a uretprobe syscall.

 - All the other architectures (powerpc, arm, mips, ...) don't list
   this syscall at all.

There are two ways to make it consistent again: either list it with
the same syscall number on all architectures, or only list it on x86
but not in scripts/syscall.tbl and asm-generic/unistd.h.

Based on the most recent discussion, it seems like we won't need it
anywhere else, so just remove the inconsistent assignment and instead
move the x86 number to the next available one in the architecture
specific range, which is 335.

Fixes: 5c28424e9a34 ("syscalls: Fix to add sys_uretprobe to syscall.tbl")
Fixes: 190fec72df4a ("uprobe: Wire up uretprobe system call")
Fixes: 63ded110979b ("uprobe: Change uretprobe syscall scope and number")
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
14 months agothermal: core: Update thermal zone registration documentation
Rafael J. Wysocki [Thu, 1 Aug 2024 16:39:28 +0000 (18:39 +0200)]
thermal: core: Update thermal zone registration documentation

The thermal sysfs API document is outdated.  One of the problems with
it is that is still documents thermal_zone_device_register() which
does not exit any more and it does not reflect the current thermal
zone operations definition.

Replace the thermal_zone_device_register() description in it with
a thermal_zone_device_register_with_trips() description, including
an update of the thermal zone operations list.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Link: https://patch.msgid.link/2767845.mvXUDI8C0e@rjwysocki.net
14 months agoRevert "nouveau: rip out busy fence waits"
Dave Airlie [Fri, 2 Aug 2024 04:38:28 +0000 (14:38 +1000)]
Revert "nouveau: rip out busy fence waits"

This reverts commit d45bb9c5f7a6f7b6e47939856b28cb1da0cdc119.

Just got a report that this causes some suspend/resume issues,
so back it out and I'll investigate it later.

Reported-by: Mike Galbraith <efault@gmx.de>
Signed-off-by: Dave Airlie <airlied@redhat.com>
14 months agoMerge tag 'drm-misc-fixes-2024-08-01' of https://gitlab.freedesktop.org/drm/misc...
Dave Airlie [Fri, 2 Aug 2024 02:14:28 +0000 (12:14 +1000)]
Merge tag 'drm-misc-fixes-2024-08-01' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes

A couple drm_panic fixes, several v3d fixes to increase the new timestamp API
safety, several fixes for vmwgfx for various modesetting issues, PM fixes
for ast, async flips improvements and two fixes for nouveau to fix
resource refcounting and buffer placement.

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maxime Ripard <mripard@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240801-interesting-antique-bat-2fe4c0@houat
14 months agoMerge tag 'drm-intel-fixes-2024-08-01' of https://gitlab.freedesktop.org/drm/i915...
Dave Airlie [Fri, 2 Aug 2024 01:19:14 +0000 (11:19 +1000)]
Merge tag 'drm-intel-fixes-2024-08-01' of https://gitlab.freedesktop.org/drm/i915/kernel into drm-fixes

- Static analysis fix for int overflow
- Fix for HDCP2_STREAM_STATUS macro and removal of PWR_CLK_STATE for gen12

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ZqslBkcZlInYdYgm@jlahtine-mobl.ger.corp.intel.com
14 months agoMerge tag 'amd-drm-fixes-6.11-2024-07-27' of https://gitlab.freedesktop.org/agd5f...
Dave Airlie [Thu, 1 Aug 2024 22:21:34 +0000 (08:21 +1000)]
Merge tag 'amd-drm-fixes-6.11-2024-07-27' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes

amd-drm-fixes-6.11-2024-07-27:

amdgpu:
- SMU 14.x update
- Fix contiguous VRAM handling for IB parsing
- GFX 12 fix
- Regression fix for old APUs

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240728025407.2115881-1-alexander.deucher@amd.com
14 months agoprotect the fetch of ->fd[fd] in do_dup2() from mispredictions
Al Viro [Thu, 1 Aug 2024 19:22:22 +0000 (15:22 -0400)]
protect the fetch of ->fd[fd] in do_dup2() from mispredictions

both callers have verified that fd is not greater than ->max_fds;
however, misprediction might end up with
        tofree = fdt->fd[fd];
being speculatively executed.  That's wrong for the same reasons
why it's wrong in close_fd()/file_close_fd_locked(); the same
solution applies - array_index_nospec(fd, fdt->max_fds) could differ
from fd only in case of speculative execution on mispredicted path.

Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
14 months agox86/uaccess: Zero the 8-byte get_range case on failure on 32-bit
David Gow [Wed, 31 Jul 2024 07:30:29 +0000 (15:30 +0800)]
x86/uaccess: Zero the 8-byte get_range case on failure on 32-bit

While zeroing the upper 32 bits of an 8-byte getuser on 32-bit x86 was
fixed by commit 8c860ed825cb ("x86/uaccess: Fix missed zeroing of ia32 u64
get_user() range checking") it was broken again in commit 8a2462df1547
("x86/uaccess: Improve the 8-byte getuser() case").

This is because the register which holds the upper 32 bits (%ecx) is being
cleared _after_ the check_range, so if the range check fails, %ecx is never
cleared.

This can be reproduced with:
./tools/testing/kunit/kunit.py run --arch i386 usercopy

Instead, clear %ecx _before_ check_range in the 8-byte case. This
reintroduces a bit of the ugliness we were trying to avoid by adding
another #ifndef CONFIG_X86_64, but at least keeps check_range from needing
a separate bad_get_user_8 jump.

Fixes: 8a2462df1547 ("x86/uaccess: Improve the 8-byte getuser() case")
Signed-off-by: David Gow <davidgow@google.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/all/20240731073031.4045579-1-davidgow@google.com
14 months agoriscv: Fix linear mapping checks for non-contiguous memory regions
Stuart Menefy [Sat, 22 Jun 2024 11:42:16 +0000 (12:42 +0100)]
riscv: Fix linear mapping checks for non-contiguous memory regions

The RISC-V kernel already has checks to ensure that memory which would
lie outside of the linear mapping is not used. However those checks
use memory_limit, which is used to implement the mem= kernel command
line option (to limit the total amount of memory, not its address
range). When memory is made up of two or more non-contiguous memory
banks this check is incorrect.

Two changes are made here:
 - add a call in setup_bootmem() to memblock_cap_memory_range() which
   will cause any memory which falls outside the linear mapping to be
   removed from the memory regions.
 - remove the check in create_linear_mapping_page_table() which was
   intended to remove memory which is outside the liner mapping based
   on memory_limit, as it is no longer needed. Note a check for
   mapping more memory than memory_limit (to implement mem=) is
   unnecessary because of the existing call to
   memblock_enforce_memory_limit().

This issue was seen when booting on a SV39 platform with two memory
banks:
  0x00,80000000 1GiB
  0x20,00000000 32GiB
This memory range is 158GiB from top to bottom, but the linear mapping
is limited to 128GiB, so the lower block of RAM will be mapped at
PAGE_OFFSET, and the upper block straddles the top of the linear
mapping.

This causes the following Oops:
[    0.000000] Linux version 6.10.0-rc2-gd3b8dd5b51dd-dirty (stuart.menefy@codasip.com) (riscv64-codasip-linux-gcc (GCC) 13.2.0, GNU ld (GNU Binutils) 2.41.0.20231213) #20 SMP Sat Jun 22 11:34:22 BST 2024
[    0.000000] memblock_add: [0x0000000080000000-0x00000000bfffffff] early_init_dt_add_memory_arch+0x4a/0x52
[    0.000000] memblock_add: [0x0000002000000000-0x00000027ffffffff] early_init_dt_add_memory_arch+0x4a/0x52
...
[    0.000000] memblock_alloc_try_nid: 23724 bytes align=0x8 nid=-1 from=0x0000000000000000 max_addr=0x0000000000000000 early_init_dt_alloc_memory_arch+0x1e/0x48
[    0.000000] memblock_reserve: [0x00000027ffff5350-0x00000027ffffaffb] memblock_alloc_range_nid+0xb8/0x132
[    0.000000] Unable to handle kernel paging request at virtual address fffffffe7fff5350
[    0.000000] Oops [#1]
[    0.000000] Modules linked in:
[    0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 6.10.0-rc2-gd3b8dd5b51dd-dirty #20
[    0.000000] Hardware name: codasip,a70x (DT)
[    0.000000] epc : __memset+0x8c/0x104
[    0.000000]  ra : memblock_alloc_try_nid+0x74/0x84
[    0.000000] epc : ffffffff805e88c8 ra : ffffffff806148f6 sp : ffffffff80e03d50
[    0.000000]  gp : ffffffff80ec4158 tp : ffffffff80e0bec0 t0 : fffffffe7fff52f8
[    0.000000]  t1 : 00000027ffffb000 t2 : 5f6b636f6c626d65 s0 : ffffffff80e03d90
[    0.000000]  s1 : 0000000000005cac a0 : fffffffe7fff5350 a1 : 0000000000000000
[    0.000000]  a2 : 0000000000005cac a3 : fffffffe7fffaff8 a4 : 000000000000002c
[    0.000000]  a5 : ffffffff805e88c8 a6 : 0000000000005cac a7 : 0000000000000030
[    0.000000]  s2 : fffffffe7fff5350 s3 : ffffffffffffffff s4 : 0000000000000000
[    0.000000]  s5 : ffffffff8062347e s6 : 0000000000000000 s7 : 0000000000000001
[    0.000000]  s8 : 0000000000002000 s9 : 00000000800226d0 s10: 0000000000000000
[    0.000000]  s11: 0000000000000000 t3 : ffffffff8080a928 t4 : ffffffff8080a928
[    0.000000]  t5 : ffffffff8080a928 t6 : ffffffff8080a940
[    0.000000] status: 0000000200000100 badaddr: fffffffe7fff5350 cause: 000000000000000f
[    0.000000] [<ffffffff805e88c8>] __memset+0x8c/0x104
[    0.000000] [<ffffffff8062349c>] early_init_dt_alloc_memory_arch+0x1e/0x48
[    0.000000] [<ffffffff8043e892>] __unflatten_device_tree+0x52/0x114
[    0.000000] [<ffffffff8062441e>] unflatten_device_tree+0x9e/0xb8
[    0.000000] [<ffffffff806046fe>] setup_arch+0xd4/0x5bc
[    0.000000] [<ffffffff806007aa>] start_kernel+0x76/0x81a
[    0.000000] Code: b823 02b2 bc23 02b2 b023 04b2 b423 04b2 b823 04b2 (bc23) 04b2
[    0.000000] ---[ end trace 0000000000000000 ]---
[    0.000000] Kernel panic - not syncing: Attempted to kill the idle task!
[    0.000000] ---[ end Kernel panic - not syncing: Attempted to kill the idle task! ]---

The problem is that memblock (unaware that some physical memory cannot
be used) has allocated memory from the top of memory but which is
outside the linear mapping region.

Signed-off-by: Stuart Menefy <stuart.menefy@codasip.com>
Fixes: c99127c45248 ("riscv: Make sure the linear mapping does not use the kernel mapping")
Reviewed-by: David McKay <david.mckay@codasip.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20240622114217.2158495-1-stuart.menefy@codasip.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
14 months agoMerge tag 'pci-v6.11-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci
Linus Torvalds [Thu, 1 Aug 2024 18:30:15 +0000 (11:30 -0700)]
Merge tag 'pci-v6.11-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci

Pull PCI fixes from Bjorn Helgaas:

 - Fix a pci_intx() regression that caused driver reload to fail with
   "Resources present before probing" (Philipp Stanner)

 - Fix a pciehp regression that clobbered the upper bits of RAID status
   LEDs on NVMe devices behind an Intel VMD (Blazej Kucman)

* tag 'pci-v6.11-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci:
  PCI: pciehp: Retain Power Indicator bits for userspace indicators
  PCI: Fix devres regression in pci_intx()

14 months agoKVM: x86/mmu: fix determination of max NPT mapping level for private pages
Ackerley Tng [Thu, 1 Aug 2024 17:39:55 +0000 (17:39 +0000)]
KVM: x86/mmu: fix determination of max NPT mapping level for private pages

The `if (req_max_level)` test was meant ignore req_max_level if
PG_LEVEL_NONE was returned. Hence, this function should return
max_level instead of the ignored req_max_level.

This is only a latent issue for now, since guest_memfd does not
support large pages.

Signed-off-by: Ackerley Tng <ackerleytng@google.com>
Message-ID: <20240801173955.1975034-1-ackerleytng@google.com>
Fixes: f32fb32820b1 ("KVM: x86: Add hook for determining max NPT mapping level")
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
14 months agoPCI: pciehp: Retain Power Indicator bits for userspace indicators
Blazej Kucman [Mon, 22 Jul 2024 14:14:40 +0000 (16:14 +0200)]
PCI: pciehp: Retain Power Indicator bits for userspace indicators

The sysfs "attention" file normally controls the Slot Control Attention
Indicator with 0 (off), 1 (on), 2 (blink) settings.

576243b3f9ea ("PCI: pciehp: Allow exclusive userspace control of
indicators") added pciehp_set_raw_indicator_status() to allow userspace to
directly control all four bits in both the Attention Indicator and the
Power Indicator fields via the "attention" file.

This is used on Intel VMD bridges so utilities like "ledmon" can use sysfs
"attention" to control up to 16 indicators for NVMe device RAID status.

abaaac4845a0 ("PCI: hotplug: Use FIELD_GET/PREP()") broke this by masking
the sysfs data with PCI_EXP_SLTCTL_AIC, which discards the upper two bits
intended for the Power Indicator Control field (PCI_EXP_SLTCTL_PIC).

For NVMe devices behind an Intel VMD, ledmon settings that use the
PCI_EXP_SLTCTL_PIC bits, i.e., ATTENTION_REBUILD (0x5), ATTENTION_LOCATE
(0x7), ATTENTION_FAILURE (0xD), ATTENTION_OFF (0xF), no longer worked
correctly.

Mask with PCI_EXP_SLTCTL_AIC | PCI_EXP_SLTCTL_PIC to retain both the
Attention Indicator and the Power Indicator bits.

Fixes: abaaac4845a0 ("PCI: hotplug: Use FIELD_GET/PREP()")
Link: https://lore.kernel.org/r/20240722141440.7210-1-blazej.kucman@intel.com
Signed-off-by: Blazej Kucman <blazej.kucman@intel.com>
[bhelgaas: commit log]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable@vger.kernel.org # v6.7+
14 months agoPCI: Fix devres regression in pci_intx()
Philipp Stanner [Thu, 25 Jul 2024 12:07:30 +0000 (14:07 +0200)]
PCI: Fix devres regression in pci_intx()

pci_intx() becomes managed if pcim_enable_device() has been called in
advance. Commit 25216afc9db5 ("PCI: Add managed pcim_intx()") changed this
behavior so that pci_intx() always leads to creation of a separate device
resource for itself, whereas earlier, a shared resource was used for all
PCI devres operations.

Unfortunately, pci_intx() seems to be used in some drivers' remove() paths;
in the managed case this causes a device resource to be created on driver
detach, which causes .probe() to fail if the driver is reloaded:

  pci 0000:00:1f.2: Resources present before probing

Fix the regression by only redirecting pci_intx() to its managed twin
pcim_intx() if the pci_command changes.

Link: https://lore.kernel.org/r/20240725120729.59788-2-pstanner@redhat.com
Fixes: 25216afc9db5 ("PCI: Add managed pcim_intx()")
Reported-by: Damien Le Moal <dlemoal@kernel.org>
Closes: https://lore.kernel.org/all/b8f4ba97-84fc-4b7e-ba1a-99de2d9f0118@kernel.org/
Signed-off-by: Philipp Stanner <pstanner@redhat.com>
[bhelgaas: add error message to commit log]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Damien Le Moal <dlemoal@kernel.org>
14 months agoMerge tag 'net-6.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Linus Torvalds [Thu, 1 Aug 2024 16:42:09 +0000 (09:42 -0700)]
Merge tag 'net-6.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Paolo Abeni:
 "Including fixes from wireless, bleutooth, BPF and netfilter.

  Current release - regressions:

   - core: drop bad gso csum_start and offset in virtio_net_hdr

   - wifi: mt76: fix null pointer access in mt792x_mac_link_bss_remove

   - eth: tun: add missing bpf_net_ctx_clear() in do_xdp_generic()

   - phy: aquantia: only poll GLOBAL_CFG regs on aqr113, aqr113c and
     aqr115c

  Current release - new code bugs:

   - smc: prevent UAF in inet_create()

   - bluetooth: btmtk: fix kernel crash when entering btmtk_usb_suspend

   - eth: bnxt: reject unsupported hash functions

  Previous releases - regressions:

   - sched: act_ct: take care of padding in struct zones_ht_key

   - netfilter: fix null-ptr-deref in iptable_nat_table_init().

   - tcp: adjust clamping window for applications specifying SO_RCVBUF

  Previous releases - always broken:

   - ethtool: rss: small fixes to spec and GET

   - mptcp:
      - fix signal endpoint re-add
      - pm: fix backup support in signal endpoints

   - wifi: ath12k: fix soft lockup on suspend

   - eth: bnxt_en: fix RSS logic in __bnxt_reserve_rings()

   - eth: ice: fix AF_XDP ZC timeout and concurrency issues

   - eth: mlx5:
      - fix missing lock on sync reset reload
      - fix error handling in irq_pool_request_irq"

* tag 'net-6.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (76 commits)
  mptcp: fix duplicate data handling
  mptcp: fix bad RCVPRUNED mib accounting
  ipv6: fix ndisc_is_useropt() handling for PIO
  igc: Fix double reset adapter triggered from a single taprio cmd
  net: MAINTAINERS: Demote Qualcomm IPA to "maintained"
  net: wan: fsl_qmc_hdlc: Discard received CRC
  net: wan: fsl_qmc_hdlc: Convert carrier_lock spinlock to a mutex
  net/mlx5e: Add a check for the return value from mlx5_port_set_eth_ptys
  net/mlx5e: Fix CT entry update leaks of modify header context
  net/mlx5e: Require mlx5 tc classifier action support for IPsec prio capability
  net/mlx5: Fix missing lock on sync reset reload
  net/mlx5: Lag, don't use the hardcoded value of the first port
  net/mlx5: DR, Fix 'stack guard page was hit' error in dr_rule
  net/mlx5: Fix error handling in irq_pool_request_irq
  net/mlx5: Always drain health in shutdown callback
  net: Add skbuff.h to MAINTAINERS
  r8169: don't increment tx_dropped in case of NETDEV_TX_BUSY
  netfilter: iptables: Fix potential null-ptr-deref in ip6table_nat_table_init().
  netfilter: iptables: Fix null-ptr-deref in iptable_nat_table_init().
  net: drop bad gso csum_start and offset in virtio_net_hdr
  ...

14 months agorust: SHADOW_CALL_STACK is incompatible with Rust
Alice Ryhl [Mon, 29 Jul 2024 14:22:49 +0000 (14:22 +0000)]
rust: SHADOW_CALL_STACK is incompatible with Rust

When using the shadow call stack sanitizer, all code must be compiled
with the -ffixed-x18 flag, but this flag is not currently being passed
to Rust. This results in crashes that are extremely difficult to debug.

To ensure that nobody else has to go through the same debugging session
that I had to, prevent configurations that enable both SHADOW_CALL_STACK
and RUST.

It is rather common for people to backport 724a75ac9542 ("arm64: rust:
Enable Rust support for AArch64"), so I recommend applying this fix all
the way back to 6.1.

Cc: stable@vger.kernel.org # 6.1 and later
Fixes: 724a75ac9542 ("arm64: rust: Enable Rust support for AArch64")
Signed-off-by: Alice Ryhl <aliceryhl@google.com>
Acked-by: Miguel Ojeda <ojeda@kernel.org>
Link: https://lore.kernel.org/r/20240729-shadow-call-stack-v4-1-2a664b082ea4@google.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
14 months agoarm64: errata: Expand speculative SSBS workaround (again)
Mark Rutland [Thu, 1 Aug 2024 10:18:03 +0000 (11:18 +0100)]
arm64: errata: Expand speculative SSBS workaround (again)

A number of Arm Ltd CPUs suffer from errata whereby an MSR to the SSBS
special-purpose register does not affect subsequent speculative
instructions, permitting speculative store bypassing for a window of
time.

We worked around this for a number of CPUs in commits:

7187bb7d0b5c7dfa ("arm64: errata: Add workaround for Arm errata 3194386 and 3312417")
75b3c43eab594bfb ("arm64: errata: Expand speculative SSBS workaround")

Since then, similar errata have been published for a number of other Arm
Ltd CPUs, for which the same mitigation is sufficient. This is described
in their respective Software Developer Errata Notice (SDEN) documents:

* Cortex-A76 (MP052) SDEN v31.0, erratum 3324349
  https://developer.arm.com/documentation/SDEN-885749/3100/

* Cortex-A77 (MP074) SDEN v19.0, erratum 3324348
  https://developer.arm.com/documentation/SDEN-1152370/1900/

* Cortex-A78 (MP102) SDEN v21.0, erratum 3324344
  https://developer.arm.com/documentation/SDEN-1401784/2100/

* Cortex-A78C (MP138) SDEN v16.0, erratum 3324346
  https://developer.arm.com/documentation/SDEN-1707916/1600/

* Cortex-A78C (MP154) SDEN v10.0, erratum 3324347
  https://developer.arm.com/documentation/SDEN-2004089/1000/

* Cortex-A725 (MP190) SDEN v5.0, erratum 3456106
  https://developer.arm.com/documentation/SDEN-2832921/0500/

* Cortex-X1 (MP077) SDEN v21.0, erratum 3324344
  https://developer.arm.com/documentation/SDEN-1401782/2100/

* Cortex-X1C (MP136) SDEN v16.0, erratum 3324346
  https://developer.arm.com/documentation/SDEN-1707914/1600/

* Neoverse-N1 (MP050) SDEN v32.0, erratum 3324349
  https://developer.arm.com/documentation/SDEN-885747/3200/

* Neoverse-V1 (MP076) SDEN v19.0, erratum 3324341
  https://developer.arm.com/documentation/SDEN-1401781/1900/

Note that due to the manner in which Arm develops IP and tracks errata,
some CPUs share a common erratum number and some CPUs have multiple
erratum numbers for the same HW issue.

On parts without SB, it is necessary to use ISB for the workaround. The
spec_bar() macro used in the mitigation will expand to a "DSB SY; ISB"
sequence in this case, which is sufficient on all affected parts.

Enable the existing mitigation by adding the relevant MIDRs to
erratum_spec_ssbs_list. The list is sorted alphanumerically (involving
moving Neoverse-V3 after Neoverse-V2) so that this is easy to audit and
potentially extend again in future. The Kconfig text is also updated to
clarify the set of affected parts and the mitigation.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20240801101803.1982459-4-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
14 months agoarm64: cputype: Add Cortex-A725 definitions
Mark Rutland [Thu, 1 Aug 2024 10:18:02 +0000 (11:18 +0100)]
arm64: cputype: Add Cortex-A725 definitions

Add cputype definitions for Cortex-A725. These will be used for errata
detection in subsequent patches.

These values can be found in the Cortex-A725 TRM:

  https://developer.arm.com/documentation/107652/0001/

... in table A-247 ("MIDR_EL1 bit descriptions").

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Link: https://lore.kernel.org/r/20240801101803.1982459-3-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
14 months agoarm64: cputype: Add Cortex-X1C definitions
Mark Rutland [Thu, 1 Aug 2024 10:18:01 +0000 (11:18 +0100)]
arm64: cputype: Add Cortex-X1C definitions

Add cputype definitions for Cortex-X1C. These will be used for errata
detection in subsequent patches.

These values can be found in the Cortex-X1C TRM:

  https://developer.arm.com/documentation/101968/0002/

... in section B2.107 ("MIDR_EL1, Main ID Register, EL1").

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Link: https://lore.kernel.org/r/20240801101803.1982459-2-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
14 months agoRISC-V: Enable the IPI before workqueue_online_cpu()
Nick Hu [Wed, 17 Jul 2024 03:17:14 +0000 (11:17 +0800)]
RISC-V: Enable the IPI before workqueue_online_cpu()

Sometimes the hotplug cpu stalls at the arch_cpu_idle() for a while after
workqueue_online_cpu(). When cpu stalls at the idle loop, the reschedule
IPI is pending. However the enable bit is not enabled yet so the cpu stalls
at WFI until watchdog timeout. Therefore enable the IPI before the
workqueue_online_cpu() to fix the issue.

Fixes: 63c5484e7495 ("workqueue: Add multiple affinity scopes and interface to select them")
Signed-off-by: Nick Hu <nick.hu@sifive.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Link: https://lore.kernel.org/r/20240717031714.1946036-1-nick.hu@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
14 months agoriscv/mm: Add handling for VM_FAULT_SIGSEGV in mm_fault_error()
Zhe Qiao [Wed, 31 Jul 2024 08:45:47 +0000 (16:45 +0800)]
riscv/mm: Add handling for VM_FAULT_SIGSEGV in mm_fault_error()

Handle VM_FAULT_SIGSEGV in the page fault path so that we correctly
kill the process and we don't BUG() the kernel.

Fixes: 07037db5d479 ("RISC-V: Paging and MMU")
Signed-off-by: Zhe Qiao <qiaozhe@iscas.ac.cn>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20240731084547.85380-1-qiaozhe@iscas.ac.cn
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
14 months agoperf: riscv: Fix selecting counters in legacy mode
Shifrin Dmitry [Mon, 29 Jul 2024 12:58:58 +0000 (15:58 +0300)]
perf: riscv: Fix selecting counters in legacy mode

It is required to check event type before checking event config.
Events with the different types can have the same config.
This check is missed for legacy mode code

For such perf usage:
    sysctl -w kernel.perf_user_access=2
    perf stat -e cycles,L1-dcache-loads --
driver will try to force both events to CYCLE counter.

This commit implements event type check before forcing
events on the special counters.

Signed-off-by: Shifrin Dmitry <dmitry.shifrin@syntacore.com>
Reviewed-by: Atish Patra <atishp@rivosinc.com>
Fixes: cc4c07c89aad ("drivers: perf: Implement perf event mmap support in the SBI backend")
Link: https://lore.kernel.org/r/20240729125858.630653-1-dmitry.shifrin@syntacore.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
14 months agocache: StarFive: Require a 64-bit system
Palmer Dabbelt [Mon, 22 Jul 2024 15:45:20 +0000 (08:45 -0700)]
cache: StarFive: Require a 64-bit system

This has a bunch of {read,write}q() calls, so it won't work on 32-bit
systems.  I don't think there's any 32-bit StarFive systems, so for now
just require 64-bit.

Fixes: cabff60ca77d ("cache: Add StarFive StarLink cache management")
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Emil Renner Berthing <emil.renner.berthing@canonical.com>
Link: https://lore.kernel.org/r/20240722154519.25375-2-palmer@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
14 months agoperf arch events: Fix duplicate RISC-V SBI firmware event name
Eric Lin [Fri, 19 Jul 2024 11:50:18 +0000 (19:50 +0800)]
perf arch events: Fix duplicate RISC-V SBI firmware event name

Currently, the RISC-V firmware JSON file has duplicate event name
"FW_SFENCE_VMA_RECEIVED". According to the RISC-V SBI PMU extension[1],
the event name should be "FW_SFENCE_VMA_ASID_SENT".

Before this patch:
$ perf list

firmware:
  fw_access_load
       [Load access trap event. Unit: cpu]
  fw_access_store
       [Store access trap event. Unit: cpu]
....
 fw_set_timer
       [Set timer event. Unit: cpu]
  fw_sfence_vma_asid_received
       [Received SFENCE.VMA with ASID request from other HART event. Unit: cpu]
  fw_sfence_vma_received
       [Sent SFENCE.VMA with ASID request to other HART event. Unit: cpu]

After this patch:
$ perf list

firmware:
  fw_access_load
       [Load access trap event. Unit: cpu]
  fw_access_store
       [Store access trap event. Unit: cpu]
.....
  fw_set_timer
       [Set timer event. Unit: cpu]
  fw_sfence_vma_asid_received
       [Received SFENCE.VMA with ASID request from other HART event. Unit: cpu]
  fw_sfence_vma_asid_sent
       [Sent SFENCE.VMA with ASID request to other HART event. Unit: cpu]
  fw_sfence_vma_received
       [Received SFENCE.VMA request from other HART event. Unit: cpu]

Link: https://github.com/riscv-non-isa/riscv-sbi-doc/blob/master/src/ext-pmu.adoc#event-firmware-events-type-15
Fixes: 8f0dcb4e7364 ("perf arch events: riscv sbi firmware std event files")
Fixes: c4f769d4093d ("perf vendor events riscv: add Sifive U74 JSON file")
Fixes: acbf6de674ef ("perf vendor events riscv: Add StarFive Dubhe-80 JSON file")
Fixes: 7340c6df49df ("perf vendor events riscv: add T-HEAD C9xx JSON file")
Fixes: f5102e31c209 ("riscv: andes: Support specifying symbolic firmware and hardware raw event")
Signed-off-by: Eric Lin <eric.lin@sifive.com>
Reviewed-by: Samuel Holland <samuel.holland@sifive.com>
Reviewed-by: Nikita Shubin <n.shubin@yadro.com>
Reviewed-by: Inochi Amaoto <inochiama@outlook.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Atish Patra <atishp@rivosinc.com>
Link: https://lore.kernel.org/r/20240719115018.27356-1-eric.lin@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
14 months agoriscv/purgatory: align riscv_kernel_entry
Daniel Maslowski [Fri, 19 Jul 2024 17:04:37 +0000 (19:04 +0200)]
riscv/purgatory: align riscv_kernel_entry

When alignment handling is delegated to the kernel, everything must be
word-aligned in purgatory, since the trap handler is then set to the
kexec one. Without the alignment, hitting the exception would
ultimately crash. On other occasions, the kernel's handler would take
care of exceptions.
This has been tested on a JH7110 SoC with oreboot and its SBI delegating
unaligned access exceptions and the kernel configured to handle them.

Fixes: 736e30af583fb ("RISC-V: Add purgatory")
Signed-off-by: Daniel Maslowski <cyrevolt@gmail.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20240719170437.247457-1-cyrevolt@gmail.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
14 months agoalpha: fix ioread64be()/iowrite64be() helpers
Arnd Bergmann [Mon, 29 Jul 2024 21:44:59 +0000 (23:44 +0200)]
alpha: fix ioread64be()/iowrite64be() helpers

Compile-testing the crypto/caam driver on alpha showed a pre-existing
problem on alpha with iowrite64be() missing:

ERROR: modpost: "iowrite64be" [drivers/crypto/caam/caam_jr.ko] undefined!

The prototypes were added a while ago when we started using asm-generic/io.h,
but the implementation was still missing. At some point the ioread64/iowrite64
helpers were added, but the big-endian versions are still missing, and
the generic version (using readq/writeq) is would not work here.

Change it to wrap ioread64()/iowrite64() instead.

Fixes: beba3771d9e0 ("crypto: caam: Make CRYPTO_DEV_FSL_CAAM dependent of COMPILE_TEST")
Fixes: e19d4ebc536d ("alpha: add full ioread64/iowrite64 implementation")
Fixes: 7e772dad9913 ("alpha: Use generic <asm-generic/io.h>")
Closes: https://lore.kernel.org/all/CAHk-=wgEyzSxTs467NDOVfBSzWvUS6ztcwhiy=M3xog==KBmTw@mail.gmail.com/
Tested-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
14 months agox86/mm: Fix pti_clone_entry_text() for i386
Peter Zijlstra [Thu, 1 Aug 2024 10:42:25 +0000 (12:42 +0200)]
x86/mm: Fix pti_clone_entry_text() for i386

While x86_64 has PMD aligned text sections, i386 does not have this
luxery. Notably ALIGN_ENTRY_TEXT_END is empty and _etext has PAGE
alignment.

This means that text on i386 can be page granular at the tail end,
which in turn means that the PTI text clones should consistently
account for this.

Make pti_clone_entry_text() consistent with pti_clone_kernel_text().

Fixes: 16a3fe634f6a ("x86/mm/pti: Clone kernel-image on PTE level for 32 bit")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
14 months agox86/mm: Fix pti_clone_pgtable() alignment assumption
Peter Zijlstra [Wed, 31 Jul 2024 16:31:05 +0000 (18:31 +0200)]
x86/mm: Fix pti_clone_pgtable() alignment assumption

Guenter reported dodgy crashes on an i386-nosmp build using GCC-11
that had the form of endless traps until entry stack exhaust and then
#DF from the stack guard.

It turned out that pti_clone_pgtable() had alignment assumptions on
the start address, notably it hard assumes start is PMD aligned. This
is true on x86_64, but very much not true on i386.

These assumptions can cause the end condition to malfunction, leading
to a 'short' clone. Guess what happens when the user mapping has a
short copy of the entry text?

Use the correct increment form for addr to avoid alignment
assumptions.

Fixes: 16a3fe634f6a ("x86/mm/pti: Clone kernel-image on PTE level for 32 bit")
Reported-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20240731163105.GG33588@noisy.programming.kicks-ass.net
14 months agoceph: force sending a cap update msg back to MDS for revoke op
Xiubo Li [Fri, 12 Jul 2024 04:40:19 +0000 (12:40 +0800)]
ceph: force sending a cap update msg back to MDS for revoke op

If a client sends out a cap update dropping caps with the prior 'seq'
just before an incoming cap revoke request, then the client may drop
the revoke because it believes it's already released the requested
capabilities.

This causes the MDS to wait indefinitely for the client to respond
to the revoke. It's therefore always a good idea to ack the cap
revoke request with the bumped up 'seq'.

Currently if the cap->issued equals to the newcaps the check_caps()
will do nothing, we should force flush the caps.

Cc: stable@vger.kernel.org
Link: https://tracker.ceph.com/issues/61782
Signed-off-by: Xiubo Li <xiubli@redhat.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
14 months agoMerge branch 'mptcp-fix-duplicate-data-handling'
Paolo Abeni [Thu, 1 Aug 2024 10:30:15 +0000 (12:30 +0200)]
Merge branch 'mptcp-fix-duplicate-data-handling'

Matthieu Baerts says:

====================
mptcp: fix duplicate data handling

In some cases, the subflow-level's copied_seq counter was incorrectly
increased, leading to an unexpected subflow reset.

Patch 1/2 fixes the RCVPRUNED MIB counter that was attached to the wrong
event since its introduction in v5.14, backported to v5.11.

Patch 2/2 fixes the copied_seq counter issues, is present since v5.10.

Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
====================

Link: https://patch.msgid.link/20240731-upstream-net-20240731-mptcp-dup-data-v1-0-bde833fa628a@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
14 months agomptcp: fix duplicate data handling
Paolo Abeni [Wed, 31 Jul 2024 10:10:15 +0000 (12:10 +0200)]
mptcp: fix duplicate data handling

When a subflow receives and discards duplicate data, the mptcp
stack assumes that the consumed offset inside the current skb is
zero.

With multiple subflows receiving data simultaneously such assertion
does not held true. As a result the subflow-level copied_seq will
be incorrectly increased and later on the same subflow will observe
a bad mapping, leading to subflow reset.

Address the issue taking into account the skb consumed offset in
mptcp_subflow_discard_data().

Fixes: 04e4cd4f7ca4 ("mptcp: cleanup mptcp_subflow_discard_data()")
Cc: stable@vger.kernel.org
Link: https://github.com/multipath-tcp/mptcp_net-next/issues/501
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
14 months agomptcp: fix bad RCVPRUNED mib accounting
Paolo Abeni [Wed, 31 Jul 2024 10:10:14 +0000 (12:10 +0200)]
mptcp: fix bad RCVPRUNED mib accounting

Since its introduction, the mentioned MIB accounted for the wrong
event: wake-up being skipped as not-needed on some edge condition
instead of incoming skb being dropped after landing in the (subflow)
receive queue.

Move the increment in the correct location.

Fixes: ce599c516386 ("mptcp: properly account bulk freed memory")
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
14 months agoMerge tag 'nf-24-07-31' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf
Paolo Abeni [Thu, 1 Aug 2024 10:08:28 +0000 (12:08 +0200)]
Merge tag 'nf-24-07-31' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf

Pablo Neira Ayuso says:

====================
Netfilter fixes for net

The following patchset contains Netfilter fixes for net:

Fix a possible null-ptr-deref sometimes triggered by iptables-restore at
boot time. Register iptables {ipv4,ipv6} nat table pernet in first place
to fix this issue. Patch #1 and #2 from Kuniyuki Iwashima.

netfilter pull request 24-07-31

* tag 'nf-24-07-31' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
  netfilter: iptables: Fix potential null-ptr-deref in ip6table_nat_table_init().
  netfilter: iptables: Fix null-ptr-deref in iptable_nat_table_init().
====================

Link: https://patch.msgid.link/20240731213046.6194-1-pablo@netfilter.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
14 months agoipv6: fix ndisc_is_useropt() handling for PIO
Maciej Żenczykowski [Tue, 30 Jul 2024 00:17:48 +0000 (17:17 -0700)]
ipv6: fix ndisc_is_useropt() handling for PIO

The current logic only works if the PIO is between two
other ND user options.  This fixes it so that the PIO
can also be either before or after other ND user options
(for example the first or last option in the RA).

side note: there's actually Android tests verifying
a portion of the old broken behaviour, so:
  https://android-review.googlesource.com/c/kernel/tests/+/3196704
fixes those up.

Cc: Jen Linkova <furry@google.com>
Cc: Lorenzo Colitti <lorenzo@google.com>
Cc: Patrick Rohr <prohr@google.com>
Cc: David Ahern <dsahern@kernel.org>
Cc: YOSHIFUJI Hideaki / 吉藤英明 <yoshfuji@linux-ipv6.org>
Cc: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Maciej Żenczykowski <maze@google.com>
Fixes: 048c796beb6e ("ipv6: adjust ndisc_is_useropt() to also return true for PIO")
Link: https://patch.msgid.link/20240730001748.147636-1-maze@google.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
14 months agoigc: Fix double reset adapter triggered from a single taprio cmd
Faizal Rahim [Tue, 30 Jul 2024 17:33:02 +0000 (10:33 -0700)]
igc: Fix double reset adapter triggered from a single taprio cmd

Following the implementation of "igc: Add TransmissionOverrun counter"
patch, when a taprio command is triggered by user, igc processes two
commands: TAPRIO_CMD_REPLACE followed by TAPRIO_CMD_STATS. However, both
commands unconditionally pass through igc_tsn_offload_apply() which
evaluates and triggers reset adapter. The double reset causes issues in
the calculation of adapter->qbv_count in igc.

TAPRIO_CMD_REPLACE command is expected to reset the adapter since it
activates qbv. It's unexpected for TAPRIO_CMD_STATS to do the same
because it doesn't configure any driver-specific TSN settings. So, the
evaluation in igc_tsn_offload_apply() isn't needed for TAPRIO_CMD_STATS.

To address this, commands parsing are relocated to
igc_tsn_enable_qbv_scheduling(). Commands that don't require an adapter
reset will exit after processing, thus avoiding igc_tsn_offload_apply().

Fixes: d3750076d464 ("igc: Add TransmissionOverrun counter")
Signed-off-by: Faizal Rahim <faizal.abdul.rahim@linux.intel.com>
Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Tested-by: Mor Bar-Gabay <morx.bar.gabay@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Link: https://patch.msgid.link/20240730173304.865479-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
14 months agonet: MAINTAINERS: Demote Qualcomm IPA to "maintained"
Krzysztof Kozlowski [Tue, 30 Jul 2024 10:40:16 +0000 (12:40 +0200)]
net: MAINTAINERS: Demote Qualcomm IPA to "maintained"

To the best of my knowledge, Alex Elder is not being paid to support
Qualcomm IPA networking drivers, so drop the status from "supported" to
"maintained".

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Acked-by: Alex Elder <elder@kernel.org>
Link: https://patch.msgid.link/20240730104016.22103-1-krzysztof.kozlowski@linaro.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
14 months agonet: wan: fsl_qmc_hdlc: Discard received CRC
Herve Codina [Tue, 30 Jul 2024 06:31:33 +0000 (08:31 +0200)]
net: wan: fsl_qmc_hdlc: Discard received CRC

Received frame from QMC contains the CRC.
Upper layers don't need this CRC and tcpdump mentioned trailing junk
data due to this CRC presence.

As some other HDLC driver, simply discard this CRC.

Fixes: d0f2258e79fd ("net: wan: Add support for QMC HDLC")
Cc: stable@vger.kernel.org
Signed-off-by: Herve Codina <herve.codina@bootlin.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240730063133.179598-1-herve.codina@bootlin.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
14 months agonet: wan: fsl_qmc_hdlc: Convert carrier_lock spinlock to a mutex
Herve Codina [Tue, 30 Jul 2024 06:31:04 +0000 (08:31 +0200)]
net: wan: fsl_qmc_hdlc: Convert carrier_lock spinlock to a mutex

The carrier_lock spinlock protects the carrier detection. While it is
held, framer_get_status() is called which in turn takes a mutex.
This is not correct and can lead to a deadlock.

A run with PROVE_LOCKING enabled detected the issue:
  [ BUG: Invalid wait context ]
  ...
  c204ddbc (&framer->mutex){+.+.}-{3:3}, at: framer_get_status+0x40/0x78
  other info that might help us debug this:
  context-{4:4}
  2 locks held by ifconfig/146:
  #0: c0926a38 (rtnl_mutex){+.+.}-{3:3}, at: devinet_ioctl+0x12c/0x664
  #1: c2006a40 (&qmc_hdlc->carrier_lock){....}-{2:2}, at: qmc_hdlc_framer_set_carrier+0x30/0x98

Avoid the spinlock usage and convert carrier_lock to a mutex.

Fixes: 54762918ca85 ("net: wan: fsl_qmc_hdlc: Add framer support")
Cc: stable@vger.kernel.org
Signed-off-by: Herve Codina <herve.codina@bootlin.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240730063104.179553-1-herve.codina@bootlin.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
14 months agoMerge branch 'mlx5-misc-fixes-2024-07-30'
Jakub Kicinski [Thu, 1 Aug 2024 01:04:53 +0000 (18:04 -0700)]
Merge branch 'mlx5-misc-fixes-2024-07-30'

Tariq Toukan says:

====================
mlx5 misc fixes 2024-07-30

This patchset provides misc bug fixes from the team to the mlx5 core and
Eth drivers.
====================

Link: https://patch.msgid.link/20240730061638.1831002-1-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
14 months agonet/mlx5e: Add a check for the return value from mlx5_port_set_eth_ptys
Shahar Shitrit [Tue, 30 Jul 2024 06:16:37 +0000 (09:16 +0300)]
net/mlx5e: Add a check for the return value from mlx5_port_set_eth_ptys

Since the documentation for mlx5_toggle_port_link states that it should
only be used after setting the port register, we add a check for the
return value from mlx5_port_set_eth_ptys to ensure the register was
successfully set before calling it.

Fixes: 667daedaecd1 ("net/mlx5e: Toggle link only after modifying port parameters")
Signed-off-by: Shahar Shitrit <shshitrit@nvidia.com>
Reviewed-by: Carolina Jubran <cjubran@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Link: https://patch.msgid.link/20240730061638.1831002-9-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
14 months agonet/mlx5e: Fix CT entry update leaks of modify header context
Chris Mi [Tue, 30 Jul 2024 06:16:36 +0000 (09:16 +0300)]
net/mlx5e: Fix CT entry update leaks of modify header context

The cited commit allocates a new modify header to replace the old
one when updating CT entry. But if failed to allocate a new one, eg.
exceed the max number firmware can support, modify header will be
an error pointer that will trigger a panic when deallocating it. And
the old modify header point is copied to old attr. When the old
attr is freed, the old modify header is lost.

Fix it by restoring the old attr to attr when failed to allocate a
new modify header context. So when the CT entry is freed, the right
modify header context will be freed. And the panic of accessing
error pointer is also fixed.

Fixes: 94ceffb48eac ("net/mlx5e: Implement CT entry update")
Signed-off-by: Chris Mi <cmi@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Link: https://patch.msgid.link/20240730061638.1831002-8-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
14 months agonet/mlx5e: Require mlx5 tc classifier action support for IPsec prio capability
Rahul Rameshbabu [Tue, 30 Jul 2024 06:16:35 +0000 (09:16 +0300)]
net/mlx5e: Require mlx5 tc classifier action support for IPsec prio capability

Require mlx5 classifier action support when creating IPSec chains in
offload path. MLX5_IPSEC_CAP_PRIO should only be set if CONFIG_MLX5_CLS_ACT
is enabled. If CONFIG_MLX5_CLS_ACT=n and MLX5_IPSEC_CAP_PRIO is set,
configuring IPsec offload will fail due to the mlxx5 ipsec chain rules
failing to be created due to lack of classifier action support.

Fixes: fa5aa2f89073 ("net/mlx5e: Use chains for IPsec policy priority offload")
Signed-off-by: Rahul Rameshbabu <rrameshbabu@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Link: https://patch.msgid.link/20240730061638.1831002-7-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
14 months agonet/mlx5: Fix missing lock on sync reset reload
Moshe Shemesh [Tue, 30 Jul 2024 06:16:34 +0000 (09:16 +0300)]
net/mlx5: Fix missing lock on sync reset reload

On sync reset reload work, when remote host updates devlink on reload
actions performed on that host, it misses taking devlink lock before
calling devlink_remote_reload_actions_performed() which results in
triggering lock assert like the following:

WARNING: CPU: 4 PID: 1164 at net/devlink/core.c:261 devl_assert_locked+0x3e/0x50

 CPU: 4 PID: 1164 Comm: kworker/u96:6 Tainted: G S      W          6.10.0-rc2+ #116
 Hardware name: Supermicro SYS-2028TP-DECTR/X10DRT-PT, BIOS 2.0 12/18/2015
 Workqueue: mlx5_fw_reset_events mlx5_sync_reset_reload_work [mlx5_core]
 RIP: 0010:devl_assert_locked+0x3e/0x50

 Call Trace:
  <TASK>
  ? __warn+0xa4/0x210
  ? devl_assert_locked+0x3e/0x50
  ? report_bug+0x160/0x280
  ? handle_bug+0x3f/0x80
  ? exc_invalid_op+0x17/0x40
  ? asm_exc_invalid_op+0x1a/0x20
  ? devl_assert_locked+0x3e/0x50
  devlink_notify+0x88/0x2b0
  ? mlx5_attach_device+0x20c/0x230 [mlx5_core]
  ? __pfx_devlink_notify+0x10/0x10
  ? process_one_work+0x4b6/0xbb0
  process_one_work+0x4b6/0xbb0
[…]

Fixes: 84a433a40d0e ("net/mlx5: Lock mlx5 devlink reload callbacks")
Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Link: https://patch.msgid.link/20240730061638.1831002-6-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
14 months agonet/mlx5: Lag, don't use the hardcoded value of the first port
Mark Bloch [Tue, 30 Jul 2024 06:16:33 +0000 (09:16 +0300)]
net/mlx5: Lag, don't use the hardcoded value of the first port

The cited commit didn't change the body of the loop as it should.
It shouldn't be using MLX5_LAG_P1.

Fixes: 7e978e7714d6 ("net/mlx5: Lag, use actual number of lag ports")
Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Link: https://patch.msgid.link/20240730061638.1831002-5-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
14 months agonet/mlx5: DR, Fix 'stack guard page was hit' error in dr_rule
Yevgeny Kliteynik [Tue, 30 Jul 2024 06:16:32 +0000 (09:16 +0300)]
net/mlx5: DR, Fix 'stack guard page was hit' error in dr_rule

This patch reduces the size of hw_ste_arr_optimized array that is
allocated on stack from 640 bytes (5 match STEs + 5 action STES)
to 448 bytes (2 match STEs + 5 action STES).
This fixes the 'stack guard page was hit' issue, while still fitting
majority of the usecases (up to 2 match STEs).

Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Reviewed-by: Alex Vesker <valex@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Link: https://patch.msgid.link/20240730061638.1831002-4-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
14 months agonet/mlx5: Fix error handling in irq_pool_request_irq
Shay Drory [Tue, 30 Jul 2024 06:16:31 +0000 (09:16 +0300)]
net/mlx5: Fix error handling in irq_pool_request_irq

In case mlx5_irq_alloc fails, the previously allocated index remains
in the XArray, which could lead to inconsistencies.

Fix it by adding error handling that erases the allocated index
from the XArray if mlx5_irq_alloc returns an error.

Fixes: c36326d38d93 ("net/mlx5: Round-Robin EQs over IRQs")
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Maher Sanalla <msanalla@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Link: https://patch.msgid.link/20240730061638.1831002-3-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
14 months agonet/mlx5: Always drain health in shutdown callback
Shay Drory [Tue, 30 Jul 2024 06:16:30 +0000 (09:16 +0300)]
net/mlx5: Always drain health in shutdown callback

There is no point in recovery during device shutdown. if health
work started need to wait for it to avoid races and NULL pointer
access.

Hence, drain health WQ on shutdown callback.

Fixes: 1958fc2f0712 ("net/mlx5: SF, Add auxiliary device driver")
Fixes: d2aa060d40fa ("net/mlx5: Cancel health poll before sending panic teardown command")
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Link: https://patch.msgid.link/20240730061638.1831002-2-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
14 months agonet: Add skbuff.h to MAINTAINERS
Breno Leitao [Tue, 30 Jul 2024 16:14:03 +0000 (09:14 -0700)]
net: Add skbuff.h to MAINTAINERS

The network maintainers need to be copied if the skbuff.h is touched.

This also helps git-send-email to figure out the proper maintainers when
touching the file.

Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240730161404.2028175-1-leitao@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
14 months agor8169: don't increment tx_dropped in case of NETDEV_TX_BUSY
Heiner Kallweit [Tue, 30 Jul 2024 19:51:52 +0000 (21:51 +0200)]
r8169: don't increment tx_dropped in case of NETDEV_TX_BUSY

The skb isn't consumed in case of NETDEV_TX_BUSY, therefore don't
increment the tx_dropped counter.

Fixes: 188f4af04618 ("r8169: use NETDEV_TX_{BUSY/OK}")
Cc: stable@vger.kernel.org
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Link: https://patch.msgid.link/bbba9c48-8bac-4932-9aa1-d2ed63bc9433@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
14 months agoMerge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Jakub Kicinski [Thu, 1 Aug 2024 00:49:00 +0000 (17:49 -0700)]
Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf

Daniel Borkmann says:

====================
pull-request: bpf 2024-07-31

We've added 2 non-merge commits during the last 2 day(s) which contain
a total of 2 files changed, 2 insertions(+), 2 deletions(-).

The main changes are:

1) Fix BPF selftest build after tree sync with regards to a _GNU_SOURCE
   macro redefined compilation error, from Stanislav Fomichev.

2) Fix a wrong test in the ASSERT_OK() check in uprobe_syscall BPF selftest,
   from Jiri Olsa.

* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
  bpf/selftests: Fix ASSERT_OK condition check in uprobe_syscall test
  selftests/bpf: Filter out _GNU_SOURCE when compiling test_cpp
====================

Link: https://patch.msgid.link/20240731115706.19677-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
14 months agonouveau: set placement to original placement on uvmm validate.
Dave Airlie [Wed, 15 May 2024 02:55:41 +0000 (12:55 +1000)]
nouveau: set placement to original placement on uvmm validate.

When a buffer is evicted for memory pressure or TTM evict all,
the placement is set to the eviction domain, this means the
buffer never gets revalidated on the next exec to the correct domain.

I think this should be fine to use the initial domain from the
object creation, as least with VM_BIND this won't change after
init so this should be the correct answer.

Fixes: b88baab82871 ("drm/nouveau: implement new VM_BIND uAPI")
Cc: Danilo Krummrich <dakr@redhat.com>
Cc: <stable@vger.kernel.org> # v6.6
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20240515025542.2156774-1-airlied@gmail.com
14 months agonetfilter: iptables: Fix potential null-ptr-deref in ip6table_nat_table_init().
Kuniyuki Iwashima [Thu, 25 Jul 2024 19:28:21 +0000 (12:28 -0700)]
netfilter: iptables: Fix potential null-ptr-deref in ip6table_nat_table_init().

ip6table_nat_table_init() accesses net->gen->ptr[ip6table_nat_net_ops.id],
but the function is exposed to user space before the entry is allocated
via register_pernet_subsys().

Let's call register_pernet_subsys() before xt_register_template().

Fixes: fdacd57c79b7 ("netfilter: x_tables: never register tables by default")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
14 months agonetfilter: iptables: Fix null-ptr-deref in iptable_nat_table_init().
Kuniyuki Iwashima [Thu, 25 Jul 2024 19:28:20 +0000 (12:28 -0700)]
netfilter: iptables: Fix null-ptr-deref in iptable_nat_table_init().

We had a report that iptables-restore sometimes triggered null-ptr-deref
at boot time. [0]

The problem is that iptable_nat_table_init() is exposed to user space
before the kernel fully initialises netns.

In the small race window, a user could call iptable_nat_table_init()
that accesses net_generic(net, iptable_nat_net_id), which is available
only after registering iptable_nat_net_ops.

Let's call register_pernet_subsys() before xt_register_template().

[0]:
bpfilter: Loaded bpfilter_umh pid 11702
Started bpfilter
BUG: kernel NULL pointer dereference, address: 0000000000000013
 PF: supervisor write access in kernel mode
 PF: error_code(0x0002) - not-present page
PGD 0 P4D 0
PREEMPT SMP NOPTI
CPU: 2 PID: 11879 Comm: iptables-restor Not tainted 6.1.92-99.174.amzn2023.x86_64 #1
Hardware name: Amazon EC2 c6i.4xlarge/, BIOS 1.0 10/16/2017
RIP: 0010:iptable_nat_table_init (net/ipv4/netfilter/iptable_nat.c:87 net/ipv4/netfilter/iptable_nat.c:121) iptable_nat
Code: 10 4c 89 f6 48 89 ef e8 0b 19 bb ff 41 89 c4 85 c0 75 38 41 83 c7 01 49 83 c6 28 41 83 ff 04 75 dc 48 8b 44 24 08 48 8b 0c 24 <48> 89 08 4c 89 ef e8 a2 3b a2 cf 48 83 c4 10 44 89 e0 5b 5d 41 5c
RSP: 0018:ffffbef902843cd0 EFLAGS: 00010246
RAX: 0000000000000013 RBX: ffff9f4b052caa20 RCX: ffff9f4b20988d80
RDX: 0000000000000000 RSI: 0000000000000064 RDI: ffffffffc04201c0
RBP: ffff9f4b29394000 R08: ffff9f4b07f77258 R09: ffff9f4b07f77240
R10: 0000000000000000 R11: ffff9f4b09635388 R12: 0000000000000000
R13: ffff9f4b1a3c6c00 R14: ffff9f4b20988e20 R15: 0000000000000004
FS:  00007f6284340000(0000) GS:ffff9f51fe280000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000013 CR3: 00000001d10a6005 CR4: 00000000007706e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
PKRU: 55555554
Call Trace:
 <TASK>
 ? show_trace_log_lvl (arch/x86/kernel/dumpstack.c:259)
 ? show_trace_log_lvl (arch/x86/kernel/dumpstack.c:259)
 ? xt_find_table_lock (net/netfilter/x_tables.c:1259)
 ? __die_body.cold (arch/x86/kernel/dumpstack.c:478 arch/x86/kernel/dumpstack.c:420)
 ? page_fault_oops (arch/x86/mm/fault.c:727)
 ? exc_page_fault (./arch/x86/include/asm/irqflags.h:40 ./arch/x86/include/asm/irqflags.h:75 arch/x86/mm/fault.c:1470 arch/x86/mm/fault.c:1518)
 ? asm_exc_page_fault (./arch/x86/include/asm/idtentry.h:570)
 ? iptable_nat_table_init (net/ipv4/netfilter/iptable_nat.c:87 net/ipv4/netfilter/iptable_nat.c:121) iptable_nat
 xt_find_table_lock (net/netfilter/x_tables.c:1259)
 xt_request_find_table_lock (net/netfilter/x_tables.c:1287)
 get_info (net/ipv4/netfilter/ip_tables.c:965)
 ? security_capable (security/security.c:809 (discriminator 13))
 ? ns_capable (kernel/capability.c:376 kernel/capability.c:397)
 ? do_ipt_get_ctl (net/ipv4/netfilter/ip_tables.c:1656)
 ? bpfilter_send_req (net/bpfilter/bpfilter_kern.c:52) bpfilter
 nf_getsockopt (net/netfilter/nf_sockopt.c:116)
 ip_getsockopt (net/ipv4/ip_sockglue.c:1827)
 __sys_getsockopt (net/socket.c:2327)
 __x64_sys_getsockopt (net/socket.c:2342 net/socket.c:2339 net/socket.c:2339)
 do_syscall_64 (arch/x86/entry/common.c:51 arch/x86/entry/common.c:81)
 entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:121)
RIP: 0033:0x7f62844685ee
Code: 48 8b 0d 45 28 0f 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 49 89 ca b8 37 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 0a c3 66 0f 1f 84 00 00 00 00 00 48 8b 15 09
RSP: 002b:00007ffd1f83d638 EFLAGS: 00000246 ORIG_RAX: 0000000000000037
RAX: ffffffffffffffda RBX: 00007ffd1f83d680 RCX: 00007f62844685ee
RDX: 0000000000000040 RSI: 0000000000000000 RDI: 0000000000000004
RBP: 0000000000000004 R08: 00007ffd1f83d670 R09: 0000558798ffa2a0
R10: 00007ffd1f83d680 R11: 0000000000000246 R12: 00007ffd1f83e3b2
R13: 00007f628455baa0 R14: 00007ffd1f83d7b0 R15: 00007f628457a008
 </TASK>
Modules linked in: iptable_nat(+) bpfilter rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache veth xt_state xt_connmark xt_nat xt_statistic xt_MASQUERADE xt_mark xt_addrtype ipt_REJECT nf_reject_ipv4 nft_chain_nat nf_nat xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xt_comment nft_compat nf_tables nfnetlink overlay nls_ascii nls_cp437 vfat fat ghash_clmulni_intel aesni_intel ena crypto_simd ptp cryptd i8042 pps_core serio button sunrpc sch_fq_codel configfs loop dm_mod fuse dax dmi_sysfs crc32_pclmul crc32c_intel efivarfs
CR2: 0000000000000013

Fixes: fdacd57c79b7 ("netfilter: x_tables: never register tables by default")
Reported-by: Takahiro Kawahara <takawaha@amazon.co.jp>
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
14 months agox86/setup: Parse the builtin command line before merging
Borislav Petkov (AMD) [Tue, 30 Jul 2024 14:15:12 +0000 (16:15 +0200)]
x86/setup: Parse the builtin command line before merging

Commit in Fixes was added as a catch-all for cases where the cmdline is
parsed before being merged with the builtin one.

And promptly one issue appeared, see Link below. The microcode loader
really needs to parse it that early, but the merging happens later.

Reshuffling the early boot nightmare^W code to handle that properly would
be a painful exercise for another day so do the chicken thing and parse the
builtin cmdline too before it has been merged.

Fixes: 0c40b1c7a897 ("x86/setup: Warn when option parsing is done too early")
Reported-by: Mike Lothian <mike@fireburn.co.uk>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20240730152108.GAZqkE5Dfi9AuKllRw@fat_crate.local
Link: https://lore.kernel.org/r/20240722152330.GCZp55ck8E_FT4kPnC@fat_crate.local
14 months agodrm/atomic: Allow userspace to use damage clips with async flips
André Almeida [Tue, 2 Jul 2024 21:22:15 +0000 (18:22 -0300)]
drm/atomic: Allow userspace to use damage clips with async flips

Allow userspace to use damage clips with atomic async flips. Damage
clips are useful for partial plane updates, which can be helpful for
clients that want to do flips asynchronously.

Fixes: 0e26cc72c71c ("drm: Refuse to async flip with atomic prop changes")
Signed-off-by: André Almeida <andrealmeid@igalia.com>
Reviewed-by: Simon Ser <contact@emersion.fr>
Signed-off-by: Simon Ser <contact@emersion.fr>
Link: https://patchwork.freedesktop.org/patch/msgid/20240702212215.109696-2-andrealmeid@igalia.com
14 months agodrm/atomic: Allow userspace to use explicit sync with atomic async flips
André Almeida [Tue, 2 Jul 2024 21:22:14 +0000 (18:22 -0300)]
drm/atomic: Allow userspace to use explicit sync with atomic async flips

Allow userspace to use explicit synchronization with atomic async flips.
That means that the flip will wait for some hardware fence, and then
will flip as soon as possible (async) in regard of the vblank.

Fixes: 0e26cc72c71c ("drm: Refuse to async flip with atomic prop changes")
Signed-off-by: André Almeida <andrealmeid@igalia.com>
Reviewed-by: Simon Ser <contact@emersion.fr>
Signed-off-by: Simon Ser <contact@emersion.fr>
Link: https://patchwork.freedesktop.org/patch/msgid/20240702212215.109696-1-andrealmeid@igalia.com
14 months agoALSA: hda: Conditionally use snooping for AMD HDMI
Takashi Iwai [Wed, 31 Jul 2024 17:05:15 +0000 (19:05 +0200)]
ALSA: hda: Conditionally use snooping for AMD HDMI

The recent regression report revealed that the use of WC pages for AMD
HDMI device together with AMD IOMMU leads to unexpected truncation or
noises.  The issue seems triggered by the change in the kernel core
memory allocation that enables IOMMU driver to use always S/G
buffers.  Meanwhile, the use of WC pages has been a workaround for the
similar issue with standard pages in the past.  So, now we need to
apply the workaround conditionally, namely, only when IOMMU isn't in
place.

This patch modifies the workaround code to check the DMA ops at first
and apply the snoop-off only when needed.

Fixes: f5ff79fddf0e ("dma-mapping: remove CONFIG_DMA_REMAP")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=219087
Link: https://patch.msgid.link/20240731170521.31714-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
14 months agominmax: fix up min3() and max3() too
Linus Torvalds [Tue, 30 Jul 2024 22:44:16 +0000 (15:44 -0700)]
minmax: fix up min3() and max3() too

David Laight pointed out that we should deal with the min3() and max3()
mess too, which still does excessive expansion.

And our current macros are actually rather broken.

In particular, the macros did this:

  #define min3(x, y, z) min((typeof(x))min(x, y), z)
  #define max3(x, y, z) max((typeof(x))max(x, y), z)

and that not only is a nested expansion of possibly very complex
arguments with all that involves, the typing with that "typeof()" cast
is completely wrong.

For example, imagine what happens in max3() if 'x' happens to be a
'unsigned char', but 'y' and 'z' are 'unsigned long'.  The types are
compatible, and there's no warning - but the result is just random
garbage.

No, I don't think we've ever hit that issue in practice, but since we
now have sane infrastructure for doing this right, let's just use it.
It fixes any excessive expansion, and also avoids these kinds of broken
type issues.

Requested-by: David Laight <David.Laight@aculab.com>
Acked-by: Arnd Bergmann <arnd@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
14 months agoriscv: cpufeature: Do not drop Linux-internal extensions
Samuel Holland [Thu, 18 Jul 2024 21:29:59 +0000 (14:29 -0700)]
riscv: cpufeature: Do not drop Linux-internal extensions

The Linux-internal Xlinuxenvcfg ISA extension is omitted from the
riscv_isa_ext array because it has no DT binding and should not appear
in /proc/cpuinfo. The logic added in commit 625034abd52a ("riscv: add
ISA extensions validation callback") assumes all extensions are included
in riscv_isa_ext, and so riscv_resolve_isa() wrongly drops Xlinuxenvcfg
from the final ISA string. Instead, accept such Linux-internal ISA
extensions as if they have no validation callback.

Fixes: 625034abd52a ("riscv: add ISA extensions validation callback")
Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/20240718213011.2600150-1-samuel.holland@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
14 months agos390: Keep inittext section writable
Heiko Carstens [Mon, 29 Jul 2024 11:06:44 +0000 (13:06 +0200)]
s390: Keep inittext section writable

There is no added security by making the inittext section non-writable,
however it does split part of the kernel mapping into 4K mappings
instead of 1M mappings:

---[ Kernel Image Start ]---
0x000003ffe0000000-0x000003ffe0e00000        14M PMD RO X
0x000003ffe0e00000-0x000003ffe0ec7000       796K PTE RO X
0x000003ffe0ec7000-0x000003ffe0f00000       228K PTE RO NX
0x000003ffe0f00000-0x000003ffe1300000         4M PMD RO NX
0x000003ffe1300000-0x000003ffe1353000       332K PTE RO NX
0x000003ffe1353000-0x000003ffe1400000       692K PTE RW NX
0x000003ffe1400000-0x000003ffe1500000         1M PMD RW NX
0x000003ffe1500000-0x000003ffe1700000         2M PTE RW NX <---
0x000003ffe1700000-0x000003ffe1800000         1M PMD RW NX
0x000003ffe1800000-0x000003ffe187e000       504K PTE RW NX
---[ Kernel Image End ]---

Keep the inittext writable and enable instruction execution protection
(aka noexec) later to prevent this. This also allows to use the
generic free_initmem() implementation.

---[ Kernel Image Start ]---
0x000003ffe0000000-0x000003ffe0e00000        14M PMD RO X
0x000003ffe0e00000-0x000003ffe0ec7000       796K PTE RO X
0x000003ffe0ec7000-0x000003ffe0f00000       228K PTE RO NX
0x000003ffe0f00000-0x000003ffe1300000         4M PMD RO NX
0x000003ffe1300000-0x000003ffe1353000       332K PTE RO NX
0x000003ffe1353000-0x000003ffe1400000       692K PTE RW NX
0x000003ffe1400000-0x000003ffe1800000         4M PMD RW NX <---
0x000003ffe1800000-0x000003ffe187e000       504K PTE RW NX
---[ Kernel Image End ]---

Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
14 months agos390/vmlinux.lds.S: Move ro_after_init section behind rodata section
Heiko Carstens [Mon, 29 Jul 2024 11:06:43 +0000 (13:06 +0200)]
s390/vmlinux.lds.S: Move ro_after_init section behind rodata section

The .data.rel.ro and .got section were added between the rodata and
ro_after_init data section, which adds an RW mapping in between all RO
mapping of the kernel image:

---[ Kernel Image Start ]---
0x000003ffe0000000-0x000003ffe0e00000        14M PMD RO X
0x000003ffe0e00000-0x000003ffe0ec7000       796K PTE RO X
0x000003ffe0ec7000-0x000003ffe0f00000       228K PTE RO NX
0x000003ffe0f00000-0x000003ffe1300000         4M PMD RO NX
0x000003ffe1300000-0x000003ffe1331000       196K PTE RO NX
0x000003ffe1331000-0x000003ffe13b3000       520K PTE RW NX <---
0x000003ffe13b3000-0x000003ffe13d5000       136K PTE RO NX
0x000003ffe13d5000-0x000003ffe1400000       172K PTE RW NX
0x000003ffe1400000-0x000003ffe1500000         1M PMD RW NX
0x000003ffe1500000-0x000003ffe1700000         2M PTE RW NX
0x000003ffe1700000-0x000003ffe1800000         1M PMD RW NX
0x000003ffe1800000-0x000003ffe187e000       504K PTE RW NX
---[ Kernel Image End ]---

Move the ro_after_init data section again right behind the rodata
section to prevent interleaving RO and RW mappings:

---[ Kernel Image Start ]---
0x000003ffe0000000-0x000003ffe0e00000        14M PMD RO X
0x000003ffe0e00000-0x000003ffe0ec7000       796K PTE RO X
0x000003ffe0ec7000-0x000003ffe0f00000       228K PTE RO NX
0x000003ffe0f00000-0x000003ffe1300000         4M PMD RO NX
0x000003ffe1300000-0x000003ffe1353000       332K PTE RO NX
0x000003ffe1353000-0x000003ffe1400000       692K PTE RW NX
0x000003ffe1400000-0x000003ffe1500000         1M PMD RW NX
0x000003ffe1500000-0x000003ffe1700000         2M PTE RW NX
0x000003ffe1700000-0x000003ffe1800000         1M PMD RW NX
0x000003ffe1800000-0x000003ffe187e000       504K PTE RW NX
---[ Kernel Image End ]---

Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
14 months agos390/mm: Get rid of RELOC_HIDE()
Heiko Carstens [Mon, 29 Jul 2024 13:45:58 +0000 (15:45 +0200)]
s390/mm: Get rid of RELOC_HIDE()

Since __va(0) does not translate to NULL anymore remove RELOC_HIDE()
which was only added to get rid of a compile warning with clang W=1:

    arch/s390/mm/vmem.c:666:36: warning: performing pointer arithmetic on
     a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
      666 |                 __set_memory_4k(__va(0), __va(0) + ident_map_size);
          |                                          ~~~~~~~ ^

Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>