www.infradead.org Git - users/dwmw2/linux.git/log

]> www.infradead.org Git - users/dwmw2/linux.git/log

projects / users / dwmw2 / linux.git / log

Chancel Liu [Thu, 17 Oct 2024 07:15:07 +0000 (16:15 +0900)]

ASoC: fsl_micfil: Add a flag to distinguish with different volume control types

On i.MX8MM the register of volume control has positive and negative
values. It is different from other platforms like i.MX8MP and i.MX93
which only have positive values. Add a volume_sx flag to use SX_TLV
volume control for this kind of platform. Use common TLV volume control
for other platforms.

Fixes: cdfa92eb90f5 ("ASoC: fsl_micfil: Correct the number of steps on SX controls")
Signed-off-by: Chancel Liu <chancel.liu@nxp.com>
Reviewed-by: Daniel Baluta <daniel.baluta@nxp.com>
Link: https://patch.msgid.link/20241017071507.2577786-1-chancel.liu@nxp.com
Signed-off-by: Mark Brown <broonie@kernel.org>

commit | commitdiff | tree

Alexey Klimov [Wed, 16 Oct 2024 22:10:49 +0000 (23:10 +0100)]

ASoC: codecs: lpass-rx-macro: fix RXn(rx,n) macro for DSM_CTL and SEC7 regs

Turns out some registers of pre-2.5 version of rxmacro codecs are not
located at the expected offsets but 0xc further away in memory. So far
the detected registers are CDC_RX_RX2_RX_PATH_SEC7 and
CDC_RX_RX2_RX_PATH_DSM_CTL.

CDC_RX_RXn_RX_PATH_DSM_CTL(rx, n) macro incorrectly generates the address
0x540 for RX2 but it should be 0x54C and it also overwrites
CDC_RX_RX2_RX_PATH_SEC7 which is located at 0x540.
The same goes for CDC_RX_RXn_RX_PATH_SEC7(rx, n).

Fix this by introducing additional rxn_reg_stride2 offset. For 2.5 version
and above this offset will be equal to 0.
With such change the corresponding RXn() macros will generate the same
values for 2.5 codec version for all RX paths and the same old values
for pre-2.5 version for RX0 and RX1. However for the latter case with
RX2 path it will also add rxn_reg_stride2 on top.

While at this, also remove specific if-check for INTERP_AUX from
rx_macro_digital_mute() and rx_macro_enable_interp_clk(). These if-check
was used to handle such special offset for AUX interpolator but since
CDC_RX_RXn_RX_PATH_SEC7(rx, n) and CDC_RX_RXn_RX_PATH_DSM_CTL(rx, n)
macros will generate the correst addresses of dsm register, they are no
longer needed.

Cc: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Cc: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Alexey Klimov <alexey.klimov@linaro.org>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Link: https://patch.msgid.link/20241016221049.1145101-1-alexey.klimov@linaro.org
Signed-off-by: Mark Brown <broonie@kernel.org>

commit | commitdiff | tree

Jens Axboe [Fri, 18 Oct 2024 16:58:24 +0000 (10:58 -0600)]

Merge tag 'md-6.12-20241018' of https://git.kernel.org/pub/scm/linux/kernel/git/song/md into block-6.12

Pull MD fixes from Song.

* tag 'md-6.12-20241018' of https://git.kernel.org/pub/scm/linux/kernel/git/song/md:
md/raid10: fix null ptr dereference in raid10_size()
md: ensure child flush IO does not affect origin bio->bi_status

commit | commitdiff | tree

Linus Torvalds [Fri, 18 Oct 2024 16:50:05 +0000 (09:50 -0700)]

mm: fix follow_pfnmap API lockdep assert

The lockdep asserts for the new follow_pfnmap() API "knows" that a
pfnmap always has a vma->vm_file, since that's the only way to create
such a mapping.

And that's actually true for all the normal cases. But not for the mmap
failure case, where the incomplete mapping is torn down and we have
cleared vma->vm_file because the failure occured before the file was
linked to the vma.

So this codepath does actually need to check for vm_file being NULL.

Reported-by: Jann Horn <jannh@google.com>
Fixes: 6da8e9634bb7 ("mm: new follow_pfnmap API")
Cc: Peter Xu <peterx@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

commit | commitdiff | tree

Rafael J. Wysocki [Fri, 18 Oct 2024 16:22:43 +0000 (18:22 +0200)]

Merge branch 'pm-cpufreq'

Merge amd-pstate driver fixes for 6.12-rc4:

- Enable ACPI CPPC in amd_pstate_register_driver() after disabling
   it in amd_pstate_unregister_driver() during driver operation mode
   switch (Dhananjay Ugwekar).

- Make amd-pstate use nominal performance as the maximum performance
   level when boost is disabled (Mario Limonciello).

* pm-cpufreq:
  cpufreq/amd-pstate: Use nominal perf for limits when boost is disabled
  cpufreq/amd-pstate: Fix amd_pstate mode switch on shared memory systems

commit | commitdiff | tree

Linus Torvalds [Fri, 18 Oct 2024 14:13:24 +0000 (07:13 -0700)]

Merge tag 'iommu-fixes-v6.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux

Pull iommu fixes from Joerg Roedel:
"ARM-SMMU fixes from Will Deacon:

   - Clarify warning message when failing to disable the MMU-500
     prefetcher

   - Fix undefined behaviour in calculation of L1 stream-table index
     when 32-bit StreamIDs are implemented

   - Replace a rogue comma with a semicolon

  Intel VT-d fix from Lu Baolu:

   - Fix incorrect pci_for_each_dma_alias() for non-PCI devices"

* tag 'iommu-fixes-v6.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux:
  iommu/vt-d: Fix incorrect pci_for_each_dma_alias() for non-PCI devices
  iommu/arm-smmu-v3: Convert comma to semicolon
  iommu/arm-smmu-v3: Fix last_sid_idx calculation for sid_bits==32
  iommu/arm-smmu: Clarify MMU-500 CPRE workaround

commit | commitdiff | tree

Linus Torvalds [Fri, 18 Oct 2024 14:07:13 +0000 (07:07 -0700)]

Merge tag 'powerpc-6.12-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc fix from Madhavan Srinivasan:

- To prevent possible memory leak, free "name" on error in
opal_event_init()

Thanks to Michael Ellerman and 2639161967.

* tag 'powerpc-6.12-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/powernv: Free name on error in opal_event_init()

commit | commitdiff | tree

Linus Torvalds [Fri, 18 Oct 2024 14:01:59 +0000 (07:01 -0700)]

Merge tag 's390-6.12-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux

Pull s390 fixes from Heiko Carstens:

- Fix PCI error recovery by handling error events correctly

- Fix CCA crypto card behavior within protected execution environment

- Two KVM commits which fix virtual vs physical address handling bugs
   in KVM pfault handling

- Fix return code handling in pckmo_key2protkey()

- Deactivate sclp console as late as possible so that outstanding
   messages appear on the console instead of being dropped on reboot

- Convert newlines to CRLF instead of LFCR for the sclp vt220 driver,
   as required by the vt220 specification

- Initialize also psw mask in perf_arch_fetch_caller_regs() to make
   sure that user_mode(regs) will return false

- Update defconfigs

* tag 's390-6.12-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390: Update defconfigs
  s390: Initialize psw mask in perf_arch_fetch_caller_regs()
  s390/sclp_vt220: Convert newlines to CRLF instead of LFCR
  s390/sclp: Deactivate sclp after all its users
  s390/pkey_pckmo: Return with success for valid protected key types
  KVM: s390: Change virtual to physical address access in diag 0x258 handler
  KVM: s390: gaccess: Check if guest address is in memslot
  s390/ap: Fix CCA crypto card behavior within protected execution environment
  s390/pci: Handle PCI error codes other than 0x3a

commit | commitdiff | tree

Yo-Jung (Leo) Lin [Thu, 17 Oct 2024 14:47:38 +0000 (22:47 +0800)]

misc: rtsx: list supported models in Kconfig help

rts5228, rts5261, rts5264 are supported by the rtsx_pci driver, but
they are not mentioned in the Kconfig help when the code was added.
List those models in the Kconfig help accordingly.

Signed-off-by: Yo-Jung Lin (Leo) <0xff07@gmail.com>
Link: https://lore.kernel.org/r/20241017144747.15966-1-0xff07@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit | commitdiff | tree

Greg Kroah-Hartman [Fri, 18 Oct 2024 11:31:34 +0000 (13:31 +0200)]

MAINTAINERS: Remove some entries due to various compliance requirements.

Remove some entries due to various compliance requirements. They can come
back in the future if sufficient documentation is provided.

Link: https://lore.kernel.org/r/2024101835-tiptop-blip-09ed@gregkh
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit | commitdiff | tree

Thorsten Blum [Fri, 18 Oct 2024 10:27:03 +0000 (12:27 +0200)]

proc: Fix W=1 build kernel-doc warning

Building the kernel with W=1 generates the following warning:

fs/proc/fd.c:81: warning: This comment starts with '/**',
but isn't a kernel-doc comment.

Use a normal comment for the helper function proc_fdinfo_permission().

Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Link: https://lore.kernel.org/r/20241018102705.92237-2-thorsten.blum@linux.dev
Signed-off-by: Christian Brauner <brauner@kernel.org>

commit | commitdiff | tree

Hans de Goede [Thu, 10 Oct 2024 09:45:12 +0000 (11:45 +0200)]

HID: lenovo: Add support for Thinkpad X1 Tablet Gen 3 keyboard

The Thinkpad X1 Tablet Gen 3 keyboard has the same Lenovo specific quirks
as the original Thinkpad X1 Tablet keyboard.

Add the PID for the "Thinkpad X1 Tablet Gen 3 keyboard" to the hid-lenovo
driver to fix the FnLock, Mute and media buttons not working.

Suggested-by: Izhar Firdaus <izhar@fedoraproject.org>
Closes https://bugzilla.redhat.com/show_bug.cgi?id=2315395
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Jiri Kosina <jkosina@suse.com>

commit | commitdiff | tree

Kenneth Albanowski [Fri, 4 Oct 2024 17:24:29 +0000 (10:24 -0700)]

HID: multitouch: Add quirk for Logitech Bolt receiver w/ Casa touchpad

The Logitech Casa Touchpad does not reliably send touch release signals
when communicating through the Logitech Bolt wireless-to-USB receiver.

Adjusting the device class to add MT_QUIRK_NOT_SEEN_MEANS_UP to make
sure that no touches become stuck, MT_QUIRK_FORCE_MULTI_INPUT is not
needed, but harmless.

Linux does not have information on which devices are connected to the
Bolt receiver, so we have to enable this for the entire device.

Signed-off-by: Kenneth Albanowski <kenalba@chromium.org>
Signed-off-by: Jiri Kosina <jkosina@suse.com>

commit | commitdiff | tree

Bartłomiej Maryńczak [Mon, 7 Oct 2024 22:25:42 +0000 (00:25 +0200)]

HID: i2c-hid: Delayed i2c resume wakeup for 0x0d42 Goodix touchpad

Patch for Goodix 27c6:0d42 touchpads found in Inspiron 5515 laptops.

After resume from suspend, one can communicate with this device just fine.
We can read data from it or request a reset,
but for some reason the interrupt line will not go up
when new events are available.
(it can correctly respond to a reset with an interrupt tho)

The only way I found to wake this device up
is to send anything to it after ~1.5s mark,
for example a simple read request, or power mode change.

In this patch, I simply delay the resume steps with msleep,
this will cause the set_power request to happen after
the ~1.5s barrier causing the device to resume its event interrupts.

Sleep was used rather than delayed_work
to make this workaround as non-invasive as possible.

[jkosina@suse.com: shortlog update]
Signed-off-by: Bartłomiej Maryńczak <marynczakbartlomiej@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.com>

commit | commitdiff | tree

Greg Kroah-Hartman [Fri, 18 Oct 2024 10:11:28 +0000 (12:11 +0200)]

Merge tag 'usb-serial-6.12-rc4' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial into usb-linus

Johan writes:

USB-serial device ids for 6.12-rc4

Here are some new modem device ids.

Everything has been in linux-next over night with no reported issues.

* tag 'usb-serial-6.12-rc4' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial:
USB: serial: option: add Telit FN920C04 MBIM compositions
USB: serial: option: add support for Quectel EG916Q-GL

commit | commitdiff | tree

Jiqian Chen [Sat, 12 Oct 2024 08:45:37 +0000 (16:45 +0800)]

xen: Remove dependency between pciback and privcmd

Commit 2fae6bb7be32 ("xen/privcmd: Add new syscall to get gsi from dev")
adds a weak reverse dependency to the config XEN_PRIVCMD definition, that
dependency causes xen-privcmd can't be loaded on domU, because dependent
xen-pciback isn't always be loaded successfully on domU.

To solve above problem, remove that dependency, and do not call
pcistub_get_gsi_from_sbdf() directly, instead add a hook in
drivers/xen/apci.c, xen-pciback register the real call function, then in
privcmd_ioctl_pcidev_get_gsi call that hook.

Fixes: 2fae6bb7be32 ("xen/privcmd: Add new syscall to get gsi from dev")
Reported-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Message-ID: <20241012084537.1543059-1-Jiqian.Chen@amd.com>
Signed-off-by: Juergen Gross <jgross@suse.com>

commit | commitdiff | tree

Kent Overstreet [Fri, 18 Oct 2024 04:22:09 +0000 (00:22 -0400)]

bcachefs: fsck: Improve hash_check_key()

hash_check_key() checks and repairs the hash table btrees: dirents and
xattrs are open addressing hash tables.

We recently had a corruption reported where the hash type on an inode
somehow got flipped, which made the existing dirents invisible and
allowed new ones to be created with the same name.

Now, hash_check_key() can repair duplicates: it will delete one of them,
if it has an xattr or dangling dirent, but if it has two valid dirents
one of them gets renamed.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

commit | commitdiff | tree

Kent Overstreet [Fri, 18 Oct 2024 04:19:12 +0000 (00:19 -0400)]

bcachefs: bch2_hash_set_or_get_in_snapshot()

Add a variant of bch2_hash_set_in_snapshot() that returns the existing
key on -EEXIST.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

commit | commitdiff | tree

Kent Overstreet [Fri, 18 Oct 2024 03:00:14 +0000 (23:00 -0400)]

bcachefs: Repair mismatches in inode hash seed, type

Different versions of the same inode (same inode number, different
snapshot ID) must have the same hash seed and type - lookups require
this, since they see keys from different snapshots simultaneously.

To repair we only need to make the inodes consistent, hash_check_key()
will do the rest.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

commit | commitdiff | tree

Kent Overstreet [Sat, 28 Sep 2024 18:44:06 +0000 (14:44 -0400)]

bcachefs: Add hash seed, type to inode_to_text()

This helped with discovering some filesystem corruption fsck has having
trouble with: the str_hash type had gotten flipped on one snapshot's
version of an inode.

All versions of a given inode number have the same hash seed and hash
type, since lookups will be done with a single hash/seed and type and
see dirents/xattrs from multiple snapshots.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

commit | commitdiff | tree

Kent Overstreet [Fri, 18 Oct 2024 02:55:59 +0000 (22:55 -0400)]

bcachefs: INODE_STR_HASH() for bch_inode_unpacked

Trivial cleanup - add a normal BITMASK() helper for bch_inode_unpacked.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

commit | commitdiff | tree

Kent Overstreet [Fri, 18 Oct 2024 02:29:23 +0000 (22:29 -0400)]

bcachefs: Run in-kernel offline fsck without ratelimit errors

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

commit | commitdiff | tree

Hongbo Li [Thu, 10 Oct 2024 04:01:48 +0000 (12:01 +0800)]

bcachefs: skip mount option handle for empty string.

The options parse in get_tree will split the options buffer, it will
get the empty string for last one by strsep(). After commit
ea0eeb89b1d5 ("bcachefs: reject unknown mount options") is merged,
unknown mount options is not allowed (here is empty string), and this
causes this errors. This can be reproduced just by the following steps:

bcachefs format /dev/loop
mount -t bcachefs -o metadata_target=loop1 /dev/loop1 /mnt/bcachefs/

Fixes: ea0eeb89b1d5 ("bcachefs: reject unknown mount options")
Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

commit | commitdiff | tree

Hongbo Li [Thu, 26 Sep 2024 02:00:01 +0000 (10:00 +0800)]

bcachefs: fix incorrect show_options results

When call show_options in bcachefs, the options buffer is appeneded
to the seq variable. In fact, it requires an additional comma to be
appended first. This will affect the remount process when reading
existing mount options.

Fixes: 9305cf91d05e ("bcachefs: bch2_opts_to_text()")
Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

commit | commitdiff | tree

Kent Overstreet [Thu, 17 Oct 2024 05:10:49 +0000 (01:10 -0400)]

bcachefs: Fix data corruption on -ENOSPC in buffered write path

Found by generic/299: When we have to truncate a write due to -ENOSPC,
we may have to read in the folio we're writing to if we're now no longer
doing a complete write to a !uptodate folio.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

commit | commitdiff | tree

Kent Overstreet [Thu, 17 Oct 2024 05:05:17 +0000 (01:05 -0400)]

bcachefs: bch2_folio_reservation_get_partial() is now better behaved

bch2_folio_reservation_get_partial(), on partial success, will now
return a reservation that's aligned to the filesystem blocksize.

This is a partial fix for fstests generic/299 - fio verify is badly
behaved in the presence of short writes that aren't aligned to its
blocksize.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

commit | commitdiff | tree

Kent Overstreet [Mon, 14 Oct 2024 21:55:48 +0000 (17:55 -0400)]

bcachefs: fix disk reservation accounting in bch2_folio_reservation_get()

bch2_disk_reservation_put() zeroes out the reservation - oops.

This fixes a disk reservation leak when getting a quota reservation
returned an error.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

commit | commitdiff | tree

Kent Overstreet [Wed, 16 Oct 2024 10:32:12 +0000 (06:32 -0400)]

bcachefS: ec: fix data type on stripe deletion

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

commit | commitdiff | tree

Kent Overstreet [Mon, 14 Oct 2024 01:53:26 +0000 (21:53 -0400)]

bcachefs: Don't use commit_do() unnecessarily

Using commit_do() to call alloc_sectors_start_trans() breaks when we're
randomly injecting transaction restarts - the restart in the commit
causes us to leak the lock that alloc_sectorS_start_trans() takes.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

commit | commitdiff | tree

Kent Overstreet [Tue, 15 Oct 2024 03:58:45 +0000 (23:58 -0400)]

bcachefs: handle restarts in bch2_bucket_io_time_reset()

bch2_bucket_io_time_reset() doesn't need to succeed, which is why it
didn't previously retry on transaction restart - but we're now treating
these as errors.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

commit | commitdiff | tree

Kent Overstreet [Wed, 16 Oct 2024 08:11:15 +0000 (04:11 -0400)]

bcachefs: fix restart handling in __bch2_resume_logged_op_finsert()

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

commit | commitdiff | tree

Kent Overstreet [Wed, 16 Oct 2024 07:36:40 +0000 (03:36 -0400)]

bcachefs: fix restart handling in bch2_alloc_write_key()

This is ugly:

We may discover in alloc_write_key that the data type we calculated is
wrong, because BCH_DATA_need_discard is checked/set elsewhere, and the
disk accounting counters we calculated need to be updated.

But bch2_alloc_key_to_dev_counters(..., BTREE_TRIGGER_gc) is not safe
w.r.t. transaction restarts, so we need to propagate the fixup back to
our gc state in case we take a transaction restart.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

commit | commitdiff | tree

Kent Overstreet [Tue, 15 Oct 2024 06:13:22 +0000 (02:13 -0400)]

bcachefs: fix restart handling in bch2_do_invalidates_work()

this one is fairly harmless since the invalidate worker will just run
again later if it needs to, but still worth fixing

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

commit | commitdiff | tree

Kent Overstreet [Tue, 15 Oct 2024 03:52:38 +0000 (23:52 -0400)]

bcachefs: fix missing restart handling in bch2_read_retry_nodecode()

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

commit | commitdiff | tree

Kent Overstreet [Tue, 15 Oct 2024 03:32:23 +0000 (23:32 -0400)]

bcachefs: fix restart handling in bch2_fiemap()

We were leaking transaction restart errors to userspace.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

commit | commitdiff | tree

Kent Overstreet [Tue, 15 Oct 2024 02:40:20 +0000 (22:40 -0400)]

bcachefs: fix bch2_hash_delete() error path

we were exiting an iterator that hadn't been initialized

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

commit | commitdiff | tree

Kent Overstreet [Tue, 15 Oct 2024 02:18:12 +0000 (22:18 -0400)]

bcachefs: fix restart handling in bch2_rename2()

This should be impossible to hit in practice; the first lookup within a
transaction won't return a restart due to lock ordering, but we're
adding fault injection for transaction restarts and shaking out bugs.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

commit | commitdiff | tree

Dave Airlie [Fri, 18 Oct 2024 03:53:34 +0000 (13:53 +1000)]

Merge tag 'drm-xe-fixes-2024-10-17' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes

Driver Changes:
- New workaround to Xe2 (Aradhya)
- Fix unbalanced rpm put (Matthew Auld)
- Remove fragile lock optimization (Matthew Brost)
- Fix job release, delegating it to the drm scheduler (Matthew Brost)
- Fix timestamp bit width for Xe2 (Lucas)
- Fix external BO's dma-resv usag (Matthew Brost)
- Fix returning success for timeout in wait_token (Nirmoy)
- Initialize fence to avoid it being detected as signaled (Matthew Auld)
- Improve cache flush for BMG (Matthew Auld)
- Don't allow hflip for tile4 framebuffer on Xe2 (Juha-Pekka)

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/jkldrex5733ldxrla75b4ayvhujjhw2kccmasl5rotoufoacj4@pkvlrrv4orc7

commit | commitdiff | tree

Linus Torvalds [Fri, 18 Oct 2024 02:12:38 +0000 (19:12 -0700)]

Merge tag 'x86_bugs_post_ibpb' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 IBPB fixes from Borislav Petkov:
"This fixes the IBPB implementation of older AMDs (< gen4) that do not
  flush the RSB (Return Address Stack) so you can still do some leaking
  when using a "=ibpb" mitigation for Retbleed or SRSO. Fix it by doing
  the flushing in software on those generations.

  IBPB is not the default setting so this is not likely to affect
  anybody in practice"

* tag 'x86_bugs_post_ibpb' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/bugs: Do not use UNTRAIN_RET with IBPB on entry
  x86/bugs: Skip RSB fill at VMEXIT
  x86/entry: Have entry_ibpb() invalidate return predictions
  x86/cpufeatures: Add a IBPB_NO_RET BUG flag
  x86/cpufeatures: Define X86_FEATURE_AMD_IBPB_RET

commit | commitdiff | tree

Josh Poimboeuf [Thu, 17 Oct 2024 22:09:02 +0000 (15:09 -0700)]

cdrom: Avoid barrier_nospec() in cdrom_ioctl_media_changed()

The barrier_nospec() after the array bounds check is overkill and
painfully slow for arches which implement it.

Furthermore, most arches don't implement it, so they remain exposed to
Spectre v1 (which can affect pretty much any CPU with branch
prediction).

Instead, clamp the user pointer to a valid range so it's guaranteed to
be a valid array index even when the bounds check mispredicts.

Fixes: 8270cb10c068 ("cdrom: Fix spectre-v1 gadget")
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Link: https://lore.kernel.org/r/1d86f4d9d8fba68e5ca64cdeac2451b95a8bf872.1729202937.git.jpoimboe@kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>

commit | commitdiff | tree

Linus Torvalds [Thu, 17 Oct 2024 23:33:06 +0000 (16:33 -0700)]

Merge tag 'mm-hotfixes-stable-2024-10-17-16-08' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull misc fixes from Andrew Morton:
"28 hotfixes. 13 are cc:stable. 23 are MM.

  It is the usual shower of unrelated singletons - please see the
  individual changelogs for details"

* tag 'mm-hotfixes-stable-2024-10-17-16-08' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (28 commits)
  maple_tree: add regression test for spanning store bug
  maple_tree: correct tree corruption on spanning store
  mm/mglru: only clear kswapd_failures if reclaimable
  mm/swapfile: skip HugeTLB pages for unuse_vma
  selftests: mm: fix the incorrect usage() info of khugepaged
  MAINTAINERS: add Jann as memory mapping/VMA reviewer
  mm: swap: prevent possible data-race in __try_to_reclaim_swap
  mm: khugepaged: fix the incorrect statistics when collapsing large file folios
  MAINTAINERS: kasan, kcov: add bugzilla links
  mm: don't install PMD mappings when THPs are disabled by the hw/process/vma
  mm: huge_memory: add vma_thp_disabled() and thp_disabled_by_hw()
  Docs/damon/maintainer-profile: update deprecated awslabs GitHub URLs
  Docs/damon/maintainer-profile: add missing '_' suffixes for external web links
  maple_tree: check for MA_STATE_BULK on setting wr_rebalance
  mm: khugepaged: fix the arguments order in khugepaged_collapse_file trace point
  mm/damon/tests/sysfs-kunit.h: fix memory leak in damon_sysfs_test_add_targets()
  mm: remove unused stub for can_swapin_thp()
  mailmap: add an entry for Andy Chiu
  MAINTAINERS: add memory mapping/VMA co-maintainers
  fs/proc: fix build with GCC 15 due to -Werror=unterminated-string-initialization
  ...

commit | commitdiff | tree

Linus Torvalds [Thu, 17 Oct 2024 23:24:42 +0000 (16:24 -0700)]

Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux

Pull clk fixes from Stephen Boyd:
"Two clk driver fixes and a unit test fix:

   - Terminate the of_device_id table in the Samsung exynosautov920 clk
     driver so that device matching logic doesn't run off the end of the
     array into other memory and break matching for any kernel with this
     driver loaded

   - Properly limit the max clk ID in the Rockchip clk driver

   - Use clk kunit helpers in the clk tests so that memory isn't leaked
     after the test concludes"

* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
  clk: test: Fix some memory leaks
  clk: rockchip: fix finding of maximum clock ID
  clk: samsung: Fix out-of-bound access of of_match_node()

commit | commitdiff | tree

Dave Airlie [Thu, 17 Oct 2024 20:43:11 +0000 (06:43 +1000)]

Merge tag 'drm-misc-fixes-2024-10-17' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes

Short summary of fixes pull:

ast:
- Clear EDID on unplugged connectors

host1x:
- Fix boot on Tegra186
- Set DMA parameters

mgag200:
- Revert VBLANK support

panel:
- himax-hx83192: Adjust power and gamma

qaic:
- Sgtable loop fixes

vmwgfx:
- Limit display layout allocatino size
- Handle allocation errors in connector checks
- Clean up KMS code for 2d-only setup
- Report surface-check errors correctly
- Remove NULL test around kvfree()

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20241017115516.GA196624@linux.fritz.box

commit | commitdiff | tree

Dave Airlie [Thu, 17 Oct 2024 20:41:12 +0000 (06:41 +1000)]

Merge tag 'drm-intel-fixes-2024-10-17' of https://gitlab.freedesktop.org/drm/i915/kernel into drm-fixes

- Two DP bandwidth related MST fixes

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ZxDLdML9Dwqkb1AW@jlahtine-mobl.ger.corp.intel.com

commit | commitdiff | tree

Dave Airlie [Thu, 17 Oct 2024 20:13:13 +0000 (06:13 +1000)]

Merge tag 'amd-drm-fixes-6.12-2024-10-16' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes

amd-drm-fixes-6.12-2024-10-16:

amdgpu:
- SR-IOV fix
- CS chunk handling fix
- MES fixes
- SMU13 fixes

amdkfd:
- VRAM usage reporting fix

radeon:
- Fix possible_clones handling

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241016200514.3520286-1-alexander.deucher@amd.com

commit | commitdiff | tree

Sebastian Andrzej Siewior [Tue, 15 Oct 2024 15:11:32 +0000 (17:11 +0200)]

MAINTAINERS: Add an entry for PREEMPT_RT.

Add a maintainers entry now that the PREEMPT_RT bits are merged. Steven
volunteered and asked for the list.

There are no files associated with this entry since it is spread over the
kernel. It serves as entry for people knowing what they look for. There is
a keyword added so if PREEMPT_RT is mentioned somewhere, then the entry
will be picked up.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Pavel Machek <pavel@denx.de>
Link: https://lore.kernel.org/all/20241015151132.Erx81G9f@linutronix.de

commit | commitdiff | tree

Andrii Nakryiko [Thu, 17 Oct 2024 17:47:13 +0000 (10:47 -0700)]

lib/buildid: Handle memfd_secret() files in build_id_parse()

>From memfd_secret(2) manpage:

  The memory areas backing the file created with memfd_secret(2) are
  visible only to the processes that have access to the file descriptor.
  The memory region is removed from the kernel page tables and only the
  page tables of the processes holding the file descriptor map the
  corresponding physical memory. (Thus, the pages in the region can't be
  accessed by the kernel itself, so that, for example, pointers to the
  region can't be passed to system calls.)

We need to handle this special case gracefully in build ID fetching
code. Return -EFAULT whenever secretmem file is passed to build_id_parse()
family of APIs. Original report and repro can be found in [0].

  [0] https://lore.kernel.org/bpf/ZwyG8Uro%2FSyTXAni@ly-workstation/

Fixes: de3ec364c3c3 ("lib/buildid: add single folio-based file reader abstraction")
Reported-by: Yi Lai <yi1.lai@intel.com>
Suggested-by: Shakeel Butt <shakeel.butt@linux.dev>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Shakeel Butt <shakeel.butt@linux.dev>
Link: https://lore.kernel.org/bpf/20241017175431.6183-A-hca@linux.ibm.com
Link: https://lore.kernel.org/bpf/20241017174713.2157873-1-andrii@kernel.org

commit | commitdiff | tree

Jens Axboe [Thu, 17 Oct 2024 18:49:27 +0000 (12:49 -0600)]

Merge tag 'nvme-6.12-2024-10-18' of git://git.infradead.org/nvme into block-6.12

Pull NVMe fixes from Keith:

"nvme fixes for Linux 6.12

- Fix target passthrough identifier (Nilay)
- Fix tcp locking (Hannes)
- Replace list with sbitmap for tracking RDMA rsp tags (Guixen)
- Remove unnecessary fallthrough statements (Tokunori)
- Remove ready-without-media support (Greg)
- Fix multipath partition scan deadlock (Keith)
- Fix concurrent PCI reset and remove queue mapping (Maurizio)
- Fabrics shutdown fixes (Nilay)"

* tag 'nvme-6.12-2024-10-18' of git://git.infradead.org/nvme:
  nvme: use helper nvme_ctrl_state in nvme_keep_alive_finish function
  nvme: make keep-alive synchronous operation
  nvme-loop: flush off pending I/O while shutting down loop controller
  nvme-pci: fix race condition between reset and nvme_dev_disable()
  nvme-multipath: defer partition scanning
  nvme: disable CC.CRIME (NVME_CC_CRIME)
  nvme: delete unnecessary fallthru comment
  nvmet-rdma: use sbitmap to replace rsp free list
  nvme: tcp: avoid race between queue_lock lock and destroy
  nvmet-passthru: clear EUID/NGUID/UUID while using loop target
  block: fix blk_rq_map_integrity_sg kernel-doc

commit | commitdiff | tree

Luca Boccassi [Sun, 15 Sep 2024 09:11:19 +0000 (11:11 +0200)]

ipe: allow secondary and platform keyrings to install/update policies

The current policy management makes it impossible to use IPE
in a general purpose distribution. In such cases the users are not
building the kernel, the distribution is, and access to the private
key included in the trusted keyring is, for obvious reason, not
available.
This means that users have no way to enable IPE, since there will
be no built-in generic policy, and no access to the key to sign
updates validated by the trusted keyring.

Just as we do for dm-verity, kernel modules and more, allow the
secondary and platform keyrings to also validate policies. This
allows users enrolling their own keys in UEFI db or MOK to also
sign policies, and enroll them. This makes it sensible to enable
IPE in general purpose distributions, as it becomes usable by
any user wishing to do so. Keys in these keyrings can already
load kernels and kernel modules, so there is no security
downgrade.

Add a kconfig each, like dm-verity does, but default to enabled if
the dependencies are available.

Signed-off-by: Luca Boccassi <bluca@debian.org>
Reviewed-by: Serge Hallyn <serge@hallyn.com>
[FW: fixed some style issues]
Signed-off-by: Fan Wu <wufan@kernel.org>

commit | commitdiff | tree

Yu Kuai [Wed, 9 Oct 2024 01:49:14 +0000 (09:49 +0800)]

md/raid10: fix null ptr dereference in raid10_size()

In raid10_run() if raid10_set_queue_limits() succeed, the return value
is set to zero, and if following procedures failed raid10_run() will
return zero while mddev->private is still NULL, causing null ptr
dereference in raid10_size().

Fix the problem by only overwrite the return value if
raid10_set_queue_limits() failed.

Fixes: 3d8466ba68d4 ("md/raid10: use the atomic queue limit update APIs")
Cc: stable@vger.kernel.org
Reported-and-tested-by: ValdikSS <iam@valdikss.org.ru>
Closes: https://lore.kernel.org/all/0dd96820-fe52-4841-bc58-dbf14d6bfcc8@valdikss.org.ru/
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20241009014914.1682037-1-yukuai1@huaweicloud.com
Signed-off-by: Song Liu <song@kernel.org>

commit | commitdiff | tree

Luca Boccassi [Wed, 25 Sep 2024 21:01:34 +0000 (23:01 +0200)]

ipe: also reject policy updates with the same version

Currently IPE accepts an update that has the same version as the policy
being updated, but it doesn't make it a no-op nor it checks that the
old and new policyes are the same. So it is possible to change the
content of a policy, without changing its version. This is very
confusing from userspace when managing policies.
Instead change the update logic to reject updates that have the same
version with ESTALE, as that is much clearer and intuitive behaviour.

Signed-off-by: Luca Boccassi <bluca@debian.org>
Reviewed-by: Serge Hallyn <serge@hallyn.com>
Signed-off-by: Fan Wu <wufan@kernel.org>

commit | commitdiff | tree

Luca Boccassi [Wed, 25 Sep 2024 21:01:33 +0000 (23:01 +0200)]

ipe: return -ESTALE instead of -EINVAL on update when new policy has a lower version

When loading policies in userspace we want a recognizable error when an
update attempts to use an old policy, as that is an error that needs
to be treated differently from an invalid policy. Use -ESTALE as it is
clear enough for an update mechanism.

Signed-off-by: Luca Boccassi <bluca@debian.org>
Reviewed-by: Serge Hallyn <serge@hallyn.com>
Signed-off-by: Fan Wu <wufan@kernel.org>

commit | commitdiff | tree

Li Nan [Thu, 19 Sep 2024 06:30:48 +0000 (14:30 +0800)]

md: ensure child flush IO does not affect origin bio->bi_status

When a flush is issued to an RAID array, a child flush IO is created and
issued for each member disk in the RAID array. Since commit b75197e86e6d
("md: Remove flush handling"), each child flush IO has been chained with
the original bio. As a result, the failure of any child IO could modify
the bi_status of the original bio, potentially impacting the upper-layer
filesystem.

Fix the issue by preventing child flush IO from altering the original
bio->bi_status as before. However, this design introduces a known
issue: in the event of a power failure, if a flush IO on a member
disk fails, the upper layers may not be informed. This issue is not easy
to fix and will not be addressed for the time being in this issue.

Fixes: b75197e86e6d ("md: Remove flush handling")
Signed-off-by: Li Nan <linan122@huawei.com>
Reviewed-by: Yu Kuai <yukuai3@huawei.com>
Link: https://lore.kernel.org/r/20240919063048.2887579-1-linan666@huaweicloud.com
Signed-off-by: Song Liu <song@kernel.org>

commit | commitdiff | tree

Nilay Shroff [Wed, 16 Oct 2024 03:03:16 +0000 (08:33 +0530)]

nvme: use helper nvme_ctrl_state in nvme_keep_alive_finish function

We no more need acquiring ctrl->lock before accessing the
NVMe controller state and instead we can now use the helper
nvme_ctrl_state. So replace the use of ctrl->lock from
nvme_keep_alive_finish function with nvme_ctrl_state call.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nilay Shroff <nilay@linux.ibm.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>

commit | commitdiff | tree

Nilay Shroff [Wed, 16 Oct 2024 03:03:15 +0000 (08:33 +0530)]

nvme: make keep-alive synchronous operation

The nvme keep-alive operation, which executes at a periodic interval,
could potentially sneak in while shutting down a fabric controller.
This may lead to a race between the fabric controller admin queue
destroy code path (invoked while shutting down controller) and hw/hctx
queue dispatcher called from the nvme keep-alive async request queuing
operation. This race could lead to the kernel crash shown below:

Call Trace:
    autoremove_wake_function+0x0/0xbc (unreliable)
    __blk_mq_sched_dispatch_requests+0x114/0x24c
    blk_mq_sched_dispatch_requests+0x44/0x84
    blk_mq_run_hw_queue+0x140/0x220
    nvme_keep_alive_work+0xc8/0x19c [nvme_core]
    process_one_work+0x200/0x4e0
    worker_thread+0x340/0x504
    kthread+0x138/0x140
    start_kernel_thread+0x14/0x18

While shutting down fabric controller, if nvme keep-alive request sneaks
in then it would be flushed off. The nvme_keep_alive_end_io function is
then invoked to handle the end of the keep-alive operation which
decrements the admin->q_usage_counter and assuming this is the last/only
request in the admin queue then the admin->q_usage_counter becomes zero.
If that happens then blk-mq destroy queue operation (blk_mq_destroy_
queue()) which could be potentially running simultaneously on another
cpu (as this is the controller shutdown code path) would forward
progress and deletes the admin queue. So, now from this point onward
we are not supposed to access the admin queue resources. However the
issue here's that the nvme keep-alive thread running hw/hctx queue
dispatch operation hasn't yet finished its work and so it could still
potentially access the admin queue resource while the admin queue had
been already deleted and that causes the above crash.

This fix helps avoid the observed crash by implementing keep-alive as a
synchronous operation so that we decrement admin->q_usage_counter only
after keep-alive command finished its execution and returns the command
status back up to its caller (blk_execute_rq()). This would ensure that
fabric shutdown code path doesn't destroy the fabric admin queue until
keep-alive request finished execution and also keep-alive thread is not
running hw/hctx queue dispatch operation.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nilay Shroff <nilay@linux.ibm.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>

commit | commitdiff | tree

Nilay Shroff [Wed, 16 Oct 2024 03:03:14 +0000 (08:33 +0530)]

nvme-loop: flush off pending I/O while shutting down loop controller

While shutting down loop controller, we first quiesce the admin/IO queue,
delete the admin/IO tag-set and then at last destroy the admin/IO queue.
However it's quite possible that during the window between quiescing and
destroying of the admin/IO queue, some admin/IO request might sneak in
and if that happens then we could potentially encounter a hung task
because shutdown operation can't forward progress until any pending I/O
is flushed off.

This commit helps ensure that before destroying the admin/IO queue, we
unquiesce the admin/IO queue so that any outstanding requests, which are
added after the admin/IO queue is quiesced, are now flushed to its
completion.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nilay Shroff <nilay@linux.ibm.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>

commit | commitdiff | tree

Daniel Borkmann [Wed, 16 Oct 2024 13:49:13 +0000 (15:49 +0200)]

selftests/bpf: Add test case for delta propagation

Add a small BPF verifier test case to ensure that alu32 additions to
registers are not subject to linked scalar delta tracking.

  # ./vmtest.sh -- ./test_progs -t verifier_linked_scalars
  [...]
  ./test_progs -t verifier_linked_scalars
  [    1.413138] tsc: Refined TSC clocksource calibration: 3407.993 MHz
  [    1.413524] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x311fcd52370, max_idle_ns: 440795242006 ns
  [    1.414223] clocksource: Switched to clocksource tsc
  [    1.419640] bpf_testmod: loading out-of-tree module taints kernel.
  [    1.420025] bpf_testmod: module verification failed: signature and/or required key missing - tainting kernel
  #500/1   verifier_linked_scalars/scalars: find linked scalars:OK
  #500     verifier_linked_scalars:OK
  Summary: 1/1 PASSED, 0 SKIPPED, 0 FAILED
  [    1.590858] ACPI: PM: Preparing to enter system sleep state S5
  [    1.591402] reboot: Power down
  [...]

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/bpf/20241016134913.32249-3-daniel@iogearbox.net

commit | commitdiff | tree

Daniel Borkmann [Wed, 16 Oct 2024 13:49:12 +0000 (15:49 +0200)]

bpf: Fix print_reg_state's constant scalar dump

print_reg_state() should not consider adding reg->off to reg->var_off.value
when dumping scalars. Scalars can be produced with reg->off != 0 through
BPF_ADD_CONST, and thus as-is this can skew the register log dump.

Fixes: 98d7ca374ba4 ("bpf: Track delta between "linked" registers.")
Reported-by: Nathaniel Theis <nathaniel.theis@nccgroup.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20241016134913.32249-2-daniel@iogearbox.net

commit | commitdiff | tree

Daniel Borkmann [Wed, 16 Oct 2024 13:49:11 +0000 (15:49 +0200)]

bpf: Fix incorrect delta propagation between linked registers

Nathaniel reported a bug in the linked scalar delta tracking, which can lead
to accepting a program with OOB access. The specific code is related to the
sync_linked_regs() function and the BPF_ADD_CONST flag, which signifies a
constant offset between two scalar registers tracked by the same register id.

The verifier attempts to track "similar" scalars in order to propagate bounds
information learned about one scalar to others. For instance, if r1 and r2
are known to contain the same value, then upon encountering 'if (r1 != 0x1234)
goto xyz', not only does it know that r1 is equal to 0x1234 on the path where
that conditional jump is not taken, it also knows that r2 is.

Additionally, with env->bpf_capable set, the verifier will track scalars
which should be a constant delta apart (if r1 is known to be one greater than
r2, then if r1 is known to be equal to 0x1234, r2 must be equal to 0x1233.)
The code path for the latter in adjust_reg_min_max_vals() is reached when
processing both 32 and 64-bit addition operations. While adjust_reg_min_max_vals()
knows whether dst_reg was produced by a 32 or a 64-bit addition (based on the
alu32 bool), the only information saved in dst_reg is the id of the source
register (reg->id, or'ed by BPF_ADD_CONST) and the value of the constant
offset (reg->off).

Later, the function sync_linked_regs() will attempt to use this information
to propagate bounds information from one register (known_reg) to others,
meaning, for all R in linked_regs, it copies known_reg range (and possibly
adjusting delta) into R for the case of R->id == known_reg->id.

For the delta adjustment, meaning, matching reg->id with BPF_ADD_CONST, the
verifier adjusts the register as reg = known_reg; reg += delta where delta
is computed as (s32)reg->off - (s32)known_reg->off and placed as a scalar
into a fake_reg to then simulate the addition of reg += fake_reg. This is
only correct, however, if the value in reg was created by a 64-bit addition.
When reg contains the result of a 32-bit addition operation, its upper 32
bits will always be zero. sync_linked_regs() on the other hand, may cause
the verifier to believe that the addition between fake_reg and reg overflows
into those upper bits. For example, if reg was generated by adding the
constant 1 to known_reg using a 32-bit alu operation, then reg->off is 1
and known_reg->off is 0. If known_reg is known to be the constant 0xFFFFFFFF,
sync_linked_regs() will tell the verifier that reg is equal to the constant
0x100000000. This is incorrect as the actual value of reg will be 0, as the
32-bit addition will wrap around.

Example:

  0: (b7) r0 = 0;             R0_w=0
  1: (18) r1 = 0x80000001;    R1_w=0x80000001
  3: (37) r1 /= 1;            R1_w=scalar()
  4: (bf) r2 = r1;            R1_w=scalar(id=1) R2_w=scalar(id=1)
  5: (bf) r4 = r1;            R1_w=scalar(id=1) R4_w=scalar(id=1)
  6: (04) w2 += 2147483647;   R2_w=scalar(id=1+2147483647,smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff))
  7: (04) w4 += 0 ;           R4_w=scalar(id=1+0,smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff))
  8: (15) if r2 == 0x0 goto pc+1
10: R0=0 R1=0xffffffff80000001 R2=0x7fffffff R4=0xffffffff80000001 R10=fp0

What can be seen here is that r1 is copied to r2 and r4, such that {r1,r2,r4}.id
are all the same which later lets sync_linked_regs() to be invoked. Then, in
a next step constants are added with alu32 to r2 and r4, setting their ->off,
as well as id |= BPF_ADD_CONST. Next, the conditional will bind r2 and
propagate ranges to its linked registers. The verifier now believes the upper
32 bits of r4 are r4=0xffffffff80000001, while actually r4=r1=0x80000001.

One approach for a simple fix suitable also for stable is to limit the constant
delta tracking to only 64-bit alu addition. If necessary at some later point,
BPF_ADD_CONST could be split into BPF_ADD_CONST64 and BPF_ADD_CONST32 to avoid
mixing the two under the tradeoff to further complicate sync_linked_regs().
However, none of the added tests from dedf56d775c0 ("selftests/bpf: Add tests
for add_const") make this necessary at this point, meaning, BPF CI also passes
with just limiting tracking to 64-bit alu addition.

Fixes: 98d7ca374ba4 ("bpf: Track delta between "linked" registers.")
Reported-by: Nathaniel Theis <nathaniel.theis@nccgroup.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/bpf/20241016134913.32249-1-daniel@iogearbox.net

commit | commitdiff | tree

Jordan Rome [Wed, 16 Oct 2024 21:00:48 +0000 (14:00 -0700)]

bpf: Properly test iter/task tid filtering

Previously test_task_tid was setting `linfo.task.tid`
to `getpid()` which is the same as `gettid()` for the
parent process. Instead create a new child thread
and set `linfo.task.tid` to `gettid()` to make sure
the tid filtering logic is working as expected.

Signed-off-by: Jordan Rome <linux@jordanrome.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20241016210048.1213935-2-linux@jordanrome.com

commit | commitdiff | tree

Jordan Rome [Wed, 16 Oct 2024 21:00:47 +0000 (14:00 -0700)]

bpf: Fix iter/task tid filtering

In userspace, you can add a tid filter by setting
the "task.tid" field for "bpf_iter_link_info".
However, `get_pid_task` when called for the
`BPF_TASK_ITER_TID` type should have been using
`PIDTYPE_PID` (tid) instead of `PIDTYPE_TGID` (pid).

Fixes: f0d74c4da1f0 ("bpf: Parameterize task iterators.")
Signed-off-by: Jordan Rome <linux@jordanrome.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20241016210048.1213935-1-linux@jordanrome.com

commit | commitdiff | tree

Maurizio Lombardi [Tue, 15 Oct 2024 11:21:00 +0000 (13:21 +0200)]

nvme-pci: fix race condition between reset and nvme_dev_disable()

nvme_dev_disable() modifies the dev->online_queues field, therefore
nvme_pci_update_nr_queues() should avoid racing against it, otherwise
we could end up passing invalid values to blk_mq_update_nr_hw_queues().

WARNING: CPU: 39 PID: 61303 at drivers/pci/msi/api.c:347
          pci_irq_get_affinity+0x187/0x210
Workqueue: nvme-reset-wq nvme_reset_work [nvme]
RIP: 0010:pci_irq_get_affinity+0x187/0x210
Call Trace:
  <TASK>
  ? blk_mq_pci_map_queues+0x87/0x3c0
  ? pci_irq_get_affinity+0x187/0x210
  blk_mq_pci_map_queues+0x87/0x3c0
  nvme_pci_map_queues+0x189/0x460 [nvme]
  blk_mq_update_nr_hw_queues+0x2a/0x40
  nvme_reset_work+0x1be/0x2a0 [nvme]

Fix the bug by locking the shutdown_lock mutex before using
dev->online_queues. Give up if nvme_dev_disable() is running or if
it has been executed already.

Fixes: 949928c1c731 ("NVMe: Fix possible queue use after freed")
Tested-by: Yi Zhang <yi.zhang@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Maurizio Lombardi <mlombard@redhat.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>

commit | commitdiff | tree

Linus Torvalds [Thu, 17 Oct 2024 16:51:03 +0000 (09:51 -0700)]

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 fixes from Will Deacon:

- Disable software tag-based KASAN when compiling with GCC, as
   functions are incorrectly instrumented leading to a crash early
   during boot

- Fix pkey configuration for kernel threads when POE is enabled

- Fix invalid memory accesses in uprobes when targetting load-literal
   instructions

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  kasan: Disable Software Tag-Based KASAN with GCC
  Documentation/protection-keys: add AArch64 to documentation
  arm64: set POR_EL0 for kernel threads
  arm64: probes: Fix uprobes for big-endian kernels
  arm64: probes: Fix simulate_ldr*_literal()
  arm64: probes: Remove broken LDR (literal) uprobe support

commit | commitdiff | tree

Linus Torvalds [Thu, 17 Oct 2024 16:43:36 +0000 (09:43 -0700)]

Merge tag 'arm-fixes-6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc

Pull SoC fixes from Arnd Bergmann:
"Most of the fixes this time are for platform specific drivers,
  addressing issues found through build testing on freescale, ep93xx,
  starfive, and npcm platforms, as as well as the ffa firmware.

  The fixes for the scmi firmware driver address compatibility problems
  found on broadcom machines.

  There are only two devicetree fixes, addressing incorrect in
  configuration on broadcom and marvell machines.

  The changes to the Documentation and MAINTAINERS files are for
  clarification only"

* tag 'arm-fixes-6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
  firmware: arm_ffa: Avoid string-fortify warning caused by memcpy()
  firmware: arm_scmi: Queue in scmi layer for mailbox implementation
  firmware: arm_ffa: Avoid string-fortify warning in export_uuid()
  firmware: arm_scmi: Give SMC transport precedence over mailbox
  firmware: arm_scmi: Fix the double free in scmi_debugfs_common_setup()
  Documentation/process: maintainer-soc: clarify submitting patches
  dmaengine: cirrus: check that output may be truncated
  dmaengine: cirrus: ERR_CAST() ioremap error
  MAINTAINERS: use the canonical soc mailing list address and mark it as L:
  ARM: dts: bcm2837-rpi-cm3-io3: Fix HDMI hpd-gpio pin
  arm64: dts: marvell: cn9130-sr-som: fix cp0 mdio pin numbers
  soc: fsl: cpm1: qmc: Fix unused data compilation warning
  soc: fsl: cpm1: qmc: Do not use IS_ERR_VALUE() on error pointers
  reset: starfive: jh71x0: Fix accessing the empty member on JH7110 SoC
  reset: npcm: convert comma to semicolon

commit | commitdiff | tree

Linus Torvalds [Thu, 17 Oct 2024 16:36:59 +0000 (09:36 -0700)]

Merge tag 'sound-6.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound

Pull sound fixes from Takashi Iwai:
"A collection of small fixes, nothing really stands out:

   - Usual HD-audio quirks / device-specific fixes

   - Kconfig dependency fix for UM

   - A series of minor fixes for SoundWire

   - Updates of USB-audio LINE6 contact address"

* tag 'sound-6.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ALSA: hda/conexant - Use cached pin control for Node 0x1d on HP EliteOne 1000 G2
  ALSA/hda: intel-sdw-acpi: add support for sdw-manager-list property read
  ALSA/hda: intel-sdw-acpi: simplify sdw-master-count property read
  ALSA/hda: intel-sdw-acpi: fetch fwnode once in sdw_intel_scan_controller()
  ALSA/hda: intel-sdw-acpi: cleanup sdw_intel_scan_controller
  ALSA: hda/tas2781: Add new quirk for Lenovo, ASUS, Dell projects
  ALSA: scarlett2: Add error check after retrieving PEQ filter values
  ALSA: hda/cs8409: Fix possible NULL dereference
  sound: Make CONFIG_SND depend on INDIRECT_IOMEM instead of UML
  ALSA: line6: update contact information
  ALSA: usb-audio: Fix NULL pointer deref in snd_usb_power_domain_set()
  ALSA: hda/conexant - Fix audio routing for HP EliteOne 1000 G2
  ALSA: hda: Sound support for HP Spectre x360 16 inch model 2024

commit | commitdiff | tree

Linus Torvalds [Thu, 17 Oct 2024 16:31:18 +0000 (09:31 -0700)]

Merge tag 'net-6.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Paolo Abeni:
"Current release - new code bugs:

   - eth: mlx5: HWS, don't destroy more bwc queue locks than allocated

  Previous releases - regressions:

   - ipv4: give an IPv4 dev to blackhole_netdev

   - udp: compute L4 checksum as usual when not segmenting the skb

   - tcp/dccp: don't use timer_pending() in reqsk_queue_unlink().

   - eth: mlx5e: don't call cleanup on profile rollback failure

   - eth: microchip: vcap api: fix memory leaks in
     vcap_api_encode_rule_test()

   - eth: enetc: disable Tx BD rings after they are empty

   - eth: macb: avoid 20s boot delay by skipping MDIO bus registration
     for fixed-link PHY

  Previous releases - always broken:

   - posix-clock: fix missing timespec64 check in pc_clock_settime()

   - genetlink: hold RCU in genlmsg_mcast()

   - mptcp: prevent MPC handshake on port-based signal endpoints

   - eth: vmxnet3: fix packet corruption in vmxnet3_xdp_xmit_frame

   - eth: stmmac: dwmac-tegra: fix link bring-up sequence

   - eth: bcmasp: fix potential memory leak in bcmasp_xmit()

  Misc:

   - add Andrew Lunn as a co-maintainer of all networking drivers"

* tag 'net-6.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (47 commits)
  net/mlx5e: Don't call cleanup on profile rollback failure
  net/mlx5: Unregister notifier on eswitch init failure
  net/mlx5: Fix command bitmask initialization
  net/mlx5: Check for invalid vector index on EQ creation
  net/mlx5: HWS, use lock classes for bwc locks
  net/mlx5: HWS, don't destroy more bwc queue locks than allocated
  net/mlx5: HWS, fixed double free in error flow of definer layout
  net/mlx5: HWS, removed wrong access to a number of rules variable
  mptcp: pm: fix UaF read in mptcp_pm_nl_rm_addr_or_subflow
  net: ethernet: mtk_eth_soc: fix memory corruption during fq dma init
  vmxnet3: Fix packet corruption in vmxnet3_xdp_xmit_frame
  net: dsa: vsc73xx: fix reception from VLAN-unaware bridges
  net: ravb: Only advertise Rx/Tx timestamps if hardware supports it
  net: microchip: vcap api: Fix memory leaks in vcap_api_encode_rule_test()
  net: phy: mdio-bcm-unimac: Add BCM6846 support
  dt-bindings: net: brcm,unimac-mdio: Add bcm6846-mdio
  udp: Compute L4 checksum as usual when not segmenting the skb
  genetlink: hold RCU in genlmsg_mcast()
  net: dsa: mv88e6xxx: Fix the max_vid definition for the MV88E6361
  tcp/dccp: Don't use timer_pending() in reqsk_queue_unlink().
  ...

commit | commitdiff | tree

Sean Anderson [Thu, 17 Oct 2024 15:10:55 +0000 (11:10 -0400)]

dma-mapping: fix tracing dma_alloc/free with vmalloc'd memory

Not all virtual addresses have physical addresses, such as if they were
vmalloc'd. Just trace the virtual address instead of trying to trace a
physical address. This aligns with the API, and is good enough to
associate dma_alloc with dma_free.

Fixes: 038eb433dc14 ("dma-mapping: add tracing for dma-mapping API calls")
Reported-by: syzbot+b4bfacdec173efaa8567@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/670ebde5.050a0220.d9b66.0154.GAE@google.com/
Signed-off-by: Sean Anderson <sean.anderson@linux.dev>
Signed-off-by: Christoph Hellwig <hch@lst.de>

commit | commitdiff | tree

Lorenzo Stoakes [Mon, 7 Oct 2024 15:28:33 +0000 (16:28 +0100)]

maple_tree: add regression test for spanning store bug

Add a regression test to assert that, when performing a spanning store
which consumes the entirety of the rightmost right leaf node does not
result in maple tree corruption when doing so.

This achieves this by building a test tree of 3 levels and establishing a
store which ultimately results in a spanned store of this nature.

Link: https://lkml.kernel.org/r/30cdc101a700d16e03ba2f9aa5d83f2efa894168.1728314403.git.lorenzo.stoakes@oracle.com
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
Reviewed-by: Wei Yang <richard.weiyang@gmail.com>
Cc: Bert Karwatzki <spasswolf@web.de>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

commit | commitdiff | tree

Lorenzo Stoakes [Mon, 7 Oct 2024 15:28:32 +0000 (16:28 +0100)]

maple_tree: correct tree corruption on spanning store

Patch series "maple_tree: correct tree corruption on spanning store", v3.

There has been a nasty yet subtle maple tree corruption bug that appears
to have been in existence since the inception of the algorithm.

This bug seems far more likely to happen since commit f8d112a4e657
("mm/mmap: avoid zeroing vma tree in mmap_region()"), which is the point
at which reports started to be submitted concerning this bug.

We were made definitely aware of the bug thanks to the kind efforts of
Bert Karwatzki who helped enormously in my being able to track this down
and identify the cause of it.

The bug arises when an attempt is made to perform a spanning store across
two leaf nodes, where the right leaf node is the rightmost child of the
shared parent, AND the store completely consumes the right-mode node.

This results in mas_wr_spanning_store() mitakenly duplicating the new and
existing entries at the maximum pivot within the range, and thus maple
tree corruption.

The fix patch corrects this by detecting this scenario and disallowing the
mistaken duplicate copy.

The fix patch commit message goes into great detail as to how this occurs.

This series also includes a test which reliably reproduces the issue, and
asserts that the fix works correctly.

Bert has kindly tested the fix and confirmed it resolved his issues.  Also
Mikhail Gavrilov kindly reported what appears to be precisely the same
bug, which this fix should also resolve.

This patch (of 2):

There has been a subtle bug present in the maple tree implementation from
its inception.

This arises from how stores are performed - when a store occurs, it will
overwrite overlapping ranges and adjust the tree as necessary to
accommodate this.

A range may always ultimately span two leaf nodes.  In this instance we
walk the two leaf nodes, determine which elements are not overwritten to
the left and to the right of the start and end of the ranges respectively
and then rebalance the tree to contain these entries and the newly
inserted one.

This kind of store is dubbed a 'spanning store' and is implemented by
mas_wr_spanning_store().

In order to reach this stage, mas_store_gfp() invokes
mas_wr_preallocate(), mas_wr_store_type() and mas_wr_walk() in turn to
walk the tree and update the object (mas) to traverse to the location
where the write should be performed, determining its store type.

When a spanning store is required, this function returns false stopping at
the parent node which contains the target range, and mas_wr_store_type()
marks the mas->store_type as wr_spanning_store to denote this fact.

When we go to perform the store in mas_wr_spanning_store(), we first
determine the elements AFTER the END of the range we wish to store (that
is, to the right of the entry to be inserted) - we do this by walking to
the NEXT pivot in the tree (i.e.  r_mas.last + 1), starting at the node we
have just determined contains the range over which we intend to write.

We then turn our attention to the entries to the left of the entry we are
inserting, whose state is represented by l_mas, and copy these into a 'big
node', which is a special node which contains enough slots to contain two
leaf node's worth of data.

We then copy the entry we wish to store immediately after this - the copy
and the insertion of the new entry is performed by mas_store_b_node().

After this we copy the elements to the right of the end of the range which
we are inserting, if we have not exceeded the length of the node (i.e.
r_mas.offset <= r_mas.end).

Herein lies the bug - under very specific circumstances, this logic can
break and corrupt the maple tree.

Consider the following tree:

Height
  0                             Root Node
                                 /      \
                 pivot = 0xffff /        \ pivot = ULONG_MAX
                               /          \
  1                       A [-----]       ...
                             /   \
             pivot = 0x4fff /     \ pivot = 0xffff
                           /       \
  2 (LEAVES)          B [-----]  [-----] C
                                      ^--- Last pivot 0xffff.

Now imagine we wish to store an entry in the range [0x4000, 0xffff] (note
that all ranges expressed in maple tree code are inclusive):

1. mas_store_gfp() descends the tree, finds node A at <=0xffff, then
   determines that this is a spanning store across nodes B and C. The mas
   state is set such that the current node from which we traverse further
   is node A.

2. In mas_wr_spanning_store() we try to find elements to the right of pivot
   0xffff by searching for an index of 0x10000:

    - mas_wr_walk_index() invokes mas_wr_walk_descend() and
      mas_wr_node_walk() in turn.

        - mas_wr_node_walk() loops over entries in node A until EITHER it
          finds an entry whose pivot equals or exceeds 0x10000 OR it
          reaches the final entry.

        - Since no entry has a pivot equal to or exceeding 0x10000, pivot
          0xffff is selected, leading to node C.

    - mas_wr_walk_traverse() resets the mas state to traverse node C. We
      loop around and invoke mas_wr_walk_descend() and mas_wr_node_walk()
      in turn once again.

         - Again, we reach the last entry in node C, which has a pivot of
           0xffff.

3. We then copy the elements to the left of 0x4000 in node B to the big
   node via mas_store_b_node(), and insert the new [0x4000, 0xffff] entry
   too.

4. We determine whether we have any entries to copy from the right of the
   end of the range via - and with r_mas set up at the entry at pivot
   0xffff, r_mas.offset <= r_mas.end, and then we DUPLICATE the entry at
   pivot 0xffff.

5. BUG! The maple tree is corrupted with a duplicate entry.

This requires a very specific set of circumstances - we must be spanning
the last element in a leaf node, which is the last element in the parent
node.

spanning store across two leaf nodes with a range that ends at that shared
pivot.

A potential solution to this problem would simply be to reset the walk
each time we traverse r_mas, however given the rarity of this situation it
seems that would be rather inefficient.

Instead, this patch detects if the right hand node is populated, i.e.  has
anything we need to copy.

We do so by only copying elements from the right of the entry being
inserted when the maximum value present exceeds the last, rather than
basing this on offset position.

The patch also updates some comments and eliminates the unused bool return
value in mas_wr_walk_index().

The work performed in commit f8d112a4e657 ("mm/mmap: avoid zeroing vma
tree in mmap_region()") seems to have made the probability of this event
much more likely, which is the point at which reports started to be
submitted concerning this bug.

The motivation for this change arose from Bert Karwatzki's report of
encountering mm instability after the release of kernel v6.12-rc1 which,
after the use of CONFIG_DEBUG_VM_MAPLE_TREE and similar configuration
options, was identified as maple tree corruption.

After Bert very generously provided his time and ability to reproduce this
event consistently, I was able to finally identify that the issue
discussed in this commit message was occurring for him.

Link: https://lkml.kernel.org/r/cover.1728314402.git.lorenzo.stoakes@oracle.com
Link: https://lkml.kernel.org/r/48b349a2a0f7c76e18772712d0997a5e12ab0a3b.1728314403.git.lorenzo.stoakes@oracle.com
Fixes: 54a611b60590 ("Maple Tree: add new data structure")
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reported-by: Bert Karwatzki <spasswolf@web.de>
Closes: https://lore.kernel.org/all/20241001023402.3374-1-spasswolf@web.de/
Tested-by: Bert Karwatzki <spasswolf@web.de>
Reported-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
Closes: https://lore.kernel.org/all/CABXGCsOPwuoNOqSMmAvWO2Fz4TEmPnjFj-b7iF+XFRu1h7-+Dg@mail.gmail.com/
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
Tested-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
Reviewed-by: Wei Yang <richard.weiyang@gmail.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

commit | commitdiff | tree

Andrea Parri [Thu, 17 Oct 2024 14:36:28 +0000 (17:36 +0300)]

riscv, bpf: Make BPF_CMPXCHG fully ordered

According to the prototype formal BPF memory consistency model
discussed e.g. in [1] and following the ordering properties of
the C/in-kernel macro atomic_cmpxchg(), a BPF atomic operation
with the BPF_CMPXCHG modifier is fully ordered.  However, the
current RISC-V JIT lowerings fail to meet such memory ordering
property.  This is illustrated by the following litmus test:

BPF BPF__MP+success_cmpxchg+fence
{
0:r1=x; 0:r3=y; 0:r5=1;
1:r2=y; 1:r4=f; 1:r7=x;
}
P0                               | P1                                         ;
*(u64 *)(r1 + 0) = 1             | r1 = *(u64 *)(r2 + 0)                      ;
r2 = cmpxchg_64 (r3 + 0, r4, r5) | r3 = atomic_fetch_add((u64 *)(r4 + 0), r5) ;
                                  | r6 = *(u64 *)(r7 + 0)                      ;
exists (1:r1=1 /\ 1:r6=0)

whose "exists" clause is not satisfiable according to the BPF
memory model.  Using the current RISC-V JIT lowerings, the test
can be mapped to the following RISC-V litmus test:

RISCV RISCV__MP+success_cmpxchg+fence
{
0:x1=x; 0:x3=y; 0:x5=1;
1:x2=y; 1:x4=f; 1:x7=x;
}
P0                 | P1                          ;
sd x5, 0(x1)       | ld x1, 0(x2)                ;
L00:               | amoadd.d.aqrl x3, x5, 0(x4) ;
lr.d x2, 0(x3)     | ld x6, 0(x7)                ;
bne x2, x4, L01    |                             ;
sc.d x6, x5, 0(x3) |                             ;
bne x6, x4, L00    |                             ;
fence rw, rw       |                             ;
L01:               |                             ;
exists (1:x1=1 /\ 1:x6=0)

where the two stores in P0 can be reordered.  Update the RISC-V
JIT lowerings/implementation of BPF_CMPXCHG to emit an SC with
RELEASE ("rl") annotation in order to meet the expected memory
ordering guarantees.  The resulting RISC-V JIT lowerings of
BPF_CMPXCHG match the RISC-V lowerings of the C atomic_cmpxchg().

Other lowerings were fixed via 20a759df3bba ("riscv, bpf: make
some atomic operations fully ordered").

Fixes: dd642ccb45ec ("riscv, bpf: Implement more atomic operations for RV64")
Signed-off-by: Andrea Parri <parri.andrea@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Puranjay Mohan <puranjay@kernel.org>
Acked-by: Björn Töpel <bjorn@kernel.org>
Link: https://lpc.events/event/18/contributions/1949/attachments/1665/3441/bpfmemmodel.2024.09.19p.pdf
Link: https://lore.kernel.org/bpf/20241017143628.2673894-1-parri.andrea@gmail.com

commit | commitdiff | tree

Jens Axboe [Thu, 17 Oct 2024 14:31:56 +0000 (08:31 -0600)]

io_uring/sqpoll: ensure task state is TASK_RUNNING when running task_work

When the sqpoll is exiting and cancels pending work items, it may need
to run task_work. If this happens from within io_uring_cancel_generic(),
then it may be under waiting for the io_uring_task waitqueue. This
results in the below splat from the scheduler, as the ring mutex may be
attempted grabbed while in a TASK_INTERRUPTIBLE state.

Ensure that the task state is set appropriately for that, just like what
is done for the other cases in io_run_task_work().

do not call blocking ops when !TASK_RUNNING; state=1 set at [<0000000029387fd2>] prepare_to_wait+0x88/0x2fc
WARNING: CPU: 6 PID: 59939 at kernel/sched/core.c:8561 __might_sleep+0xf4/0x140
Modules linked in:
CPU: 6 UID: 0 PID: 59939 Comm: iou-sqp-59938 Not tainted 6.12.0-rc3-00113-g8d020023b155 #7456
Hardware name: linux,dummy-virt (DT)
pstate: 61400005 (nZCv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--)
pc : __might_sleep+0xf4/0x140
lr : __might_sleep+0xf4/0x140
sp : ffff80008c5e7830
x29: ffff80008c5e7830 x28: ffff0000d93088c0 x27: ffff60001c2d7230
x26: dfff800000000000 x25: ffff0000e16b9180 x24: ffff80008c5e7a50
x23: 1ffff000118bcf4a x22: ffff0000e16b9180 x21: ffff0000e16b9180
x20: 000000000000011b x19: ffff80008310fac0 x18: 1ffff000118bcd90
x17: 30303c5b20746120 x16: 74657320313d6574 x15: 0720072007200720
x14: 0720072007200720 x13: 0720072007200720 x12: ffff600036c64f0b
x11: 1fffe00036c64f0a x10: ffff600036c64f0a x9 : dfff800000000000
x8 : 00009fffc939b0f6 x7 : ffff0001b6327853 x6 : 0000000000000001
x5 : ffff0001b6327850 x4 : ffff600036c64f0b x3 : ffff8000803c35bc
x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff0000e16b9180
Call trace:
__might_sleep+0xf4/0x140
mutex_lock+0x84/0x124
io_handle_tw_list+0xf4/0x260
tctx_task_work_run+0x94/0x340
io_run_task_work+0x1ec/0x3c0
io_uring_cancel_generic+0x364/0x524
io_sq_thread+0x820/0x124c
ret_from_fork+0x10/0x20

Cc: stable@vger.kernel.org
Fixes: af5d68f8892f ("io_uring/sqpoll: manage task_work privately")
Signed-off-by: Jens Axboe <axboe@kernel.dk>

commit | commitdiff | tree

Daniele Palmas [Thu, 3 Oct 2024 09:38:08 +0000 (11:38 +0200)]

USB: serial: option: add Telit FN920C04 MBIM compositions

Add the following Telit FN920C04 compositions:

0x10a2: MBIM + tty (AT/NMEA) + tty (AT) + tty (diag)
T:  Bus=03 Lev=01 Prnt=03 Port=06 Cnt=01 Dev#= 17 Spd=480  MxCh= 0
D:  Ver= 2.00 Cls=ef(misc ) Sub=02 Prot=01 MxPS=64 #Cfgs=  1
P:  Vendor=1bc7 ProdID=10a2 Rev=05.15
S:  Manufacturer=Telit Cinterion
S:  Product=FN920
S:  SerialNumber=92c4c4d8
C:  #Ifs= 5 Cfg#= 1 Atr=e0 MxPwr=500mA
I:  If#= 0 Alt= 0 #EPs= 1 Cls=02(commc) Sub=0e Prot=00 Driver=cdc_mbim
E:  Ad=82(I) Atr=03(Int.) MxPS=  64 Ivl=32ms
I:  If#= 1 Alt= 1 #EPs= 2 Cls=0a(data ) Sub=00 Prot=02 Driver=cdc_mbim
E:  Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
I:  If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=60 Driver=option
E:  Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=83(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=84(I) Atr=03(Int.) MxPS=  10 Ivl=32ms
I:  If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option
E:  Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=85(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=86(I) Atr=03(Int.) MxPS=  10 Ivl=32ms
I:  If#= 4 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option
E:  Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=87(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms

0x10a7: MBIM + tty (AT) + tty (AT) + tty (diag)
T:  Bus=03 Lev=01 Prnt=03 Port=06 Cnt=01 Dev#= 18 Spd=480  MxCh= 0
D:  Ver= 2.00 Cls=ef(misc ) Sub=02 Prot=01 MxPS=64 #Cfgs=  1
P:  Vendor=1bc7 ProdID=10a7 Rev=05.15
S:  Manufacturer=Telit Cinterion
S:  Product=FN920
S:  SerialNumber=92c4c4d8
C:  #Ifs= 5 Cfg#= 1 Atr=e0 MxPwr=500mA
I:  If#= 0 Alt= 0 #EPs= 1 Cls=02(commc) Sub=0e Prot=00 Driver=cdc_mbim
E:  Ad=82(I) Atr=03(Int.) MxPS=  64 Ivl=32ms
I:  If#= 1 Alt= 1 #EPs= 2 Cls=0a(data ) Sub=00 Prot=02 Driver=cdc_mbim
E:  Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
I:  If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option
E:  Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=83(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=84(I) Atr=03(Int.) MxPS=  10 Ivl=32ms
I:  If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option
E:  Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=85(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=86(I) Atr=03(Int.) MxPS=  10 Ivl=32ms
I:  If#= 4 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option
E:  Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=87(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms

0x10aa: MBIM + tty (AT) + tty (diag) + DPL (data packet logging) + adb
T:  Bus=03 Lev=01 Prnt=03 Port=06 Cnt=01 Dev#= 15 Spd=480  MxCh= 0
D:  Ver= 2.00 Cls=ef(misc ) Sub=02 Prot=01 MxPS=64 #Cfgs=  1
P:  Vendor=1bc7 ProdID=10aa Rev=05.15
S:  Manufacturer=Telit Cinterion
S:  Product=FN920
S:  SerialNumber=92c4c4d8
C:  #Ifs= 6 Cfg#= 1 Atr=e0 MxPwr=500mA
I:  If#= 0 Alt= 0 #EPs= 1 Cls=02(commc) Sub=0e Prot=00 Driver=cdc_mbim
E:  Ad=82(I) Atr=03(Int.) MxPS=  64 Ivl=32ms
I:  If#= 1 Alt= 1 #EPs= 2 Cls=0a(data ) Sub=00 Prot=02 Driver=cdc_mbim
E:  Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
I:  If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option
E:  Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=83(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=84(I) Atr=03(Int.) MxPS=  10 Ivl=32ms
I:  If#= 3 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option
E:  Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=85(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
I:  If#= 4 Alt= 0 #EPs= 1 Cls=ff(vend.) Sub=ff Prot=80 Driver=(none)
E:  Ad=86(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
I:  If#= 5 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=42 Prot=01 Driver=(none)
E:  Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=87(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms

Signed-off-by: Daniele Palmas <dnlplm@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: Johan Hovold <johan@kernel.org>

commit | commitdiff | tree

Benjamin B. Frost [Wed, 11 Sep 2024 08:54:05 +0000 (10:54 +0200)]

USB: serial: option: add support for Quectel EG916Q-GL

Add Quectel EM916Q-GL with product ID 0x6007

T:  Bus=01 Lev=02 Prnt=02 Port=01 Cnt=01 Dev#=  3 Spd=480  MxCh= 0
D:  Ver= 2.00 Cls=ef(misc ) Sub=02 Prot=01 MxPS=64 #Cfgs=  1
P:  Vendor=2c7c ProdID=6007 Rev= 2.00
S:  Manufacturer=Quectel
S:  Product=EG916Q-GL
C:* #Ifs= 6 Cfg#= 1 Atr=a0 MxPwr=200mA
A:  FirstIf#= 4 IfCount= 2 Cls=02(comm.) Sub=06 Prot=00
I:* If#= 0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
E:  Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
I:* If#= 1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
E:  Ad=82(I) Atr=03(Int.) MxPS=  16 Ivl=32ms
E:  Ad=83(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
I:* If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
E:  Ad=84(I) Atr=03(Int.) MxPS=  16 Ivl=32ms
E:  Ad=85(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
I:* If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
E:  Ad=86(I) Atr=03(Int.) MxPS=  16 Ivl=32ms
E:  Ad=87(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
I:* If#= 4 Alt= 0 #EPs= 1 Cls=02(comm.) Sub=06 Prot=00 Driver=cdc_ether
E:  Ad=88(I) Atr=03(Int.) MxPS=  32 Ivl=32ms
I:  If#= 5 Alt= 0 #EPs= 0 Cls=0a(data ) Sub=00 Prot=00 Driver=cdc_ether
I:* If#= 5 Alt= 1 #EPs= 2 Cls=0a(data ) Sub=00 Prot=00 Driver=cdc_ether
E:  Ad=05(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=89(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms

MI_00 Quectel USB Diag Port
MI_01 Quectel USB NMEA Port
MI_02 Quectel USB AT Port
MI_03 Quectel USB Modem Port
MI_04 Quectel USB Net Port

Signed-off-by: Benjamin B. Frost <benjamin@geanix.com>
Reviewed-by: Lars Melin <larsm17@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: Johan Hovold <johan@kernel.org>

commit | commitdiff | tree

Kalle Valo [Thu, 17 Oct 2024 14:25:37 +0000 (17:25 +0300)]

Merge tag 'ath-current-20241016' of git://git.kernel.org/pub/scm/linux/kernel/git/ath/ath

ath.git patches for v6.12-rc4

Fix two instances of memory leaks, one in ath10k and one in ath11k.

commit | commitdiff | tree

Bitterblue Smith [Thu, 10 Oct 2024 15:34:43 +0000 (18:34 +0300)]

wifi: rtlwifi: rtl8192du: Don't claim USB ID 0bda:8171

This ID appears to be RTL8188SU, not RTL8192DU. This is the wrong driver
for RTL8188SU. The r8712u driver from staging handles this ID.

I think this ID comes from the original rtl8192du driver from Realtek.
I don't know if they added it by mistake, or it was actually used for
two different chips.

RTL8188SU with this ID exists in the wild. RTL8192DU with this ID
probably doesn't.

Fixes: b5dc8873b6ff ("wifi: rtlwifi: Add rtl8192du/sw.c")
Cc: stable@vger.kernel.org # v6.11
Closes: https://github.com/lwfinger/rtl8192du/issues/105
Signed-off-by: Bitterblue Smith <rtl8821cerfe2@gmail.com>
Acked-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://patch.msgid.link/40245564-41fe-4a5e-881f-cd517255b20a@gmail.com

commit | commitdiff | tree

Bitterblue Smith [Tue, 8 Oct 2024 18:44:02 +0000 (21:44 +0300)]

wifi: rtw88: Fix the RX aggregation in USB 3 mode

RTL8822CU, RTL8822BU, and RTL8821CU don't need BIT_EN_PRE_CALC.
In fact, RTL8822BU in USB 3 mode doesn't pass all the frames to the
driver, resulting in much lower download speed than normal:

$ iperf3 -c 192.168.0.1 -R
Connecting to host 192.168.0.1, port 5201
Reverse mode, remote host 192.168.0.1 is sending
[  5] local 192.168.0.50 port 43062 connected to 192.168.0.1 port 5201
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-1.00   sec  26.9 MBytes   225 Mbits/sec
[  5]   1.00-2.00   sec  7.50 MBytes  62.9 Mbits/sec
[  5]   2.00-3.00   sec  8.50 MBytes  71.3 Mbits/sec
[  5]   3.00-4.00   sec  8.38 MBytes  70.3 Mbits/sec
[  5]   4.00-5.00   sec  7.75 MBytes  65.0 Mbits/sec
[  5]   5.00-6.00   sec  8.00 MBytes  67.1 Mbits/sec
[  5]   6.00-7.00   sec  8.00 MBytes  67.1 Mbits/sec
[  5]   7.00-8.00   sec  7.75 MBytes  65.0 Mbits/sec
[  5]   8.00-9.00   sec  7.88 MBytes  66.1 Mbits/sec
[  5]   9.00-10.00  sec  7.88 MBytes  66.1 Mbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.02  sec   102 MBytes  85.1 Mbits/sec  224             sender
[  5]   0.00-10.00  sec  98.6 MBytes  82.7 Mbits/sec                  receiver

Don't set BIT_EN_PRE_CALC. Then the speed is much better:

% iperf3 -c 192.168.0.1 -R
Connecting to host 192.168.0.1, port 5201
Reverse mode, remote host 192.168.0.1 is sending
[  5] local 192.168.0.50 port 39000 connected to 192.168.0.1 port 5201
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-1.00   sec  52.8 MBytes   442 Mbits/sec
[  5]   1.00-2.00   sec  71.9 MBytes   603 Mbits/sec
[  5]   2.00-3.00   sec  74.8 MBytes   627 Mbits/sec
[  5]   3.00-4.00   sec  75.9 MBytes   636 Mbits/sec
[  5]   4.00-5.00   sec  76.0 MBytes   638 Mbits/sec
[  5]   5.00-6.00   sec  74.1 MBytes   622 Mbits/sec
[  5]   6.00-7.00   sec  74.0 MBytes   621 Mbits/sec
[  5]   7.00-8.00   sec  76.0 MBytes   638 Mbits/sec
[  5]   8.00-9.00   sec  74.4 MBytes   624 Mbits/sec
[  5]   9.00-10.00  sec  63.9 MBytes   536 Mbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec   717 MBytes   601 Mbits/sec   24             sender
[  5]   0.00-10.00  sec   714 MBytes   599 Mbits/sec                  receiver

Fixes: 002a5db9a52a ("wifi: rtw88: Enable USB RX aggregation for 8822c/8822b/8821c")
Signed-off-by: Bitterblue Smith <rtl8821cerfe2@gmail.com>
Acked-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://patch.msgid.link/afb94a82-3d18-459e-97fc-1a217608cdf0@gmail.com

commit | commitdiff | tree

Geert Uytterhoeven [Tue, 24 Sep 2024 12:09:32 +0000 (14:09 +0200)]

wifi: brcm80211: BRCM_TRACING should depend on TRACING

When tracing is disabled, there is no point in asking the user about
enabling Broadcom wireless device tracing.

Fixes: f5c4f10852d42012 ("brcm80211: Allow trace support to be enabled separately from debug")
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Arend van Spriel <arend.vanspriel@broadcom.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://patch.msgid.link/81a29b15eaacc1ac1fb421bdace9ac0c3385f40f.1727179742.git.geert@linux-m68k.org

commit | commitdiff | tree

Ping-Ke Shih [Tue, 24 Sep 2024 02:16:33 +0000 (10:16 +0800)]

wifi: rtw89: pci: early chips only enable 36-bit DMA on specific PCI hosts

The early chips including RTL8852A, RTL8851B, RTL8852B and RTL8852BT have
interoperability problems of 36-bit DMA with some PCI hosts. Rollback
to 32-bit DMA by default, and only enable 36-bit DMA for tested platforms.

Since all Intel platforms we have can work correctly, add the vendor ID to
white list. Otherwise, list vendor/device ID of bridge we have tested.

Fixes: 1fd4b3fe52ef ("wifi: rtw89: pci: support 36-bit PCI DMA address")
Reported-by: Marcel Weißenbach <mweissenbach@ignaz.org>
Closes: https://lore.kernel.org/linux-wireless/20240918073237.Horde.VLueh0_KaiDw-9asEEcdM84@ignaz.org/T/#m07c5694df1acb173a42e1a0bab7ac22bd231a2b8
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Tested-by: Marcel Weißenbach <mweissenbach@ignaz.org>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://patch.msgid.link/20240924021633.19861-1-pkshih@realtek.com

commit | commitdiff | tree

Naohiro Aota [Tue, 1 Oct 2024 08:03:32 +0000 (17:03 +0900)]

btrfs: zoned: fix zone unusable accounting for freed reserved extent

When btrfs reserves an extent and does not use it (e.g, by an error), it
calls btrfs_free_reserved_extent() to free the reserved extent. In the
process, it calls btrfs_add_free_space() and then it accounts the region
bytes as block_group->zone_unusable.

However, it leaves the space_info->bytes_zone_unusable side not updated. As
a result, ENOSPC can happen while a space_info reservation succeeded. The
reservation is fine because the freed region is not added in
space_info->bytes_zone_unusable, leaving that space as "free". OTOH,
corresponding block group counts it as zone_unusable and its allocation
pointer is not rewound, we cannot allocate an extent from that block group.
That will also negate space_info's async/sync reclaim process, and cause an
ENOSPC error from the extent allocation process.

Fix that by returning the space to space_info->bytes_zone_unusable.
Ideally, since a bio is not submitted for this reserved region, we should
return the space to free space and rewind the allocation pointer. But, it
needs rework on extent allocation handling, so let it work in this way for
now.

Fixes: 169e0da91a21 ("btrfs: zoned: track unusable bytes for zones")
CC: stable@vger.kernel.org # 5.15+
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>

commit | commitdiff | tree

Arnaldo Carvalho de Melo [Fri, 11 Oct 2024 19:10:01 +0000 (16:10 -0300)]

perf trace: The return from 'write' isn't a pid

When adding a explicit beautifier for the 'write' syscall when the BPF
based buffer collector was introduced there was a cut'n'paste error that
carried the syscall_fmt->errpid setting from a nearby syscall (waitid)
that returns a pid.

So the write return was being suppressed by the return pretty printer,
remove that field, reverting it back to the default return handler, that
prints positive numbers as-is and interpret negative values as errnos.

I actually introduced the problem while making Howard's original patch
work just with the 'write' syscall, as we couldn't just look for any
buffers, the ones that are filled in by the kernel couldn't use the same
sys_enter BPF collector.

Fixes: b257fac12f38d7f5 ("perf trace: Pretty print buffer data")
Reported-by: James Clark <james.clark@linaro.org>
Link: https://lore.kernel.org/lkml/bcf50648-3c7e-4513-8717-0d14492c53b9@linaro.org
Link: https://lore.kernel.org/all/Zt8jTfzDYgBPvFCd@x1/#t
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alan Maguire <alan.maguire@oracle.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

commit | commitdiff | tree

Arnaldo Carvalho de Melo [Fri, 11 Oct 2024 18:20:17 +0000 (15:20 -0300)]

tools headers UAPI: Sync linux/const.h with the kernel headers

To pick up the changes in:

  947697c6f0f75f98 ("uapi: Define GENMASK_U128")

That causes no changes in tooling, just addresses this perf build
warning:

  Warning: Kernel ABI header differences:
    diff -u tools/include/uapi/linux/const.h include/uapi/linux/const.h

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Yury Norov <yury.norov@gmail.com>
Link: https://lore.kernel.org/lkml/ZwltGNJwujKu1Fgn@x1
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

commit | commitdiff | tree

David Howells [Thu, 17 Oct 2024 10:58:52 +0000 (11:58 +0100)]

afs: Fix lock recursion

afs_wake_up_async_call() can incur lock recursion.  The problem is that it
is called from AF_RXRPC whilst holding the ->notify_lock, but it tries to
take a ref on the afs_call struct in order to pass it to a work queue - but
if the afs_call is already queued, we then have an extraneous ref that must
be put... calling afs_put_call() may call back down into AF_RXRPC through
rxrpc_kernel_shutdown_call(), however, which might try taking the
->notify_lock again.

This case isn't very common, however, so defer it to a workqueue.  The oops
looks something like:

  BUG: spinlock recursion on CPU#0, krxrpcio/7001/1646
   lock: 0xffff888141399b30, .magic: dead4ead, .owner: krxrpcio/7001/1646, .owner_cpu: 0
  CPU: 0 UID: 0 PID: 1646 Comm: krxrpcio/7001 Not tainted 6.12.0-rc2-build3+ #4351
  Hardware name: ASUS All Series/H97-PLUS, BIOS 2306 10/09/2014
  Call Trace:
   <TASK>
   dump_stack_lvl+0x47/0x70
   do_raw_spin_lock+0x3c/0x90
   rxrpc_kernel_shutdown_call+0x83/0xb0
   afs_put_call+0xd7/0x180
   rxrpc_notify_socket+0xa0/0x190
   rxrpc_input_split_jumbo+0x198/0x1d0
   rxrpc_input_data+0x14b/0x1e0
   ? rxrpc_input_call_packet+0xc2/0x1f0
   rxrpc_input_call_event+0xad/0x6b0
   rxrpc_input_packet_on_conn+0x1e1/0x210
   rxrpc_input_packet+0x3f2/0x4d0
   rxrpc_io_thread+0x243/0x410
   ? __pfx_rxrpc_io_thread+0x10/0x10
   kthread+0xcf/0xe0
   ? __pfx_kthread+0x10/0x10
   ret_from_fork+0x24/0x40
   ? __pfx_kthread+0x10/0x10
   ret_from_fork_asm+0x1a/0x30
   </TASK>

Signed-off-by: David Howells <dhowells@redhat.com>
Link: https://lore.kernel.org/r/1394602.1729162732@warthog.procyon.org.uk
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
cc: linux-fsdevel@vger.kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>

commit | commitdiff | tree

Alessandro Zanni [Thu, 17 Oct 2024 12:05:51 +0000 (14:05 +0200)]

fs: Fix uninitialized value issue in from_kuid and from_kgid

ocfs2_setattr() uses attr->ia_mode, attr->ia_uid and attr->ia_gid in
a trace point even though ATTR_MODE, ATTR_UID and ATTR_GID aren't set.

Initialize all fields of newattrs to avoid uninitialized variables, by
checking if ATTR_MODE, ATTR_UID, ATTR_GID are initialized, otherwise 0.

Reported-by: syzbot+6c55f725d1bdc8c52058@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=6c55f725d1bdc8c52058
Signed-off-by: Alessandro Zanni <alessandro.zanni87@gmail.com>
Link: https://lore.kernel.org/r/20241017120553.55331-1-alessandro.zanni87@gmail.com
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Christian Brauner <brauner@kernel.org>

commit | commitdiff | tree

Christian Brauner [Wed, 16 Oct 2024 17:49:48 +0000 (19:49 +0200)]

fs: don't try and remove empty rbtree node

When copying a namespace we won't have added the new copy into the
namespace rbtree until after the copy succeeded. Calling free_mnt_ns()
will try to remove the copy from the rbtree which is invalid. Simply
free the namespace skeleton directly.

Link: https://lore.kernel.org/r/20241016-adapter-seilwinde-83c508a7bde1@brauner
Fixes: 1901c92497bd ("fs: keep an index of current mount namespaces")
Tested-by: Brad Spengler <spender@grsecurity.net>
Cc: stable@vger.kernel.org # v6.11+
Reported-by: Brad Spengler <spender@grsecurity.net>
Suggested-by: Brad Spengler <spender@grsecurity.net>
Signed-off-by: Christian Brauner <brauner@kernel.org>

commit | commitdiff | tree

David Howells [Wed, 16 Oct 2024 16:28:33 +0000 (17:28 +0100)]

netfs: Downgrade i_rwsem for a buffered write

In the I/O locking code borrowed from NFS into netfslib, i_rwsem is held
locked across a buffered write - but this causes a performance regression
in cifs as it excludes buffered reads for the duration (cifs didn't use any
locking for buffered reads).

Mitigate this somewhat by downgrading the i_rwsem to a read lock across the
buffered write. This at least allows parallel reads to occur whilst
excluding other writes, DIO, truncate and setattr.

Note that this shouldn't be a problem for a buffered write as a read
through an mmap can circumvent i_rwsem anyway.

Also note that we might want to make this change in NFS also.

Signed-off-by: David Howells <dhowells@redhat.com>
Link: https://lore.kernel.org/r/1317958.1729096113@warthog.procyon.org.uk
cc: Steve French <sfrench@samba.org>
cc: Paulo Alcantara <pc@manguebit.com>
cc: Trond Myklebust <trondmy@kernel.org>
cc: Jeff Layton <jlayton@kernel.org>
cc: netfs@lists.linux.dev
cc: linux-cifs@vger.kernel.org
cc: linux-nfs@vger.kernel.org
cc: linux-fsdevel@vger.kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>

commit | commitdiff | tree

Florian Westphal [Thu, 10 Oct 2024 16:34:05 +0000 (18:34 +0200)]

netfilter: bpf: must hold reference on net namespace

BUG: KASAN: slab-use-after-free in __nf_unregister_net_hook+0x640/0x6b0
Read of size 8 at addr ffff8880106fe400 by task repro/72=
bpf_nf_link_release+0xda/0x1e0
bpf_link_free+0x139/0x2d0
bpf_link_release+0x68/0x80
__fput+0x414/0xb60

Eric says:
It seems that bpf was able to defer the __nf_unregister_net_hook()
after exit()/close() time.
Perhaps a netns reference is missing, because the netns has been
dismantled/freed already.
bpf_nf_link_attach() does :
link->net = net;
But I do not see a reference being taken on net.

Add such a reference and release it after hook unreg.
Note that I was unable to get syzbot reproducer to work, so I
do not know if this resolves this splat.

Fixes: 84601d6ee68a ("bpf: add bpf_link support for BPF_NETFILTER programs")
Diagnosed-by: Eric Dumazet <edumazet@google.com>
Reported-by: Lai, Yi <yi1.lai@linux.intel.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

commit | commitdiff | tree

Kirill Marinushkin [Wed, 16 Oct 2024 21:58:10 +0000 (23:58 +0200)]

ASoC: Change my e-mail to gmail

Change my contact e-mail in pcm3060 driver and MAINTAINERS

Signed-off-by: Kirill Marinushkin <k.marinushkin@gmail.com>
Cc: Kirill Marinushkin <kmarinushkin@birdec.com>
Cc: Liam Girdwood <lgirdwood@gmail.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: Jaroslav Kysela <perex@perex.cz>
Cc: Takashi Iwai <tiwai@suse.com>
Cc: linux-kernel@vger.kernel.org
Cc: linux-sound@vger.kernel.org
Link: https://patch.msgid.link/20241016215810.1544222-1-k.marinushkin@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>

commit | commitdiff | tree

Derek Fang [Wed, 16 Oct 2024 03:07:03 +0000 (11:07 +0800)]

ASoC: Intel: soc-acpi: lnl: Add match entry for TM2 laptops

Add a new match table entry on Lunarlake for the TM2 laptops
with rt713 and rt1318.

Signed-off-by: Derek Fang <derek.fang@realtek.com>
Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com>
Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
Signed-off-by: Bard Liao <yung-chuan.liao@linux.intel.com>
Link: https://patch.msgid.link/20241016030703.13669-1-yung-chuan.liao@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>

commit | commitdiff | tree

Ilya Dudikov [Wed, 16 Oct 2024 03:40:37 +0000 (10:40 +0700)]

ASoC: amd: yc: Fix non-functional mic on ASUS E1404FA

ASUS Vivobook E1404FA needs a quirks-table entry for the internal microphone to function properly.

Signed-off-by: Ilya Dudikov <ilyadud@mail.ru>
Link: https://patch.msgid.link/20241016034038.13481-1-ilyadud25@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>

commit | commitdiff | tree

Ranjani Sridharan [Wed, 16 Oct 2024 03:29:10 +0000 (11:29 +0800)]

ASoC: SOF: Intel: hda: Always clean up link DMA during stop

This is required to reset the DMA read/write pointers when the stream is
prepared and restarted after a call to snd_pcm_drain()/snd_pcm_drop().
Also, now that the stream is reset during stop, do not save LLP registers
in the case of STOP/suspend to avoid erroneous delay reporting.

Link: https://github.com/thesofproject/sof/issues/9502
Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com>
Reviewed-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
Signed-off-by: Bard Liao <yung-chuan.liao@linux.intel.com>
All: stable@vger.kernel.org # 6.10.x 6.11.x
Link: https://patch.msgid.link/20241016032910.14601-5-yung-chuan.liao@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>

commit | commitdiff | tree

Ranjani Sridharan [Wed, 16 Oct 2024 03:29:09 +0000 (11:29 +0800)]

soundwire: intel_ace2x: Send PDI stream number during prepare

In the case of a prepare callback after an xrun or when the PCM is
restarted after a call to snd_pcm_drain/snd_pcm_drop, avoid
reprogramming the SHIM registers but send the PDI stream number so that
the link DMA data can be set. This is needed for the case that the DMA
data is cleared when the PCM is stopped and restarted without being
closed.

Link: https://github.com/thesofproject/sof/issues/9502
Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com>
Reviewed-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
Signed-off-by: Bard Liao <yung-chuan.liao@linux.intel.com>
Acked-by: Vinod Koul <vkoul@kernel.org>
All: stable@vger.kernel.org # 6.10.x 6.11.x
Link: https://patch.msgid.link/20241016032910.14601-4-yung-chuan.liao@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>

commit | commitdiff | tree

Ranjani Sridharan [Wed, 16 Oct 2024 03:29:08 +0000 (11:29 +0800)]

ASoC: SOF: Intel: hda: Handle prepare without close for non-HDA DAI's

When a PCM is restarted after a snd_pcm_drain/snd_pcm_drop(), the prepare
callback will be invoked and the hw_params will be set again. For the
HDA DAI's, the hw_params function handles this case already but not for
the non-HDA DAI's. So, add the check for link_prepared to verify if the
hw_params should be done again or not. Additionally, for SDW DAI's reset
the PCMSyCM registers as would be done in the case of a start after a
hw_free.

Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com>
Reviewed-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
Signed-off-by: Bard Liao <yung-chuan.liao@linux.intel.com>
All: stable@vger.kernel.org # 6.10.x 6.11.x
Link: https://patch.msgid.link/20241016032910.14601-3-yung-chuan.liao@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>

commit | commitdiff | tree

Ranjani Sridharan [Wed, 16 Oct 2024 03:29:07 +0000 (11:29 +0800)]

ASoC: SOF: ipc4-topology: Do not set ALH node_id for aggregated DAIs

For aggregated DAIs, the node ID is set to the group_id during the DAI
widget's ipc_prepare op. With the current logic, setting the dai_index
for node_id in the dai_config is redundant as it will be overwritten
with the group_id anyway. Removing it will also prevent any accidental
clearing/resetting of the group_id for aggregated DAIs due to the
dai_config calls could that happen before the allocated group_id is
freed.

Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com>
Reviewed-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
Signed-off-by: Bard Liao <yung-chuan.liao@linux.intel.com>
All: stable@vger.kernel.org # 6.10.x 6.11.x
Link: https://patch.msgid.link/20241016032910.14601-2-yung-chuan.liao@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>

commit | commitdiff | tree

Michal Luczaj [Sun, 13 Oct 2024 16:26:42 +0000 (18:26 +0200)]

bpf, vsock: Drop static vsock_bpf_prot initialization

vsock_bpf_prot is set up at runtime. Remove the superfluous init.

No functional change intended.

Fixes: 634f1a7110b4 ("vsock: support sockmap")
Signed-off-by: Michal Luczaj <mhal@rbox.co>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20241013-vsock-fixes-for-redir-v2-4-d6577bbfe742@rbox.co

commit | commitdiff | tree

Michal Luczaj [Sun, 13 Oct 2024 16:26:41 +0000 (18:26 +0200)]

vsock: Update msg_count on read_skb()

Dequeuing via vsock_transport::read_skb() left msg_count outdated, which
then confused SOCK_SEQPACKET recv(). Decrease the counter.

Fixes: 634f1a7110b4 ("vsock: support sockmap")
Signed-off-by: Michal Luczaj <mhal@rbox.co>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20241013-vsock-fixes-for-redir-v2-3-d6577bbfe742@rbox.co

commit | commitdiff | tree

Michal Luczaj [Sun, 13 Oct 2024 16:26:40 +0000 (18:26 +0200)]

vsock: Update rx_bytes on read_skb()

Make sure virtio_transport_inc_rx_pkt() and virtio_transport_dec_rx_pkt()
calls are balanced (i.e. virtio_vsock_sock::rx_bytes doesn't lie) after
vsock_transport::read_skb().

While here, also inform the peer that we've freed up space and it has more
credit.

Failing to update rx_bytes after packet is dequeued leads to a warning on
SOCK_STREAM recv():

[ 233.396654] rx_queue is empty, but rx_bytes is non-zero
[ 233.396702] WARNING: CPU: 11 PID: 40601 at net/vmw_vsock/virtio_transport_common.c:589

Fixes: 634f1a7110b4 ("vsock: support sockmap")
Suggested-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Michal Luczaj <mhal@rbox.co>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20241013-vsock-fixes-for-redir-v2-2-d6577bbfe742@rbox.co

commit | commitdiff | tree

Michal Luczaj [Sun, 13 Oct 2024 16:26:39 +0000 (18:26 +0200)]

bpf, sockmap: SK_DROP on attempted redirects of unsupported af_vsock

Don't mislead the callers of bpf_{sk,msg}_redirect_{map,hash}(): make sure
to immediately and visibly fail the forwarding of unsupported af_vsock
packets.

Fixes: 634f1a7110b4 ("vsock: support sockmap")
Signed-off-by: Michal Luczaj <mhal@rbox.co>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20241013-vsock-fixes-for-redir-v2-1-d6577bbfe742@rbox.co

commit | commitdiff | tree

Paolo Abeni [Thu, 17 Oct 2024 10:14:10 +0000 (12:14 +0200)]

Merge branch 'mlx5-misc-fixes-2024-10-15'

Tariq Toukan says:

====================
mlx5 misc fixes 2024-10-15

This patchset provides misc bug fixes from the team to the mlx5 core and
Eth drivers.

Series generated against:
commit 174714f0e505 ("selftests: drivers: net: fix name not defined")
====================

Link: https://patch.msgid.link/20241015093208.197603-1-tariqt@nvidia.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

commit | commitdiff | tree

Cosmin Ratiu [Tue, 15 Oct 2024 09:32:08 +0000 (12:32 +0300)]

net/mlx5e: Don't call cleanup on profile rollback failure

When profile rollback fails in mlx5e_netdev_change_profile, the netdev
profile var is left set to NULL. Avoid a crash when unloading the driver
by not calling profile->cleanup in such a case.

This was encountered while testing, with the original trigger that
the wq rescuer thread creation got interrupted (presumably due to
Ctrl+C-ing modprobe), which gets converted to ENOMEM (-12) by
mlx5e_priv_init, the profile rollback also fails for the same reason
(signal still active) so the profile is left as NULL, leading to a crash
later in _mlx5e_remove.

[  732.473932] mlx5_core 0000:08:00.1: E-Switch: Unload vfs: mode(OFFLOADS), nvfs(2), necvfs(0), active vports(2)
[  734.525513] workqueue: Failed to create a rescuer kthread for wq "mlx5e": -EINTR
[  734.557372] mlx5_core 0000:08:00.1: mlx5e_netdev_init_profile:6235:(pid 6086): mlx5e_priv_init failed, err=-12
[  734.559187] mlx5_core 0000:08:00.1 eth3: mlx5e_netdev_change_profile: new profile init failed, -12
[  734.560153] workqueue: Failed to create a rescuer kthread for wq "mlx5e": -EINTR
[  734.589378] mlx5_core 0000:08:00.1: mlx5e_netdev_init_profile:6235:(pid 6086): mlx5e_priv_init failed, err=-12
[  734.591136] mlx5_core 0000:08:00.1 eth3: mlx5e_netdev_change_profile: failed to rollback to orig profile, -12
[  745.537492] BUG: kernel NULL pointer dereference, address: 0000000000000008
[  745.538222] #PF: supervisor read access in kernel mode
<snipped>
[  745.551290] Call Trace:
[  745.551590]  <TASK>
[  745.551866]  ? __die+0x20/0x60
[  745.552218]  ? page_fault_oops+0x150/0x400
[  745.555307]  ? exc_page_fault+0x79/0x240
[  745.555729]  ? asm_exc_page_fault+0x22/0x30
[  745.556166]  ? mlx5e_remove+0x6b/0xb0 [mlx5_core]
[  745.556698]  auxiliary_bus_remove+0x18/0x30
[  745.557134]  device_release_driver_internal+0x1df/0x240
[  745.557654]  bus_remove_device+0xd7/0x140
[  745.558075]  device_del+0x15b/0x3c0
[  745.558456]  mlx5_rescan_drivers_locked.part.0+0xb1/0x2f0 [mlx5_core]
[  745.559112]  mlx5_unregister_device+0x34/0x50 [mlx5_core]
[  745.559686]  mlx5_uninit_one+0x46/0xf0 [mlx5_core]
[  745.560203]  remove_one+0x4e/0xd0 [mlx5_core]
[  745.560694]  pci_device_remove+0x39/0xa0
[  745.561112]  device_release_driver_internal+0x1df/0x240
[  745.561631]  driver_detach+0x47/0x90
[  745.562022]  bus_remove_driver+0x84/0x100
[  745.562444]  pci_unregister_driver+0x3b/0x90
[  745.562890]  mlx5_cleanup+0xc/0x1b [mlx5_core]
[  745.563415]  __x64_sys_delete_module+0x14d/0x2f0
[  745.563886]  ? kmem_cache_free+0x1b0/0x460
[  745.564313]  ? lockdep_hardirqs_on_prepare+0xe2/0x190
[  745.564825]  do_syscall_64+0x6d/0x140
[  745.565223]  entry_SYSCALL_64_after_hwframe+0x4b/0x53
[  745.565725] RIP: 0033:0x7f1579b1288b

Fixes: 3ef14e463f6e ("net/mlx5e: Separate between netdev objects and mlx5e profiles initialization")
Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com>
Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

commit | commitdiff | tree

Cosmin Ratiu [Tue, 15 Oct 2024 09:32:07 +0000 (12:32 +0300)]

net/mlx5: Unregister notifier on eswitch init failure

It otherwise remains registered and a subsequent attempt at eswitch
enabling might trigger warnings of the sort:

[  682.589148] ------------[ cut here ]------------
[  682.590204] notifier callback eswitch_vport_event [mlx5_core] already registered
[  682.590256] WARNING: CPU: 13 PID: 2660 at kernel/notifier.c:31 notifier_chain_register+0x3e/0x90
[...snipped]
[  682.610052] Call Trace:
[  682.610369]  <TASK>
[  682.610663]  ? __warn+0x7c/0x110
[  682.611050]  ? notifier_chain_register+0x3e/0x90
[  682.611556]  ? report_bug+0x148/0x170
[  682.611977]  ? handle_bug+0x36/0x70
[  682.612384]  ? exc_invalid_op+0x13/0x60
[  682.612817]  ? asm_exc_invalid_op+0x16/0x20
[  682.613284]  ? notifier_chain_register+0x3e/0x90
[  682.613789]  atomic_notifier_chain_register+0x25/0x40
[  682.614322]  mlx5_eswitch_enable_locked+0x1d4/0x3b0 [mlx5_core]
[  682.614965]  mlx5_eswitch_enable+0xc9/0x100 [mlx5_core]
[  682.615551]  mlx5_device_enable_sriov+0x25/0x340 [mlx5_core]
[  682.616170]  mlx5_core_sriov_configure+0x50/0x170 [mlx5_core]
[  682.616789]  sriov_numvfs_store+0xb0/0x1b0
[  682.617248]  kernfs_fop_write_iter+0x117/0x1a0
[  682.617734]  vfs_write+0x231/0x3f0
[  682.618138]  ksys_write+0x63/0xe0
[  682.618536]  do_syscall_64+0x4c/0x100
[  682.618958]  entry_SYSCALL_64_after_hwframe+0x4b/0x53

Fixes: 7624e58a8b3a ("net/mlx5: E-switch, register event handler before arming the event")
Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

Unnamed repository; edit this file 'description' to name the repository.

RSS Atom