]> www.infradead.org Git - linux.git/log
linux.git
2 years agohabanalabs/gaudi2: don't enable entries in the MSIX_GW table
Tomer Tayar [Thu, 20 Oct 2022 09:05:09 +0000 (12:05 +0300)]
habanalabs/gaudi2: don't enable entries in the MSIX_GW table

User should use the virtual MSI-X doorbell to generate interrupts from
the device, so there is no need to enable entries in the MSIX_GW table.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2 years agohabanalabs/gaudi2: remove redundant firmware version check
farah kassabri [Tue, 8 Nov 2022 11:23:17 +0000 (13:23 +0200)]
habanalabs/gaudi2: remove redundant firmware version check

Firmware 1.7 is the first official firmware, so no need to check
if we are running a version below it.

Signed-off-by: farah kassabri <fkassabri@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2 years agohabanalabs/gaudi: fix print for firmware-alive event
Tomer Tayar [Mon, 7 Nov 2022 14:34:32 +0000 (16:34 +0200)]
habanalabs/gaudi: fix print for firmware-alive event

Add missing le{32,64}_to_cpu conversions.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2 years agohabanalabs: fix print for out-of-sync and pkt-failure events
Tomer Tayar [Mon, 7 Nov 2022 14:20:03 +0000 (16:20 +0200)]
habanalabs: fix print for out-of-sync and pkt-failure events

Add missing le32_to_cpu() conversions, and use %d for the value
returned from atomic_read().

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2 years agohabanalabs/gaudi2: add page fault notify event
Dani Liberman [Mon, 31 Oct 2022 21:04:14 +0000 (23:04 +0200)]
habanalabs/gaudi2: add page fault notify event

Each time page fault happens, besides capturing its data, also notify
the user about it.

Signed-off-by: Dani Liberman <dliberman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2 years agohabanalabs/gaudi2: classify power/thermal events as info
Ofir Bitton [Sun, 6 Nov 2022 10:07:03 +0000 (12:07 +0200)]
habanalabs/gaudi2: classify power/thermal events as info

As power and thermal envelope events are pure informative and not
indicating an error, we reduce the print level to info only.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2 years agohabanalabs: skip events info ioctl if not supported
Ohad Sharabi [Sun, 6 Nov 2022 07:26:01 +0000 (09:26 +0200)]
habanalabs: skip events info ioctl if not supported

Some ASICs haven't yet implemented this functionality and so the
ioctl call should fail and the user should be notified of the reason.

Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2 years agohabanalabs: fix firmware descriptor copy operation
farah kassabri [Thu, 22 Sep 2022 11:24:35 +0000 (14:24 +0300)]
habanalabs: fix firmware descriptor copy operation

This is needed to allow adding more data to the lkd_fw_comms_desc
structure.

Signed-off-by: farah kassabri <fkassabri@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2 years agohabanalabs/gaudi2: add razwi notify event
Dani Liberman [Sun, 30 Oct 2022 12:46:19 +0000 (14:46 +0200)]
habanalabs/gaudi2: add razwi notify event

Each time razwi (read-only zero, write ignored) event happens, besides
capturing its data, also notify the user about it.

Signed-off-by: Dani Liberman <dliberman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2 years agohabanalabs/gaudi2: implement fp32 not supported event
Ofir Bitton [Sun, 30 Oct 2022 13:10:13 +0000 (15:10 +0200)]
habanalabs/gaudi2: implement fp32 not supported event

Due to binning, Gaudi2 does not always support fp32.
We add support for such an event in case fp32 is used by the user
in such a device.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2 years agohabanalabs/gaudi: add page fault notify event
Dani Liberman [Mon, 31 Oct 2022 09:44:45 +0000 (11:44 +0200)]
habanalabs/gaudi: add page fault notify event

Each time page fault happens, besides capturing its data, also notify
the user about it.

Signed-off-by: Dani Liberman <dliberman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2 years agohabanalabs: use single threaded WQ for event handling
Dani Liberman [Thu, 27 Oct 2022 17:38:26 +0000 (20:38 +0300)]
habanalabs: use single threaded WQ for event handling

Creating event queue workqueue using alloc_workqueue made it run in
multi threaded mode, which caused parallel dumping of events as well as
parallel events notifying to user, causing logs with multiple
events to be out of order.

Fixed by creating event queue workqueue as single threaded work queue.

Signed-off-by: Dani Liberman <dliberman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2 years agohabanalabs/gaudi: add razwi notify event
Dani Liberman [Sun, 30 Oct 2022 11:08:37 +0000 (13:08 +0200)]
habanalabs/gaudi: add razwi notify event

Each time razwi (read-only zero, write ignore) happens, besides
capturing its data, also notify the user about it.

Signed-off-by: Dani Liberman <dliberman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2 years agohabanalabs/gaudi2: add PCI revision 2 support
Ofir Bitton [Wed, 26 Oct 2022 13:20:45 +0000 (16:20 +0300)]
habanalabs/gaudi2: add PCI revision 2 support

Add support for Gaudi2 Device with PCI revision 2.
Functionality is exactly the same as revision 1, the only difference
is device name exposed to user.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2 years agohabanalabs: remove redundant gaudi2_sec asic type
Ofir Bitton [Wed, 26 Oct 2022 15:20:49 +0000 (18:20 +0300)]
habanalabs: remove redundant gaudi2_sec asic type

As Gaudi2 has a single PCI id, the secured asic type is redundant.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2 years agohabanalabs: add warning print upon a PCI error
Ofir Bitton [Wed, 19 Oct 2022 11:05:18 +0000 (14:05 +0300)]
habanalabs: add warning print upon a PCI error

In order to know if driver catches PCI errors correctly, we need to
print a warning per each error.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2 years agohabanalabs: fix PCIe access to SRAM via debugfs
Tomer Tayar [Sun, 23 Oct 2022 22:14:18 +0000 (01:14 +0300)]
habanalabs: fix PCIe access to SRAM via debugfs

hl_access_sram_dram_region() uses a region base which is set within the
hl_set_dram_bar() function. However, for SRAM access this function is
not called, and we end up with a wrong value of region base and with a
bad calculated address.
Fix it by initializing the region base value independently of whether
hl_set_dram_bar() is called or not.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2 years agohabanalabs: zero ts registration buff when allocated
farah kassabri [Tue, 20 Sep 2022 08:48:40 +0000 (11:48 +0300)]
habanalabs: zero ts registration buff when allocated

To avoid memory corruption in kernel memory while using timestamp
registration nodes, zero the kernel buff memory when its allocated.

Signed-off-by: farah kassabri <fkassabri@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2 years agohabanalabs: no consecutive err when user context is enabled
Tal Cohen [Tue, 18 Oct 2022 14:35:06 +0000 (17:35 +0300)]
habanalabs: no consecutive err when user context is enabled

Consecutive error protects a device reset loop from being triggered
due to h/w issues and enters the device into an unavailable state.
When user may cause the error, an unavailable state
will prevent the user from running its workloads.

The commit prevents entering consecutive state when a user context
is enabled.

Signed-off-by: Tal Cohen <talcohen@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2 years agohabanalabs: use graceful hard reset for CS timeouts
Tomer Tayar [Fri, 30 Sep 2022 14:02:19 +0000 (17:02 +0300)]
habanalabs: use graceful hard reset for CS timeouts

Use graceful hard reset when detecting a CS timeout that requires a
device reset.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2 years agohabanalabs/gaudi2: use graceful hard reset for F/W events
Tomer Tayar [Fri, 30 Sep 2022 13:57:54 +0000 (16:57 +0300)]
habanalabs/gaudi2: use graceful hard reset for F/W events

Use graceful hard reset for F/W events on Gaudi2 device that require a
device reset.

While at it, do a small refactor of the checks and function calls,
to simplify it and to avoid code duplication.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2 years agohabanalabs/gaudi: use graceful hard reset for F/W events
Tomer Tayar [Fri, 30 Sep 2022 13:43:47 +0000 (16:43 +0300)]
habanalabs/gaudi: use graceful hard reset for F/W events

Use graceful hard reset for F/W events on Gaudi device that require a
device reset.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2 years agohabanalabs: add an option to control watchdog timeout via debugfs
Tomer Tayar [Fri, 30 Sep 2022 13:37:41 +0000 (16:37 +0300)]
habanalabs: add an option to control watchdog timeout via debugfs

Add an option to control the timeout value for the driver's watchdog
of the reset process. The timeout represents the amount of the user
has to close his process once he gets a device reset notification from
the driver.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2 years agohabanalabs: add support for graceful hard reset
Tomer Tayar [Fri, 30 Sep 2022 12:08:13 +0000 (15:08 +0300)]
habanalabs: add support for graceful hard reset

Calling hl_device_reset() for a hard reset will lead to a quite
immediate device reset and to killing user process.
For resets that follow errors, it disables the option to debug the
errors on both the device side and the user application side.

This patch adds a 'graceful hard reset' option and a new
hl_device_cond_reset() function.
Under some conditions, mainly if there is no user process or if he is
not registered to driver notifications, this function will execute hard
reset as usual.
Otherwise, the reset will be postponed and a notification will be sent
to user, to let him perform post-error actions and then to release the
device, after which reset will take place.

If device is not released by user in some defined time, a watchdog work
will execute the reset in any case.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2 years agohabanalabs: avoid divide by zero in device utilization
Ohad Sharabi [Sun, 23 Oct 2022 11:46:08 +0000 (14:46 +0300)]
habanalabs: avoid divide by zero in device utilization

Currently there is no verification whether the divisor is legal.

Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2 years agohabanalabs: fix user mappings calculation in case of page fault
Dani Liberman [Wed, 19 Oct 2022 17:24:55 +0000 (20:24 +0300)]
habanalabs: fix user mappings calculation in case of page fault

As there are 2 types of user mappings, pmmu and hmmu, calculate
only the relevant mappings for the requested type.

Signed-off-by: Dani Liberman <dliberman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2 years agohabanalabs/gaudi2: remove configurations to access the MSI-X doorbell
Tomer Tayar [Thu, 20 Oct 2022 08:29:03 +0000 (11:29 +0300)]
habanalabs/gaudi2: remove configurations to access the MSI-X doorbell

The virtual MSI-X doorbell is supported now in F/W, so all
configurations to access the PCIE_DBI MSI-X doorbell can be removed.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2 years agohabanalabs: allow setting HBM BAR to other regions
Ohad Sharabi [Wed, 14 Sep 2022 05:53:29 +0000 (08:53 +0300)]
habanalabs: allow setting HBM BAR to other regions

Up until now the use-case in the driver was that the HBM is accessed
using the HBM BAR, yet the BAR sometimes cannot cover the whole HBM and
so we needed to set the BAR to other HBM offset.
Now we are facing the need to access other PCI memory regions that can
be covered by the HBM BAR.
To answer that we are allowing the caller to determine if the HBM BAR
need to be set or not regardless of the PCI memory region.

Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2 years agohabanalabs: fix using freed pointer
Ohad Sharabi [Tue, 18 Oct 2022 05:51:33 +0000 (08:51 +0300)]
habanalabs: fix using freed pointer

The code uses the pointer for trace purpose (without actually
dereference it) but still get static analysis warning.
This patch eliminate the warning.

Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2 years agohabanalabs/gaudi2: unsecure CBU_EARLY_BRESP registers
Dilip Puri [Wed, 12 Oct 2022 08:06:48 +0000 (11:06 +0300)]
habanalabs/gaudi2: unsecure CBU_EARLY_BRESP registers

NIC ARCs need to have access to CBU_EARLY_BRESP, hence we unsecure
those registers.

Signed-off-by: Dilip Puri <dilipp@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2 years agohabanalabs: verify no zero event is sent
Tal Cohen [Mon, 3 Oct 2022 10:55:50 +0000 (13:55 +0300)]
habanalabs: verify no zero event is sent

The event notifier mechanism should not raise an empty
event (event equals zero).

Signed-off-by: Tal Cohen <talcohen@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2 years agohabanalabs/gaudi2: capture page fault data
Dani Liberman [Thu, 29 Sep 2022 07:28:36 +0000 (10:28 +0300)]
habanalabs/gaudi2: capture page fault data

Capture page fault data when it happens.

Signed-off-by: Dani Liberman <dliberman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2 years agohabanalabs/gaudi2: capture RAZWI information
Dani Liberman [Wed, 28 Sep 2022 19:14:55 +0000 (22:14 +0300)]
habanalabs/gaudi2: capture RAZWI information

Added function to calculate possible engines which caused
RAZWI (read-only zero, write ignored), from a given router id or
module index.

When getting RAZWI via PSOC IP, first the router id is calculated
and then the possible engines that caused the RAZWI are calculated.

There is a possibility that the RAZWI initiator is not an engine. In
that case, it will not be included in possible engines as it
doesn't have an engine id.

RAZWI information is captured when receiving event from engine or via
PSOC IP.

Signed-off-by: Dani Liberman <dliberman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2 years agohabanalabs: handle HBM MMU when capturing page fault data
Dani Liberman [Thu, 29 Sep 2022 07:21:28 +0000 (10:21 +0300)]
habanalabs: handle HBM MMU when capturing page fault data

In case of HBM MMU page fault, capture its relevant mappings.

Signed-off-by: Dani Liberman <dliberman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2 years agohabanalabs: move reset workqueue to be under hl_device
Tomer Tayar [Fri, 30 Sep 2022 11:36:27 +0000 (14:36 +0300)]
habanalabs: move reset workqueue to be under hl_device

'struct hl_device_reset_work' is used as a wrapper for the reset work
and its parameters, including the reset workqueue on which it runs.
In a future commit, another reset related work with similar parameters
is going to be added, but it won't use the reset workqueue.

As in any case there is a single reset workqueue, and to allow the resue
of this structure, move the reset workqueue to 'struct hl_device'.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2 years agohabanalabs: allow unregistering eventfd when device non-operational
Tomer Tayar [Fri, 30 Sep 2022 11:19:21 +0000 (14:19 +0300)]
habanalabs: allow unregistering eventfd when device non-operational

Unregistering eventfd is for releasing host resources and doesn't
involve an access to the device. As such, there is no reason to disallow
it when device isn't operational.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2 years agohabanalabs: skip idle status check if reset on device release
Tomer Tayar [Fri, 30 Sep 2022 11:09:32 +0000 (14:09 +0300)]
habanalabs: skip idle status check if reset on device release

If reset upon device release is enabled, there is no need to check the
device idle status in hpriv_release(), because device is going to be
reset in any case.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2 years agohabanalabs/gaudi2: add device unavailable notification
Tal Cohen [Wed, 28 Sep 2022 15:33:19 +0000 (18:33 +0300)]
habanalabs/gaudi2: add device unavailable notification

Device unavailable notifies the user that there isn't an option to
retrieve debug information from the device.
When a critical device error occurs and the f/w performs the device
reset, a device unavailable notification shall be sent to the user
process.

Signed-off-by: Tal Cohen <talcohen@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2 years agohabanalabs/gaudi2: remove privileged MME clock configuration
Koby Elbaz [Wed, 28 Sep 2022 12:56:13 +0000 (15:56 +0300)]
habanalabs/gaudi2: remove privileged MME clock configuration

Privileged MME clock configuration is removed as it is done by the f/w.

Signed-off-by: Koby Elbaz <kelbaz@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2 years agohabanalabs: replace 'pf' to 'prefetch'
Dafna Hirschfeld [Wed, 28 Sep 2022 08:38:00 +0000 (11:38 +0300)]
habanalabs: replace 'pf' to 'prefetch'

pf was an abbreviation for prefetch but because pf already stands
for 'physical function', we decided to change it to 'prefetch'.

Signed-off-by: Dafna Hirschfeld <dhirschfeld@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2 years agohabanalabs: add page fault info uapi
Dani Liberman [Sun, 18 Sep 2022 18:37:31 +0000 (21:37 +0300)]
habanalabs: add page fault info uapi

Only the first page fault will be saved.
Besides the address which caused the page fault, the driver captures
all of the mmu user mappings.
User can retrieve this data via the new uapi (new opcode in INFO ioctl).

Signed-off-by: Dani Liberman <dliberman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2 years agohabanalabs/gaudi2: fix module ID for RAZWI handling
Tomer Tayar [Thu, 22 Sep 2022 12:25:46 +0000 (15:25 +0300)]
habanalabs/gaudi2: fix module ID for RAZWI handling

RAZWI is optionally handled as part of the generic QM SEI error
handling, but it always uses PDMA as the module ID.
Fix it to use the suitable module ID according to the specific event.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2 years agohabanalabs: use lower_32_bits()
Bharat Jauhari [Tue, 27 Sep 2022 11:38:38 +0000 (14:38 +0300)]
habanalabs: use lower_32_bits()

This fixes sparse warning on doing cast to 32-bits

Signed-off-by: Bharat Jauhari <bjauhari@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2 years agohabanalabs: refactor razwi event notification
Dani Liberman [Mon, 19 Sep 2022 15:51:59 +0000 (18:51 +0300)]
habanalabs: refactor razwi event notification

This event notification was compatible only with gaudi, where razwi
and page fault happens together.

To make it compatible with all ASICs, this refactor contains:

1. Razwi notification will only notify about razwi info.
   New notification will be added in future patch, to retrieve data
   about page fault error.

2. Changed razwi info structure to support all ASICs.

Signed-off-by: Dani Liberman <dliberman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2 years agohabanalabs: Use simplified API for p2p dist calc
Oded Gabbay [Thu, 22 Sep 2022 09:30:32 +0000 (12:30 +0300)]
habanalabs: Use simplified API for p2p dist calc

Use the simplified API that calculates distance between two devices.

Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2 years agohabanalabs: allow control device open during reset
Ofir Bitton [Tue, 23 Aug 2022 12:14:14 +0000 (15:14 +0300)]
habanalabs: allow control device open during reset

Monitoring apps would like to query device state at any time so we
should allow it also during reset because it doesn't involve
accessing the h/w.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2 years agohabanalabs: fix return value check in hl_fw_get_sec_attest_data()
Yang Yingliang [Fri, 23 Sep 2022 14:39:13 +0000 (22:39 +0800)]
habanalabs: fix return value check in hl_fw_get_sec_attest_data()

If hl_cpu_accessible_dma_pool_alloc() fails, we should check
'req_cpu_addr', fix it.

Fixes: 0c88760f8f5e ("habanalabs/gaudi2: add secured attestation info uapi")
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2 years agoMerge 6.1-rc6 into char-misc-next
Greg Kroah-Hartman [Mon, 21 Nov 2022 09:05:34 +0000 (10:05 +0100)]
Merge 6.1-rc6 into char-misc-next

We need the char/misc fixes in here as well.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2 years agoLinux 6.1-rc6 v6.1-rc6
Linus Torvalds [Mon, 21 Nov 2022 00:02:16 +0000 (16:02 -0800)]
Linux 6.1-rc6

2 years agoMerge tag 'trace-probes-v6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace...
Linus Torvalds [Sun, 20 Nov 2022 23:31:20 +0000 (15:31 -0800)]
Merge tag 'trace-probes-v6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull tracing/probes fixes from Steven Rostedt:

 - Fix possible NULL pointer dereference on trace_event_file in
   kprobe_event_gen_test_exit()

 - Fix NULL pointer dereference for trace_array in
   kprobe_event_gen_test_exit()

 - Fix memory leak of filter string for eprobes

 - Fix a possible memory leak in rethook_alloc()

 - Skip clearing aggrprobe's post_handler in kprobe-on-ftrace case which
   can cause a possible use-after-free

 - Fix warning in eprobe filter creation

 - Fix eprobe filter creation as it picked the wrong event for the
   fields

* tag 'trace-probes-v6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  tracing/eprobe: Fix eprobe filter to make a filter correctly
  tracing/eprobe: Fix warning in filter creation
  kprobes: Skip clearing aggrprobe's post_handler in kprobe-on-ftrace case
  rethook: fix a potential memleak in rethook_alloc()
  tracing/eprobe: Fix memory leak of filter string
  tracing: kprobe: Fix potential null-ptr-deref on trace_array in kprobe_event_gen_test_exit()
  tracing: kprobe: Fix potential null-ptr-deref on trace_event_file in kprobe_event_gen_test_exit()

2 years agoMerge tag 'trace-v6.1-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/trace...
Linus Torvalds [Sun, 20 Nov 2022 23:25:32 +0000 (15:25 -0800)]
Merge tag 'trace-v6.1-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull tracing fixes from Steven Rostedt:

 - Fix polling to block on watermark like the reads do, as user space
   applications get confused when the select says read is available, and
   then the read blocks

 - Fix accounting of ring buffer dropped pages as it is what is used to
   determine if the buffer is empty or not

 - Fix memory leak in tracing_read_pipe()

 - Fix struct trace_array warning about being declared in parameters

 - Fix accounting of ftrace pages used in output at start up.

 - Fix allocation of dyn_ftrace pages by subtracting one from order
   instead of diving it by 2

 - Static analyzer found a case were a pointer being used outside of a
   NULL check (rb_head_page_deactivate())

 - Fix possible NULL pointer dereference if kstrdup() fails in
   ftrace_add_mod()

 - Fix memory leak in test_gen_synth_cmd() and test_empty_synth_event()

 - Fix bad pointer dereference in register_synth_event() on error path

 - Remove unused __bad_type_size() method

 - Fix possible NULL pointer dereference of entry in list 'tr->err_log'

 - Fix NULL pointer deference race if eprobe is called before the event
   setup

* tag 'trace-v6.1-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  tracing: Fix race where eprobes can be called before the event
  tracing: Fix potential null-pointer-access of entry in list 'tr->err_log'
  tracing: Remove unused __bad_type_size() method
  tracing: Fix wild-memory-access in register_synth_event()
  tracing: Fix memory leak in test_gen_synth_cmd() and test_empty_synth_event()
  ftrace: Fix null pointer dereference in ftrace_add_mod()
  ring_buffer: Do not deactivate non-existant pages
  ftrace: Optimize the allocation for mcount entries
  ftrace: Fix the possible incorrect kernel message
  tracing: Fix warning on variable 'struct trace_array'
  tracing: Fix memory leak in tracing_read_pipe()
  ring-buffer: Include dropped pages in counting dirty patches
  tracing/ring-buffer: Have polling block on watermark

2 years agotracing: Fix race where eprobes can be called before the event
Steven Rostedt (Google) [Fri, 18 Nov 2022 02:42:49 +0000 (21:42 -0500)]
tracing: Fix race where eprobes can be called before the event

The flag that tells the event to call its triggers after reading the event
is set for eprobes after the eprobe is enabled. This leads to a race where
the eprobe may be triggered at the beginning of the event where the record
information is NULL. The eprobe then dereferences the NULL record causing
a NULL kernel pointer bug.

Test for a NULL record to keep this from happening.

Link: https://lore.kernel.org/linux-trace-kernel/20221116192552.1066630-1-rafaelmendsr@gmail.com/
Link: https://lore.kernel.org/linux-trace-kernel/20221117214249.2addbe10@gandalf.local.home
Cc: Linux Trace Kernel <linux-trace-kernel@vger.kernel.org>
Cc: Tzvetomir Stoyanov <tz.stoyanov@gmail.com>
Cc: Tom Zanussi <zanussi@kernel.org>
Cc: stable@vger.kernel.org
Fixes: 7491e2c442781 ("tracing: Add a probe that attaches to trace events")
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Reported-by: Rafael Mendonca <rafaelmendsr@gmail.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2 years agoMerge tag 'x86_urgent_for_v6.1_rc6' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 20 Nov 2022 18:47:39 +0000 (10:47 -0800)]
Merge tag 'x86_urgent_for_v6.1_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fixes from Borislav Petkov:

 - Do not hold fpregs lock when inheriting FPU permissions because the
   fpregs lock disables preemption on RT but fpu_inherit_perms() does
   spin_lock_irq(), which, on RT, uses rtmutexes and they need to be
   preemptible.

 - Check the page offset and the length of the data supplied by
   userspace for overflow when specifying a set of pages to add to an
   SGX enclave

* tag 'x86_urgent_for_v6.1_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/fpu: Drop fpregs lock before inheriting FPU permissions
  x86/sgx: Add overflow check in sgx_validate_offset_length()

2 years agoMerge tag 'sched_urgent_for_v6.1_rc6' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 20 Nov 2022 18:43:52 +0000 (10:43 -0800)]
Merge tag 'sched_urgent_for_v6.1_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull scheduler fixes from Borislav Petkov:

 - Fix a small race on the task's exit path where there's a
   misunderstanding whether the task holds rq->lock or not

 - Prevent processes from getting killed when using deprecated or
   unknown rseq ABI flags in order to be able to fuzz the rseq() syscall
   with syzkaller

* tag 'sched_urgent_for_v6.1_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched: Fix race in task_call_func()
  rseq: Use pr_warn_once() when deprecated/unknown ABI flags are encountered

2 years agoMerge tag 'perf_urgent_for_v6.1_rc6' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 20 Nov 2022 18:41:14 +0000 (10:41 -0800)]
Merge tag 'perf_urgent_for_v6.1_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull perf fixes from Borislav Petkov:

 - Fix an intel PT erratum where CPUs do not support single range output
   for more than 4K

 - Fix a NULL ptr dereference which can happen after an NMI interferes
   with the event enabling dance in amd_pmu_enable_all()

 - Free the events array too when freeing uncore contexts on CPU online,
   thereby fixing a memory leak

 - Improve the pending SIGTRAP check

* tag 'perf_urgent_for_v6.1_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/x86/intel/pt: Fix sampling using single range output
  perf/x86/amd: Fix crash due to race between amd_pmu_enable_all, perf NMI and throttling
  perf/x86/amd/uncore: Fix memory leak for events array
  perf: Improve missing SIGTRAP checking

2 years agoMerge tag 'locking_urgent_for_v6.1_rc6' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 20 Nov 2022 18:39:45 +0000 (10:39 -0800)]
Merge tag 'locking_urgent_for_v6.1_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull locking fix from Borislav Petkov:

 - Fix a build error with clang 11

* tag 'locking_urgent_for_v6.1_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  locking: Fix qspinlock/x86 inline asm error

2 years agoMerge tag 'powerpc-6.1-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc...
Linus Torvalds [Sun, 20 Nov 2022 17:47:33 +0000 (09:47 -0800)]
Merge tag 'powerpc-6.1-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc fix from Michael Ellerman:

 - Fix writable sections being moved into the rodata region.

Thanks to Nicholas Piggin and Christophe Leroy.

* tag 'powerpc-6.1-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc: Fix writable sections being moved into the rodata region

2 years agoMerge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Linus Torvalds [Sat, 19 Nov 2022 23:51:22 +0000 (15:51 -0800)]
Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
 "Five small fixes, all in drivers.

  Most of these are error leg freeing issues, with the only really user
  visible one being the zfcp fix"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: iscsi: Fix possible memory leak when device_register() failed
  scsi: zfcp: Fix double free of FSF request when qdio send fails
  scsi: scsi_debug: Fix possible UAF in sdebug_add_host_helper()
  scsi: target: tcm_loop: Fix possible name leak in tcm_loop_setup_hba_bus()
  scsi: mpi3mr: Suppress command reply debug prints

2 years agoMerge tag 'iommu-fixes-v6.1-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sat, 19 Nov 2022 17:08:57 +0000 (09:08 -0800)]
Merge tag 'iommu-fixes-v6.1-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu

Pull iommu fixes from Joerg Roedel:

 - Preset accessed bits in Intel VT-d page-directory entries to avoid
   hardware error

 - Set supervisor bit only when Intel IOMMU has the SRS capability

* tag 'iommu-fixes-v6.1-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
  iommu/vt-d: Set SRE bit only when hardware has SRS cap
  iommu/vt-d: Preset Access bit for IOVA in FL non-leaf paging entries

2 years agoMerge tag 'kbuild-fixes-v6.1-3' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sat, 19 Nov 2022 17:03:20 +0000 (09:03 -0800)]
Merge tag 'kbuild-fixes-v6.1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild

Pull Kbuild fixes from Masahiro Yamada:

 - Update MAINTAINERS with Nathan and Nicolas as new Kbuild reviewers

 - Increment the debian revision for deb-pkg builds

* tag 'kbuild-fixes-v6.1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
  kbuild: Restore .version auto-increment behaviour for Debian packages
  MAINTAINERS: Add linux-kbuild's patchwork
  MAINTAINERS: Remove Michal Marek from Kbuild maintainers
  MAINTAINERS: Add Nathan and Nicolas to Kbuild reviewers

2 years agoMerge tag '6.1-rc5-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6
Linus Torvalds [Sat, 19 Nov 2022 16:58:58 +0000 (08:58 -0800)]
Merge tag '6.1-rc5-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6

Pull cifs fixes from Steve French:

 - two missing and one incorrect return value checks

 - fix leak on tlink mount failure

* tag '6.1-rc5-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6:
  cifs: add check for returning value of SMB2_set_info_init
  cifs: Fix wrong return value checking when GETFLAGS
  cifs: add check for returning value of SMB2_close_init
  cifs: Fix connections leak when tlink setup failed

2 years agoiommu/vt-d: Set SRE bit only when hardware has SRS cap
Tina Zhang [Wed, 16 Nov 2022 05:15:44 +0000 (13:15 +0800)]
iommu/vt-d: Set SRE bit only when hardware has SRS cap

SRS cap is the hardware cap telling if the hardware IOMMU can support
requests seeking supervisor privilege or not. SRE bit in scalable-mode
PASID table entry is treated as Reserved(0) for implementation not
supporting SRS cap.

Checking SRS cap before setting SRE bit can avoid the non-recoverable
fault of "Non-zero reserved field set in PASID Table Entry" caused by
setting SRE bit while there is no SRS cap support. The fault messages
look like below:

 DMAR: DRHD: handling fault status reg 2
 DMAR: [DMA Read NO_PASID] Request device [00:0d.0] fault addr 0x1154e1000
       [fault reason 0x5a]
       SM: Non-zero reserved field set in PASID Table Entry

Fixes: 6f7db75e1c46 ("iommu/vt-d: Add second level page table interface")
Cc: stable@vger.kernel.org
Signed-off-by: Tina Zhang <tina.zhang@intel.com>
Link: https://lore.kernel.org/r/20221115070346.1112273-1-tina.zhang@intel.com
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20221116051544.26540-3-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2 years agoiommu/vt-d: Preset Access bit for IOVA in FL non-leaf paging entries
Tina Zhang [Wed, 16 Nov 2022 05:15:43 +0000 (13:15 +0800)]
iommu/vt-d: Preset Access bit for IOVA in FL non-leaf paging entries

The A/D bits are preseted for IOVA over first level(FL) usage for both
kernel DMA (i.e, domain typs is IOMMU_DOMAIN_DMA) and user space DMA
usage (i.e., domain type is IOMMU_DOMAIN_UNMANAGED).

Presetting A bit in FL requires to preset the bit in every related paging
entries, including the non-leaf ones. Otherwise, hardware may treat this
as an error. For example, in a case of ECAP_REG.SMPWC==0, DMA faults might
occur with below DMAR fault messages (wrapped for line length) dumped.

 DMAR: DRHD: handling fault status reg 2
 DMAR: [DMA Read NO_PASID] Request device [aa:00.0] fault addr 0x10c3a6000
    [fault reason 0x90]
    SM: A/D bit update needed in first-level entry when set up in no snoop

Fixes: 289b3b005cb9 ("iommu/vt-d: Preset A/D bits for user space DMA usage")
Cc: stable@vger.kernel.org
Signed-off-by: Tina Zhang <tina.zhang@intel.com>
Link: https://lore.kernel.org/r/20221113010324.1094483-1-tina.zhang@intel.com
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20221116051544.26540-2-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2 years agoMerge tag 'input-for-v6.1-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor...
Linus Torvalds [Sat, 19 Nov 2022 01:56:29 +0000 (17:56 -0800)]
Merge tag 'input-for-v6.1-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input

Pull input fixes from Dmitry Torokhov:

 - a fix for 8042 to stop leaking platform device on unload

 - a fix for Goodix touchscreens on devices like Nanote UMPC-01 where we
   need to reset controller to load config from firmware

 - a workaround for Acer Switch to avoid interrupt storm from home and
   power buttons

 - a workaround for more ASUS ZenBook models to detect keyboard
   controller

 - a fix for iforce driver to properly handle communication errors

 - touchpad on HP Laptop 15-da3001TU switched to RMI mode

* tag 'input-for-v6.1-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  Input: i8042 - fix leaking of platform device on module removal
  Input: i8042 - apply probe defer to more ASUS ZenBook models
  Input: soc_button_array - add Acer Switch V 10 to dmi_use_low_level_irq[]
  Input: soc_button_array - add use_low_level_irq module parameter
  Input: iforce - invert valid length check when fetching device IDs
  Input: goodix - try resetting the controller when no config is set
  dt-bindings: input: touchscreen: Add compatible for Goodix GT7986U chip
  Input: synaptics - switch touchpad on HP Laptop 15-da3001TU to RMI mode

2 years agoMerge tag 'zonefs-6.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal...
Linus Torvalds [Sat, 19 Nov 2022 01:17:42 +0000 (17:17 -0800)]
Merge tag 'zonefs-6.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/zonefs

Pull zonefs fixes from Damien Le Moal:

 - Fix the IO error recovery path for failures happening in the last
   zone of device, and that zone is a "runt" zone (smaller than the
   other zone). The current code was failing to properly obtain a zone
   report in that case.

 - Remove the unused to_attr() function as it is unused, causing
   compilation warnings with clang.

* tag 'zonefs-6.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/zonefs:
  zonefs: Remove to_attr() helper function
  zonefs: fix zone report size in __zonefs_io_error()

2 years agoInput: i8042 - fix leaking of platform device on module removal
Chen Jun [Fri, 18 Nov 2022 23:40:03 +0000 (15:40 -0800)]
Input: i8042 - fix leaking of platform device on module removal

Avoid resetting the module-wide i8042_platform_device pointer in
i8042_probe() or i8042_remove(), so that the device can be properly
destroyed by i8042_exit() on module unload.

Fixes: 9222ba68c3f4 ("Input: i8042 - add deferred probe support")
Signed-off-by: Chen Jun <chenjun102@huawei.com>
Link: https://lore.kernel.org/r/20221109034148.23821-1-chenjun102@huawei.com
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2 years agoMerge tag 'io_uring-6.1-2022-11-18' of git://git.kernel.dk/linux
Linus Torvalds [Fri, 18 Nov 2022 22:59:53 +0000 (14:59 -0800)]
Merge tag 'io_uring-6.1-2022-11-18' of git://git.kernel.dk/linux

Pull io_uring fixes from Jens Axboe:
 "This is mostly fixing issues around the poll rework, but also two
  tweaks for the multishot handling for accept and receive.

  All stable material"

* tag 'io_uring-6.1-2022-11-18' of git://git.kernel.dk/linux:
  io_uring: disallow self-propelled ring polling
  io_uring: fix multishot recv request leaks
  io_uring: fix multishot accept request leaks
  io_uring: fix tw losing poll events
  io_uring: update res mask in io_poll_check_events

2 years agoMerge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Linus Torvalds [Fri, 18 Nov 2022 22:31:03 +0000 (14:31 -0800)]
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 fixes from Catalin Marinas:

 - Fix a build error with CONFIG_CFI_CLANG + CONFIG_FTRACE when
   CONFIG_FUNCTION_GRAPH_TRACER is not enabled.

 - Fix a BUG_ON triggered by the page table checker due to incorrect
   file_map_count for non-leaf pmd/pud (the arm64
   pmd_user_accessible_page() not checking whether it's a leaf entry).

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64/mm: fix incorrect file_map_count for non-leaf pmd/pud
  arm64: ftrace: Define ftrace_stub_graph only with FUNCTION_GRAPH_TRACER

2 years agoMerge tag 'block-6.1-2022-11-18' of git://git.kernel.dk/linux
Linus Torvalds [Fri, 18 Nov 2022 21:59:45 +0000 (13:59 -0800)]
Merge tag 'block-6.1-2022-11-18' of git://git.kernel.dk/linux

Pull block fixes from Jens Axboe:

 - NVMe pull request via Christoph:
      - Two more bogus nid quirks (Bean Huo, Tiago Dias Ferreira)
      - Memory leak fix in nvmet (Sagi Grimberg)

 - Regression fix for block cgroups pinning the wrong blkcg, causing
   leaks of cgroups and blkcgs (Chris)

 - UAF fix for drbd setup error handling (Dan)

 - Fix DMA alignment propagation in DM (Keith)

* tag 'block-6.1-2022-11-18' of git://git.kernel.dk/linux:
  dm-log-writes: set dma_alignment limit in io_hints
  dm-integrity: set dma_alignment limit in io_hints
  block: make blk_set_default_limits() private
  dm-crypt: provide dma_alignment limit in io_hints
  block: make dma_alignment a stacking queue_limit
  nvmet: fix a memory leak in nvmet_auth_set_key
  nvme-pci: add NVME_QUIRK_BOGUS_NID for Netac NV7000
  drbd: use after free in drbd_create_device()
  nvme-pci: add NVME_QUIRK_BOGUS_NID for Micron Nitro
  blk-cgroup: properly pin the parent in blkcg_css_online

2 years agoMerge tag 'drm-fixes-2022-11-19' of git://anongit.freedesktop.org/drm/drm
Linus Torvalds [Fri, 18 Nov 2022 21:31:40 +0000 (13:31 -0800)]
Merge tag 'drm-fixes-2022-11-19' of git://anongit.freedesktop.org/drm/drm

Pull drm fixes from Dave Airlie:
 "I guess the main question is are things settling down, and I'd say
  kinda, these are all pretty small fixes, nothing big stands out
  really, just seems to be quite a few of them.

  Mostly amdgpu and core fixes, with some i915, tegra, vc4, panel bits.

  core:
   - Fix potential memory leak in drm_dev_init()
   - Fix potential null-ptr-deref in drm_vblank_destroy_worker()
   - Revert hiding unregistered connectors from userspace, as it breaks
     on DP-MST
   - Add workaround for DP++ dual mode adaptors that don't support i2c
     subaddressing

  i915:
   - Fix uaf with lmem_userfault_list handling

  amdgpu:
   - gang submit fixes
   - Fix a possible memory leak in ganng submit error path
   - DP tunneling fixes
   - DCN 3.1 page flip fix
   - DCN 3.2.x fixes
   - DCN 3.1.4 fixes
   - Don't expose degamma on hardware that doesn't support it
   - BACO fixes for SMU 11.x
   - BACO fixes for SMU 13.x
   - Virtual display fix for devices with no display hardware

  amdkfd:
   - Memory limit regression fix

  tegra:
   - tegra20 GART fix

  vc4:
   - Fix error handling in vc4_atomic_commit_tail()

  lima:
   - Set lima's clkname corrrectly when regulator is missing

  panel:
   - Set bpc for logictechno panels"

* tag 'drm-fixes-2022-11-19' of git://anongit.freedesktop.org/drm/drm: (28 commits)
  gpu: host1x: Avoid trying to use GART on Tegra20
  drm/display: Don't assume dual mode adaptors support i2c sub-addressing
  drm/amd/pm: fix SMU13 runpm hang due to unintentional workaround
  drm/amd/pm: enable runpm support over BACO for SMU13.0.7
  drm/amd/pm: enable runpm support over BACO for SMU13.0.0
  drm/amdgpu: there is no vbios fb on devices with no display hw (v2)
  drm/amdkfd: Fix a memory limit issue
  drm/amdgpu: disable BACO support on more cards
  drm/amd/display: don't enable DRM CRTC degamma property for DCE
  drm/amd/display: Set max for prefetch lines on dcn32
  drm/amd/display: use uclk pstate latency for fw assisted mclk validation dcn32
  drm/amd/display: Fix prefetch calculations for dcn32
  drm/amd/display: Fix optc2_configure warning on dcn314
  drm/amd/display: Fix calculation for cursor CAB allocation
  Revert "drm: hide unregistered connectors from GETCONNECTOR IOCTL"
  drm/amd/display: Support parsing VRAM info v3.0 from VBIOS
  drm/amd/display: Fix invalid DPIA AUX reply causing system hang
  drm/amdgpu: Add psp_13_0_10_ta firmware to modinfo
  drm/amd/display: Add HUBP surface flip interrupt handler
  drm/amd/display: Fix access timeout to DPIA AUX at boot time
  ...

2 years agoMerge tag 's390-6.1-5' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Linus Torvalds [Fri, 18 Nov 2022 20:30:23 +0000 (12:30 -0800)]
Merge tag 's390-6.1-5' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux

Pull s390 fixes from Alexander Gordeev:

 - Fix deadlock in discontiguous saved segments (DCSS) block device
   driver. When adding a disk and scanning partitions the scan would not
   break out early without a missed flag.

 - Avoid using global register variable for current_stack_pointer due to
   an old bug in gcc versions prior to gcc-8.4. Due to this bug a broken
   code is generated, which leads to stack corruptions.

* tag 's390-6.1-5' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390: avoid using global register for current_stack_pointer
  s390/dcssblk: fix deadlock when adding a DCSS

2 years agoMerge tag 'for-6.1/dm-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/devic...
Linus Torvalds [Fri, 18 Nov 2022 20:23:35 +0000 (12:23 -0800)]
Merge tag 'for-6.1/dm-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm

Pull device mapper fixes from Mike Snitzer:

 - Fix misbehavior if list_versions DM ioctl races with module loading

 - Fix missing decrement of no_sleep_enabled if dm_bufio_client_create
   failed

 - Allow DM integrity devices to be activated in read-only mode

* tag 'for-6.1/dm-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
  dm integrity: clear the journal on suspend
  dm integrity: flush the journal on suspend
  dm bufio: Fix missing decrement of no_sleep_enabled if dm_bufio_client_create failed
  dm ioctl: fix misbehavior if list_versions races with module loading

2 years agoMerge tag 'drm/tegra/for-6.1-rc6' of https://gitlab.freedesktop.org/drm/tegra into...
Dave Airlie [Fri, 18 Nov 2022 20:15:20 +0000 (06:15 +1000)]
Merge tag 'drm/tegra/for-6.1-rc6' of https://gitlab.freedesktop.org/drm/tegra into drm-fixes

drm/tegra: Fixes for v6.1-rc6

This contains a single fix that avoids using the GART on Tegra20 because
it doesn't work well with the way the Tegra DRM driver tries to use it.

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Thierry Reding <thierry.reding@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221118121614.3511110-1-thierry.reding@gmail.com
2 years agoMerge tag 'usb-6.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Linus Torvalds [Fri, 18 Nov 2022 20:08:24 +0000 (12:08 -0800)]
Merge tag 'usb-6.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb

Pull USB driver fixes from Greg KH:
 "Here are a number of USB driver fixes and new device ids for 6.1-rc6.
  Included in here are:

   - new usb-serial device ids

   - dwc3 driver fixes for reported problems

   - cdns3 driver fixes

   - new USB device quirks

   - typec driver fixes

   - extcon USB typec driver fix

  All of these have been in linux-next with no reported issues"

* tag 'usb-6.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
  USB: serial: option: add u-blox LARA-L6 modem
  USB: serial: option: add u-blox LARA-R6 00B modem
  USB: serial: option: remove old LARA-R6 PID
  USB: serial: option: add Fibocom FM160 0x0111 composition
  usb: add NO_LPM quirk for Realforce 87U Keyboard
  usb: cdns3: host: fix endless superspeed hub port reset
  usb: chipidea: fix deadlock in ci_otg_del_timer
  usb: dwc3: Do not get extcon device when usb-role-switch is used
  usb: typec: tipd: Prevent uninitialized event{1,2} in IRQ handler
  usb: typec: mux: Enter safe mode only when pins need to be reconfigured
  extcon: usbc-tusb320: Call the Type-C IRQ handler only if a port is registered
  Revert "usb: dwc3: disable USB core PHY management"
  usb: dwc3: gadget: Return -ESHUTDOWN on ep disable
  USB: bcma: Make GPIO explicitly optional
  USB: serial: option: add Sierra Wireless EM9191

2 years agoMerge tag 'staging-6.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh...
Linus Torvalds [Fri, 18 Nov 2022 20:02:38 +0000 (12:02 -0800)]
Merge tag 'staging-6.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging

Pull staging driver fix from Greg KH:
 "Here is a single staging driver fix for 6.1-rc6.

  It resolves a bogus signed character test as pointed out, and fixed
  by, Jason in the rtl8192e driver

  It has been in linux-next for a few weeks now with no reported
  problems"

* tag 'staging-6.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
  staging: rtl8192e: remove bogus ssid character sign test

2 years agoarm64/mm: fix incorrect file_map_count for non-leaf pmd/pud
Liu Shixin [Thu, 17 Nov 2022 07:56:01 +0000 (15:56 +0800)]
arm64/mm: fix incorrect file_map_count for non-leaf pmd/pud

The page table check trigger BUG_ON() unexpectedly when collapse hugepage:

 ------------[ cut here ]------------
 kernel BUG at mm/page_table_check.c:82!
 Internal error: Oops - BUG: 00000000f2000800 [#1] SMP
 Dumping ftrace buffer:
    (ftrace buffer empty)
 Modules linked in:
 CPU: 6 PID: 68 Comm: khugepaged Not tainted 6.1.0-rc3+ #750
 Hardware name: linux,dummy-virt (DT)
 pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
 pc : page_table_check_clear.isra.0+0x258/0x3f0
 lr : page_table_check_clear.isra.0+0x240/0x3f0
[...]
 Call trace:
  page_table_check_clear.isra.0+0x258/0x3f0
  __page_table_check_pmd_clear+0xbc/0x108
  pmdp_collapse_flush+0xb0/0x160
  collapse_huge_page+0xa08/0x1080
  hpage_collapse_scan_pmd+0xf30/0x1590
  khugepaged_scan_mm_slot.constprop.0+0x52c/0xac8
  khugepaged+0x338/0x518
  kthread+0x278/0x2f8
  ret_from_fork+0x10/0x20
[...]

Since pmd_user_accessible_page() doesn't check if a pmd is leaf, it
decrease file_map_count for a non-leaf pmd comes from collapse_huge_page().
and so trigger BUG_ON() unexpectedly.

Fix this problem by using pmd_leaf() insteal of pmd_present() in
pmd_user_accessible_page(). Moreover, use pud_leaf() for
pud_user_accessible_page() too.

Fixes: 42b2547137f5 ("arm64/mm: enable ARCH_SUPPORTS_PAGE_TABLE_CHECK")
Reported-by: Denys Vlasenko <dvlasenk@redhat.com>
Signed-off-by: Liu Shixin <liushixin2@huawei.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Acked-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Reviewed-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20221117075602.2904324-2-liushixin2@huawei.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2 years agoMerge tag 'tty-6.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Linus Torvalds [Fri, 18 Nov 2022 18:59:52 +0000 (10:59 -0800)]
Merge tag 'tty-6.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty

Pull tty/serial driver fixes from Greg KH:
 "Here are a number of small tty and serial driver fixes for 6.1-rc6.
  They all resolve reported problems:

   - kernel doc build problems with the -rc1 serial driver documentation
     update

   - n_gsm reported problems

   - imx serial driver missing callback

   - lots of tiny 8250 driver fixes for reported issues.

  All of these have been in linux-next for over a week with no reported
  problems"

* tag 'tty-6.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
  docs/driver-api/miscellaneous: Remove kernel-doc of serial_core.c
  serial: 8250: Flush DMA Rx on RLSI
  serial: 8250_lpss: Use 16B DMA burst with Elkhart Lake
  serial: 8250_lpss: Configure DMA also w/o DMA filter
  serial: 8250: Fall back to non-DMA Rx if IIR_RDI occurs
  tty: n_gsm: fix sleep-in-atomic-context bug in gsm_control_send
  Revert "tty: n_gsm: replace kicktimer with delayed_work"
  Revert "tty: n_gsm: avoid call of sleeping functions from atomic context"
  serial: imx: Add missing .thaw_noirq hook
  tty: serial: fsl_lpuart: don't break the on-going transfer when global reset
  serial: 8250: omap: Flush PM QOS work on remove
  serial: 8250: omap: Fix unpaired pm_runtime_put_sync() in omap8250_remove()
  serial: 8250_omap: remove wait loop from Errata i202 workaround
  serial: 8250: omap: Fix missing PM runtime calls for omap8250_set_mctrl()
  serial: 8250: 8250_omap: Avoid RS485 RTS glitch on ->set_termios()

2 years agoMerge tag 'driver-core-6.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Fri, 18 Nov 2022 18:49:53 +0000 (10:49 -0800)]
Merge tag 'driver-core-6.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

Pull driver core fixes from Greg KH:
 "Here are two small driver core fixes for 6.1-rc6:

   - utsname fix, this one should already be in your tree as it came
     from a different tree earlier.

   - kernfs bugfix for a much reported syzbot report that seems to keep
     getting triggered.

  Both of these have been in linux-next for a while with no reported
  issues"

* tag 'driver-core-6.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
  kernfs: Fix spurious lockdep warning in kernfs_find_and_get_node_by_id()
  kernel/utsname_sysctl.c: Add missing enum uts_proc value

2 years agoMerge tag 'char-misc-6.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh...
Linus Torvalds [Fri, 18 Nov 2022 18:29:25 +0000 (10:29 -0800)]
Merge tag 'char-misc-6.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc

Pull char/misc driver fixes from Greg KH:
 "Here are some small char/misc and other driver fixes for 6.1-rc6 to
  resolve some reported problems. Included in here are:

   - iio driver fixes

   - binder driver fix

   - nvmem driver fix

   - vme_vmci information leak fix

   - parport fix

   - slimbus configuration fix

   - coreboot firmware bugfix

   - speakup build fix and crash fix

  All of these have been in linux-next for a while with no reported
  issues"

* tag 'char-misc-6.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (22 commits)
  firmware: coreboot: Register bus in module init
  nvmem: u-boot-env: fix crc32_data_offset on redundant u-boot-env
  slimbus: qcom-ngd: Fix build error when CONFIG_SLIM_QCOM_NGD_CTRL=y && CONFIG_QCOM_RPROC_COMMON=m
  docs: update mediator contact information in CoC doc
  slimbus: stream: correct presence rate frequencies
  nvmem: lan9662-otp: Fix compatible string
  binder: validate alloc->mm in ->mmap() handler
  parport_pc: Avoid FIFO port location truncation
  siox: fix possible memory leak in siox_device_add()
  misc/vmw_vmci: fix an infoleak in vmci_host_do_receive_datagram()
  speakup: replace utils' u_char with unsigned char
  speakup: fix a segfault caused by switching consoles
  tools: iio: iio_generic_buffer: Fix read size
  iio: imu: bno055: uninitialized variable bug in bno055_trigger_handler()
  iio: adc: at91_adc: fix possible memory leak in at91_adc_allocate_trigger()
  iio: adc: mp2629: fix potential array out of bound access
  iio: adc: mp2629: fix wrong comparison of channel
  iio: pressure: ms5611: changed hardcoded SPI speed to value limited
  iio: pressure: ms5611: fixed value compensation bug
  iio: accel: bma400: Ensure VDDIO is enable defore reading the chip ID.
  ...

2 years agoMerge tag 'sound-6.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai...
Linus Torvalds [Fri, 18 Nov 2022 17:52:10 +0000 (09:52 -0800)]
Merge tag 'sound-6.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound

Pull sound fixes from Takashi Iwai:
 "A fair amount of commits at this time due to ASoC PR merge, but all
  look small and easy, mostly device-specific fixes spanned in various
  drivers. Hopefully this should be the last big chunk for 6.1"

* tag 'sound-6.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (21 commits)
  ALSA: hda/realtek: Fix the speaker output on Samsung Galaxy Book Pro 360
  ALSA: hda/realtek: fix speakers for Samsung Galaxy Book Pro
  ALSA: usb-audio: Drop snd_BUG_ON() from snd_usbmidi_output_open()
  ASoC: stm32: dfsdm: manage cb buffers cleanup
  ASoC: sof_es8336: reduce pop noise on speaker
  ASoC: SOF: topology: No need to assign core ID if token parsing failed
  ASoC: soc-utils: Remove __exit for snd_soc_util_exit()
  ASoC: rt5677: fix legacy dai naming
  ASoC: rt5514: fix legacy dai naming
  ASoC: SOF: ipc3-topology: use old pipeline teardown flow with SOF2.1 and older
  ASoC: hda: intel-dsp-config: add ES83x6 quirk for IceLake
  ASoC: Intel: soc-acpi: add ES83x6 support to IceLake
  ASoC: tas2780: Fix set_tdm_slot in case of single slot
  ASoC: tas2764: Fix set_tdm_slot in case of single slot
  ASoC: tas2770: Fix set_tdm_slot in case of single slot
  ASoC: fsl_asrc fsl_esai fsl_sai: allow CONFIG_PM=N
  ASoC: core: Fix use-after-free in snd_soc_exit()
  MAINTAINERS: update Tzung-Bi's email address
  ASoC: Intel: bytcht_es8316: Add quirk for the Nanote UMPC-01
  ASoC: amd: yc: Add Alienware m17 R5 AMD into DMI table
  ...

2 years agoMerge tag 'mmc-v6.1-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc
Linus Torvalds [Fri, 18 Nov 2022 17:43:30 +0000 (09:43 -0800)]
Merge tag 'mmc-v6.1-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc

Pull MMC fixes from Ulf Hansson:
 "MMC core:
   - Fixup VDD/VMMC voltage-range negotiation

  MMC host:
   - sdhci-pci: Fix memory leak by adding a missing pci_dev_put()
   - sdhci-pci-o2micro: Fix card detect by tuning the debounce timeout"

* tag 'mmc-v6.1-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
  mmc: sdhci-pci: Fix possible memory leak caused by missing pci_dev_put()
  mmc: sdhci-pci-o2micro: fix card detect fail issue caused by CD# debounce timeout
  mmc: core: properly select voltage range without power cycle

2 years agoio_uring: disallow self-propelled ring polling
Pavel Begunkov [Fri, 18 Nov 2022 15:41:41 +0000 (15:41 +0000)]
io_uring: disallow self-propelled ring polling

When we post a CQE we wake all ring pollers as it normally should be.
However, if a CQE was generated by a multishot poll request targeting
its own ring, it'll wake that request up, which will make it to post
a new CQE, which will wake the request and so on until it exhausts all
CQ entries.

Don't allow multishot polling io_uring files but downgrade them to
oneshots, which was always stated as a correct behaviour that the
userspace should check for.

Cc: stable@vger.kernel.org
Fixes: aa43477b04025 ("io_uring: poll rework")
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/3124038c0e7474d427538c2d915335ec28c92d21.1668785722.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2 years agodm integrity: clear the journal on suspend
Mikulas Patocka [Tue, 15 Nov 2022 17:51:50 +0000 (12:51 -0500)]
dm integrity: clear the journal on suspend

There was a problem that a user burned a dm-integrity image on CDROM
and could not activate it because it had a non-empty journal.

Fix this problem by flushing the journal (done by the previous commit)
and clearing the journal (done by this commit). Once the journal is
cleared, dm-integrity won't attempt to replay it on the next
activation.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2 years agodm integrity: flush the journal on suspend
Mikulas Patocka [Tue, 15 Nov 2022 17:48:26 +0000 (12:48 -0500)]
dm integrity: flush the journal on suspend

This commit flushes the journal on suspend. It is prerequisite for the
next commit that enables activating dm integrity devices in read-only mode.

Note that we deliberately didn't flush the journal on suspend, so that the
journal replay code would be tested. However, the dm-integrity code is 5
years old now, so that journal replay is well-tested, and we can make this
change now.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2 years agodm bufio: Fix missing decrement of no_sleep_enabled if dm_bufio_client_create failed
Zhihao Cheng [Fri, 11 Nov 2022 12:10:27 +0000 (20:10 +0800)]
dm bufio: Fix missing decrement of no_sleep_enabled if dm_bufio_client_create failed

The 'no_sleep_enabled' should be decreased in error handling path
in dm_bufio_client_create() when the DM_BUFIO_CLIENT_NO_SLEEP flag
is set, otherwise static_branch_unlikely() will always return true
even if no dm_bufio_client instances have DM_BUFIO_CLIENT_NO_SLEEP
flag set.

Cc: stable@vger.kernel.org
Fixes: 3c1c875d0586 ("dm bufio: conditionally enable branching for DM_BUFIO_CLIENT_NO_SLEEP")
Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2 years agodm ioctl: fix misbehavior if list_versions races with module loading
Mikulas Patocka [Tue, 1 Nov 2022 20:53:35 +0000 (16:53 -0400)]
dm ioctl: fix misbehavior if list_versions races with module loading

__list_versions will first estimate the required space using the
"dm_target_iterate(list_version_get_needed, &needed)" call and then will
fill the space using the "dm_target_iterate(list_version_get_info,
&iter_info)" call. Each of these calls locks the targets using the
"down_read(&_lock)" and "up_read(&_lock)" calls, however between the first
and second "dm_target_iterate" there is no lock held and the target
modules can be loaded at this point, so the second "dm_target_iterate"
call may need more space than what was the first "dm_target_iterate"
returned.

The code tries to handle this overflow (see the beginning of
list_version_get_info), however this handling is incorrect.

The code sets "param->data_size = param->data_start + needed" and
"iter_info.end = (char *)vers+len" - "needed" is the size returned by the
first dm_target_iterate call; "len" is the size of the buffer allocated by
userspace.

"len" may be greater than "needed"; in this case, the code will write up
to "len" bytes into the buffer, however param->data_size is set to
"needed", so it may write data past the param->data_size value. The ioctl
interface copies only up to param->data_size into userspace, thus part of
the result will be truncated.

Fix this bug by setting "iter_info.end = (char *)vers + needed;" - this
guarantees that the second "dm_target_iterate" call will write only up to
the "needed" buffer and it will exit with "DM_BUFFER_FULL_FLAG" if it
overflows the "needed" space - in this case, userspace will allocate a
larger buffer and retry.

Note that there is also a bug in list_version_get_needed - we need to add
"strlen(tt->name) + 1" to the needed size, not "strlen(tt->name)".

Cc: stable@vger.kernel.org
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2 years agoMerge tag 'nvme-6.1-2022-11-18' of git://git.infradead.org/nvme into block-6.1
Jens Axboe [Fri, 18 Nov 2022 14:47:54 +0000 (07:47 -0700)]
Merge tag 'nvme-6.1-2022-11-18' of git://git.infradead.org/nvme into block-6.1

Pull NVMe fixes from Christoph:

"nvme fixes for Linux 6.1

 - two more bogus nid quirks (Bean Huo, Tiago Dias Ferreira)
 - memory leak fix in nvmet (Sagi Grimberg)"

* tag 'nvme-6.1-2022-11-18' of git://git.infradead.org/nvme:
  nvmet: fix a memory leak in nvmet_auth_set_key
  nvme-pci: add NVME_QUIRK_BOGUS_NID for Netac NV7000
  nvme-pci: add NVME_QUIRK_BOGUS_NID for Micron Nitro

2 years agogpu: host1x: Avoid trying to use GART on Tegra20
Robin Murphy [Thu, 20 Oct 2022 14:23:40 +0000 (15:23 +0100)]
gpu: host1x: Avoid trying to use GART on Tegra20

Since commit c7e3ca515e78 ("iommu/tegra: gart: Do not register with
bus") quite some time ago, the GART driver has effectively disabled
itself to avoid issues with the GPU driver expecting it to work in ways
that it doesn't. As of commit 57365a04c921 ("iommu: Move bus setup to
IOMMU device registration") that bodge no longer works, but really the
GPU driver should be responsible for its own behaviour anyway. Make the
workaround explicit.

Reported-by: Jon Hunter <jonathanh@nvidia.com>
Suggested-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2 years agotracing: Fix potential null-pointer-access of entry in list 'tr->err_log'
Zheng Yejian [Mon, 14 Nov 2022 10:46:32 +0000 (18:46 +0800)]
tracing: Fix potential null-pointer-access of entry in list 'tr->err_log'

Entries in list 'tr->err_log' will be reused after entry number
exceed TRACING_LOG_ERRS_MAX.

The cmd string of the to be reused entry will be freed first then
allocated a new one. If the allocation failed, then the entry will
still be in list 'tr->err_log' but its 'cmd' field is set to be NULL,
later access of 'cmd' is risky.

Currently above problem can cause the loss of 'cmd' information of first
entry in 'tr->err_log'. When execute `cat /sys/kernel/tracing/error_log`,
reproduce logs like:
  [   37.495100] trace_kprobe: error: Maxactive is not for kprobe(null) ^
  [   38.412517] trace_kprobe: error: Maxactive is not for kprobe
    Command: p4:myprobe2 do_sys_openat2
            ^

Link: https://lore.kernel.org/linux-trace-kernel/20221114104632.3547266-1-zhengyejian1@huawei.com
Fixes: 1581a884b7ca ("tracing: Remove size restriction on tracing_log_err cmd strings")
Signed-off-by: Zheng Yejian <zhengyejian1@huawei.com>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2 years agotracing: Remove unused __bad_type_size() method
Qiujun Huang [Thu, 17 Nov 2022 16:44:35 +0000 (00:44 +0800)]
tracing: Remove unused __bad_type_size() method

__bad_type_size() is unused after
commit 04ae87a52074("ftrace: Rework event_create_dir()").
So, remove it.

Link: https://lkml.kernel.org/r/D062EC2E-7DB7-4402-A67E-33C3577F551E@gmail.com
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Qiujun Huang <hqjagain@gmail.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2 years agotracing/eprobe: Fix eprobe filter to make a filter correctly
Masami Hiramatsu (Google) [Fri, 18 Nov 2022 01:15:34 +0000 (10:15 +0900)]
tracing/eprobe: Fix eprobe filter to make a filter correctly

Since the eprobe filter was defined based on the eprobe's trace event
itself, it doesn't work correctly. Use the original trace event of
the eprobe when making the filter so that the filter works correctly.

Without this fix:

 # echo 'e syscalls/sys_enter_openat \
flags_rename=$flags:u32 if flags < 1000' >> dynamic_events
 # echo 1 > events/eprobes/sys_enter_openat/enable
[  114.551550] event trace: Could not enable event sys_enter_openat
-bash: echo: write error: Invalid argument

With this fix:
 # echo 'e syscalls/sys_enter_openat \
flags_rename=$flags:u32 if flags < 1000' >> dynamic_events
 # echo 1 > events/eprobes/sys_enter_openat/enable
 # tail trace
cat-241     [000] ...1.   266.498449: sys_enter_openat: (syscalls.sys_enter_openat) flags_rename=0
cat-242     [000] ...1.   266.977640: sys_enter_openat: (syscalls.sys_enter_openat) flags_rename=0

Link: https://lore.kernel.org/all/166823166395.1385292.8931770640212414483.stgit@devnote3/
Fixes: 752be5c5c910 ("tracing/eprobe: Add eprobe filter support")
Reported-by: Rafael Mendonca <rafaelmendsr@gmail.com>
Tested-by: Rafael Mendonca <rafaelmendsr@gmail.com>
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
2 years agotracing/eprobe: Fix warning in filter creation
Rafael Mendonca [Fri, 18 Nov 2022 01:15:34 +0000 (10:15 +0900)]
tracing/eprobe: Fix warning in filter creation

The filter pointer (filterp) passed to create_filter() function must be a
pointer that references a NULL pointer, otherwise, we get a warning when
adding a filter option to the event probe:

root@localhost:/sys/kernel/tracing# echo 'e:egroup/stat_runtime_4core sched/sched_stat_runtime \
        runtime=$runtime:u32 if cpu < 4' >> dynamic_events
[ 5034.340439] ------------[ cut here ]------------
[ 5034.341258] WARNING: CPU: 0 PID: 223 at kernel/trace/trace_events_filter.c:1939 create_filter+0x1db/0x250
[...] stripped
[ 5034.345518] RIP: 0010:create_filter+0x1db/0x250
[...] stripped
[ 5034.351604] Call Trace:
[ 5034.351803]  <TASK>
[ 5034.351959]  ? process_preds+0x1b40/0x1b40
[ 5034.352241]  ? rcu_read_lock_bh_held+0xd0/0xd0
[ 5034.352604]  ? kasan_set_track+0x29/0x40
[ 5034.352904]  ? kasan_save_alloc_info+0x1f/0x30
[ 5034.353264]  create_event_filter+0x38/0x50
[ 5034.353573]  __trace_eprobe_create+0x16f4/0x1d20
[ 5034.353964]  ? eprobe_dyn_event_release+0x360/0x360
[ 5034.354363]  ? mark_held_locks+0xa6/0xf0
[ 5034.354684]  ? _raw_spin_unlock_irqrestore+0x35/0x60
[ 5034.355105]  ? trace_hardirqs_on+0x41/0x120
[ 5034.355417]  ? _raw_spin_unlock_irqrestore+0x35/0x60
[ 5034.355751]  ? __create_object+0x5b7/0xcf0
[ 5034.356027]  ? lock_is_held_type+0xaf/0x120
[ 5034.356362]  ? rcu_read_lock_bh_held+0xb0/0xd0
[ 5034.356716]  ? rcu_read_lock_bh_held+0xd0/0xd0
[ 5034.357084]  ? kasan_set_track+0x29/0x40
[ 5034.357411]  ? kasan_save_alloc_info+0x1f/0x30
[ 5034.357715]  ? __kasan_kmalloc+0xb8/0xc0
[ 5034.357985]  ? write_comp_data+0x2f/0x90
[ 5034.358302]  ? __sanitizer_cov_trace_pc+0x25/0x60
[ 5034.358691]  ? argv_split+0x381/0x460
[ 5034.358949]  ? write_comp_data+0x2f/0x90
[ 5034.359240]  ? eprobe_dyn_event_release+0x360/0x360
[ 5034.359620]  trace_probe_create+0xf6/0x110
[ 5034.359940]  ? trace_probe_match_command_args+0x240/0x240
[ 5034.360376]  eprobe_dyn_event_create+0x21/0x30
[ 5034.360709]  create_dyn_event+0xf3/0x1a0
[ 5034.360983]  trace_parse_run_command+0x1a9/0x2e0
[ 5034.361297]  ? dyn_event_release+0x500/0x500
[ 5034.361591]  dyn_event_write+0x39/0x50
[ 5034.361851]  vfs_write+0x311/0xe50
[ 5034.362091]  ? dyn_event_seq_next+0x40/0x40
[ 5034.362376]  ? kernel_write+0x5b0/0x5b0
[ 5034.362637]  ? write_comp_data+0x2f/0x90
[ 5034.362937]  ? __sanitizer_cov_trace_pc+0x25/0x60
[ 5034.363258]  ? ftrace_syscall_enter+0x544/0x840
[ 5034.363563]  ? write_comp_data+0x2f/0x90
[ 5034.363837]  ? __sanitizer_cov_trace_pc+0x25/0x60
[ 5034.364156]  ? write_comp_data+0x2f/0x90
[ 5034.364468]  ? write_comp_data+0x2f/0x90
[ 5034.364770]  ksys_write+0x158/0x2a0
[ 5034.365022]  ? __ia32_sys_read+0xc0/0xc0
[ 5034.365344]  __x64_sys_write+0x7c/0xc0
[ 5034.365669]  ? syscall_enter_from_user_mode+0x53/0x70
[ 5034.366084]  do_syscall_64+0x60/0x90
[ 5034.366356]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
[ 5034.366767] RIP: 0033:0x7ff0b43938f3
[...] stripped
[ 5034.371892]  </TASK>
[ 5034.374720] ---[ end trace 0000000000000000 ]---

Link: https://lore.kernel.org/all/20221108202148.1020111-1-rafaelmendsr@gmail.com/
Fixes: 752be5c5c910 ("tracing/eprobe: Add eprobe filter support")
Signed-off-by: Rafael Mendonca <rafaelmendsr@gmail.com>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
2 years agokprobes: Skip clearing aggrprobe's post_handler in kprobe-on-ftrace case
Li Huafei [Fri, 18 Nov 2022 01:15:34 +0000 (10:15 +0900)]
kprobes: Skip clearing aggrprobe's post_handler in kprobe-on-ftrace case

In __unregister_kprobe_top(), if the currently unregistered probe has
post_handler but other child probes of the aggrprobe do not have
post_handler, the post_handler of the aggrprobe is cleared. If this is
a ftrace-based probe, there is a problem. In later calls to
disarm_kprobe(), we will use kprobe_ftrace_ops because post_handler is
NULL. But we're armed with kprobe_ipmodify_ops. This triggers a WARN in
__disarm_kprobe_ftrace() and may even cause use-after-free:

  Failed to disarm kprobe-ftrace at kernel_clone+0x0/0x3c0 (error -2)
  WARNING: CPU: 5 PID: 137 at kernel/kprobes.c:1135 __disarm_kprobe_ftrace.isra.21+0xcf/0xe0
  Modules linked in: testKprobe_007(-)
  CPU: 5 PID: 137 Comm: rmmod Not tainted 6.1.0-rc4-dirty #18
  [...]
  Call Trace:
   <TASK>
   __disable_kprobe+0xcd/0xe0
   __unregister_kprobe_top+0x12/0x150
   ? mutex_lock+0xe/0x30
   unregister_kprobes.part.23+0x31/0xa0
   unregister_kprobe+0x32/0x40
   __x64_sys_delete_module+0x15e/0x260
   ? do_user_addr_fault+0x2cd/0x6b0
   do_syscall_64+0x3a/0x90
   entry_SYSCALL_64_after_hwframe+0x63/0xcd
   [...]

For the kprobe-on-ftrace case, we keep the post_handler setting to
identify this aggrprobe armed with kprobe_ipmodify_ops. This way we
can disarm it correctly.

Link: https://lore.kernel.org/all/20221112070000.35299-1-lihuafei1@huawei.com/
Fixes: 0bc11ed5ab60 ("kprobes: Allow kprobes coexist with livepatch")
Reported-by: Zhao Gongyi <zhaogongyi@huawei.com>
Suggested-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Li Huafei <lihuafei1@huawei.com>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
2 years agorethook: fix a potential memleak in rethook_alloc()
Yi Yang [Fri, 18 Nov 2022 01:15:34 +0000 (10:15 +0900)]
rethook: fix a potential memleak in rethook_alloc()

In rethook_alloc(), the variable rh is not freed or passed out
if handler is NULL, which could lead to a memleak, fix it.

Link: https://lore.kernel.org/all/20221110104438.88099-1-yiyang13@huawei.com/
[Masami: Add "rethook:" tag to the title.]

Fixes: 54ecbe6f1ed5 ("rethook: Add a generic return hook")
Cc: stable@vger.kernel.org
Signed-off-by: Yi Yang <yiyang13@huawei.com>
Acke-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
2 years agotracing/eprobe: Fix memory leak of filter string
Rafael Mendonca [Fri, 18 Nov 2022 01:15:34 +0000 (10:15 +0900)]
tracing/eprobe: Fix memory leak of filter string

The filter string doesn't get freed when a dynamic event is deleted. If a
filter is set, then memory is leaked:

root@localhost:/sys/kernel/tracing# echo 'e:egroup/stat_runtime_4core \
        sched/sched_stat_runtime runtime=$runtime:u32 if cpu < 4' >> dynamic_events
root@localhost:/sys/kernel/tracing# echo "-:egroup/stat_runtime_4core"  >> dynamic_events
root@localhost:/sys/kernel/tracing# echo scan > /sys/kernel/debug/kmemleak
[  224.416373] kmemleak: 1 new suspected memory leaks (see /sys/kernel/debug/kmemleak)
root@localhost:/sys/kernel/tracing# cat /sys/kernel/debug/kmemleak
unreferenced object 0xffff88810156f1b8 (size 8):
  comm "bash", pid 224, jiffies 4294935612 (age 55.800s)
  hex dump (first 8 bytes):
    63 70 75 20 3c 20 34 00                          cpu < 4.
  backtrace:
    [<000000009f880725>] __kmem_cache_alloc_node+0x18e/0x720
    [<0000000042492946>] __kmalloc+0x57/0x240
    [<0000000034ea7995>] __trace_eprobe_create+0x1214/0x1d30
    [<00000000d70ef730>] trace_probe_create+0xf6/0x110
    [<00000000915c7b16>] eprobe_dyn_event_create+0x21/0x30
    [<000000000d894386>] create_dyn_event+0xf3/0x1a0
    [<00000000e9af57d5>] trace_parse_run_command+0x1a9/0x2e0
    [<0000000080777f18>] dyn_event_write+0x39/0x50
    [<0000000089f0ec73>] vfs_write+0x311/0xe50
    [<000000003da1bdda>] ksys_write+0x158/0x2a0
    [<00000000bb1e616e>] __x64_sys_write+0x7c/0xc0
    [<00000000e8aef1f7>] do_syscall_64+0x60/0x90
    [<00000000fe7fe8ba>] entry_SYSCALL_64_after_hwframe+0x63/0xcd

Additionally, in __trace_eprobe_create() function, if an error occurs after
the call to trace_eprobe_parse_filter(), which allocates the filter string,
then memory is also leaked. That can be reproduced by creating the same
event probe twice:

root@localhost:/sys/kernel/tracing# echo 'e:egroup/stat_runtime_4core \
        sched/sched_stat_runtime runtime=$runtime:u32 if cpu < 4' >> dynamic_events
root@localhost:/sys/kernel/tracing# echo 'e:egroup/stat_runtime_4core \
        sched/sched_stat_runtime runtime=$runtime:u32 if cpu < 4' >> dynamic_events
-bash: echo: write error: File exists
root@localhost:/sys/kernel/tracing# echo scan > /sys/kernel/debug/kmemleak
[  207.871584] kmemleak: 1 new suspected memory leaks (see /sys/kernel/debug/kmemleak)
root@localhost:/sys/kernel/tracing# cat /sys/kernel/debug/kmemleak
unreferenced object 0xffff8881020d17a8 (size 8):
  comm "bash", pid 223, jiffies 4294938308 (age 31.000s)
  hex dump (first 8 bytes):
    63 70 75 20 3c 20 34 00                          cpu < 4.
  backtrace:
    [<000000000e4f5f31>] __kmem_cache_alloc_node+0x18e/0x720
    [<0000000024f0534b>] __kmalloc+0x57/0x240
    [<000000002930a28e>] __trace_eprobe_create+0x1214/0x1d30
    [<0000000028387903>] trace_probe_create+0xf6/0x110
    [<00000000a80d6a9f>] eprobe_dyn_event_create+0x21/0x30
    [<000000007168698c>] create_dyn_event+0xf3/0x1a0
    [<00000000f036bf6a>] trace_parse_run_command+0x1a9/0x2e0
    [<00000000014bde8b>] dyn_event_write+0x39/0x50
    [<0000000078a097f7>] vfs_write+0x311/0xe50
    [<00000000996cb208>] ksys_write+0x158/0x2a0
    [<00000000a3c2acb0>] __x64_sys_write+0x7c/0xc0
    [<0000000006b5d698>] do_syscall_64+0x60/0x90
    [<00000000780e8ecf>] entry_SYSCALL_64_after_hwframe+0x63/0xcd

Fix both issues by releasing the filter string in
trace_event_probe_cleanup().

Link: https://lore.kernel.org/all/20221108235738.1021467-1-rafaelmendsr@gmail.com/
Fixes: 752be5c5c910 ("tracing/eprobe: Add eprobe filter support")
Signed-off-by: Rafael Mendonca <rafaelmendsr@gmail.com>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
2 years agotracing: kprobe: Fix potential null-ptr-deref on trace_array in kprobe_event_gen_test...
Shang XiaoJing [Fri, 18 Nov 2022 01:15:34 +0000 (10:15 +0900)]
tracing: kprobe: Fix potential null-ptr-deref on trace_array in kprobe_event_gen_test_exit()

When test_gen_kprobe_cmd() failed after kprobe_event_gen_cmd_end(), it
will goto delete, which will call kprobe_event_delete() and release the
corresponding resource. However, the trace_array in gen_kretprobe_test
will point to the invalid resource. Set gen_kretprobe_test to NULL
after called kprobe_event_delete() to prevent null-ptr-deref.

BUG: kernel NULL pointer dereference, address: 0000000000000070
PGD 0 P4D 0
Oops: 0000 [#1] SMP PTI
CPU: 0 PID: 246 Comm: modprobe Tainted: G        W
6.1.0-rc1-00174-g9522dc5c87da-dirty #248
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
rel-1.15.0-0-g2dd4b9b3f840-prebuilt.qemu.org 04/01/2014
RIP: 0010:__ftrace_set_clr_event_nolock+0x53/0x1b0
Code: e8 82 26 fc ff 49 8b 1e c7 44 24 0c ea ff ff ff 49 39 de 0f 84 3c
01 00 00 c7 44 24 18 00 00 00 00 e8 61 26 fc ff 48 8b 6b 10 <44> 8b 65
70 4c 8b 6d 18 41 f7 c4 00 02 00 00 75 2f
RSP: 0018:ffffc9000159fe00 EFLAGS: 00010293
RAX: 0000000000000000 RBX: ffff88810971d268 RCX: 0000000000000000
RDX: ffff8881080be600 RSI: ffffffff811b48ff RDI: ffff88810971d058
RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000001
R10: ffffc9000159fe58 R11: 0000000000000001 R12: ffffffffa0001064
R13: ffffffffa000106c R14: ffff88810971d238 R15: 0000000000000000
FS:  00007f89eeff6540(0000) GS:ffff88813b600000(0000)
knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000070 CR3: 000000010599e004 CR4: 0000000000330ef0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <TASK>
 __ftrace_set_clr_event+0x3e/0x60
 trace_array_set_clr_event+0x35/0x50
 ? 0xffffffffa0000000
 kprobe_event_gen_test_exit+0xcd/0x10b [kprobe_event_gen_test]
 __x64_sys_delete_module+0x206/0x380
 ? lockdep_hardirqs_on_prepare+0xd8/0x190
 ? syscall_enter_from_user_mode+0x1c/0x50
 do_syscall_64+0x3f/0x90
 entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x7f89eeb061b7

Link: https://lore.kernel.org/all/20221108015130.28326-3-shangxiaojing@huawei.com/
Fixes: 64836248dda2 ("tracing: Add kprobe event command generation test module")
Signed-off-by: Shang XiaoJing <shangxiaojing@huawei.com>
Cc: stable@vger.kernel.org
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
2 years agotracing: kprobe: Fix potential null-ptr-deref on trace_event_file in kprobe_event_gen...
Shang XiaoJing [Fri, 18 Nov 2022 01:15:33 +0000 (10:15 +0900)]
tracing: kprobe: Fix potential null-ptr-deref on trace_event_file in kprobe_event_gen_test_exit()

When trace_get_event_file() failed, gen_kretprobe_test will be assigned
as the error code. If module kprobe_event_gen_test is removed now, the
null pointer dereference will happen in kprobe_event_gen_test_exit().
Check if gen_kprobe_test or gen_kretprobe_test is error code or NULL
before dereference them.

BUG: kernel NULL pointer dereference, address: 0000000000000012
PGD 0 P4D 0
Oops: 0000 [#1] SMP PTI
CPU: 3 PID: 2210 Comm: modprobe Not tainted
6.1.0-rc1-00171-g2159299a3b74-dirty #217
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
rel-1.15.0-0-g2dd4b9b3f840-prebuilt.qemu.org 04/01/2014
RIP: 0010:kprobe_event_gen_test_exit+0x1c/0xb5 [kprobe_event_gen_test]
Code: Unable to access opcode bytes at 0xffffffff9ffffff2.
RSP: 0018:ffffc900015bfeb8 EFLAGS: 00010246
RAX: ffffffffffffffea RBX: ffffffffa0002080 RCX: 0000000000000000
RDX: ffffffffa0001054 RSI: ffffffffa0001064 RDI: ffffffffdfc6349c
RBP: ffffffffa0000000 R08: 0000000000000004 R09: 00000000001e95c0
R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000800
R13: ffffffffa0002420 R14: 0000000000000000 R15: 0000000000000000
FS:  00007f56b75be540(0000) GS:ffff88813bc00000(0000)
knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffff9ffffff2 CR3: 000000010874a006 CR4: 0000000000330ee0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <TASK>
 __x64_sys_delete_module+0x206/0x380
 ? lockdep_hardirqs_on_prepare+0xd8/0x190
 ? syscall_enter_from_user_mode+0x1c/0x50
 do_syscall_64+0x3f/0x90
 entry_SYSCALL_64_after_hwframe+0x63/0xcd

Link: https://lore.kernel.org/all/20221108015130.28326-2-shangxiaojing@huawei.com/
Fixes: 64836248dda2 ("tracing: Add kprobe event command generation test module")
Signed-off-by: Shang XiaoJing <shangxiaojing@huawei.com>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Cc: stable@vger.kernel.org
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
2 years agoMerge tag 'amd-drm-fixes-6.1-2022-11-16' of https://gitlab.freedesktop.org/agd5f...
Dave Airlie [Fri, 18 Nov 2022 01:09:04 +0000 (11:09 +1000)]
Merge tag 'amd-drm-fixes-6.1-2022-11-16' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes

amd-drm-fixes-6.1-2022-11-16:

amdgpu:
- Fix a possible memory leak in ganng submit error path
- DP tunneling fixes
- DCN 3.1 page flip fix
- DCN 3.2.x fixes
- DCN 3.1.4 fixes
- Don't expose degamma on hardware that doesn't support it
- BACO fixes for SMU 11.x
- BACO fixes for SMU 13.x
- Virtual display fix for devices with no display hardware

amdkfd:
- Memory limit regression fix

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221117040416.6100-1-alexander.deucher@amd.com
2 years agoMerge tag 'drm-intel-fixes-2022-11-17' of git://anongit.freedesktop.org/drm/drm-intel...
Dave Airlie [Fri, 18 Nov 2022 01:02:53 +0000 (11:02 +1000)]
Merge tag 'drm-intel-fixes-2022-11-17' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes

- Fix uaf with lmem_userfault_list handling (Matthew Auld)

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/Y3X2bNJ/4GR1BAiG@tursulin-desk
2 years agotracing: Fix wild-memory-access in register_synth_event()
Shang XiaoJing [Thu, 17 Nov 2022 01:23:46 +0000 (09:23 +0800)]
tracing: Fix wild-memory-access in register_synth_event()

In register_synth_event(), if set_synth_event_print_fmt() failed, then
both trace_remove_event_call() and unregister_trace_event() will be
called, which means the trace_event_call will call
__unregister_trace_event() twice. As the result, the second unregister
will causes the wild-memory-access.

register_synth_event
    set_synth_event_print_fmt failed
    trace_remove_event_call
        event_remove
            if call->event.funcs then
            __unregister_trace_event (first call)
    unregister_trace_event
        __unregister_trace_event (second call)

Fix the bug by avoiding to call the second __unregister_trace_event() by
checking if the first one is called.

general protection fault, probably for non-canonical address
0xfbd59c0000000024: 0000 [#1] SMP KASAN PTI
KASAN: maybe wild-memory-access in range
[0xdead000000000120-0xdead000000000127]
CPU: 0 PID: 3807 Comm: modprobe Not tainted
6.1.0-rc1-00186-g76f33a7eedb4 #299
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
rel-1.15.0-0-g2dd4b9b3f840-prebuilt.qemu.org 04/01/2014
RIP: 0010:unregister_trace_event+0x6e/0x280
Code: 00 fc ff df 4c 89 ea 48 c1 ea 03 80 3c 02 00 0f 85 0e 02 00 00 48
b8 00 00 00 00 00 fc ff df 4c 8b 63 08 4c 89 e2 48 c1 ea 03 <80> 3c 02
00 0f 85 e2 01 00 00 49 89 2c 24 48 85 ed 74 28 e8 7a 9b
RSP: 0018:ffff88810413f370 EFLAGS: 00010a06
RAX: dffffc0000000000 RBX: ffff888105d050b0 RCX: 0000000000000000
RDX: 1bd5a00000000024 RSI: ffff888119e276e0 RDI: ffffffff835a8b20
RBP: dead000000000100 R08: 0000000000000000 R09: fffffbfff0913481
R10: ffffffff8489a407 R11: fffffbfff0913480 R12: dead000000000122
R13: ffff888105d050b8 R14: 0000000000000000 R15: ffff888105d05028
FS:  00007f7823e8d540(0000) GS:ffff888119e00000(0000)
knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f7823e7ebec CR3: 000000010a058002 CR4: 0000000000330ef0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <TASK>
 __create_synth_event+0x1e37/0x1eb0
 create_or_delete_synth_event+0x110/0x250
 synth_event_run_command+0x2f/0x110
 test_gen_synth_cmd+0x170/0x2eb [synth_event_gen_test]
 synth_event_gen_test_init+0x76/0x9bc [synth_event_gen_test]
 do_one_initcall+0xdb/0x480
 do_init_module+0x1cf/0x680
 load_module+0x6a50/0x70a0
 __do_sys_finit_module+0x12f/0x1c0
 do_syscall_64+0x3f/0x90
 entry_SYSCALL_64_after_hwframe+0x63/0xcd

Link: https://lkml.kernel.org/r/20221117012346.22647-3-shangxiaojing@huawei.com
Fixes: 4b147936fa50 ("tracing: Add support for 'synthetic' events")
Signed-off-by: Shang XiaoJing <shangxiaojing@huawei.com>
Cc: stable@vger.kernel.org
Cc: <mhiramat@kernel.org>
Cc: <zanussi@kernel.org>
Cc: <fengguang.wu@intel.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>