]> www.infradead.org Git - linux.git/log
linux.git
6 months agodrm/amdkfd: Add rec SDMA engines support with limited XGMI
Shane Xiao [Thu, 10 Apr 2025 04:35:15 +0000 (12:35 +0800)]
drm/amdkfd: Add rec SDMA engines support with limited XGMI

This patch adds recommended SDMA engines with limited XGMI SDMA engines.
It will help improve overall performance for device to device copies
with this optimization.

v2: Update the formatting issues and data type

Signed-off-by: Shane Xiao <shane.xiao@amd.com>
Suggested-by: Jonathan Kim <jonathan.kim@amd.com>
Reviewed-by: Jonathan Kim <jonathan.kim@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu: Enhance Cleaner Shader Handling in GFX v9.0 Architecture v2
Srinivasan Shanmugam [Fri, 11 Apr 2025 16:02:08 +0000 (21:32 +0530)]
drm/amdgpu: Enhance Cleaner Shader Handling in GFX v9.0 Architecture v2

This commit modifies the gfx_v9_0_ring_emit_cleaner_shader function
to use a switch statement for cleaner shader emission based on the
specific GFX IP version.

The function now distinguishes between different IP versions, using
PACKET3_RUN_CLEANER_SHADER_9_0 for the versions 9.0.1, 9.1.0,
9.2.1, 9.2.2, 9.3.0, and 9.4.0, while retaining
PACKET3_RUN_CLEANER_SHADER for version 9.4.2.

v2: Simplify logic (Alex).

Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Suggested-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu: Add PACKET3_RUN_CLEANER_SHADER_9_0 for Cleaner Shader execution
Srinivasan Shanmugam [Fri, 11 Apr 2025 15:45:41 +0000 (21:15 +0530)]
drm/amdgpu: Add PACKET3_RUN_CLEANER_SHADER_9_0 for Cleaner Shader execution

This commit introduces the PACKET3_RUN_CLEANER_SHADER_9_0 definition,
which is a command packet utilized to instruct the GPU to execute the
cleaner shader for the GFX9.0 graphics architecture.

The cleaner shader is a piece of GPU code that is responsible for
clearing or initializing essential GPU resources, such as Local Data
Share (LDS), Vector General Purpose Registers (VGPRs), and Scalar
General Purpose Registers (SGPRs). Properly clearing these resources  is
vital for ensuring data isolation and security between different
workloads executed on the GPU.

When the GPU receives this packet, it fetches and runs the cleaner
shader instructions from the specified location in the packet.  Thus by
preventing data leaks and ensuring that previous job states do not
interfere with subsequent workloads.

Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amd/amdgpu: Fix out of bounds warning in amdgpu_hw_ip_info
Jesse Zhang [Fri, 11 Apr 2025 08:22:27 +0000 (16:22 +0800)]
drm/amd/amdgpu: Fix out of bounds warning in amdgpu_hw_ip_info

Fix an array index out of bounds warning in the DMA IP case of
amdgpu_hw_ip_info() where it was incorrectly checking
adev->gfx.gfx_ring[i].no_user_submission instead of
adev->sdma.instance[i].ring.no_user_submission.

The mismatch caused UBSAN to report an array bounds violation since
it was accessing the GFX ring array with SDMA instance indices.

Fixes: 4310acd4464b ("drm/amdgpu: add ring flag for no user submissions")
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Jesse Zhang <jesse.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdkfd: add smi events for process start and end
Eric Huang [Mon, 7 Apr 2025 19:32:33 +0000 (15:32 -0400)]
drm/amdkfd: add smi events for process start and end

rocm-smi will be able to show the events for KFD process
start/end, it is the implementation of this feature.

Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
Reviewed-by: Kent Russell <kent.russell@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu: Use the right function for hdp flush
Lijo Lazar [Fri, 11 Apr 2025 12:10:26 +0000 (17:40 +0530)]
drm/amdgpu: Use the right function for hdp flush

There are a few prechecks made before HDP flush like a flush is not
required on APU bare metal. Using hdp callback directly bypasses those
checks. Use amdgpu_device_flush_hdp which takes care of prechecks.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu: Direct ret in ras_reset_err_cnt on VF
Ellen Pan [Fri, 11 Apr 2025 02:12:24 +0000 (22:12 -0400)]
drm/amdgpu: Direct ret in ras_reset_err_cnt on VF

With adding sriov_vf check, we directly return EOPNOTSUPP in
ras_reset_error_count as we should not do anything on VF to reset RAS error
count.

This also fixes the issue that loading guest driver causes register
violations.

Reviewed-by: Ahmad Rehman <Ahmad.Rehman@amd.com>
Signed-off-by: Ellen Pan <yunru.pan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu: Use generic hdp flush function
Lijo Lazar [Fri, 11 Apr 2025 11:15:46 +0000 (16:45 +0530)]
drm/amdgpu: Use generic hdp flush function

Except HDP v5.2 all use a common logic for HDP flush. Use a generic
function. HDP v5.2 forces NO_KIQ logic, revisit it later.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu: Set RAS EEPROM table version to v3 for umc v12_5
Candice Li [Fri, 11 Apr 2025 05:12:06 +0000 (13:12 +0800)]
drm/amdgpu: Set RAS EEPROM table version to v3 for umc v12_5

Set RAS EEPROM table version to v3 for umc v12_5.

Signed-off-by: Candice Li <candice.li@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu: Enable per-queue reset for SDMA v4.4.2 on IP v9.5.0
Jesse.zhang@amd.com [Tue, 8 Apr 2025 06:23:45 +0000 (14:23 +0800)]
drm/amdgpu: Enable per-queue reset for SDMA v4.4.2 on IP v9.5.0

Add support for per-queue reset on SDMA v4.4.2 when running with:
1. MEC firmware version 17 or later
2. DPM indicates SDMA reset is supported

v2: Fixed supported firmware versions  (Lijo)

Signed-off-by: Jesse.Zhang <Jesse.zhang@amd.com>
Reviewed-by: Tim Huang <tim.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu/gfx11: Add Cleaner Shader Support for GFX11.5.2/11.5.3 GPUs
Srinivasan Shanmugam [Thu, 10 Apr 2025 14:07:06 +0000 (19:37 +0530)]
drm/amdgpu/gfx11: Add Cleaner Shader Support for GFX11.5.2/11.5.3 GPUs

Enable the cleaner shader for additional GFX11.5.2/11.5.3 series GPUs to
ensure data isolation among GPU tasks. The cleaner shader is tasked with
clearing the Local Data Store (LDS), Vector General Purpose Registers
(VGPRs), and Scalar General Purpose Registers (SGPRs), which helps avoid
data leakage and guarantees the accuracy of computational results.

This update extends cleaner shader support to GFX11.5.2/11.5.3 GPUs,
previously available for GFX11.0.3. It enhances security by clearing GPU
memory between processes and maintains a consistent GPU state across KGD
and KFD workloads.

Cc: Mario Sopena-Novales <mario.novales@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amd/display/dml2: use vzalloc rather than kzalloc
Alex Deucher [Wed, 9 Apr 2025 01:27:15 +0000 (21:27 -0400)]
drm/amd/display/dml2: use vzalloc rather than kzalloc

The structures are large and they do not require contiguous
memory so use vzalloc.

Fixes: 70839da63605 ("drm/amd/display: Add new DCN401 sources")
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4126
Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agoDocumentation/amdgpu: Add Ryzen AI 350 series processors
Mario Limonciello [Thu, 10 Apr 2025 17:23:06 +0000 (12:23 -0500)]
Documentation/amdgpu: Add Ryzen AI 350 series processors

These have been announced so add them to the table.

Link: https://www.amd.com/en/products/processors/laptop/ryzen/ai-300-series/amd-ryzen-ai-7-350.html
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amd/display: Add htmldocs description for fused_io interface
Roman Li [Wed, 9 Apr 2025 16:03:00 +0000 (12:03 -0400)]
drm/amd/display: Add htmldocs description for fused_io interface

[Why]
htmldocs build warning: "Function parameter or struct member 'fused_io'
not described in 'amdgpu_display_manager'".

[How]
Add missing description.

Fixes: ce801e5d6c1b ("drm/amd/display: HDCP Locality check using DMUB Fused IO")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Roman Li <Roman.Li@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu: adjust enforce_isolation handling
Alex Deucher [Tue, 8 Apr 2025 14:39:06 +0000 (10:39 -0400)]
drm/amdgpu: adjust enforce_isolation handling

Switch from a bool to an enum and allow more options
for enforce isolation.  There are now 3 modes of operation:
- Disabled (0)
- Enabled (serialization and cleaner shader) (1)
- Enabled in legacy mode (no serialization or cleaner shader) (2)
This provides better flexibility for more use cases.

Acked-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu/mes12: use the device value for enforce isolation
Alex Deucher [Tue, 8 Apr 2025 14:47:24 +0000 (10:47 -0400)]
drm/amdgpu/mes12: use the device value for enforce isolation

Use the local setting rather than the global parameter.

Acked-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu/mes11: use the device value for enforce isolation
Alex Deucher [Tue, 8 Apr 2025 14:45:52 +0000 (10:45 -0400)]
drm/amdgpu/mes11: use the device value for enforce isolation

Use the local setting rather than the global parameter.

Acked-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu: Add back JPEG to video caps for carrizo and newer
David Rosca [Mon, 7 Apr 2025 11:12:11 +0000 (13:12 +0200)]
drm/amdgpu: Add back JPEG to video caps for carrizo and newer

JPEG is not supported on Vega only.

Fixes: 0a6e7b06bdbe ("drm/amdgpu: Remove JPEG from vega and carrizo video caps")
Signed-off-by: David Rosca <david.rosca@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu/gfx12: Implement the GFX12 KCQ pipe reset
Prike Liang [Fri, 21 Feb 2025 12:34:32 +0000 (20:34 +0800)]
drm/amdgpu/gfx12: Implement the GFX12 KCQ pipe reset

Implement the GFX12 KCQ pipe reset, and disable the GFX12
kernel compute queue until the CPFW fully supports it.

Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu: Replace tmp_adev with hive in amdgpu_pci_slot_reset
Ce Sun [Wed, 9 Apr 2025 11:53:11 +0000 (19:53 +0800)]
drm/amdgpu: Replace tmp_adev with hive in amdgpu_pci_slot_reset

Checking hive is more readable.

The following smatch warning:
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:6820 amdgpu_pci_slot_reset()
warn: iterator used outside loop: 'tmp_adev'

Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Ce Sun <cesun102@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu: fix warning of drm_mm_clean
ZhenGuo Yin [Tue, 8 Apr 2025 08:18:28 +0000 (16:18 +0800)]
drm/amdgpu: fix warning of drm_mm_clean

Kernel doorbell BOs needs to be freed before ttm_fini.

Fixes: 54c30d2a8def ("drm/amdgpu: create kernel doorbell pages")
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: ZhenGuo Yin <zhenguo.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amd/amdgpu: disable ASPM in some situations
Kenneth Feng [Tue, 1 Apr 2025 08:04:41 +0000 (16:04 +0800)]
drm/amd/amdgpu: disable ASPM in some situations

disable ASPM with some ASICs on some specific platforms.
required from PCIe controller owner.

Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu: remove the duplicated mes queue active state setting
Prike Liang [Fri, 28 Mar 2025 10:33:09 +0000 (18:33 +0800)]
drm/amdgpu: remove the duplicated mes queue active state setting

The MES queue deactivation and active status are already set in
mes_userq_unmap|map(), so the caller needn't set the queue_active
bit again.

Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agoamd/amdgpu: Implement VCN queue reset for vcn 4.0.3
Ruili Ji [Tue, 1 Apr 2025 00:55:09 +0000 (20:55 -0400)]
amd/amdgpu: Implement VCN queue reset for vcn 4.0.3

Add function for vcn queue reset to make driver to
do fine-grained reset instead of the whole gpu reset.

Reviewed-by: Sonny Jiang <sonny.jiang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Ruili Ji <ruiliji2@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu: Move read of snoop register from guest to host
Masha Grinman [Thu, 3 Apr 2025 19:08:17 +0000 (14:08 -0500)]
drm/amdgpu: Move read of snoop register from guest to host

Guest is reading/writing to snoop register which is a security violation
We moved the code to the host driver
And also added a validation on the guest side to check if it's guest

Signed-off-by: Masha Grinman <Masha.Grinman@amd.com>
Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amd: Forbid suspending into non-default suspend states
Mario Limonciello [Tue, 8 Apr 2025 18:09:57 +0000 (13:09 -0500)]
drm/amd: Forbid suspending into non-default suspend states

On systems that default to 'deep' some userspace software likes
to try to suspend in 'deep' first.  If there is a failure for any
reason (such as -ENOMEM) the failure is ignored and then it will
try to use 's2idle' as a fallback. This fails, but more importantly
it leads to graphical problems.

Forbid this behavior and only allow suspending in the last state
supported by the system.

Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4093
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Link: https://lore.kernel.org/r/20250408180957.4027643-1-superm1@kernel.org
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu: use a dummy owner for sysfs triggered cleaner shaders v4
Christian König [Fri, 28 Mar 2025 17:58:17 +0000 (18:58 +0100)]
drm/amdgpu: use a dummy owner for sysfs triggered cleaner shaders v4

Otherwise triggering sysfs multiple times without other submissions in
between only runs the shader once.

v2: add some comment
v3: re-add missing cast
v4: squash in semicolon fix

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu/sdma7: add support for disable_kq
Alex Deucher [Tue, 18 Feb 2025 18:23:55 +0000 (13:23 -0500)]
drm/amdgpu/sdma7: add support for disable_kq

When the parameter is set, disable user submissions
to kernel queues.

Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu/sdma6: add support for disable_kq
Alex Deucher [Tue, 18 Feb 2025 18:22:49 +0000 (13:22 -0500)]
drm/amdgpu/sdma6: add support for disable_kq

When the parameter is set, disable user submissions
to kernel queues.

Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu/sdma: add flag for tracking disable_kq
Alex Deucher [Tue, 18 Feb 2025 18:09:55 +0000 (13:09 -0500)]
drm/amdgpu/sdma: add flag for tracking disable_kq

For SDMA, we still need kernel queues for paging so
they need to be initialized, but we no not want to
accept submissions from userspace when disable_kq
is set.

Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu/gfx12: add support for disable_kq
Alex Deucher [Tue, 18 Feb 2025 17:37:51 +0000 (12:37 -0500)]
drm/amdgpu/gfx12: add support for disable_kq

Plumb in support for disabling kernel queues.

v2: use ring counts per Felix' suggestion
v3: fix stream fault handler, enable EOP interrupts
v4: fix MEC interrupt offset (Sunil)
v5: clean up after removing extra sched.ready settings

Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu/gfx11: add support for disable_kq
Alex Deucher [Tue, 18 Feb 2025 17:16:26 +0000 (12:16 -0500)]
drm/amdgpu/gfx11: add support for disable_kq

Plumb in support for disabling kernel queues in
GFX11.  We have to bring up a GFX queue briefly in
order to initialize the clear state.  After that
we can disable it.

v2: use ring counts per Felix' suggestion
v3: fix stream fault handler, enable EOP interrupts
v4: fix MEC interrupt offset (Sunil)

Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu/mes: make more vmids available when disable_kq=1
Alex Deucher [Wed, 26 Feb 2025 17:51:30 +0000 (12:51 -0500)]
drm/amdgpu/mes: make more vmids available when disable_kq=1

If we don't have kernel queues, the vmids can be used by
the MES for user queues.

Acked-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu/mes: update hqd masks when disable_kq is set
Alex Deucher [Wed, 26 Feb 2025 17:40:30 +0000 (12:40 -0500)]
drm/amdgpu/mes: update hqd masks when disable_kq is set

Make all resources available to user queues.

Reviewed-by: Prike Liang <Prike.Liang@amd.com>
Suggested-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu/gfx: add generic handling for disable_kq
Alex Deucher [Tue, 18 Feb 2025 17:07:48 +0000 (12:07 -0500)]
drm/amdgpu/gfx: add generic handling for disable_kq

Add proper checks for disable_kq functionality in
gfx helper functions.  Add special logic for families
that require the clear state setup.

v2: use ring count as per Felix suggestion
v3: fix num_gfx_rings handling in amdgpu_gfx_graphics_queue_acquire()
v4: fix error code (Alex)

Reviewed-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu: add ring flag for no user submissions
Alex Deucher [Tue, 18 Feb 2025 18:06:19 +0000 (13:06 -0500)]
drm/amdgpu: add ring flag for no user submissions

This would be set by IPs which only accept submissions
from the kernel, not userspace, such as when kernel
queues are disabled. Don't expose the rings to userspace
and reject any submissions in the CS IOCTL.

v2: fix error code (Alex)

Reviewed-by: Sunil Khatri<sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu: add parameter to disable kernel queues
Alex Deucher [Tue, 18 Feb 2025 15:33:53 +0000 (10:33 -0500)]
drm/amdgpu: add parameter to disable kernel queues

On chips that support user queues, setting this option
will disable kernel queues to be used to validate
user queues without kernel queues.

Reviewed-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu/userq: prevent runtime pm when userqs are active
Alex Deucher [Thu, 20 Mar 2025 16:49:30 +0000 (12:49 -0400)]
drm/amdgpu/userq: prevent runtime pm when userqs are active

Similar to KFD, prevent runtime pm while user queues are active.

Reviewed-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu: store userq_managers in a list in adev
Alex Deucher [Thu, 20 Feb 2025 20:56:24 +0000 (15:56 -0500)]
drm/amdgpu: store userq_managers in a list in adev

So we can iterate across them when we need to manage
all user queues.

v2: add uq_mgr to adev list in amdgpu_userq_mgr_init

Reviewed-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu: bump version for user queue IP support query
Alex Deucher [Mon, 24 Mar 2025 20:29:03 +0000 (16:29 -0400)]
drm/amdgpu: bump version for user queue IP support query

Add the user queue IP support query to the drm_amdgpu_info_device
query.

Cc: marek.olsak@amd.com
Cc: prike.liang@amd.com
Cc: sunil.khatri@amd.com
Cc: yogesh.mohanmarimuthu@amd.com
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu: add UAPI to query if user queues are supported
Alex Deucher [Mon, 24 Mar 2025 20:26:00 +0000 (16:26 -0400)]
drm/amdgpu: add UAPI to query if user queues are supported

Add an INFO query to check if user queues are supported.

v2: switch to a mask of IPs (Marek)
v3: move to drm_amdgpu_info_device (Marek)

Cc: marek.olsak@amd.com
Cc: prike.liang@amd.com
Cc: sunil.khatri@amd.com
Cc: yogesh.mohanmarimuthu@amd.com
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu/gfx12: split userq setup to a separate switch
Alex Deucher [Wed, 26 Mar 2025 16:21:22 +0000 (12:21 -0400)]
drm/amdgpu/gfx12: split userq setup to a separate switch

Add a separate switch statement for the userq callback
assignment so that we can assign the callbacks for each
asic as the firmware becomes available.

Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu/gfx11: clean up and consolidate sw_init
Alex Deucher [Wed, 26 Mar 2025 16:09:12 +0000 (12:09 -0400)]
drm/amdgpu/gfx11: clean up and consolidate sw_init

With the ME details fixed, we can now consolidate
this state.  Also split out the userq setup into a separate
switch statement so that we can set them per IP version
when the firmwares are ready.

Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu: Fix display freezing issue when resizing apps
Arvind Yadav [Tue, 18 Mar 2025 13:15:40 +0000 (18:45 +0530)]
drm/amdgpu: Fix display freezing issue when resizing apps

The display is freezing because the amdgpu_userq_wait_ioctl()
is waiting for a non-user queue fence(specifically, the PT update fence).

RootCause:
The resume_work is initiated by both amdgpu_userq_suspend and
amdgpu_userqueue_ensure_ev_fence at same time. The amdgpu_userq_suspend
signals a dma-fence and subsequently triggers the resume_work, which is
intended to replace the existing fence by creating new dma-fence. However,
following this, the amdgpu_userqueue_ensure_ev_fence schedules another
resume_work that generates a new dma-fence, thereby replacing the one
created by amdgpu_userq_suspend. Consequently, the original fence will
never be signaled.

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Shashank Sharma <shashank.sharma@amd.com>
Cc: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Arvind Yadav <arvind.yadav@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu/mes: warn on unexpected pipe numbers
Alex Deucher [Thu, 20 Mar 2025 14:18:58 +0000 (10:18 -0400)]
drm/amdgpu/mes: warn on unexpected pipe numbers

Warn if the number of pipes exceeds what the MES supports.

Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu/mes: centralize gfx_hqd mask management
Alex Deucher [Wed, 26 Feb 2025 17:31:46 +0000 (12:31 -0500)]
drm/amdgpu/mes: centralize gfx_hqd mask management

Move it to amdgpu_mes to align with the compute and
sdma hqd masks. No functional change.

v2: rebase on new changes
v3: misc optimizations

Reviewed-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Sunil Khatri<sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu: remove is_mes_queue flag
Alex Deucher [Wed, 12 Mar 2025 17:47:33 +0000 (13:47 -0400)]
drm/amdgpu: remove is_mes_queue flag

This was leftover from MES bring up when we had MES
user queues in the kernel.  It's no longer used so
remove it.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu/mes: remove unused functions
Alex Deucher [Wed, 26 Feb 2025 20:39:02 +0000 (15:39 -0500)]
drm/amdgpu/mes: remove unused functions

Leftover from the MES self tests that were removed previously.

Reviewed-by: Mukul Joshi <mukul.joshi@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu: validate user queue parameters
Alex Deucher [Wed, 26 Feb 2025 21:31:57 +0000 (16:31 -0500)]
drm/amdgpu: validate user queue parameters

Make sure these are set properly to ensure compatibility if
we ever update the IOCTL interface.

Reviewed-by: Prike Liang <Prike.Liang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu: fix the memleak caused by fence not released
Arvind Yadav [Tue, 18 Feb 2025 13:26:25 +0000 (18:56 +0530)]
drm/amdgpu: fix the memleak caused by fence not released

Encountering a taint issue during the unloading of gpu_sched
due to the fence not being released/put. In this context,
amdgpu_vm_clear_freed is responsible for creating a job to
update the page table (PT). It allocates kmem_cache for
drm_sched_fence and returns the finished fence associated
with job->base.s_fence. In case of Usermode queue this finished
fence is added to the timeline sync object through
amdgpu_gem_update_bo_mapping, which is utilized by user
space to ensure the completion of the PT update.

[  508.900587] =============================================================================
[  508.900605] BUG drm_sched_fence (Tainted: G                 N): Objects remaining in drm_sched_fence on __kmem_cache_shutdown()
[  508.900617] -----------------------------------------------------------------------------

[  508.900627] Slab 0xffffe0cc04548780 objects=32 used=2 fp=0xffff8ea81521f000 flags=0x17ffffc0000240(workingset|head|node=0|zone=2|lastcpupid=0x1fffff)
[  508.900645] CPU: 3 UID: 0 PID: 2337 Comm: rmmod Tainted: G                 N 6.12.0+ #1
[  508.900651] Tainted: [N]=TEST
[  508.900653] Hardware name: Gigabyte Technology Co., Ltd. X570 AORUS ELITE/X570 AORUS ELITE, BIOS F34 06/10/2021
[  508.900656] Call Trace:
[  508.900659]  <TASK>
[  508.900665]  dump_stack_lvl+0x70/0x90
[  508.900674]  dump_stack+0x14/0x20
[  508.900678]  slab_err+0xcb/0x110
[  508.900687]  ? srso_return_thunk+0x5/0x5f
[  508.900692]  ? try_to_grab_pending+0xd3/0x1d0
[  508.900697]  ? srso_return_thunk+0x5/0x5f
[  508.900701]  ? mutex_lock+0x17/0x50
[  508.900708]  __kmem_cache_shutdown+0x144/0x2d0
[  508.900713]  ? flush_rcu_work+0x50/0x60
[  508.900719]  kmem_cache_destroy+0x46/0x1f0
[  508.900728]  drm_sched_fence_slab_fini+0x19/0x970 [gpu_sched]
[  508.900736]  __do_sys_delete_module.constprop.0+0x184/0x320
[  508.900744]  ? srso_return_thunk+0x5/0x5f
[  508.900747]  ? debug_smp_processor_id+0x1b/0x30
[  508.900754]  __x64_sys_delete_module+0x16/0x20
[  508.900758]  x64_sys_call+0xdf/0x20d0
[  508.900763]  do_syscall_64+0x51/0x120
[  508.900769]  entry_SYSCALL_64_after_hwframe+0x76/0x7e

v2: call dma_fence_put in amdgpu_gem_va_update_vm
v3: Addressed review comments from Christian.
    - calling amdgpu_gem_update_timeline_node before switch.
    - puting a dma_fence in case of error or !timeline_syncobj.
v4: Addressed review comments from Christian.

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Shashank Sharma <shashank.sharma@amd.com>
Cc: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Le Ma <le.ma@amd.com>
Signed-off-by: Arvind Yadav <arvind.yadav@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu/userq: move the header to amdgpu directory
Alex Deucher [Fri, 28 Feb 2025 19:55:57 +0000 (14:55 -0500)]
drm/amdgpu/userq: move the header to amdgpu directory

To align with other headers.

Reviewed-by: Prike Liang <Prike.Liang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu/userq: remove BROKEN from config
Alex Deucher [Wed, 19 Feb 2025 21:46:52 +0000 (16:46 -0500)]
drm/amdgpu/userq: remove BROKEN from config

This can be enabled now.  We have the firmware checks
in place.

Reviewed-by: Prike Liang <Prike.Liang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu: add userq firmware version checks
Alex Deucher [Fri, 28 Feb 2025 19:50:11 +0000 (14:50 -0500)]
drm/amdgpu: add userq firmware version checks

Currently disabled until the firmwares are officially
released.

Reviewed-by: Prike Liang <Prike.Liang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu/gfx11: fix config guard
Alex Deucher [Fri, 28 Feb 2025 19:45:37 +0000 (14:45 -0500)]
drm/amdgpu/gfx11: fix config guard

s/CONFIG_DRM_AMD_USERQ_GFX/CONFIG_DRM_AMDGPU_NAVI3X_USERQ/

Reviewed-by: Prike Liang <Prike.Liang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu/Kconfig: fix wording of DRM_AMDGPU_NAVI3X_USERQ
Alex Deucher [Fri, 28 Feb 2025 19:37:31 +0000 (14:37 -0500)]
drm/amdgpu/Kconfig: fix wording of DRM_AMDGPU_NAVI3X_USERQ

The feature is not navi3x specific at this point.

Reviewed-by: Prike Liang <Prike.Liang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu: return an error in the userq IOCTL when DRM_AMDGPU_NAVI3X_USERQ=n
Alex Deucher [Fri, 28 Feb 2025 19:14:35 +0000 (14:14 -0500)]
drm/amdgpu: return an error in the userq IOCTL when DRM_AMDGPU_NAVI3X_USERQ=n

I'd swear this was already fixed, but I guess the patch never
landed.  Add it now.

Reviewed-by: Prike Liang <Prike.Liang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu/userq: handle runtime pm
Alex Deucher [Thu, 20 Feb 2025 14:44:39 +0000 (09:44 -0500)]
drm/amdgpu/userq: handle runtime pm

Take a reference when we create a queue and drop it
when we destroy the queue.  We need to keep the device
active while user queues are active.

v2: squash in fix from Sunil
v3: squash in fix from Prike

Reviewed-by: Prike Liang <Prike.Liang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu/userq: fix hardcoded uq functions
Alex Deucher [Thu, 20 Feb 2025 21:08:02 +0000 (16:08 -0500)]
drm/amdgpu/userq: fix hardcoded uq functions

Use the IP type to look up the userq functions rather
than hardcoding it.

Reviewed-by: Saleemkhan Jamadar <saleemkhan.jamadar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu: Fix display freeze lockup error
Arvind Yadav [Mon, 27 Jan 2025 12:52:01 +0000 (18:22 +0530)]
drm/amdgpu: Fix display freeze lockup error

A deadlock situation has arised between the userq
signal ioctl and the eviction fence. In this scenario,
the function amdgpu_userq_signal_ioctl() has acquired a reservation
lock on the read/write buffer object (BO) through drm_exec.
Subsequently, it calls amdgpu_userqueue_ensure_ev_fence(),
which is in a waiting for the userq resume work.
Meanwhile, the userq suspend worker has initiated the userq resume
work(amdgpu_userqueue_resume_worker). This userq resume work attempts
to validate the vm->done BO, leading to amdgpu_userqueue_validate_bos
also attempting to reservation lock the same write BO that is already
locked by amdgpu_userq_signal_ioctl.
As a result, the resume work becomes stalled, causing
amdgpu_userqueue_ensure_ev_fence to remain in a waiting state.

Call Trace:
[  242.836469] INFO: task gnome-shel:cs0:1288 blocked for more than 120 seconds.
[  242.836486]       Tainted: G           OE      6.12.0-rc2rebased-oct-24+ #4
[  242.836491] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  242.836494] task:gnome-shel:cs0  state:D stack:0     pid:1288  tgid:1282  ppid:1180   flags:0x00000002
[  242.836503] Call Trace:
[  242.836508]  <TASK>
[  242.836517]  __schedule+0x3e0/0xb10
[  242.836530]  ? srso_return_thunk+0x5/0x5f
[  242.836541]  schedule+0x31/0x120
[  242.836546]  schedule_timeout+0x150/0x160
[  242.836551]  ? srso_return_thunk+0x5/0x5f
[  242.836555]  ? sysvec_call_function+0x69/0xd0
[  242.836562]  ? srso_return_thunk+0x5/0x5f
[  242.836567]  ? preempt_count_add+0x7f/0xd0
[  242.836577]  __wait_for_common+0x91/0x180
[  242.836582]  ? __pfx_schedule_timeout+0x10/0x10
[  242.836590]  wait_for_completion+0x28/0x30
[  242.836595]  __flush_work+0x16c/0x290
[  242.836602]  ? __pfx_wq_barrier_func+0x10/0x10
[  242.836611]  flush_delayed_work+0x3a/0x60
[  242.836621]  amdgpu_userqueue_ensure_ev_fence+0x2d/0xb0 [amdgpu]
[  242.836966]  amdgpu_userq_signal_ioctl+0x959/0xec0 [amdgpu]
[  242.837171]  ? __pfx_amdgpu_userq_signal_ioctl+0x10/0x10 [amdgpu]
[  242.837365]  drm_ioctl_kernel+0xae/0x100 [drm]
[  242.837398]  drm_ioctl+0x2a1/0x500 [drm]
[  242.837420]  ? __pfx_amdgpu_userq_signal_ioctl+0x10/0x10 [amdgpu]
[  242.837622]  ? srso_return_thunk+0x5/0x5f
[  242.837627]  ? srso_return_thunk+0x5/0x5f
[  242.837630]  ? _raw_spin_unlock_irqrestore+0x2b/0x50
[  242.837635]  amdgpu_drm_ioctl+0x4f/0x90 [amdgpu]
[  242.837811]  __x64_sys_ioctl+0x99/0xd0
[  242.837820]  x64_sys_call+0x1209/0x20d0
[  242.837825]  do_syscall_64+0x51/0x120
[  242.837830]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[  242.837835] RIP: 0033:0x7f2f33f1a94f
[  242.837838] RSP: 002b:00007f2f24ffea30 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[  242.837842] RAX: ffffffffffffffda RBX: 00007f2f24ffebd0 RCX: 00007f2f33f1a94f
[  242.837845] RDX: 00007f2f24ffebd0 RSI: 00000000c0306457 RDI: 000000000000000d
[  242.837847] RBP: 00007f2f24ffeab0 R08: 0000000000000000 R09: 0000000000000000
[  242.837849] R10: 00007f2f24ffecd0 R11: 0000000000000246 R12: 00007f2f25000640
[  242.837851] R13: 00000000c0306457 R14: 000000000000000d R15: 00007fff3b39c1e0
[  242.837858]  </TASK>
[  242.837865] INFO: task Xwayland:cs0:1517 blocked for more than 120 seconds.
[  242.837869]       Tainted: G           OE      6.12.0-rc2rebased-oct-24+ #4
[  242.837872] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  242.837874] task:Xwayland:cs0    state:D stack:0     pid:1517  tgid:1338  ppid:1282   flags:0x00004002
[  242.837878] Call Trace:
[  242.837880]  <TASK>
[  242.837883]  __schedule+0x3e0/0xb10
[  242.837890]  schedule+0x31/0x120
[  242.837894]  schedule_preempt_disabled+0x1c/0x30
[  242.837897]  __mutex_lock.constprop.0+0x386/0x6e0
[  242.837902]  ? srso_return_thunk+0x5/0x5f
[  242.837905]  ? __timer_delete_sync+0x81/0xe0
[  242.837911]  __mutex_lock_slowpath+0x13/0x20
[  242.837915]  mutex_lock+0x3b/0x50
[  242.837919]  amdgpu_userqueue_ensure_ev_fence+0x35/0xb0 [amdgpu]
[  242.838138]  amdgpu_userq_signal_ioctl+0x959/0xec0 [amdgpu]
[  242.838340]  ? __pfx_amdgpu_userq_signal_ioctl+0x10/0x10 [amdgpu]
[  242.838531]  drm_ioctl_kernel+0xae/0x100 [drm]
[  242.838559]  drm_ioctl+0x2a1/0x500 [drm]
[  242.838580]  ? __pfx_amdgpu_userq_signal_ioctl+0x10/0x10 [amdgpu]
[  242.838778]  ? srso_return_thunk+0x5/0x5f
[  242.838783]  ? srso_return_thunk+0x5/0x5f
[  242.838786]  ? _raw_spin_unlock_irqrestore+0x2b/0x50
[  242.838791]  amdgpu_drm_ioctl+0x4f/0x90 [amdgpu]
[  242.838967]  __x64_sys_ioctl+0x99/0xd0
[  242.838972]  x64_sys_call+0x1209/0x20d0
[  242.838975]  do_syscall_64+0x51/0x120
[  242.838979]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[  242.838982] RIP: 0033:0x7f9118b1a94f
[  242.838985] RSP: 002b:00007f910cdff760 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[  242.838989] RAX: ffffffffffffffda RBX: 00007f910cdff910 RCX: 00007f9118b1a94f
[  242.838991] RDX: 00007f910cdff910 RSI: 00000000c0306457 RDI: 000000000000000c
[  242.838993] RBP: 00007f910cdff7e0 R08: 0000000000000000 R09: 0000000000000001
[  242.838995] R10: 00007f910cdff9d4 R11: 0000000000000246 R12: 00007f910ce00640
[  242.838997] R13: 00000000c0306457 R14: 000000000000000c R15: 00007fff9dd11d10
[  242.839004]  </TASK>

v2: Addressed review comemnts from Christian.
v3/v4: Addressed review comemnts from Christian.
   - Move drm_exec drm_exec loop after userq fence create.
   - cleanup the newly created userq fence in case of error.
v5 - Addressed review comemnts from Christian.
   - Create a new amdgpu_userq_fence_alloc() function for allocation.
   - Calling dma_fence_put for cleanup procedure.
   - make amdgpu_userq_fence_create() function static.
   - drm_exec_init is called after mutex_unlock.

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Shashank Sharma <shashank.sharma@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Arvind Yadav <arvind.yadav@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu: Modify the seq64 VM cache policy
Arunpravin Paneer Selvam [Mon, 10 Feb 2025 16:47:28 +0000 (22:17 +0530)]
drm/amdgpu: Modify the seq64 VM cache policy

The seq64 VM cache policy should be set to UC (Uncached) to
match with userqueue fence address kernel mapped memory's
cache settings.

Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu: Fix out-of-bounds issue in user fence
Arunpravin Paneer Selvam [Wed, 1 Jan 2025 08:52:29 +0000 (14:22 +0530)]
drm/amdgpu: Fix out-of-bounds issue in user fence

Fix out-of-bounds issue in userq fence create when
accessing the userq xa structure. Added a lock to
protect the race condition.

v2:(Christian)
  - Allocate memory with GFP_ATOMIC.

v3:
  - Moved to 2 xa approach.

v4:(Christian)
  - Lock the xa_for_each blocks and memory allocation part
    as well to make sure that xa is not modified in between
    the 2 xa_for_each blocks.

BUG: KASAN: slab-out-of-bounds in amdgpu_userq_fence_create+0x726/0x880 [amdgpu]
[  +0.000006] Call Trace:
[  +0.000005]  <TASK>
[  +0.000005]  dump_stack_lvl+0x6c/0x90
[  +0.000011]  print_report+0xc4/0x5e0
[  +0.000009]  ? srso_return_thunk+0x5/0x5f
[  +0.000008]  ? kasan_complete_mode_report_info+0x26/0x1d0
[  +0.000007]  ? amdgpu_userq_fence_create+0x726/0x880 [amdgpu]
[  +0.000405]  kasan_report+0xdf/0x120
[  +0.000009]  ? amdgpu_userq_fence_create+0x726/0x880 [amdgpu]
[  +0.000405]  __asan_report_store8_noabort+0x17/0x20
[  +0.000007]  amdgpu_userq_fence_create+0x726/0x880 [amdgpu]
[  +0.000406]  ? __pfx_amdgpu_userq_fence_create+0x10/0x10 [amdgpu]
[  +0.000408]  ? srso_return_thunk+0x5/0x5f
[  +0.000008]  ? ttm_resource_move_to_lru_tail+0x235/0x4f0 [ttm]
[  +0.000013]  ? srso_return_thunk+0x5/0x5f
[  +0.000008]  amdgpu_userq_signal_ioctl+0xd29/0x1c70 [amdgpu]
[  +0.000412]  ? __pfx_amdgpu_userq_signal_ioctl+0x10/0x10 [amdgpu]
[  +0.000404]  ? try_to_wake_up+0x165/0x1840
[  +0.000010]  ? __pfx_futex_wake_mark+0x10/0x10
[  +0.000011]  drm_ioctl_kernel+0x178/0x2f0 [drm]
[  +0.000050]  ? __pfx_amdgpu_userq_signal_ioctl+0x10/0x10 [amdgpu]
[  +0.000404]  ? __pfx_drm_ioctl_kernel+0x10/0x10 [drm]
[  +0.000043]  ? __kasan_check_read+0x11/0x20
[  +0.000007]  ? srso_return_thunk+0x5/0x5f
[  +0.000007]  ? __kasan_check_write+0x14/0x20
[  +0.000008]  drm_ioctl+0x513/0xd20 [drm]
[  +0.000040]  ? __pfx_amdgpu_userq_signal_ioctl+0x10/0x10 [amdgpu]
[  +0.000407]  ? __pfx_drm_ioctl+0x10/0x10 [drm]
[  +0.000044]  ? srso_return_thunk+0x5/0x5f
[  +0.000007]  ? _raw_spin_lock_irqsave+0x99/0x100
[  +0.000007]  ? __pfx__raw_spin_lock_irqsave+0x10/0x10
[  +0.000006]  ? __rseq_handle_notify_resume+0x188/0xc30
[  +0.000008]  ? srso_return_thunk+0x5/0x5f
[  +0.000008]  ? srso_return_thunk+0x5/0x5f
[  +0.000006]  ? _raw_spin_unlock_irqrestore+0x27/0x50
[  +0.000010]  amdgpu_drm_ioctl+0xcd/0x1d0 [amdgpu]
[  +0.000388]  __x64_sys_ioctl+0x135/0x1b0
[  +0.000009]  x64_sys_call+0x1205/0x20d0
[  +0.000007]  do_syscall_64+0x4d/0x120
[  +0.000008]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[  +0.000007] RIP: 0033:0x7f7c3d31a94f

Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu: add db size and offset range for VCN and VPE
Saleemkhan Jamadar [Mon, 6 Jan 2025 07:20:50 +0000 (12:50 +0530)]
drm/amdgpu: add db size and offset range for VCN and VPE

VCN and VPE have different offset range, update the doorbell
offset range repsectively.
Doorbell size for VCN and VPE is 32bit.

v1 : add gfx switch case and fix checkpatch warnings (Shashank)

Signed-off-by: Saleemkhan Jamadar <saleemkhan.jamadar@amd.com>
Reviewed-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu: map doorbell for the requested userq
Saleemkhan Jamadar [Fri, 3 Jan 2025 13:32:59 +0000 (19:02 +0530)]
drm/amdgpu: map doorbell for the requested userq

Introduce db_info structure to the populate the doorbell
information that is required to be mapped.

Made changes to the doorbell mapping func more generic,
by taking parameters that vary based on IPs and/or usecase
into db_info structure.

v2 - Fix space alignment and checkpatch warnings(Shashank)

Signed-off-by: Saleemkhan Jamadar <saleemkhan.jamadar@amd.com>
Reviewed-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu: fix call to amdgpu_eviction_fence_detach
Christian König [Fri, 20 Dec 2024 12:44:23 +0000 (13:44 +0100)]
drm/amdgpu: fix call to amdgpu_eviction_fence_detach

That needs to be done after grabbing the lock, not before.

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Arvind Yadav <arvind.yadav@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu: Fix Illegal opcode in command stream Error
Arvind Yadav [Thu, 19 Dec 2024 14:13:54 +0000 (19:43 +0530)]
drm/amdgpu: Fix Illegal opcode in command stream Error

When applications closes, it triggers the drm_file_free
function which subsequently releases all allocated buffer
objects. Concurrently, the resume_worker thread will attempt
to map the usermode queue. However, since the wptr buffer
object has already been deallocated, this will result in
an Illegal opcode error being raised in the command stream.

Now replacing drm_release() with a new function
amdgpu_drm_release(). This function will set the flag to
prevent the scheduling of any new queue resume/map, stop
all queues and then call drm_release().

V2:
  - Replace drm_release with amdgpu_drm_release(Christian).

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Reviewed-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Arvind Yadav <arvind.yadav@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu: Apply sign extension to seq64
Arunpravin Paneer Selvam [Thu, 12 Dec 2024 14:06:16 +0000 (19:36 +0530)]
drm/amdgpu: Apply sign extension to seq64

Apply sign extension to seq64 va address.

Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu: Modify the MES process va end limit
Christian König [Mon, 9 Dec 2024 17:40:48 +0000 (23:10 +0530)]
drm/amdgpu: Modify the MES process va end limit

Modify the MES process va end limit to max pfn.

Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu: Fix the use-after-free issue in wait IOCTL
Arunpravin Paneer Selvam [Mon, 9 Dec 2024 17:34:34 +0000 (23:04 +0530)]
drm/amdgpu: Fix the use-after-free issue in wait IOCTL

The xarray pointer which has the userqueue xarray structure
reference should be cleared when the userqueue gets
destroyed. Otherwise, we may access the freed xa memory and
see the below warnings.

warning 1:
BUG: KASAN: slab-use-after-free in _raw_spin_lock+0x7a/0xe0
[  +0.000044] Call Trace:
[  +0.000017]  <TASK>
[  +0.000016]  dump_stack_lvl+0x6c/0x90
[  +0.000025]  print_report+0xc4/0x5e0
[  +0.000025]  ? srso_return_thunk+0x5/0x5f
[  +0.000024]  ? kasan_complete_mode_report_info+0x60/0x1d0
[  +0.000030]  ? _raw_spin_lock+0x7a/0xe0
[  +0.000023]  kasan_report+0xdf/0x120
[  +0.000023]  ? _raw_spin_lock+0x7a/0xe0
[  +0.000025]  kasan_check_range+0xf7/0x1b0
[  +0.000025]  __kasan_check_write+0x14/0x20
[  +0.000024]  _raw_spin_lock+0x7a/0xe0
[  +0.000023]  ? __pfx__raw_spin_lock+0x10/0x10
[  +0.000024]  ? amdgpu_userq_wait_ioctl+0xac0/0x1f30 [amdgpu]
[  +0.000442]  amdgpu_userq_wait_ioctl+0x18fc/0x1f30 [amdgpu]
[  +0.000428]  ? __pfx_amdgpu_userq_wait_ioctl+0x10/0x10 [amdgpu]
[  +0.000424]  ? __pfx_idr_alloc_u32+0x10/0x10
[  +0.000027]  ? srso_return_thunk+0x5/0x5f
[  +0.000024]  ? __kasan_check_write+0x14/0x20
[  +0.000025]  ? srso_return_thunk+0x5/0x5f
[  +0.000024]  ? idr_alloc+0x72/0xc0
[  +0.000023]  ? srso_return_thunk+0x5/0x5f
[  +0.000023]  ? fput+0x1c/0x2f0
[  +0.000025]  drm_ioctl_kernel+0x178/0x2f0 [drm]
[  +0.000065]  ? __pfx_amdgpu_userq_wait_ioctl+0x10/0x10 [amdgpu]
[  +0.000425]  ? __pfx_drm_ioctl_kernel+0x10/0x10 [drm]
[  +0.000064]  ? srso_return_thunk+0x5/0x5f
[  +0.000023]  ? __kasan_check_write+0x14/0x20
[  +0.000025]  drm_ioctl+0x513/0xd20 [drm]
[  +0.000058]  ? __pfx_amdgpu_userq_wait_ioctl+0x10/0x10 [amdgpu]
[  +0.000428]  ? __pfx_drm_ioctl+0x10/0x10 [drm]
[  +0.000061]  ? __pfx__raw_spin_lock_irqsave+0x10/0x10
[  +0.000027]  ? __count_memcg_events+0x11f/0x3a0
[  +0.000027]  ? srso_return_thunk+0x5/0x5f
[  +0.001040]  ? srso_return_thunk+0x5/0x5f
[  +0.000969]  ? _raw_spin_unlock_irqrestore+0x27/0x50
[  +0.000966]  amdgpu_drm_ioctl+0xcd/0x1d0 [amdgpu]
[  +0.001352]  __x64_sys_ioctl+0x135/0x1b0
[  +0.000966]  x64_sys_call+0x1205/0x20d0
[  +0.000968]  do_syscall_64+0x4d/0x120
[  +0.000960]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[  +0.000962] RIP: 0033:0x7f42af11a94f

warning 2:
WARNING: at lib/xarray.c:1849 __xa_alloc+0x13a/0x150
[  366.491409] RIP: 0010:__xa_alloc+0x13a/0x150
[  366.491434] Call Trace:
[  366.491437]  <TASK>
[  366.491440]  ? show_regs+0x6d/0x80
[  366.491445]  ? __warn+0x91/0x140
[  366.491450]  ? __xa_alloc+0x13a/0x150
[  366.491453]  ? report_bug+0x1c9/0x1e0
[  366.491459]  ? handle_bug+0x63/0xa0
[  366.491463]  ? exc_invalid_op+0x1d/0x80
[  366.491467]  ? asm_exc_invalid_op+0x1f/0x30
[  366.491476]  ? __xa_alloc+0x13a/0x150
[  366.491484]  amdgpu_userq_wait_ioctl+0xe0e/0xfe0 [amdgpu]
[  366.491743]  ? idr_alloc_u32+0x97/0xd0
[  366.491749]  ? __pfx_amdgpu_userq_wait_ioctl+0x10/0x10 [amdgpu]
[  366.491912]  drm_ioctl_kernel+0xae/0x100 [drm]
[  366.491942]  drm_ioctl+0x2a1/0x500 [drm]
[  366.491961]  ? __pfx_amdgpu_userq_wait_ioctl+0x10/0x10 [amdgpu]
[  366.492127]  ? srso_return_thunk+0x5/0x5f
[  366.492132]  ? srso_return_thunk+0x5/0x5f
[  366.492135]  ? _raw_spin_unlock_irqrestore+0x2b/0x50
[  366.492139]  amdgpu_drm_ioctl+0x4f/0x90 [amdgpu]
[  366.492288]  __x64_sys_ioctl+0x99/0xd0
[  366.492295]  x64_sys_call+0x1209/0x20d0
[  366.492299]  do_syscall_64+0x51/0x120
[  366.492303]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[  366.492418] RIP: 0033:0x7f86f3b1a94f

Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu: Fix NULL ptr dereference issue for non userq fences
Arunpravin Paneer Selvam [Mon, 9 Dec 2024 17:32:28 +0000 (23:02 +0530)]
drm/amdgpu: Fix NULL ptr dereference issue for non userq fences

Add the correct fences count variable [num_fences] in the fences
array iteration to handle the userq / non-userq fences.

v2:(Christian)
  - All fences in the array either come from some reservation object
    or drm_syncobj. If any of those are NULL then there is a bug
    somewhere else.

Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu: Add mqd for userq compute queue
Arunpravin Paneer Selvam [Mon, 9 Dec 2024 13:50:56 +0000 (19:20 +0530)]
drm/amdgpu: Add mqd for userq compute queue

Add mqd for userq compute queue for gfx11/gfx12

Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu: enable eviction fence
Shashank Sharma [Wed, 20 Nov 2024 17:45:33 +0000 (18:45 +0100)]
drm/amdgpu: enable eviction fence

This patch enables attachment and detachment of eviction fences.
This is just a fork of eviction fence enabling code from the first
patch of the series so that the CI testing can happen on fully
fledged code.

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Reviewed-by: Christian Koenig <christian.koenig@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Arvind Yadav <arvind.yadav@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu: simplify eviction fence suspend/resume
Shashank Sharma [Wed, 11 Dec 2024 11:09:00 +0000 (12:09 +0100)]
drm/amdgpu: simplify eviction fence suspend/resume

The basic idea in this redesign is to add an eviction fence only in UQ
resume path. When userqueue is not present, keep ev_fence as NULL

Main changes are:
 - do not create the eviction fence during evf_mgr_init, keeping
   evf_mgr->ev_fence=NULL until UQ get active.
 - do not replace the ev_fence in evf_resume path, but replace it only in
   uq_resume path, so remove all the unnecessary code from ev_fence_resume.
 - add a new helper function (amdgpu_userqueue_ensure_ev_fence) which
   will do the following:
   - flush any pending uq_resume work, so that it could create an
     eviction_fence
   - if there is no pending uq_resume_work, add a uq_resume work and
     wait for it to execute so that we always have a valid ev_fence
 - call this helper function from two places, to ensure we have a valid
   ev_fence:
   - when a new uq is created
   - when a new uq completion fence is created

v2: Worked on review comments by Christian.
v3: Addressed few more review comments by Christian.
v4: Move mutex lock outside of the amdgpu_userqueue_suspend()
    function (Christian).
v5: squash in build fix (Alex)

Cc: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Arvind Yadav <arvind.yadav@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu: enable userqueue secure sem for GFX 12
Arunpravin Paneer Selvam [Tue, 26 Nov 2024 14:51:08 +0000 (15:51 +0100)]
drm/amdgpu: enable userqueue secure sem for GFX 12

- Add a field in struct amdgpu_mqd_prop for userqueue
  secure sem fence address since now we have a generic
  file for mes_userqueue.c
- Add secure sem fence address mqd support to gfx12 into
  their corresponding init functions.
- Enable secure semaphore IRQ handling

V2: Address review comment from Alex:
    Use fence_address instead of fenceaddress (Shashank)

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Signed-off-by: Somalapuram Amaranath <Amaranath.Somalapuram@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu: enable userqueue support for GFX12
Somalapuram Amaranath [Thu, 10 Oct 2024 18:08:06 +0000 (20:08 +0200)]
drm/amdgpu: enable userqueue support for GFX12

This patch enables Usermode queue support across GFX, Compute
and SDMA IPs on GFX12/SDMA7. It typically reuses Navi3X userqueue
IP functions to create and destroy MQDs.

v2: rebase on proposed changes (Alex)

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Cc: Arvind Yadav <arvind.yadav@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Somalapuram Amaranath <Amaranath.Somalapuram@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu/uq: make MES UQ setup generic
Alex Deucher [Tue, 26 Nov 2024 14:45:19 +0000 (15:45 +0100)]
drm/amdgpu/uq: make MES UQ setup generic

Now that all of the IP specific code has been moved into
the IP specific functions, we can make this code generic.

V2: Fixed build errors and porting logics (Shashank)

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu/uq: remove gfx11 specifics from UQ setup
Alex Deucher [Mon, 21 Oct 2024 13:16:12 +0000 (15:16 +0200)]
drm/amdgpu/uq: remove gfx11 specifics from UQ setup

This can all be handled by in the IP specific mpd init
code.

V2: Removed setting of gds_va, which was removed during UAPI
    review (Shashank)

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu/sdma7: update mqd init for UQ
Alex Deucher [Fri, 18 Oct 2024 18:15:51 +0000 (14:15 -0400)]
drm/amdgpu/sdma7: update mqd init for UQ

Set the addresses for the UQ metadata.

V2: Fix lower offset mask (Shashank)
V2: Use lower_32_bits for mqd objects(Alex)

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu/sdma6: update mqd init for UQ
Alex Deucher [Fri, 18 Oct 2024 18:15:31 +0000 (14:15 -0400)]
drm/amdgpu/sdma6: update mqd init for UQ

Set the addresses for the UQ metadata.

V2: Fix lower address mask (Shashank)
V3: Use lower_32_bits for MQD objects (Alex)

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu/gfx12: update mqd init for UQ
Alex Deucher [Fri, 18 Oct 2024 18:15:12 +0000 (14:15 -0400)]
drm/amdgpu/gfx12: update mqd init for UQ

Set the addresses for the UQ metadata.

V2: Fix lower address mask (Shashank)
V3: Use lower_32_bits() for MQD objects (Alex)

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu: fix IGT CI regression with eviction fence
Amaranath Somalapuram [Wed, 27 Nov 2024 16:06:45 +0000 (17:06 +0100)]
drm/amdgpu: fix IGT CI regression with eviction fence

This patch fixes one of the regressions in eviction fence code with
IGT tests.

Reviewed-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Amaranath Somalapuram <amaranath.somalapuram@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu/gfx11: update mqd init for UQ
Alex Deucher [Fri, 18 Oct 2024 18:14:34 +0000 (14:14 -0400)]
drm/amdgpu/gfx11: update mqd init for UQ

Set the addresses for the UQ metadata.

V2: Fix lower address (Shashank)
V3: Restore lower_32_bits() for MQD addresses (Alex)

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu: add some additional members to amdgpu_mqd_prop
Alex Deucher [Fri, 18 Oct 2024 17:58:23 +0000 (13:58 -0400)]
drm/amdgpu: add some additional members to amdgpu_mqd_prop

These are needed to make userqueue infrastructure generic.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu: handle eviction fence race
Shashank Sharma [Wed, 20 Nov 2024 17:04:33 +0000 (18:04 +0100)]
drm/amdgpu: handle eviction fence race

The eviction process can get into a race condition between the eviction
fence suspend work (which replaces the old fence with new) and kms_close
(which destroys the fence and doesn't expect a new one).

This patch:
- adds a flag to indicate that fd is closing, so fence replacement is
  not required (evf_mgr->fd_closing)
- adds a flush_work() during the ev_fence_destroy routine

V2: Addressed review comments from Christian:
    - Do not use mutex to sync
    - Use flush_work and wait for suspend_work to be done

V3: Fixed state machine for queue->active, which adds into race between
    suspend/resume and queue ops

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Arvind Yadav <arvind.yadav@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu: resume gfx userqueues
Shashank Sharma [Wed, 20 Nov 2024 18:02:26 +0000 (19:02 +0100)]
drm/amdgpu: resume gfx userqueues

This patch adds support for userqueue resume. What it typically does is
this:
- adds a new delayed work for resuming all the queues.
- schedules this delayed work from the suspend work.
- validates the BOs and replaces the eviction fence before resuming all
  the queues running under this instance of userq manager.

V2: Addressed Christian's review comments:
    - declare local variables like ret at the bottom.
    - lock all the object first, then start attaching the new fence.
    - dont replace old eviction fence, just attach new eviction fence.
    - no error logs for drm_exec_lock failures
    - no need to reserve bos after drm_exec_locked
    - schedule the resume worker immediately (not after 100 ms)
    - check for NULL BO (Arvind)

V5: Rebased wrt changes in suspend patch
    - moved amdgpu_userqueue_validate_vm_bo in this patch
    - initialized ret in resume_all

V6: Rebase
V7: Addressed review comments from Christian
    - Do not use list_for_each_safe() with vm->invalidated, its not
      correct way
V8: Fixed the race condition between suspend/close/fence

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Acked-by: Christian Koenig <christian.koenig@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Arvind Yadav <arvind.yadav@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu: suspend gfx userqueues
Shashank Sharma [Wed, 20 Nov 2024 17:59:49 +0000 (18:59 +0100)]
drm/amdgpu: suspend gfx userqueues

This patch adds suspend support for gfx userqueues. It typically does
the following:
- adds an enable_signaling function for the eviction fence, so that it
  can trigger the userqueue suspend,
- adds a delayed work to handle suspending of the eviction_fence
- adds a suspend function to handle suspending of userqueues which
  suspends all the queues under this userq manager and signals the
  eviction fence,
- adds a function to replace the old eviction fence with a new one and
  attach it to each of the objects,
- adds reference of userq manager in the eviction fence container so
  that it can be used in the suspend function.

V2: Addressed Christian's review comments:
    - schedule suspend work immediately

V4: Addressed Christian's review comments:
    - wait for pending uq fences before starting suspend, added
      queue->last_fence for the same
    - accommodate ev_fence_mgr into existing code
    - some bug fixes and NULL checks

V5: Addressed Christian's review comments (gitlab)
    - Wait for eviction fence to get signaled in destroy,
      don't signal it
    - Wait for eviction fence to get signaled in replace fence,
      don't signal it

V6: Addressed Christian's review comments
    - Do not destroy the old eviction fence until we have it replaced
    - Change the sequence of fence replacement sub-tasks
    - reusing the ev_fence delayed work for userqueue suspend as well
      (Shashank).

V7: Addressed Christian's review comments
    - give evf_mgr as argument (instead of fpriv) to replace_fence()
    - save ptr to evf_mgr in ev_fence (instead of uq_mgr)
    - modify suspend_all_queues logic to reflect error properly
    - remove the garbage drm_exec_lock section in wait_for_signal
    - grab the userqueue mutex before starting the wait for fence
    - remove the unrelated gobj check from signal_ioctl

V8: Added race condition fixes

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Acked-by: Christian Koenig <christian.koenig@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Arvind Yadav <arvind.yadav@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu: add userqueue suspend/resume functions
Shashank Sharma [Mon, 3 Jun 2024 09:13:03 +0000 (11:13 +0200)]
drm/amdgpu: add userqueue suspend/resume functions

This patch adds userqueue suspend/resume functions at
core MES V11 IP level.

V2: use true/false for queue_active status (Christian)
    added Christian's R-B

V3: reset/set queue status in mqd.create and mqd.destroy

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Reviewed-by: Christian Koenig <christian.koenig@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Arvind Yadav <arvind.yadav@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu: add gfx eviction fence helpers
Shashank Sharma [Tue, 27 Aug 2024 10:14:43 +0000 (15:44 +0530)]
drm/amdgpu: add gfx eviction fence helpers

This patch adds basic eviction fence framework for the gfx buffers.
The idea is to:
- One eviction fence is created per gfx process, at kms_open.
- This fence is attached to all the gem buffers created
  by this process.
- This fence is detached to all the gem buffers at postclose_kms.

This framework will be further used for usermode queues.

V2: Addressed review comments from Christian
    - keep fence_ctx and fence_seq directly in fpriv
    - evcition_fence should be dynamically allocated
    - do not save eviction fence instance in BO, there could be many
      such fences attached to one BO
    - use dma_resv_replace_fence() in detach

V3: Addressed review comments from Christian
    - eviction fence create and destroy functions should be called
      only once from fpriv create/destroy
    - use dma_fence_put() in eviction_fence_destroy

V4: Addressed review comments from Christian:
    - create a separate ev_fence_mgr structure
    - cleanup fence init part
    - do not add a domain for fence owner KGD

V5: Addressed review comments from Christian:
    - drop the dma_fence_is_signaled check
    - use a local variable to access evf_mgr->ev_fence under the
      spin_lock() multiple places
    - remove the vm->is_compute_ctx check to attach gfx eviction fence,
      in gem_object_open

V6: Addressed review comments from Christian:
    - drop the return value from eviction_fence_signal
    - reserve_fence should be the first thing inside the
      attach_eviction_fence function, also keep the resv_add_fence inside
      the lock
    - remove the unwanted ev_fence check inside detach function
    - fix wrong variable check in eviction_fence_init function
    - return the error value of eviction_fence_init to the caller, dont
      keep it void.
    - fail gem_object_open if attaching of eviction_fence fails
    - detach the eviction fence only when amdgpu_vm_is_bo_always_valid
      is not true.

V7: Addressed review comments from Christian:
    - Do not add a uq_mgr ptr in ev_fence, rather add evf_mgr

V8: Move eviction fence enabling into separate patch for CI

Cc: Christian Koenig <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian Koenig <christian.koenig@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Arvind Yadav <arvind.yadav@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu: add the argument description for gpu_addr
Sunil Khatri [Wed, 13 Nov 2024 08:10:33 +0000 (13:40 +0530)]
drm/amdgpu: add the argument description for gpu_addr

Add argument description for the input argument
gpu_addr for amdgpu_seq64_alloc.

Fixes the warning raised by the compiler:
drivers/gpu/drm/amd/amdgpu/amdgpu_seq64.c:168:
warning: Function parameter or struct member 'gpu_addr' not described in 'amdgpu_seq64_alloc

Cc: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu: add new AMDGPU_INFO subquery for userq objects
Shashank Sharma [Wed, 30 Oct 2024 14:39:42 +0000 (15:39 +0100)]
drm/amdgpu: add new AMDGPU_INFO subquery for userq objects

This patch adds a new subquery (AMDGPU_INFO_UQ_FW_AREAS) in
AMDGPU_INFO_IOCTL to get the size and alignment of shadow
and csa objects from the FW setup. This information is
required for the userqueue consumers.

V2: Added Alex's suggestions and addressed review comments:
- make this query IP specific (GFX/SDMA etc)
- give a better title (AMDGPU_INFO_UQ_METADATA)
- restructured the code as per sample code shared by Alex

V3: Split the UAPI patch from shadow_size_fn modifications
V4: Addressed review comments from UAPI review (Marek/Pierre-Eric)
    - Change the query name to AMDGPU_INFO_UQ_FW_AREAS
    - remove unused inpur parameter for AMDGPU_HW_IP*

link: https://gitlab.freedesktop.org/mesa/drm/-/merge_requests/400/
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Cc: Arvind Yadav <arvind.yadav@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu: add get_gfx_shadow_info callback for gfx12
Shashank Sharma [Thu, 24 Oct 2024 15:07:42 +0000 (17:07 +0200)]
drm/amdgpu: add get_gfx_shadow_info callback for gfx12

This callback gets the size and alignment requirements
for the gfx shadow buffer for preemption.

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu: Modify userq signal/wait struct field names
Arunpravin Paneer Selvam [Mon, 11 Nov 2024 07:13:07 +0000 (12:43 +0530)]
drm/amdgpu: Modify userq signal/wait struct field names

Modify kernel UAPI userq signal/wait struct field names and
description corresponding to the libdrm UAPI review comments.

libdrm MR: https://gitlab.freedesktop.org/mesa/drm/-/merge_requests/392

Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu: bypass SRIOV check for shadow size info
Shashank Sharma [Wed, 30 Oct 2024 14:32:27 +0000 (15:32 +0100)]
drm/amdgpu: bypass SRIOV check for shadow size info

Currently, the shadow FW space size and alignment information is
protected under a flag (adev->gfx.cp_gfx_shadow) which gets set
only in case of SRIOV setups.
if (amdgpu_sriov_vf(adev))
    adev->gfx.cp_gfx_shadow = true;

But we need this information for GFX Userqueues, so that user can
create these objects while creating userqueue. This patch series
creates a method to get this information bypassing the dependency
on this check.

This patch:
- adds a new input parameter flag to the gfx.funcs->get_gfx_shadow_info
fptr definition, so that it can accommodate the information without the
check (adev->gfx.cp_gfx_shadow) on request.
- updates the existing definition of amdgpu_gfx_get_gfx_shadow_info to
adjust with this new flag.

Next patch in the series is adding a UAPI which will consume this info.

V2: split this patch from the new UAPI patch

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Cc: Arvind Yadav <arvind.yadav@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu: fix userqueue UAPI comments
Shashank Sharma [Mon, 11 Nov 2024 11:34:30 +0000 (12:34 +0100)]
drm/amdgpu: fix userqueue UAPI comments

This patch fixes some of the pending UAPI review comments
from the libDRM/UAPI review process.

- It updates some outdated comments in the userqueue UAPI header
  highlighted during the libdrm UAPI review.
- It removes the GDS BO support which was found unused.
- It also removes the unused flags parameter from the UAPI.
- It also adds a padding variables in userqueue in/out structures.

(Pierre-Eric and Marek)
  - clarify comments on top of drm_amdgpu_userq_in
  - clarify comment for queue_id (in)
  - clarify comment for mqd
  - clarify comment for compute MQD size
  - clarify comment for queue_id (out)
  - remove GDB object from BO object list
  - remove the unused flags parameter

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Arvind Yadav <arvind.yadav@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agoRevert "drm/amdgpu: don't allow userspace to create a doorbell BO"
Shashank Sharma [Tue, 28 Nov 2023 14:22:30 +0000 (19:52 +0530)]
Revert "drm/amdgpu: don't allow userspace to create a doorbell BO"

This reverts commit 6be2ad4f0073c541146caa66c5ae936c955a8224.

This patch was to block userspace to use doorbell manager UAPI
until usermode queue UAPI gets approved. UQ UAPI got approved in the
following MR:
https://gitlab.freedesktop.org/mesa/drm/-/merge_requests/392

Cc: Christian Koenig <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu: Add input fence to sync bo map/unmap
Arvind Yadav [Wed, 25 Sep 2024 16:10:41 +0000 (18:10 +0200)]
drm/amdgpu: Add input fence to sync bo map/unmap

This patch adds input fences to VM_IOCTL for buffer object.
The kernel will map/unmap the BO only when the fence is signaled.
The UAPI for the same has been approved here:
https://gitlab.freedesktop.org/mesa/drm/-/merge_requests/392

V2: Bug fix (Arvind)
V3: Bug fix (Arvind)
V4: Rename UAPI objects as per UAPI review (Marek)
V5: Addressed review comemnts from Christian
     - function should return error.
     - Add 'TODO' comment
     - The input fence should be independent of the operation.
V6: Addressed review comemnts from Christian
    - Release the memory allocated by memdup_user().
V7: Addressed review comemnts from Christian
    - Drop the debug print and add "return r;" for the error handling.

V11: Rebase
v12: Fix 32-bit holes issue in sturct drm_amdgpu_gem_va.
v13: Fix deadlock issue.
v14: Fix merge conflict.
v15: Fix review comment by renaming syncobj handles.

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Arvind Yadav <arvind.yadav@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu: add userq specific kernel config for fence ioctls
Arunpravin Paneer Selvam [Wed, 30 Oct 2024 05:56:37 +0000 (11:26 +0530)]
drm/amdgpu: add userq specific kernel config for fence ioctls

Keep the user queue fence signal and wait IOCTLs in the
kernel config CONFIG_DRM_AMDGPU_NAVI3X_USERQ.

v2(Christian):
  - Remove the userq specific config added for kernel queues fence init
    function.

v3(Alex):
  - It will be better to return an error(-ENOTSUPP) in these cases.

Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu: Add gpu_addr support to seq64 allocation
Arunpravin Paneer Selvam [Wed, 30 Oct 2024 05:49:26 +0000 (11:19 +0530)]
drm/amdgpu: Add gpu_addr support to seq64 allocation

Add gpu address support to seq64 alloc function.

v1:(Christian)
  - Add the user of this new interface change to the same
    patch.

Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu: Add separate array of read and write for BO handles
Arunpravin Paneer Selvam [Wed, 30 Oct 2024 05:29:04 +0000 (10:59 +0530)]
drm/amdgpu: Add separate array of read and write for BO handles

Drop AMDGPU_USERQ_BO_WRITE as this should not be a global option
of the IOCTL, It should be option per buffer. Hence adding separate
array for read and write BO handles.

v2(Marek):
  - Internal kernel details shouldn't be here. This file should only
    document the observed behavior, not the implementation .

v3:
  - Fix DAL CI clang issue.

v4:
  - Added Alex RB to merge the kernel UAPI changes since he has
    already approved the amdgpu_drm.h changes.

Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Suggested-by: Marek Olšák <marek.olsak@amd.com>
Suggested-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu: add vm root BO lock before accessing the vm
Arunpravin Paneer Selvam [Wed, 30 Oct 2024 05:26:57 +0000 (10:56 +0530)]
drm/amdgpu: add vm root BO lock before accessing the vm

Add a vm root BO lock before accessing the userqueue VM.

v1:(Christian)
   - Keep the VM locked until you are done with the mapping.
   - Grab a temporary BO reference, drop the VM lock and acquire the BO.
     When you are done with everything just drop the BO lock and
     then the temporary BO reference.

Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu: Add the missing error handling for xa_store() call
Arunpravin Paneer Selvam [Wed, 30 Oct 2024 05:25:22 +0000 (10:55 +0530)]
drm/amdgpu: Add the missing error handling for xa_store() call

Add the missing error handling for xa_store() call in the function
amdgpu_userq_fence_driver_alloc().

Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>