]> www.infradead.org Git - users/willy/xarray.git/log
users/willy/xarray.git
5 months agodrm/amdgpu/gfx: Introduce helpers handling CSB manipulation
Rodrigo Siqueira [Mon, 21 Apr 2025 22:12:18 +0000 (16:12 -0600)]
drm/amdgpu/gfx: Introduce helpers handling CSB manipulation

From GFX6 to GFX11, there is a function for getting the CSB buffer to be
put into the hardware. Three common parts are duplicated in all of these
GFX functions:

1. Prepare the CSB preamble.
2. Parser the CS data.
3. End the CSB preamble.

This commit creates helpers to be used from GFX6 to GFX11.

Signed-off-by: Rodrigo Siqueira <siqueira@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
5 months agodrm/amdgpu: Fix spelling mistake "rounter" -> "router"
Colin Ian King [Tue, 22 Apr 2025 10:26:26 +0000 (11:26 +0100)]
drm/amdgpu: Fix spelling mistake "rounter" -> "router"

There is a spelling mistake with the array utcl2_rounter_str, it
appears it should be utcl2_router_str. Fix it.

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
5 months agodrm/amdgpu/atom: Work around vbios NULL offset false positive
Kees Cook [Mon, 21 Apr 2025 21:58:34 +0000 (14:58 -0700)]
drm/amdgpu/atom: Work around vbios NULL offset false positive

GCC really does not want to consider NULL (or near-NULL) addresses as
valid, so calculations based off of NULL end up getting range-tracked into
being an offset wthin a 0 byte array. It gets especially mad about this:

                if (vbios_str == NULL)
                        vbios_str += sizeof(BIOS_ATOM_PREFIX) - 1;
...
        if (vbios_str != NULL && *vbios_str == 0)
                vbios_str++;

It sees this as being "sizeof(BIOS_ATOM_PREFIX) - 1" byte offset from
NULL, when building with -Warray-bounds (and the coming
-fdiagnostic-details flag):

In function 'atom_get_vbios_pn',
    inlined from 'amdgpu_atom_parse' at drivers/gpu/drm/amd/amdgpu/atom.c:1553:2:
drivers/gpu/drm/amd/amdgpu/atom.c:1447:34: error: array subscript 0 is outside array bounds of 'unsigned char[0]' [-Werror=array-bounds=]
 1447 |         if (vbios_str != NULL && *vbios_str == 0)
      |                                  ^~~~~~~~~~
  'amdgpu_atom_parse': events 1-2
 1444 |                 if (vbios_str == NULL)
      |                    ^
      |                    |
      |                    (1) when the condition is evaluated to true
......
 1447 |         if (vbios_str != NULL && *vbios_str == 0)
      |                                  ~~~~~~~~~~
      |                                  |
      |                                  (2) out of array bounds here
In function 'amdgpu_atom_parse':
cc1: note: source object is likely at address zero

As there isn't a sane way to convince it otherwise, hide vbios_str from
GCC's optimizer to avoid the warning so we can get closer to enabling
-Warray-bounds globally.

Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Kees Cook <kees@kernel.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
5 months agodrm/amd/display: Fix slab-use-after-free in hdcp
Chris Bainbridge [Thu, 17 Apr 2025 21:50:05 +0000 (16:50 -0500)]
drm/amd/display: Fix slab-use-after-free in hdcp

The HDCP code in amdgpu_dm_hdcp.c copies pointers to amdgpu_dm_connector
objects without incrementing the kref reference counts. When using a
USB-C dock, and the dock is unplugged, the corresponding
amdgpu_dm_connector objects are freed, creating dangling pointers in the
HDCP code. When the dock is plugged back, the dangling pointers are
dereferenced, resulting in a slab-use-after-free:

[   66.775837] BUG: KASAN: slab-use-after-free in event_property_validate+0x42f/0x6c0 [amdgpu]
[   66.776171] Read of size 4 at addr ffff888127804120 by task kworker/0:1/10

[   66.776179] CPU: 0 UID: 0 PID: 10 Comm: kworker/0:1 Not tainted 6.14.0-rc7-00180-g54505f727a38-dirty #233
[   66.776183] Hardware name: HP HP Pavilion Aero Laptop 13-be0xxx/8916, BIOS F.17 12/18/2024
[   66.776186] Workqueue: events event_property_validate [amdgpu]
[   66.776494] Call Trace:
[   66.776496]  <TASK>
[   66.776497]  dump_stack_lvl+0x70/0xa0
[   66.776504]  print_report+0x175/0x555
[   66.776507]  ? __virt_addr_valid+0x243/0x450
[   66.776510]  ? kasan_complete_mode_report_info+0x66/0x1c0
[   66.776515]  kasan_report+0xeb/0x1c0
[   66.776518]  ? event_property_validate+0x42f/0x6c0 [amdgpu]
[   66.776819]  ? event_property_validate+0x42f/0x6c0 [amdgpu]
[   66.777121]  __asan_report_load4_noabort+0x14/0x20
[   66.777124]  event_property_validate+0x42f/0x6c0 [amdgpu]
[   66.777342]  ? __lock_acquire+0x6b40/0x6b40
[   66.777347]  ? enable_assr+0x250/0x250 [amdgpu]
[   66.777571]  process_one_work+0x86b/0x1510
[   66.777575]  ? pwq_dec_nr_in_flight+0xcf0/0xcf0
[   66.777578]  ? assign_work+0x16b/0x280
[   66.777580]  ? lock_is_held_type+0xa3/0x130
[   66.777583]  worker_thread+0x5c0/0xfa0
[   66.777587]  ? process_one_work+0x1510/0x1510
[   66.777588]  kthread+0x3a2/0x840
[   66.777591]  ? kthread_is_per_cpu+0xd0/0xd0
[   66.777594]  ? trace_hardirqs_on+0x4f/0x60
[   66.777597]  ? _raw_spin_unlock_irq+0x27/0x60
[   66.777599]  ? calculate_sigpending+0x77/0xa0
[   66.777602]  ? kthread_is_per_cpu+0xd0/0xd0
[   66.777605]  ret_from_fork+0x40/0x90
[   66.777607]  ? kthread_is_per_cpu+0xd0/0xd0
[   66.777609]  ret_from_fork_asm+0x11/0x20
[   66.777614]  </TASK>

[   66.777643] Allocated by task 10:
[   66.777646]  kasan_save_stack+0x39/0x60
[   66.777649]  kasan_save_track+0x14/0x40
[   66.777652]  kasan_save_alloc_info+0x37/0x50
[   66.777655]  __kasan_kmalloc+0xbb/0xc0
[   66.777658]  __kmalloc_cache_noprof+0x1c8/0x4b0
[   66.777661]  dm_dp_add_mst_connector+0xdd/0x5c0 [amdgpu]
[   66.777880]  drm_dp_mst_port_add_connector+0x47e/0x770 [drm_display_helper]
[   66.777892]  drm_dp_send_link_address+0x1554/0x2bf0 [drm_display_helper]
[   66.777901]  drm_dp_check_and_send_link_address+0x187/0x1f0 [drm_display_helper]
[   66.777909]  drm_dp_mst_link_probe_work+0x2b8/0x410 [drm_display_helper]
[   66.777917]  process_one_work+0x86b/0x1510
[   66.777919]  worker_thread+0x5c0/0xfa0
[   66.777922]  kthread+0x3a2/0x840
[   66.777925]  ret_from_fork+0x40/0x90
[   66.777927]  ret_from_fork_asm+0x11/0x20

[   66.777932] Freed by task 1713:
[   66.777935]  kasan_save_stack+0x39/0x60
[   66.777938]  kasan_save_track+0x14/0x40
[   66.777940]  kasan_save_free_info+0x3b/0x60
[   66.777944]  __kasan_slab_free+0x52/0x70
[   66.777946]  kfree+0x13f/0x4b0
[   66.777949]  dm_dp_mst_connector_destroy+0xfa/0x150 [amdgpu]
[   66.778179]  drm_connector_free+0x7d/0xb0
[   66.778184]  drm_mode_object_put.part.0+0xee/0x160
[   66.778188]  drm_mode_object_put+0x37/0x50
[   66.778191]  drm_atomic_state_default_clear+0x220/0xd60
[   66.778194]  __drm_atomic_state_free+0x16e/0x2a0
[   66.778197]  drm_mode_atomic_ioctl+0x15ed/0x2ba0
[   66.778200]  drm_ioctl_kernel+0x17a/0x310
[   66.778203]  drm_ioctl+0x584/0xd10
[   66.778206]  amdgpu_drm_ioctl+0xd2/0x1c0 [amdgpu]
[   66.778375]  __x64_sys_ioctl+0x139/0x1a0
[   66.778378]  x64_sys_call+0xee7/0xfb0
[   66.778381]  do_syscall_64+0x87/0x140
[   66.778385]  entry_SYSCALL_64_after_hwframe+0x4b/0x53

Fix this by properly incrementing and decrementing the reference counts
when making and deleting copies of the amdgpu_dm_connector pointers.

(Mario: rebase on current code and update fixes tag)

Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4006
Signed-off-by: Chris Bainbridge <chris.bainbridge@gmail.com>
Fixes: da3fd7ac0bcf3 ("drm/amd/display: Update CP property based on HW query")
Reviewed-by: Alex Hung <alex.hung@amd.com>
Link: https://lore.kernel.org/r/20250417215005.37964-1-mario.limonciello@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
5 months agodrm/amdgpu: Disallow partition query during reset
Lijo Lazar [Wed, 16 Apr 2025 06:53:44 +0000 (12:23 +0530)]
drm/amdgpu: Disallow partition query during reset

Reject queries to get current partition modes during reset. Also, don't
accept sysfs interface requests to switch compute partition mode while
in reset.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
5 months agodrm/amd/display: Fix NULL pointer dereferences in dm_update_crtc_state() v2
Srinivasan Shanmugam [Mon, 21 Apr 2025 03:08:33 +0000 (08:38 +0530)]
drm/amd/display: Fix NULL pointer dereferences in dm_update_crtc_state() v2

Added checks for NULL values after retrieving drm_new_conn_state
to prevent dereferencing NULL pointers.

Fixes the below:
drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c:10751 dm_update_crtc_state()
warn: 'drm_new_conn_state' can also be NULL

drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c
    10672 static int dm_update_crtc_state(struct amdgpu_display_manager *dm,
    10673                          struct drm_atomic_state *state,
    10674                          struct drm_crtc *crtc,
    10675                          struct drm_crtc_state *old_crtc_state,
    10676                          struct drm_crtc_state *new_crtc_state,
    10677                          bool enable,
    10678                          bool *lock_and_validation_needed)
    10679 {
    10680         struct dm_atomic_state *dm_state = NULL;
    10681         struct dm_crtc_state *dm_old_crtc_state, *dm_new_crtc_state;
    10682         struct dc_stream_state *new_stream;
    10683         int ret = 0;
    10684
    ...
    10703
    10704         /* TODO This hack should go away */
    10705         if (connector && enable) {
    10706                 /* Make sure fake sink is created in plug-in scenario */
    10707                 drm_new_conn_state = drm_atomic_get_new_connector_state(state,
    10708                                                                         connector);

drm_atomic_get_new_connector_state() can't return error pointers, only NULL.

    10709                 drm_old_conn_state = drm_atomic_get_old_connector_state(state,
    10710                                                                         connector);
    10711
    10712                 if (IS_ERR(drm_new_conn_state)) {
                                     ^^^^^^^^^^^^^^^^^^

    10713                         ret = PTR_ERR_OR_ZERO(drm_new_conn_state);

Calling PTR_ERR_OR_ZERO() doesn't make sense.  It can't be success.

    10714                         goto fail;
    10715                 }
    10716
    ...
    10748
    10749                 dm_new_crtc_state->abm_level = dm_new_conn_state->abm_level;
    10750
--> 10751                 ret = fill_hdr_info_packet(drm_new_conn_state,
                                                     ^^^^^^^^^^^^^^^^^^ Unchecked dereference

    10752                                            &new_stream->hdr_static_metadata);
    10753                 if (ret)
    10754                         goto fail;
    10755

v2: Modified the NULL pointer check for drm_new_conn_state in the
    dm_update_crtc_state function to  include a warning via WARN_ON and
    return -EINVAL to indicate an invalid state when the pointer is NULL.

Cc: Harry Wentland <harry.wentland@amd.com>
Cc: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Cc: Tom Chung <chiahsuan.chung@amd.com>
Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Cc: Roman Li <roman.li@amd.com>
Cc: Alex Hung <alex.hung@amd.com>
Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu: update fence ptr with context:seqno
Sunil Khatri [Mon, 21 Apr 2025 12:15:49 +0000 (17:45 +0530)]
drm/amdgpu: update fence ptr with context:seqno

log context:seqno of the fence during timeout rather
than logging fence pointer.

Reviewed-by: Arvind Yadav <Arvind.Yadav@amd.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu/gfx12: Add fw minimum version check for usermode queue
Arvind Yadav [Wed, 9 Apr 2025 10:15:34 +0000 (15:45 +0530)]
drm/amdgpu/gfx12: Add fw minimum version check for usermode queue

This patch is load usermode queue based on FW support for gfx12.
CP Ucode FW Vesion: [PFP = 2840, ME = 2780, MEC = 3050, MES = 123]

v2: Addressed review comments from Alex
   - Just check the firmware versions directly.

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Cc: Shashank Sharma <shashank.sharma@amd.com>
Cc: Sunil Khatri <sunil.khatri@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Arvind Yadav <Arvind.Yadav@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu/gfx11: Add fw minimum version check for usermode queue
Arvind Yadav [Wed, 9 Apr 2025 10:11:59 +0000 (15:41 +0530)]
drm/amdgpu/gfx11: Add fw minimum version check for usermode queue

This patch is load usermode queue based on FW support for gfx11.
CP Ucode FW version: [PFP = 2530, ME = 2390, MEC = 2600, MES = 120]

v2: Addressed review comments from Alex.
    - Just check the firmware versions directly.
v3: Firmware version checks only for Navi3x(by Alex).

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Cc: Shashank Sharma <shashank.sharma@amd.com>
Cc: Sunil Khatri <sunil.khatri@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Arvind Yadav <Arvind.Yadav@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amd/display: Add NULL pointer checks in dm_force_atomic_commit()
Srinivasan Shanmugam [Thu, 17 Apr 2025 19:27:19 +0000 (00:57 +0530)]
drm/amd/display: Add NULL pointer checks in dm_force_atomic_commit()

This commit updates the dm_force_atomic_commit function to replace the
usage of PTR_ERR_OR_ZERO with IS_ERR for checking error states after
retrieving the Connector (drm_atomic_get_connector_state), CRTC
(drm_atomic_get_crtc_state), and Plane (drm_atomic_get_plane_state)
states.

The function utilized PTR_ERR_OR_ZERO for error checking. However, this
approach is inappropriate in this context because the respective
functions do not return NULL; they return pointers that encode errors.

This change ensures that error pointers are properly checked using
IS_ERR before attempting to dereference.

Cc: Harry Wentland <harry.wentland@amd.com>
Cc: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Cc: Tom Chung <chiahsuan.chung@amd.com>
Cc: Roman Li <roman.li@amd.com>
Cc: Alex Hung <alex.hung@amd.com>
Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu/userq: use consistent function naming
Alex Deucher [Wed, 16 Apr 2025 21:49:45 +0000 (17:49 -0400)]
drm/amdgpu/userq: use consistent function naming

s/userqueue/userq/

1. remove the mix of amdgpu_userqueue and amdgpu_userq
2. to be consistent with other amdgpu_userq_fence.c
3. it's shorter

Reviewed-by: Prike Liang <Prike.Liang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu/userq: rename eviction helpers
Alex Deucher [Wed, 16 Apr 2025 21:27:41 +0000 (17:27 -0400)]
drm/amdgpu/userq: rename eviction helpers

suspend/resume -> evict/restore

Rename to avoid confusion with the system suspend
and resume helpers.

v2: update error messages

Reviewed-by: Prike Liang <Prike.Liang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu/userq: move waiting for last fence before umap
Alex Deucher [Wed, 16 Apr 2025 18:40:25 +0000 (14:40 -0400)]
drm/amdgpu/userq: move waiting for last fence before umap

Need to wait for the last fence before unmapping.  This
also fixes a memory leak in amdgpu_userqueue_cleanup()
when the fence isn't signalled.

Fixes: b0db33c8c50f ("drm/amdgpu/userq: rework front end call sequence")
Reviewed-by: Prike Liang <Prike.Liang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu/userq: unmap queues amdgpu_userq_mgr_fini()
Alex Deucher [Wed, 16 Apr 2025 18:35:05 +0000 (14:35 -0400)]
drm/amdgpu/userq: unmap queues amdgpu_userq_mgr_fini()

This was missed when the map and unmap were split out
of the mqd create and destroy functions.

Fixes: b0db33c8c50f ("drm/amdgpu/userq: rework front end call sequence")
Reviewed-by: Prike Liang <Prike.Liang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu: switch from queue_active to queue state
Alex Deucher [Sat, 12 Apr 2025 16:59:38 +0000 (12:59 -0400)]
drm/amdgpu: switch from queue_active to queue state

Track the state of the queue rather than simple active vs
not.  This is needed for other states (hung, preempted, etc.).
While we are at it, move the state tracking into the user
queue front end code.

Reviewed-by: Prike Liang <Prike.Liang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu/userq: optimize enforce isolation and s/r
Alex Deucher [Wed, 16 Apr 2025 20:59:08 +0000 (16:59 -0400)]
drm/amdgpu/userq: optimize enforce isolation and s/r

If user queues are disabled for all IPs in the case
of suspend and resume and for gfx/compute in the case
of enforce isolation, we can return early.

Reviewed-by: Prike Liang <Prike.Liang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amd/display: Remove unused *vbios_smu_set_dprefclk
Dr. David Alan Gilbert [Fri, 18 Apr 2025 00:21:17 +0000 (01:21 +0100)]
drm/amd/display: Remove unused *vbios_smu_set_dprefclk

rn_vbios_smu_set_dprefclk() was added in 2019 by
commit 4edb6fc91878 ("drm/amd/display: Add Renoir clock manager")
rv1_vbios_smu_set_dprefclk() was also added in 2019 by
commit dc88b4a684d2 ("drm/amd/display: make clk mgr soc specific")

neither have been used.

Remove them.

Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/radeon: Remove unused radeon_fence_wait_any
Dr. David Alan Gilbert [Fri, 18 Apr 2025 00:21:16 +0000 (01:21 +0100)]
drm/radeon: Remove unused radeon_fence_wait_any

radeon_fence_wait_any() last use was removed in 2023's
commit 254986e324ad ("drm/radeon: Use the drm suballocation manager
implementation.")

Remove it.

Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/radeon/radeon_audio: Remove unused r600_hdmi_audio_workaround
Dr. David Alan Gilbert [Fri, 18 Apr 2025 00:21:14 +0000 (01:21 +0100)]
drm/radeon/radeon_audio: Remove unused r600_hdmi_audio_workaround

The last use of r600_hdmi_audio_workaround() was removed by 2014's
commit 6e72376dcc66 ("radeon/audio: consolidate audio_mode_set()
functions")

Remove it.

Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu: Print kernel message when error logged by scrub
Xiang Liu [Fri, 18 Apr 2025 07:13:44 +0000 (15:13 +0800)]
drm/amdgpu: Print kernel message when error logged by scrub

Print a kernel message when the scrub bit of status register is set to
indicate that errors are being logged by the scrub.

Signed-off-by: Xiang Liu <xiang.liu@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amd/display: do not copy invalid CRTC timing info
Gergo Koteles [Wed, 2 Apr 2025 17:03:31 +0000 (19:03 +0200)]
drm/amd/display: do not copy invalid CRTC timing info

Since b255ce4388e0, it is possible that the CRTC timing
information for the preferred mode has not yet been
calculated while amdgpu_dm_connector_mode_valid() is running.

In this case use the CRTC timing information of the actual mode.

Fixes: b255ce4388e0 ("drm/amdgpu: don't change mode in amdgpu_dm_connector_mode_valid()")
Closes: https://lore.kernel.org/all/ed09edb167e74167a694f4854102a3de6d2f1433.camel@irl.hu/
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4085
Signed-off-by: Gergo Koteles <soyer@irl.hu>
Reviewed-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Tested-by: Mark Broadworth <mark.broadworth@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amd/display: Correct prefetch calculation
TungYu Lu [Fri, 11 Apr 2025 02:41:48 +0000 (10:41 +0800)]
drm/amd/display: Correct prefetch calculation

[Why]
The minimum value of the dst_y_prefetch_equ was not correct
in prefetch calculation whice causes OPTC underflow.

[How]
Add the min operation of dst_y_prefetch_equ in prefetch calculation
for legacy DML.

Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: TungYu Lu <tungyu.lu@amd.com>
Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Tested-by: Mark Broadworth <mark.broadworth@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amd/display: Refactor SubVP cursor limiting logic
Dillon Varone [Tue, 1 Apr 2025 16:06:35 +0000 (12:06 -0400)]
drm/amd/display: Refactor SubVP cursor limiting logic

[WHY]
There are several gaps that can result in SubVP being enabled with
incompatible HW cursor sizes, and unjust restrictions to cursor size due
to wrong predictions on future usage of SubVP

[HOW]
- remove "prediction" logic in favor of tagging based on previous SubVP
  usage
- block SubVP if current HW cursor settings are incompatible
- provide interface for DM to determine if HW cursor should be disabled
  due to an attempt to enable SubVP

Reviewed-by: Alvin Lee <alvin.lee2@amd.com>
Signed-off-by: Dillon Varone <dillon.varone@amd.com>
Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Tested-by: Mark Broadworth <mark.broadworth@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu/userq: add a helper to check which IPs are enabled
Alex Deucher [Wed, 16 Apr 2025 20:43:52 +0000 (16:43 -0400)]
drm/amdgpu/userq: add a helper to check which IPs are enabled

Add a helper to get a mask of IPs which support user queues.
Use this in the INFO IOCTL to get the IP mask to replace
the current code.

Reviewed-by: Prike Liang <Prike.Liang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu: Add queue id support to the user queue wait IOCTL
Arunpravin Paneer Selvam [Fri, 11 Apr 2025 09:38:30 +0000 (15:08 +0530)]
drm/amdgpu: Add queue id support to the user queue wait IOCTL

Add queue id support to the user queue wait IOCTL
drm_amdgpu_userq_wait structure.

This is required to retrieve the wait user queue and maintain
the fence driver references in it so that the user queue in
the same context releases their reference to the fence drivers
at some point before queue destruction.

Otherwise, we would gather those references until we
don't have any more space left and crash.

v2: Modify the UAPI comment as per the mesa and libdrm UAPI comment.

Libdrm MR: https://gitlab.freedesktop.org/mesa/drm/-/merge_requests/408
Mesa MR: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34493

Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Suggested-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu/userq: enable support for secure queues
Alex Deucher [Thu, 27 Feb 2025 02:33:01 +0000 (21:33 -0500)]
drm/amdgpu/userq: enable support for secure queues

Enable users to create secure GFX/compute queues.

Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Tested-by: Jesse.Zhang <Jesse.zhang@amd.com>
Reviewed-by: Jesse.Zhang <Jesse.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu/userq/mes: pass the secure flag to mqd init
Alex Deucher [Thu, 27 Feb 2025 02:25:22 +0000 (21:25 -0500)]
drm/amdgpu/userq/mes: pass the secure flag to mqd init

So that we initialize the MQD as a secure queue.

Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Jesse.Zhang <Jesse.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amd/display: Fix pixel rate divider policy for 1 pixel per cycle config
Meenakshikumar Somasundaram [Tue, 8 Apr 2025 15:37:58 +0000 (11:37 -0400)]
drm/amd/display: Fix pixel rate divider policy for 1 pixel per cycle config

[Why]
Pixel rate dividor was not programmed correctly for 1 pixel per cycle
configuration for empty tu case.

[How]
Included check for empty tu when pixel rate dividor values were selected.

Reviewed-by: Michael Strauss <michael.strauss@amd.com>
Signed-off-by: Meenakshikumar Somasundaram <meenakshikumar.somasundaram@amd.com>
Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Tested-by: Mark Broadworth <mark.broadworth@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amd/display: Default IPS to RCG_IN_ACTIVE_IPS2_IN_OFF
Leo Li [Tue, 18 Mar 2025 22:05:05 +0000 (18:05 -0400)]
drm/amd/display: Default IPS to RCG_IN_ACTIVE_IPS2_IN_OFF

[Why]

Recent findings show negligible power savings between IPS2 and RCG
during static desktop. In fact, DCN related clocks are higher
when IPS2 is enabled vs RCG.

RCG_IN_ACTIVE is also the default policy for another OS supported by
DC, and it has faster entry/exit.

[How]

Remove previous logic that checked for IPS2 support, and just default
to `DMUB_IPS_RCG_IN_ACTIVE_IPS2_IN_OFF`.

Fixes: 199888aa25b3 ("drm/amd/display: Update IPS default mode for DCN35/DCN351")
Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Tested-by: Mark Broadworth <mark.broadworth@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amd/display: Revert "not disable dtb as dto src at dpms off"
Charlene Liu [Tue, 8 Apr 2025 20:35:38 +0000 (16:35 -0400)]
drm/amd/display: Revert "not disable dtb as dto src at dpms off"

[why]
not all the asic using the same code path.
need to revisit and limit the impact.

This reverts commit 32be4e39f459f3ac9c191569ae8e3731cb82f7ab.

Reviewed-by: Gabe Teeger <gabe.teeger@amd.com>
Signed-off-by: Charlene Liu <Charlene.Liu@amd.com>
Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Tested-by: Mark Broadworth <mark.broadworth@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amd/display: Use 16ms AUX read interval for LTTPR with old sinks
George Shen [Mon, 7 Apr 2025 16:35:57 +0000 (12:35 -0400)]
drm/amd/display: Use 16ms AUX read interval for LTTPR with old sinks

[Why/How]
LTTPR are required to program DPCD 0000Eh to 0x4 (16ms) upon AUX read
reply to this register. Since old Sinks witih DPCD rev 1.1 and earlier
may not support this register, assume the mandatory value is programmed
by the LTTPR to avoid AUX timeout issues.

Reviewed-by: Wenjing Liu <wenjing.liu@amd.com>
Signed-off-by: George Shen <george.shen@amd.com>
Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Tested-by: Mark Broadworth <mark.broadworth@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amd/display: Fix ACPI edid parsing on some Lenovo systems
Mario Limonciello [Fri, 4 Apr 2025 14:34:52 +0000 (09:34 -0500)]
drm/amd/display: Fix ACPI edid parsing on some Lenovo systems

[Why]
The ACPI EDID in the BIOS of a Lenovo laptop includes 3 blocks, but
dm_helpers_probe_acpi_edid() has a start that is 'char'.  The 3rd
block index starts after 255, so it can't be indexed properly.
This leads to problems with the display when the EDID is parsed.

[How]
Change the variable type to 'short' so that larger values can be indexed.

Cc: Renjith Pananchikkal <renjith.pananchikkal@amd.com>
Reported-by: Mark Pearson <mpearson@lenovo.com>
Suggested-by: David Ober <dober@lenovo.com>
Fixes: c6a837088bed ("drm/amd/display: Fetch the EDID from _DDC if available for eDP")
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Tested-by: Mark Broadworth <mark.broadworth@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amd/display: Promote DC to 3.2.329
Taimur Hassan [Mon, 7 Apr 2025 04:24:52 +0000 (23:24 -0500)]
drm/amd/display: Promote DC to 3.2.329

Summary:

* Implement HDMI Read request
* RMCM and MCM 3DLUT support
* Enable urgent latency adjustment on DCN35
* Enable phy-ssc reduction by default

Signed-off-by: Taimur Hassan <Syed.Hassan@amd.com>
Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Reviewed-by: Leo Li <sunpeng.li@amd.com>
Tested-by: Mark Broadworth <mark.broadworth@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu: Allow P2P access through XGMI
Felix Kuehling [Wed, 16 Apr 2025 04:19:13 +0000 (00:19 -0400)]
drm/amdgpu: Allow P2P access through XGMI

If peer memory is accessible through XGMI, allow leaving it in VRAM
rather than forcing its migration to GTT on DMABuf attachment.

Signed-off-by: Felix Kuehling <felix.kuehling@amd.com>
Tested-by: Hao (Claire) Zhou <hao.zhou@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amd/display: enable phy-ssc reduction by default
Roman Li [Thu, 3 Apr 2025 17:49:03 +0000 (13:49 -0400)]
drm/amd/display: enable phy-ssc reduction by default

[Why]
Reduction of phy-ssc is needed to support DP2 high pixel clock on dcn35x/36.
There's a special flag to enable it in dmub hw params.

[How]
Set hbr3_phy_ssc to true for dcn35, dcn351 and dcn36.

Reviewed-by: Charlene Liu <charlene.liu@amd.com>
Signed-off-by: Roman Li <Roman.Li@amd.com>
Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Tested-by: Mark Broadworth <mark.broadworth@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amd/display: Enable urgent latency adjustment on DCN35
Nicholas Susanto [Wed, 2 Apr 2025 19:04:08 +0000 (15:04 -0400)]
drm/amd/display: Enable urgent latency adjustment on DCN35

[Why]

Urgent latency adjustment was disabled on DCN35 due to issues with P0
enablement on some platforms. Without urgent latency, underflows occur
when doing certain high timing configurations. After testing, we found
that reenabling urgent latency didn't reintroduce p0 support on multiple
platforms.

[How]

renable urgent latency on DCN35 and setting it to 3000 Mhz.

This reverts commit 3412860cc4c0c484f53f91b371483e6e4440c3e5.

Reviewed-by: Charlene Liu <charlene.liu@amd.com>
Signed-off-by: Nicholas Susanto <nsusanto@amd.com>
Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Tested-by: Mark Broadworth <mark.broadworth@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amd/display: DCN42 RMCM and MCM 3DLUT support
Yihan Zhu [Tue, 1 Apr 2025 14:28:36 +0000 (10:28 -0400)]
drm/amd/display: DCN42 RMCM and MCM 3DLUT support

[WHY & HOW]
Providing hardware programming for the RMCM and MCM IPs for 3DLUT in DCN42.

Reviewed-by: Charlene Liu <charlene.liu@amd.com>
Signed-off-by: Yihan Zhu <Yihan.Zhu@amd.com>
Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Tested-by: Mark Broadworth <mark.broadworth@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amd/display: DCN32 null data check
Yihan Zhu [Mon, 31 Mar 2025 19:59:51 +0000 (15:59 -0400)]
drm/amd/display: DCN32 null data check

[WHY & HOW]
Avoid null curve data structure used in the cm block for the potential issue.

Reviewed-by: Charlene Liu <charlene.liu@amd.com>
Signed-off-by: Yihan Zhu <Yihan.Zhu@amd.com>
Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Tested-by: Mark Broadworth <mark.broadworth@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amd/display: Force full update in gpu reset
Roman Li [Wed, 26 Mar 2025 14:33:51 +0000 (10:33 -0400)]
drm/amd/display: Force full update in gpu reset

[Why]
While system undergoing gpu reset always do full update
to sync the dc state before and after reset.

[How]
Return true in should_reset_plane() if gpu reset detected

Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Roman Li <Roman.Li@amd.com>
Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Tested-by: Mark Broadworth <mark.broadworth@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amd/display: Fix gpu reset in multidisplay config
Roman Li [Tue, 1 Apr 2025 21:05:10 +0000 (17:05 -0400)]
drm/amd/display: Fix gpu reset in multidisplay config

[Why]
The indexing of stream_status in dm_gpureset_commit_state() is incorrect.
That leads to asserts in multi-display configuration after gpu reset.

[How]
Adjust the indexing logic to align stream_status with surface_updates.

Fixes: cdaae8371aa9 ("drm/amd/display: Handle GPU reset for DC block")
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3808
Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Roman Li <Roman.Li@amd.com>
Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Tested-by: Mark Broadworth <mark.broadworth@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amd/display: Move Mode Support Prefetch Checks To Its Own Function
Austin Zheng [Tue, 1 Apr 2025 18:38:31 +0000 (14:38 -0400)]
drm/amd/display: Move Mode Support Prefetch Checks To Its Own Function

[Why]
Large stack size observed in DCN4 mode support when compiling with clang.
Additional instrumentation added by compiler adds to stack size.
dml_core_mode_support ends up going over the stack size limit
due to the size of the function.

[How]
Move checks and calculations for prefetch to its own function.

Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Austin Zheng <Austin.Zheng@amd.com>
Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Tested-by: Mark Broadworth <mark.broadworth@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu: Don't pin VRAM without DMABUF_MOVE_NOTIFY
Felix Kuehling [Wed, 16 Apr 2025 03:58:28 +0000 (23:58 -0400)]
drm/amdgpu: Don't pin VRAM without DMABUF_MOVE_NOTIFY

Pinning of VRAM is for peer devices that don't support dynamic attachment
and move notifiers. But it requires that all such peer devices are able to
access VRAM via PCIe P2P. Any device without P2P access requires migration
to GTT, which fails if the memory is already pinned for another peer
device.

Sharing between GPUs should not require pinning in VRAM. However, if
DMABUF_MOVE_NOTIFY is disabled in the kernel build, even DMABufs shared
between GPUs must be pinned, which can lead to failures and functional
regressions on systems where some peer GPUs are not P2P accessible.

Disable VRAM pinning if move notifiers are disabled in the kernel build
to fix regressions when sharing BOs between GPUs.

Signed-off-by: Felix Kuehling <felix.kuehling@amd.com>
Tested-by: Hao (Claire) Zhou <hao.zhou@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amd/display: Move desync error counter operation up.
Jack Chang [Thu, 19 Dec 2024 08:10:18 +0000 (16:10 +0800)]
drm/amd/display: Move desync error counter operation up.

[Why & How]
Move desync error counter operation up to prevent
it from being skipped by force disable desync
error.

Reviewed-by: Robin Chen <robin.chen@amd.com>
Reviewed-by: Aric Cyr <aric.cyr@amd.com>
Signed-off-by: Jack Chang <jack.chang@amd.com>
Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Tested-by: Mark Broadworth <mark.broadworth@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amd/display: Avoid divide by zero by initializing dummy pitch to 1
Mario Limonciello [Tue, 21 Jan 2025 22:03:52 +0000 (16:03 -0600)]
drm/amd/display: Avoid divide by zero by initializing dummy pitch to 1

[Why]
If the dummy values in `populate_dummy_dml_surface_cfg()` aren't updated
then they can lead to a divide by zero in downstream callers like
CalculateVMAndRowBytes()

[How]
Initialize dummy value to a value to avoid divide by zero.

Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Tested-by: Mark Broadworth <mark.broadworth@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amd/display: Implement HDMI Read Request
Chris Park [Mon, 31 Mar 2025 20:04:55 +0000 (16:04 -0400)]
drm/amd/display: Implement HDMI Read Request

[Why]
Read Request provides alterative method to polling to
the HDMI sinks that support it.

[How]
Implement Read Request where interrupt can be generated
by the sink.

Reviewed-by: Joshua Aberback <joshua.aberback@amd.com>
Signed-off-by: Chris Park <chris.park@amd.com>
Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Tested-by: Mark Broadworth <mark.broadworth@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amd/display: To apply the adjusted DP ref clock for DP devices
Yiling Chen [Tue, 4 Mar 2025 08:52:16 +0000 (16:52 +0800)]
drm/amd/display: To apply the adjusted DP ref clock for DP devices

[Why]
For some pixel clock margin sensitive external monitor,
we could not keep original DP ref clock for the ASICs
supported SSC DP ref clock.

[How]
From slicon design team's comment,
we have to apply the adjusted DP ref clock for
DP devices.
DP 128b (DP2) signals uses the DTBCLK not DP ref.

Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Yiling Chen <yi-ling.chen2@amd.com>
Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Tested-by: Mark Broadworth <mark.broadworth@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu/gfx12: add support for TMZ queues to mqd_init
Alex Deucher [Thu, 27 Feb 2025 02:13:02 +0000 (21:13 -0500)]
drm/amdgpu/gfx12: add support for TMZ queues to mqd_init

Set up TMZ for queues.

Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Jesse.Zhang <Jesse.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu/gfx11: add support for TMZ queues to mqd_init
Alex Deucher [Thu, 27 Feb 2025 02:08:56 +0000 (21:08 -0500)]
drm/amdgpu/gfx11: add support for TMZ queues to mqd_init

Set up TMZ for queues.

Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Jesse.Zhang <Jesse.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu: Use allowed_domains for pinning dmabufs
Felix Kuehling [Thu, 17 Apr 2025 14:23:15 +0000 (10:23 -0400)]
drm/amdgpu: Use allowed_domains for pinning dmabufs

When determining the domains for pinning DMABufs, filter allowed_domains
and fail with a warning if VRAM is forbidden and GTT is not an allowed
domain.

Fixes: f5e7fabd1f5c ("drm/amdgpu: allow pinning DMA-bufs into VRAM if all importers can do P2P")
Suggested-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Felix Kuehling <felix.kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu: add tmz queue parameter to mqd props
Alex Deucher [Thu, 27 Feb 2025 01:58:42 +0000 (20:58 -0500)]
drm/amdgpu: add tmz queue parameter to mqd props

Use this to track the whether we want TMZ for queues.

Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Jesse.Zhang <Jesse.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu/userq: add UAPI for setting up secure queues
Alex Deucher [Wed, 26 Feb 2025 22:12:58 +0000 (17:12 -0500)]
drm/amdgpu/userq: add UAPI for setting up secure queues

If the queues needs to access TMZ surfaces, it must
be set up as secure.

Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Jesse.Zhang <Jesse.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu: Refine Cleaner Shader MEC firmware version for GFX10.1.x GPUs
Srinivasan Shanmugam [Thu, 17 Apr 2025 15:35:35 +0000 (21:05 +0530)]
drm/amdgpu: Refine Cleaner Shader MEC firmware version for GFX10.1.x GPUs

Update the minimum firmware version for the Cleaner Shader in the
gfx_v10_0_sw_init function.

This change adjusts the minimum required firmware version for the MEC
firmware from 152 to 151, allowing for broader compatibility with
GFX10.1 GPUs.

Fixes: 25961bad9212 ("drm/amdgpu/gfx10: Add cleaner shader for GFX10.1.10")
Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu:remove old sdma reset callback mechanism
Jesse.zhang@amd.com [Fri, 11 Apr 2025 07:30:23 +0000 (15:30 +0800)]
drm/amdgpu:remove old sdma reset callback mechanism

This patch removes the deprecated SDMA reset callback mechanism, which was previously used to register pre-reset and post-reset callbacks for SDMA engine resets.
 The callback mechanism has been replaced with a more direct and efficient approach using `stop_queue` and `start_queue` functions in the ring's function table.

The SDMA reset callback mechanism allowed KFD and AMDGPU to register pre-reset and post-reset functions for handling SDMA engine resets.
However, this approach added unnecessary complexity and was no longer needed after the introduction of the `stop_queue` and `start_queue` functions in the ring's function table.

1. **Remove Callback Mechanism**:
   - Removed the `amdgpu_sdma_register_on_reset_callbacks` function and its associated data structures (`sdma_on_reset_funcs`).
   - Removed the callback registration logic from the SDMA v4.4.2 initialization code.

2. **Clean Up Related Code**:
   - Removed the `sdma_v4_4_2_set_engine_reset_funcs` function, which was used to register the callbacks.
   - Removed the `sdma_v4_4_2_engine_reset_funcs` structure, which contained the pre-reset and post-reset callback functions.

Signed-off-by: Jesse Zhang <jesse.zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/radeon: fix the warning for radeon_cs_parser_fini
Sunil Khatri [Thu, 17 Apr 2025 09:30:44 +0000 (15:00 +0530)]
drm/radeon: fix the warning for radeon_cs_parser_fini

Fix the below warning message.
radeon/radeon_cs.c:418: warning: Excess function parameter 'backoff' description in 'radeon_cs_parser_fini'

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu/userq: add context and seqno of the fence
Sunil Khatri [Thu, 17 Apr 2025 10:27:26 +0000 (15:57 +0530)]
drm/amdgpu/userq: add context and seqno of the fence

Add context and seqno of the fence in error logging
rather than printing fence ptr.

Reviewed-by: Christian König <christian.koenig@amd.com>
Suggested-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Suggested-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu: optimize queue reset and stop logic for sdma_v5_2
Jesse.zhang@amd.com [Wed, 16 Apr 2025 06:13:30 +0000 (14:13 +0800)]
drm/amdgpu: optimize queue reset and stop logic for sdma_v5_2

This patch refactors the SDMA v5.2 queue reset and stop logic to improve
code readability, maintainability, and performance. The key changes include:

1. **Generalized `sdma_v5_2_gfx_stop` Function**:
   - Added an `inst_mask` parameter to allow stopping specific SDMA instances
     instead of all instances. This is useful for resetting individual queues.

2. **Simplified `sdma_v5_2_reset_queue` Function**:
   - Removed redundant loops and checks by directly using the `ring->me` field
     to identify the SDMA instance.
   - Reused the `sdma_v5_2_gfx_stop` function to stop the queue, reducing code
     duplication.

v1: The general coding style is to declare variables like "i" or "r" last. E.g. longest lines first and short lasts. (Chritian)

Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu: optimize queue reset and stop logic for sdma_v5_0
Jesse.zhang@amd.com [Wed, 16 Apr 2025 06:03:02 +0000 (14:03 +0800)]
drm/amdgpu: optimize queue reset and stop logic for sdma_v5_0

This patch refactors the SDMA v5.0 queue reset and stop logic to improve
code readability, maintainability, and performance. The key changes include:

1. **Generalized `sdma_v5_0_gfx_stop` Function**:
   - Added an `inst_mask` parameter to allow stopping specific SDMA instances
     instead of all instances. This is useful for resetting individual queues.

2. **Simplified `sdma_v5_0_reset_queue` Function**:
   - Removed redundant loops and checks by directly using the `ring->me` field
     to identify the SDMA instance.
   - Reused the `sdma_v5_0_gfx_stop` function to stop the queue, reducing code
     duplication.

v1: The general coding style is to declare variables like "i" or "r" last. E.g. longest lines first and short lasts. (Chritian)

Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu: Register the new sdma function pointers for sdma_v5_2
Jesse.zhang@amd.com [Mon, 14 Apr 2025 08:06:51 +0000 (16:06 +0800)]
drm/amdgpu: Register the new sdma function pointers for sdma_v5_2

Register stop/start/soft_reset queue functions for SDMA IP versions v5.2.

Suggested-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Jesse Zhang <jesse.zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu/: drm/amdgpu: Register the new sdma function pointers for sdma_v5_0
Jesse.zhang@amd.com [Mon, 14 Apr 2025 08:05:33 +0000 (16:05 +0800)]
drm/amdgpu/: drm/amdgpu: Register the new sdma function pointers for sdma_v5_0

Register stop/start/soft_reset queue functions for SDMA IP versions v5.0.

Suggested-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Jesse Zhang <jesse.zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu: Implement SDMA soft reset directly for v5.x
Jesse.zhang@amd.com [Fri, 11 Apr 2025 07:26:18 +0000 (15:26 +0800)]
drm/amdgpu: Implement SDMA soft reset directly for v5.x

This patch introduces a new function `amdgpu_sdma_soft_reset` to handle SDMA soft resets directly,
rather than relying on the DPM interface.

1. **New `amdgpu_sdma_soft_reset` Function**:
   - Implements a soft reset for SDMA engines by directly writing to the hardware registers.
   - Handles SDMA versions 4.x and 5.x separately:
     - For SDMA 4.x, the existing `amdgpu_dpm_reset_sdma` function is used for backward compatibility.
     - For SDMA 5.x, the driver directly manipulates the `GRBM_SOFT_RESET` register to reset the specified SDMA instance.

2. **Integration into `amdgpu_sdma_reset_engine`**:
   - The `amdgpu_sdma_soft_reset` function is called during the SDMA reset process, replacing the previous call to `amdgpu_dpm_reset_sdma`.

v2: r should default to an error (Alex)

Suggested-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Jesse Zhang <jesse.zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu: switch amdgpu_sdma_reset_engine to use the new sdma function pointers
Jesse.zhang@amd.com [Fri, 11 Apr 2025 06:48:52 +0000 (14:48 +0800)]
drm/amdgpu: switch amdgpu_sdma_reset_engine to use the new sdma function pointers

Replace old callback mechanism with direct calls to stop/start functions.

Suggested-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Jesse Zhang <jesse.zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu/userq: enable support for queue priorities
Alex Deucher [Thu, 27 Feb 2025 03:56:26 +0000 (22:56 -0500)]
drm/amdgpu/userq: enable support for queue priorities

Enable users to create queues at different priority levels.
The highest level is restricted to drm master.

Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Jesse.Zhang <Jesse.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu/userq/mes: handle user queue priority
Alex Deucher [Thu, 27 Feb 2025 03:47:11 +0000 (22:47 -0500)]
drm/amdgpu/userq/mes: handle user queue priority

Handle the queue priority set by the user.

Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Jesse.Zhang <Jesse.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu/userq: add priorty to user queue structure
Alex Deucher [Thu, 27 Feb 2025 03:38:08 +0000 (22:38 -0500)]
drm/amdgpu/userq: add priorty to user queue structure

So we can track this when we create user queues.

Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Jesse.Zhang <Jesse.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu/mes12: add conversion for priority levels
Alex Deucher [Thu, 27 Feb 2025 03:29:15 +0000 (22:29 -0500)]
drm/amdgpu/mes12: add conversion for priority levels

Convert driver priority levels to MES11 priority levels.
At the moment they are the same, but they may not always
be.

Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Jesse.Zhang <Jesse.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu/mes11: add conversion for priority levels
Alex Deucher [Thu, 27 Feb 2025 03:27:47 +0000 (22:27 -0500)]
drm/amdgpu/mes11: add conversion for priority levels

Convert driver priority levels to MES11 priority levels.
At the moment they are the same, but they may not always
be.

Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Jesse.Zhang <Jesse.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu/userq: add UAPI for setting queue priority
Alex Deucher [Wed, 26 Feb 2025 21:55:36 +0000 (16:55 -0500)]
drm/amdgpu/userq: add UAPI for setting queue priority

Allow the user to set a queue priority levels:
0 - normal low - most apps (maps to MES AMD_PRIORITY_LEVEL_NORMAL)
1 - low - background jobs (maps to MES AMD_PRIORITY_LEVEL_LOW)
2 - normal high - apps that need relative high (maps to MES AMD_PRIORITY_LEVEL_MEDIUM)
3 - high (admin only - for compositors) (maps to MES AMD_PRIORITY_LEVEL_HIGH)

Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Jesse.Zhang <Jesse.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu: convert userq UAPI _pad to flags
Alex Deucher [Wed, 26 Feb 2025 21:54:27 +0000 (16:54 -0500)]
drm/amdgpu: convert userq UAPI _pad to flags

Reuse the _pad field for flags.

Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Jesse.Zhang <Jesse.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amd/display: Add error check for avi and vendor infoframe setup function
Wentao Liang [Mon, 14 Apr 2025 03:14:39 +0000 (11:14 +0800)]
drm/amd/display: Add error check for avi and vendor infoframe setup function

The function fill_stream_properties_from_drm_display_mode() calls the
function drm_hdmi_avi_infoframe_from_display_mode() and the
function drm_hdmi_vendor_infoframe_from_display_mode(), but does
not check its return value. Log the error messages to prevent silent
failure if either function fails.

Signed-off-by: Wentao Liang <vulab@iscas.ac.cn>
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu/userq: integrate with enforce isolation
Alex Deucher [Mon, 14 Apr 2025 17:11:10 +0000 (13:11 -0400)]
drm/amdgpu/userq: integrate with enforce isolation

Enforce isolation serializes access to the GFX IP.  User
queues are isolated in the MES scheduler, but we still
need to serialize between kernel queues and user queues.
For enforce isolation, group KGD user queues with KFD user
queues.

v2: split out variable renaming, add config guards
v3: use new function names

Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu: rename enforce isolation variables
Alex Deucher [Fri, 21 Feb 2025 20:20:45 +0000 (15:20 -0500)]
drm/amdgpu: rename enforce isolation variables

Since they will be used for both KFD and KGD user queues,
rename them from kfd to userq.  No intended functional
change.

Acked-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu/userq: add helpers to start/stop scheduling
Alex Deucher [Thu, 10 Apr 2025 17:26:43 +0000 (13:26 -0400)]
drm/amdgpu/userq: add helpers to start/stop scheduling

This will be used to stop/start user queue scheduling for
example when switching between kernel and user queues when
enforce isolation is enabled.

v2: use idx
v3: only stop compute/gfx queues

Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu/userq: track the xcp_id associated with the queue
Alex Deucher [Fri, 11 Apr 2025 18:16:41 +0000 (14:16 -0400)]
drm/amdgpu/userq: track the xcp_id associated with the queue

Track this to align with KFD for enforce isolation
handling.

Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu: Clear overflow for SRIOV
Emily Deng [Tue, 8 Apr 2025 12:25:43 +0000 (20:25 +0800)]
drm/amdgpu: Clear overflow for SRIOV

For VF, it doesn't have the permission to clear overflow, clear the bit
by reset.

Signed-off-by: Emily Deng <Emily.Deng@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu/userq: rework driver parameter
Alex Deucher [Mon, 14 Apr 2025 18:18:03 +0000 (14:18 -0400)]
drm/amdgpu/userq: rework driver parameter

Replace disable_kq parameter with user_queue parameter.
The parameter has the following logic:
 -1 = auto (ASIC specific default)
  0 = user queues disabled
  1 = user queues enabled and kernel queues enabled (if supported)
  2 = user queues enabled and kernel queues disabled

The default behavior (-1) is currently the same as 0 for current
ASICs.  To enable user queues (in addition to kernel queues) set
user_queue=1. To enable user queues and disable kernel queues
(to make all resources available to user queues), set user_queue=2.

Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amd/pm: Enable host limit metrics support
Asad Kamal [Sat, 12 Apr 2025 03:14:32 +0000 (11:14 +0800)]
drm/amd/pm: Enable host limit metrics support

Enable host limit metrics support for smuv_13_0_12

Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu/sdma7: properly reference trap interrupts for userqs
Alex Deucher [Sun, 13 Apr 2025 14:32:23 +0000 (10:32 -0400)]
drm/amdgpu/sdma7: properly reference trap interrupts for userqs

We need to take a reference to the interrupts to make
sure they stay enabled even if the kernel queues have
disabled them.

Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu/sdma6: properly reference trap interrupts for userqs
Alex Deucher [Sun, 13 Apr 2025 14:31:08 +0000 (10:31 -0400)]
drm/amdgpu/sdma6: properly reference trap interrupts for userqs

We need to take a reference to the interrupts to make
sure they stay enabled even if the kernel queues have
disabled them.

Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amd/pm: Enable host limit metrics support
Asad Kamal [Sat, 12 Apr 2025 03:08:01 +0000 (11:08 +0800)]
drm/amd/pm: Enable host limit metrics support

Enable host limit metrics support for smuv_13_0_6

Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu: Enable doorbell for JPEG5_0_1
Sathishkumar S [Thu, 10 Apr 2025 06:28:16 +0000 (11:58 +0530)]
drm/amdgpu: Enable doorbell for JPEG5_0_1

Enable doorbell for JPEG5_0_1 and adjust index for VCN5_0_1.

Signed-off-by: Sathishkumar S <sathishkumar.sundararaju@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu: Update vcn doorbell range in NBIO 7.9
Shiwu Zhang [Thu, 10 Apr 2025 06:26:47 +0000 (11:56 +0530)]
drm/amdgpu: Update vcn doorbell range in NBIO 7.9

Increase vcn doorbell range for gfx950 to 11.

Signed-off-by: Shiwu Zhang <shiwu.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu/gfx12: properly reference EOP interrupts for userqs
Alex Deucher [Sun, 13 Apr 2025 14:19:24 +0000 (10:19 -0400)]
drm/amdgpu/gfx12: properly reference EOP interrupts for userqs

Regardless of whether we disable kernel queues, we need
to take an extra reference to the pipe interrupts for
user queues to make sure they stay enabled in case we
disable them for kernel queues.

Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu/gfx11: properly reference EOP interrupts for userqs
Alex Deucher [Sun, 13 Apr 2025 14:16:58 +0000 (10:16 -0400)]
drm/amdgpu/gfx11: properly reference EOP interrupts for userqs

Regardless of whether we disable kernel queues, we need
to take an extra reference to the pipe interrupts for
user queues to make sure they stay enabled in case we
disable them for kernel queues.

Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdkfd: fix a bug of smi event for superuser
Eric Huang [Mon, 14 Apr 2025 15:19:01 +0000 (11:19 -0400)]
drm/amdkfd: fix a bug of smi event for superuser

rocm-smi with superuser permission doesn't show some
of smi events, i.e. page fault/migration, because the
condition of "(events & all)" is false. Superuser
should be able to detect all events, the condiiton of
"(events & all)" seems redundant, so removing it will
fix the issue.

Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
Reviewed-by: Kent Russell <kent.russell@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu: add missing DCE6 to dce_version_to_string()
Alexandre Demers [Sun, 13 Apr 2025 20:51:21 +0000 (16:51 -0400)]
drm/amdgpu: add missing DCE6 to dce_version_to_string()

Missing DCE 6.0 6.1 and 6.4 are identified as UNKNOWN. Fix this.

Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu: fix typo in bios_parser.c
Alexandre Demers [Tue, 8 Apr 2025 02:11:00 +0000 (22:11 -0400)]
drm/amdgpu: fix typo in bios_parser.c

Probably a cut and paste error from using get_integrated_info_v8's comment.
This has to be get_integrated_info_v9

Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu: fix duplicated value setting in dce100_resource_construct()
Alexandre Demers [Tue, 8 Apr 2025 02:10:59 +0000 (22:10 -0400)]
drm/amdgpu: fix duplicated value setting in dce100_resource_construct()

i2c_speed_in_khz was set twice with the same values. Looking at other DCE
versions, we probably wanted to set the value for i2c_speed_in_khz_hdcp.

Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/radeon: fix typo in atombios.h
Alexandre Demers [Tue, 8 Apr 2025 02:10:58 +0000 (22:10 -0400)]
drm/radeon: fix typo in atombios.h

"aligned" not "aligend"

Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu: fix typo in atombios.h
Alexandre Demers [Tue, 8 Apr 2025 02:10:57 +0000 (22:10 -0400)]
drm/amdgpu: fix typo in atombios.h

"aligned" not "aligend"

Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu: add missing parameter name in dce110_clk_src_construct() declaration
Alexandre Demers [Tue, 8 Apr 2025 02:10:56 +0000 (22:10 -0400)]
drm/amdgpu: add missing parameter name in dce110_clk_src_construct() declaration

While not needed per speaking, all the other parameters have names but
this one.

Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu: rename function to follow naming convention in dce110
Alexandre Demers [Tue, 8 Apr 2025 02:10:55 +0000 (22:10 -0400)]
drm/amdgpu: rename function to follow naming convention in dce110

The prefix dce110 is used on all functions, but init_pipes() and
init_hw(). Under DCN, these sames functions are prefixed.

Let's keep thing coherent.

Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu: Clean up error handling in amdgpu_userq_fence_driver_alloc()
Dan Carpenter [Sat, 12 Apr 2025 14:39:43 +0000 (17:39 +0300)]
drm/amdgpu: Clean up error handling in amdgpu_userq_fence_driver_alloc()

1) Checkpatch complains if we print an error message for kzalloc()
   failure.  The kzalloc() failure already has it's own error messages
   built in.  Also this allocation is small enough that it is guaranteed
   to succeed.
2) Return directly instead of doing a goto free_fence_drv.  The
   "fence_drv" is already NULL so no cleanup is necessary.

Reviewed-by: Arvind Yadav <arvind.yadav@amd.com>
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu: Fix double free in amdgpu_userq_fence_driver_alloc()
Dan Carpenter [Sat, 12 Apr 2025 14:39:32 +0000 (17:39 +0300)]
drm/amdgpu: Fix double free in amdgpu_userq_fence_driver_alloc()

The goto frees "fence_drv" so this is a double free bug.  There is no
need to call amdgpu_seq64_free(adev, fence_drv->va) since the seq64
allocation failed so change the goto to goto free_fence_drv.  Also
propagate the error code from amdgpu_seq64_alloc() instead of hard coding
it to -ENOMEM.

Fixes: e7cf21fbb277 ("drm/amdgpu: Few optimization and fixes for userq fence driver")
Reviewed-by: Arvind Yadav <arvind.yadav@amd.com>
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu/userq: move runpm handling into core userq code
Alex Deucher [Fri, 11 Apr 2025 19:49:51 +0000 (15:49 -0400)]
drm/amdgpu/userq: move runpm handling into core userq code

Pull it out of the MES code and into the generic code.
It's not MES specific and needs to be applied to all user
queues regardless of the backend.

Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Shaoyun.liu <Shaoyun.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdkfd: fix NULL check mistake for process smi event
Eric Huang [Mon, 14 Apr 2025 14:45:12 +0000 (10:45 -0400)]
drm/amdkfd: fix NULL check mistake for process smi event

The mistake will lead to NULL kernel oops, so fix it.

Fixes: 4172b556fd5b ("drm/amdkfd: add smi events for process start and end")
Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
Reviewed-by: Kent Russell <kent.russell@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu/sdma_v4: Register the new sdma function pointers
Jesse.zhang@amd.com [Mon, 14 Apr 2025 07:25:16 +0000 (15:25 +0800)]
drm/amdgpu/sdma_v4: Register the new sdma function pointers

Register stop/start/soft_reset queue functions for sdma v4_4_2.

Suggested-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Jesse Zhang <jesse.zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu: Add the new sdma function pointers for amdgpu_sdma.h
Jesse.zhang@amd.com [Fri, 11 Apr 2025 05:01:19 +0000 (13:01 +0800)]
drm/amdgpu: Add the new sdma function pointers for amdgpu_sdma.h

This patch introduces new function pointers in the amdgpu_sdma structure
to handle queue stop, start and soft reset operations. These will replace
the older callback mechanism.

The new functions are:
- stop_kernel_queue: Stops a specific SDMA queue
- start_kernel_queue: Starts/Restores a specific SDMA queue
- soft_reset_kernel_queue: Performs soft reset on a specific SDMA queue

v2: Update stop_queue/start_queue function paramters to use ring pointer instead of device/instance(Chritian)
v3: move stop_queue/start_queue to struct amdgpu_sdma_instance and rename them. (Alex)
v4: rework the ordering a bit (Alex)

Suggested-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu: don't swallow errors in amdgpu_userqueue_resume_all()
Alex Deucher [Thu, 10 Apr 2025 17:17:08 +0000 (13:17 -0400)]
drm/amdgpu: don't swallow errors in amdgpu_userqueue_resume_all()

since we loop through the queues |= the errors.

Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu/userq: handle system suspend and resume
Alex Deucher [Fri, 21 Feb 2025 17:16:34 +0000 (12:16 -0500)]
drm/amdgpu/userq: handle system suspend and resume

Unmap user queues on suspend and map them on resume.

Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 months agodrm/amdgpu/userq: add suspend and resume helpers
Alex Deucher [Thu, 20 Feb 2025 21:31:40 +0000 (16:31 -0500)]
drm/amdgpu/userq: add suspend and resume helpers

Add helpers to unmap and map user queues on suspend and
resume.

Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>