]> www.infradead.org Git - users/hch/uuid.git/log
users/hch/uuid.git
20 months agodrm/amd/display: Increase eval/entry delay for DCN35
Nicholas Kazlauskas [Tue, 23 Jan 2024 17:20:06 +0000 (12:20 -0500)]
drm/amd/display: Increase eval/entry delay for DCN35

[Why]
To match firmware measurements and avoid hanging when accessing HW
that's in idle.

[How]
Increase the delays to what we've measured.

Reviewed-by: Ovidiu Bunea <ovidiu.bunea@amd.com>
Acked-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
20 months agodrm/amd/display: Disable timeout in more places for dc_dmub_srv
Nicholas Kazlauskas [Tue, 23 Jan 2024 17:14:42 +0000 (12:14 -0500)]
drm/amd/display: Disable timeout in more places for dc_dmub_srv

[Why]
We're still missing a few and we'd like to avoid continuining when
a hang occurs for debug purposes.

[How]
Add the loop anywhere we try to wait on rptr == wptr in dc_dmub_srv.

Reviewed-by: Ovidiu Bunea <ovidiu.bunea@amd.com>
Acked-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
20 months agodrm/amdgpu: Fix potential out-of-bounds access in 'amdgpu_discovery_reg_base_init()'
Srinivasan Shanmugam [Thu, 1 Feb 2024 17:17:15 +0000 (22:47 +0530)]
drm/amdgpu: Fix potential out-of-bounds access in 'amdgpu_discovery_reg_base_init()'

The issue arises when the array 'adev->vcn.vcn_config' is accessed
before checking if the index 'adev->vcn.num_vcn_inst' is within the
bounds of the array.

The fix involves moving the bounds check before the array access. This
ensures that 'adev->vcn.num_vcn_inst' is within the bounds of the array
before it is used as an index.

Fixes the below:
drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c:1289 amdgpu_discovery_reg_base_init() error: testing array offset 'adev->vcn.num_vcn_inst' after use.

Fixes: a0ccc717c4ab ("drm/amdgpu/discovery: validate VCN and SDMA instances")
Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
20 months agodrm/amd/pm: Retrieve UMC ODECC error count from aca bank
Candice Li [Fri, 2 Feb 2024 10:27:32 +0000 (18:27 +0800)]
drm/amd/pm: Retrieve UMC ODECC error count from aca bank

Instead of software managed counters.

Signed-off-by: Candice Li <candice.li@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
20 months agodrm/amd/display: Add more checks for exiting idle in DC
Nicholas Kazlauskas [Tue, 16 Jan 2024 14:52:58 +0000 (09:52 -0500)]
drm/amd/display: Add more checks for exiting idle in DC

[Why]
Any interface that touches registers needs to wake up the system.

[How]
Add a new interface dc_exit_ips_for_hw_access that wraps the check
for IPS support and insert it into the public DC interfaces that
touch registers.

We don't re-enter, since we expect that the enter/exit to have been done
on the DM side.

Cc: stable@vger.kernel.org # 6.1+
Reviewed-by: Ovidiu Bunea <ovidiu.bunea@amd.com>
Acked-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
20 months agodrm/amd/display: correct static screen event mask
Allen Pan [Tue, 23 Jan 2024 20:14:40 +0000 (15:14 -0500)]
drm/amd/display: correct static screen event mask

[Why]
Hardware register definition changed

Reviewed-by: Charlene Liu <charlene.liu@amd.com>
Acked-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Allen Pan <allen.pan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
20 months agodrm/amdgpu: remove asymmetrical irq disabling in jpeg 4.0.5 suspend
Li Ma [Thu, 1 Feb 2024 11:31:06 +0000 (19:31 +0800)]
drm/amdgpu: remove asymmetrical irq disabling in jpeg 4.0.5 suspend

A supplement to commit: 615dd56ac5379f4239940be69139a33e79e59c67
There is an irq warning of jpeg during resume in s2idle process. No irq enabled in jpeg 4.0.5 resume.

Fixes: 615dd56ac537 ("drm/amdgpu: remove asymmetrical irq disabling in vcn 4.0.5 suspend")
Signed-off-by: Li Ma <li.ma@amd.com>
Acked-By: Saleemkhan Jamadar <saleemkhan.jamadar@amd.com>
Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com>
Reviewed-by: Veerabadhran Gopalakrishnan <Veerabadhran.Gopalakrishnan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
20 months agodrm/amdgpu: reset gpu for s3 suspend abort case
Prike Liang [Wed, 17 Jan 2024 05:39:37 +0000 (13:39 +0800)]
drm/amdgpu: reset gpu for s3 suspend abort case

In the s3 suspend abort case some type of gfx9 power
rail not turn off from FCH side and this will put the
GPU in an unknown power status, so let's reset the gpu
to a known good power state before reinitialize gpu
device.

Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
20 months agodrm/amdgpu: skip to program GFXDEC registers for suspend abort
Prike Liang [Tue, 16 Jan 2024 11:10:45 +0000 (19:10 +0800)]
drm/amdgpu: skip to program GFXDEC registers for suspend abort

In the suspend abort cases, the gfx power rail doesn't turn off so
some GFXDEC registers/CSB can't reset to default value and at this
moment reinitialize GFXDEC/CSB will result in an unexpected error.
So let skip those program sequence for the suspend abort case.

Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
20 months agodrm/amd/display: set odm_combine_policy based on context in dcn32 resource
Wenjing Liu [Thu, 18 Jan 2024 23:12:15 +0000 (18:12 -0500)]
drm/amd/display: set odm_combine_policy based on context in dcn32 resource

[why]
When populating dml pipes, odm combine policy should be assigned based
on the pipe topology of the context passed in. DML pipes could be
repopulated multiple times during single validate bandwidth attempt. We
need to make sure that whenever we repopulate the dml pipes it is always
aligned with the updated context. There is a case where DML pipes get
repopulated during FPO optimization after ODM combine policy is changed.
Since in the current code we reinitlaize ODM combine policy, even though
the current context has ODM combine enabled, we overwrite it despite the
pipes are already split. This causes DML to think that MPC combine is
used so we mistakenly enable MPC combine because we apply pipe split
with ODM combine policy reset. This issue doesn't impact non windowed
MPO with ODM case because the legacy policy has restricted use cases. We
don't encounter the case where both ODM and FPO optimizations are
enabled together. So we decide to leave it as is because it is about to
be replaced anyway.

Cc: stable@vger.kernel.org # 6.6+
Reviewed-by: Chaitanya Dhere <chaitanya.dhere@amd.com>
Reviewed-by: Alvin Lee <alvin.lee2@amd.com>
Acked-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Wenjing Liu <wenjing.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
20 months agodrm/amd/display: Don't perform rate toggle on DP2-capable FIXED_VS retimers
Michael Strauss [Tue, 16 Jan 2024 18:46:53 +0000 (13:46 -0500)]
drm/amd/display: Don't perform rate toggle on DP2-capable FIXED_VS retimers

[WHY]
Only required if FIXED_VS retimer does not support DP2-capable.

[HOW]
Gate link rate toggle with DP 128b/132b LTTPR channel coding cap check.

Reviewed-by: Charlene Liu <charlene.liu@amd.com>
Acked-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Michael Strauss <michael.strauss@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
20 months agodrm/amdkfd: Add cache line sizes to KFD topology
Joseph Greathouse [Sat, 20 Jan 2024 02:01:50 +0000 (20:01 -0600)]
drm/amdkfd: Add cache line sizes to KFD topology

The KFD topology includes cache line size, but we have not been
filling that information out unless we are parsing a CRAT table.
Fill in this information for the devices where we have cache
information structs, and pipe this information to the topology
sysfs files.

v2: squash in fix from Joe (Alex)

Signed-off-by: Joseph Greathouse <Joseph.Greathouse@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
20 months agodrm/amd/display: Remove Legacy FIXED_VS Transparent LT Sequence
Michael Strauss [Fri, 5 Jan 2024 16:20:48 +0000 (11:20 -0500)]
drm/amd/display: Remove Legacy FIXED_VS Transparent LT Sequence

The New sequence has been in use in DCN314 with no regressions
introduced. Therefore, it is safe to enable this sequence for all
devices using FIXED_VS retimers. So, remove the legacy codepath and its
associated config flag.

Reviewed-by: Ovidiu Bunea <ovidiu.bunea@amd.com>
Acked-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Michael Strauss <michael.strauss@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
20 months agodrm/amd/display: add panel_power_savings sysfs entry to eDP connectors
Hamza Mahfooz [Fri, 26 Jan 2024 21:27:10 +0000 (16:27 -0500)]
drm/amd/display: add panel_power_savings sysfs entry to eDP connectors

We want programs besides the compositor to be able to enable or disable
panel power saving features. However, since they are currently only
configurable through DRM properties, that isn't possible. So, to remedy
that issue introduce a new "panel_power_savings" sysfs attribute.

v2: squash in fix from Hamza (Alex)

Cc: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Tested-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
20 months agodrm/amdgpu: Clear the hotplug interrupt ack bit before hpd initialization
Qiang Ma [Wed, 31 Jan 2024 07:57:03 +0000 (15:57 +0800)]
drm/amdgpu: Clear the hotplug interrupt ack bit before hpd initialization

Problem:
The computer in the bios initialization process, unplug the HDMI display,
wait until the system up, plug in the HDMI display, did not enter the
hotplug interrupt function, the display is not bright.

Fix:
After the above problem occurs, and the hpd ack interrupt bit is 1,
the interrupt should be cleared during hpd_init initialization so that
when the driver is ready, it can respond to the hpd interrupt normally.

Signed-off-by: Qiang Ma <maqianga@uniontech.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
20 months agodrm/amdgpu: fix typo in parameter description
Alex Deucher [Thu, 11 Jan 2024 15:56:33 +0000 (10:56 -0500)]
drm/amdgpu: fix typo in parameter description

Missing space.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
20 months agodrm/amdgpu: Only create mes event log debugfs when mes is enabled
shaoyunl [Wed, 31 Jan 2024 14:20:07 +0000 (09:20 -0500)]
drm/amdgpu: Only create mes event log debugfs when mes is enabled

Skip the debugfs file creation for mes event log if the GPU
doesn't use MES. This to prevent potential kernel oops when
user try to read the event log in debugfs on a GPU without MES

Signed-off-by: shaoyunl <shaoyun.liu@amd.com>
Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
20 months agodrm/amd/display: Add NULL test for 'timing generator' in 'dcn21_set_pipe()'
Srinivasan Shanmugam [Wed, 31 Jan 2024 03:19:41 +0000 (08:49 +0530)]
drm/amd/display: Add NULL test for 'timing generator' in 'dcn21_set_pipe()'

In "u32 otg_inst = pipe_ctx->stream_res.tg->inst;"
pipe_ctx->stream_res.tg could be NULL, it is relying on the caller to
ensure the tg is not NULL.

Fixes: 474ac4a875ca ("drm/amd/display: Implement some asic specific abm call backs.")
Cc: Yongqiang Sun <yongqiang.sun@amd.com>
Cc: Anthony Koo <Anthony.Koo@amd.com>
Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Anthony Koo <Anthony.Koo@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
20 months agodrm/amd/display: Fix 'panel_cntl' could be null in 'dcn21_set_backlight_level()'
Srinivasan Shanmugam [Sat, 27 Jan 2024 13:04:01 +0000 (18:34 +0530)]
drm/amd/display: Fix 'panel_cntl' could be null in 'dcn21_set_backlight_level()'

'panel_cntl' structure used to control the display panel could be null,
dereferencing it could lead to a null pointer access.

Fixes the below:
drivers/gpu/drm/amd/amdgpu/../display/dc/hwss/dcn21/dcn21_hwseq.c:269 dcn21_set_backlight_level() error: we previously assumed 'panel_cntl' could be null (see line 250)

Fixes: 474ac4a875ca ("drm/amd/display: Implement some asic specific abm call backs.")
Cc: Yongqiang Sun <yongqiang.sun@amd.com>
Cc: Anthony Koo <Anthony.Koo@amd.com>
Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Anthony Koo <Anthony.Koo@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
20 months agodrm/amdgpu/pm: Use inline function for IP version check
Ma Jun [Wed, 31 Jan 2024 02:19:20 +0000 (10:19 +0800)]
drm/amdgpu/pm: Use inline function for IP version check

Use existing inline function for IP version check.

Signed-off-by: Ma Jun <Jun.Ma2@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
20 months agodrm/amdgpu: Reset IH OVERFLOW_CLEAR bit
Friedrich Vock [Tue, 23 Jan 2024 11:52:03 +0000 (12:52 +0100)]
drm/amdgpu: Reset IH OVERFLOW_CLEAR bit

Allows us to detect subsequent IH ring buffer overflows as well.

Cc: Joshua Ashton <joshua@froggi.es>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Friedrich Vock <friedrich.vock@gmx.de>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
20 months agodrm/amdgpu: remove asymmetrical irq disabling in vcn 4.0.5 suspend
Yifan Zhang [Tue, 30 Jan 2024 13:01:42 +0000 (21:01 +0800)]
drm/amdgpu: remove asymmetrical irq disabling in vcn 4.0.5 suspend

There is no irq enabled in vcn 4.0.5 resume, causing wrong amdgpu_irq_src status.
Beside, current set function callbacks are empty with no real effect.

Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Acked-by: Saleemkhan Jamadar <saleemkhan.jamadar@amd.com>
Reviewed-by: Veerabadhran Gopalakrishnan <Veerabadhran.Gopalakrishnan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
20 months agodrm/amdgpu: use PSP address query command
Tao Zhou [Fri, 26 Jan 2024 10:45:38 +0000 (18:45 +0800)]
drm/amdgpu: use PSP address query command

Get UMC physical address from PSP in RAS error address coversion.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
20 months agodrm/amdgpu: add PSP RAS address query command
Tao Zhou [Fri, 26 Jan 2024 10:44:43 +0000 (18:44 +0800)]
drm/amdgpu: add PSP RAS address query command

Convert mca address to physical address or vice versa via RAS TA.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
20 months agodrm/amdgpu: drm/amdgpu: remove golden setting for gfx 11.5.0
Yifan Zhang [Tue, 30 Jan 2024 05:14:39 +0000 (13:14 +0800)]
drm/amdgpu: drm/amdgpu: remove golden setting for gfx 11.5.0

No need to set GC golden settings in driver from gfx 11.5.0 onwards.

Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Lang Yu <lang.yu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
20 months agodrm/amdkfd: reserve the BO before validating it
Lang Yu [Thu, 11 Jan 2024 04:27:07 +0000 (12:27 +0800)]
drm/amdkfd: reserve the BO before validating it

Fix a warning.

v2: Avoid unmapping attachment repeatedly when ERESTARTSYS.

v3: Lock the BO before accessing ttm->sg to avoid race conditions.(Felix)

[   41.708711] WARNING: CPU: 0 PID: 1463 at drivers/gpu/drm/ttm/ttm_bo.c:846 ttm_bo_validate+0x146/0x1b0 [ttm]
[   41.708989] Call Trace:
[   41.708992]  <TASK>
[   41.708996]  ? show_regs+0x6c/0x80
[   41.709000]  ? ttm_bo_validate+0x146/0x1b0 [ttm]
[   41.709008]  ? __warn+0x93/0x190
[   41.709014]  ? ttm_bo_validate+0x146/0x1b0 [ttm]
[   41.709024]  ? report_bug+0x1f9/0x210
[   41.709035]  ? handle_bug+0x46/0x80
[   41.709041]  ? exc_invalid_op+0x1d/0x80
[   41.709048]  ? asm_exc_invalid_op+0x1f/0x30
[   41.709057]  ? amdgpu_amdkfd_gpuvm_dmaunmap_mem+0x2c/0x80 [amdgpu]
[   41.709185]  ? ttm_bo_validate+0x146/0x1b0 [ttm]
[   41.709197]  ? amdgpu_amdkfd_gpuvm_dmaunmap_mem+0x2c/0x80 [amdgpu]
[   41.709337]  ? srso_alias_return_thunk+0x5/0x7f
[   41.709346]  kfd_mem_dmaunmap_attachment+0x9e/0x1e0 [amdgpu]
[   41.709467]  amdgpu_amdkfd_gpuvm_dmaunmap_mem+0x56/0x80 [amdgpu]
[   41.709586]  kfd_ioctl_unmap_memory_from_gpu+0x1b7/0x300 [amdgpu]
[   41.709710]  kfd_ioctl+0x1ec/0x650 [amdgpu]
[   41.709822]  ? __pfx_kfd_ioctl_unmap_memory_from_gpu+0x10/0x10 [amdgpu]
[   41.709945]  ? srso_alias_return_thunk+0x5/0x7f
[   41.709949]  ? tomoyo_file_ioctl+0x20/0x30
[   41.709959]  __x64_sys_ioctl+0x9c/0xd0
[   41.709967]  do_syscall_64+0x3f/0x90
[   41.709973]  entry_SYSCALL_64_after_hwframe+0x6e/0xd8

Fixes: 101b8104307e ("drm/amdkfd: Move dma unmapping after TLB flush")
Signed-off-by: Lang Yu <Lang.Yu@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
20 months agodrm/amd/display: Fix buffer overflow in 'get_host_router_total_dp_tunnel_bw()'
Srinivasan Shanmugam [Mon, 29 Jan 2024 15:47:09 +0000 (21:17 +0530)]
drm/amd/display: Fix buffer overflow in 'get_host_router_total_dp_tunnel_bw()'

The error message buffer overflow 'dc->links' 12 <= 12 suggests that the
code is trying to access an element of the dc->links array that is
beyond its bounds. In C, arrays are zero-indexed, so an array with 12
elements has valid indices from 0 to 11. Trying to access dc->links[12]
would be an attempt to access the 13th element of a 12-element array,
which is a buffer overflow.

To fix this, ensure that the loop does not go beyond the last valid
index when accessing dc->links[i + 1] by subtracting 1 from the loop
condition.

This would ensure that i + 1 is always a valid index in the array.

Fixes the below:
drivers/gpu/drm/amd/amdgpu/../display/dc/link/protocols/link_dp_dpia_bw.c:208 get_host_router_total_dp_tunnel_bw() error: buffer overflow 'dc->links' 12 <= 12

Fixes: 59f1622a5f05 ("drm/amd/display: Add dpia display mode validation logic")
Cc: PeiChen Huang <peichen.huang@amd.com>
Cc: Aric Cyr <aric.cyr@amd.com>
Cc: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
Cc: Meenakshikumar Somasundaram <meenakshikumar.somasundaram@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
20 months agodrm/amd/display: Add NULL check for kzalloc in 'amdgpu_dm_atomic_commit_tail()'
Srinivasan Shanmugam [Tue, 30 Jan 2024 08:36:43 +0000 (14:06 +0530)]
drm/amd/display: Add NULL check for kzalloc in 'amdgpu_dm_atomic_commit_tail()'

Add a NULL check for the kzalloc call that allocates memory for
dummy_updates in the amdgpu_dm_atomic_commit_tail function. Previously,
if kzalloc failed to allocate memory and returned NULL, the code would
attempt to use the NULL pointer.

The fix is to check if kzalloc returns NULL, and if so, log an error
message and skip the rest of the current loop iteration with the
continue statement.  This prevents the code from attempting to use the
NULL pointer.

Cc: Julia Lawall <julia.lawall@inria.fr>
Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
Cc: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
Cc: Alex Hung <alex.hung@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Reported-by: Julia Lawall <julia.lawall@inria.fr>
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/r/202401300629.ICnCt983-lkp@intel.com/
Fixes: 135fd1b35690 ("drm/amd/display: Reduce stack size")
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
20 months agodrm/amdgpu: Fix missing error code in 'gmc_v6/7/8/9_0_hw_init()'
Srinivasan Shanmugam [Tue, 30 Jan 2024 06:40:38 +0000 (12:10 +0530)]
drm/amdgpu: Fix missing error code in 'gmc_v6/7/8/9_0_hw_init()'

Return 0 for success scenairos in 'gmc_v6/7/8/9_0_hw_init()'

Fixes the below:
drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c:920 gmc_v6_0_hw_init() warn: missing error code? 'r'
drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c:1104 gmc_v7_0_hw_init() warn: missing error code? 'r'
drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c:1224 gmc_v8_0_hw_init() warn: missing error code? 'r'
drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c:2347 gmc_v9_0_hw_init() warn: missing error code? 'r'

Fixes: fac4ebd79fed ("drm/amdgpu: Fix with right return code '-EIO' in 'amdgpu_gmc_vram_checking()'")
Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
20 months agodrm/amdgpu: Need to resume ras during gpu reset for gfx v9_4_3 sriov
YiPeng Chai [Tue, 30 Jan 2024 12:03:39 +0000 (20:03 +0800)]
drm/amdgpu: Need to resume ras during gpu reset for gfx v9_4_3 sriov

Need to resume ras during gpu reset for
gfx v9_4_3 sriov

Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
20 months agodrm/amdgpu: disable RAS feature when fini
Tao Zhou [Fri, 26 Jan 2024 11:43:37 +0000 (19:43 +0800)]
drm/amdgpu: disable RAS feature when fini

Send RAS disable feature command in fini.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
20 months agodrm/amdgpu: Update boot time errors polling sequence
Hawking Zhang [Mon, 29 Jan 2024 12:29:08 +0000 (20:29 +0800)]
drm/amdgpu: Update boot time errors polling sequence

Update boot time errors polling sequence to align with
the latest firmware change.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Frank Min <Frank.Min@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
20 months agodrm/amd: Don't init MEC2 firmware when it fails to load
David McFarland [Mon, 29 Jan 2024 22:18:22 +0000 (18:18 -0400)]
drm/amd: Don't init MEC2 firmware when it fails to load

The same calls are made directly above, but conditional on the firmware
loading and validating successfully.

Cc: stable@vger.kernel.org
Fixes: 9931b67690cf ("drm/amd: Load GFX10 microcode during early_init")
Signed-off-by: David McFarland <corngood@gmail.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
20 months agodrm/amdgpu: Fix the warning info in mode1 reset
Ma Jun [Fri, 5 Jan 2024 06:05:25 +0000 (14:05 +0800)]
drm/amdgpu: Fix the warning info in mode1 reset

Fix the warning info below during mode1 reset.
[  +0.000004] Call Trace:
[  +0.000004]  <TASK>
[  +0.000006]  ? show_regs+0x6e/0x80
[  +0.000011]  ? __flush_work.isra.0+0x2e8/0x390
[  +0.000005]  ? __warn+0x91/0x150
[  +0.000009]  ? __flush_work.isra.0+0x2e8/0x390
[  +0.000006]  ? report_bug+0x19d/0x1b0
[  +0.000013]  ? handle_bug+0x46/0x80
[  +0.000012]  ? exc_invalid_op+0x1d/0x80
[  +0.000011]  ? asm_exc_invalid_op+0x1f/0x30
[  +0.000014]  ? __flush_work.isra.0+0x2e8/0x390
[  +0.000007]  ? __flush_work.isra.0+0x208/0x390
[  +0.000007]  ? _prb_read_valid+0x216/0x290
[  +0.000008]  __cancel_work_timer+0x11d/0x1a0
[  +0.000007]  ? try_to_grab_pending+0xe8/0x190
[  +0.000012]  cancel_work_sync+0x14/0x20
[  +0.000008]  amddrm_sched_stop+0x3c/0x1d0 [amd_sched]
[  +0.000032]  amdgpu_device_gpu_recover+0x29a/0xe90 [amdgpu]

This warning info was printed after applying the patch
"drm/sched: Convert drm scheduler to use a work queue rather than kthread".
The root cause is that amdgpu driver tries to use the uninitialized
work_struct in the struct drm_gpu_scheduler

v2:
 - Rename the function to amdgpu_ring_sched_ready and move it to
amdgpu_ring.c (Alex)
v3:
- Fix a few more checks based on Vitaly's patch (Alex)
v4:
- squash in fix noticed by Bert in
https://gitlab.freedesktop.org/drm/amd/-/issues/3139

Fixes: 11b3b9f461c5 ("drm/sched: Check scheduler ready before calling timeout handling")
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Signed-off-by: Ma Jun <Jun.Ma2@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
20 months agodrm/amdgpu: use helper macro HW_ERR instead of Hardware error string
Yang Wang [Mon, 29 Jan 2024 09:06:38 +0000 (17:06 +0800)]
drm/amdgpu: use helper macro HW_ERR instead of Hardware error string

use helper macro HW_ERR to instead of Hardware error string.

Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
20 months agodrm/amdgpu/pm: Use macro definitions in the smu IH process function
Ma Jun [Thu, 18 Jan 2024 08:29:09 +0000 (16:29 +0800)]
drm/amdgpu/pm: Use macro definitions in the smu IH process function

Replace the hard-coded numbers with macro definition

Signed-off-by: Ma Jun <Jun.Ma2@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
20 months agodrm/amd/display: 3.2.270
Aric Cyr [Mon, 22 Jan 2024 03:38:45 +0000 (22:38 -0500)]
drm/amd/display: 3.2.270

- Add control flag for IPS residency profiling
- Populate invalid split index to be 0xF
- Fix dcn35 8k30 Underflow/Corruption Issue
- Fix DP audio settings
- Use correct phantom pipe when populating subvp pipe info
- Fix incorrect mpc_combine array size
- Fix DPSTREAM CLK on and off sequence
- Fix USB-C flag update after enc10 feature init
- Add debugfs disallow edp psr
- Unify optimize_required flags and VRR adjustments
- Increased min_dcfclk_mhz and min_fclk_mhz
- Fix static screen event mask definition change

Reviewed-by: Tom Chung <chiahsuan.chung@amd.com>
Acked-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Aric Cyr <aric.cyr@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
20 months agodrm/amd/display: [FW Promotion] Release 0.0.202.0
Anthony Koo [Sun, 21 Jan 2024 01:54:33 +0000 (20:54 -0500)]
drm/amd/display: [FW Promotion] Release 0.0.202.0

 - Add control flag for IPS residency profiling

Reviewed-by: Tom Chung <chiahsuan.chung@amd.com>
Acked-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Anthony Koo <anthony.koo@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
20 months agodrm/amd/display: Populate invalid split index to be 0xF
Alvin Lee [Fri, 19 Jan 2024 22:20:21 +0000 (17:20 -0500)]
drm/amd/display: Populate invalid split index to be 0xF

[why]
There exists scenarios where the split index for subvp can be
pipe index 0. The assumption in FW is that the split index
won't be 0 but this is incorrect.

[how]
Instead populate non-split cases to be 0xF to differentiate
between split and non-split.

Reviewed-by: Wenjing Liu <wenjing.liu@amd.com>
Acked-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Alvin Lee <alvin.lee2@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
20 months agodrm/amd/display: Fix dcn35 8k30 Underflow/Corruption Issue
Fangzhi Zuo [Thu, 11 Jan 2024 19:46:01 +0000 (14:46 -0500)]
drm/amd/display: Fix dcn35 8k30 Underflow/Corruption Issue

[why]
odm calculation is missing for pipe split policy determination
and cause Underflow/Corruption issue.

[how]
Add the odm calculation.

Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Reviewed-by: Charlene Liu <charlene.liu@amd.com>
Acked-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Fangzhi Zuo <jerry.zuo@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
20 months agodrm/amd/display: clkmgr unittest with removal of warn & rename DCN35 ips handshake...
Mounika Adhuri [Thu, 18 Jan 2024 10:00:48 +0000 (15:30 +0530)]
drm/amd/display: clkmgr unittest with removal of warn & rename DCN35 ips handshake for idle

[why]
To Remove warnings of clk_mgr.

[How]
Added code to remove warnings by resolving redefinations.

Reviewed-by: Martin Leung <martin.leung@amd.com>
Acked-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Mounika Adhuri <moadhuri@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
20 months agodrm/amd/display: fix DP audio settings
Charlene Liu [Thu, 18 Jan 2024 23:25:23 +0000 (18:25 -0500)]
drm/amd/display: fix DP audio settings

[why]
Audio channel layout for 5.1ch is not correct

[how]
Add the audio layout for 5.1ch (channel_count = 6).
Add divided by zero check.

Reviewed-by: Zhan Liu <zhan.liu@amd.com>
Acked-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Charlene Liu <charlene.liu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
20 months agodrm/amd/display: Underflow workaround by increasing SR exit latency
Nicholas Susanto [Thu, 18 Jan 2024 18:34:40 +0000 (13:34 -0500)]
drm/amd/display: Underflow workaround by increasing SR exit latency

[Why]
On 14us for exit latency time causes underflow for 8K monitor with HDR on.
Increasing the latency to 28us fixes the underflow.

[How]
Increase the latency to 28us. This workaround should be sufficient
before we figure out why SR exit so long.

Reviewed-by: Chaitanya Dhere <chaitanya.dhere@amd.com>
Acked-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Nicholas Susanto <nicholas.susanto@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
20 months agodrm/amd/display: use correct phantom pipe when populating subvp pipe info
Wenjing Liu [Thu, 18 Jan 2024 20:30:01 +0000 (15:30 -0500)]
drm/amd/display: use correct phantom pipe when populating subvp pipe info

[why]
In current code, we recognize a pipe as a phantom pipe if it references
the same phantom stream. However it can also a phantom split pipe.
If the phantom split pipe has a smaller pipe index than the phantom pipe
we will mistakenly use the phantom split pipe as the phantom pipe. This
causes an incorrect subvp configuration where the first half of the
screen is flashing solid white image.

[how]
Add additional check that the pipe needs to be an OTG master pipe.

Reviewed-by: Alvin Lee <alvin.lee2@amd.com>
Acked-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Wenjing Liu <wenjing.liu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
20 months agodrm/amd/display: fix incorrect mpc_combine array size
Wenjing Liu [Thu, 18 Jan 2024 20:14:15 +0000 (15:14 -0500)]
drm/amd/display: fix incorrect mpc_combine array size

[why]
MAX_SURFACES is per stream, while MAX_PLANES is per asic. The
mpc_combine is an array that records all the planes per asic. Therefore
MAX_PLANES should be used as the array size. Using MAX_SURFACES causes
array overflow when there are more than 3 planes.

[how]
Use the MAX_PLANES for the mpc_combine array size.

Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Reviewed-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
Reviewed-by: Nevenko Stupar <nevenko.stupar@amd.com>
Reviewed-by: Chaitanya Dhere <chaitanya.dhere@amd.com>
Acked-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Wenjing Liu <wenjing.liu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
20 months agodrm/amd/display: Fix DPSTREAM CLK on and off sequence
Dmytro Laktyushkin [Wed, 17 Jan 2024 21:46:02 +0000 (16:46 -0500)]
drm/amd/display: Fix DPSTREAM CLK on and off sequence

[Why]
Secondary DP2 display fails to light up in some instances

[How]
Clock needs to be on when DPSTREAMCLK*_EN =1. This change
moves dtbclk_p enable/disable point to make sure this is
the case

Reviewed-by: Charlene Liu <charlene.liu@amd.com>
Reviewed-by: Dmytro Laktyushkin <dmytro.laktyushkin@amd.com>
Acked-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Daniel Miess <daniel.miess@amd.com>
Signed-off-by: Dmytro Laktyushkin <dmytro.laktyushkin@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
20 months agodrm/amd/display: fix invalid reg access on DCN35 FPGA
Eric Yang [Tue, 16 Jan 2024 19:50:55 +0000 (14:50 -0500)]
drm/amd/display: fix invalid reg access on DCN35 FPGA

[Why]
Unguarded SMU and CLK IP access cause issue on FPGA

[How]
Guard them for FPGA environment

Reviewed-by: Sung joon Kim <sungjoon.kim@amd.com>
Acked-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Eric Yang <eric.yang@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
20 months agodrm/amd/display: refine code for dmcub inbox1 ring buffer debug
Fudongwang [Wed, 17 Jan 2024 11:13:46 +0000 (19:13 +0800)]
drm/amd/display: refine code for dmcub inbox1 ring buffer debug

[Why]
1. To watch dmcub inbox1 ring buffer cmd type without tools
2. dmub_cmd_PLAT_54186_wa 66 bytes

[How]
Added dmcub cmd type enum: unsigned char for debug use only,
also fixed 66 bytes issue by using unsigned int in bit
define instead of unsigned char.

Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Acked-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Fudongwang <fudong.wang@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
20 months agodrm/amd/display: Send DTBCLK disable message on first commit
Taimur Hassan [Tue, 16 Jan 2024 23:10:54 +0000 (18:10 -0500)]
drm/amd/display: Send DTBCLK disable message on first commit

[Why]
Previous patch to allow DTBCLK disable didn't address boot case. Driver
thinks DTBCLK is disabled by default, so we don't send disable message to
PMFW. DTBCLK is then enabled at idle desktop on boot, burning power.

[How]
Set dtbclk_en to true on boot so that disable message is sent during first
commit.

Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Acked-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Taimur Hassan <syed.hassan@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
20 months agodrm/amd/display: fix USB-C flag update after enc10 feature init
Charlene Liu [Wed, 17 Jan 2024 00:54:15 +0000 (19:54 -0500)]
drm/amd/display: fix USB-C flag update after enc10 feature init

[why]
BIOS's integration info table not following the original order
which is phy instance is ext_displaypath's array index.

[how]
Move them to follow the original order.

Reviewed-by: Muhammad Ahmed <ahmed.ahmed@amd.com>
Acked-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Charlene Liu <charlene.liu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
20 months agodrm/amd/display: add debugfs disallow edp psr
Hersen Wu [Wed, 3 Jan 2024 21:27:39 +0000 (16:27 -0500)]
drm/amd/display: add debugfs disallow edp psr

[Why]
fix reading edp rx crc timeout failure. after
bootup, kernel setup psr with dpcd 0x170 = 5. this
notify rx psr enable and let rx fw start checking crc
for fw internal logic. rx fw may not update crc read
count within dpcd 0x246. read count is always 0. this
will lead tx crc reading timeout.

[How]
add debugfs to let test app to disbable rx crc
checking for rx internal logic. then test app can read
rx crc dpcd 0x246 successfully.
expected app sequence is as below:
1. disable eDP PHY and notify eDP rx with dpcd 0x600 = 2.
2. echo 0x1 /sys/kernel/debug/dri/0/eDP-X/disallow_edp_enter_psr
3. enable eDP PHY and notify eDP rx with dpcd 0x600 = 1 but
   without dpcd 0x170 = 5.
4. read crc from rx dpcd 0x270, 0x246, etc.
5. echo 0x0 /sys/kernel/debug/dri/0/eDP-X/disallow_edp_enter_psr.
   this will let eDP back to normal with psr setup dpcd 0x170 = 5.

Reviewed-by: Wayne Lin <wayne.lin@amd.com>
Acked-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Hersen Wu <hersenxs.wu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
20 months agodrm/amd/display: For FPO and SubVP/DRR configs program vmin/max sel
Alvin Lee [Tue, 12 Dec 2023 17:17:31 +0000 (12:17 -0500)]
drm/amd/display: For FPO and SubVP/DRR configs program vmin/max sel

[Why & How]
For FPO and SubVP/DRR cases we need to ensure to program
OTG_V_TOTAL_MIN/MAX_SEL, otherwise stretching the vblank
in FPO / SubVP / DRR cases will not have any effect
and we could hit underflow / corruption.

Reviewed-by: Alvin Lee <alvin.lee2@amd.com>
Acked-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Alvin Lee <alvin.lee2@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
20 months agodrm/amd/display: Unify optimize_required flags and VRR adjustments
Aric Cyr [Thu, 30 Nov 2023 23:54:48 +0000 (18:54 -0500)]
drm/amd/display: Unify optimize_required flags and VRR adjustments

[why]
There is only a single call to dc_post_update_surfaces_to_stream
so there is no need to have two flags to control it. Unifying
this to a single flag allows dc_stream_adjust_vmin_vmax to skip
actual programming when there is no change required.

[how]
Remove wm_optimze_required flag and set only optimize_required in its
place.  Then in dc_stream_adjust_vmin_vmax, check that the stream timing
range matches the requested one and skip programming if they are equal.

Reviewed-by: Alvin Lee <alvin.lee2@amd.com>
Acked-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Aric Cyr <aric.cyr@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
20 months agodrm/amd/display: increased min_dcfclk_mhz and min_fclk_mhz
Sohaib Nadeem [Tue, 16 Jan 2024 16:00:00 +0000 (11:00 -0500)]
drm/amd/display: increased min_dcfclk_mhz and min_fclk_mhz

[why]
Originally, PMFW said min FCLK is 300Mhz, but min DCFCLK can be increased
to 400Mhz because min FCLK is now 600Mhz so FCLK >= 1.5 * DCFCLK hardware
requirement will still be satisfied. Increasing min DCFCLK addresses
underflow issues (underflow occurs when phantom pipe is turned on for some
Sub-Viewport configs).

[how]
Increasing DCFCLK by raising the min_dcfclk_mhz

Reviewed-by: Chaitanya Dhere <chaitanya.dhere@amd.com>
Reviewed-by: Alvin Lee <alvin.lee2@amd.com>
Acked-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Sohaib Nadeem <sohaib.nadeem@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
20 months agodrm/amd/display: Wait for mailbox ready when powering up DMCUB
Nicholas Kazlauskas [Fri, 12 Jan 2024 15:06:15 +0000 (10:06 -0500)]
drm/amd/display: Wait for mailbox ready when powering up DMCUB

[Why]
Otherwise we can send commands too early and they don't execute until
the next command is sent.

[How]
Check the extra status bit when polling for HW powered up.

Reviewed-by: Charlene Liu <charlene.liu@amd.com>
Acked-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
20 months agodrm/amd/display: Wait before sending idle allow and after idle disallow
Nicholas Kazlauskas [Mon, 15 Jan 2024 22:24:26 +0000 (17:24 -0500)]
drm/amd/display: Wait before sending idle allow and after idle disallow

[Why]
We want acknowledgment of the driver idle disallow from DMCUB before
continuing with any further programming.

For idle allow we want to minimize the chance of DMCUB actively
interacing with other firmware components on the system (eg. PMFW)
at the same time.

[How]
Ensure that DMCUB isn't in the middle of processing other command
submissions prior to allowing idle and after disallowing idle by
inserting a wait before the allow and by changing the wait type for
the idle disallow.

Reviewed-by: Charlene Liu <charlene.liu@amd.com>
Acked-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
20 months agoRevert "drm/amd/display: initialize all the dpm level's stutter latency"
Charlene Liu [Tue, 9 Jan 2024 16:31:20 +0000 (11:31 -0500)]
Revert "drm/amd/display: initialize all the dpm level's stutter latency"

Revert commit 885c71ad791c
("drm/amd/display: initialize all the dpm level's stutter latency")

Because it causes some regression

Reviewed-by: Muhammad Ahmed <ahmed.ahmed@amd.com>
Acked-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Charlene Liu <charlene.liu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
20 months agodrm/amd/display: Fix static screen event mask definition change
Yiling Chen [Mon, 15 Jan 2024 03:28:12 +0000 (11:28 +0800)]
drm/amd/display: Fix static screen event mask definition change

[why]
The static screen event mask definition is different
betwnn DCN31 after and before.

[how]
Rename DCN30_set_static_screen_control to DCN31.

Reviewed-by: Charlene Liu <charlene.liu@amd.com>
Acked-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Yiling Chen <yi-ling.chen2@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
20 months agoDocumentation/gpu: Introduce a simple contribution list for display code
Rodrigo Siqueira [Mon, 22 Jan 2024 21:25:01 +0000 (14:25 -0700)]
Documentation/gpu: Introduce a simple contribution list for display code

This commit adds a contribution list for display under the kernel
documentation with some first suggestions. It also drops an old TODO
list from the display folder.

Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Harry Wentland <Harry.Wentland@amd.com>
Cc: Hamza Mahfooz <hamza.mahfooz@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
20 months agoDocumentation/gpu: Add an explanation about the DC weekly patches
Rodrigo Siqueira [Mon, 22 Jan 2024 21:25:00 +0000 (14:25 -0700)]
Documentation/gpu: Add an explanation about the DC weekly patches

This commit introduces some explanation about the display team
validation.

Changes since V1:
- Remove unprecise information about the DC process for now, can be
  added later on.
- Jani: Fix bullets

Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Harry Wentland <Harry.Wentland@amd.com>
Cc: Hamza Mahfooz <hamza.mahfooz@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
20 months agoDocumentation/gpu: Add entry for the DIO component
Rodrigo Siqueira [Mon, 22 Jan 2024 21:24:59 +0000 (14:24 -0700)]
Documentation/gpu: Add entry for the DIO component

Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Harry Wentland <Harry.Wentland@amd.com>
Cc: Hamza Mahfooz <hamza.mahfooz@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
20 months agoDocumentation/gpu: Add entry for OPP in the kernel doc
Rodrigo Siqueira [Mon, 22 Jan 2024 21:24:58 +0000 (14:24 -0700)]
Documentation/gpu: Add entry for OPP in the kernel doc

Introduce OPP as part of the kernel documentation.

Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Harry Wentland <Harry.Wentland@amd.com>
Cc: Hamza Mahfooz <hamza.mahfooz@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
20 months agoDocumentation/gpu: Add kernel doc entry for MPC
Rodrigo Siqueira [Mon, 22 Jan 2024 21:24:57 +0000 (14:24 -0700)]
Documentation/gpu: Add kernel doc entry for MPC

This commit adds a kernel-doc entry for the MPC block. Since it enabled
the kernel-doc to parse some of the documentation in the mpc.h file,
fixing some of the comments was required.

Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Harry Wentland <Harry.Wentland@amd.com>
Cc: Hamza Mahfooz <hamza.mahfooz@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
20 months agoDocumentation/gpu: Add kernel doc entry for DPP
Rodrigo Siqueira [Mon, 22 Jan 2024 21:24:56 +0000 (14:24 -0700)]
Documentation/gpu: Add kernel doc entry for DPP

This commit introduces basic DPP information and the struct scan for
code documentation.

Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Harry Wentland <Harry.Wentland@amd.com>
Cc: Hamza Mahfooz <hamza.mahfooz@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
20 months agodrm/amdkfd: Use correct drm device for cgroup permission check
Mukul Joshi [Fri, 26 Jan 2024 20:14:50 +0000 (15:14 -0500)]
drm/amdkfd: Use correct drm device for cgroup permission check

On GFX 9.4.3, for a given KFD node, fetch the correct drm device from
XCP manager when checking for cgroup permissions.

Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
20 months agodrm/amdkfd: Use S_ENDPGM_SAVED in trap handler
Jay Cornwall [Fri, 5 Jan 2024 14:49:06 +0000 (08:49 -0600)]
drm/amdkfd: Use S_ENDPGM_SAVED in trap handler

This instruction has no functional difference to S_ENDPGM
but allows performance counters to track save events correctly.

Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
Reviewed-by: Laurent Morichetti <laurent.morichetti@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
20 months agodrm/amdkfd: Correct partial migration virtual addr
Philip Yang [Mon, 15 Jan 2024 18:37:47 +0000 (13:37 -0500)]
drm/amdkfd: Correct partial migration virtual addr

Partial migration to system memory should use migrate.addr, not
prange->start as virtual address to allocate system memory page.

Fixes: a546a2768440 ("drm/amdkfd: Use partial migrations/mapping for GPU/CPU page faults in SVM")
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Xiaogang Chen <Xiaogang.Chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
20 months agodrm/amdgpu: move the drm client creation behind drm device registration
Le Ma [Thu, 25 Jan 2024 04:00:34 +0000 (12:00 +0800)]
drm/amdgpu: move the drm client creation behind drm device registration

This patch is to eliminate interrupt warning below:

  "[drm] Fence fallback timer expired on ring sdma0.0".

An early vm pt clearing job is sent to SDMA ahead of interrupt enabled.
And re-locating the drm client creation following after drm_dev_register
looks like a more proper flow.

v2: wrap the drm client creation

Fixes: 1819200166ce ("drm/amdkfd: Export DMABufs from KFD using GEM handles")
Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
20 months agodrm/amd/display: Fix potential NULL pointer dereferences in 'dcn10_set_output_transfe...
Srinivasan Shanmugam [Thu, 25 Jan 2024 15:46:04 +0000 (21:16 +0530)]
drm/amd/display: Fix potential NULL pointer dereferences in 'dcn10_set_output_transfer_func()'

The 'stream' pointer is used in dcn10_set_output_transfer_func() before
the check if 'stream' is NULL.

Fixes the below:
drivers/gpu/drm/amd/amdgpu/../display/dc/hwss/dcn10/dcn10_hwseq.c:1892 dcn10_set_output_transfer_func() warn: variable dereferenced before check 'stream' (see line 1875)

Fixes: ddef02de0d71 ("drm/amd/display: add null checks before logging")
Cc: Wyatt Wood <wyatt.wood@amd.com>
Cc: Anthony Koo <Anthony.Koo@amd.com>
Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Anthony Koo <Anthony.Koo@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
20 months agodrm/amd/include: Add missing registers/mask for DCN316 and 350
Rodrigo Siqueira [Thu, 11 Jan 2024 21:55:57 +0000 (14:55 -0700)]
drm/amd/include: Add missing registers/mask for DCN316 and 350

Cc: Jun Lei <Jun.Lei@amd.com>
Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
Cc: Hamza Mahfooz <hamza.mahfooz@amd.com>
Cc: Harry Wentland <harry.wentland@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
20 months agoDocumentation/gpu: Add simple doc page for DCHUBBUB
Rodrigo Siqueira [Mon, 22 Jan 2024 21:24:55 +0000 (14:24 -0700)]
Documentation/gpu: Add simple doc page for DCHUBBUB

Enable the documentation to extract code documentation from dchubbub.h
file.

Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Harry Wentland <Harry.Wentland@amd.com>
Cc: Hamza Mahfooz <hamza.mahfooz@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
20 months agoDocumentation/gpu: Add basic page for HUBP
Rodrigo Siqueira [Mon, 22 Jan 2024 21:24:54 +0000 (14:24 -0700)]
Documentation/gpu: Add basic page for HUBP

Create the HUBP documentation page and add the doc references to extract
the HUBP code documentation.

Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Harry Wentland <Harry.Wentland@amd.com>
Cc: Hamza Mahfooz <hamza.mahfooz@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
20 months agodrm/amdgpu: update documentation on new chips
Alex Deucher [Thu, 18 Jan 2024 18:52:28 +0000 (13:52 -0500)]
drm/amdgpu: update documentation on new chips

These have been released now, so add them to the documentation.

Acked-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agoamdgpu/drm: Use vram manager for virtualization page retirement
Victor Skvortsov [Mon, 22 Jan 2024 17:45:09 +0000 (12:45 -0500)]
amdgpu/drm: Use vram manager for virtualization page retirement

In runtime, use vram manager for virtualization page retirement.

Signed-off-by: Victor Skvortsov <victor.skvortsov@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amdgpu: Add RAS_POISON_READY host response message
Victor Skvortsov [Sun, 21 Jan 2024 15:25:24 +0000 (10:25 -0500)]
drm/amdgpu: Add RAS_POISON_READY host response message

In a non-FLR page avoidance scenario, the host driver will
provide the bad pages in the pf2vf exchange region.

Adding a new host response message to indicate when the
pf2vf exchange region has been updated.

Signed-off-by: Victor Skvortsov <victor.skvortsov@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amdgpu: Support passing poison consumption ras block to SRIOV
YiPeng Chai [Tue, 23 Jan 2024 08:08:11 +0000 (16:08 +0800)]
drm/amdgpu: Support passing poison consumption ras block to SRIOV

Support passing poison consumption ras blocks
to SRIOV.

Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amd/display: "Enable IPS by default"
Roman Li [Tue, 23 Jan 2024 20:18:24 +0000 (15:18 -0500)]
drm/amd/display: "Enable IPS by default"

[Why]
IPS was temporary disabled due to instability.
It was fixed in dmub firmware and with:
- "drm/amd/display: Add IPS checks before dcn register access"
- "drm/amd/display: Disable ips before dc interrupt setting"

[How]
Enable IPS by default.
Disable IPS if 0x800 bit set in amdgpu.dcdebugmask module params

Signed-off-by: Roman Li <Roman.Li@amd.com>
Tested-by: Mark Broadworth <mark.broadworth@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amd: Add a DC debug mask for IPS
Roman Li [Tue, 23 Jan 2024 20:14:28 +0000 (15:14 -0500)]
drm/amd: Add a DC debug mask for IPS

For debugging IPS-related issues, expose a new debug mask
that allows to disable IPS.
Usage:
amdgpu.dcdebugmask=0x800

Signed-off-by: Roman Li <Roman.Li@amd.com>
Tested-by: Mark Broadworth <mark.broadworth@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amd/display: Disable ips before dc interrupt setting
Roman Li [Mon, 22 Jan 2024 22:45:41 +0000 (17:45 -0500)]
drm/amd/display: Disable ips before dc interrupt setting

[Why]
While in IPS2 an access to dcn registers is not allowed.
If interrupt results in dc call, we should disable IPS.

[How]
Safeguard register access in IPS2 by disabling idle optimization
before calling dc interrupt setting api.

Signed-off-by: Roman Li <Roman.Li@amd.com>
Tested-by: Mark Broadworth <mark.broadworth@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amdgpu: adjust aca init/fini sequence to match gpu reset
Yang Wang [Wed, 24 Jan 2024 02:15:10 +0000 (10:15 +0800)]
drm/amdgpu: adjust aca init/fini sequence to match gpu reset

- move aca init/fini function into ras init/fini to adapt gpu reset
  sequence.
- add new function amdgpu_aca_reset()

Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amdgpu: add aca sysfs remove support
Yang Wang [Tue, 23 Jan 2024 06:31:26 +0000 (14:31 +0800)]
drm/amdgpu: add aca sysfs remove support

add aca sysfs remove support.

Fixes: 37973b69eab4 ("drm/amdgpu: add aca sysfs support")
Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amd/display: Fix a potential buffer overflow in 'dp_dsc_clock_en_read()'
Srinivasan Shanmugam [Tue, 23 Jan 2024 14:48:07 +0000 (20:18 +0530)]
drm/amd/display: Fix a potential buffer overflow in 'dp_dsc_clock_en_read()'

Tell snprintf() to store at most 10 bytes in the output buffer
instead of 30.

Fixes the below:
drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm_debugfs.c:1508 dp_dsc_clock_en_read() error: snprintf() is printing too much 30 vs 10

Fixes: c06e09b76639 ("drm/amd/display: Add DSC parameters logging to debugfs")
Cc: Alex Hung <alex.hung@amd.com>
Cc: Qingqing Zhuo <qingqing.zhuo@amd.com>
Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amdgpu: Fix module unload hang with RAS enabled
Mukul Joshi [Wed, 24 Jan 2024 02:14:51 +0000 (10:14 +0800)]
drm/amdgpu: Fix module unload hang with RAS enabled

The driver unload hangs because the page retirement
kthread cannot be stopped as it is sleeping and waiting
on page retirement event to occur. Add kthread_should_stop()
to the event condition to wake up the kthread when kthread
stop is called during driver unload.

Fixes: 3fdcd0a31d7a ("drm/amdgpu: Prepare for asynchronous processing of umc page retirement")
Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amdgpu/pm: Add default case for smu IH process func
Ma Jun [Mon, 22 Jan 2024 06:21:11 +0000 (14:21 +0800)]
drm/amdgpu/pm: Add default case for smu IH process func

Add default case for smu IH process func.

Signed-off-by: Ma Jun <Jun.Ma2@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amdgpu: convert some variable sized arrays to [] style
Alex Deucher [Wed, 17 Jan 2024 23:09:00 +0000 (18:09 -0500)]
drm/amdgpu: convert some variable sized arrays to [] style

Replace [1] with [].  Silences UBSAN warnings.

Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3107
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amdgpu/gfx11: set UNORD_DISPATCH in compute MQDs
Alex Deucher [Fri, 19 Jan 2024 17:32:59 +0000 (12:32 -0500)]
drm/amdgpu/gfx11: set UNORD_DISPATCH in compute MQDs

This needs to be set to 1 to avoid a potential deadlock in
the GC 10.x and newer.  On GC 9.x and older, this needs
to be set to 0. This can lead to hangs in some mixed
graphics and compute workloads. Updated firmware is also
required for AQL.

Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
21 months agodrm/amdgpu/gfx10: set UNORD_DISPATCH in compute MQDs
Alex Deucher [Fri, 19 Jan 2024 17:23:55 +0000 (12:23 -0500)]
drm/amdgpu/gfx10: set UNORD_DISPATCH in compute MQDs

This needs to be set to 1 to avoid a potential deadlock in
the GC 10.x and newer.  On GC 9.x and older, this needs
to be set to 0.  This can lead to hangs in some mixed
graphics and compute workloads.  Updated firmware is also
required for AQL.

Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
21 months agodrm/amd/amdgpu: Assign GART pages to AMD device mapping
Tom St Denis [Wed, 17 Jan 2024 17:47:37 +0000 (12:47 -0500)]
drm/amd/amdgpu: Assign GART pages to AMD device mapping

This allows kernel mapped pages like the PDB and PTB to be
read via the iomem debugfs when there is no vram in the system.

Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amdgpu: Fix return type in 'aca_bank_hwip_is_matched()'
Srinivasan Shanmugam [Tue, 23 Jan 2024 07:16:36 +0000 (12:46 +0530)]
drm/amdgpu: Fix return type in 'aca_bank_hwip_is_matched()'

Change the return type of "if (!bank || type == ACA_HWIP_TYPE_UNKNOW)"
to be bool instead of int.

Fixes the below:
drivers/gpu/drm/amd/amdgpu/amdgpu_aca.c:185 aca_bank_hwip_is_matched() warn: signedness bug returning '(-22)'

Fixes: f5e4cc8461c4 ("drm/amdgpu: implement RAS ACA driver framework")
Cc: Yang Wang <kevinyang.wang@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amd/pm: Fetch current power limit from FW
Lijo Lazar [Thu, 18 Jan 2024 08:55:35 +0000 (14:25 +0530)]
drm/amd/pm: Fetch current power limit from FW

Power limit of SMUv13.0.6 SOCs can be updated by out-of-band ways. Fetch
the limit from firmware instead of using cached values.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amdgpu/pptable: convert some variable sized arrays to [] style
Alex Deucher [Mon, 22 Jan 2024 15:48:39 +0000 (10:48 -0500)]
drm/amdgpu/pptable: convert some variable sized arrays to [] style

Replace [1] with [].  Silences UBSAN warnings.

Link: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2039926
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agoRevert "drm/amd/pm: fix the high voltage and temperature issue"
Mario Limonciello [Fri, 19 Jan 2024 09:08:37 +0000 (03:08 -0600)]
Revert "drm/amd/pm: fix the high voltage and temperature issue"

This reverts commit 5f38ac54e60562323ea4abb1bfb37d043ee23357.
This causes issues with rebooting and the 7800XT.

Cc: Kenneth Feng <kenneth.feng@amd.com>
Cc: stable@vger.kernel.org
Fixes: 5f38ac54e605 ("drm/amd/pm: fix the high voltage and temperature issue")
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3062
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amd/display: hook up DCN20 color blocks data to DTN log
Melissa Wen [Tue, 28 Nov 2023 17:52:57 +0000 (16:52 -0100)]
drm/amd/display: hook up DCN20 color blocks data to DTN log

Color caps changed between HW versions, which caused the DCN10 color
state sections in the DTN log to no longer match DCN2+ state. Create a
color state log specific to DCN2.0 and hook it up to DCN2 family
drivers. Instead of reading gamut remap reg values, display gamut remap
matrix data in fixed 31.32.

Signed-off-by: Melissa Wen <mwen@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amd/display: add DPP and MPC color caps to DTN log
Melissa Wen [Tue, 28 Nov 2023 17:52:56 +0000 (16:52 -0100)]
drm/amd/display: add DPP and MPC color caps to DTN log

Add color caps information for DPP and MPC block to show HW color caps.

Signed-off-by: Melissa Wen <mwen@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amd/display: Address kdoc for eDP Panel Replay feature in 'amdgpu_dm_crtc_set_pan...
Srinivasan Shanmugam [Mon, 22 Jan 2024 15:17:32 +0000 (20:47 +0530)]
drm/amd/display: Address kdoc for eDP Panel Replay feature in 'amdgpu_dm_crtc_set_panel_sr_feature()'

Fixes the below:
drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm_crtc.c:100: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst
 * The DRM vblank counter enable/disable action is used as the trigger
   to enable

Cc: Sun peng Li <sunpeng.li@amd.com>
Cc: Alex Hung <alex.hung@amd.com>
Cc: Tom Chung <chiahsuan.chung@amd.com>
Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amd/display: hook up DCN30 color blocks data to DTN log
Melissa Wen [Tue, 28 Nov 2023 17:52:55 +0000 (16:52 -0100)]
drm/amd/display: hook up DCN30 color blocks data to DTN log

Color caps changed between HW versions, which caused the DCN10 color
state sections in the DTN log to no longer match DCN3+ state. Create a
color state log specific to DCN3.0 and hook it up to DCN3.0+ and DCN3.1+
drivers.

rfc-v2:
- detail RAM mode for gamcor and blnd gamma blocks
- add MPC gamut remap matrix log

v3:
- read MPC gamut remap matrix in fixed 31.32 format
- extend to DCN3.0+ and DCN3.1+ drivers (Harry)

Signed-off-by: Melissa Wen <mwen@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amdgpu: Fix null pointer dereference
Hawking Zhang [Mon, 22 Jan 2024 09:38:23 +0000 (17:38 +0800)]
drm/amdgpu: Fix null pointer dereference

amdgpu_reg_state_sysfs_fini could be invoked at the
time when asic_func is even not initialized, i.e.,
amdgpu_discovery_init fails for some reason.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amdgpu: skip call ras_late_init if ras block is not supported
Yang Wang [Mon, 22 Jan 2024 04:09:31 +0000 (12:09 +0800)]
drm/amdgpu: skip call ras_late_init if ras block is not supported

skip call ras_late_init callback if ras block is not supported.

Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amdgpu: Show vram vendor only if available
Lijo Lazar [Sat, 20 Jan 2024 08:18:09 +0000 (13:48 +0530)]
drm/amdgpu: Show vram vendor only if available

Ony if vram vendor info is available, show in sysfs.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amdgpu: Avoid fetching vram vendor information
Lijo Lazar [Sat, 20 Jan 2024 08:02:51 +0000 (13:32 +0530)]
drm/amdgpu: Avoid fetching vram vendor information

For GFX 9.4.3 APUs, the current method of fetching vram vendor
information is not reliable. Avoid fetching the information.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>