]> www.infradead.org Git - users/hch/dma-mapping.git/log
users/hch/dma-mapping.git
21 months agodrm/amdgpu: Fix the null pointer when load rlc firmware
Ma Jun [Fri, 12 Jan 2024 05:33:24 +0000 (13:33 +0800)]
drm/amdgpu: Fix the null pointer when load rlc firmware

If the RLC firmware is invalid because of wrong header size,
the pointer to the rlc firmware is released in function
amdgpu_ucode_request. There will be a null pointer error
in subsequent use. So skip validation to fix it.

Signed-off-by: Ma Jun <Jun.Ma2@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amdgpu: Centralize ras cap query to amdgpu_ras_check_supported
Hawking Zhang [Fri, 29 Dec 2023 06:37:45 +0000 (14:37 +0800)]
drm/amdgpu: Centralize ras cap query to amdgpu_ras_check_supported

Move ras capablity check to amdgpu_ras_check_supported.
Driver will query ras capablity through psp interace, or
vbios interface, or specific ip callbacks.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amdgpu: Query ras capablity from psp v2
Hawking Zhang [Tue, 2 Jan 2024 14:13:20 +0000 (22:13 +0800)]
drm/amdgpu: Query ras capablity from psp v2

Instead of traditional atomfirmware interfaces for RAS
capability, host driver can query ras capability from
psp starting from psp v13_0_6.

v2: drop redundant local variable from get_ras_capability.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amd/display: 3.2.267
Martin Leung [Tue, 2 Jan 2024 13:58:03 +0000 (08:58 -0500)]
drm/amd/display: 3.2.267

- Align the returned error code with legacy DP
- Allow Z8 for multiplane configurations on DCN35
- Set default Z8 minimum residency for DCN35
- Rework DC Z10 restore
- Enable Panel Replay for static screen use case
- Add DP audio BW validation
- Fix dml2 assigned pipe search
- Ensure populate uclk in bb construction
- Update P010 scaling cap
- Reenable windowed mpo odm support
- Fix DML2 watermark calculation
- Clear OPTC mem select on disable
- Floor to mhz when requesting dpp disp clock changes to SMU
- Port DENTIST hang and TDR fixes to OTG disable W/A
- Add logging resource checks
- Add Replay IPS register for DMUB command table
- Init link enc resources in dc_state only if res_pool presents
- Allow IPS2 during Replay

Acked-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Martin Leung <martin.leung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amd/display: Align the returned error code with legacy DP
Wayne Lin [Tue, 2 Jan 2024 06:20:37 +0000 (14:20 +0800)]
drm/amd/display: Align the returned error code with legacy DP

[Why]
For usb4 connector, AUX transaction is handled by dmub utilizing a differnt
code path comparing to legacy DP connector. If the usb4 DP connector is
disconnected, AUX access will report EBUSY and cause igt@kms_dp_aux_dev
fail.

[How]
Align the error code with the one reported by legacy DP as EIO.

Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Acked-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Wayne Lin <Wayne.Lin@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amd/display: Allow Z8 for multiplane configurations on DCN35
Nicholas Kazlauskas [Thu, 4 Jan 2024 22:56:15 +0000 (17:56 -0500)]
drm/amd/display: Allow Z8 for multiplane configurations on DCN35

[Why]
Power improvement over DCN314, but also addresses a functional issue
where plane_state remains uncleared on pipes that aren't actually
active.

[How]
Update the check to allow for zero streams to be treated as z8 allow.
Update the check to remove plane count on the active stream case.

Z8 will still be blocked based on stutter duration, which is likely to
be the case for most multi plane configurations.

Reviewed-by: Gabe Teeger <gabe.teeger@amd.com>
Reviewed-by: Charlene Liu <charlene.liu@amd.com>
Acked-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amd/display: Set default Z8 minimum residency for DCN35
Nicholas Kazlauskas [Thu, 4 Jan 2024 22:55:19 +0000 (17:55 -0500)]
drm/amd/display: Set default Z8 minimum residency for DCN35

[Why & How]
Match DCN314's policy.

Reviewed-by: Gabe Teeger <gabe.teeger@amd.com>
Reviewed-by: Charlene Liu <charlene.liu@amd.com>
Acked-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amd/display: Rework DC Z10 restore
Nicholas Kazlauskas [Tue, 19 Dec 2023 17:35:44 +0000 (12:35 -0500)]
drm/amd/display: Rework DC Z10 restore

[Why]
The call currently does two things:
1. Exits DMCUB from idle optimization if it was in
2. Checks DMCUB scratch register to determine if we need to call
   DMCUB to do deferred HW restore and then sends the command if it's
   ready for it.

By doing (1) we prevent driver idle from being renotified in the cases
where driver had previously allowed DC level idle optimizations via
dc_allow_idle_optimizations since it thinks:

allow == dc->idle_optimizations_allowed

...and that the operation is a no-op.

We want driver idle to be resent at the next opprotunity to do so
for video playback cases.

[How]
Migrate all usecases of dc_z10_restore to only perform (2).

Add extra calls to dc_allow_idle_optimizations to handle (1) and also
keep SW state matching with when we requested enter/exit of DMCUB
idle optimizations.

Ensure cursor idle optimizations false always get called when IPS
is supported.

Further rework/redesign is needed to decide whether we need a separate
level of DM allow vs DC allow and when to attempt re-entry.

Reviewed-by: Yihan Zhu <yihan.zhu@amd.com>
Reviewed-by: Charlene Liu <charlene.liu@amd.com>
Acked-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amd/display: Enable Panel Replay for static screen use case
Tom Chung [Wed, 6 Dec 2023 14:07:51 +0000 (22:07 +0800)]
drm/amd/display: Enable Panel Replay for static screen use case

[Why]
Enable the Panel Replay if eDP panel and ASIC support.
(prioritize Panel Replay over PSR)

[How]
- Setup the Panel Replay config during the device init
  (prioritize Panel Replay over PSR).
- Separate the Replay init function into two functions
  amdgpu_dm_link_setup_replay() and amdgpu_dm_set_replay_caps()
  to fix the issue in the earlier commit that cause PSR and Replay
  enabled at the same time.

Reviewed-by: Sun peng Li <sunpeng.li@amd.com>
Acked-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amd/display: Add DP audio BW validation
George Shen [Tue, 21 Nov 2023 23:32:52 +0000 (18:32 -0500)]
drm/amd/display: Add DP audio BW validation

[Why]
Timings with small HBlank (such as CVT RBv2) can result in insufficient
HBlank bandwidth for audio SDP transmission when DSC is active. This
will cause some higher bandwidth audio modes to fail.

The combination of CVT RBv2 timings + DSC can commonly be encountered
in MST scenarios.

[How]
Add DP audio bandwidth validation for 8b/10b MST and 128b/132b SST/MST
cases and filter out modes that cannot be supported with the current
timing config.

Reviewed-by: Wenjing Liu <wenjing.liu@amd.com>
Acked-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: George Shen <george.shen@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amd/display: Fix dml2 assigned pipe search
Dmytro Laktyushkin [Thu, 4 Jan 2024 14:14:18 +0000 (09:14 -0500)]
drm/amd/display: Fix dml2 assigned pipe search

[Why & How]
DML2 currently finds assigned pipes in array order rather than the
existing linked list order. This results in rearranging pipe order
on flip and more importantly otg inst and pipe idx mismatch.

This change preserves the order of existing pipes and guarantees
the head pipe will have matching otg inst and pipe idx.

Reviewed-by: Gabe Teeger <gabe.teeger@amd.com>
Acked-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Dmytro Laktyushkin <dmytro.laktyushkin@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amd/display: Ensure populate uclk in bb construction
Alvin Lee [Thu, 4 Jan 2024 18:40:04 +0000 (13:40 -0500)]
drm/amd/display: Ensure populate uclk in bb construction

[Description]
- For some SKUs, the optimal DCFCLK for each UCLK is less than the
  smallest DCFCLK STA target due to low memory bandwidth. There is
  an assumption that the DCFCLK STA targets will always be less
  than one of the optimal DCFCLK values, but this is not true for
  SKUs that have low memory bandwidth. In this case we need to
  populate the optimal UCLK for each DCFCLK STA targets as the max
  UCLK freq.
- Also fix a bug in DML where start_state is not assigned and used
  correctly.

Reviewed-by: Samson Tam <samson.tam@amd.com>
Reviewed-by: Chaitanya Dhere <chaitanya.dhere@amd.com>
Acked-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Alvin Lee <alvin.lee2@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amd/display: Update P010 scaling cap
Charlene Liu [Wed, 3 Jan 2024 22:09:30 +0000 (17:09 -0500)]
drm/amd/display: Update P010 scaling cap

[Why]
Keep the same as previous APU and also insert clock dump

Reviewed-by: Ovidiu Bunea <ovidiu.bunea@amd.com>
Acked-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Charlene Liu <charlene.liu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amd/display: Fix DML2 watermark calculation
Ovidiu Bunea [Tue, 19 Dec 2023 02:40:45 +0000 (21:40 -0500)]
drm/amd/display: Fix DML2 watermark calculation

[Why]
core_mode_programming in DML2 should output watermark calculations
to locals, but it incorrectly uses mode_lib

[How]
update code to match HW DML2

Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Reviewed-by: Charlene Liu <charlene.liu@amd.com>
Acked-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Ovidiu Bunea <ovidiu.bunea@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amd/display: Clear OPTC mem select on disable
Ilya Bakoulin [Wed, 3 Jan 2024 14:42:04 +0000 (09:42 -0500)]
drm/amd/display: Clear OPTC mem select on disable

[Why]
Not clearing the memory select bits prior to OPTC disable can cause DSC
corruption issues when attempting to reuse a memory instance for another
OPTC that enables ODM.

[How]
Clear the memory select bits prior to disabling an OPTC.

Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Reviewed-by: Charlene Liu <charlene.liu@amd.com>
Acked-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Ilya Bakoulin <ilya.bakoulin@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amd/display: Floor to mhz when requesting dpp disp clock changes to SMU
Wenjing Liu [Tue, 2 Jan 2024 21:06:35 +0000 (16:06 -0500)]
drm/amd/display: Floor to mhz when requesting dpp disp clock changes to SMU

[Why]
SMU uses discrete dpp and disp clock levels. When we submit SMU request
for clock changes in Mhz we need to floor the requested value from Khz so
SMU will choose the next higher clock level in Khz to set. If we ceil to
Mhz, SMU will have to choose the next higher clock level after the ceil,
which could result in unnecessarily jumpping to the next level.

For example, we request 1911,111Khz which is exactly one of the SMU preset
level. If we pass 1912Mhz, SMU will choose 2150,000 khz. If we pass
1911Mhz, SMU will choose 1911,111kHz, which is the expected value.

Reviewed-by: Alvin Lee <alvin.lee2@amd.com>
Acked-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Wenjing Liu <wenjing.liu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amd/display: Port DENTIST hang and TDR fixes to OTG disable W/A
Nicholas Kazlauskas [Fri, 15 Dec 2023 16:01:42 +0000 (11:01 -0500)]
drm/amd/display: Port DENTIST hang and TDR fixes to OTG disable W/A

[Why]
We can experience DENTIST hangs during optimize_bandwidth or TDRs if
FIFO is toggled and hangs.

[How]
Port the DCN35 fixes to DCN314.

Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Reviewed-by: Charlene Liu <charlene.liu@amd.com>
Acked-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amd/display: Add logging resource checks
Charlene Liu [Thu, 28 Dec 2023 18:19:33 +0000 (13:19 -0500)]
drm/amd/display: Add logging resource checks

[Why]
When mapping resources, resources could be unavailable.

Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Reviewed-by: Sung joon Kim <sungjoon.kim@amd.com>
Acked-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Charlene Liu <charlene.liu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amdgpu: Clean up errors in amdgpu_rlc.c
chenxuebing [Thu, 11 Jan 2024 02:10:02 +0000 (02:10 +0000)]
drm/amdgpu: Clean up errors in amdgpu_rlc.c

Fix the following errors reported by checkpatch:

ERROR: space prohibited before that '++' (ctx:WxB)

Signed-off-by: chenxuebing <chenxb_99091@126.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amd: Clean up errors in sdma_v2_4.c
chenxuebing [Thu, 11 Jan 2024 02:07:05 +0000 (02:07 +0000)]
drm/amd: Clean up errors in sdma_v2_4.c

Fix the following errors reported by checkpatch:

ERROR: that open brace { should be on the previous line
ERROR: trailing statements should be on next line

Signed-off-by: chenxuebing <chenxb_99091@126.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amd/amdgpu: Clean up errors in amdgpu_umr.h
chenxuebing [Thu, 11 Jan 2024 02:03:33 +0000 (02:03 +0000)]
drm/amd/amdgpu: Clean up errors in amdgpu_umr.h

Fix the following errors reported by checkpatch:

spaces required around that '=' (ctx:VxV)

Signed-off-by: chenxuebing <chenxb_99091@126.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amdgpu: Clean up errors in amdgpu_atomfirmware.h
chenxuebing [Thu, 11 Jan 2024 02:01:23 +0000 (02:01 +0000)]
drm/amdgpu: Clean up errors in amdgpu_atomfirmware.h

Fix the following errors reported by checkpatch:

ERROR: "foo* bar" should be "foo *bar"

Signed-off-by: chenxuebing <chenxb_99091@126.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amdgpu: Clean up errors in clearstate_gfx9.h
chenxuebing [Thu, 11 Jan 2024 01:59:03 +0000 (01:59 +0000)]
drm/amdgpu: Clean up errors in clearstate_gfx9.h

Fix the following errors reported by checkpatch:

ERROR: that open brace { should be on the previous line

Signed-off-by: chenxuebing <chenxb_99091@126.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amdgpu: Clean up errors in navi10_ih.c
chenxuebing [Thu, 11 Jan 2024 01:54:42 +0000 (01:54 +0000)]
drm/amdgpu: Clean up errors in navi10_ih.c

Fix the following errors reported by checkpatch:

ERROR: that open brace { should be on the previous line

Signed-off-by: chenxuebing <chenxb_99091@126.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/radeon: check PS, WS index
Alexander Richards [Thu, 11 Jan 2024 15:04:49 +0000 (16:04 +0100)]
drm/radeon: check PS, WS index

Theoretically, it would be possible for a buggy or malicious VBIOS to
overwrite past the bounds of the passed parameters (or its own
workspace); add bounds checking to prevent this from happening.

Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3093
Signed-off-by: Alexander Richards <electrodeyt@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amdgpu: check PS, WS index
Alexander Richards [Thu, 11 Jan 2024 15:04:48 +0000 (16:04 +0100)]
drm/amdgpu: check PS, WS index

Theoretically, it would be possible for a buggy or malicious VBIOS to
overwrite past the bounds of the passed parameters (or its own
workspace); add bounds checking to prevent this from happening.

Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3093
Signed-off-by: Alexander Richards <electrodeyt@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amd/display: Add Replay IPS register for DMUB command table
Alvin Lee [Tue, 2 Jan 2024 17:02:39 +0000 (12:02 -0500)]
drm/amd/display: Add Replay IPS register for DMUB command table

- Introduce a new Replay mode for DMUB version 0.0.199.0

Reviewed-by: Martin Leung <martin.leung@amd.com>
Acked-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Alvin Lee <alvin.lee2@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amd/display: Init link enc resources in dc_state only if res_pool presents
Dillon Varone [Fri, 29 Dec 2023 02:36:39 +0000 (21:36 -0500)]
drm/amd/display: Init link enc resources in dc_state only if res_pool presents

[Why & How]
res_pool is not initialized in all situations such as virtual
environments, and therefore link encoder resources should not be
initialized if res_pool is NULL.

Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Reviewed-by: Martin Leung <martin.leung@amd.com>
Acked-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Dillon Varone <dillon.varone@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amd/display: Allow IPS2 during Replay
Nicholas Kazlauskas [Thu, 7 Dec 2023 19:12:03 +0000 (14:12 -0500)]
drm/amd/display: Allow IPS2 during Replay

[Why & How]
Add regkey to block video playback in IPS2 by default

Allow idle optimizations in the same spot we allow Replay for
video playback usecases.

Avoid sending it when there's an external display connected by
modifying the allow idle checks to check for active non-eDP screens.

Reviewed-by: Charlene Liu <charlene.liu@amd.com>
Acked-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amd/display: Fix late derefrence 'dsc' check in 'link_set_dsc_pps_packet()'
Srinivasan Shanmugam [Wed, 10 Jan 2024 15:28:35 +0000 (20:58 +0530)]
drm/amd/display: Fix late derefrence 'dsc' check in 'link_set_dsc_pps_packet()'

In link_set_dsc_pps_packet(), 'struct display_stream_compressor *dsc'
was dereferenced in a DC_LOGGER_INIT(dsc->ctx->logger); before the 'dsc'
NULL pointer check.

Fixes the below:
drivers/gpu/drm/amd/amdgpu/../display/dc/link/link_dpms.c:905 link_set_dsc_pps_packet() warn: variable dereferenced before check 'dsc' (see line 903)

Cc: stable@vger.kernel.org
Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Cc: Hamza Mahfooz <hamza.mahfooz@amd.com>
Cc: Wenjing Liu <wenjing.liu@amd.com>
Cc: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amdkfd: Fix variable dereferenced before NULL check in 'kfd_dbg_trap_device_snaps...
Srinivasan Shanmugam [Fri, 5 Jan 2024 05:57:48 +0000 (11:27 +0530)]
drm/amdkfd: Fix variable dereferenced before NULL check in 'kfd_dbg_trap_device_snapshot()'

Fixes the below:
drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_debug.c:1024 kfd_dbg_trap_device_snapshot() warn: variable dereferenced before check 'entry_size' (see line 1021)

Cc: Felix Kuehling <Felix.Kuehling@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amdgpu: Align ras block enum with firmware
Hawking Zhang [Fri, 29 Dec 2023 03:35:34 +0000 (11:35 +0800)]
drm/amdgpu: Align ras block enum with firmware

Driver and firmware share the same ras block enum.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amdgpu: replace MCA macro with ACA for XGMI
Yang Wang [Fri, 24 Nov 2023 08:46:25 +0000 (16:46 +0800)]
drm/amdgpu: replace MCA macro with ACA for XGMI

use new ACA macro to instead of MCA

Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amdgpu: Log deferred error separately
Candice Li [Wed, 3 Jan 2024 07:19:19 +0000 (15:19 +0800)]
drm/amdgpu: Log deferred error separately

Separate deferred error from UE and CE and log it
individually.

Signed-off-by: Candice Li <candice.li@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amdgpu: Do bad page retirement for deferred errors
Candice Li [Wed, 10 Jan 2024 07:30:31 +0000 (15:30 +0800)]
drm/amdgpu: Do bad page retirement for deferred errors

Needs to do bad page retirement for deferred errors.

v2: Drop unused dev_info.

Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com>
Signed-off-by: Candice Li <candice.li@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amdgpu: add xgmi v6.4.0 ACA support
Yang Wang [Mon, 20 Nov 2023 05:47:52 +0000 (13:47 +0800)]
drm/amdgpu: add xgmi v6.4.0 ACA support

add xgmi v6.4.0 ACA driver support

Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amdgpu: add mmhub v1.8 ACA support
Yang Wang [Fri, 24 Nov 2023 07:15:32 +0000 (15:15 +0800)]
drm/amdgpu: add mmhub v1.8 ACA support

v1:
add mmhub v1.8 ACA driver support

v2:
use macro to define smn address value.

Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amdgpu: add sdma v4.4.2 ACA support
Yang Wang [Fri, 24 Nov 2023 07:09:48 +0000 (15:09 +0800)]
drm/amdgpu: add sdma v4.4.2 ACA support

v1:
add sdma v4.4.2 ACA driver support

v2:
use macro to define smn address value.
v3: squash in fix for unbalanced irqs

Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amdgpu: add gfx v9.4.3 ACA support
Yang Wang [Thu, 16 Nov 2023 06:10:32 +0000 (14:10 +0800)]
drm/amdgpu: add gfx v9.4.3 ACA support

v1:
add gfx v9.4.3 ACA driver support

v2:
use macro to define smn address value.

Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amdgpu: Check extended configuration space register when system uses large bar
Ma Jun [Mon, 18 Dec 2023 03:32:06 +0000 (11:32 +0800)]
drm/amdgpu: Check extended configuration space register when system uses large bar

Some customer platforms do not enable mmconfig for various reasons,
such as bios bug, and therefore cannot access the GPU extend configuration
space through mmio.

When the system enters the d3cold state and resumes, the amdgpu driver
fails to resume because the extend configuration space registers of
GPU can't be restored. At this point, Usually we only see some failure
dmesg log printed by amdgpu driver, it is difficult to find the root
cause.

Therefor print a warnning message if the system can't access the
extended configuration space register when using large bar.

Signed-off-by: Ma Jun <Jun.Ma2@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amdgpu: add umc v12.0 ACA support
Yang Wang [Tue, 14 Nov 2023 02:46:42 +0000 (10:46 +0800)]
drm/amdgpu: add umc v12.0 ACA support

add umc v12.0 ACA driver support

Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amdgpu: add aca sysfs support
Yang Wang [Tue, 2 Jan 2024 02:43:23 +0000 (10:43 +0800)]
drm/amdgpu: add aca sysfs support

add aca sysfs node support

Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amdgpu: add amdgpu ras aca query interface
Yang Wang [Fri, 24 Nov 2023 06:41:06 +0000 (14:41 +0800)]
drm/amdgpu: add amdgpu ras aca query interface

v1:
add ACA error query interface

v2:
Add a new helper function to determine whether to use ACA or MCA.

Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amd/pm: add aca smu backend support for smu v13.0.6
Yang Wang [Tue, 14 Nov 2023 11:43:34 +0000 (19:43 +0800)]
drm/amd/pm: add aca smu backend support for smu v13.0.6

add aca smu backend support for smu v13.0.6.

Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amdgpu: move kiq_reg_write_reg_wait() out of amdgpu_virt.c
Alex Deucher [Thu, 14 Dec 2023 17:18:45 +0000 (12:18 -0500)]
drm/amdgpu: move kiq_reg_write_reg_wait() out of amdgpu_virt.c

It's used for more than just SR-IOV now, so move it to
amdgpu_gmc.c and rename it to better match the functionality and
update the comments in the code paths to better document
when each path is used and why.  No functional change.

Reviewed-by: Shaoyun.liu <Shaoyun.liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: Shaoyun.Liu@amd.com
Cc: Christian.Koenig@amd.com
21 months agodrm/amdgpu: add new INFO IOCTL query for input power
Alex Deucher [Mon, 2 Oct 2023 18:33:15 +0000 (14:33 -0400)]
drm/amdgpu: add new INFO IOCTL query for input power

Some chips provide both average and input power.  Previously
we just exposed average power, add a new query for input
power.

Example userspace:
https://github.com/Umio-Yasuno/libdrm-amdgpu-sys-rs/tree/input_power

Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amdgpu: Query boot status if boot failed
Hawking Zhang [Sun, 31 Dec 2023 12:04:05 +0000 (20:04 +0800)]
drm/amdgpu: Query boot status if boot failed

Check and report firmware boot status if it doesn't
reach steady status.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Le Ma <le.ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amdgpu: add ACA bank dump debugfs support
Yang Wang [Fri, 24 Nov 2023 08:07:56 +0000 (16:07 +0800)]
drm/amdgpu: add ACA bank dump debugfs support

add ACA bank dump debugfs support

Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amdgpu: add ACA kernel hardware error log support
Yang Wang [Fri, 24 Nov 2023 07:43:20 +0000 (15:43 +0800)]
drm/amdgpu: add ACA kernel hardware error log support

add ACA kernel hardware error log support.

Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amdgpu: Query boot status if discovery failed
Hawking Zhang [Thu, 28 Dec 2023 08:17:31 +0000 (16:17 +0800)]
drm/amdgpu: Query boot status if discovery failed

Check and report boot status if discovery failed.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Le Ma <le.ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amdgpu: Add ras helper to query boot errors v2
Hawking Zhang [Tue, 2 Jan 2024 11:32:56 +0000 (19:32 +0800)]
drm/amdgpu: Add ras helper to query boot errors v2

Add ras helper function to query boot time gpu
errors.
v2: use aqua_vanjaram smn addressing pattern

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Le Ma <le.ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amdgpu: implement RAS ACA driver framework
Yang Wang [Mon, 13 Nov 2023 06:33:56 +0000 (14:33 +0800)]
drm/amdgpu: implement RAS ACA driver framework

v1:
implement new RAS ACA driver code framework.

v2:
- rename aca_bank_set to aca_banks.
- rename aca_source_xxx to aca_handle_xxx.

v3:
Optimize some function implementation details. (from Hawking's suggestion)

Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amdgpu: Init pcie_index/data address as fallback (v2)
Hawking Zhang [Sun, 7 Jan 2024 15:36:17 +0000 (23:36 +0800)]
drm/amdgpu: Init pcie_index/data address as fallback (v2)

To allow using this helper for indirect access when
nbio funcs is not available. For instance, in ip
discovery phase.

v2: define macro for pcie_index/data/index_hi fallback.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Le Ma <le.ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amdgpu: drop psp v13 query_boot_status implementation
Hawking Zhang [Thu, 28 Dec 2023 08:04:02 +0000 (16:04 +0800)]
drm/amdgpu: drop psp v13 query_boot_status implementation

Will replace it with new implementation to cover
boot fails in ip discovery phase.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Le Ma <le.ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amdgpu: Replace DRM_* with dev_* in amdgpu_psp.c
Hawking Zhang [Sun, 31 Dec 2023 10:14:23 +0000 (18:14 +0800)]
drm/amdgpu: Replace DRM_* with dev_* in amdgpu_psp.c

So kernel message has the device pcie bdf information,
which helps issue debugging especially in multiple GPU
system.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amdkfd: Bump KFD ioctl version
Felix Kuehling [Wed, 3 Jan 2024 23:13:39 +0000 (18:13 -0500)]
drm/amdkfd: Bump KFD ioctl version

This is not strictly a change in the IOCTL API. This version bump is meant
to indicate to user mode the presence of a number of changes and fixes
that enable the management of VA mappings in compute VMs using the GEM_VA
ioctl for DMABufs exported from KFD.

Signed-off-by: Felix Kuehling <felix.kuehling@amd.com>
Reviewed-by: Xiaogang Chen<Xiaogang.Chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amdgpu: Auto-validate DMABuf imports in compute VMs
Felix Kuehling [Wed, 3 Jan 2024 23:13:22 +0000 (18:13 -0500)]
drm/amdgpu: Auto-validate DMABuf imports in compute VMs

DMABuf imports in compute VMs are not wrapped in a kgd_mem object on the
process_info->kfd_bo_list. There is no explicit KFD API call to validate
them or add eviction fences to them.

This patch automatically validates and fences dymanic DMABuf imports when
they are added to a compute VM. Revalidation after evictions is handled
in the VM code.

v2:
* Renamed amdgpu_vm_validate_evicted_bos to amdgpu_vm_validate
* Eliminated evicted_user state, use evicted state for VM BOs and user BOs
* Fixed and simplified amdgpu_vm_fence_imports, depends on reserved BOs
* Moved dma_resv_reserve_fences for amdgpu_vm_fence_imports into
  amdgpu_vm_validate, outside the vm->status_lock
* Added dummy version of amdgpu_amdkfd_bo_validate_and_fence for builds
  without KFD

v4: Eliminate amdgpu_vm_fence_imports. It's not needed because the
reservation with its fences is shared with the export, as long as all
imports are from KFD, with the exports already reserved, validated and
fenced by the KFD restore worker.

v5: Reintroduced separate evicted_user state to simplify the state machine
and CS error handling when amdgpu_vm_validate is called without a ticket.

Signed-off-by: Felix Kuehling <felix.kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amd/display: Avoid enum conversion warning
Nathan Chancellor [Wed, 10 Jan 2024 20:46:47 +0000 (13:46 -0700)]
drm/amd/display: Avoid enum conversion warning

Clang warns (or errors with CONFIG_WERROR=y) when performing arithmetic
with different enumerated types, which is usually a bug:

    drivers/gpu/drm/amd/amdgpu/../display/dc/link/protocols/link_dp_dpia_bw.c:548:24: error: arithmetic between different enumeration types ('const enum dc_link_rate' and 'const enum dc_lane_count') [-Werror,-Wenum-enum-conversion]
      548 |                         link_cap->link_rate * link_cap->lane_count * LINK_RATE_REF_FREQ_IN_KHZ * 8;
          |                         ~~~~~~~~~~~~~~~~~~~ ^ ~~~~~~~~~~~~~~~~~~~~
    1 error generated.

In this case, there is not a problem because the enumerated types are
basically treated as '#define' values. Add an explicit cast to an
integral type to silence the warning.

Closes: https://github.com/ClangBuiltLinux/linux/issues/1976
Fixes: 5f3bce13266e ("drm/amd/display: Request usb4 bw for mst streams")
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amd/pm: Fix smuv13.0.6 current clock reporting
Lijo Lazar [Thu, 11 Jan 2024 09:58:53 +0000 (15:28 +0530)]
drm/amd/pm: Fix smuv13.0.6 current clock reporting

When current clock is equal to max dpm level clock, the level is not
indicated correctly with *. Fix by comparing current clock against dpm
level value.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 6.7.x
21 months agodrm/amd/pm: Add error log for smu v13.0.6 reset
Lijo Lazar [Thu, 11 Jan 2024 04:17:33 +0000 (09:47 +0530)]
drm/amd/pm: Add error log for smu v13.0.6 reset

For all mode-2 reset fail cases, add error log.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 6.7.x
21 months agodrm/amdkfd: Fix 'node' NULL check in 'svm_range_get_range_boundaries()'
Srinivasan Shanmugam [Tue, 9 Jan 2024 11:27:26 +0000 (16:57 +0530)]
drm/amdkfd: Fix 'node' NULL check in 'svm_range_get_range_boundaries()'

Range interval [start, last] is ordered by rb_tree, rb_prev, rb_next
return value still needs NULL check, thus modified from "node" to "rb_node".

Fixes the below:
drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_svm.c:2691 svm_range_get_range_boundaries() warn: can 'node' even be NULL?

Suggested-by: Philip Yang <Philip.Yang@amd.com>
Cc: Felix Kuehling <Felix.Kuehling@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amdgpu: drop exp hw support check for GC 9.4.3
Alex Deucher [Tue, 9 Jan 2024 15:45:42 +0000 (10:45 -0500)]
drm/amdgpu: drop exp hw support check for GC 9.4.3

No longer needed.

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 6.7.x
21 months agodrm/amdgpu: move debug options init prior to amdgpu device init
Le Ma [Tue, 9 Jan 2024 10:06:35 +0000 (18:06 +0800)]
drm/amdgpu: move debug options init prior to amdgpu device init

To bring debug options into effect in early initialization phase

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amdgpu: add debug flag to place fw bo on vram for frontdoor loading
Le Ma [Tue, 9 Jan 2024 09:44:39 +0000 (17:44 +0800)]
drm/amdgpu: add debug flag to place fw bo on vram for frontdoor loading

Use debug_mask=0x8 param to help isolating data path issues
on new systems in early phase.

v2: rename the flag for explicitness (lijo)

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agoRevert "drm/amdgpu: add param to specify fw bo location for front-door loading"
Le Ma [Tue, 9 Jan 2024 04:06:25 +0000 (12:06 +0800)]
Revert "drm/amdgpu: add param to specify fw bo location for front-door loading"

This reverts commit c572abffe9f50c8ba33060865449313b3f588c35.

Will use debug module param instead of independent module param.

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amd/display: Fix variable deferencing before NULL check in edp_setup_replay()
Srinivasan Shanmugam [Mon, 8 Jan 2024 15:50:28 +0000 (21:20 +0530)]
drm/amd/display: Fix variable deferencing before NULL check in edp_setup_replay()

In edp_setup_replay(), 'struct dc *dc' & 'struct dmub_replay *replay'
was dereferenced before the pointer 'link' & 'replay' NULL check.

Fixes the below:
drivers/gpu/drm/amd/amdgpu/../display/dc/link/protocols/link_edp_panel_control.c:947 edp_setup_replay() warn: variable dereferenced before check 'link' (see line 933)

Cc: stable@vger.kernel.org
Cc: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com>
Cc: Harry Wentland <harry.wentland@amd.com>
Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amdgpu: update regGL2C_CTRL4 value in golden setting
Yifan Zhang [Tue, 9 Jan 2024 01:19:22 +0000 (09:19 +0800)]
drm/amdgpu: update regGL2C_CTRL4 value in golden setting

This patch to update regGL2C_CTRL4 in golden setting.

Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Reviewed-by: Tim Huang <Tim.Huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 6.7.x
21 months agodrm/amdgpu: Release 'adev->pm.fw' before return in 'amdgpu_device_need_post()'
Srinivasan Shanmugam [Thu, 21 Dec 2023 12:43:11 +0000 (18:13 +0530)]
drm/amdgpu: Release 'adev->pm.fw' before return in 'amdgpu_device_need_post()'

In function 'amdgpu_device_need_post(struct amdgpu_device *adev)' -
'adev->pm.fw' may not be released before return.

Using the function release_firmware() to release adev->pm.fw.

Thus fixing the below:
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1571 amdgpu_device_need_post() warn: 'adev->pm.fw' from request_firmware() not released on lines: 1554.

Cc: Monk Liu <Monk.Liu@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Suggested-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amdgpu: Fix unsigned comparison with less than zero in vpe_u1_8_from_fraction()
Srinivasan Shanmugam [Fri, 5 Jan 2024 04:38:58 +0000 (10:08 +0530)]
drm/amdgpu: Fix unsigned comparison with less than zero in vpe_u1_8_from_fraction()

The variables 'numerator' and 'denominator', are unsigned 16-bit integer
types, that can never be less than 0.

Thus fixing the below:
drivers/gpu/drm/amd/amdgpu/amdgpu_vpe.c:62 vpe_u1_8_from_fraction() warn: unsigned 'numerator' is never less than zero.
drivers/gpu/drm/amd/amdgpu/amdgpu_vpe.c:63 vpe_u1_8_from_fraction() warn: unsigned 'denominator' is never less than zero.

Cc: Peyton Lee <peytolee@amd.com>
Cc: Lang Yu <lang.yu@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Peyton Lee <peyton.lee@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amdgpu: Fix with right return code '-EIO' in 'amdgpu_gmc_vram_checking()'
Srinivasan Shanmugam [Thu, 4 Jan 2024 09:56:42 +0000 (15:26 +0530)]
drm/amdgpu: Fix with right return code '-EIO' in 'amdgpu_gmc_vram_checking()'

The amdgpu_gmc_vram_checking() function in emulation checks whether
all of the memory range of shared system memory could be accessed by
GPU, from this aspect, -EIO is returned for error scenarios.

Fixes the below:
drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c:919 gmc_v6_0_hw_init() warn: missing error code? 'r'
drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c:1103 gmc_v7_0_hw_init() warn: missing error code? 'r'
drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c:1223 gmc_v8_0_hw_init() warn: missing error code? 'r'
drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c:2344 gmc_v9_0_hw_init() warn: missing error code? 'r'

Cc: Xiaojian Du <Xiaojian.Du@amd.com>
Cc: Lijo Lazar <lijo.lazar@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Suggested-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amdgpu: Do not program VM_L2_CNTL under SRIOV
Victor Lu [Tue, 19 Dec 2023 16:55:09 +0000 (11:55 -0500)]
drm/amdgpu: Do not program VM_L2_CNTL under SRIOV

VM_L2_CNTL* should not be programmed on driver unload under SRIOV.
These regs are skipped during SRIOV driver init.

Signed-off-by: Victor Lu <victorchengchi.lu@amd.com>
Reviewed-by: Vignesh Chander <Vignesh.Chander@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amd/powerplay: Fix kzalloc parameter 'ATOM_Tonga_PPM_Table' in 'get_platform_powe...
Srinivasan Shanmugam [Fri, 5 Jan 2024 06:35:09 +0000 (12:05 +0530)]
drm/amd/powerplay: Fix kzalloc parameter 'ATOM_Tonga_PPM_Table' in 'get_platform_power_management_table()'

In 'struct phm_ppm_table *ptr' allocation using kzalloc, an incorrect
structure type is passed to sizeof() in kzalloc, larger structure types
were used, thus using correct type 'struct phm_ppm_table' fixes the
below:

drivers/gpu/drm/amd/amdgpu/../pm/powerplay/hwmgr/process_pptables_v1_0.c:203 get_platform_power_management_table() warn: struct type mismatch 'phm_ppm_table vs _ATOM_Tonga_PPM_Table'

Cc: Eric Huang <JinHuiEric.Huang@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amdgpu: update ATHUB_MISC_CNTL offset for athub v3.3
Yifan Zhang [Mon, 8 Jan 2024 08:05:27 +0000 (16:05 +0800)]
drm/amdgpu: update ATHUB_MISC_CNTL offset for athub v3.3

This patch to update ATHUB_MISC_CNTL offset for athub v3.3

v2: correct a typo (Tim)
v3: correct patch title (Lang)

Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Tim Huang <Tim.Huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amdgpu: update headers for nbio v7.11
Yifan Zhang [Mon, 8 Jan 2024 07:46:12 +0000 (15:46 +0800)]
drm/amdgpu: update headers for nbio v7.11

This patch is to update headers for nbio v7.11.

Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Tim Huang <Tim.Huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amdgpu/pm: clarify debugfs pm output
Alex Deucher [Thu, 7 Dec 2023 15:36:56 +0000 (10:36 -0500)]
drm/amdgpu/pm: clarify debugfs pm output

On APUs power is SoC power, not just GPU.
Clarify that for UVD/VCE/VCN the IP is powered down,
not disabled which can confusing and lead to concerns
that the IP is actually not available.

Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amdgpu: fall back to INPUT power for AVG power via INFO IOCTL
Alex Deucher [Mon, 2 Oct 2023 18:43:06 +0000 (14:43 -0400)]
drm/amdgpu: fall back to INPUT power for AVG power via INFO IOCTL

For backwards compatibility with userspace.

Fixes: 47f1724db4fe ("drm/amd: Introduce `AMDGPU_PP_SENSOR_GPU_INPUT_POWER`")
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2897
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amdgpu: fix avg vs input power reporting on smu7
Alex Deucher [Mon, 2 Oct 2023 18:27:13 +0000 (14:27 -0400)]
drm/amdgpu: fix avg vs input power reporting on smu7

Hawaii, Bonaire, Fiji, and Tonga support average power, the others
support current power.

Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amdkfd: fixes for HMM mem allocation
Dafna Hirschfeld [Sun, 7 Jan 2024 13:07:00 +0000 (15:07 +0200)]
drm/amdkfd: fixes for HMM mem allocation

Fix err return value and reset pgmap->type after checking it.

Fixes: c83dee9b6394 ("drm/amdkfd: add SPM support for SVM")
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Dafna Hirschfeld <dhirschfeld@habana.ai>
Signed-off-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amd/display: cleanup inconsistent indenting in amdgpu_dm_color
Melissa Wen [Fri, 5 Jan 2024 22:02:09 +0000 (21:02 -0100)]
drm/amd/display: cleanup inconsistent indenting in amdgpu_dm_color

smatch warnings:
amdgpu_dm_update_plane_color_mgmt() warn: inconsistent indenting

Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202401051643.PPdbmG1U-lkp@intel.com/
Signed-off-by: Melissa Wen <mwen@igalia.com>
Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amdgpu: make a correction on comment
James Zhu [Tue, 2 Jan 2024 20:53:01 +0000 (15:53 -0500)]
drm/amdgpu: make a correction on comment

Use a generic comment for AMDGPU_VM_RESERVED_VRAM size.

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agoRevert "drm/amd/display: fix bandwidth validation failure on DCN 2.1"
Ivan Lipski [Sat, 6 Jan 2024 00:40:50 +0000 (19:40 -0500)]
Revert "drm/amd/display: fix bandwidth validation failure on DCN 2.1"

This commit causes dmesg-warn on several IGT tests on DCN 3.1.6: *ERROR*
link_enc_cfg_validate: Invalid link encoder assignments - 0x1c

Affected IGT tests include:
- amdgpu/[amd_assr|amd_plane|amd_hotplug]
- kms_atomic
- kms_color
- kms_flip
- kms_properties
- kms_universal_plane

and some other tests

This reverts commit 3a0fa3bc245ef92838a8296e0055569b8dff94c4.

Cc: Melissa Wen <mwen@igalia.com>
Cc: Hamza Mahfooz <hamza.mahfooz@amd.com>
Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Ivan Lipski <ivlipski@amd.com>
Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amdkfd: Fix sparse __rcu annotation warnings
Felix Kuehling [Wed, 3 Jan 2024 23:13:04 +0000 (18:13 -0500)]
drm/amdkfd: Fix sparse __rcu annotation warnings

Properly mark kfd_process->ef as __rcu and consistently use the right
accessor functions.

Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202312052245.yFpBSgNH-lkp@intel.com/
Signed-off-by: Felix Kuehling <felix.kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amdkfd: Fix lock dependency warning with srcu
Philip Yang [Fri, 29 Dec 2023 20:19:25 +0000 (15:19 -0500)]
drm/amdkfd: Fix lock dependency warning with srcu

======================================================
WARNING: possible circular locking dependency detected
6.5.0-kfd-yangp #2289 Not tainted
------------------------------------------------------
kworker/0:2/996 is trying to acquire lock:
        (srcu){.+.+}-{0:0}, at: __synchronize_srcu+0x5/0x1a0

but task is already holding lock:
        ((work_completion)(&svms->deferred_list_work)){+.+.}-{0:0}, at:
process_one_work+0x211/0x560

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #3 ((work_completion)(&svms->deferred_list_work)){+.+.}-{0:0}:
        __flush_work+0x88/0x4f0
        svm_range_list_lock_and_flush_work+0x3d/0x110 [amdgpu]
        svm_range_set_attr+0xd6/0x14c0 [amdgpu]
        kfd_ioctl+0x1d1/0x630 [amdgpu]
        __x64_sys_ioctl+0x88/0xc0

-> #2 (&info->lock#2){+.+.}-{3:3}:
        __mutex_lock+0x99/0xc70
        amdgpu_amdkfd_gpuvm_restore_process_bos+0x54/0x740 [amdgpu]
        restore_process_helper+0x22/0x80 [amdgpu]
        restore_process_worker+0x2d/0xa0 [amdgpu]
        process_one_work+0x29b/0x560
        worker_thread+0x3d/0x3d0

-> #1 ((work_completion)(&(&process->restore_work)->work)){+.+.}-{0:0}:
        __flush_work+0x88/0x4f0
        __cancel_work_timer+0x12c/0x1c0
        kfd_process_notifier_release_internal+0x37/0x1f0 [amdgpu]
        __mmu_notifier_release+0xad/0x240
        exit_mmap+0x6a/0x3a0
        mmput+0x6a/0x120
        do_exit+0x322/0xb90
        do_group_exit+0x37/0xa0
        __x64_sys_exit_group+0x18/0x20
        do_syscall_64+0x38/0x80

-> #0 (srcu){.+.+}-{0:0}:
        __lock_acquire+0x1521/0x2510
        lock_sync+0x5f/0x90
        __synchronize_srcu+0x4f/0x1a0
        __mmu_notifier_release+0x128/0x240
        exit_mmap+0x6a/0x3a0
        mmput+0x6a/0x120
        svm_range_deferred_list_work+0x19f/0x350 [amdgpu]
        process_one_work+0x29b/0x560
        worker_thread+0x3d/0x3d0

other info that might help us debug this:
Chain exists of:
  srcu --> &info->lock#2 --> (work_completion)(&svms->deferred_list_work)

Possible unsafe locking scenario:

        CPU0                    CPU1
        ----                    ----
        lock((work_completion)(&svms->deferred_list_work));
                        lock(&info->lock#2);
lock((work_completion)(&svms->deferred_list_work));
        sync(srcu);

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amdgpu: Packed socket_id to ras feature mask
Hawking Zhang [Fri, 29 Dec 2023 06:54:35 +0000 (14:54 +0800)]
drm/amdgpu: Packed socket_id to ras feature mask

Initialize RAS feature mask bit[31:29] with socket_id.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amdgpu: Support poison error injection via ras_ctrl debugfs
Candice Li [Thu, 4 Jan 2024 02:27:10 +0000 (10:27 +0800)]
drm/amdgpu: Support poison error injection via ras_ctrl debugfs

Support poison error injection.

Signed-off-by: Candice Li <candice.li@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amdgpu: correct the cu count for gfx v11
Likun Gao [Fri, 5 Jan 2024 09:33:34 +0000 (17:33 +0800)]
drm/amdgpu: correct the cu count for gfx v11

Correct the algorithm of active CU to skip disabled
sa for gfx v11.

Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
21 months agodrm/amdgpu: Drop unnecessary sentences about CE and deferred error.
Candice Li [Thu, 4 Jan 2024 01:34:25 +0000 (09:34 +0800)]
drm/amdgpu: Drop unnecessary sentences about CE and deferred error.

Remove "no user action is needed" for correctable and deferred error
to avoid confusion.

Signed-off-by: Candice Li <candice.li@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amd/display: 3.2.266
Aric Cyr [Mon, 18 Dec 2023 00:36:16 +0000 (19:36 -0500)]
drm/amd/display: 3.2.266

This version brings along following fixes:

- Improve z8/z10 support.
- Revert some of the VRR optimization.
- Improve usb4 when using MST.

Acked-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
Signed-off-by: Aric Cyr <aric.cyr@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amd/display: Dpia hpd status not in sync after S4
Meenakshikumar Somasundaram [Tue, 19 Dec 2023 20:51:24 +0000 (15:51 -0500)]
drm/amd/display: Dpia hpd status not in sync after S4

[Why]
Dpia hpd status not in sync causing driver not enabling BW Alloc after
S4.

[How]
Update hpd_status of the link when querying hpd state from dmub in
dpia_query_hpd_status().

Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Acked-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
Signed-off-by: Meenakshikumar Somasundaram <meenakshikumar.somasundaram@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amd/display: Update z8 latency
Charlene Liu [Thu, 21 Dec 2023 14:42:28 +0000 (09:42 -0500)]
drm/amd/display: Update z8 latency

Adjust z8 latency for performance.

Reviewed-by: Muhammad Ahmed <ahmed.ahmed@amd.com>
Acked-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
Signed-off-by: Charlene Liu <charlene.liu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agoRevert "drm/amd/display: Fix conversions between bytes and KB"
Daniel Miess [Wed, 20 Dec 2023 15:34:32 +0000 (10:34 -0500)]
Revert "drm/amd/display: Fix conversions between bytes and KB"

This reverts commit d0f639c5869399bf6dde4d694d5f8c0ab8c0ec46.

The previous commit causes failure to light up for 1080p
eDP + 8k HDMI panel combo.

Reviewed-by: Charlene Liu <charlene.liu@amd.com>
Acked-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
Signed-off-by: Daniel Miess <daniel.miess@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amd/display: Request usb4 bw for mst streams
Peichen Huang [Thu, 14 Dec 2023 15:16:34 +0000 (23:16 +0800)]
drm/amd/display: Request usb4 bw for mst streams

[WHY]
When usb4 bandwidth allocation mode is enabled, driver need to request
bandwidth from connection manager. For mst link,  the requested
bandwidth should be big enough for all remote streams.

[HOW]
- If mst link, the requested bandwidth should be the sum of all mst
  streams bandwidth added with dp MTPH overhead.
- Allocate/deallcate usb4 bandwidth when setting dpms on/off.
- When doing display mode validation, driver also need to consider total
  bandwidth of all mst streams for mst link.

Reviewed-by: Cruise Hung <cruise.hung@amd.com>
Acked-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
Signed-off-by: Peichen Huang <peichen.huang@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amd/display: revert "Optimize VRR updates to only necessary ones"
Martin Leung [Wed, 20 Dec 2023 21:56:11 +0000 (16:56 -0500)]
drm/amd/display: revert "Optimize VRR updates to only necessary ones"

This reverts commit 6e4337f695c25162f0296934152506ad596fcebf.

The original commit causes regression in corner case with HDMI at
specific timings. Reverting from staging to get the full suite to
retest.

Reviewed-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
Signed-off-by: Martin Leung <martin.leung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amd/display: revert "for FPO & SubVP/DRR config program vmin/max"
Martin Leung [Wed, 20 Dec 2023 21:54:05 +0000 (16:54 -0500)]
drm/amd/display: revert "for FPO & SubVP/DRR config program vmin/max"

This reverts commit 6b2b782ad6a25734ae847d1659bea3f613dbb563.

The original commit causes issues with certain features when DRR is
disabled, need to revisit this change later after resolving issues with
new DRR policy.

Reviewed-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
Signed-off-by: Martin Leung <martin.leung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amd/display: Disconnect phantom pipe OPP from OPTC being disabled
George Shen [Sun, 17 Dec 2023 22:17:57 +0000 (17:17 -0500)]
drm/amd/display: Disconnect phantom pipe OPP from OPTC being disabled

[Why]
If an OPP is used for a different OPTC without first being disconnected
from the previous OPTC, unexpected behaviour can occur. This also
applies to phantom pipes, which is what the current logic missed.

[How]
Disconnect OPPs from OPTC for phantom pipes before disabling OTG master.

Also move the disconnection to before the OTG master disable, since the
register is double buffered.

Reviewed-by: Dillon Varone <dillon.varone@amd.com>
Acked-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
Signed-off-by: George Shen <george.shen@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amd/display: To adjust dprefclk by down spread percentage
Martin Tsai [Mon, 18 Dec 2023 08:36:44 +0000 (16:36 +0800)]
drm/amd/display: To adjust dprefclk by down spread percentage

[Why]
Panels show corruption with high refresh rate timings when ssc is
enabled.

[How]
Read down-spread percentage from lut to adjust dprefclk. Issues come
from S0i3 with this commit has been fixed by SMU.

Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Acked-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
Signed-off-by: Martin Tsai <martin.tsai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amdkfd: Fix lock dependency warning
Felix Kuehling [Tue, 2 Jan 2024 20:07:44 +0000 (15:07 -0500)]
drm/amdkfd: Fix lock dependency warning

======================================================
WARNING: possible circular locking dependency detected
6.5.0-kfd-fkuehlin #276 Not tainted
------------------------------------------------------
kworker/8:2/2676 is trying to acquire lock:
ffff9435aae95c88 ((work_completion)(&svm_bo->eviction_work)){+.+.}-{0:0}, at: __flush_work+0x52/0x550

but task is already holding lock:
ffff9435cd8e1720 (&svms->lock){+.+.}-{3:3}, at: svm_range_deferred_list_work+0xe8/0x340 [amdgpu]

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #2 (&svms->lock){+.+.}-{3:3}:
       __mutex_lock+0x97/0xd30
       kfd_ioctl_alloc_memory_of_gpu+0x6d/0x3c0 [amdgpu]
       kfd_ioctl+0x1b2/0x5d0 [amdgpu]
       __x64_sys_ioctl+0x86/0xc0
       do_syscall_64+0x39/0x80
       entry_SYSCALL_64_after_hwframe+0x63/0xcd

-> #1 (&mm->mmap_lock){++++}-{3:3}:
       down_read+0x42/0x160
       svm_range_evict_svm_bo_worker+0x8b/0x340 [amdgpu]
       process_one_work+0x27a/0x540
       worker_thread+0x53/0x3e0
       kthread+0xeb/0x120
       ret_from_fork+0x31/0x50
       ret_from_fork_asm+0x11/0x20

-> #0 ((work_completion)(&svm_bo->eviction_work)){+.+.}-{0:0}:
       __lock_acquire+0x1426/0x2200
       lock_acquire+0xc1/0x2b0
       __flush_work+0x80/0x550
       __cancel_work_timer+0x109/0x190
       svm_range_bo_release+0xdc/0x1c0 [amdgpu]
       svm_range_free+0x175/0x180 [amdgpu]
       svm_range_deferred_list_work+0x15d/0x340 [amdgpu]
       process_one_work+0x27a/0x540
       worker_thread+0x53/0x3e0
       kthread+0xeb/0x120
       ret_from_fork+0x31/0x50
       ret_from_fork_asm+0x11/0x20

other info that might help us debug this:

Chain exists of:
  (work_completion)(&svm_bo->eviction_work) --> &mm->mmap_lock --> &svms->lock

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(&svms->lock);
                               lock(&mm->mmap_lock);
                               lock(&svms->lock);
  lock((work_completion)(&svm_bo->eviction_work));

I believe this cannot really lead to a deadlock in practice, because
svm_range_evict_svm_bo_worker only takes the mmap_read_lock if the BO
refcount is non-0. That means it's impossible that svm_range_bo_release
is running concurrently. However, there is no good way to annotate this.

To avoid the problem, take a BO reference in
svm_range_schedule_evict_svm_bo instead of in the worker. That way it's
impossible for a BO to get freed while eviction work is pending and the
cancel_work_sync call in svm_range_bo_release can be eliminated.

v2: Use svm_bo_ref_unless_zero and explained why that's safe. Also
removed redundant checks that are already done in
amdkfd_fence_enable_signaling.

Signed-off-by: Felix Kuehling <felix.kuehling@amd.com>
Reviewed-by: Philip Yang <philip.yang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agoMerge tag 'amd-drm-next-6.8-2024-01-05' of https://gitlab.freedesktop.org/agd5f/linux...
Dave Airlie [Mon, 8 Jan 2024 23:07:42 +0000 (09:07 +1000)]
Merge tag 'amd-drm-next-6.8-2024-01-05' of https://gitlab.freedesktop.org/agd5f/linux into drm-next

amd-drm-next-6.8-2024-01-05:

amdgpu:
- VRR fixes
- PSR-SU fixes
- SubVP fixes
- DCN 3.5 fixes
- Documentation updates
- DMCUB fixes
- DML2 fixes
- UMC 12.0 updates
- GPUVM fix
- Misc code cleanups and whitespace cleanups
- DP MST fix
- Let KFD sync with GPUVM fences
- GFX11 reset fix
- SMU 13.0.6 fixes
- VSC fix for DP/eDP
- Navi12 display fix
- RN/CZN system aperture fix
- DCN 2.1 bandwidth validation fix
- DCN INIT cleanup

amdkfd:
- SVM fixes
- Revert TBA/TMA location change

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240105220522.4976-1-alexander.deucher@amd.com
21 months agodrm/amd/display: Allow z8/z10 from driver
Charlene Liu [Sat, 16 Dec 2023 00:41:57 +0000 (19:41 -0500)]
drm/amd/display: Allow z8/z10 from driver

Copy StutterPeriod from DML2 into DML1 StutterPeriod parameter.

Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Reviewed-by: Muhammad Ahmed <ahmed.ahmed@amd.com>
Reviewed-by: Aric Cyr <aric.cyr@amd.com>
Acked-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
Signed-off-by: Charlene Liu <charlene.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
21 months agodrm/amd/display: fix bandwidth validation failure on DCN 2.1
Melissa Wen [Fri, 29 Dec 2023 16:25:00 +0000 (15:25 -0100)]
drm/amd/display: fix bandwidth validation failure on DCN 2.1

IGT `amdgpu/amd_color/crtc-lut-accuracy` fails right at the beginning of
the test execution, during atomic check, because DC rejects the
bandwidth state for a fb sizing 64x64. The test was previously working
with the deprecated dc_commit_state(). Now using
dc_validate_with_context() approach, the atomic check needs to perform a
full state validation. Therefore, set fast_validation to false in the
dc_validate_global_state call for atomic check.

Cc: stable@vger.kernel.org
Fixes: b8272241ff9d ("drm/amd/display: Drop dc_commit_state in favor of dc_commit_streams")
Signed-off-by: Melissa Wen <mwen@igalia.com>
Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>