]> www.infradead.org Git - users/willy/linux.git/log
users/willy/linux.git
4 years agodrm/amdkfd: dqm fence memory corruption
Qu Huang [Thu, 28 Jan 2021 12:14:25 +0000 (20:14 +0800)]
drm/amdkfd: dqm fence memory corruption

Amdgpu driver uses 4-byte data type as DQM fence memory,
and transmits GPU address of fence memory to microcode
through query status PM4 message. However, query status
PM4 message definition and microcode processing are all
processed according to 8 bytes. Fence memory only allocates
4 bytes of memory, but microcode does write 8 bytes of memory,
so there is a memory corruption.

Changes since v1:
  * Change dqm->fence_addr as a u64 pointer to fix this issue,
also fix up query_status and amdkfd_fence_wait_timeout function
uses 64 bit fence value to make them consistent.

Signed-off-by: Qu Huang <jinsdb@126.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: fix offset calculation in amdgpu_vm_bo_clear_mappings()
Nirmoy Das [Fri, 26 Mar 2021 15:08:10 +0000 (16:08 +0100)]
drm/amdgpu: fix offset calculation in amdgpu_vm_bo_clear_mappings()

Offset calculation wasn't correct as start addresses are in pfn
not in bytes.

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/amdgpu: Add CP_IB1_BASE_* to gc_10_3_0 headers
Tom St Denis [Fri, 26 Mar 2021 11:07:25 +0000 (07:07 -0400)]
drm/amd/amdgpu: Add CP_IB1_BASE_* to gc_10_3_0 headers

Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/pm: Fix DPM level count on aldebaran
Lijo Lazar [Fri, 26 Mar 2021 05:43:14 +0000 (13:43 +0800)]
drm/amd/pm: Fix DPM level count on aldebaran

Firmware returns zero-based max level, increment by one to get
total levels. This fixes the issue of not showing all levels and current
frequency when frequency is at max DPM level.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/pm: unify the interface for gfx state setting
Evan Quan [Thu, 25 Mar 2021 05:16:48 +0000 (13:16 +0800)]
drm/amd/pm: unify the interface for gfx state setting

No need to have special handling for swSMU supported ASICs.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/pm: unify the interface for power gating
Evan Quan [Thu, 25 Mar 2021 05:01:09 +0000 (13:01 +0800)]
drm/amd/pm: unify the interface for power gating

No need to have special handling for swSMU supported ASICs.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/pm: fix missing static declarations
Evan Quan [Thu, 25 Mar 2021 03:34:31 +0000 (11:34 +0800)]
drm/amd/pm: fix missing static declarations

Add "static" declarations for those APIs used internally.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/pm: unify the interface for loading SMU microcode
Evan Quan [Wed, 24 Mar 2021 08:51:52 +0000 (16:51 +0800)]
drm/amd/pm: unify the interface for loading SMU microcode

No need to have special handling for swSMU supported ASICs.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/pm: no need to force MCLK to highest when no display connected
Evan Quan [Tue, 23 Mar 2021 08:30:38 +0000 (16:30 +0800)]
drm/amd/pm: no need to force MCLK to highest when no display connected

Correct the check for vblank short.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Tested-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: Fix build warnings
Lijo Lazar [Wed, 24 Mar 2021 04:56:28 +0000 (12:56 +0800)]
drm/amdgpu: Fix build warnings

Fix header guard and make internal functions static. Fixes the below warnings:

drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgpu_reset.h:24:9: warning: '__AMDUGPU_RESET_H__' is used as a header guard here, followed by #define of a different macro [-Wheader-guard]
drivers/gpu/drm/amd/amdgpu/aldebaran.c:110:6: warning: no previous prototype for function 'aldebaran_async_reset' [-Wmissing-prototypes]
drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu13/aldebaran_ppt.c:1435:5: warning: no previous prototype for function 'aldebaran_mode2_reset' [-Wmissing-prototypes]

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: Enable recovery on aldebaran
Lijo Lazar [Tue, 23 Mar 2021 10:50:50 +0000 (18:50 +0800)]
drm/amdgpu: Enable recovery on aldebaran

Add aldebaran to devices which support recovery

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: Add mode2 reset support for aldebaran
Lijo Lazar [Tue, 16 Mar 2021 13:36:28 +0000 (21:36 +0800)]
drm/amdgpu: Add mode2 reset support for aldebaran

v1: Aldebaran uses reset control to support mode2 reset. The sequences to
reset and restore hardware context are specific to a particular
configuration.

v2: Clear bus mastering before reset.
Fix coding style issues, drop unwanted variables and info log.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: Make set PG/CG state functions public
Lijo Lazar [Tue, 16 Mar 2021 13:14:40 +0000 (21:14 +0800)]
drm/amdgpu: Make set PG/CG state functions public

Expose PG/CG set states functions for other clients

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: Add PSP public function to load a list of FWs
Lijo Lazar [Tue, 16 Mar 2021 12:56:43 +0000 (20:56 +0800)]
drm/amdgpu: Add PSP public function to load a list of FWs

v1: Adds a function to load a list of FWs as passed by the caller. This is
needed as only a select need to loaded for some use cases.

v2: Omit unrelated change, remove info log, fix return value when count is 0

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: Add reset control handling to reset workflow
Lijo Lazar [Tue, 16 Mar 2021 12:31:51 +0000 (20:31 +0800)]
drm/amdgpu: Add reset control handling to reset workflow

This prefers reset control based handling if it's implemented
for a particular ASIC. If not, it takes the legacy path. It uses
the legacy method of preparing environment (job, scheduler tasks)
and restoring environment.

v2: remove unused variable (Alex)

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: Add reset control to amdgpu_device
Lijo Lazar [Tue, 16 Mar 2021 12:19:06 +0000 (20:19 +0800)]
drm/amdgpu: Add reset control to amdgpu_device

v1: Add generic amdgpu_reset_control to handle different types of resets. It
may be added at device, hive or ip level. Each reset control has a list
of handlers associated with it to handle different types of reset. Reset
control is responsible for choosing the right handler given a particular
reset context.

Handler objects may implement a set of functions on how to handle a
particular type of reset.

prepare_env = Prepare environment/software context (not used currently).
prepare_hwcontext = Prepare hardware context for the reset.
perform_reset = Perform the type of reset.
restore_hwcontext = Restore the hw context after reset.
restore_env = Restore the environment after reset (not used currently).

Reset context carries the context of reset, as of now this is based on
the parameters used for current set of resets.

v2: Fix coding style

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/pm: Add support for reset completion on aldebaran
Lijo Lazar [Tue, 16 Mar 2021 11:47:51 +0000 (19:47 +0800)]
drm/amd/pm: Add support for reset completion on aldebaran

v1: On aldebaran, after hardware context restore, another handshake
needs to happen with PMFW so that reset recovery is complete from
PMFW side. Treat this as RESET_COMPLETE event for aldebaran.

v2: Cleanup coding style, info logs

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/pm: Add function to wait for smu events
Lijo Lazar [Tue, 16 Mar 2021 11:34:38 +0000 (19:34 +0800)]
drm/amd/pm: Add function to wait for smu events

v1: Add function to wait for specific event/states from PMFW

v2: Add mutex lock, simplify sequence

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/pm: Modify mode2 msg sequence on aldebaran
Lijo Lazar [Tue, 16 Mar 2021 11:19:09 +0000 (19:19 +0800)]
drm/amd/pm: Modify mode2 msg sequence on aldebaran

v1: During mode2 reset, PCI space is lost after message is sent.
Restore PCI space before waiting for response from firmware.

v2: Move mode2 sequence to aldebaran and update PMFW version.
Handle generic sequence in smu13 without PMFW version check.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/amdgpu implement tdr advanced mode
Jack Zhang [Mon, 8 Mar 2021 04:41:27 +0000 (12:41 +0800)]
drm/amd/amdgpu implement tdr advanced mode

[Why]
Previous tdr design treats the first job in job_timeout as the bad job.
But sometimes a later bad compute job can block a good gfx job and
cause an unexpected gfx job timeout because gfx and compute ring share
internal GC HW mutually.

[How]
This patch implements an advanced tdr mode.It involves an additinal
synchronous pre-resubmit step(Step0 Resubmit) before normal resubmit
step in order to find the real bad job.

1. At Step0 Resubmit stage, it synchronously submits and pends for the
first job being signaled. If it gets timeout, we identify it as guilty
and do hw reset. After that, we would do the normal resubmit step to
resubmit left jobs.

2. For whole gpu reset(vram lost), do resubmit as the old way.

v2: squash in build fix (Alex)

Signed-off-by: Jack Zhang <Jack.Zhang1@amd.com>
Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: make BO type check less restrictive
Nirmoy Das [Mon, 15 Mar 2021 15:02:37 +0000 (16:02 +0100)]
drm/amdgpu: make BO type check less restrictive

BO with ttm_bo_type_sg type can also have tiling_flag and metadata.
So so BO type check for only ttm_bo_type_kernel.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reported-by: Tom StDenis <Tom.StDenis@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: use amdgpu_bo_user bo for metadata and tiling flag
Nirmoy Das [Mon, 8 Mar 2021 13:03:35 +0000 (14:03 +0100)]
drm/amdgpu: use amdgpu_bo_user bo for metadata and tiling flag

Tiling flag and metadata are only needed for BOs created by
amdgpu_gem_object_create(), so we can remove those from the
base class.

v2: * squash tiling_flags and metadata relared patches into one
    * use BUG_ON for non ttm_bo_type_device type when accessing
    tiling_flags and metadata._
v3: *include to_amdgpu_bo_user

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: use amdgpu_bo_create_user() for when possible
Nirmoy Das [Tue, 9 Mar 2021 07:31:25 +0000 (08:31 +0100)]
drm/amdgpu: use amdgpu_bo_create_user() for when possible

Use amdgpu_bo_create_user() for all the BO allocations for
ttm_bo_type_device type.

v2: include amdgpu_amdkfd_alloc_gws() as well it calls amdgpu_bo_create()
    for  ttm_bo_type_device

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: introduce struct amdgpu_bo_user
Nirmoy Das [Fri, 5 Mar 2021 12:00:22 +0000 (13:00 +0100)]
drm/amdgpu: introduce struct amdgpu_bo_user

Implement a new struct amdgpu_bo_user as subclass of
struct amdgpu_bo and a function to created amdgpu_bo_user
bo with a flag to identify the owner.

v2: amdgpu_bo_to_amdgpu_bo_user -> to_amdgpu_bo_user()

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: allow variable BO struct creation
Nirmoy Das [Mon, 8 Mar 2021 13:00:06 +0000 (14:00 +0100)]
drm/amdgpu: allow variable BO struct creation

Allow allocating BO structures with different structure size
than struct amdgpu_bo.

v2: Check bo_ptr_size in all amdgpu_bo_create() caller.

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: load balance VCN3 decode as well v8
Christian König [Wed, 10 Feb 2021 12:55:13 +0000 (13:55 +0100)]
drm/amdgpu: load balance VCN3 decode as well v8

Add VCN3 IB parsing to figure out to which instance we can send the
stream for decode.

v2: remove VCN instance limit as well, fix amdgpu_cs_find_mapping,
    check supported formats instead of unsupported.
v3: fix typo and error handling
v4: make sure the message BO is CPU accessible
v5: fix addr calculation once more
v6: only check message buffers
v7: fix constant and use defines
v8: fix create msg calculation

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Sonny Jiang <sonny.jiang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: share scheduler score on VCN3 instances
Christian König [Tue, 2 Feb 2021 12:13:29 +0000 (13:13 +0100)]
drm/amdgpu: share scheduler score on VCN3 instances

The VCN3 instances can do both decode as well as encode.

Share the scheduler load balancing score and remove fixing encode to
only the second instance.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-and-Tested-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: add the sched_score to amdgpu_ring_init
Christian König [Tue, 2 Feb 2021 12:05:49 +0000 (13:05 +0100)]
drm/amdgpu: add the sched_score to amdgpu_ring_init

Allow separate ring to share the same scheduler score.

No functional change.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-and-Tested-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/display: check fb of primary plane
Sefa Eyeoglu [Tue, 16 Mar 2021 21:50:06 +0000 (22:50 +0100)]
drm/amd/display: check fb of primary plane

Sometimes the primary plane might not be initialized (yet), which
causes dm_check_crtc_cursor to divide by zero.
Apparently a weird state before a S3-suspend causes the aforementioned
divide-by-zero error when resuming from S3.  This was explained in
bug 212293 on Bugzilla.

To avoid this divide-by-zero error we check if the primary plane's fb
isn't NULL.  If it's NULL the src_w and src_h attributes will be 0,
which would cause a divide-by-zero.

This fixes Bugzilla report 212293
Bug: https://bugzilla.kernel.org/show_bug.cgi?id=212293

Fixes: 12f4849a1cfd69f3 ("drm/amd/display: check cursor scaling")
Reviewed-by: Simon Ser <contact@emersion.fr>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Sefa Eyeoglu <contact@scrumplex.net>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/display: Allow idle optimization based on vblank.
Bindu Ramamurthy [Tue, 16 Mar 2021 21:08:47 +0000 (17:08 -0400)]
drm/amd/display: Allow idle optimization based on vblank.

[Why]
idle optimization was being disabled after commit.

[How]
check vblank count for display off and enable idle optimization based on this count.
Also,check added to ensure vblank count does not decrement, when count reaches 0.

Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Bindu Ramamurthy <bindu.r@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd: Fix a typo in two different sentences
Bhaskar Chowdhury [Mon, 22 Mar 2021 21:06:12 +0000 (02:36 +0530)]
drm/amd: Fix a typo in two different sentences

s/defintion/definition/ .....two different places.

Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Bhaskar Chowdhury <unixbhaskar@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/amdgpu/gfx_v7_0: Trivial typo fixes
Bhaskar Chowdhury [Thu, 25 Mar 2021 08:53:24 +0000 (14:23 +0530)]
drm/amd/amdgpu/gfx_v7_0: Trivial typo fixes

s/acccess/access/
s/inferface/interface/
s/sequnce/sequence/  .....two different places.
s/retrive/retrieve/
s/sheduling/scheduling/
s/independant/independent/
s/wether/whether/ ......two different places.
s/emmit/emit/
s/synce/sync/

Reviewed-by: Nirmoy Das<nirmoy.das@amd.com>
Signed-off-by: Bhaskar Chowdhury <unixbhaskar@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/radeon/r600_cs: Few typo fixes
Bhaskar Chowdhury [Wed, 24 Mar 2021 23:29:41 +0000 (04:59 +0530)]
drm/radeon/r600_cs: Few typo fixes

s/miror/mirror/
s/needind/needing/
s/informations/information/

Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Bhaskar Chowdhury <unixbhaskar@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agoamdgpu: securedisplay: simplify i2c hexdump output
Arnd Bergmann [Wed, 24 Mar 2021 13:36:52 +0000 (14:36 +0100)]
amdgpu: securedisplay: simplify i2c hexdump output

A previous fix I did left a rather complicated loop in
amdgpu_securedisplay_debugfs_write() for what could be expressed in a
simple sprintf, as Rasmus pointed out.

This drops the leading 0x for each byte, but is otherwise
much nicer.

Suggested-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: Ensure that the modifier requested is supported by plane.
Mark Yacoub [Wed, 24 Mar 2021 20:16:24 +0000 (16:16 -0400)]
drm/amdgpu: Ensure that the modifier requested is supported by plane.

On initializing the framebuffer, call drm_any_plane_has_format to do a
check if the modifier is supported. drm_any_plane_has_format calls
dm_plane_format_mod_supported which is extended to validate that the
modifier is on the list of the plane's supported modifiers.

The bug was caught using igt-gpu-tools test: kms_addfb_basic.addfb25-bad-modifier

Tested on ChromeOS Zork by turning on the display, running an overlay
test, and running a YT video.

=== Changes from v1 ===
Explicitly handle DRM_FORMAT_MOD_INVALID modifier.

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Mark Yacoub <markyacoub@chromium.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/pm: Convert sysfs sprintf/snprintf family to sysfs_emit
Tian Tao [Wed, 24 Mar 2021 09:17:41 +0000 (17:17 +0800)]
drm/amd/pm: Convert sysfs sprintf/snprintf family to sysfs_emit

Fix the following coccicheck warning:
drivers/gpu/drm/amd/pm/amdgpu_pm.c:1940:8-16: WARNING:
use scnprintf or sprintf
drivers/gpu/drm/amd/pm/amdgpu_pm.c:1978:8-16: WARNING:
use scnprintf or sprintf
drivers/gpu/drm/amd/pm/amdgpu_pm.c:2022:8-16: WARNING:
use scnprintf or sprintf
drivers/gpu/drm/amd/pm/amdgpu_pm.c:294:8-16: WARNING:
use scnprintf or sprintf
drivers/gpu/drm/amd/pm/amdgpu_pm.c:154:8-16: WARNING:
use scnprintf or sprintf
drivers/gpu/drm/amd/pm/amdgpu_pm.c:496:8-16: WARNING:
use scnprintf or sprintf
drivers/gpu/drm/amd/pm/amdgpu_pm.c:512:9-17: WARNING:
use scnprintf or sprintf
drivers/gpu/drm/amd/pm/amdgpu_pm.c:1740:8-16: WARNING:
use scnprintf or sprintf
drivers/gpu/drm/amd/pm/amdgpu_pm.c:1667:8-16: WARNING:
use scnprintf or sprintf
drivers/gpu/drm/amd/pm/amdgpu_pm.c:2074:8-16: WARNING:
use scnprintf or sprintf
drivers/gpu/drm/amd/pm/amdgpu_pm.c:2047:9-17: WARNING:
use scnprintf or sprintf
drivers/gpu/drm/amd/pm/amdgpu_pm.c:2768:8-16: WARNING:
use scnprintf or sprintf
drivers/gpu/drm/amd/pm/amdgpu_pm.c:2738:8-16: WARNING:
use scnprintf or sprintf
drivers/gpu/drm/amd/pm/amdgpu_pm.c:2442:8-16: WARNING:
use scnprintf or sprintf
drivers/gpu/drm/amd/pm/amdgpu_pm.c:3246:8-16: WARNING:
use scnprintf or sprintf
drivers/gpu/drm/amd/pm/amdgpu_pm.c:3253:8-16: WARNING:
use scnprintf or sprintf
drivers/gpu/drm/amd/pm/amdgpu_pm.c:2458:8-16: WARNING:
use scnprintf or sprintf
drivers/gpu/drm/amd/pm/amdgpu_pm.c:3047:8-16: WARNING:
use scnprintf or sprintf
drivers/gpu/drm/amd/pm/amdgpu_pm.c:3133:8-16: WARNING:
use scnprintf or sprintf
drivers/gpu/drm/amd/pm/amdgpu_pm.c:3209:8-16: WARNING:
use scnprintf or sprintf
drivers/gpu/drm/amd/pm/amdgpu_pm.c:3216:8-16: WARNING:
use scnprintf or sprintf
drivers/gpu/drm/amd/pm/amdgpu_pm.c:2410:8-16: WARNING:
use scnprintf or sprintf
drivers/gpu/drm/amd/pm/amdgpu_pm.c:2496:8-16: WARNING:
use scnprintf or sprintf
drivers/gpu/drm/amd/pm/amdgpu_pm.c:2470:8-16: WARNING:
use scnprintf or sprintf
drivers/gpu/drm/amd/pm/amdgpu_pm.c:2426:8-16: WARNING:
use scnprintf or sprintf
drivers/gpu/drm/amd/pm/amdgpu_pm.c:2965:8-16: WARNING:
use scnprintf or sprintf
drivers/gpu/drm/amd/pm/amdgpu_pm.c:2972:8-16: WARNING:
use scnprintf or sprintf
drivers/gpu/drm/amd/pm/amdgpu_pm.c:3006:8-16: WARNING:
use scnprintf or sprintf
drivers/gpu/drm/amd/pm/amdgpu_pm.c:3013:8-16: WARNING:
use scnprintf or sprintf

Signed-off-by: Tian Tao <tiantao6@hisilicon.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: Convert sysfs sprintf/snprintf family to sysfs_emit
Tian Tao [Wed, 24 Mar 2021 09:17:40 +0000 (17:17 +0800)]
drm/amdgpu: Convert sysfs sprintf/snprintf family to sysfs_emit

Fix the following coccicheck warning:
drivers/gpu//drm/amd/amdgpu/amdgpu_ras.c:434:9-17: WARNING:
use scnprintf or sprintf
drivers/gpu//drm/amd/amdgpu/amdgpu_xgmi.c:220:8-16: WARNING:
use scnprintf or sprintf
drivers/gpu//drm/amd/amdgpu/amdgpu_xgmi.c:249:8-16: WARNING:
use scnprintf or sprintf
drivers/gpu//drm/amd/amdgpu/df_v3_6.c:208:8-16: WARNING:
use scnprintf or sprintf
drivers/gpu//drm/amd/amdgpu/amdgpu_psp.c:2973:8-16: WARNING:
use scnprintf or sprintf
drivers/gpu//drm/amd/amdgpu/amdgpu_vram_mgr.c:75:8-16: WARNING:
use scnprintf or sprintf
drivers/gpu//drm/amd/amdgpu/amdgpu_vram_mgr.c:112:8-16: WARNING:
use scnprintf or sprintf
drivers/gpu//drm/amd/amdgpu/amdgpu_vram_mgr.c:58:8-16: WARNING:
use scnprintf or sprintf
drivers/gpu//drm/amd/amdgpu/amdgpu_vram_mgr.c:93:8-16: WARNING:
use scnprintf or sprintf
drivers/gpu//drm/amd/amdgpu/amdgpu_vram_mgr.c:125:9-17: WARNING:
use scnprintf or sprintf
drivers/gpu//drm/amd/amdgpu/amdgpu_gtt_mgr.c:52:8-16: WARNING:
use scnprintf or sprintf
drivers/gpu//drm/amd/amdgpu/amdgpu_gtt_mgr.c:71:8-16: WARNING:
use scnprintf or sprintf
drivers/gpu//drm/amd/amdgpu/amdgpu_device.c:140:8-16: WARNING:
use scnprintf or sprintf
drivers/gpu//drm/amd/amdgpu/amdgpu_device.c:164:8-16: WARNING:
use scnprintf or sprintf
drivers/gpu//drm/amd/amdgpu/amdgpu_device.c:186:8-16: WARNING:
use scnprintf or sprintf
drivers/gpu//drm/amd/amdgpu/amdgpu_device.c:208:8-16: WARNING:
use scnprintf or sprintf
drivers/gpu//drm/amd/amdgpu/amdgpu_atombios.c:1916:8-16: WARNING:
use scnprintf or sprintf

Signed-off-by: Tian Tao <tiantao6@hisilicon.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/radeon/radeon_pm: Convert sysfs sprintf/snprintf family to sysfs_emit
Tian Tao [Wed, 24 Mar 2021 06:47:55 +0000 (14:47 +0800)]
drm/radeon/radeon_pm: Convert sysfs sprintf/snprintf family to sysfs_emit

Fix the following coccicheck warning:
drivers/gpu//drm/radeon/radeon_pm.c:521:9-17: WARNING: use scnprintf or
sprintf
drivers/gpu//drm/radeon/radeon_pm.c:475:8-16: WARNING: use scnprintf or
sprintf
drivers/gpu//drm/radeon/radeon_pm.c:418:8-16: WARNING: use scnprintf or
sprintf
drivers/gpu//drm/radeon/radeon_pm.c:363:8-16: WARNING: use scnprintf or
sprintf
drivers/gpu//drm/radeon/radeon_pm.c:734:8-16: WARNING: use scnprintf or
sprintf
drivers/gpu//drm/radeon/radeon_pm.c:688:8-16: WARNING: use scnprintf or
sprintf
drivers/gpu//drm/radeon/radeon_pm.c:704:8-16: WARNING: use scnprintf or
sprintf
drivers/gpu//drm/radeon/radeon_pm.c:755:8-16: WARNING: use scnprintf or
sprintf

Signed-off-by: Tian Tao <tiantao6@hisilicon.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu/pm: bail on sysfs/debugfs queries during platform suspend
Alex Deucher [Wed, 24 Mar 2021 21:09:41 +0000 (17:09 -0400)]
drm/amdgpu/pm: bail on sysfs/debugfs queries during platform suspend

The GPU is in the process of being shutdown.  Spurious queries during
suspend and resume can put the SMU into a bad state.  Runtime PM is
handled dynamically so we check if we are in non-runtime suspend.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu/pm: mark pcie link/speed arrays as const
Alex Deucher [Wed, 24 Mar 2021 03:48:49 +0000 (23:48 -0400)]
drm/amdgpu/pm: mark pcie link/speed arrays as const

They are read only.

Noticed-by: Dave Airlie <airlied@linux.ie>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: remove irq_src->data handling
Christian König [Fri, 19 Mar 2021 12:24:03 +0000 (13:24 +0100)]
drm/amdgpu: remove irq_src->data handling

That is unused for quite some time now.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/display: Removing unused code from dmub_cmd.h
Anson Jacob [Tue, 23 Mar 2021 20:43:42 +0000 (16:43 -0400)]
drm/amd/display: Removing unused code from dmub_cmd.h

Removing code that is not used at the moment.

Signed-off-by: Anson Jacob <Anson.Jacob@amd.com>
Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: Fix check for RAS support
Luben Tuikov [Fri, 12 Mar 2021 00:11:01 +0000 (19:11 -0500)]
drm/amdgpu: Fix check for RAS support

Use positive logic to check for RAS
support. Rename the function to actually indicate
what it is testing for. Essentially, make the
function a predicate with the correct name.

Cc: Stanley Yang <Stanley.Yang@amd.com>
Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/display: Use appropriate DRM_DEBUG_... level
Luben Tuikov [Sat, 20 Mar 2021 03:49:38 +0000 (23:49 -0400)]
drm/amd/display: Use appropriate DRM_DEBUG_... level

Convert IRQ-based prints from DRM_DEBUG_DRIVER to
the appropriate DRM log type, since IRQ-based
prints drown out the rest of the driver's
DRM_DEBUG_DRIVER messages.

v2: Update as per feedback to fine-tune for each
type of DRM log level.

Cc: Harry Wentland <Harry.Wentland@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: Set amdgpu.noretry=1 for Arcturus
Philip Cox [Wed, 24 Mar 2021 13:15:45 +0000 (09:15 -0400)]
drm/amdgpu: Set amdgpu.noretry=1 for Arcturus

Setting amdgpu.noretry=1 as default for Arcturus.

Signed-off-by: Philip Cox <Philip.Cox@amd.com>
Reviewed-by: Kent Russell <kent.russell@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: added support for dynamic GECC
John Clements [Wed, 24 Mar 2021 13:13:17 +0000 (21:13 +0800)]
drm/amdgpu: added support for dynamic GECC

updated host to send boot config to psp to enable GECC for sienna cichlid

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: update host to psp interface
John Clements [Wed, 24 Mar 2021 13:12:06 +0000 (21:12 +0800)]
drm/amdgpu: update host to psp interface

added interface support for setting boot config

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/pm: Update aldebaran pmfw interface
Lijo Lazar [Tue, 23 Mar 2021 12:53:12 +0000 (20:53 +0800)]
drm/amd/pm: Update aldebaran pmfw interface

Update aldebaran PMFW interfaces to version 0x6

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: move vram recover into sriov full access
Horace Chen [Tue, 23 Mar 2021 06:22:22 +0000 (14:22 +0800)]
drm/amdgpu: move vram recover into sriov full access

[what]
currently driver recover vram after full access, which may hit
a corner case that meanwhile another whole gpu reset may be
triggered by another VF, which will cause vram recover fail
then fail the whole device reset.

[how]
move the recover vram into full access. So another bad VF will
not disturb the recover sequence for this vf.

Signed-off-by: Horace Chen <horace.chen@amd.com>
Reviewed by: Monk.Liu <monk.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/pm: drop redundant and unneeded BACO APIs V2
Evan Quan [Fri, 19 Mar 2021 08:22:17 +0000 (16:22 +0800)]
drm/amd/pm: drop redundant and unneeded BACO APIs V2

Use other APIs which are with the same functionality but much
more clean.

V2: drop mediate unneeded interface

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/pm: label these APIs used internally as static
Evan Quan [Fri, 19 Mar 2021 06:00:15 +0000 (14:00 +0800)]
drm/amd/pm: label these APIs used internally as static

Also drop unnecessary header file and declarations.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/pm: make DAL communicate with SMU through unified interfaces
Evan Quan [Fri, 19 Mar 2021 04:15:47 +0000 (12:15 +0800)]
drm/amd/pm: make DAL communicate with SMU through unified interfaces

No need to have special handlings for swSMU supported ASICs.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu/display: fix memory leak for dimgrey cavefish
Alex Deucher [Tue, 23 Mar 2021 16:59:26 +0000 (12:59 -0400)]
drm/amdgpu/display: fix memory leak for dimgrey cavefish

We need to clean up the dcn3 clk_mgr.

Acked-by: Nirmoy Das <nirmoy.das@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agoamdgpu: fix gcc -Wrestrict warning
Arnd Bergmann [Tue, 23 Mar 2021 13:04:20 +0000 (14:04 +0100)]
amdgpu: fix gcc -Wrestrict warning

gcc warns about an sprintf() that uses the same buffer as source
and destination, which is undefined behavior in C99:

drivers/gpu/drm/amd/amdgpu/amdgpu_securedisplay.c: In function 'amdgpu_securedisplay_debugfs_write':
drivers/gpu/drm/amd/amdgpu/amdgpu_securedisplay.c:141:6: error: 'sprintf' argument 3 overlaps destination object 'i2c_output' [-Werror=restrict]
  141 |      sprintf(i2c_output, "%s 0x%X", i2c_output,
      |      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  142 |       securedisplay_cmd->securedisplay_out_message.send_roi_crc.i2c_buf[i]);
      |       ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/gpu/drm/amd/amdgpu/amdgpu_securedisplay.c:97:7: note: destination object referenced by 'restrict'-qualified argument 1 was declared here
   97 |  char i2c_output[256];
      |       ^~~~~~~~~~

Rewrite it to remember the current offset into the buffer instead.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agoamdgpu: avoid incorrect %hu format string
Arnd Bergmann [Mon, 22 Mar 2021 11:54:42 +0000 (12:54 +0100)]
amdgpu: avoid incorrect %hu format string

clang points out that the %hu format string does not match the type
of the variables here:

drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c:263:7: warning: format specifies type 'unsigned short' but the argument has type 'unsigned int' [-Wformat]
                                  version_major, version_minor);
                                  ^~~~~~~~~~~~~
include/drm/drm_print.h:498:19: note: expanded from macro 'DRM_ERROR'
        __drm_err(fmt, ##__VA_ARGS__)
                  ~~~    ^~~~~~~~~~~

Change it to a regular %u, the same way a previous patch did for
another instance of the same warning.

Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Tom Rix <trix@redhat.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrivers: gpu: Remove duplicate include of amdgpu_hdp.h
Wan Jiabing [Mon, 22 Mar 2021 12:02:25 +0000 (20:02 +0800)]
drivers: gpu: Remove duplicate include of amdgpu_hdp.h

amdgpu_hdp.h has been included at line 91, so remove
the duplicate include.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Wan Jiabing <wanjiabing@vivo.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdkfd: Fix cat debugfs hang_hws file causes system crash bug
Qu Huang [Sun, 21 Mar 2021 08:28:18 +0000 (16:28 +0800)]
drm/amdkfd: Fix cat debugfs hang_hws file causes system crash bug

Here is the system crash log:
[ 1272.884438] BUG: unable to handle kernel NULL pointer dereference at
(null)
[ 1272.884444] IP: [<          (null)>]           (null)
[ 1272.884447] PGD 825b09067 PUD 8267c8067 PMD 0
[ 1272.884452] Oops: 0010 [#1] SMP
[ 1272.884509] CPU: 13 PID: 3485 Comm: cat Kdump: loaded Tainted: G
[ 1272.884515] task: ffff9a38dbd4d140 ti: ffff9a37cd3b8000 task.ti:
ffff9a37cd3b8000
[ 1272.884517] RIP: 0010:[<0000000000000000>]  [<          (null)>]
(null)
[ 1272.884520] RSP: 0018:ffff9a37cd3bbe68  EFLAGS: 00010203
[ 1272.884522] RAX: 0000000000000000 RBX: 0000000000000000 RCX:
0000000000014d5f
[ 1272.884524] RDX: fffffffffffffff4 RSI: 0000000000000001 RDI:
ffff9a38aca4d200
[ 1272.884526] RBP: ffff9a37cd3bbed0 R08: ffff9a38dcd5f1a0 R09:
ffff9a31ffc07300
[ 1272.884527] R10: ffff9a31ffc07300 R11: ffffffffaddd5e9d R12:
ffff9a38b4e0fb00
[ 1272.884529] R13: 0000000000000001 R14: ffff9a37cd3bbf18 R15:
ffff9a38aca4d200
[ 1272.884532] FS:  00007feccaa67740(0000) GS:ffff9a38dcd40000(0000)
knlGS:0000000000000000
[ 1272.884534] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1272.884536] CR2: 0000000000000000 CR3: 00000008267c0000 CR4:
00000000003407e0
[ 1272.884537] Call Trace:
[ 1272.884544]  [<ffffffffade68940>] ? seq_read+0x130/0x440
[ 1272.884548]  [<ffffffffade40f8f>] vfs_read+0x9f/0x170
[ 1272.884552]  [<ffffffffade41e4f>] SyS_read+0x7f/0xf0
[ 1272.884557]  [<ffffffffae374ddb>] system_call_fastpath+0x22/0x27
[ 1272.884558] Code:  Bad RIP value.
[ 1272.884562] RIP  [<          (null)>]           (null)
[ 1272.884564]  RSP <ffff9a37cd3bbe68>
[ 1272.884566] CR2: 0000000000000000

Signed-off-by: Qu Huang <jinsdb@126.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/atomic: Couple of typo fixes
Bhaskar Chowdhury [Sat, 20 Mar 2021 18:36:42 +0000 (00:06 +0530)]
drm/atomic: Couple of typo fixes

s/seralization/serialization/
s/parallism/parallelism/

Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Bhaskar Chowdhury <unixbhaskar@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/radeon: don't evict if not initialized
Tong Zhang [Sun, 21 Mar 2021 15:19:07 +0000 (11:19 -0400)]
drm/radeon: don't evict if not initialized

TTM_PL_VRAM may not initialized at all when calling
radeon_bo_evict_vram(). We need to check before doing eviction.

[    2.160837] BUG: kernel NULL pointer dereference, address: 0000000000000020
[    2.161212] #PF: supervisor read access in kernel mode
[    2.161490] #PF: error_code(0x0000) - not-present page
[    2.161767] PGD 0 P4D 0
[    2.163088] RIP: 0010:ttm_resource_manager_evict_all+0x70/0x1c0 [ttm]
[    2.168506] Call Trace:
[    2.168641]  radeon_bo_evict_vram+0x1c/0x20 [radeon]
[    2.168936]  radeon_device_fini+0x28/0xf9 [radeon]
[    2.169224]  radeon_driver_unload_kms+0x44/0xa0 [radeon]
[    2.169534]  radeon_driver_load_kms+0x174/0x210 [radeon]
[    2.169843]  drm_dev_register+0xd9/0x1c0 [drm]
[    2.170104]  radeon_pci_probe+0x117/0x1a0 [radeon]

Reviewed-by: Christian König <christian.koenig@amd.com>
Suggested-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Tong Zhang <ztong0001@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: replace per_device_list by array
Alex Sierra [Wed, 1 Apr 2020 21:35:06 +0000 (16:35 -0500)]
drm/amdgpu: replace per_device_list by array

Remove per_device_list from kfd_process and replace it with a
kfd_process_device pointers array of MAX_GPU_INSTANCES size. This helps
to manage the kfd_process_devices binded to a specific kfd_process.
Also, functions used by kfd_chardev to iterate over the list were
removed, since they are not valid anymore. Instead, it was replaced by a
local loop iterating the array.

Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Jonathan Kim <jonathan.kim@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/display: 3.2.128
Aric Cyr [Mon, 15 Mar 2021 03:59:22 +0000 (23:59 -0400)]
drm/amd/display: 3.2.128

This version brings along following fixes:

- Populate socclk entries for dcn2.1
- hide VGH asic specific structs
- Add kernel doc to crc_rd_wrk field
- revert max lb lines change
- Log DMCUB trace buffer events
- Fix debugfs link_settings entry
- revert max lb use by default for n10
- Deallocate IRQ handlers on amdgpu_dm_irq_fini
- Fixed Clock Recovery Sequence
- Fix UBSAN: shift-out-of-bounds warning
- [FW Promotion] Release 0.0.57
- Change input parameter for set_drr
- Use pwrseq instance to determine eDP instance

Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Aric Cyr <aric.cyr@amd.com>
Acked-by: Solomon Chiu <solomon.chiu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/display: Use pwrseq instance to determine eDP instance
Jake Wang [Wed, 17 Mar 2021 18:26:55 +0000 (14:26 -0400)]
drm/amd/display: Use pwrseq instance to determine eDP instance

[Why & How]
Link index doesn't always correspond to the appropriate eDP instance.
We can assume lower link index is a lower eDP instance and set panel
control instance accordingly.

Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Jake Wang <haonan.wang2@amd.com>
Reviewed-by: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com>
Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/display: Change input parameter for set_drr
Alvin Lee [Mon, 20 Apr 2020 14:45:27 +0000 (10:45 -0400)]
drm/amd/display: Change input parameter for set_drr

[Why]
Change set_drr to pass in the entire dc_crtc_timing_adjust
structure instead of passing in the parameters individually.
This is to more easily pass in required parameters in the
adjust structure when it gets updated.

Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alvin Lee <alvin.lee2@amd.com>
Reviewed-by: Jun Lei <Jun.Lei@amd.com>
Acked-by: Solomon Chiu <solomon.chiu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/display: [FW Promotion] Release 0.0.57
Anthony Koo [Fri, 12 Mar 2021 21:48:57 +0000 (16:48 -0500)]
drm/amd/display: [FW Promotion] Release 0.0.57

Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Anthony Koo <Anthony.Koo@amd.com>
Acked-by: Solomon Chiu <solomon.chiu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/display: Fix UBSAN: shift-out-of-bounds warning
Anson Jacob [Mon, 1 Mar 2021 19:25:44 +0000 (14:25 -0500)]
drm/amd/display: Fix UBSAN: shift-out-of-bounds warning

[Why]
On NAVI14 CONFIG_UBSAN reported shift-out-of-bounds at
display_rq_dlg_calc_20v2.c:304:38

rq_param->misc.rq_c.blk256_height is 0 when chroma(*_c) is invalid.
dml_log2 returns -1023 for log2(0), although log2(0) is undefined.

Which ended up as:
rq_param->dlg.rq_c.swath_height = 1 << -1023

[How]
Fix applied on all dml versions.
1. Ensure dml_log2 is only called if the argument is greater than 0.
2. Subtract req128_l/req128_c from log2_swath_height_l/log2_swath_height_c
   only when it is greater than 0.

Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Anson Jacob <Anson.Jacob@amd.com>
Reviewed-by: Dmytro Laktyushkin <Dmytro.Laktyushkin@amd.com>
Reviewed-by: Jun Lei <Jun.Lei@amd.com>
Acked-by: Solomon Chiu <solomon.chiu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/display: Fixed Clock Recovery Sequence
David Galiffi [Thu, 11 Mar 2021 23:13:12 +0000 (18:13 -0500)]
drm/amd/display: Fixed Clock Recovery Sequence

[Why]
When performing clock recovery, if a pre-emphasis adjustment is
requested, but voltage swing remains constant, the the retry counter
will not be reset. This can lead to prematurely failing link training.

[How]
Reset the clock recovery retry counter if an adjustment is requested
for either voltage swing or pre-emphasis.

Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Calvin Hou <Calvin.Hou@amd.com>
Signed-off-by: David Galiffi <David.Galiffi@amd.com>
Reviewed-by: Jun Lei <Jun.Lei@amd.com>
Acked-by: Solomon Chiu <solomon.chiu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/display: Deallocate IRQ handlers on amdgpu_dm_irq_fini
Victor Lu [Fri, 5 Mar 2021 16:24:37 +0000 (11:24 -0500)]
drm/amd/display: Deallocate IRQ handlers on amdgpu_dm_irq_fini

[why]
The amdgpu_dm IRQ handlers are not freed during the IRQ teardown.

[how]
Add function to deallocate IRQ handlers on amdgpu_dm_irq_fini step.

Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Victor Lu <victorchengchi.lu@amd.com>
Reviewed-by: Roman Li <Roman.Li@amd.com>
Acked-by: Solomon Chiu <solomon.chiu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/display: revert max lb use by default for n10
Dmytro Laktyushkin [Thu, 11 Mar 2021 23:25:34 +0000 (18:25 -0500)]
drm/amd/display: revert max lb use by default for n10

This is causing a pstate change underflow regression for
unknown reason

Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Dmytro Laktyushkin <Dmytro.Laktyushkin@amd.com>
Reviewed-by: Eric Bernstein <Eric.Bernstein@amd.com>
Acked-by: Solomon Chiu <solomon.chiu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/display: Fix debugfs link_settings entry
Fangzhi Zuo [Tue, 9 Mar 2021 16:22:36 +0000 (11:22 -0500)]
drm/amd/display: Fix debugfs link_settings entry

1. Catch invalid link_rate and link_count settings
2. Call dc interface to overwrite preferred link settings, and wait
until next stream update to apply the new settings.

Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Fangzhi Zuo <Jerry.Zuo@amd.com>
Reviewed-by: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com>
Acked-by: Solomon Chiu <solomon.chiu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/display: Log DMCUB trace buffer events
Leo (Hanghong) Ma [Fri, 19 Feb 2021 21:22:58 +0000 (16:22 -0500)]
drm/amd/display: Log DMCUB trace buffer events

[Why]
We want to log DMCUB trace buffer events as Linux kernel traces.

[How]
Register an IRQ handler for DMCUB outbox0 interrupt in amdgpu_dm,
and log the messages in the DMCUB tracebuffer to a new DMCUB
TRACE_EVENT as soon as we receive the outbox0 IRQ from DMCUB FW.

Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Leo (Hanghong) Ma <hanghong.ma@amd.com>
Reviewed-by: Harry Wentland <Harry.Wentland@amd.com>
Acked-by: Solomon Chiu <solomon.chiu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/display: revert max lb lines change
Dmytro Laktyushkin [Thu, 11 Mar 2021 14:19:34 +0000 (09:19 -0500)]
drm/amd/display: revert max lb lines change

Some hardware revisions do have a max number of lines limitation
not honouring which can cause pstate switch underflow.

Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Dmytro Laktyushkin <Dmytro.Laktyushkin@amd.com>
Reviewed-by: Eric Bernstein <Eric.Bernstein@amd.com>
Acked-by: Solomon Chiu <solomon.chiu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/display: Add kernel doc to crc_rd_wrk field
Wayne Lin [Wed, 10 Mar 2021 08:29:51 +0000 (16:29 +0800)]
drm/amd/display: Add kernel doc to crc_rd_wrk field

[Why]
Receive warning message below:

drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h:380: warning: Function
parameter or member 'crc_rd_wrk' not described in 'amdgpu_display_manager'

[How]
Add documentation for crc_rd_wrk.

Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Wayne Lin <Wayne.Lin@amd.com>
Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Acked-by: Solomon Chiu <solomon.chiu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/display: hide VGH asic specific structs
Dmytro Laktyushkin [Tue, 9 Mar 2021 20:58:18 +0000 (15:58 -0500)]
drm/amd/display: hide VGH asic specific structs

The pmfw structs are specific to the asic and should not be
present in base clk_mgr struct

v2: squash in SI fix (Alex)

Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Dmytro Laktyushkin <Dmytro.Laktyushkin@amd.com>
Reviewed-by: Eric Yang <eric.yang2@amd.com>
Acked-by: Solomon Chiu <solomon.chiu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/display: Populate socclk entries for dcn2.1
Roman Li [Wed, 24 Feb 2021 02:28:25 +0000 (21:28 -0500)]
drm/amd/display: Populate socclk entries for dcn2.1

[Why]
Dcn2.1 socclk entries in bandwidth params are not initialized.
They are not used now, but will be needed for dml validation.

[How]
Populate socclk bw params from dpm clock table

Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Roman Li <roman.li@amd.com>
Reviewed-by: Dmytro Laktyushkin <Dmytro.Laktyushkin@amd.com>
Acked-by: Solomon Chiu <solomon.chiu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: Use correct size when access vram
xinhui pan [Mon, 22 Mar 2021 00:48:54 +0000 (08:48 +0800)]
drm/amdgpu: Use correct size when access vram

To make size is 4 byte aligned. Use &~0x3ULL instead of &3ULL.

Signed-off-by: xinhui pan <xinhui.pan@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/pm: fix gpu reset failure by MP1 state setting
Guchun Chen [Tue, 23 Mar 2021 05:41:00 +0000 (13:41 +0800)]
drm/amd/pm: fix gpu reset failure by MP1 state setting

Instead of blocking varied unsupported MP1 state in upper level,
defer and skip such MP1 state handling in specific ASIC.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/pm: fix MP1 state setting failure in s3 test
Guchun Chen [Mon, 22 Mar 2021 06:07:38 +0000 (14:07 +0800)]
drm/amd/pm: fix MP1 state setting failure in s3 test

Skip PP_MP1_STATE_NONE in MP1 state setting, otherwise, it will
break S3 sequence.

[   50.188269] [drm:amdgpu_device_ip_suspend_phase2 [amdgpu]] *ERROR* SMC failed to set mp1 state 0, -22
[   50.969901] amdgpu 0000:03:00.0: amdgpu: SMU is resuming...
[   50.970024] sd 0:0:0:0: [sda] Starting disk
[   50.979723] serial 00:02: activated
[   51.353644] ata4: SATA link down (SStatus 4 SControl 300)
[   51.353669] ata3: SATA link down (SStatus 4 SControl 300)
[   51.353747] ata6: SATA link down (SStatus 4 SControl 300)
[   51.357694] ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[   51.357711] ata5: SATA link down (SStatus 4 SControl 300)
[   51.357729] ata2: SATA link down (SStatus 4 SControl 300)
[   51.358005] ata1.00: supports DRM functions and may not be fully accessible
[   51.360491] ata1.00: supports DRM functions and may not be fully accessible
[   51.362573] ata1.00: configured for UDMA/133
[   51.362610] ahci 0000:00:17.0: port does not support device sleep
[   51.362946] ata1.00: Enabling discard_zeroes_data
[   52.566438] amdgpu 0000:03:00.0: amdgpu: Msg issuing pre-check failed and SMU may be not in the right state!
[   54.126316] amdgpu 0000:03:00.0: amdgpu: Msg issuing pre-check failed and SMU may be not in the right state!
[   54.126317] amdgpu 0000:03:00.0: amdgpu: Failed to SetDriverDramAddr!
[   54.126318] amdgpu 0000:03:00.0: amdgpu: Failed to setup smc hw!
[   54.126319] [drm:amdgpu_device_ip_resume_phase2 [amdgpu]] *ERROR* resume of IP block <smu> failed -62
[   54.126398] amdgpu 0000:03:00.0: amdgpu: amdgpu_device_ip_resume failed (-62).
[   54.126399] PM: dpm_run_callback(): pci_pm_resume+0x0/0x90 returns -62
[   54.126403] PM: Device 0000:03:00.0 failed to resume async: error -62

Fixes: 1689fca0d62aa7 ("drm/amd/pm: fix Navi1x runtime resume failure V2")
Signed-off-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu/powerplay/smu10: refactor AMDGPU_PP_SENSOR_GPU_LOAD
Shirish S [Fri, 19 Mar 2021 06:58:40 +0000 (12:28 +0530)]
drm/amdgpu/powerplay/smu10: refactor AMDGPU_PP_SENSOR_GPU_LOAD

refactor AMDGPU_PP_SENSOR_GPU_LOAD to ensure code consistency with other
commands

Signed-off-by: Shirish S <shirish.s@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: fix amdgpu_res_first()
Nirmoy Das [Fri, 19 Mar 2021 18:54:31 +0000 (19:54 +0100)]
drm/amdgpu: fix amdgpu_res_first()

Fix size comparison in the resource cursor.

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu/display: restore AUX_DPHY_TX_CONTROL for DCN2.x
Alex Deucher [Tue, 16 Feb 2021 17:22:40 +0000 (12:22 -0500)]
drm/amdgpu/display: restore AUX_DPHY_TX_CONTROL for DCN2.x

Commit 098214999c8f added fetching of the AUX_DPHY register
values from the vbios, but it also changed the default values
in the case when there are no values in the vbios.  This causes
problems with displays with high refresh rates.  To fix this,
switch back to the original default value for AUX_DPHY_TX_CONTROL.

Fixes: 098214999c8f ("drm/amd/display: Read VBIOS Golden Settings Tbl")
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1426
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: Igor Kravchenko <Igor.Kravchenko@amd.com>
Cc: Aric Cyr <Aric.Cyr@amd.com>
Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
4 years agodrm/amd/display: use GFP_ATOMIC in dcn20_resource_construct
Nirmoy Das [Wed, 17 Mar 2021 10:38:11 +0000 (11:38 +0100)]
drm/amd/display: use GFP_ATOMIC in dcn20_resource_construct

Replace GFP_KERNEL with GFP_ATOMIC as dcn20_resource_construct()
can't sleep.

Partially fixes: https://bugzilla.kernel.org/show_bug.cgi?id=212311
as dcn20_resource_construct() also calls into SMU functions which does
mutex_lock().

Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/display/dc/calcs/dce_calcs: Remove some large variables from the stack
Lee Jones [Fri, 19 Mar 2021 08:24:16 +0000 (08:24 +0000)]
drm/amd/display/dc/calcs/dce_calcs: Remove some large variables from the stack

Fixes the following W=1 kernel build warning(s):

 drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dce_calcs.c: In function ‘bw_calcs_init’:
 drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dce_calcs.c:2726:1: warning: the frame size of 1336 bytes is larger than 1024 bytes [-Wframe-larger-than=]

v2: squash in sizeof fix

Cc: Harry Wentland <harry.wentland@amd.com>
Cc: Leo Li <sunpeng.li@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: amd-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/display/dc/calcs/dce_calcs: Move some large variables from the stack to the...
Lee Jones [Fri, 19 Mar 2021 08:24:15 +0000 (08:24 +0000)]
drm/amd/display/dc/calcs/dce_calcs: Move some large variables from the stack to the heap

Fixes the following W=1 kernel build warning(s):

 drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dce_calcs.c: In function ‘calculate_bandwidth’:
 drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dce_calcs.c:2016:1: warning: the frame size of 1216 bytes is larger than 1024 bytes [-Wframe-larger-than=]

Cc: Harry Wentland <harry.wentland@amd.com>
Cc: Leo Li <sunpeng.li@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: amd-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/display/dc/dce80/dce80_resource: Make local functions static
Lee Jones [Fri, 19 Mar 2021 08:24:17 +0000 (08:24 +0000)]
drm/amd/display/dc/dce80/dce80_resource: Make local functions static

Fixes the following W=1 kernel build warning(s):

 drivers/gpu/drm/amd/amdgpu/../display/dc/dce80/dce80_resource.c:527:17: warning: no previous prototype for ‘dce80_aux_engine_create’ [-Wmissing-prototypes]
 drivers/gpu/drm/amd/amdgpu/../display/dc/dce80/dce80_resource.c:565:20: warning: no previous prototype for ‘dce80_i2c_hw_create’ [-Wmissing-prototypes]
 drivers/gpu/drm/amd/amdgpu/../display/dc/dce80/dce80_resource.c:581:20: warning: no previous prototype for ‘dce80_i2c_sw_create’ [-Wmissing-prototypes]
 drivers/gpu/drm/amd/amdgpu/../display/dc/dce80/dce80_resource.c:715:22: warning: no previous prototype for ‘dce80_link_encoder_create’ [-Wmissing-prototypes]
 drivers/gpu/drm/amd/amdgpu/../display/dc/dce80/dce80_resource.c:754:22: warning: no previous prototype for ‘dce80_clock_source_create’ [-Wmissing-prototypes]
 drivers/gpu/drm/amd/amdgpu/../display/dc/dce80/dce80_resource.c:778:6: warning: no previous prototype for ‘dce80_clock_source_destroy’ [-Wmissing-prototypes]
 drivers/gpu/drm/amd/amdgpu/../display/dc/dce80/dce80_resource.c:868:6: warning: no previous prototype for ‘dce80_validate_bandwidth’ [-Wmissing-prototypes]
 drivers/gpu/drm/amd/amdgpu/../display/dc/dce80/dce80_resource.c:913:16: warning: no previous prototype for ‘dce80_validate_global’ [-Wmissing-prototypes]

Cc: Harry Wentland <harry.wentland@amd.com>
Cc: Leo Li <sunpeng.li@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Anthony Koo <Anthony.Koo@amd.com>
Cc: amd-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/pm: fix Navi1x runtime resume failure V2
Evan Quan [Wed, 17 Mar 2021 04:10:15 +0000 (12:10 +0800)]
drm/amd/pm: fix Navi1x runtime resume failure V2

The RLC was put into a wrong state on runtime suspend. Thus the RLC
autoload will fail on the succeeding runtime resume. By adding an
intermediate PPSMC_MSG_PrepareMp1ForUnload(some GC hard reset involved,
designed for PnP), we can bring RLC back into the desired state.

V2: integrate INTERRUPTS_ENABLED flag clearing into current
    mp1 state set routines

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: Enable VCN/JPEG CG on aldebaran
Lijo Lazar [Tue, 2 Mar 2021 10:26:04 +0000 (18:26 +0800)]
drm/amdgpu: Enable VCN/JPEG CG on aldebaran

Enable clockgating for VCN and JPEG blocks on aldebaran

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: Fix a typo
Bhaskar Chowdhury [Thu, 18 Mar 2021 23:18:15 +0000 (04:48 +0530)]
drm/amdgpu: Fix a typo

s/proces/process/

Signed-off-by: Bhaskar Chowdhury <unixbhaskar@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: Fix a typo
Bhaskar Chowdhury [Thu, 18 Mar 2021 20:24:14 +0000 (01:54 +0530)]
drm/amdgpu: Fix a typo

s/traing/training/

...Plus the entire sentence construction for better readability.

Signed-off-by: Bhaskar Chowdhury <unixbhaskar@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/radeon/ttm: Fix memory leak userptr pages
Daniel Gomez [Thu, 18 Mar 2021 08:32:36 +0000 (09:32 +0100)]
drm/radeon/ttm: Fix memory leak userptr pages

If userptr pages have been pinned but not bounded,
they remain uncleared.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Daniel Gomez <daniel@qtec.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu/ttm: Fix memory leak userptr pages
Daniel Gomez [Wed, 17 Mar 2021 16:08:37 +0000 (17:08 +0100)]
drm/amdgpu/ttm: Fix memory leak userptr pages

If userptr pages have been pinned but not bounded,
they remain uncleared.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Daniel Gomez <daniel@qtec.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: skip kfd suspend/resume for S0ix
Alex Deucher [Wed, 17 Mar 2021 02:02:14 +0000 (22:02 -0400)]
drm/amdgpu: skip kfd suspend/resume for S0ix

GFX is in gfxoff mode during s0ix so we shouldn't need to
actually tear anything down and restore it.

Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: drop S0ix checks around CG/PG in suspend
Alex Deucher [Tue, 16 Mar 2021 18:18:30 +0000 (14:18 -0400)]
drm/amdgpu: drop S0ix checks around CG/PG in suspend

We handle it properly within the CG/PG functions directly
now.

Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: skip CG/PG for gfx during S0ix
Pratik Vishwakarma [Tue, 16 Mar 2021 18:15:44 +0000 (14:15 -0400)]
drm/amdgpu: skip CG/PG for gfx during S0ix

Not needed as the device is in gfxoff state so the CG/PG state
is handled just like it would be for gfxoff during runtime gfxoff.

This should also prevent delays on resume.

Reworked from Pratik's original patch (Alex)

Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Pratik Vishwakarma <Pratik.Vishwakarma@amd.com>
4 years agodrm/amdgpu: update comments about s0ix suspend/resume
Alex Deucher [Tue, 16 Mar 2021 18:00:36 +0000 (14:00 -0400)]
drm/amdgpu: update comments about s0ix suspend/resume

Provide and explanation as to why we skip GFX and PSP for
S0ix.  GFX goes into gfxoff, same as runtime, so no need
to tear down and re-init.  PSP is part of the always on
state, so no need to touch it.

Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu/swsmu: skip gfx cgpg on s0ix suspend
Alex Deucher [Fri, 12 Mar 2021 21:00:21 +0000 (16:00 -0500)]
drm/amdgpu/swsmu: skip gfx cgpg on s0ix suspend

The SMU expects CGPG to be enabled when entering S0ix.
with this we can re-enable SMU suspend.

Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: re-enable suspend phase 2 for S0ix
Alex Deucher [Fri, 12 Mar 2021 20:36:04 +0000 (15:36 -0500)]
drm/amdgpu: re-enable suspend phase 2 for S0ix

This really needs to be done to properly tear down
the device.  SMC, PSP, and GFX are still problematic,
need to dig deeper into what aspect of them that is
problematic.

Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: move s0ix check into amdgpu_device_ip_suspend_phase2 (v3)
Alex Deucher [Fri, 12 Mar 2021 20:33:46 +0000 (15:33 -0500)]
drm/amdgpu: move s0ix check into amdgpu_device_ip_suspend_phase2 (v3)

No functional change.

v2: use correct dev
v3: rework

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: clean up non-DC suspend/resume handling
Alex Deucher [Fri, 19 Mar 2021 20:34:45 +0000 (16:34 -0400)]
drm/amdgpu: clean up non-DC suspend/resume handling

Move the non-DC specific code into the DCE IP blocks similar
to how we handle DC.  This cleans up the common suspend
and resume pathes.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: don't evict vram on APUs for suspend to ram (v4)
Alex Deucher [Wed, 10 Mar 2021 04:20:46 +0000 (23:20 -0500)]
drm/amdgpu: don't evict vram on APUs for suspend to ram (v4)

Vram is system memory, so no need to evict.

v2: use PM_EVENT messages
v3: use correct dev
v4: use driver flags

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: rework S3/S4/S0ix state handling
Alex Deucher [Fri, 12 Mar 2021 20:22:36 +0000 (15:22 -0500)]
drm/amdgpu: rework S3/S4/S0ix state handling

Set flags at the top level pmops callbacks to track
state.  This cleans up the current set of flags and
properly handles S4 on S0ix capable systems.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>