Changfeng [Fri, 14 May 2021 07:28:25 +0000 (15:28 +0800)]
drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid compute hang
There is problem with 3DCGCG firmware and it will cause compute test
hang on picasso/raven1. It needs to disable 3DCGCG in driver to avoid
compute hang.
Signed-off-by: Changfeng <Changfeng.Zhu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm/amdgpu: Refine the error report when flush tlb.
there are 2 hubs to flush in the gmc, to make it easier
to debug when hub flush failed, refine the logs.
Signed-off-by: Peng Ju Zhou <PengJu.Zhou@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm/amdgpu: Skip the program of GRBM_CAM* in SRIOV
KMD should not the program these registers,
so skip them in the SRIOV environment.
Signed-off-by: Peng Ju Zhou <PengJu.Zhou@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Yi Li [Fri, 14 May 2021 06:40:39 +0000 (14:40 +0800)]
drm/amdgpu: Fix GPU TLB update error when PAGE_SIZE > AMDGPU_PAGE_SIZE
When PAGE_SIZE is larger than AMDGPU_PAGE_SIZE, the number of GPU TLB
entries which need to update in amdgpu_map_buffer() should be multiplied
by AMDGPU_GPU_PAGES_IN_CPU_PAGE (PAGE_SIZE / AMDGPU_PAGE_SIZE).
Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Yi Li <liyi@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Philip Yang [Thu, 13 May 2021 16:30:48 +0000 (12:30 -0400)]
drm/amdkfd: heavy-weight flush TLB after unmap
Need do a heavy-weight TLB flush to make sure we have no more dirty data
in the cache for the unmapped pages.
Define enum TLB_FLUSH_TYPE, add flush_type parameter to
amdgpu_amdkfd_flush_gpu_tlb_pasid.
Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
After "drm/amdgpu: flush TLB if valid PDE turns into PTE" is checked
in, this workaround is not needed.
Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Philip Yang [Wed, 12 May 2021 12:02:46 +0000 (08:02 -0400)]
drm/amdgpu: flush TLB if valid PDE turns into PTE
Mapping huge page, 2MB aligned address with 2MB size, uses PDE0 as PTE.
If previously valid PDE0, PDE0.V=1 and PDE0.P=0 turns into PTE, this
requires TLB flush, otherwise page table walker will not read updated
PDE0.
Change page table update mapping to return table_freed flag to indicate
the previously valid PDE may have turned into a PTE if page table is
freed.
Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Christian König [Wed, 12 May 2021 08:36:43 +0000 (10:36 +0200)]
drm/radeon: use the dummy page for GART if needed
Imported BOs don't have a pagelist any more.
Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Fixes: 0575ff3d33cd ("drm/radeon: stop using pages with drm_prime_sg_to_page_addr_arrays v2")
Christian König [Tue, 27 Apr 2021 09:40:40 +0000 (11:40 +0200)]
drm/amdgpu: move struct amdgpu_vram_reservation into vram mgr
Not used outside of that file.
Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Tested-by: Nirmoy Das <nirmoy.das@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Christian König [Tue, 27 Apr 2021 09:36:10 +0000 (11:36 +0200)]
drm/amdgpu: check contiguous flags instead of mm_node
Drop the last user of drm_mm_node.
Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Tested-by: Nirmoy Das <nirmoy.das@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Christian König [Tue, 27 Apr 2021 09:17:59 +0000 (11:17 +0200)]
drm/amdgpu: set the contiguous flag if possible
This avoids reallocating scanout BOs on first pin in a lot of cases.
Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Tested-by: Nirmoy Das <nirmoy.das@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Christian König [Tue, 27 Apr 2021 08:29:10 +0000 (10:29 +0200)]
drm/amdgpu: use cursor functions in amdgpu_bo_in_cpu_visible_vram
One of the last remaining uses of drm_mm_node.
Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Tested-by: Nirmoy Das <nirmoy.das@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Tested-by: Nirmoy Das <nirmoy.das@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Hawking Zhang [Thu, 13 May 2021 13:46:24 +0000 (21:46 +0800)]
drm/amdgpu: add atomfirmware helper function to query fw cap
Fimware capability was changed from 16 bits to 32 bits
for atomfirmware. add helper funciton to query firmware
capability and cache the value at early stage.
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Felix Kuehling [Mon, 10 May 2021 22:50:11 +0000 (18:50 -0400)]
drm/amdgpu: Albebaran: MTYPE_NC for coarse-grain remote memory
MTYPE UC was used for a specific use case that ended up not being
implemented. Use NC for better performance for coarse-grained memory where
cache coherence during shader execution is not required.
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Oak Zeng <Oak.Zeng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Felix Kuehling [Mon, 10 May 2021 22:37:56 +0000 (18:37 -0400)]
drm/amdgpu: Arcturus: MTYPE_NC for coarse-grain remote memory
MTYPE UC was used for a specific use case that ended up not being
implemented. Use NC for better performance for coarse-grained memory where
cache coherence during shader execution is not required.
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Oak Zeng <Oak.Zeng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Dennis Li [Tue, 11 May 2021 07:35:49 +0000 (15:35 +0800)]
drm/amdkfd: refine the poison data consumption handling
The user applications maybe register the KFD_EVENT_TYPE_HW_EXCEPTION and
KFD_EVENT_TYPE_MEMORY events, driver could notify them when poison data
consumed. Beside that, some applications maybe register SIGBUS signal
hander. These applications will handle poison data by themselves, exit
or re-create context to re-dispatch works.
Signed-off-by: Dennis Li <Dennis.Li@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm/radeon/dpm: Disable sclk switching on Oland when two 4K 60Hz monitors are connected
Screen flickers rapidly when two 4K 60Hz monitors are in use. This issue
doesn't happen when one monitor is 4K 60Hz (pixelclock 594MHz) and
another one is 4K 30Hz (pixelclock 297MHz).
The issue is gone after setting "power_dpm_force_performance_level" to
"high". Following the indication, we found that the issue occurs when
sclk is too low.
So resolve the issue by disabling sclk switching when there are two
monitors requires high pixelclock (> 297MHz).
v2:
- Only apply the fix to Oland. Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Philip Yang [Wed, 5 May 2021 14:32:27 +0000 (10:32 -0400)]
drm/amdkfd: new range accessible by all GPUs
If xnack is on, new range is created to recover retry vm fault or
created by SVM API calls, set all GPUs have access to the range.
Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Likun Gao [Fri, 7 May 2021 05:56:46 +0000 (13:56 +0800)]
drm/amdgpu: update the method for harvest IP for specific SKU
Update the method of disabling VCN IP for specific SKU for navi1x ASIC,
it will judge whether should add the related IP at the function of
amdgpu_device_ip_block_add().
Dennis Li [Sat, 8 May 2021 09:10:24 +0000 (17:10 +0800)]
drm/amdgpu: add synchronization among waves in the same threadgroup
It is possible that the previous waves have exited before others are
created, so the other waves maybe reuse pyhsical resouces left by
previous ones. Therefore add barrier instruction to synchronize waves within
the same threadgroup.
Signed-off-by: Dennis Li <Dennis.Li@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Fixes: 0dd79532340568 ("drm/amdgpu/display: Implement functions to let DC allocate GPU memory") Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Fri, 7 May 2021 20:33:28 +0000 (16:33 -0400)]
drm/amdgpu/display: fix build when CONFIG_DRM_AMD_DC_DCN is not defined
Fixes:
drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c: In function ‘amdgpu_dm_initialize_drm_device’:
drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c:3726:7: error: implicit declaration of function ‘register_outbox_irq_handlers’; did you mean ‘register_hpd_handlers’? [-Werror=implicit-function-declaration]
3726 | if (register_outbox_irq_handlers(dm->adev)) {
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
| register_hpd_handlers
Fixes: 81927e2808be ("drm/amd/display: Support for DMUB AUX") Reviewed-by: Jude Shih <shenshih@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: Jude Shish <Jude.Shih@amd.com>
Alex Deucher [Fri, 7 May 2021 20:29:33 +0000 (16:29 -0400)]
drm/amdgpu/display: fix warning when CONFIG_DRM_AMD_DC_DCN is not defined
Fixes:
At top level:
drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c:633:13: warning: ‘dm_dmub_outbox1_low_irq’ defined but not used [-Wunused-function]
633 | static void dm_dmub_outbox1_low_irq(void *interrupt_params)
| ^~~~~~~~~~~~~~~~~~~~~~~
Fixes: 81927e2808be ("drm/amd/display: Support for DMUB AUX") Reviewed-by: Jude Shih <shenshih@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: Jude Shih <Jude.Shih@amd.com>
David Ward [Mon, 10 May 2021 09:30:39 +0000 (05:30 -0400)]
drm/amd/display: Initialize attribute for hdcp_srm sysfs file
It is stored in dynamically allocated memory, so sysfs_bin_attr_init() must
be called to initialize it. (Note: "initialization" only sets the .attr.key
member in this struct; it does not change the value of any other members.)
Otherwise, when CONFIG_DEBUG_LOCK_ALLOC=y this message appears during boot:
Gustavo A. R. Silva [Mon, 10 May 2021 20:46:18 +0000 (15:46 -0500)]
drm/amd/pm: Fix out-of-bounds bug
Create new structure SISLANDS_SMC_SWSTATE_SINGLE, as initialState.levels
and ACPIState.levels are never actually used as flexible arrays. Those
arrays can be used as simple objects of type
SISLANDS_SMC_HW_PERFORMANCE_LEVEL, instead.
Currently, the code fails because flexible array _levels_ in
struct SISLANDS_SMC_SWSTATE doesn't allow for code that accesses
the first element of initialState.levels and ACPIState.levels
arrays:
because such element cannot be accessed without previously allocating
enough dynamic memory for it to exist (which never actually happens).
So, there is an out-of-bounds bug in this case.
That's why struct SISLANDS_SMC_SWSTATE should only be used as type
for object driverState and new struct SISLANDS_SMC_SWSTATE_SINGLE is
created as type for objects initialState, ACPIState and ULVState.
Also, with the change from one-element array to flexible-array member
in commit 0e1aa13ca3ff ("drm/amd/pm: Replace one-element array with
flexible-array in struct SISLANDS_SMC_SWSTATE"), the size of
dpmLevels in struct SISLANDS_SMC_STATETABLE should be fixed to be
SISLANDS_MAX_SMC_PERFORMANCE_LEVELS_PER_SWSTATE instead of
SISLANDS_MAX_SMC_PERFORMANCE_LEVELS_PER_SWSTATE - 1.
Fixes: 0e1aa13ca3ff ("drm/amd/pm: Replace one-element array with flexible-array in struct SISLANDS_SMC_SWSTATE") Cc: stable@vger.kernel.org Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Gustavo A. R. Silva [Sun, 9 May 2021 22:55:25 +0000 (17:55 -0500)]
drm/radeon/si_dpm: Fix SMU power state load
Create new structure SISLANDS_SMC_SWSTATE_SINGLE, as initialState.levels
and ACPIState.levels are never actually used as flexible arrays. Those
arrays can be used as simple objects of type
SISLANDS_SMC_HW_PERFORMANCE_LEVEL, instead.
Currently, the code fails because flexible array _levels_ in
struct SISLANDS_SMC_SWSTATE doesn't allow for code that access
the first element of initialState.levels and ACPIState.levels
arrays:
because such element cannot exist without previously allocating
any dynamic memory for it (which never actually happens).
That's why struct SISLANDS_SMC_SWSTATE should only be used as type
for object driverState and new struct SISLANDS_SMC_SWSTATE_SINGLE is
created as type for objects initialState, ACPIState and ULVState.
Also, with the change from one-element array to flexible-array member
in commit 96e27e8d919e ("drm/radeon/si_dpm: Replace one-element array
with flexible-array in struct SISLANDS_SMC_SWSTATE"), the size of
dpmLevels in struct SISLANDS_SMC_STATETABLE should be fixed to be
SISLANDS_MAX_SMC_PERFORMANCE_LEVELS_PER_SWSTATE instead of
SISLANDS_MAX_SMC_PERFORMANCE_LEVELS_PER_SWSTATE - 1.
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1583 Fixes: 96e27e8d919e ("drm/radeon/si_dpm: Replace one-element array with flexible-array in struct SISLANDS_SMC_SWSTATE") Cc: stable@vger.kernel.org Reported-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Tested-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Gustavo A. R. Silva [Sun, 9 May 2021 22:49:26 +0000 (17:49 -0500)]
drm/radeon/ni_dpm: Fix booting bug
Create new structure NISLANDS_SMC_SWSTATE_SINGLE, as initialState.levels
and ACPIState.levels are never actually used as flexible arrays. Those
arrays can be used as simple objects of type
NISLANDS_SMC_HW_PERFORMANCE_LEVEL, instead.
Currently, the code fails because flexible array _levels_ in
struct NISLANDS_SMC_SWSTATE doesn't allow for code that access
the first element of initialState.levels and ACPIState.levels
arrays:
because such element cannot exist without previously allocating
any dynamic memory for it (which never actually happens).
That's why struct NISLANDS_SMC_SWSTATE should only be used as type
for object driverState and new struct SISLANDS_SMC_SWSTATE_SINGLE is
created as type for objects initialState, ACPIState and ULVState.
Also, with the change from one-element array to flexible-array member
in commit 434fb1e7444a ("drm/radeon/nislands_smc.h: Replace one-element
array with flexible-array member in struct NISLANDS_SMC_SWSTATE"), the
size of dpmLevels in struct NISLANDS_SMC_STATETABLE should be fixed to
be NISLANDS_MAX_SMC_PERFORMANCE_LEVELS_PER_SWSTATE instead of
NISLANDS_MAX_SMC_PERFORMANCE_LEVELS_PER_SWSTATE - 1.
Bug: https://lore.kernel.org/dri-devel/3eedbe78-1fbd-4763-a7f3-ac5665e76a4a@xenosoft.de/ Fixes: 434fb1e7444a ("drm/radeon/nislands_smc.h: Replace one-element array with flexible-array member in struct NISLANDS_SMC_SWSTATE") Cc: stable@vger.kernel.org Reported-by: Christian Zigotzky <chzigotzky@xenosoft.de> Tested-by: Christian Zigotzky <chzigotzky@xenosoft.de> Link: https://lore.kernel.org/dri-devel/9bb5fcbd-daf5-1669-b3e7-b8624b3c36f9@xenosoft.de/ Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Dwaipayan Ray [Sun, 9 May 2021 14:49:23 +0000 (20:19 +0530)]
drm/amd/amdgpu: Fix errors in function documentation
Fix a couple of syntax errors and removed one excess
parameter in the function documentations which lead
to kernel docs build warning.
Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Dwaipayan Ray <dwaipayanray1@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Rouven Czerwinski [Sat, 8 May 2021 18:19:51 +0000 (20:19 +0200)]
drm/amd/display: remove unused function dc_link_perform_link_training
This function is not used anywhere, remove it. It was added in 40dd6bd376a4 ("drm/amd/display: Linux Set/Read link rate and lane count
through debugfs") and moved in fe798de53a7a ("drm/amd/display: Move link
functions from dc to dc_link"), but a user is missing.
Dennis Li [Mon, 10 May 2021 11:08:11 +0000 (19:08 +0800)]
drm/amdgpu: add function to clear MMEA error status for aldebaran
For aldebaran, hardware will not clear error status automatically when
reading error status register, insteadly driver should set clear bit of
the error status register explicitly to clear error status.
Signed-off-by: Dennis Li <Dennis.Li@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Dennis Li [Mon, 10 May 2021 07:57:04 +0000 (15:57 +0800)]
drm/amdgpu: correct the funtion to clear GCEA error status
The bit 11 of GCEA_ERR_STATUS register is used to clear GCEA error
status.
Signed-off-by: Dennis Li <Dennis.Li@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Anthony Koo [Sun, 2 May 2021 00:16:03 +0000 (20:16 -0400)]
drm/amd/display: [FW Promotion] Release 0.0.65
- Implement INBOX0 messaging for HW lock
Signed-off-by: Anthony Koo <Anthony.Koo@amd.com> Acked-by: Stylon Wang <stylon.wang@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Anthony Wang [Fri, 30 Apr 2021 13:09:02 +0000 (09:09 -0400)]
drm/amd/display: Handle potential dpp_inst mismatch with pipe_idx
[Why]
In some pipe harvesting configs, we will select the incorrect
dpp_inst when programming DTO. This is because when any intermediate
pipe is fused, resource instances are no longer in 1:1
correspondence with pipe index.
[How]
When looping through pipes to program DTO, get the dpp_inst
associated with each pipe from res_pool.
Signed-off-by: Anthony Wang <anthony1.wang@amd.com> Reviewed-by: Yongqiang Sun <yongqiang.sun@amd.com> Acked-by: Stylon Wang <stylon.wang@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why]
Some DSC tests fail because stream pixel encoding does not change
its value according to the type requested in the DPCD test params.
[How]
Set stream pixel encoding before updating DSC config and configuring
the test pattern.
Signed-off-by: Ilya Bakoulin <Ilya.Bakoulin@amd.com> Reviewed-by: Hanghong Ma <Hanghong.Ma@amd.com> Acked-by: Stylon Wang <stylon.wang@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why]
Currently, the code that fills the clock table can miss filling
information about some of the higher voltage states advertised
by the SMU. This, in turn, may cause some of the higher pixel clock
modes (e.g. 8k60) to fail validation.
[How]
Fill the table with one entry per DCFCLK level instead of one entry
per FCLK level. This is needed because the maximum FCLK does not
necessarily need maximum voltage, whereas DCFCLK values from SMU
cover the full voltage range.
Signed-off-by: Ilya Bakoulin <Ilya.Bakoulin@amd.com> Reviewed-by: Dmytro Laktyushkin <Dmytro.Laktyushkin@amd.com> Acked-by: Stylon Wang <stylon.wang@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Wenjing Liu [Tue, 13 Apr 2021 22:44:40 +0000 (18:44 -0400)]
drm/amd/display: minor dp link training refactor
[how]
The change includes some dp link training refactors:
1. break down is_ch_eq_done to checking individual conditions in
its own function.
2. update dpcd_set_training_pattern to take in dc_dp_training_pattern
as input.
3. moving lttpr mode struct definition into link_service_types.h
Signed-off-by: Wenjing Liu <wenjing.liu@amd.com> Reviewed-by: George Shen <George.Shen@amd.com> Acked-by: Stylon Wang <stylon.wang@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm/amd/display: DETBufferSizeInKbyte variable type modifications
[Why]
DETBufferSizeInKByte is not expected to be sub-dividable, hence
unsigned int is a better suited data-type. Change it to an array
as well to satisfy current requirements.
[How]
Change the data-type of DETBufferSizeInKByte to an unsigned int
array. Modify the all the variables like DETBufferSizeY,
DETBufferSizeC that are involved in DETBufferSizeInKByte calculations
to unsigned int in all the display_mode_vba_xx files.
Signed-off-by: Chaitanya Dhere <chaitanya.dhere@amd.com> Reviewed-by: Dmytro Laktyushkin <Dmytro.Laktyushkin@amd.com> Acked-by: Stylon Wang <stylon.wang@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Jimmy Kizito [Sat, 10 Apr 2021 02:23:53 +0000 (22:23 -0400)]
drm/amd/display: Expand DP module training API.
[Why & How]
Add functionality useful for DP link training to public interface.
Signed-off-by: Jimmy Kizito <Jimmy.Kizito@amd.com> Reviewed-by: Jun Lei <Jun.Lei@amd.com> Acked-by: Stylon Wang <stylon.wang@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Jimmy Kizito [Wed, 7 Apr 2021 22:56:19 +0000 (18:56 -0400)]
drm/amd/display: Add fallback and abort paths for DP link training.
[Why]
When enabling a DisplayPort stream:
- Optionally reducing link bandwidth between failed link training
attempts should progressively relax training requirements.
- Abandoning link training altogether if a sink is unplugged should
avoid unnecessary training attempts.
[How]
- Add fallback parameter to DP link training function and reduce link
bandwidth between failed training attempts as long as stream bandwidth
requirements are met.
- Add training status for sink unplug and abort training when this
status is reported.
Signed-off-by: Jimmy Kizito <Jimmy.Kizito@amd.com> Reviewed-by: Jun Lei <Jun.Lei@amd.com> Acked-by: Stylon Wang <stylon.wang@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Jimmy Kizito [Mon, 5 Apr 2021 22:05:09 +0000 (18:05 -0400)]
drm/amd/display: Update setting of DP training parameters.
[Why]
Some links are dynamically assigned link encoders on stream enablement.
[How]
Update DisplayPort training parameter determination stage that assumes
link encoder statically assigned to link.
Signed-off-by: Jimmy Kizito <Jimmy.Kizito@amd.com> Reviewed-by: Jun Lei <Jun.Lei@amd.com> Acked-by: Stylon Wang <stylon.wang@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Jimmy Kizito [Thu, 1 Apr 2021 16:59:54 +0000 (12:59 -0400)]
drm/amd/display: Update DPRX detection.
[Why]
Some extra provisions are required during DPRX detection for links which
lack physical HPD and AUX/DDC pins.
[How]
Avoid attempting to access nonexistent physical pins during DPRX
detection.
Signed-off-by: Jimmy Kizito <Jimmy.Kizito@amd.com> Reviewed-by: Jun Lei <Jun.Lei@amd.com> Acked-by: Stylon Wang <stylon.wang@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Dennis Li [Mon, 10 May 2021 03:04:59 +0000 (11:04 +0800)]
drm/amdgpu: covert ras status to kernel errno
The original codes use ras status and kernl errno together in the same
function, which is a wrong code style.
Signed-off-by: Dennis Li <Dennis.Li@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Oak Zeng [Thu, 6 May 2021 16:24:17 +0000 (11:24 -0500)]
drm/amdgpu: Quit RAS initialization earlier if RAS is disabled
If RAS is disabled through amdgpu_ras_enable kernel parameter,
we should quit the RAS initialization eariler to avoid initialization
of some RAS data structure such as sysfs etc.
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Oak Zeng <Oak.Zeng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Philip Yang [Wed, 28 Apr 2021 22:57:57 +0000 (18:57 -0400)]
drm/amdkfd: handle errors returned by svm_migrate_copy_to_vram/ram
If migration copy failed because process is killed, or out of VRAM or
system memory, pass error code back to caller to handle error
gracefully.
Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Luben Tuikov [Tue, 4 May 2021 06:32:20 +0000 (02:32 -0400)]
drm/amdgpu: Export ras_*_enabled to debugfs
Export the runtime-set "ras_hw_enabled" and
"ras_enabled" to debugfs, for debugging.
Cc: Alexander Deucher <Alexander.Deucher@amd.com> Cc: John Clements <john.clements@amd.com> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: John Clements <John.Clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Luben Tuikov [Tue, 4 May 2021 06:25:29 +0000 (02:25 -0400)]
drm/amdgpu: Rename to ras_*_enabled
Rename,
ras_hw_supported --> ras_hw_enabled, and
ras_features --> ras_enabled,
to show that ras_enabled is a subset of
ras_hw_enabled, which itself is a subset
of the ASIC capability.
Cc: Alexander Deucher <Alexander.Deucher@amd.com> Cc: John Clements <john.clements@amd.com> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: John Clements <John.Clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Luben Tuikov [Tue, 4 May 2021 00:43:00 +0000 (20:43 -0400)]
drm/amdgpu: Move up ras_hw_supported
Move ras_hw_supported into struct amdgpu_dev.
The dependency is:
struct amdgpu_ras <== struct amdgpu_dev <== ASIC,
read as "struct amdgpu_ras depends on struct
amdgpu_dev, which depends on the hardware."
This can be loosely understood as, "if RAS is
supported, which is property of the ASIC (struct
amdgpu_dev), then we can access struct
amdgpu_ras."
v2: Fix a typo: must binary AND in ternary cond
in amdgpu_ras.c
Cc: Alexander Deucher <Alexander.Deucher@amd.com> Cc: John Clements <john.clements@amd.com> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: John Clements <John.Clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Luben Tuikov [Tue, 4 May 2021 00:02:22 +0000 (20:02 -0400)]
drm/amdgpu: Remove redundant ras->supported
Remove redundant ras->supported, as this value
is also stored in adev->ras_features.
Use adev->ras_features, as that supercedes "ras",
since the latter is its member.
The dependency goes like this:
ras <== adev->ras_features <== hw_supported,
and is read as "ras depends on ras_features, which
depends on hw_supported." The arrows show the flow
of information, i.e. the dependency update.
"hw_supported" should also live in "adev".
Cc: Alexander Deucher <Alexander.Deucher@amd.com> Cc: John Clements <john.clements@amd.com> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: John Clements <John.Clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sathishkumar S <sathishkumar.sundararaju@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Dennis Li [Thu, 6 May 2021 05:26:28 +0000 (13:26 +0800)]
drm/amdgpu: update the shader to clear specific SGPRs
Add shader codes to explicitly clear specific SGPRs, such as
flat_scratch_lo, flat_scratch_hi and so on. And also correct the
allocation size of SGPRs in PGM_RSRC1.
Signed-off-by: Dennis Li <Dennis.Li@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Eric Huang [Fri, 23 Apr 2021 19:05:17 +0000 (15:05 -0400)]
drm/amdkfd: add ACPI SRAT parsing for topology
In NPS4 BIOS we need to find the closest numa node when creating
topology io link between cpu and gpu, if PCI driver doesn't set
it.
Signed-off-by: Eric Huang <jinhuieric.huang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Bas Nieuwenhuizen [Tue, 4 May 2021 09:43:34 +0000 (11:43 +0200)]
drm/amdgpu: Use device specific BO size & stride check.
The builtin size check isn't really the right thing for AMD
modifiers due to a couple of reasons:
1) In the format structs we don't do set any of the tilesize / blocks
etc. to avoid having format arrays per modifier/GPU
2) The pitch on the main plane is pixel_pitch * bytes_per_pixel even
for tiled ...
3) The pitch for the DCC planes is really the pixel pitch of the main
surface that would be covered by it ...
Note that we only handle GFX9+ case but we do this after converting
the implicit modifier to an explicit modifier, so on GFX9+ all
framebuffers should be checked here.
There is a TODO about DCC alignment, but it isn't worse than before
and I'd need to dig a bunch into the specifics. Getting this out in
a reasonable timeframe to make sure it gets the appropriate testing
seemed more important.
Finally as I've found that debugging addfb2 failures is a pita I was
generous adding explicit error messages to every failure case.
Fixes: f258907fdd83 ("drm/amdgpu: Verify bo size can fit framebuffer size on init.") Tested-by: Simon Ser <contact@emersion.fr> Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Bas Nieuwenhuizen [Wed, 5 May 2021 01:27:49 +0000 (03:27 +0200)]
drm/amdgpu: Init GFX10_ADDR_CONFIG for VCN v3 in DPG mode.
Otherwise tiling modes that require the values form this field
(In particular _*_X) would be corrupted upon video decode.
Copied from the VCN v2 code.
Fixes: 99541f392b4d ("drm/amdgpu: add mc resume DPG mode for VCN3.0")
Reviewed-and-Tested by: Leo Liu <leo.liu@amd.com> Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Tue, 4 May 2021 15:00:42 +0000 (11:00 -0400)]
drm/amdgpu: change the default timeout for kernel compute queues
Change to 60s. This matches what we already do in virtualization.
Infinite timeout can lead to deadlocks in the kernel.
Reviewed-by: Christian König <christian.koenig@amd.com> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Sathishkumar S [Mon, 3 May 2021 07:04:10 +0000 (12:34 +0530)]
drm/amdgpu: set vcn mgcg flag for picasso
enable vcn mgcg flag for picasso.
Signed-off-by: Sathishkumar S <sathishkumar.sundararaju@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Mike Li [Fri, 26 Mar 2021 19:14:12 +0000 (15:14 -0400)]
drm/amdkfd: Update L1 and add L2/3 cache information
The L1 cache information has been updated and the L2/L3
information has been added. The changes have been made
for Vega10 and newer ASICs. There are no changes
for the older ASICs before Vega10.
Signed-off-by: Mike Li <Tianxinmike.Li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Fabio M. De Francesco [Tue, 27 Apr 2021 09:44:49 +0000 (11:44 +0200)]
drm/amd/amdgpu/amdgpu_drv.c: Replace drm_modeset_lock_all with drm_modeset_lock
drm_modeset_lock_all() is not needed here, so it is replaced with
drm_modeset_lock(). The crtc list around which we are looping never
changes, therefore the only lock we need is to protect access to
crtc->state.
Suggested-by: Daniel Vetter <daniel@ffwll.ch> Suggested-by: Matthew Wilcox <willy@infradead.org> Signed-off-by: Fabio M. De Francesco <fmdefrancesco@gmail.com> Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Kees Cook [Mon, 3 May 2021 05:06:07 +0000 (22:06 -0700)]
drm/radeon: Fix off-by-one power_state index heap overwrite
An out of bounds write happens when setting the default power state.
KASAN sees this as:
[drm] radeon: 512M of GTT memory ready.
[drm] GART: num cpu pages 131072, num gpu pages 131072
==================================================================
BUG: KASAN: slab-out-of-bounds in
radeon_atombios_parse_power_table_1_3+0x1837/0x1998 [radeon]
Write of size 4 at addr ffff88810178d858 by task systemd-udevd/157
Without KASAN, this will manifest later when the kernel attempts to
allocate memory that was stomped, since it collides with the inline slab
freelist pointer:
Use a valid power_state index when initializing the "flags" and "misc"
and "misc2" fields.
Bug: https://bugzilla.kernel.org/show_bug.cgi?id=211537 Reported-by: Erhard F. <erhard_f@mailbox.org> Fixes: a48b9b4edb8b ("drm/radeon/kms/pm: add asic specific callbacks for getting power state (v2)") Fixes: 79daedc94281 ("drm/radeon/kms: minor pm cleanups") Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Tue, 20 Apr 2021 19:26:46 +0000 (15:26 -0400)]
drm/amdgpu: Add graphics cache rinse packet for sdma 5.0
Add emit mem sync callback for sdma_v5_0
In amdgpu sync object test, three threads created jobs
to send GFX IB and SDMA IB in sequence. After the first
GFX thread joined, sometimes the third thread will reuse
the same physical page to store the SDMA IB. There will
be a risk that SDMA will read GFX IB in the previous physical
page. So it's better to flush the cache before commit sdma IB.
Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Mon, 3 May 2021 13:43:21 +0000 (09:43 -0400)]
MAINTAINERS: fix a few more amdgpu tree links
Switch to gitlab.
Fixes: 101c2fae5108d7 ("MAINTAINERS: update radeon/amdgpu/amdkfd git trees") Cc: David Ward <david.ward@gatech.edu> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm/amd/pm: Add debugfs node to read private buffer
Add debugfs interface to read region allocated for FW private buffer
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Suggested-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Kevin Wang <kevin1.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm/amd/pm: Add interface to get FW private buffer
v1: Add new interface to get FW private buffer details
v2: Drop domain check
v3: Use amdgpu_bo_kmap to get cpu address
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Suggested-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Kevin Wang <kevin1.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Jonathan Kim [Fri, 30 Apr 2021 19:31:42 +0000 (15:31 -0400)]
drm/amdkfd: fix no atomics settings in the kfd topology
To account for various PCIe and xGMI setups, check the no atomics settings
for a device in relation to every direct peer.
Signed-off-by: Jonathan Kim <jonathan.kim@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This version brings improvements across DP, eDP, DMUB, DSC, etc
Signed-off-by: Aric Cyr <aric.cyr@amd.com> Acked-by: Wayne Lin <Wayne.Lin@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Anthony Koo [Sat, 24 Apr 2021 14:42:47 +0000 (10:42 -0400)]
drm/amd/display: [FW Promotion] Release 0.0.64
Signed-off-by: Anthony Koo <Anthony.Koo@amd.com> Acked-by: Wayne Lin <Wayne.Lin@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Paul Wu [Fri, 16 Apr 2021 05:33:26 +0000 (13:33 +0800)]
drm/amd/display: Set stream_count to 0 when dc_resource_state_destruct.
[Why]
When hardware need to be reset, driver need to reset stream objects but
dc_resource_state_destruct function omit resetting stream_count. It will
lead page fault if some logic will touch stream object.
[How]
Set stream_count to 0 when dc_resource_state_destruct.
Signed-off-by: Paul Wu <paul.wu@amd.com> Reviewed-by: Aric Cyr <Aric.Cyr@amd.com> Acked-by: Wayne Lin <Wayne.Lin@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
George Shen [Fri, 16 Apr 2021 21:35:07 +0000 (17:35 -0400)]
drm/amd/display: Filter out YCbCr420 timing if VSC SDP not supported
[Why]
Per DP specification, YCbCr420 shall use VSC SDP.
[How]
For YCbCr420 timings, fail DP mode timing validation
if DPCD caps do not indicate VSC SDP colorimetry
support.
Signed-off-by: George Shen <george.shen@amd.com> Reviewed-by: Wenjing Liu <Wenjing.Liu@amd.com> Acked-by: Wayne Lin <Wayne.Lin@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm/amd/display: remove checking sink in is_timing_changed
[Why]
Sometimes, such as sleep wake, the link->local sink pointer changed,
but the dc_stream_state->sink pointer is not changed. The checking
of timing_changed reports wrong result, lead to link tear down
unexpected wrongly.
[How]
SST compare local sink, MST compare proper remote link.
Signed-off-by: Calvin Hou <Calvin.Hou@amd.com> Reviewed-by: Aric Cyr <Aric.Cyr@amd.com> Acked-by: Wayne Lin <Wayne.Lin@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm/amd/display: Add audio support for DFP type of active branch is DP case
[Why]
Per DP spec, for active protocol convertor adaptor, DP source should enable audio
for DFP type is DP, HDMI or DP++. Current is_dp_active_dongle() checking is not
precise, which treat branch device default as active dongle. As a result, we will
mistakenly disable audio for DFP type is DP case.
[How]
Make is_dp_active_dongle() checking more precise for active dongle types.
Rename active diongle type as SST branch device in case confusion.
Signed-off-by: Dale Zhao <dale.zhao@amd.com> Reviewed-by: Wenjing Liu <Wenjing.Liu@amd.com> Acked-by: Wayne Lin <Wayne.Lin@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>