Le Ma [Wed, 27 Nov 2019 05:17:17 +0000 (13:17 +0800)]
drm/amdgpu: support full gpu reset workflow when ras err_event_athub occurs
This athub fatal error can be recovered by baco without system-level reboot,
so add a mode to use baco for the recovery. Not affect the default psp reset
situations for now.
Signed-off-by: Le Ma <le.ma@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Le Ma [Tue, 26 Nov 2019 14:12:31 +0000 (22:12 +0800)]
drm/amdgpu: add concurrent baco reset support for XGMI
Currently each XGMI node reset wq does not run in parrallel if bound to same
cpu. Make change to bound the xgmi_reset_work item to different cpus.
XGMI requires all nodes enter into baco within very close proximity before
any node exit baco. So schedule the xgmi_reset_work wq twice for enter/exit
baco respectively.
To use baco for XGMI, PMFW supported for baco on XGMI needs to be involved.
The case that PSP reset and baco reset coexist within an XGMI hive never exist
and is not in the consideration.
v2: define use_baco flag to simplify the code for xgmi baco sequence
Signed-off-by: Le Ma <le.ma@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Le Ma [Fri, 22 Nov 2019 09:56:47 +0000 (17:56 +0800)]
drm/amdgpu: clear ras controller status registers when interrupt occurs
To fix issue that ras controller interrupt cannot be triggered anymore after
one time nbif uncorrectable error. And error count is stored in nbif ras object
for query.
Signed-off-by: Le Ma <le.ma@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Yong Zhao [Fri, 8 Nov 2019 05:30:49 +0000 (00:30 -0500)]
drm/amdkfd: Eliminate unnecessary kernel queue function pointers
Up to this point, those functions are all the same for all ASICs, so
no need to call them by functions pointers. Removing the function
pointers will greatly increase the code readablity. If there is ever
need for those function pointers, we can add it back then.
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
James Zhu [Tue, 3 Dec 2019 20:40:10 +0000 (15:40 -0500)]
drm/amdgpu/gfx: Improvement on EDC GPR workarounds
SPI limits total CS waves in flight per SE to no more than 32 * num_cu and
we need to stuff 40 waves on a CU to completely clean the SGPR. This is
accomplished in the WR by cleaning the SE in two steps, half of the CU per
step.
Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Yong Zhao <Yong.Zhao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Guchun Chen [Wed, 4 Dec 2019 07:51:16 +0000 (15:51 +0800)]
drm/amdgpu: add check before enabling/disabling broadcast mode
When security violation from new vbios happens, data fabric is
risky to stop working. So prevent the direct access to DF
mmFabricConfigAccessControl from the new vbios and onwards.
Alex Deucher [Tue, 26 Nov 2019 14:41:46 +0000 (09:41 -0500)]
drm/radeon: fix r1xx/r2xx register checker for POT textures
Shift and mask were reversed. Noticed by chance.
Tested-by: Meelis Roos <mroos@linux.ee> Reviewed-by: Michel Dänzer <mdaenzer@redhat.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
Likun Gao [Mon, 2 Dec 2019 07:04:35 +0000 (15:04 +0800)]
drm/amdgpu/powerplay: unify smu send message function
Drop smu_send_smc_msg function from ASIC specify structure.
Reuse smu_send_smc_msg_with_param function for smu_send_smc_msg.
Set paramer to 0 for smu_send_msg function, otherwise it will send
with previous paramer value (Not a certain value).
Materialize msg type for smu send message function definition.
Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Kevin Wang <kevin1.wang@amd.com> Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Hawking Zhang [Mon, 2 Dec 2019 05:44:38 +0000 (13:44 +0800)]
drm/amdgpu: load np fw prior before loading the TAs
Platform TAs will independently toggle DF Cstate.
for instance, get/set topology from xgmi ta. do error
injection from ras ta. In such case, PMFW needs to be
loaded before TAs so that all the subsequent Cstate
calls recieved by PSP FW can be routed to PMFW.
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Hawking Zhang [Mon, 2 Dec 2019 05:16:09 +0000 (13:16 +0800)]
drm/amdgpu: drop asd shared memory
asd shared memory is not needed since drivers doesn't
invoke any further cmd to asd directly after the asd
loading. trust application is the one who needs
to talk to asd after the initialization
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Colin Ian King [Mon, 2 Dec 2019 15:47:38 +0000 (15:47 +0000)]
drm/amd/display: remove redundant assignment to variable v_total
The variable v_total is being initialized with a value that is never
read and it is being updated later with a new value. The initialization
is redundant and can be removed.
Addresses-Coverity: ("Unused value") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com> Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: zhengbin <zhengbin13@huawei.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com> Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: zhengbin <zhengbin13@huawei.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com> Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: zhengbin <zhengbin13@huawei.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com> Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: zhengbin <zhengbin13@huawei.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
zhengbin [Wed, 27 Nov 2019 09:33:42 +0000 (17:33 +0800)]
drm/amd/powerplay: Remove unneeded variable 'ret' in amdgpu_smu.c
Fixes coccicheck warning:
drivers/gpu/drm/amd/powerplay/amdgpu_smu.c:1192:5-8: Unneeded variable: "ret". Return "0" on line 1195
drivers/gpu/drm/amd/powerplay/amdgpu_smu.c:1945:5-8: Unneeded variable: "ret". Return "0" on line 1961
Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: zhengbin <zhengbin13@huawei.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Colin Ian King [Wed, 20 Nov 2019 17:22:42 +0000 (17:22 +0000)]
drm/amd/display: fix double assignment to msg_id field
The msg_id field is being assigned twice. Fix this by replacing the second
assignment with an assignment to msg_size.
Addresses-Coverity: ("Unused value") Fixes: 11a00965d261 ("drm/amd/display: Add PSP block to verify HDCP2.2 steps") Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Zhan liu [Mon, 2 Dec 2019 20:12:27 +0000 (15:12 -0500)]
drm/amd/display: Get NV14 specific ip params as needed
[Why]
NV14 is using its own ip params that's different from other
DCN2.0 ASICs.
[How]
Add ASIC revision check to make sure NV14 gets correct
ip params.
Signed-off-by: Zhan Liu <zhan.liu@amd.com> Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Zhan liu [Mon, 2 Dec 2019 19:54:16 +0000 (14:54 -0500)]
drm/amd/display: Adding NV14 IP Parameters
[Why]
NV14 IP Parameters are missing.
[How]
Add IP Parameters in.
Signed-off-by: Zhan liu <zhan.liu@amd.com> Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Yintian Tao <yttao@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Nirmoy Das <nirmoy.das@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Monk Liu [Tue, 26 Nov 2019 11:42:25 +0000 (19:42 +0800)]
drm/amdgpu: fix calltrace during kmd unload(v3)
issue:
kernel would report a warning from a double unpin
during the driver unloading on the CSB bo
why:
we unpin it during hw_fini, and there will be another
unpin in sw_fini on CSB bo.
fix:
actually we don't need to pin/unpin it during
hw_init/fini since it is created with kernel pinned,
we only need to fullfill the CSB again during hw_init
to prevent CSB/VRAM lost after S3
v2:
get_csb in init_rlc so hw_init() will make CSIB content
back even after reset or s3
v3:
use bo_create_kernel instead of bo_create_reserved for CSB
otherwise the bo_free_kernel() on CSB is not aligned and
would lead to its internal reserve pending there forever
take care of gfx7/8 as well
Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Xiaojie Yuan <xiaojie.yuan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Harry Wentland [Thu, 28 Nov 2019 16:30:10 +0000 (11:30 -0500)]
drm/amd/display: Drop AMD_EDID_UTILITY defines
We don't use this upstream in the Linux kernel.
Signed-off-by: Harry Wentland <harry.wentland@amd.com> Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Zhan Liu [Thu, 28 Nov 2019 19:12:11 +0000 (14:12 -0500)]
drm/amd/display: Include num_vmid and num_dsc within NV14's resource caps
[Why]
"num_vmid" and "num_dsc" are missing within NV14's resource caps structure.
[How]
Add the missing parts.
Signed-off-by: Zhan Liu <zhan.liu@amd.com> Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Monk Liu [Tue, 26 Nov 2019 11:40:08 +0000 (19:40 +0800)]
drm/amdgpu: use CPU to flush vmhub if sched stopped
otherwse the flush_gpu_tlb will hang if we unload the
KMD becuase the schedulers already stopped
Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Jules Irenge [Tue, 26 Nov 2019 00:35:14 +0000 (00:35 +0000)]
drm: radeon: replace 0 with NULL
Replace 0 with NULL to fix sparse tool warning
warning: Using plain integer as NULL pointer
Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Zhan Liu <zhan.liu@amd.com> Signed-off-by: Jules Irenge <jbi.octave@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Dan Carpenter [Tue, 26 Nov 2019 12:10:29 +0000 (15:10 +0300)]
drm/amdgpu: Fix a bug in jpeg_v1_0_start()
Originally the last WREG32_SOC15() was a part of the if statement block
but the curly braces are on the wrong line.
Fixes: bb0db70f3f75 ("drm/amdgpu: separate JPEG1.0 code out from VCN1.0") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Wed, 20 Nov 2019 22:31:11 +0000 (17:31 -0500)]
drm/amdgpu: move pci handling out of pm ops
The documentation says the that PCI core handles this
for you unless you choose to implement it. Just rely
on the PCI core to handle the pci specific bits.
Reviewed-by: Zhan Liu <zhan.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Zhan liu [Mon, 25 Nov 2019 22:25:18 +0000 (17:25 -0500)]
drm/amd/display: Modify comments to match the code
[Why]
This line of code was modified. However, comments
remained unchanged. As a result, comments and code are
mismatching.
[How]
Modifying comments to reflect code. At the same time,
explaining why the value was changed from 200ms to
3000ms.
Signed-off-by: Zhan Liu <zhan.liu@amd.com> Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Felix Kuehling [Mon, 15 Jul 2019 20:18:03 +0000 (16:18 -0400)]
drm/amdgpu: Optimize KFD page table reservation
Be less pessimistic about estimated page table use for KFD. Most
allocations use 2MB pages and therefore need less VRAM for page
tables. This allows more VRAM to be used for applications especially
on large systems with many GPUs and hundreds of GB of system memory.
Example: 8 GPUs with 32GB VRAM each + 256GB system memory = 512GB
Old page table reservation per GPU: 1GB
New page table reservation per GPU: 32MB
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: xinhui pan <xinhui.pan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Bhawanpreet Lakha [Mon, 25 Nov 2019 15:34:13 +0000 (10:34 -0500)]
drm/amd/display: Null check aconnector in event_property_validate
[Why]
previously event_property_validate was only called after we enabled the display.
But after "Refactor HDCP to handle multiple displays per link" this function
can be called at any time. In certain cases we don't have a aconnector
[How]
Null check aconnector and exit early. This is ok because we only need to check the
ENABLED->DESIRED transition if a connector exists.
Fixes: b1abe5586ffc ("drm/amd/display: Refactor HDCP to handle multiple displays per link") Signed-off-by: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com> Reviewed-by: Zhan Liu <zhan.liu@amd.com> Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
YueHaibing [Mon, 25 Nov 2019 14:58:43 +0000 (22:58 +0800)]
drm/amd/powerplay: remove set but not used variable 'stretch_amount2'
drivers/gpu/drm/amd/amdgpu/../powerplay/smumgr/vegam_smumgr.c:
In function vegam_populate_clock_stretcher_data_table:
drivers/gpu/drm/amd/amdgpu/../powerplay/smumgr/vegam_smumgr.c:1489:29:
warning: variable stretch_amount2 set but not used [-Wunused-but-set-variable]
It is never used, so can be removed.
Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
YueHaibing [Mon, 25 Nov 2019 14:54:45 +0000 (22:54 +0800)]
drm/amd/display: remove set but not used variable 'msg_out'
drivers/gpu/drm/amd/amdgpu/../display/modules/hdcp/hdcp_psp.c: In function mod_hdcp_hdcp2_enable_encryption:
drivers/gpu/drm/amd/amdgpu/../display/modules/hdcp/hdcp_psp.c:633:77: warning: variable msg_out set but not used [-Wunused-but-set-variable]
drivers/gpu/drm/amd/amdgpu/../display/modules/hdcp/hdcp_psp.c: In function mod_hdcp_hdcp2_enable_dp_stream_encryption:
drivers/gpu/drm/amd/amdgpu/../display/modules/hdcp/hdcp_psp.c:710:77: warning: variable msg_out set but not used [-Wunused-but-set-variable]
It is never used, so remove it.
Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Colin Ian King [Fri, 22 Nov 2019 23:15:04 +0000 (23:15 +0000)]
drm/radeon: remove redundant assignment to variable ret
The variable ret is being initialized with a value that is never
read and it is being updated later with a new value. The
initialization is redundant and can be removed.
Addresses-Coverity: ("Unused value") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Nathan Chancellor [Sat, 23 Nov 2019 19:23:36 +0000 (12:23 -0700)]
drm/amdgpu: Ensure ret is always initialized when using SOC15_WAIT_ON_RREG
Commit b0f3cd3191cd ("drm/amdgpu: remove unnecessary JPEG2.0 code from
VCN2.0") introduced a new clang warning in the vcn_v2_0_stop function:
../drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c:1082:2: warning: variable 'r'
is used uninitialized whenever 'while' loop exits because its condition
is false [-Wsometimes-uninitialized]
SOC15_WAIT_ON_RREG(VCN, 0, mmUVD_STATUS, UVD_STATUS__IDLE, 0x7, r);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../drivers/gpu/drm/amd/amdgpu/../amdgpu/soc15_common.h:55:10: note:
expanded from macro 'SOC15_WAIT_ON_RREG'
while ((tmp_ & (mask)) != (expected_value)) { \
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c:1083:6: note: uninitialized use
occurs here
if (r)
^
../drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c:1082:2: note: remove the
condition if it is always true
SOC15_WAIT_ON_RREG(VCN, 0, mmUVD_STATUS, UVD_STATUS__IDLE, 0x7, r);
^
../drivers/gpu/drm/amd/amdgpu/../amdgpu/soc15_common.h:55:10: note:
expanded from macro 'SOC15_WAIT_ON_RREG'
while ((tmp_ & (mask)) != (expected_value)) { \
^
../drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c:1072:7: note: initialize the
variable 'r' to silence this warning
int r;
^
= 0
1 warning generated.
To prevent warnings like this from happening in the future, make the
SOC15_WAIT_ON_RREG macro initialize its ret variable before the while
loop that can time out. This macro's return value is always checked so
it should set ret in both the success and fail path.
Nathan Chancellor [Sat, 23 Nov 2019 19:36:39 +0000 (12:36 -0700)]
drm/amd/display: Use NULL for pointer assignment in copy_stream_update_to_stream
Clang warns:
../drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc.c:1965:26: warning:
expression which evaluates to zero treated as a null pointer constant of
type 'struct dc_dsc_config *' [-Wnon-literal-null-conversion]
update->dsc_config = false;
^~~~~
../drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc.c:1971:25: warning:
expression which evaluates to zero treated as a null pointer constant of
type 'struct dc_dsc_config *' [-Wnon-literal-null-conversion]
update->dsc_config = false;
^~~~~
2 warnings generated.
Fixes: f6fe4053b91f ("drm/amd/display: Use a temporary copy of the current state when updating DSC config") Link: https://github.com/ClangBuiltLinux/linux/issues/777 Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Colin Ian King [Fri, 22 Nov 2019 23:04:07 +0000 (23:04 +0000)]
drm/amd/powerplay: remove redundant assignment to variables HiSidd and LoSidd
The variables HiSidd and LoSidd are being initialized with values that
are never read and are being updated a little later with a new value.
The initialization is redundant and can be removed.
Addresses-Coverity: ("Unused value") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Fri, 22 Nov 2019 19:16:55 +0000 (14:16 -0500)]
MAINTAINERS: Drop Rex Zhu for amdgpu powerplay
No longer works on the driver.
Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Krzysztof Kozlowski [Thu, 21 Nov 2019 13:29:30 +0000 (21:29 +0800)]
drm/amd: Fix Kconfig indentation
Adjust indentation from spaces to tab (+optional two spaces) as in
coding style with command like:
$ sed -e 's/^ /\t/' -i */Kconfig
Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Oak Zeng [Fri, 22 Nov 2019 20:15:43 +0000 (14:15 -0600)]
drm/amdgpu: Apply noretry setting for mmhub9.4
Config the translation retry behavior according to noretry
kernel parameter
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com> Suggested-by: Jay Cornwall <Jay.Cornwall@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
changzhu [Tue, 19 Nov 2019 03:13:29 +0000 (11:13 +0800)]
drm/amdgpu: invalidate mmhub semaphore workaround in gmc9/gmc10
It may lose gpuvm invalidate acknowldege state across power-gating off
cycle. To avoid this issue in gmc9/gmc10 invalidation, add semaphore acquire
before invalidation and semaphore release after invalidation.
After adding semaphore acquire before invalidation, the semaphore
register become read-only if another process try to acquire semaphore.
Then it will not be able to release this semaphore. Then it may cause
deadlock problem. If this deadlock problem happens, it needs a semaphore
firmware fix.
changzhu [Tue, 19 Nov 2019 02:18:39 +0000 (10:18 +0800)]
drm/amdgpu: initialize vm_inv_eng0_sem for gfxhub and mmhub
SW must acquire/release one of the vm_invalidate_eng*_sem around the
invalidation req/ack. Through this way,it can avoid losing invalidate
acknowledge state across power-gating off cycle.
To use vm_invalidate_eng*_sem, it needs to initialize
vm_invalidate_eng*_sem firstly.
Signed-off-by: changzhu <Changfeng.Zhu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Jack Zhang [Thu, 21 Nov 2019 06:09:08 +0000 (14:09 +0800)]
drm/amd/amdgpu/sriov skip RLCG s/r list for arcturus VF.
After rlcg fw 2.1, kmd driver starts to load extra fw for
LIST_CNTL,GPM_MEM,SRM_MEM. We needs to skip the three fw
because all rlcg related fw have been loaded by host driver.
Guest driver would load the three fw fail without this change.
Signed-off-by: Jack Zhang <Jack.Zhang1@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Jack Zhang [Thu, 21 Nov 2019 05:59:28 +0000 (13:59 +0800)]
drm/amd/amdgpu/sriov temporarily skip ras,dtm,hdcp for arcturus VF
Temporarily skip ras,dtm,hdcp initialize and terminate for arcturus VF
Currently the three features haven't been enabled at SRIOV, it would
trigger guest driver load fail with the bare-metal path of the three
features.
Signed-off-by: Jack Zhang <Jack.Zhang1@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Stephen Rothwell [Thu, 21 Nov 2019 03:54:03 +0000 (14:54 +1100)]
merge fix for "ftrace: Rework event_create_dir()"
Reviewed-by: Kevin Wang <kevin1.wang@amd.com> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Colin Ian King [Thu, 21 Nov 2019 16:54:01 +0000 (16:54 +0000)]
drm/amdgpu: remove redundant assignment to pointer write_frame
The pointer write_frame is being initialized with a value that is
never read and it is being updated later with a new value. The
initialization is redundant and can be removed.
Addresses-Coverity: ("Unused value") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Jay Cornwall [Wed, 20 Nov 2019 16:32:46 +0000 (16:32 +0000)]
drm/amdgpu: Update Arcturus golden registers
Signed-off-by: Jay Cornwall <jay.cornwall@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Dennis Li [Tue, 19 Nov 2019 06:02:57 +0000 (14:02 +0800)]
drm/amdgpu: implement querying ras error count for mmhub9.4
Get mmhub error counter by accessing EDC_CNT registers.
v2: Add mmhub_v9_4_ prefix for local static variable and function
Signed-off-by: Dennis Li <dennis.li@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Dennis Li [Tue, 19 Nov 2019 08:02:28 +0000 (16:02 +0800)]
drm/amdgpu: refine query function of mmhub EDC counter in vg20
Add codes to print the detail EDC info for the subblock of mmhub
v2: Move the EDC_CNT registers' defintion from mmhub_9_4 header
files to mmhub_1_0 ones. Add mmhub_v1_0_ prefix for the local
static variable and function.
v3: squash in DC fix
Signed-off-by: Dennis Li <dennis.li@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Dennis Li [Tue, 19 Nov 2019 08:25:25 +0000 (16:25 +0800)]
drm/amdgpu: define soc15_ras_field_entry for reuse
The struct soc15_ras_field_entry will be reused by
other IPs, such as mmhub and gc
v2: rename ras_subblock_regs to gc_ras_fields_vg20,
because the future asic maybe have a different table.
Signed-off-by: Dennis Li <dennis.li@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Sierra [Mon, 18 Nov 2019 21:33:07 +0000 (15:33 -0600)]
amd/amdgpu: force to trigger a no-retry-fault after a retry-fault
Only for the debugger use case.
[why]
Avoid endless translation retries, after an invalid address access has
been issued to the GPU. Instead, the trap handler is forced to enter by
generating a no-retry-fault.
A s_trap instruction is inserted in the debugger case to let the wave to
enter trap handler to save context.
[how]
Intentionally using an invalid flag combination (F and P set at the same
time) to trigger a no-retry-fault, after a retry-fault happens. This is
only valid under compute context.
Signed-off-by: Alex Sierra <alex.sierra@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Sierra [Mon, 18 Nov 2019 19:28:46 +0000 (13:28 -0600)]
drm/amdgpu: add flag to indicate amdgpu vm context
Flag added to indicate if the amdgpu vm context is used for compute or
graphics.
Signed-off-by: Alex Sierra <alex.sierra@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Thu, 7 Nov 2019 23:13:14 +0000 (18:13 -0500)]
drm/amdgpu: start to disentangle boco from runtime pm
BACO - Bus Active, Chip Off
BOCO - Bus Off, Chip Off
We originally only supported runtime pm on PX/HG
laptops so most of the runtime pm code looks for this.
Add a new flag to check for runtime pm enablement and
use this rather than checking for PX/HG.
Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>