Clang build fails with
amdgpu_ras.c:2416:7: error: variable 'ras_obj' is used uninitialized
whenever 'if' condition is true
if (adev->in_suspend || amdgpu_in_reset(adev)) {
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
amdgpu_ras.c:2453:6: note: uninitialized use occurs here
if (ras_obj->ras_cb)
^~~~~~~
There is a logic error in the error handler's labels.
ex/ The sysfs: is the last goto label in the normal code but
is the middle of error handler. Rework the error handler.
cleanup: is the first error, so it's handler should be last.
interrupt: is the second error, it's handler is next. interrupt:
handles the failure of amdgpu_ras_interrupt_add_hander() by
calling amdgpu_ras_interrupt_remove_handler(). This is wrong,
remove() assumes the interrupt has been setup, not torn down by
add(). Change the goto label to cleanup.
sysfs is the last error, it's handler should be first. sysfs:
handles the failure of amdgpu_ras_sysfs_create() by calling
amdgpu_ras_sysfs_remove(). But when the create() fails there
is nothing added so there is nothing to remove. This error
handler is not needed. Remove the error handler and change
goto label to interrupt.
Fixes: b293e891b057 ("drm/amdgpu: add helper function to do common ras_late_init/fini (v3)") Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Tom Rix <trix@redhat.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Luben Tuikov [Thu, 17 Feb 2022 16:12:55 +0000 (11:12 -0500)]
drm/amdgpu: Dynamically initialize IP instance attributes
Dynamically initialize IP instance attributes. This eliminates bugs
stemming from adding new attributes to an IP instance.
Cc: Alex Deucher <Alexander.Deucher@amd.com> Reported-by: Tom StDenis <tom.stdenis@amd.com> Fixes: 4d7ba312dd1f ("drm/amdgpu: Add "harvest" to IP discovery sysfs") Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Acked-by: Alex Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Tom St Denis [Tue, 15 Feb 2022 14:59:07 +0000 (09:59 -0500)]
drm/amd/amdgpu: Add APU flag to gca_config debugfs data (v3)
Needed by umr to detect if ip discovered ASIC is an APU or not.
(v2): Remove asic type from packet it's not strictly needed
(v3): Correct comment
Signed-off-by: Tom St Denis <tom.stdenis@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Mario Limonciello [Thu, 17 Feb 2022 05:19:58 +0000 (23:19 -0600)]
drm/amd: Refactor `amdgpu_aspm` to be evaluated per device
Evaluating `pcie_aspm_enabled` as part of driver probe has the implication
that if one PCIe bridge with an AMD GPU connected doesn't support ASPM
then none of them do. This is an invalid assumption as the PCIe core will
configure ASPM for individual PCIe bridges.
Create a new helper function that can be called by individual dGPUs to
react to the `amdgpu_aspm` module parameter without having negative results
for other dGPUs on the PCIe bus.
Suggested-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Luben Tuikov [Wed, 16 Feb 2022 21:47:32 +0000 (16:47 -0500)]
drm/amdgpu: Fix ARM compilation warning
Fix this ARM warning:
drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c:664:35: warning: format '%ld'
expects argument of type 'long int', but argument 4 has type 'size_t' {aka
'unsigned int'} [-Wformat=]
Cc: Alex Deucher <Alexander.Deucher@amd.com> Cc: kbuild-all@lists.01.org Cc: linux-kernel@vger.kernel.org Reported-by: kernel test robot <lkp@intel.com> Fixes: a6c40b178092 ("drm/amdgpu: Show IP discovery in sysfs") Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Acked-by: Alex Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Mario Limonciello [Tue, 1 Feb 2022 16:26:33 +0000 (10:26 -0600)]
drm/amd: Check if ASPM is enabled from PCIe subsystem
commit 0064b0ce85bb ("drm/amd/pm: enable ASPM by default") enabled ASPM
by default but a variety of hardware configurations it turns out that this
caused a regression.
* PPC64LE hardware does not support ASPM at a hardware level.
CONFIG_PCIEASPM is often disabled on these architectures.
* Some dGPUs on ALD platforms don't work with ASPM enabled and PCIe subsystem
disables it
Check with the PCIe subsystem to see that ASPM has been enabled
or not.
yipechai [Tue, 15 Feb 2022 06:38:13 +0000 (14:38 +0800)]
drm/amdgpu: Remove redundant .ras_late_init initialization in some ras blocks
1. Define amdgpu_ras_block_late_init_default in amdgpu_ras.c as
.ras_late_init common function, which is called when
.ras_late_init of ras block isn't initialized.
2. Remove the code of using amdgpu_ras_block_late_init to
initialize .ras_late_init in ras blocks.
Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
yipechai [Mon, 14 Feb 2022 06:38:02 +0000 (14:38 +0800)]
drm/amdgpu: Optimize xxx_ras_late_init function of each ras block
1. Move calling ras block instance members from module internal
function to the top calling xxx_ras_late_init.
2. Module internal function calls can only use parameter variables
of xxx_ras_late_init instead of ras block instance members.
Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Prike Liang [Tue, 26 Oct 2021 02:04:27 +0000 (10:04 +0800)]
drm/amdgpu/discovery: Add sw DM function for 3.1.6 DCE
Add 3.1.6 DCE IP and assign relevant sw DM function for the new DCE.
Reviewed-by: Leo Li <sunpeng.li@amd.com> Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Hansen Dsouza [Wed, 26 Jan 2022 20:44:50 +0000 (15:44 -0500)]
drm/amd/display: Add DCN316 resource and SMU clock manager
Add core DC implementation for DCN316.
Reviewed-by: Charlene Liu <Charlene.Liu@amd.com> Acked-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Leo Li <sunpeng.li@amd.com> Signed-off-by: Hansen Dsouza <Hansen.Dsouza@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Leo Li [Thu, 27 Jan 2022 19:29:31 +0000 (14:29 -0500)]
drm/amd/display: Add DMUB support for DCN316
Initialize DMUB for DCN316. Use same funcs as DCN31 for
DCN316.
Reviewed-by: Harry Wentland <harry.wentland@amd.com> Acked-by: Prike Liang <Prike.Liang@amd.com> Signed-off-by: Leo Li <sunpeng.li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
CC: Changcheng Deng <deng.changcheng@zte.com.cn> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Sathishkumar S [Fri, 28 Jan 2022 08:21:18 +0000 (13:51 +0530)]
drm/amdgpu: update vcn/jpeg PG flags for VCN 3.1.1
update vcn and jpeg power gating flags for VCN 3.1.1
Signed-off-by: Sathishkumar S <sathishkumar.sundararaju@amd.com> Reviewed-by: James Zhu <James.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Mario Limonciello [Wed, 16 Feb 2022 14:55:16 +0000 (08:55 -0600)]
drm/amd: smu7: downgrade voltage error to info
The message `Voltage value looks like a Leakage ID but it's not patched`
shows up as an error on Dell Precision 3540. This doesn't cause functional
problems and should be downgraded to info.
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1162 Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Michel Dänzer [Tue, 15 Feb 2022 18:53:37 +0000 (19:53 +0100)]
drm/amd/display: For vblank_disable_immediate, check PSR is really used
Even if PSR is allowed for a present GPU, there might be no eDP link
which supports PSR.
Fixes: 708978487304 ("drm/amdgpu/display: Only set vblank_disable_immediate when PSR is not enabled") Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Michel Dänzer <mdaenzer@redhat.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Changcheng Deng [Tue, 15 Feb 2022 09:11:42 +0000 (09:11 +0000)]
drm/amdkfd: Replace zero-length array with flexible-array member
There is a regular need in the kernel to provide a way to declare having
a dynamically sized set of trailing elements in a structure. Kernel code
should always use "flexible array members" for these cases. The older
style of one-element or zero-length arrays should no longer be used.
Reference:
https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays
Reported-by: Zeal Robot <zealci@zte.com.cn> Signed-off-by: Changcheng Deng <deng.changcheng@zte.com.cn> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Roman Li [Fri, 28 Jan 2022 17:29:01 +0000 (12:29 -0500)]
drm/amd/display: Add dsc pre-validation in atomic check
[Why]
The previous change:
"Add affected crtcs to atomic state for dsc mst unplug"
forces modeset on all added crctc regardless whether timing changed or not.
Per our implementation of dsc we need modeset only if timing changed.
Otherwise dsc can be programmed incorrectly leading to dsc engine hang.
[How]
During atomic_check pre-compute dsc params.
Only set mode_changed if timing is changed.
Reviewed-by: Hersen Wu <hersenwu@amd.com> Acked-by: Jasdeep Dhillon <jdhillon@amd.com> Signed-off-by: Roman Li <Roman.Li@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Victor Skvortsov [Thu, 3 Feb 2022 21:13:40 +0000 (21:13 +0000)]
drm/amdgpu: Fix wait for RLCG command completion
if (!(tmp & flag)) condition will always evaluate to true
when the flag is 0x0 (AMDGPU_RLCG_GC_WRITE). Instead check
that address bits are cleared to determine whether
the command is complete.
Signed-off-by: Victor Skvortsov <victor.skvortsov@amd.com> Tested-by: Bokun Zhang <bokun.zhang@amd.com>
Reviewed by: Shaoyun.liu <Shaoyun.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Hersen Wu [Sun, 6 Feb 2022 21:06:00 +0000 (16:06 -0500)]
drm/amd/display: add dsc mst stream pbn log for debug
[why]
payload and slot number of display on dsc mst hub will be
adjusted when there is change on any display on dsc hub.
to monitor dsc enable/disable, pbn change, we need add log.
[How]
add mst_pbn to dc_dsc_config of dc_crtc_timing.
add dsc, pbn, display name within dc_core_enable_stream,
dc_core_disable_stream, dc_stream_log
Reviewed-by: Jerry Zuo <Jerry.Zuo@amd.com> Acked-by: Jasdeep Dhillon <jdhillon@amd.com> Signed-off-by: Hersen Wu <hersenwu@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Roman Li [Thu, 23 Dec 2021 22:39:57 +0000 (17:39 -0500)]
drm/amd/display: Add affected crtcs to atomic state for dsc mst unplug
[Why]
When display topology changed on DSC hub we add all crtcs with dsc support to
atomic state.
Refer to patch:"drm/amd/display: Trigger modesets on MST DSC connectors"
However the original implementation may skip crtc if the topology change
caused by unplug.
That potentially could lead to no-lightup or corruption on DSC hub after
unplug event on one of the connectors.
[How]
Update add_affected_mst_dsc_crtcs() to use old connector state
if new connector state has no crtc (undergoes modeset due to unplug)
Fixes: 44be939ff7ac58 ("drm/amd/display: Trigger modesets on MST DSC connectors") Reviewed-by: Hersen Wu <hersenwu@amd.com> Acked-by: Jasdeep Dhillon <jdhillon@amd.com> Signed-off-by: Roman Li <Roman.Li@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Eric Yang [Mon, 24 Jan 2022 20:06:39 +0000 (15:06 -0500)]
drm/amd/display: enable z9 denial interface by default
Reviewed-by: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com> Acked-by: Jasdeep Dhillon <jdhillon@amd.com> Signed-off-by: Eric Yang <Eric.Yang2@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Luben Tuikov [Mon, 14 Feb 2022 23:02:58 +0000 (18:02 -0500)]
drm/amdgpu: Add "harvest" to IP discovery sysfs
Add the "harvest" field to the IP attributes in
the IP discovery sysfs visualization, as this
field is present in the binary data.
At the time of this commit, the harvest data isn't
consistently correct in VBIOS, but it is exposed
for completeness, in the hopes that VBIOS will be
fixed in the future.
Cc: Alex Deucher <Alexander.Deucher@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Charlene Liu [Wed, 2 Feb 2022 21:35:14 +0000 (16:35 -0500)]
drm/amd/display: make sure pipe power gating reach requested hw state
[why]
display mapping change will involved pipe power gating on and off.
when doing this too fase, sometimes usbc will have no display.
check HW status, it is still in pipe power gating.
[how]
insert polling HW status to make sure the required state reached.
also add dal registry key handling.
Reviewed-by: Sung joon Kim <Sungjoon.Kim@amd.com> Acked-by: Jasdeep Dhillon <jdhillon@amd.com> Signed-off-by: Charlene Liu <Charlene.Liu@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Hersen Wu [Wed, 2 Feb 2022 21:08:45 +0000 (16:08 -0500)]
drm/amd/display: dsc mst re-compute pbn for changes on hub
[why]
when unplug 1 dp from dsc mst hub, atomic_check new request
dc_state only include info for the unplug dp. this will not
trigger re-compute pbn for displays still connected to hub.
[how] all displays connected to dsc hub are available in
dc->current_state, by comparing dc->current_state and new
request from atomic_chceck, it will provide info of
displays connected to hub and do pbn re-compute.
Reviewed-by: Roman Li <Roman.Li@amd.com> Acked-by: Jasdeep Dhillon <jdhillon@amd.com> Signed-off-by: Hersen Wu <hersenwu@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Meenakshikumar Somasundaram [Thu, 13 Jan 2022 00:58:04 +0000 (19:58 -0500)]
drm/amd/display: Fix for dmub outbox notification enable
[Why]
Currently driver enables dmub outbox notification before oubox ISR is
registered. During boot scenario, sometimes dmub issues hpd outbox
message before driver registers ISR and those messages are missed.
[How]
Enable dmub outbox notification after outbox ISR is registered. Also,
restructured outbox enable code to call from dm layer and renamed APIs.
Reviewed-by: Jun Lei <Jun.Lei@amd.com> Acked-by: Jasdeep Dhillon <jdhillon@amd.com> Signed-off-by: Meenakshikumar Somasundaram <meenakshikumar.somasundaram@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Evan Quan [Thu, 20 Jan 2022 11:16:19 +0000 (19:16 +0800)]
drm/amd/pm: fix some OEM SKU specific stability issues
Add a quirk in sienna_cichlid_ppt.c to fix some OEM SKU
specific stability issues.
Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Evan Quan [Thu, 20 Jan 2022 08:15:52 +0000 (16:15 +0800)]
drm/amdgpu: disable MMHUB PG for Picasso
MMHUB PG needs to be disabled for Picasso for stability reasons.
Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Evan Quan [Wed, 19 Jan 2022 04:29:02 +0000 (12:29 +0800)]
drm/amd/pm: fulfill Sienna_Cichlid implementations for DriverSmuConfig setting
Fulfill the implementations for DriverSmuConfig setting on Sienna_Cichlid.
Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Evan Quan [Wed, 19 Jan 2022 04:00:38 +0000 (12:00 +0800)]
drm/amd/pm: fulfill Navi1x implementations for DriverSmuConfig setting
Fulfill the implementations for DriverSmuConfig setting on Navi1x.
Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Yiqing Yao [Mon, 24 Jan 2022 08:11:20 +0000 (16:11 +0800)]
drm/amd/pm: enable pm sysfs write for one VF mode
[why]
pm sysfs should be writable in one VF mode as is in passthrough
[how]
do not remove write access on pm sysfs if device is in one VF mode
Fixes: 11c9cc95f818 ("amdgpu/pm: Make sysfs pm attributes as read-only for VFs") Signed-off-by: Yiqing Yao <yiqing.yao@amd.com> Reviewed-by: Monk Liu <Monk.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Evan Quan [Wed, 19 Jan 2022 02:51:23 +0000 (10:51 +0800)]
drm/amd/pm: correct the default DriverSmuConfig table settings
For Some ASICs, with the PMFW default settings, we may see the
power consumption reported via metrics table is "Very Erratic".
With the socket power alpha filter set as 10/100ms, we can correct
that issue.
Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Tom Rix [Mon, 14 Feb 2022 18:22:24 +0000 (10:22 -0800)]
drm/amdgpu: check return status before using stable_pstate
Clang static analysis reports this problem
amdgpu_ctx.c:616:26: warning: Assigned value is garbage
or undefined
args->out.pstate.flags = stable_pstate;
^ ~~~~~~~~~~~~~
amdgpu_ctx_stable_pstate can fail without setting
stable_pstate. So check.
Fixes: 8cda7a4f96e4 ("drm/amdgpu/UAPI: add new CTX OP to get/set stable pstates") Signed-off-by: Tom Rix <trix@redhat.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Nicholas Bishop [Fri, 11 Feb 2022 19:57:39 +0000 (14:57 -0500)]
drm/radeon: Fix backlight control on iMac 12,1
The iMac 12,1 does not use the gmux driver for backlight, so the radeon
backlight device is needed to set the brightness.
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1838 Signed-off-by: Nicholas Bishop <nicholasbishop@google.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Sung Joon Kim [Tue, 1 Feb 2022 18:59:02 +0000 (13:59 -0500)]
drm/amd/display: reset lane settings after each PHY repeater LT
[why]
In LTTPR non-transparent mode, we need
to reset the cached lane settings before performing
link training on the next PHY repeater. Otherwise,
the cached lane settings will be used for the next
clock recovery e.g. VS = MAX (3) which should not be
the case according to the DP specs. We expect to use
minimum lane settings on each clock recovery sequence.
[how]
Reset DPCD and HW lane settings on each repeater LT.
Set training pattern to 0 for the repeater that failed LT
at the proper place.
Reviewed-by: Meenakshikumar Somasundaram <Meenakshikumar.Somasundaram@amd.com> Reviewed-by: Jun Lei <Jun.Lei@amd.com> Acked-by: Jasdeep Dhillon <jdhillon@amd.com> Signed-off-by: Sung Joon Kim <sungkim@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Jonathan Kim [Thu, 10 Feb 2022 19:41:20 +0000 (14:41 -0500)]
drm/amdkfd: navi2x requires extended engines to map and unmap sdma queues
SDMA 5.2.x queues are required to be mapped and unmapped from the extended
engines.
Signed-off-by: Jonathan Kim <jonathan.kim@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Jonathan Kim [Thu, 10 Feb 2022 18:39:15 +0000 (13:39 -0500)]
drm/amdkfd: remove unneeded unmap single queue option
The KFD only unmaps all queues, all dynamics queues or all process queues
since RUN_LIST is mapped with all KFD queues.
There's no need to provide a single type unmap so remove this option.
Signed-off-by: Jonathan Kim <jonathan.kim@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Surbhi Kakarya [Wed, 26 Jan 2022 17:04:39 +0000 (12:04 -0500)]
drm/amdgpu: Handle the GPU recovery failure in SRIOV environment.
This patch handles the GPU recovery failure in sriov environment by
retrying the reset if the first reset fails. To determine the condition
of retry, a new macro AMDGPU_RETRY_SRIOV_RESET is added which returns
true if failure is due to ETIMEDOUT, EINVAL or EBUSY, otherwise return
false.A new macro AMDGPU_MAX_RETRY_LIMIT is used to limit the retry to 2.
It also handles the return status in Post Asic Reset by updating the return
code with asic_reset_res and eventually return the return code in
amdgpu_job_timedout().
yipechai [Wed, 9 Feb 2022 03:15:29 +0000 (11:15 +0800)]
drm/amdgpu: Merge amdgpu_ras_late_init/amdgpu_ras_late_fini to amdgpu_ras_block_late_init/amdgpu_ras_block_late_fini
1. Merge amdgpu_ras_late_init to
amdgpu_ras_block_late_init.
2. Remove amdgpu_ras_late_init since no ras block
calls amdgpu_ras_late_init.
3. Merge amdgpu_ras_late_fini to
amdgpu_ras_block_late_fini.
4. Remove amdgpu_ras_late_fini since no ras block
calls amdgpu_ras_late_fini.
Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
yipechai [Sun, 30 Jan 2022 09:03:32 +0000 (17:03 +0800)]
drm/amdgpu: Optimize xxx_ras_late_init/xxx_ras_late_fini for each ras block
1. Define amdgpu_ras_block_late_init to create sysfs nodes
and interrupt handles.
2. Define amdgpu_ras_block_late_fini to remove sysfs nodes
and interrupt handles.
3. Replace ras block variable members in struct
amdgpu_ras_block_object with struct ras_common_if, which
can make it easy to associate each ras block instance
with each ras block functional interface.
4. Add .ras_cb to struct amdgpu_ras_block_object.
5. Change each ras block to fit for the changement of struct
amdgpu_ras_block_object.
Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Guchun Chen [Thu, 10 Feb 2022 06:34:51 +0000 (14:34 +0800)]
drm/amdgpu: no rlcg legacy read in SRIOV case
rlcg legacy read is not available in SRIOV configration.
Otherwise, gmc_v9_0_flush_gpu_tlb will always complain
timeout and finally breaks driver load.
v2: bypass read in amdgpu_virt_get_rlcg_reg_access_flag (from Victor)
Fixes: 97d1a3b967a3cb ("drm/amdgpu: switch to get_rlcg_reg_access_flag for gfx9") Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Victor Skvortsov <Victor.Skvortsov@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Rajneesh Bhardwaj [Fri, 11 Feb 2022 00:26:04 +0000 (19:26 -0500)]
drm/amdkfd: Fix leftover errors and warnings
A bunch of errors and warnings are leftover KFD over the years, attempt
to fix the errors and most warnings reported by checkpatch tool. Still a
few warnings remain which may be false positives so ignore them for now.
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Luben Tuikov [Thu, 3 Feb 2022 21:54:13 +0000 (16:54 -0500)]
drm/amdgpu: Show IP discovery in sysfs
Add IP discovery data in sysfs. The format is:
/sys/class/drm/cardX/device/ip_discovery/die/D/B/I/<attrs>
where,
X is the card ID, an integer,
D is the die ID, an integer,
B is the IP HW ID, an integer, aka block type,
I is the IP HW ID instance, an integer.
<attrs> are the attributes of the block instance. At the moment these
include HW ID, instance number, major, minor, revision, number of base
addresses, and the base addresses themselves.
A symbolic link of the acronym HW ID is also created, under D/, if you
prefer to browse by something humanly accessible.
Cc: Alex Deucher <Alexander.Deucher@amd.com> Cc: Tom StDenis <tom.stdenis@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Reviewed-by: Alex Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Dave Airlie [Mon, 14 Feb 2022 00:31:51 +0000 (10:31 +1000)]
Merge tag 'amd-drm-next-5.18-2022-02-11-1' of https://gitlab.freedesktop.org/agd5f/linux into drm-next
amd-drm-next-5.18-2022-02-11-1:
amdgpu:
- Clean up of power management code
- Enable freesync video mode by default
- Clean up of RAS code
- Improve VRAM access for debug using SDMA
- Coding style cleanups
- SR-IOV fixes
- More display FP reorg
- TLB flush fixes for Arcuturus, Vega20
- Misc display fixes
- Rework special register access methods for SR-IOV
- DP2 fixes
- DP tunneling fixes
- DSC fixes
- More IP discovery cleanups
- Misc RAS fixes
- Enable both SMU i2c buses where applicable
- s2idle improvements
- DPCS header cleanup
- Add new CAP firmware support for SR-IOV
amdkfd:
- Misc cleanups
- SVM fixes
- CRIU support
- Clean up MQD manager
UAPI:
- Add interface to amdgpu CTX ioctl to request a stable power state for profiling
https://gitlab.freedesktop.org/mesa/drm/-/merge_requests/207
- Add amdkfd support for CRIU
https://github.com/checkpoint-restore/criu/pull/1709
- Remove old unused amdkfd debugger interface
Was only implemented for Kaveri and was only ever used by an old HSA tool that was never open sourced
Linus Torvalds [Sun, 13 Feb 2022 19:58:11 +0000 (11:58 -0800)]
Merge tag 'kbuild-fixes-v5.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild fixes from Masahiro Yamada:
- Fix the truncated path issue for HAVE_GCC_PLUGINS test in Kconfig
- Move -Wunsligned-access to W=1 builds to avoid sprinkling warnings
for the latest Clang
- Fix missing fclose() in Kconfig
- Fix Kconfig to touch dep headers correctly when KCONFIG_AUTOCONFIG is
overridden.
* tag 'kbuild-fixes-v5.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
kconfig: fix failing to generate auto.conf
kconfig: fix missing fclose() on error paths
Makefile.extrawarn: Move -Wunaligned-access to W=1
kconfig: let 'shell' return enough output for deep path names
Linus Torvalds [Sun, 13 Feb 2022 18:06:40 +0000 (10:06 -0800)]
Merge tag 'irq-urgent-2022-02-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq fixes from Thomas Gleixner:
"Interrupt chip driver fixes:
- Don't install an hotplug notifier for GICV3-ITS on systems which do
not need it to prevent a warning in the notifier about inconsistent
state
- Add the missing device tree matching for the T-HEAD PLIC variant so
the related SoC is properly supported"
* tag 'irq-urgent-2022-02-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irqchip/sifive-plic: Add missing thead,c900-plic match string
dt-bindings: update riscv plic compatible string
irqchip/gic-v3-its: Skip HP notifier when no ITS is registered
Linus Torvalds [Sun, 13 Feb 2022 17:43:34 +0000 (09:43 -0800)]
Merge tag 'objtool_urgent_for_v5.17_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull objtool fix from Borislav Petkov:
"Fix a case where objtool would mistakenly warn about instructions
being unreachable"
* tag 'objtool_urgent_for_v5.17_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/bug: Merge annotate_reachable() into _BUG_FLAGS() asm
Linus Torvalds [Sun, 13 Feb 2022 17:22:52 +0000 (09:22 -0800)]
Merge tag 'x86_urgent_for_v5.17_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fix from Borislav Petkov:
"Prevent softlockups when tearing down large SGX enclaves"
* tag 'x86_urgent_for_v5.17_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/sgx: Silence softlockup detection when releasing large enclaves
Linus Torvalds [Sun, 13 Feb 2022 17:16:45 +0000 (09:16 -0800)]
Merge tag '5.17-rc4-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6
Pull cifs fixes from Steve French:
"Three small smb3 reconnect fixes and an error log clarification"
* tag '5.17-rc4-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6:
cifs: mark sessions for reconnection in helper function
cifs: call helper functions for marking channels for reconnect
cifs: call cifs_reconnect when a connection is marked
[smb3] improve error message when mount options conflict with posix
Linus Torvalds [Sat, 12 Feb 2022 18:29:02 +0000 (10:29 -0800)]
Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"Two minor fixes in the lpfc driver. One changing the classification of
trace messages and the other fixing a build issue when NVME_FC is
disabled"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: lpfc: Reduce log messages seen after firmware download
scsi: lpfc: Remove NVMe support if kernel has NVME_FC disabled
Linus Torvalds [Sat, 12 Feb 2022 18:01:55 +0000 (10:01 -0800)]
Merge tag 'tty-5.17-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Pull tty/serial fixes from Greg KH:
"Here are four small tty/serial fixes for 5.17-rc4. They are:
- 8250_pericom change revert to fix a reported regression
- two speculation fixes for vt_ioctl
- n_tty regression fix for polling
All of these have been in linux-next for a while with no reported
issues"
* tag 'tty-5.17-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
vt_ioctl: add array_index_nospec to VT_ACTIVATE
vt_ioctl: fix array_index_nospec in vt_setactivate
serial: 8250_pericom: Revert "Re-enable higher baud rates"
n_tty: wake up poll(POLLRDNORM) on receiving data
Linus Torvalds [Sat, 12 Feb 2022 17:56:18 +0000 (09:56 -0800)]
Merge tag 'usb-5.17-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB fixes from Greg KH:
"Here are some small USB driver fixes for 5.17-rc4 that resolve some
reported issues and add new device ids:
- usb-serial new device ids
- ulpi cleanup fixes
- f_fs use-after-free fix
- dwc3 driver fixes
- ax88179_178a usb network driver fix
- usb gadget fixes
There is a revert at the end of this series to resolve a build problem
that 0-day found yesterday. Most of these have been in linux-next,
except for the last few, and all have now passed 0-day tests"
* tag 'usb-5.17-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
Revert "usb: dwc2: drd: fix soft connect when gadget is unconfigured"
usb: dwc2: drd: fix soft connect when gadget is unconfigured
usb: gadget: rndis: check size of RNDIS_MSG_SET command
USB: gadget: validate interface OS descriptor requests
usb: core: Unregister device on component_add() failure
net: usb: ax88179_178a: Fix out-of-bounds accesses in RX fixup
usb: dwc3: gadget: Prevent core from processing stale TRBs
USB: serial: cp210x: add CPI Bulk Coin Recycler id
USB: serial: cp210x: add NCR Retail IO box id
USB: serial: ftdi_sio: add support for Brainboxes US-159/235/320
usb: gadget: f_uac2: Define specific wTerminalType
usb: gadget: udc: renesas_usb3: Fix host to USB_ROLE_NONE transition
usb: raw-gadget: fix handling of dual-direction-capable endpoints
usb: usb251xb: add boost-up property support
usb: ulpi: Call of_node_put correctly
usb: ulpi: Move of_node_put to ulpi_dev_release
USB: serial: option: add ZTE MF286D modem
USB: serial: ch341: add support for GW Instek USB2.0-Serial devices
usb: f_fs: Fix use-after-free for epfile
usb: dwc3: xilinx: fix uninitialized return value
Linus Torvalds [Sat, 12 Feb 2022 17:12:44 +0000 (09:12 -0800)]
Merge tag 's390-5.17-4' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 updates from Vasily Gorbik:
"Maintainers and reviewers changes:
- Add Alexander Gordeev as maintainer for s390.
- Christian Borntraeger will focus on s390 KVM maintainership and
stays as s390 reviewer.
Fixes:
- Fix clang build of modules loader KUnit test.
- Fix kernel panic in CIO code on FCES path-event when no driver is
attached to a device or the driver does not provide the path_event
function"
* tag 's390-5.17-4' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/cio: verify the driver availability for path_event call
s390/module: fix building test_modules_helpers.o with clang
MAINTAINERS: downgrade myself to Reviewer for s390
MAINTAINERS: add Alexander Gordeev as maintainer for s390
Linus Torvalds [Sat, 12 Feb 2022 17:08:57 +0000 (09:08 -0800)]
Merge tag 'for-linus-5.17a-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen fixes from Juergen Gross:
- Two small cleanups
- Another fix for addressing the EFI framebuffer above 4GB when running
as Xen dom0
- A patch to let Xen guests use reserved bits in MSI- and IO-APIC-
registers for extended APIC-IDs the same way KVM guests are doing it
already
* tag 'for-linus-5.17a-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen/pci: Make use of the helper macro LIST_HEAD()
xen/x2apic: Fix inconsistent indenting
xen/x86: detect support for extended destination ID
xen/x86: obtain full video frame buffer address for Dom0 also under EFI
Linus Torvalds [Sat, 12 Feb 2022 17:04:05 +0000 (09:04 -0800)]
Merge tag 'seccomp-v5.17-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull seccomp fixes from Kees Cook:
"This fixes a corner case of fatal SIGSYS being ignored since v5.15.
Along with the signal fix is a change to seccomp so that seeing
another syscall after a fatal filter result will cause seccomp to kill
the process harder.
Summary:
- Force HANDLER_EXIT even for SIGNAL_UNKILLABLE
- Make seccomp self-destruct after fatal filter results
- Update seccomp samples for easier behavioral demonstration"
* tag 'seccomp-v5.17-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
samples/seccomp: Adjust sample to also provide kill option
seccomp: Invalidate seccomp mode to catch death failures
signal: HANDLER_EXIT should clear SIGNAL_UNKILLABLE
Linus Torvalds [Sat, 12 Feb 2022 16:57:37 +0000 (08:57 -0800)]
Merge branch 'akpm' (patches from Andrew)
Merge misc fixes from Andrew Morton:
"5 patches.
Subsystems affected by this patch series: binfmt, procfs, and mm
(vmscan, memcg, and kfence)"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
kfence: make test case compatible with run time set sample interval
mm: memcg: synchronize objcg lists with a dedicated spinlock
mm: vmscan: remove deadlock due to throttling failing to make progress
fs/proc: task_mmu.c: don't read mapcount for migration entry
fs/binfmt_elf: fix PT_LOAD p_align values for loaders