Alex Hung [Sat, 8 Jun 2024 04:09:53 +0000 (22:09 -0600)]
drm/amd/display: Fix possible overflow in integer multiplication
[WHAT & HOW]
Integer multiplies integer may overflow in context that expects an
expression of unsigned long long (64 bits). This can be fixed by casting
integer to unsigned long long to force 64 bits results.
This fixes 2 OVERFLOW_BEFORE_WIDEN issues reported by Coverity.
Signed-off-by: Alex Hung <alex.hung@amd.com> Reviewed-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Nicholas Kazlauskas [Mon, 15 Jul 2024 19:52:46 +0000 (15:52 -0400)]
drm/amd/display: Request 0MHz dispclk for zero display case
[Why]
If we aren't entering RCG/IPS2 or CLKSTOP is not supported by PMFW then
we should be requesting a dispclk value of 0MHz to PMFW.
Currenly we run at max clock since there's an assumption in APU clock
table formulation where we can run at any DISPCLK at any state so the
real clock value ends up as 1200Mhz - the maximum.
[How]
Set to 0 instead of the minimum value in the state array.
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Reviewed-by: Duncan Ma <duncan.ma@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Chris Park [Fri, 12 Jul 2024 16:50:48 +0000 (12:50 -0400)]
drm/amd/display: Add two dmmuy I2C entry for GPIO port mapping issue
[Why]
When only 4 I2C is declared, two dummies are required to correctly map
GPIO port.
[How]
Add one more I2C dummy entry to match GPIO port.
Signed-off-by: Chris Park <chris.park@amd.com> Reviewed-by: Alvin Lee <alvin.lee2@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Leo Li [Thu, 11 Jul 2024 18:38:11 +0000 (14:38 -0400)]
drm/amd/display: Run idle optimizations at end of vblank handler
[Why & How]
1. After allowing idle optimizations, hw programming is disallowed.
2. Before hw programming, we need to disallow idle optimizations.
Otherwise, in scenario 1, we will immediately kick hw out of idle
optimizations with register access.
Scenario 2 is less of a concern, since any register access will kick
hw out of idle optimizations. But we'll do it early for correctness.
Signed-off-by: Leo Li <sunpeng.li@amd.com> Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Leo Li [Thu, 11 Jul 2024 18:31:27 +0000 (14:31 -0400)]
drm/amd/display: Let drm_crtc_vblank_on/off manage interrupts
[Why]
We manage interrupts for CRTCs in two places:
1. In manage_dm_interrupts(), when CRTC get enabled or disabled
2. When drm_vblank_get/put() starts or kills the vblank counter, calling
into amdgpu_dm_crtc_set_vblank()
The interrupts managed by these twp places should be identical.
[How]
Since manage_dm_interrupts() already use drm_crtc_vblank_on/off(), just
move all CRTC interrupt management into amdgpu_dm_crtc_set_vblank().
This has the added benefit of disabling all CRTC and HUBP interrupts
when there are no vblank requestors.
Note that there is a TODO item - unchanged from when it was first
introduced - to properly identify the HUBP instance from the OTG
instance, rather than just assume direct mapping.
Signed-off-by: Leo Li <sunpeng.li@amd.com> Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Samson Tam [Sun, 14 Jul 2024 20:31:05 +0000 (16:31 -0400)]
drm/amd/display: roll back quality EASF and ISHARP and dc dependency changes
[Why]
Seeing several regressions related to quality EASF and ISHARP changes
and removing dc dependency changes.
[How]
Roll back SPL changes
Signed-off-by: Samson Tam <Samson.Tam@amd.com> Reviewed-by: Martin Leung <martin.leung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm/amdkfd: Fix missing error code in kfd_queue_acquire_buffers
The fix involves setting 'err' to '-EINVAL' before each 'goto
out_err_unreserve'.
Fixes the below:
drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_queue.c:265 kfd_queue_acquire_buffers()
warn: missing error code 'err'
drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_queue.c
226 int kfd_queue_acquire_buffers(struct kfd_process_device *pdd, struct queue_properties *properties)
227 {
228 struct kfd_topology_device *topo_dev;
229 struct amdgpu_vm *vm;
230 u32 total_cwsr_size;
231 int err;
232
233 topo_dev = kfd_topology_device_by_id(pdd->dev->id);
234 if (!topo_dev)
235 return -EINVAL;
236
237 vm = drm_priv_to_vm(pdd->drm_priv);
238 err = amdgpu_bo_reserve(vm->root.bo, false);
239 if (err)
240 return err;
241
242 err = kfd_queue_buffer_get(vm, properties->write_ptr, &properties->wptr_bo, PAGE_SIZE);
243 if (err)
244 goto out_err_unreserve;
245
246 err = kfd_queue_buffer_get(vm, properties->read_ptr, &properties->rptr_bo, PAGE_SIZE);
247 if (err)
248 goto out_err_unreserve;
249
250 err = kfd_queue_buffer_get(vm, (void *)properties->queue_address,
251 &properties->ring_bo, properties->queue_size);
252 if (err)
253 goto out_err_unreserve;
254
255 /* only compute queue requires EOP buffer and CWSR area */
256 if (properties->type != KFD_QUEUE_TYPE_COMPUTE)
257 goto out_unreserve;
This is clearly a success path.
258
259 /* EOP buffer is not required for all ASICs */
260 if (properties->eop_ring_buffer_address) {
261 if (properties->eop_ring_buffer_size != topo_dev->node_props.eop_buffer_size) {
262 pr_debug("queue eop bo size 0x%lx not equal to node eop buf size 0x%x\n",
263 properties->eop_buf_bo->tbo.base.size,
264 topo_dev->node_props.eop_buffer_size);
--> 265 goto out_err_unreserve;
Fixes: 629568d25fea ("drm/amdkfd: Validate queue cwsr area and eop buffer size") Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Cc: Philip Yang <Philip.Yang@amd.com> Cc: Felix Kuehling <Felix.Kuehling@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Philip Yang <Philip.Yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm/amd/display: Add null check for top_pipe_to_program in commit_planes_for_stream
This commit addresses a null pointer dereference issue in the
`commit_planes_for_stream` function at line 4140. The issue could occur
when `top_pipe_to_program` is null.
The fix adds a check to ensure `top_pipe_to_program` is not null before
accessing its stream_res. This prevents a null pointer dereference.
Reported by smatch:
drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc.c:4140 commit_planes_for_stream() error: we previously assumed 'top_pipe_to_program' could be null (see line 3906)
Cc: Tom Chung <chiahsuan.chung@amd.com> Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Cc: Roman Li <roman.li@amd.com> Cc: Alex Hung <alex.hung@amd.com> Cc: Aurabindo Pillai <aurabindo.pillai@amd.com> Cc: Harry Wentland <harry.wentland@amd.com> Cc: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Tom Chung <chiahsuan.chung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm/amd/display: Add null check for pipe_ctx->plane_state in dcn20_program_pipe
This commit addresses a null pointer dereference issue in the
`dcn20_program_pipe` function. The issue could occur when
`pipe_ctx->plane_state` is null.
The fix adds a check to ensure `pipe_ctx->plane_state` is not null
before accessing. This prevents a null pointer dereference.
Reported by smatch:
drivers/gpu/drm/amd/amdgpu/../display/dc/hwss/dcn20/dcn20_hwseq.c:1925 dcn20_program_pipe() error: we previously assumed 'pipe_ctx->plane_state' could be null (see line 1877)
Cc: Tom Chung <chiahsuan.chung@amd.com> Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Cc: Roman Li <roman.li@amd.com> Cc: Alex Hung <alex.hung@amd.com> Cc: Aurabindo Pillai <aurabindo.pillai@amd.com> Cc: Harry Wentland <harry.wentland@amd.com> Cc: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Tom Chung <chiahsuan.chung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
ISP I2C bus device can't be enumerated via ACPI mechanism
since it shares the memory map with the AMDGPU.
So use the MFD mechanism for registering the ISP I2C device
and add the required resources.
Signed-off-by: Venkata Narendra Kumar Gutta <vengutta@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Christian König [Wed, 24 Jul 2024 07:24:02 +0000 (09:24 +0200)]
drm/amdgpu: fix contiguous handling for IB parsing v2
Otherwise we won't get correct access to the IB.
v2: keep setting AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS to avoid problems in
the VRAM backend.
Signed-off-by: Christian König <christian.koenig@amd.com> Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3501 Fixes: e362b7c8f8c7 ("drm/amdgpu: Modify the contiguous flags behaviour") Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Tested-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm/amdgpu: add macro to calculate offset with instance
Add macro definition which calculate offset of the
register with index override.
This is useful in case when there is an array of
registers which is common for all instances.
To read registers in that case it is easy to define
registers once and the index value is manually passed
to calculate proper offset of register for each instance.
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Jonathan Kim [Tue, 21 May 2024 17:22:15 +0000 (13:22 -0400)]
drm/amdkfd: allow users to target recommended SDMA engines
Certain GPUs have better copy performance over xGMI on specific
SDMA engines depending on the source and destination GPU.
Allow users to create SDMA queues on these recommended engines.
Close to 2x overall performance has been observed with this
optimization.
Signed-off-by: Jonathan Kim <jonathan.kim@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm/amdgpu/pm: support gpu_metrics sysfs interface for smu v14.0.2/3
support gpu_metrics sysfs interface for smu v14.0.2/3
Signed-off-by: Kenneth Feng <kenneth.feng@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Commit 2563391e57b5 ("drm/amd/display: DML2.1 resynchronization") blew
away the compiler warning fix from commit 2fde4fdddc1f
("drm/amd/display: Avoid -Wenum-float-conversion in
add_margin_and_round_to_dfs_grainularity()"), causing the warning to
reappear.
drivers/gpu/drm/amd/amdgpu/../display/dc/dml2/dml21/src/dml2_dpmm/dml2_dpmm_dcn4.c:183:58: error: arithmetic between enumeration type 'enum dentist_divider_range' and floating-point type 'double' [-Werror,-Wenum-float-conversion]
183 | divider = (unsigned int)(DFS_DIVIDER_RANGE_SCALE_FACTOR * (vco_freq_khz / clock_khz));
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~
1 error generated.
The comment in the vbios structure says:
// = 128 means EDID length is 128 bytes, otherwise the EDID length = ucFakeEDIDLength*128
This fake edid struct has not been used in a long time, so I'm
not sure if there were actually any boards out there with a non-128 byte
EDID, but align the code with the comment.
Reviewed-by: Thomas Weißschuh <linux@weissschuh.net> Reported-by: Thomas Weißschuh <linux@weissschuh.net> Link: https://lists.freedesktop.org/archives/amd-gfx/2024-June/109964.html Fixes: c324acd5032f ("drm/radeon/kms: parse the extended LCD info block") Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The comment in the vbios structure says:
// = 128 means EDID length is 128 bytes, otherwise the EDID length = ucFakeEDIDLength*128
This fake edid struct has not been used in a long time, so I'm
not sure if there were actually any boards out there with a non-128 byte
EDID, but align the code with the comment.
Reviewed-by: Thomas Weißschuh <linux@weissschuh.net> Reported-by: Thomas Weißschuh <linux@weissschuh.net> Link: https://lists.freedesktop.org/archives/amd-gfx/2024-June/109964.html Fixes: d38ceaf99ed0 ("drm/amdgpu: add core driver (v4)") Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Philip Yang [Wed, 26 Jun 2024 19:03:05 +0000 (15:03 -0400)]
drm/amdkfd: Validate queue cwsr area and eop buffer size
When creating KFD user compute queue, check if queue eop buffer size,
cwsr area size, ctl stack size equal to the size of KFD node
properities.
Check the entire cwsr area which may split into multiple svm ranges
aligned to granularity boundary.
Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Philip Yang [Wed, 26 Jun 2024 18:52:28 +0000 (14:52 -0400)]
drm/amdkfd: Store queue cwsr area size to node properties
Use the queue eop buffer size, cwsr area size, ctl stack size
calculation from Thunk, store the value to KFD node properties.
Those will be used to validate queue eop buffer size, cwsr area size,
ctl stack size when creating KFD user compute queue.
Those will be exposed to user space via sysfs KFD node properties, to
remove the duplicate calculation code from Thunk.
Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm/amdgpu: reset vm state machine after gpu reset(vram lost)
[Why]
Page table of compute VM in the VRAM will lost after gpu reset.
VRAM won't be restored since compute VM has no shadows.
[How]
Use higher 32-bit of vm->generation to record a vram_lost_counter.
Reset the VM state machine when vm->genertaion is not equal to
the new generation token.
v2: Check vm->generation instead of calling drm_sched_entity_error
in amdgpu_vm_validate.
v3: Use new generation token instead of vram_lost_counter for check.
Signed-off-by: ZhenGuo Yin <zhenguo.yin@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm/amd/display: Add null check for set_output_gamma in dcn30_set_output_transfer_func
This commit adds a null check for the set_output_gamma function pointer
in the dcn30_set_output_transfer_func function. Previously,
set_output_gamma was being checked for nullity at line 386, but then it
was being dereferenced without any nullity check at line 401. This
could potentially lead to a null pointer dereference error if
set_output_gamma is indeed null.
To fix this, we now ensure that set_output_gamma is not null before
dereferencing it. We do this by adding a nullity check for
set_output_gamma before the call to set_output_gamma at line 401. If
set_output_gamma is null, we log an error message and do not call the
function.
This fix prevents a potential null pointer dereference error.
drivers/gpu/drm/amd/amdgpu/../display/dc/hwss/dcn30/dcn30_hwseq.c:401 dcn30_set_output_transfer_func()
error: we previously assumed 'mpc->funcs->set_output_gamma' could be null (see line 386)
drivers/gpu/drm/amd/amdgpu/../display/dc/hwss/dcn30/dcn30_hwseq.c
373 bool dcn30_set_output_transfer_func(struct dc *dc,
374 struct pipe_ctx *pipe_ctx,
375 const struct dc_stream_state *stream)
376 {
377 int mpcc_id = pipe_ctx->plane_res.hubp->inst;
378 struct mpc *mpc = pipe_ctx->stream_res.opp->ctx->dc->res_pool->mpc;
379 const struct pwl_params *params = NULL;
380 bool ret = false;
381
382 /* program OGAM or 3DLUT only for the top pipe*/
383 if (pipe_ctx->top_pipe == NULL) {
384 /*program rmu shaper and 3dlut in MPC*/
385 ret = dcn30_set_mpc_shaper_3dlut(pipe_ctx, stream);
386 if (ret == false && mpc->funcs->set_output_gamma) {
^^^^^^^^^^^^^^^^^^^^^^^^^^^^ If this is NULL
387 if (stream->out_transfer_func.type == TF_TYPE_HWPWL)
388 params = &stream->out_transfer_func.pwl;
389 else if (pipe_ctx->stream->out_transfer_func.type ==
390 TF_TYPE_DISTRIBUTED_POINTS &&
391 cm3_helper_translate_curve_to_hw_format(
392 &stream->out_transfer_func,
393 &mpc->blender_params, false))
394 params = &mpc->blender_params;
395 /* there are no ROM LUTs in OUTGAM */
396 if (stream->out_transfer_func.type == TF_TYPE_PREDEFINED)
397 BREAK_TO_DEBUGGER();
398 }
399 }
400
--> 401 mpc->funcs->set_output_gamma(mpc, mpcc_id, params);
^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Then it will crash
402 return ret;
403 }
Fixes: d99f13878d6f ("drm/amd/display: Add DCN3 HWSEQ") Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Cc: Tom Chung <chiahsuan.chung@amd.com> Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Cc: Roman Li <roman.li@amd.com> Cc: Hersen Wu <hersenxs.wu@amd.com> Cc: Alex Hung <alex.hung@amd.com> Cc: Aurabindo Pillai <aurabindo.pillai@amd.com> Cc: Harry Wentland <harry.wentland@amd.com> Cc: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Tom Chung <chiahsuan.chung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Philip Yang [Thu, 20 Jun 2024 17:00:48 +0000 (13:00 -0400)]
drm/amdkfd: Validate user queue update
Ensure update queue new ring buffer is mapped on GPU with correct size.
Decrease queue old ring_bo queue_refcount and increase new ring_bo
queue_refcount.
Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Philip Yang [Thu, 20 Jun 2024 16:44:57 +0000 (12:44 -0400)]
drm/amdkfd: Validate user queue svm memory residency
Queue CWSR area maybe registered to GPU as svm memory, create queue to
ensure svm mapped to GPU with KFD_IOCTL_SVM_FLAG_GPU_ALWAYS_MAPPED flag.
Add queue_refcount to struct svm_range, to track queue CWSR area usage.
Because unmap mmu notifier callback return value is ignored, if
application unmap the CWSR area while queue is active, pr_warn message
in dmesg log. To be safe, evict user queue.
Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Fri, 12 Jul 2024 22:57:14 +0000 (18:57 -0400)]
drm/amdgpu/gfx9.4.3: Enable bad opcode interrupt
For the bad opcode case, it will cause CP/ME hang.
The firmware will prevent the ME side from hanging by raising a bad opcode interrupt.
And the driver needs to perform a vmid reset when receiving the interrupt.
Acked-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Fri, 12 Jul 2024 22:50:26 +0000 (18:50 -0400)]
drm/amdgpu/gfx9: Enable bad opcode interrupt
For the bad opcode case, it will cause CP/ME hang.
The firmware will prevent the ME side from hanging by raising a bad opcode interrupt.
And the driver needs to perform a vmid reset when receiving the interrupt.
Acked-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
For the bad opcode case, it will cause CP/ME hang.
The firmware will prevent the ME side from hanging by raising a bad opcode interrupt.
And the driver needs to perform a vmid reset when receiving the interrupt.
v2: update irq naming (drop priv) (Alex)
Acked-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Jesse Zhang <jesse.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
For the bad opcode case, it will cause CP/ME hang.
The firmware will prevent the ME side from hanging by raising a bad opcode interrupt.
And the driver needs to perform a vmid reset when receiving the interrupt.
v2: update irq naming (drop priv) (Alex)
Acked-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Jesse Zhang <jesse.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
For the bad opcode case, it will cause CP/ME hang.
The firmware will prevent the ME side from hanging by raising a bad opcode interrupt.
And the driver needs to perform a vmid reset when receiving the interrupt.
v2: update irq naming (drop priv) (Alex)
Acked-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com> Reviewed-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm/amd/display: Add NULL check for clk_mgr in dcn32_init_hw
This commit addresses a potential null pointer dereference issue in the
`dcn32_init_hw` function. The issue could occur when `dc->clk_mgr` is
null.
The fix adds a check to ensure `dc->clk_mgr` is not null before
accessing its functions. This prevents a potential null pointer
dereference.
Reported by smatch:
drivers/gpu/drm/amd/amdgpu/../display/dc/hwss/dcn32/dcn32_hwseq.c:961 dcn32_init_hw() error: we previously assumed 'dc->clk_mgr' could be null (see line 782)
Cc: Tom Chung <chiahsuan.chung@amd.com> Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Cc: Roman Li <roman.li@amd.com> Cc: Alex Hung <alex.hung@amd.com> Cc: Aurabindo Pillai <aurabindo.pillai@amd.com> Cc: Harry Wentland <harry.wentland@amd.com> Cc: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The ras command shared memory is allocated from
VRAM and the response status of the command
buffer will not be zero due to gpu being in
fatal error state after ras UE error injection.
Philip Yang [Thu, 20 Jun 2024 16:31:36 +0000 (12:31 -0400)]
drm/amdkfd: Ensure user queue buffers residency
Add atomic queue_refcount to struct bo_va, return -EBUSY to fail unmap
BO from the GPU if the bo_va queue_refcount is not zero.
Create queue to increase the bo_va queue_refcount, destroy queue to
decrease the bo_va queue_refcount, to ensure the queue buffers mapped on
the GPU when queue is active.
Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm/amd/display: Add NULL check for clk_mgr and clk_mgr->funcs in dcn401_init_hw
This commit addresses a potential null pointer dereference issue in the
`dcn401_init_hw` function. The issue could occur when `dc->clk_mgr` or
`dc->clk_mgr->funcs` is null.
The fix adds a check to ensure `dc->clk_mgr` and `dc->clk_mgr->funcs` is
not null before accessing its functions. This prevents a potential null
pointer dereference.
Reported by smatch:
drivers/gpu/drm/amd/amdgpu/../display/dc/hwss/dcn401/dcn401_hwseq.c:416 dcn401_init_hw() error: we previously assumed 'dc->clk_mgr' could be null (see line 225)
Cc: Tom Chung <chiahsuan.chung@amd.com> Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Cc: Roman Li <roman.li@amd.com> Cc: Alex Hung <alex.hung@amd.com> Cc: Aurabindo Pillai <aurabindo.pillai@amd.com> Cc: Harry Wentland <harry.wentland@amd.com> Cc: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm/amd/display: Add NULL check for clk_mgr and clk_mgr->funcs in dcn30_init_hw
This commit addresses a potential null pointer dereference issue in the
`dcn30_init_hw` function. The issue could occur when `dc->clk_mgr` or
`dc->clk_mgr->funcs` is null.
The fix adds a check to ensure `dc->clk_mgr` and `dc->clk_mgr->funcs` is
not null before accessing its functions. This prevents a potential null
pointer dereference.
Reported by smatch:
drivers/gpu/drm/amd/amdgpu/../display/dc/hwss/dcn30/dcn30_hwseq.c:789 dcn30_init_hw() error: we previously assumed 'dc->clk_mgr' could be null (see line 628)
Cc: Tom Chung <chiahsuan.chung@amd.com> Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Cc: Roman Li <roman.li@amd.com> Cc: Alex Hung <alex.hung@amd.com> Cc: Aurabindo Pillai <aurabindo.pillai@amd.com> Cc: Harry Wentland <harry.wentland@amd.com> Cc: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Philip Yang [Thu, 20 Jun 2024 16:21:57 +0000 (12:21 -0400)]
drm/amdkfd: Validate user queue buffers
Find user queue rptr, ring buf, eop buffer and cwsr area BOs, and
check BOs are mapped on the GPU with correct size and take the BO
reference.
Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm/amd/display: Add null check for head_pipe in dcn32_acquire_idle_pipe_for_head_pipe_in_layer
This commit addresses a potential null pointer dereference issue in the
`dcn32_acquire_idle_pipe_for_head_pipe_in_layer` function. The issue
could occur when `head_pipe` is null.
The fix adds a check to ensure `head_pipe` is not null before asserting
it. If `head_pipe` is null, the function returns NULL to prevent a
potential null pointer dereference.
Reported by smatch:
drivers/gpu/drm/amd/amdgpu/../display/dc/resource/dcn32/dcn32_resource.c:2690 dcn32_acquire_idle_pipe_for_head_pipe_in_layer() error: we previously assumed 'head_pipe' could be null (see line 2681)
Cc: Tom Chung <chiahsuan.chung@amd.com> Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Cc: Roman Li <roman.li@amd.com> Cc: Alex Hung <alex.hung@amd.com> Cc: Aurabindo Pillai <aurabindo.pillai@amd.com> Cc: Harry Wentland <harry.wentland@amd.com> Cc: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Tom Chung <chiahsuan.chung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm/amd/display: Add null check for head_pipe in dcn201_acquire_free_pipe_for_layer
This commit addresses a potential null pointer dereference issue in the
`dcn201_acquire_free_pipe_for_layer` function. The issue could occur
when `head_pipe` is null.
The fix adds a check to ensure `head_pipe` is not null before asserting
it. If `head_pipe` is null, the function returns NULL to prevent a
potential null pointer dereference.
Reported by smatch:
drivers/gpu/drm/amd/amdgpu/../display/dc/resource/dcn201/dcn201_resource.c:1016 dcn201_acquire_free_pipe_for_layer() error: we previously assumed 'head_pipe' could be null (see line 1010)
Cc: Tom Chung <chiahsuan.chung@amd.com> Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Cc: Roman Li <roman.li@amd.com> Cc: Alex Hung <alex.hung@amd.com> Cc: Aurabindo Pillai <aurabindo.pillai@amd.com> Cc: Harry Wentland <harry.wentland@amd.com> Cc: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Tom Chung <chiahsuan.chung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm/amd/display: Fix index out of bounds in DCN30 color transformation
This commit addresses a potential index out of bounds issue in the
`cm3_helper_translate_curve_to_hw_format` function in the DCN30 color
management module. The issue could occur when the index 'i' exceeds the
number of transfer function points (TRANSFER_FUNC_POINTS).
The fix adds a check to ensure 'i' is within bounds before accessing the
transfer function points. If 'i' is out of bounds, the function returns
false to indicate an error.
Cc: Tom Chung <chiahsuan.chung@amd.com> Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Cc: Roman Li <roman.li@amd.com> Cc: Alex Hung <alex.hung@amd.com> Cc: Aurabindo Pillai <aurabindo.pillai@amd.com> Cc: Harry Wentland <harry.wentland@amd.com> Cc: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Tom Chung <chiahsuan.chung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm/amd/display: Implement bounds check for stream encoder creation in DCN401
'stream_enc_regs' array is an array of dcn10_stream_enc_registers
structures. The array is initialized with four elements, corresponding
to the four calls to stream_enc_regs() in the array initializer. This
means that valid indices for this array are 0, 1, 2, and 3.
The error message 'stream_enc_regs' 4 <= 5 below, is indicating that
there is an attempt to access this array with an index of 5, which is
out of bounds. This could lead to undefined behavior
Here, eng_id is used as an index to access the stream_enc_regs array. If
eng_id is 5, this would result in an out-of-bounds access on the
stream_enc_regs array.
Thus fixing Buffer overflow error in dcn401_stream_encoder_create
Found by smatch:
drivers/gpu/drm/amd/amdgpu/../display/dc/resource/dcn401/dcn401_resource.c:1209 dcn401_stream_encoder_create() error: buffer overflow 'stream_enc_regs' 4 <= 5
Cc: Tom Chung <chiahsuan.chung@amd.com> Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Cc: Roman Li <roman.li@amd.com> Cc: Alex Hung <alex.hung@amd.com> Cc: Aurabindo Pillai <aurabindo.pillai@amd.com> Cc: Harry Wentland <harry.wentland@amd.com> Cc: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Tom Chung <chiahsuan.chung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm/amd/display: Fix index out of bounds in degamma hardware format translation
Fixes index out of bounds issue in
`cm_helper_translate_curve_to_degamma_hw_format` function. The issue
could occur when the index 'i' exceeds the number of transfer function
points (TRANSFER_FUNC_POINTS).
The fix adds a check to ensure 'i' is within bounds before accessing the
transfer function points. If 'i' is out of bounds the function returns
false to indicate an error.
Cc: Tom Chung <chiahsuan.chung@amd.com> Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Cc: Roman Li <roman.li@amd.com> Cc: Alex Hung <alex.hung@amd.com> Cc: Aurabindo Pillai <aurabindo.pillai@amd.com> Cc: Harry Wentland <harry.wentland@amd.com> Cc: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Tom Chung <chiahsuan.chung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm/amd/display: Fix index out of bounds in DCN30 degamma hardware format translation
This commit addresses a potential index out of bounds issue in the
`cm3_helper_translate_curve_to_degamma_hw_format` function in the DCN30
color management module. The issue could occur when the index 'i'
exceeds the number of transfer function points (TRANSFER_FUNC_POINTS).
The fix adds a check to ensure 'i' is within bounds before accessing the
transfer function points. If 'i' is out of bounds, the function returns
false to indicate an error.
Cc: Tom Chung <chiahsuan.chung@amd.com> Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Cc: Roman Li <roman.li@amd.com> Cc: Alex Hung <alex.hung@amd.com> Cc: Aurabindo Pillai <aurabindo.pillai@amd.com> Cc: Harry Wentland <harry.wentland@amd.com> Cc: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Tom Chung <chiahsuan.chung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm/amd/display: Add kdoc entry for 'bs_coeffs_updated' in dpp401_dscl_program_isharp
Fixes the below with gcc W=1:
drivers/gpu/drm/amd/amdgpu/../display/dc/dpp/dcn401/dcn401_dpp_dscl.c:961: warning: Function parameter or struct member 'bs_coeffs_updated' not described in 'dpp401_dscl_program_isharp'
Fixes: 94beb4ac1b3b ("drm/amd/display: ensure EASF and ISHARP coefficients are programmed together") Cc: Tom Chung <chiahsuan.chung@amd.com> Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Cc: Roman Li <roman.li@amd.com> Cc: Alex Hung <alex.hung@amd.com> Cc: Aurabindo Pillai <aurabindo.pillai@amd.com> Cc: Harry Wentland <harry.wentland@amd.com> Cc: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
dc/{dcn401,dcn303} are unused since the files in it got moved under their
respective new components location. Hence they are no longer necessary
Fixes: 2d62bb450ed1 ("drm/amd/display: Refactor DCN3X into component folder") Reviewed-by: Leo Li <sunpeng.li@amd.com> Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Joshua Aberback [Thu, 6 Jun 2024 19:51:16 +0000 (15:51 -0400)]
drm/amd/display: Remove duplicate HWSS interfaces
[Why]
Some interface functions are defined in both the public and private HWSS
interfaces, which can lead to confusion and runtime issues, therefore
the duplicates should be eliminated.
[How]
- power_down should only be private, because it's only used within HWSS.
- update_plane_addr should only be public, as it's used outside HWSS.
Dillon Varone [Mon, 20 May 2024 15:12:07 +0000 (11:12 -0400)]
drm/amd/display: Various DML2 fixes for FAMS2
The disable fams2 operation was reworked, but some of the old code
remained. This commit removes the disable_fams2_drr from the
dml2_stream_parameters.
Alex Hung [Thu, 27 Jun 2024 22:45:39 +0000 (16:45 -0600)]
drm/amd/display: Check link_res->hpo_dp_link_enc before using it
[WHAT & HOW]
Functions dp_enable_link_phy and dp_disable_link_phy can pass link_res
without initializing hpo_dp_link_enc and it is necessary to check for
null before dereferencing.
This fixes 1 FORWARD_NULL issue reported by Coverity.
Fixes: 0beca868cde8 ("drm/amd/display: Check link_res->hpo_dp_link_enc before using it") Reviewed-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm/amd/display: Check top sink only when multiple streams for DP2
[why]
When switching from extended to second display only
mode, the top remote sink is not removed while the top stream
itself is released. This causes DML to think there is no
DP2 output encoder because top remote sink does not match
with the second stream and disables DTBCLK and causes
hang.
[how]
For DP2.0 MST hubs, only treat 1st remote sink as an encoder
only when there are multiple displays connected.
Reviewed-by: Michael Strauss <michael.strauss@amd.com> Signed-off-by: Sung Joon Kim <sungjoon.kim@amd.com> Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm/amd/display: rename dcn3/dcn4 to more sound terms
Use more accurate names to refer to the asic architecture.
dcn3 in DML actually refers to DCN32 and DCN321, so rename it to dcn32x
dcn4 refers to any DCN4x soc., and hence rename dcn4 to dcn4x
Samson Tam [Wed, 10 Jul 2024 21:09:04 +0000 (17:09 -0400)]
drm/amd/display: ensure EASF and ISHARP coefficients are programmed together
[Why]
EASF coefficients are programmed to RAM and then RAM selector is toggled.
ISHARP coefficients are programmed after so they will not be in the same
RAM block
[How]
Move ISHARP programming before EASF programming
Add flag if ISHARP coefficients are updated. If so, then
force EASF coefficients programming
Reviewed-by: Alvin Lee <alvin.lee2@amd.com> Signed-off-by: Samson Tam <samson.tam@amd.com> Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[why & how]
Add standard RCG helpers based on DCCG spec
Reviewed-by: Daniel Miess <daniel.miess@amd.com> Reviewed-by: Muhammad Ahmed <ahmed.ahmed@amd.com> Signed-off-by: Hansen Dsouza <hansen.dsouza@amd.com> Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm/amd/display: Remove ASSERT if significance is zero in math_ceil2
In the DML math_ceil2 function, there is one ASSERT if the significance
is equal to zero. However, significance might be equal to zero
sometimes, and this is not an issue for a ceil function, but the current
ASSERT will trigger warnings in those cases. This commit removes the
ASSERT if the significance is equal to zero to avoid unnecessary noise.
Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Reviewed-by: Chaitanya Dhere <chaitanya.dhere@amd.com> Signed-off-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com> Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[why & how]
Need to make sure plane_state is initialized
before accessing its members.
Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Reviewed-by: Xi (Alex) Liu <xi.liu@amd.com> Signed-off-by: Sung Joon Kim <sungjoon.kim@amd.com> Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Philip Yang [Thu, 20 Jun 2024 15:53:50 +0000 (11:53 -0400)]
drm/amdkfd: Refactor queue wptr_bo GART mapping
Add helper function kfd_queue_acquire_buffers to get queue wptr_bo
reference from queue write_ptr if it is mapped to the KFD node with
expected size.
Add wptr_bo to structure queue_properties because structure queue is
allocated after queue buffers are validated, then we can remove wptr_bo
parameter from pqm_create_queue.
Rename structure queue wptr_bo_gart to hold wptr_bo reference for GART
mapping and umapping. Move MES wptr_bo_gart mapping to init_user_queue,
the same location with queue ctx_bo GART mapping.
Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Philip Yang [Sun, 14 Jul 2024 15:11:05 +0000 (11:11 -0400)]
drm/amdkfd: amdkfd_free_gtt_mem clear the correct pointer
Pass pointer reference to amdgpu_bo_unref to clear the correct pointer,
otherwise amdgpu_bo_unref clear the local variable, the original pointer
not set to NULL, this could cause use-after-free bug.
Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Philip Yang [Thu, 20 Jun 2024 15:17:59 +0000 (11:17 -0400)]
drm/amdkfd: kfd_bo_mapped_dev support partition
Change amdgpu_amdkfd_bo_mapped_to_dev to use drm_priv as parameter
instead of adev, to support spatial partition. This is only used by CRIU
checkpoint restore now. No functional change.
Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm/amdgpu: disallow multiple BO_HANDLES chunks in one submit
Before this commit, only submits with both a BO_HANDLES chunk and a
'bo_list_handle' would be rejected (by amdgpu_cs_parser_bos).
But if UMD sent multiple BO_HANDLES, what would happen is:
* only the last one would be really used
* all the others would leak memory as amdgpu_cs_p1_bo_handles would
overwrite the previous p->bo_list value
This commit rejects submissions with multiple BO_HANDLES chunks to
match the implementation of the parser.
Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>