Revert "drm/display: Make all helpers visible and switch to depends on"
This reverts commit d674858ff979550a0e97b4ac766f2640f0d9d7e7, as helper
code should always be selected by the driver that needs it, for the
convenience of the final user configuring a kernel.
The user who configures a kernel should not need to know which helpers
are needed for the driver he is interested in. Making a driver depend
on helper code means that the user needs to know which helpers to enable
first, which is very user-unfriendly.
This reverts commit c0e0f139354c01e0213204e4a96e7076e5a3e396, as helper
code should always be selected by the driver that needs it, for the
convenience of the final user configuring a kernel.
The user who configures a kernel should not need to know which helpers
are needed for the driver he is interested in. Making a driver depend
on helper code means that the user needs to know which helpers to enable
first, which is very user-unfriendly.
Revert "drm: Switch DRM_DISPLAY_HELPER to depends on"
This reverts commit e075e496f516bf92bc0cbaf94d64e8d4a6b58321, as helper
code should always be selected by the driver that needs it, for the
convenience of the final user configuring a kernel.
The user who configures a kernel should not need to know which helpers
are needed for the driver he is interested in. Making a driver depend
on helper code means that the user needs to know which helpers to enable
first, which is very user-unfriendly.
Revert "drm: Switch DRM_DISPLAY_DP_AUX_BUS to depends on"
This reverts commit 4d15125d7fe637f401e64e33c99513adf6586fdd, as helper
code should always be selected by the driver that needs it, for the
convenience of the final user configuring a kernel.
The user who configures a kernel should not need to know which helpers
are needed for the driver he is interested in. Making a driver depend
on helper code means that the user needs to know which helpers to enable
first, which is very user-unfriendly.
Revert "drm: Switch DRM_DISPLAY_DP_HELPER to depends on"
This reverts commit 0323287de87d7e6e9c22c57d7440aa353a2298d0, as helper
code should always be selected by the driver that needs it, for the
convenience of the final user configuring a kernel.
The user who configures a kernel should not need to know which helpers
are needed for the driver he is interested in. Making a driver depend
on helper code means that the user needs to know which helpers to enable
first, which is very user-unfriendly.
Revert "drm: Switch DRM_DISPLAY_HDCP_HELPER to depends on"
This reverts commit 3166e7e6d935caaef07605a5c90773fbf9ffeaf4, as helper
code should always be selected by the driver that needs it, for the
convenience of the final user configuring a kernel.
The user who configures a kernel should not need to know which helpers
are needed for the driver he is interested in. Making a driver depend
on helper code means that the user needs to know which helpers to enable
first, which is very user-unfriendly.
Revert "drm: Switch DRM_DISPLAY_HDMI_HELPER to depends on"
This reverts commit f6d2dc03fa8546b284dd8c1af027d9fac5725921, as helper
code should always be selected by the driver that needs it, for the
convenience of the final user configuring a kernel.
The user who configures a kernel should not need to know which helpers
are needed for the driver he is interested in. Making a driver depend
on helper code means that the user needs to know which helpers to enable
first, which is very user-unfriendly.
Dave Airlie [Tue, 30 Apr 2024 04:20:31 +0000 (14:20 +1000)]
Merge tag 'drm-intel-gt-next-2024-04-26' of https://anongit.freedesktop.org/git/drm/drm-intel into drm-next
UAPI Changes:
- drm/i915/guc: Use context hints for GT frequency
Allow user to provide a low latency context hint. When set, KMD
sends a hint to GuC which results in special handling for this
context. SLPC will ramp the GT frequency aggressively every time
it switches to this context. The down freq threshold will also be
lower so GuC will ramp down the GT freq for this context more slowly.
We also disable waitboost for this context as that will interfere with
the strategy.
We need to enable the use of SLPC Compute strategy during init, but
it will apply only to contexts that set this bit during context
creation.
Userland can check whether this feature is supported using a new param-
I915_PARAM_HAS_CONTEXT_FREQ_HINT. This flag is true for all guc submission
enabled platforms as they use SLPC for frequency management.
The Mesa usage model for this flag is here -
https://gitlab.freedesktop.org/sushmave/mesa/-/commits/compute_hint
- drm/i915/gt: Enable only one CCS for compute workload
Enable only one CCS engine by default with all the compute sices
allocated to it.
While generating the list of UABI engines to be exposed to the
user, exclude any additional CCS engines beyond the first
instance
***
NOTE: This W/A will make all DG2 SKUs appear like single CCS SKUs by
default to mitigate a hardware bug. All the EUs will still remain
usable, and all the userspace drivers have been confirmed to be able
to dynamically detect the change in number of CCS engines and adjust.
For the smaller percent of applications that get perf benefit from
letting the userspace driver dispatch across all 4 CCS engines we will
be introducing a sysfs control as a later patch to choose 4 CCS each
with 25% EUs (or 50% if 2 CCS).
- Add new and fix to existing workarounds: Wa_14018575942 (MTL),
Wa_16019325821 (Gen12.70), Wa_14019159160 (MTL), Wa_16015675438,
Wa_14020495402 (Gen12.70) (Tejas, John, Lucas)
- Fix UAF on destroy against retire race and remove two earlier
partial fixes (Janusz)
- Limit the reserved VM space to only the platforms that need it (Andi)
- Reset queue_priority_hint on parking for execlist platforms (Chris)
- Fix gt reset with GuC submission is disabled (Nirmoy)
- Correct capture of EIR register on hang (John)
- Remove usage of the deprecated ida_simple_xx() API
- Refactor confusing __intel_gt_reset() (Nirmoy)
- Fix the fix for GuC reset lock confusion (John)
- Simplify/extend platform check for Wa_14018913170 (John)
- Replace dev_priv with i915 (Andi)
- Add and use gt_to_guc() wrapper (Andi)
- Remove bogus null check (Rodrigo, Dan)
Merge tag 'sched-urgent-2024-04-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fixes from Ingo Molnar:
- Fix EEVDF corner cases
- Fix two nohz_full= related bugs that can cause boot crashes
and warnings
* tag 'sched-urgent-2024-04-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/isolation: Fix boot crash when maxcpus < first housekeeping CPU
sched/isolation: Prevent boot crash when the boot CPU is nohz_full
sched/eevdf: Prevent vlag from going out of bounds in reweight_eevdf()
sched/eevdf: Fix miscalculation in reweight_entity() when se is not curr
sched/eevdf: Always update V if se->on_rq when reweighting
Merge tag 'x86-urgent-2024-04-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
- Make the CPU_MITIGATIONS=n interaction with conflicting
mitigation-enabling boot parameters a bit saner.
- Re-enable CPU mitigations by default on non-x86
- Fix TDX shared bit propagation on mprotect()
- Fix potential show_regs() system hang when PKE initialization
is not fully finished yet.
- Add the 0x10-0x1f model IDs to the Zen5 range
- Harden #VC instruction emulation some more
* tag 'x86-urgent-2024-04-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
cpu: Ignore "mitigations" kernel parameter if CPU_MITIGATIONS=n
cpu: Re-enable CPU mitigations by default for !X86 architectures
x86/tdx: Preserve shared bit on mprotect()
x86/cpu: Fix check for RDPKRU in __show_regs()
x86/CPU/AMD: Add models 0x10-0x1f to the Zen5 range
x86/sev: Check for MWAITX and MONITORX opcodes in the #VC handler
sched/isolation: Fix boot crash when maxcpus < first housekeeping CPU
housekeeping_setup() checks cpumask_intersects(present, online) to ensure
that the kernel will have at least one housekeeping CPU after smp_init(),
but this doesn't work if the maxcpus= kernel parameter limits the number of
processors available after bootup.
For example, a kernel with "maxcpus=2 nohz_full=0-2" parameters crashes at
boot time on a virtual machine with 4 CPUs.
Change housekeeping_setup() to use cpumask_first_and() and check that the
returned CPU number is valid and less than setup_max_cpus.
Another corner case is "nohz_full=0" on a machine with a single CPU or with
the maxcpus=1 kernel argument. In this case non_housekeeping_mask is empty
and tick_nohz_full_setup() makes no sense. And indeed, the kernel hits the
WARN_ON(tick_nohz_full_running) in tick_sched_do_timer().
And how should the kernel interpret the "nohz_full=" parameter? It should
be silently ignored, but currently cpulist_parse() happily returns the
empty cpumask and this leads to the same problem.
Change housekeeping_setup() to check cpumask_empty(non_housekeeping_mask)
and do nothing in this case.
sched/isolation: Prevent boot crash when the boot CPU is nohz_full
Documentation/timers/no_hz.rst states that the "nohz_full=" mask must not
include the boot CPU, which is no longer true after:
08ae95f4fd3b ("nohz_full: Allow the boot CPU to be nohz_full").
However after:
aae17ebb53cd ("workqueue: Avoid using isolated cpus' timers on queue_delayed_work")
the kernel will crash at boot time in this case; housekeeping_any_cpu()
returns an invalid CPU number until smp_init() brings the first
housekeeping CPU up.
Change housekeeping_any_cpu() to check the result of cpumask_any_and() and
return smp_processor_id() in this case.
This is just the simple and backportable workaround which fixes the
symptom, but smp_processor_id() at boot time should be safe at least for
type == HK_TYPE_TIMER, this more or less matches the tick_do_timer_boot_cpu
logic.
There is no worry about cpu_down(); tick_nohz_cpu_down() will not allow to
offline tick_do_timer_cpu (the 1st online housekeeping CPU).
Fixes: aae17ebb53cd ("workqueue: Avoid using isolated cpus' timers on queue_delayed_work") Reported-by: Chris von Recklinghausen <crecklin@redhat.com> Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Phil Auld <pauld@redhat.com> Acked-by: Frederic Weisbecker <frederic@kernel.org> Link: https://lore.kernel.org/r/20240411143905.GA19288@redhat.com Closes: https://lore.kernel.org/all/20240402105847.GA24832@redhat.com/
Merge tag 'rust-fixes-6.9' of https://github.com/Rust-for-Linux/linux
Pull Rust fixes from Miguel Ojeda:
- Soundness: make internal functions generated by the 'module!' macro
inaccessible, do not implement 'Zeroable' for 'Infallible' and
require 'Send' for the 'Module' trait.
- Build: avoid errors with "empty" files and workaround 'rustdoc' ICE.
- Kconfig: depend on '!CFI_CLANG' and avoid selecting 'CONSTRUCTORS'.
- Code docs: remove non-existing key from 'module!' macro example.
- Docs: trivial rendering fix in arch table.
* tag 'rust-fixes-6.9' of https://github.com/Rust-for-Linux/linux:
rust: remove `params` from `module` macro example
kbuild: rust: force `alloc` extern to allow "empty" Rust files
kbuild: rust: remove unneeded `@rustc_cfg` to avoid ICE
rust: kernel: require `Send` for `Module` implementations
rust: phy: implement `Send` for `Registration`
rust: make mutually exclusive with CFI_CLANG
rust: macros: fix soundness issue in `module!` macro
rust: init: remove impl Zeroable for Infallible
docs: rust: fix improper rendering in Arch Support page
rust: don't select CONSTRUCTORS
Merge tag 'riscv-for-linus-6.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V fixes from Palmer Dabbelt:
- A fix for TASK_SIZE on rv64/NOMMU, to reflect the lack of user/kernel
separation
- A fix to avoid loading rv64/NOMMU kernel past the start of RAM
- A fix for RISCV_HWPROBE_EXT_ZVFHMIN on ilp32 to avoid signed integer
overflow in the bitmask
- The sud_test kselftest has been fixed to properly swizzle the syscall
number into the return register, which are not the same on RISC-V
- A fix for a build warning in the perf tools on rv32
- A fix for the CBO selftests, to avoid non-constants leaking into the
inline asm
- A pair of fixes for T-Head PBMT errata probing, which has been
renamed MAE by the vendor
* tag 'riscv-for-linus-6.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
RISC-V: selftests: cbo: Ensure asm operands match constraints, take 2
perf riscv: Fix the warning due to the incompatible type
riscv: T-Head: Test availability bit before enabling MAE errata
riscv: thead: Rename T-Head PBMT to MAE
selftests: sud_test: return correct emulated syscall value on RISC-V
riscv: hwprobe: fix invalid sign extension for RISCV_HWPROBE_EXT_ZVFHMIN
riscv: Fix loading 64-bit NOMMU kernels past the start of RAM
riscv: Fix TASK_SIZE on 64-bit NOMMU
Merge tag 'i2c-for-6.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c fixes from Wolfram Sang:
"Fix a race condition in the at24 eeprom handler, a NULL pointer
exception in the I2C core for controllers only using target modes,
drop a MAINTAINERS entry, and fix an incorrect DT binding for at24"
* tag 'i2c-for-6.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: smbus: fix NULL function pointer dereference
MAINTAINERS: Drop entry for PCA9541 bus master selector
eeprom: at24: fix memory corruption race condition
dt-bindings: eeprom: at24: Fix ST M24C64-D compatible schema
Merge tag 'soundwire-6.9-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire
Pull soundwire fix from Vinod Koul:
- Single AMD driver fix for wake interrupt handling in clockstop mode
* tag 'soundwire-6.9-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire:
soundwire: amd: fix for wake interrupt handling for clockstop mode
Wolfram Sang [Fri, 26 Apr 2024 06:44:08 +0000 (08:44 +0200)]
i2c: smbus: fix NULL function pointer dereference
Baruch reported an OOPS when using the designware controller as target
only. Target-only modes break the assumption of one transfer function
always being available. Fix this by always checking the pointer in
__i2c_transfer.
Reported-by: Baruch Siach <baruch@tkos.co.il> Closes: https://lore.kernel.org/r/4269631780e5ba789cf1ae391eec1b959def7d99.1712761976.git.baruch@tkos.co.il Fixes: 4b1acc43331d ("i2c: core changes for slave support")
[wsa: dropped the simplification in core-smbus to avoid theoretical regressions] Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Tested-by: Baruch Siach <baruch@tkos.co.il>
Merge tag 'soc-fixes-6.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull ARM SoC fixes from Arnd Bergmann:
"There are a lot of minor DT fixes for Mediatek, Rockchip, Qualcomm and
Microchip and NXP, addressing both build-time warnings and bugs found
during runtime testing.
Most of these changes are machine specific fixups, but there are a few
notable regressions that affect an entire SoC:
- The Qualcomm MSI support that was improved for 6.9 ended up being
wrong on some chips and now gets fixed.
- The i.MX8MP camera interface broke due to a typo and gets updated
again.
The main driver fix is also for Qualcomm platforms, rewriting an
interface in the QSEECOM firmware support that could lead to crashing
the kernel from a trusted application.
The only other code changes are minor fixes for Mediatek SoC drivers"
* tag 'soc-fixes-6.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (50 commits)
ARM: dts: imx6ull-tarragon: fix USB over-current polarity
soc: mediatek: mtk-socinfo: depends on CONFIG_SOC_BUS
soc: mediatek: mtk-svs: Append "-thermal" to thermal zone names
arm64: dts: imx8mp: Fix assigned-clocks for second CSI2
ARM: dts: microchip: at91-sama7g54_curiosity: Replace regulator-suspend-voltage with the valid property
ARM: dts: microchip: at91-sama7g5ek: Replace regulator-suspend-voltage with the valid property
arm64: dts: rockchip: Fix USB interface compatible string on kobol-helios64
arm64: dts: qcom: sc8180x: Fix ss_phy_irq for secondary USB controller
arm64: dts: qcom: sm8650: Fix the msi-map entries
arm64: dts: qcom: sm8550: Fix the msi-map entries
arm64: dts: qcom: sm8450: Fix the msi-map entries
arm64: dts: qcom: sc8280xp: add missing PCIe minimum OPP
arm64: dts: qcom: x1e80100: Fix the compatible for cluster idle states
arm64: dts: qcom: Fix type of "wdog" IRQs for remoteprocs
arm64: dts: rockchip: regulator for sd needs to be always on for BPI-R2Pro
dt-bindings: rockchip: grf: Add missing type to 'pcie-phy' node
arm64: dts: rockchip: drop redundant disable-gpios in Lubancat 2
arm64: dts: rockchip: drop redundant disable-gpios in Lubancat 1
arm64: dts: rockchip: drop redundant pcie-reset-suspend in Scarlet Dumo
arm64: dts: rockchip: mark system power controller and fix typo on orangepi-5-plus
...
Jack Xiao [Wed, 24 Apr 2024 08:41:04 +0000 (16:41 +0800)]
drm/amdgpu/mes11: update ADD_QUEUE interface
Update ADD_QUEUE interface for mes11 to support
mes mapping legacy queue.
Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Clear warning that uses uninitialized value fw_size.
Signed-off-by: Tim Huang <Tim.Huang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm/amd/display: Adjust registers sequence in the DIO list
This commit reorganizes the order in which some control registers are
presented to make it easier to identify the operations based on the
hardware doc.
Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Now we have two flags for contiguous VRAM buffer allocation.
If the application request for AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS,
it would set the ttm place TTM_PL_FLAG_CONTIGUOUS flag in the
buffer's placement function.
This patch will change the default behaviour of the two flags.
When we set AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS
- This means contiguous is not mandatory.
- we will try to allocate the contiguous buffer. Say if the
allocation fails, we fallback to allocate the individual pages.
When we setTTM_PL_FLAG_CONTIGUOUS
- This means contiguous allocation is mandatory.
- we are setting this in amdgpu_bo_pin_restricted() before bo validation
and check this flag in the vram manager file.
- if this is set, we should allocate the buffer pages contiguously.
the allocation fails, we return -ENOSPC.
v2:
- keep the mem_flags and bo->flags check as is(Christian)
- place the TTM_PL_FLAG_CONTIGUOUS flag setting into the
amdgpu_bo_pin_restricted function placement range iteration
loop(Christian)
- rename find_pages with amdgpu_vram_mgr_calculate_pages_per_block
(Christian)
- Keep the kernel BO allocation as is(Christain)
- If BO pin vram allocation failed, we need to return -ENOSPC as
RDMA cannot work with scattered VRAM pages(Philip)
v3(Christian):
- keep contiguous flag handling outside of pages_per_block
calculation
- remove the hacky implementation in contiguous flag error
handling code
v4(Christian):
- use any variable and return value for non-contiguous
fallback
v5: rebase to amd-staging-drm-next branch
Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com> Suggested-by: Christian König <christian.koenig@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Hung [Tue, 16 Apr 2024 04:16:08 +0000 (22:16 -0600)]
drm/amd/display: Fix uninitialized variables in DC
This fixes 29 UNINIT issues reported by Coverity.
Reviewed-by: Hersen Wu <hersenxs.wu@amd.com> Acked-by: Wayne Lin <wayne.lin@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Hung [Tue, 16 Apr 2024 01:05:34 +0000 (19:05 -0600)]
drm/amd/display: Fix uninitialized variables in DC
This fixes 49 UNINIT issues reported by Coverity.
Reviewed-by: Hersen Wu <hersenxs.wu@amd.com> Acked-by: Wayne Lin <wayne.lin@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Hung [Tue, 16 Apr 2024 01:02:56 +0000 (19:02 -0600)]
drm/amd/display: Fix uninitialized variables in DM
This fixes 11 UNINIT issues reported by Coverity.
Reviewed-by: Hersen Wu <hersenxs.wu@amd.com> Acked-by: Wayne Lin <wayne.lin@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Hung [Mon, 22 Apr 2024 18:02:17 +0000 (12:02 -0600)]
drm/amd/display: ASSERT when failing to find index by plane/stream id
[WHY]
find_disp_cfg_idx_by_plane_id and find_disp_cfg_idx_by_stream_id returns
an array index and they return -1 when not found; however, -1 is not a
valid index number.
[HOW]
When this happens, call ASSERT(), and return a positive number (which is
fewer than callers' array size) instead.
This fixes 4 OVERRUN and 2 NEGATIVE_RETURNS issues reported by Coverity.
Reviewed-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com> Acked-by: Wayne Lin <wayne.lin@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Hung [Mon, 22 Apr 2024 19:43:14 +0000 (13:43 -0600)]
drm/amd/display: Do not return negative stream id for array
[WHY]
resource_stream_to_stream_idx returns an array index and it return -1
when not found; however, -1 is not a valid array index number.
[HOW]
When this happens, call ASSERT(), and return a zero instead.
This fixes an OVERRUN and an NEGATIVE_RETURNS issues reported by Coverity.
Reviewed-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com> Acked-by: Wayne Lin <wayne.lin@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm/amd/display: Fix overlapping copy within dml_core_mode_programming
[WHY]
&mode_lib->mp.Watermark and &locals->Watermark are
the same address. memcpy may lead to unexpected behavior.
[HOW]
memmove should be used.
Reviewed-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com> Acked-by: Wayne Lin <wayne.lin@amd.com> Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Hersen Wu <hersenxs.wu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Hung [Mon, 22 Apr 2024 19:52:27 +0000 (13:52 -0600)]
drm/amd/display: Skip finding free audio for unknown engine_id
[WHY]
ENGINE_ID_UNKNOWN = -1 and can not be used as an array index. Plus, it
also means it is uninitialized and does not need free audio.
[HOW]
Skip and return NULL.
This fixes 2 OVERRUN issues reported by Coverity.
Reviewed-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com> Acked-by: Wayne Lin <wayne.lin@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Hung [Tue, 23 Apr 2024 00:07:17 +0000 (18:07 -0600)]
drm/amd/display: Check pipe offset before setting vblank
pipe_ctx has a size of MAX_PIPES so checking its index before accessing
the array.
This fixes an OVERRUN issue reported by Coverity.
Reviewed-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com> Acked-by: Wayne Lin <wayne.lin@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Lancelot SIX [Wed, 3 Apr 2024 09:21:24 +0000 (10:21 +0100)]
drm/amdkfd: Enable SQ watchpoint for gfx10
There are new control registers introduced in gfx10 used to configure
hardware watchpoints triggered by SMEM instructions:
SQ_WATCH{0,1,2,3}_{CNTL_ADDR_HI,ADDR_L}.
Those registers work in a similar way as the TCP_WATCH* registers
currently used for gfx9 and above.
This patch adds support to program the SQ_WATCH registers for gfx10.
The SQ_WATCH?_CNTL.MASK field has one bit more than
TCP_WATCH?_CNTL.MASK, so SQ watchpoints can have a finer granularity
than TCP_WATCH watchpoints. In this patch, we keep the capabilities
advertised to the debugger unchanged
(HSA_DBG_WATCH_ADDR_MASK_*_BIT_GFX10) as this reflects what both TCP and
SQ watchpoints can do and both watchpoints are configured together.
Signed-off-by: Lancelot SIX <lancelot.six@amd.com> Reviewed-by: Jonathan Kim <jonathan.kim@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Hung [Thu, 18 Apr 2024 19:27:43 +0000 (13:27 -0600)]
drm/amd/display: Check index msg_id before read or write
[WHAT]
msg_id is used as an array index and it cannot be a negative value, and
therefore cannot be equal to MOD_HDCP_MESSAGE_ID_INVALID (-1).
[HOW]
Check whether msg_id is valid before reading and setting.
This fixes 4 OVERRUN issues reported by Coverity.
Reviewed-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com> Acked-by: Wayne Lin <wayne.lin@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm/amdgpu: Fix buffer size in gfx_v9_4_3_init_ cp_compute_microcode() and rlc_microcode()
The function gfx_v9_4_3_init_microcode in gfx_v9_4_3.c was generating
about potential truncation of output when using the snprintf function.
The issue was due to the size of the buffer 'ucode_prefix' being too
small to accommodate the maximum possible length of the string being
written into it.
The string being written is "amdgpu/%s_mec.bin" or "amdgpu/%s_rlc.bin",
where %s is replaced by the value of 'chip_name'. The length of this
string without the %s is 16 characters. The warning message indicated
that 'chip_name' could be up to 29 characters long, resulting in a total
of 45 characters, which exceeds the buffer size of 30 characters.
To resolve this issue, the size of the 'ucode_prefix' buffer has been
reduced from 30 to 15. This ensures that the maximum possible length of
the string being written into the buffer will not exceed its size, thus
preventing potential buffer overflow and truncation issues.
Fixes the below with gcc W=1:
drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c: In function ‘gfx_v9_4_3_early_init’:
drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c:379:52: warning: ‘%s’ directive output may be truncated writing up to 29 bytes into a region of size 23 [-Wformat-truncation=]
379 | snprintf(fw_name, sizeof(fw_name), "amdgpu/%s_rlc.bin", chip_name);
| ^~
......
439 | r = gfx_v9_4_3_init_rlc_microcode(adev, ucode_prefix);
| ~~~~~~~~~~~~
drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c:379:9: note: ‘snprintf’ output between 16 and 45 bytes into a destination of size 30
379 | snprintf(fw_name, sizeof(fw_name), "amdgpu/%s_rlc.bin", chip_name);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c:413:52: warning: ‘%s’ directive output may be truncated writing up to 29 bytes into a region of size 23 [-Wformat-truncation=]
413 | snprintf(fw_name, sizeof(fw_name), "amdgpu/%s_mec.bin", chip_name);
| ^~
......
443 | r = gfx_v9_4_3_init_cp_compute_microcode(adev, ucode_prefix);
| ~~~~~~~~~~~~
drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c:413:9: note: ‘snprintf’ output between 16 and 45 bytes into a destination of size 30
413 | snprintf(fw_name, sizeof(fw_name), "amdgpu/%s_mec.bin", chip_name);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Fixes: 86301129698b ("drm/amdgpu: split gc v9_4_3 functionality from gc v9_0") Cc: Hawking Zhang <Hawking.Zhang@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Suggested-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm/amd/display: Add NULL pointer check for kzalloc
[Why & How]
Check return pointer of kzalloc before using it.
Reviewed-by: Alex Hung <alex.hung@amd.com> Acked-by: Wayne Lin <wayne.lin@amd.com> Signed-off-by: Hersen Wu <hersenxs.wu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm/amdgpu: Fix ras mode2 reset failure in ras aca mode
Fix ras mode2 reset failure in ras aca mode.
Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
In amdgpu_umc_bad_page_polling_timeout, the amdgpu_umc_handle_bad_pages
will be run many times so that double free err_addr in some special case.
So set the err_addr to NULL to avoid the warnings.
Signed-off-by: Bob Zhou <bob.zhou@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Bob Zhou [Wed, 24 Apr 2024 07:24:05 +0000 (15:24 +0800)]
drm/amdgpu: add return result for amdgpu_i2c_{get/put}_byte
After amdgpu_i2c_get_byte fail, amdgpu_i2c_put_byte shouldn't be
conducted to put wrong value.
So return and check the i2c transfer result.
Signed-off-by: Bob Zhou <bob.zhou@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Bob Zhou [Tue, 23 Apr 2024 08:58:11 +0000 (16:58 +0800)]
drm/amdgpu: add error handle to avoid out-of-bounds
if the sdma_v4_0_irq_id_to_seq return -EINVAL, the process should
be stop to avoid out-of-bounds read, so directly return -EINVAL.
Signed-off-by: Bob Zhou <bob.zhou@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Le Ma <le.ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Ma Jun [Mon, 22 Apr 2024 02:07:51 +0000 (10:07 +0800)]
drm/amdgpu: Initialize timestamp for some legacy SOCs
Initialize the interrupt timestamp for some legacy SOCs
to fix the coverity issue "Uninitialized scalar variable"
Signed-off-by: Ma Jun <Jun.Ma2@amd.com> Suggested-by: Christian König <christian.koenig@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Queue buffer, though it is in system memory, has to be created using the
correct amdgpu device. Enforce this as the BO needs to mapped to the
GART for MES Hardware scheduler to access it.
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm/amdgpu: Using uninitialized value *size when calling amdgpu_vce_cs_reloc
Initialize the size before calling amdgpu_vce_cs_reloc, such as case 0x03000001.
V2: To really improve the handling we would actually
need to have a separate value of 0xffffffff.(Christian)
Signed-off-by: Jesse Zhang <jesse.zhang@amd.com> Suggested-by: Christian König <christian.koenig@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm/amd/display: Remove unnecessary NULL check in dcn20_set_input_transfer_func
This commit removes an unnecessary NULL check in the
`dcn20_set_input_transfer_func` function in the `dcn20_hwseq.c` file.
The variable `tf` is assigned the address of
`plane_state->in_transfer_func` unconditionally, so it can never be
`NULL`. Therefore, the check `if (tf == NULL)` is unnecessary and has
been removed.
The plane_state->in_transfer_func itself cannot be NULL because it's a
structure, not a pointer. When we do tf =
&plane_state->in_transfer_func;, we're getting the address of that
structure, which will always be valid as long as plane_state itself is
not NULL.
we've already checked if plane_state is NULL with the line if (dpp_base
== NULL || plane_state == NULL) return false;. So, if the code execution
gets to the point where tf = &plane_state->in_transfer_func; is called,
plane_state is guaranteed to be not NULL, and therefore tf will also not
be NULL.
1101 bool result = true;
1102 bool use_degamma_ram = false;
1103
1104 if (dpp_base == NULL || plane_state == NULL)
1105 return false;
1106
1107 hws->funcs.set_shaper_3dlut(pipe_ctx, plane_state);
1108 hws->funcs.set_blend_lut(pipe_ctx, plane_state);
1109
1110 tf = &plane_state->in_transfer_func;
^^^^^
Before there was an if statement but now tf is assigned unconditionally
1111
--> 1112 if (tf == NULL) {
^^^^^^^^^^^^^^^^^
so these conditions are impossible.
Fixes the below Smatch static checker warning:
drivers/gpu/drm/amd/amdgpu/../display/dc/hwss/dcn20/dcn20_hwseq.c:1112 dcn20_set_input_transfer_func() warn: address of 'plane_state->in_transfer_func' is non-NULL
Fixes: 285a7054bf81 ("drm/amd/display: Remove plane and stream pointers from dc scratch") Cc: Wenjing Liu <wenjing.liu@amd.com> Cc: Tom Chung <chiahsuan.chung@amd.com> Cc: Alvin Lee <alvin.lee2@amd.com> Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Cc: Roman Li <roman.li@amd.com> Cc: Qingqing Zhuo <Qingqing.Zhuo@amd.com> Cc: Aurabindo Pillai <aurabindo.pillai@amd.com> Cc: Harry Wentland <harry.wentland@amd.com> Suggested-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
According to [1]:
```
DTN only logs 'pipe_count' instances of MPCC. However in some cases
there are different number of MPCC than DPP (pipe_count).
```
As DTN log still relies on pipe_count to print mpcc state, switch to
mpcc_count in all occurrences.
Subtract the VRAM pinned memory when checking for available memory
in amdgpu_amdkfd_reserve_mem_limit function since that memory is not
available for use.
Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Sathishkumar S [Thu, 11 Apr 2024 19:43:06 +0000 (01:13 +0530)]
drm/amdgpu: update jpeg max decode resolution
jpeg ip version v2.1 and higher supports 16kx16k resolution decode
Signed-off-by: Sathishkumar S <sathishkumar.sundararaju@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Jose Fernandez [Mon, 22 Apr 2024 14:35:44 +0000 (08:35 -0600)]
drm/amd/display: Fix division by zero in setup_dsc_config
When slice_height is 0, the division by slice_height in the calculation
of the number of slices will cause a division by zero driver crash. This
leaves the kernel in a state that requires a reboot. This patch adds a
check to avoid the division by zero.
The stack trace below is for the 6.8.4 Kernel. I reproduced the issue on
a Z16 Gen 2 Lenovo Thinkpad with a Apple Studio Display monitor
connected via Thunderbolt. The amdgpu driver crashed with this exception
when I rebooted the system with the monitor connected.
After applying this patch, the driver no longer crashes when the monitor
is connected and the system is rebooted. I believe this is the same
issue reported for 3113.
Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Jose Fernandez <josef@netflix.com> Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3113 Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm/amdgpu: add ip dump for each ip in devcoredump
Add ip dump for each ip of the asic in the
devcoredump for all the ips where a callback
is registered for register dump.
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm/amdgpu: dump ip state before reset for each ip
Invoke the dump_ip_state function for each ip before
the asic resets and save the register values for
debugging via devcoredump.
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add support to print ip information to be
used to print registers in devcoredump
buffer.
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add the protoype for print ip state to be used
to print the registers in devcoredump during
a gpu reset.
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Adding gfx10 gc registers to be used for register
dump via devcoredump during a gpu reset.
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add the prototype to dump ip registers
for all ips of different asics and set
them to NULL for now. Based on the
requirement add a function pointer for
each of them.
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
YiPeng Chai [Fri, 29 Mar 2024 06:42:59 +0000 (14:42 +0800)]
drm/amdgpu: Add interface to reserve bad page
Add interface to reserve bad page.
Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>