Dave Airlie [Tue, 12 Jul 2022 05:50:41 +0000 (15:50 +1000)]
Merge tag 'drm/tegra/for-5.20-rc1' of https://gitlab.freedesktop.org/drm/tegra into drm-next
drm/tegra: Changes for v5.20-rc1
The bulk of these changes adds support for context isolation for the
various supported host1x engines, as well as support for the hardware
found on the new Tegra234 SoC generation.
There's also a couple of fixes and cleanups. To round things off, the
device tree bindings are converted to the new json-schema format that
allows DTBs to be validated.
Dave Airlie [Tue, 12 Jul 2022 01:07:30 +0000 (11:07 +1000)]
Merge tag 'amd-drm-next-5.20-2022-07-05' of https://gitlab.freedesktop.org/agd5f/linux into drm-next
amd-drm-next-5.20-2022-07-05:
amdgpu:
- Various spelling and grammer fixes
- Various eDP fixes
- Various DMCUB fixes
- VCN fixes
- GMC 11 fixes
- RAS fixes
- TMZ support for GC 10.3.7
- GPUVM TLB flush fixes
- SMU 13.0.x updates
- DCN 3.2 Support
- DCN 3.2.1 Support
- MES updates
- GFX11 modifiers support
- USB-C fixes
- MMHUB 3.0.1 support
- SDMA 6.0 doorbell fixes
- Initial devcoredump support
- Enable high priority gfx queue on asics which support it
- Enable GPU reset for SMU 13.0.4
- OLED display fixes
- MPO fixes
- DC frame size fixes
- ASPM support for PCIE 7.4/7.6
- GPU reset support for SMU 13.0.0
- GFX11 updates
- VCN JPEG fix
- BACO support for SMU 13.0.7
- VCN instance handling fix
- GFX8 GPUVM TLB flush fix
- GPU reset rework
- VCN 4.0.2 support
- GTT size fixes
- DP link training fixes
- LSDMA 6.0.1 support
- Various backlight fixes
- Color encoding fixes
- Backlight config cleanup
- VCN 4.x unified queue cleanup
amdkfd:
- MMU notifier fixes
- Updates for GC 10.3.6 and 10.3.7
- P2P DMA support using dma-buf
- Add available memory IOCTL
- SDMA 6.0.1 fix
- MES fixes
- HMM profiler support
radeon:
- License fix
- Backlight config cleanup
UAPI:
- Add available memory IOCTL to amdkfd
Proposed userspace: https://www.mail-archive.com/amd-gfx@lists.freedesktop.org/msg75743.html
- HMM profiler support for amdkfd
Proposed userspace: https://lists.freedesktop.org/archives/amd-gfx/2022-June/080805.html
drm/tegra: vic: Use devm_platform_ioremap_resource()
Use the devm_platform_ioremap_resource() helper instead of calling
platform_get_resource() and devm_ioremap_resource() separately. Make the
code simpler without functional changes.
Mikko Perttunen [Fri, 8 Jul 2022 15:18:02 +0000 (18:18 +0300)]
gpu: host1x: Generalize host1x_cdma_push_wide()
host1x_cdma_push_wide() had the assumptions that the last parameter word
was a NOP opcode, and that NOP opcodes could be used in all situations.
Neither are true with the new job opcode sequence, so adjust the
function to not have these assumptions, and instead place an early
RESTART opcode when necessary to jump back to the beginning of the
pushbuffer.
Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com>
Mikko Perttunen [Fri, 8 Jul 2022 15:18:01 +0000 (18:18 +0300)]
gpu: host1x: Initialize syncval in channel_submit()
During the refactoring of channel_submit(), assignment of syncval was
moved but it is also used in channel_submit(). Add this assignment back
to channel_submit() as well.
Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com>
Robin Murphy [Mon, 11 Apr 2022 19:49:10 +0000 (20:49 +0100)]
drm/tegra: Include DMA API header where used
Even though the IOVA API never actually needed it, iova.h is still
carrying an include of dma-mapping.h, now solely for the sake of not
breaking tegra-drm. Fix that properly.
Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Thierry Reding <treding@nvidia.com>
Dmitry Osipenko [Tue, 28 Jun 2022 22:42:39 +0000 (01:42 +0300)]
drm/tegra: Fix vmapping of prime buffers
The code assumes that Tegra GEM is permanently vmapped, which is not
true for the scattered buffers. After converting Tegra video decoder
driver to V4L API, we're now getting a BUG_ON from dma-buf core on playing
video using libvdpau-tegra on T30+ because tegra_gem_prime_vmap() sets
vaddr to NULL. Older pre-V4L video decoder driver wasn't vmapping dma-bufs.
Fix it by actually vmapping the exported GEMs.
YueHaibing [Sat, 5 Mar 2022 12:32:00 +0000 (20:32 +0800)]
drm/tegra: vic: Fix build warning when CONFIG_PM=n
drivers/gpu/drm/tegra/vic.c:326:12: error: ‘vic_runtime_suspend’ defined but not used [-Werror=unused-function]
static int vic_runtime_suspend(struct device *dev)
^~~~~~~~~~~~~~~~~~~
drivers/gpu/drm/tegra/vic.c:292:12: error: ‘vic_runtime_resume’ defined but not used [-Werror=unused-function]
static int vic_runtime_resume(struct device *dev)
^~~~~~~~~~~~~~~~~~
Robin Murphy [Thu, 7 Jul 2022 17:30:44 +0000 (18:30 +0100)]
gpu: host1x: Register context bus unconditionally
Conditional registration is a problem for other subsystems which may
unwittingly try to interact with host1x_context_device_bus_type in an
uninitialised state on non-Tegra platforms. A look under /sys/bus on a
typical system already reveals plenty of entries from enabled but
otherwise irrelevant configs, so lets keep things simple and register
our context bus unconditionally too.
Signed-off-by: Robin Murphy <robin.murphy@arm.com> Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com>
Mikko Perttunen [Mon, 27 Jun 2022 14:20:07 +0000 (17:20 +0300)]
gpu: host1x: Use RESTART_W to skip timed out jobs on Tegra186+
When MLOCK enforcement is enabled, the 0-word write currently done
is rejected by the hardware outside of an MLOCK region. As such,
on these chips, which also have the newer, more convenient RESTART_W
opcode, use that instead to skip over the timed out job.
Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com>
Mikko Perttunen [Mon, 27 Jun 2022 14:20:06 +0000 (17:20 +0300)]
gpu: host1x: Add MLOCK release code on Tegra234
With the full-featured opcode sequence using MLOCKs, we need to also
unlock those MLOCKs in the event of a timeout. However, it turns out
that on Tegra186/Tegra194, by default, we don't need to do this;
furthermore, on Tegra234 it is much simpler to do; so only implement
this on Tegra234 for the time being.
Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com>
Mikko Perttunen [Mon, 27 Jun 2022 14:20:05 +0000 (17:20 +0300)]
gpu: host1x: Rewrite job opcode sequence
For new (Tegra186+) SoCs, use a new ('full-featured') job opcode
sequence that is compatible with virtualization. In particular,
the Host1x hardware in Tegra234 is more strict regarding the sequence,
requiring ACQUIRE_MLOCK-SETCLASS-SETSTREAMID opcodes to occur in
that sequence without gaps (except for SETPAYLOAD), so let's do it
properly in one go now.
Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com>
Mikko Perttunen [Mon, 27 Jun 2022 14:20:03 +0000 (17:20 +0300)]
gpu: host1x: Program interrupt destinations on Tegra234
On Tegra234, each Host1x VM has 8 interrupt lines. Each syncpoint
can be configured with which interrupt line should be used for
threshold interrupt, allowing for load balancing.
For now, to keep backwards compatibility, just set all syncpoints
to the first interrupt.
Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com>
Mikko Perttunen [Mon, 27 Jun 2022 14:19:59 +0000 (17:19 +0300)]
gpu: host1x: Deduplicate hardware headers
Host1x class information and opcodes are unchanged or backwards
compatible across SoCs so let's not duplicate them for each one
but have them in a shared header file.
At the same time, add opcode functions for acquire/release_mlock.
Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com>
Mikko Perttunen [Mon, 27 Jun 2022 14:19:54 +0000 (17:19 +0300)]
drm/tegra: Implement stream ID related callbacks on engines
Implement the get_streamid_offset and can_use_memory_ctx callbacks
required for supporting context isolation. Since old firmware on VIC
cannot support context isolation without hacks that we don't want to
implement, check the firmware binary to see if context isolation
should be enabled.
Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com>
Mikko Perttunen [Mon, 27 Jun 2022 14:19:53 +0000 (17:19 +0300)]
drm/tegra: Support context isolation
For engines that support context isolation, allocate a context when
opening a channel, and set up stream ID offset and context fields
when submitting a job.
As of this commit, the stream ID offset and fallback stream ID
are not used when context isolation is disabled. However, with
upcoming patches that enable a full featured job opcode sequence,
these will be necessary.
Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com>
Mikko Perttunen [Mon, 27 Jun 2022 14:19:52 +0000 (17:19 +0300)]
drm/tegra: nvdec: Fix TRANSCFG register offset
NVDEC's TRANSCFG register is at a different offset than VIC.
This becomes a problem now when context isolation is enabled and
the reset value of the register is no longer sufficient.
Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com>
Mikko Perttunen [Mon, 27 Jun 2022 14:19:51 +0000 (17:19 +0300)]
drm/tegra: falcon: Set DMACTX field on DMA transactions
The DMACTX field determines which context, as specified in the
TRANSCFG register, is used. While during boot it doesn't matter
which is used, later on it matters and this value is reused by
the firmware.
Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com>
Mikko Perttunen [Mon, 27 Jun 2022 14:19:49 +0000 (17:19 +0300)]
gpu: host1x: Program context stream ID on submission
Add code to do stream ID switching at the beginning of a job. The
stream ID is switched to the stream ID specified by the context
passed in the job structure.
Before switching the stream ID, an OP_DONE wait is done on the
channel's engine to ensure that there is no residual ongoing
work that might do DMA using the new stream ID.
Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com>
Also, fix the order of descriptions for VM/hypervisor
register apertures -- while the reg-names specification
was correct, the descriptions for these were switched.
Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com> Reviewed-by: Rob Herring <robh@kernel.org> Signed-off-by: Thierry Reding <treding@nvidia.com>
Lukas Bulwahn [Thu, 23 Jun 2022 09:54:52 +0000 (11:54 +0200)]
MAINTAINERS: Rectify entry for NVIDIA TEGRA DRM and VIDEO DRIVER
Commit fd27de58b0ad ("dt-bindings: display: tegra: Convert to json-schema")
converts nvidia,tegra20-host1x.txt to yaml, but missed to adjust its
references in MAINTAINERS.
Hence, ./scripts/get_maintainer.pl --self-test=patterns complains about a
broken reference.
Repair these file references in NVIDIA TEGRA DRM and VIDEO DRIVER.
Randy Dunlap [Wed, 6 Jul 2022 18:42:24 +0000 (11:42 -0700)]
drm: xlnx: add <linux/io.h> for readl/writel
Add a header file to prevent build errors:
../drivers/gpu/drm/xlnx/zynqmp_dp.c: In function ‘zynqmp_dp_write’:
../drivers/gpu/drm/xlnx/zynqmp_dp.c:335:9: error: implicit declaration of function ‘writel’ [-Werror=implicit-function-declaration]
335 | writel(val, dp->iomem + offset);
../drivers/gpu/drm/xlnx/zynqmp_dp.c: In function ‘zynqmp_dp_read’:
../drivers/gpu/drm/xlnx/zynqmp_dp.c:340:16: error: implicit declaration of function ‘readl’ [-Werror=implicit-function-declaration]
340 | return readl(dp->iomem + offset);
Fixes: a204f9743b68 ("drm: Remove linux/i2c.h from drm_crtc.h") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Hyun Kwon <hyun.kwon@xilinx.com> Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Cc: "Ville Syrjälä" <ville.syrjala@linux.intel.com> Cc: David Airlie <airlied@linux.ie> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Michal Simek <michal.simek@xilinx.com> Cc: linux-arm-kernel@lists.infradead.org Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220706184224.29116-1-rdunlap@infradead.org
Ville Syrjälä [Thu, 30 Jun 2022 15:06:00 +0000 (18:06 +0300)]
drm/i915: Nuke PCH_JSP
JSP is based on ICP and we don't really need to differentiate
between the two. So let's just delcare JSP to be ICP.
The only slight change here is for Wa_14011294188 which we
used to apply for JSP but now we'll only apply to MCC. This
should be fine since the issue being dealt with was introduced
in TGP and inherited into MCC. JSP being derived from ICP
should not need this workaround.
With LVDS dual link, up to 160MHz mode clock rate is supported.
With LVDS single link, up to 80MHz mode clock rate is supported.
Fix mode clock rate validation by swapping the maximum mode clock
rates of the two link modes.
Fixes: 463db5c2ed4a ("drm: bridge: ldb: Implement simple Freescale i.MX8MP LDB bridge") Cc: Andrzej Hajda <andrzej.hajda@intel.com> Cc: Neil Armstrong <narmstrong@baylibre.com> Cc: Robert Foss <robert.foss@linaro.org> Cc: Laurent Pinchart <Laurent.pinchart@ideasonboard.com> Cc: Jonas Karlman <jonas@kwiboo.se> Cc: Jernej Skrabec <jernej.skrabec@gmail.com> Cc: David Airlie <airlied@linux.ie> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Sam Ravnborg <sam@ravnborg.org> Cc: Marek Vasut <marex@denx.de> Cc: NXP Linux Team <linux-imx@nxp.com> Signed-off-by: Liu Ying <victor.liu@nxp.com> Reviewed-by: Marek Vasut <marex@denx.de> Signed-off-by: Robert Foss <robert.foss@linaro.org> Link: https://patchwork.freedesktop.org/patch/msgid/20220701065634.4027537-2-victor.liu@nxp.com
Move hpd polling check into wait_hpd_asserted() callback. For the cases
that aux transfer function wasn't used, do hpd polling check after pm
runtime resume, which will power on the bridge.
Hsin-Yi Wang [Wed, 6 Jul 2022 12:52:53 +0000 (20:52 +0800)]
drm/bridge: anx7625: Fix NULL pointer crash when using edp-panel
Move devm_of_dp_aux_populate_ep_devices() after pm runtime and i2c setup
to avoid NULL pointer crash.
edp-panel probe (generic_edp_panel_probe) calls pm_runtime_get_sync() to
read EDID. At this time, bridge should have pm runtime enabled and i2c
clients ready.
Hsin-Yi Wang [Wed, 6 Jul 2022 12:52:52 +0000 (20:52 +0800)]
drm/bridge: anx7625: use pm_runtime_force_suspend(resume)
There's no need to check for IRQ or disable it in suspend.
Use pm_runtime_force_suspend(resume) to make sure anx7625 is powered off
correctly. Make the system suspend/resume and pm runtime suspend/resume
more consistent.
Ville Syrjälä [Thu, 30 Jun 2022 19:51:13 +0000 (22:51 +0300)]
drm: Remove linux/media-bus-format.h from drm_crtc.h
drm_crtc.h has no need for linux/media-bus-format.h, so don't
include it. Avoids useless rebuilds of the entire universe when
touching linux/media-bus-format.h.
Quite a few placs do currently depend on linux/media-bus-format.h
without actually including it directly. All of those need to be
fixed up.
v2: Deal with ingenic as well
v3: Fix up mxsfb and remaining parts of imx
Ville Syrjälä [Thu, 30 Jun 2022 19:51:12 +0000 (22:51 +0300)]
drm: Remove linux/fb.h from drm_crtc.h
drm_crtc.h has no need for linux/fb.h, so don't include it.
Avoids useless rebuilds of the entire universe when
touching linux/fb.h.
Quite a few placs do currently depend on linux/fb.h or other
headers pulled in by it without actually including any of it
directly. All of those need to be fixed up.
Ville Syrjälä [Thu, 30 Jun 2022 19:51:11 +0000 (22:51 +0300)]
drm/vmwgfx: Stop using 'TRUE'
Stop using the 'TRUE' define. This ultimately gets defined by
acpi/actypes.h that gets included here via a convoluted chain of
other headers. drm_crtc.h is part of that chain, and I'm trying
to eliminate all unnecessary includes from it to avoid pointless
rebuilds.
v2: Split out from the bigger patch
Cc: Zack Rusin <zackr@vmware.com> Cc: VMware Graphics Reviewers <linux-graphics-maintainer@vmware.com> Acked-by: Sam Ravnborg <sam@ravnborg.org> Acked-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220630195114.17407-2-ville.syrjala@linux.intel.com Reviewed-by: Zack Rusin <zackr@vmware.com<mailto:zackr@vmware.com>>
Hangyu Hua [Wed, 18 May 2022 06:58:56 +0000 (14:58 +0800)]
drm: bridge: sii8620: fix possible off-by-one
The next call to sii8620_burst_get_tx_buf will result in off-by-one
When ctx->burst.tx_count + size == ARRAY_SIZE(ctx->burst.tx_buf). The same
thing happens in sii8620_burst_get_rx_buf.
This patch also change tx_count and tx_buf to rx_count and rx_buf in
sii8620_burst_get_rx_buf. It is unreasonable to check tx_buf's size and
use rx_buf.
Paul Cercueil [Sat, 2 Jul 2022 23:07:27 +0000 (00:07 +0100)]
drm/ingenic: Use the highest possible DMA burst size
Until now, when running at the maximum resolution of 1280x720 at 32bpp
on the JZ4770 SoC the output was garbled, the X/Y position of the
top-left corner of the framebuffer warping to a random position with
the whole image being offset accordingly, every time a new frame was
being submitted.
This problem can be eliminated by using a bigger burst size for the DMA.
Set in each soc_info structure the maximum burst size supported by the
corresponding SoC, and use it in the driver.
Set the new value using regmap_update_bits() instead of
regmap_set_bits(), since we do want to override the old value of the
burst size. (Note that regmap_set_bits() wasn't really valid before for
the same reason, but it never seemed to be a problem).
Add binding for the Emerging Display Technology ETML0700Y5DHA panel.
It is a 7" WSVGA (1024x600) TFT LCD panel with:
- LVDS data interface,
- backlight and
- capacitive touch.
Anton Bambura [Sun, 29 May 2022 18:05:46 +0000 (21:05 +0300)]
dt-bindings: sharp,lq101r1sx01: Add compatible for LQ101R1SX03
LQ101R1SX03 is compatible with LQ101R1SX01 from software perspective,
document it. The LQ101R1SX03 is a newer revision of LQ101R1SX01, it has
minor differences in hardware pins in comparison to the older version.
The newer version of the panel can be found on Android tablets, like
ASUS TF701T.
Brian Norris [Sat, 18 Jun 2022 00:26:52 +0000 (17:26 -0700)]
drm/rockchip: vop: Don't crash for invalid duplicate_state()
It's possible for users to try to duplicate the CRTC state even when the
state doesn't exist. drm_atomic_helper_crtc_duplicate_state() (and other
users of __drm_atomic_helper_crtc_duplicate_state()) already guard this
with a WARN_ON() instead of crashing, so let's do that here too.
Geert Uytterhoeven [Fri, 24 Jun 2022 12:10:51 +0000 (14:10 +0200)]
drm/bridge: imx: i.MX8 bridge drivers should depend on ARCH_MXC
The various Freescale i.MX8 display bridges are only present on
Freescale i.MX8 SoCs. Hence add a dependency on ARCH_MXC, to prevent
asking the user about these drivers when configuring a kernel without
i.MX SoC support.
Fixes: e60c4354840b2fe8 ("drm/bridge: imx: Add LDB support for i.MX8qm") Fixes: 3818715f62b42b5c ("drm/bridge: imx: Add LDB support for i.MX8qxp") Fixes: 96988a526c97cfbe ("drm/bridge: imx: Add i.MX8qxp pixel link to DPI support") Fixes: 1ec17c26bc06289d ("drm/bridge: imx: Add i.MX8qm/qxp display pixel link support") Fixes: 93e163a9e0392aca ("drm/bridge: imx: Add i.MX8qm/qxp pixel combiner support") Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Liu Ying <victor.liu@nxp.com> Reviewed-by: Neil Armstrong <narmstrong@baylibre.com> Signed-off-by: Neil Armstrong <narmstrong@baylibre.com> Link: https://patchwork.freedesktop.org/patch/msgid/42c542b53a1c8027b23a045045fbb7b34479913d.1656072500.git.geert+renesas@glider.be
Dave Airlie [Fri, 1 Jul 2022 04:14:52 +0000 (14:14 +1000)]
Merge tag 'drm-intel-gt-next-2022-06-29' of git://anongit.freedesktop.org/drm/drm-intel into drm-next
UAPI Changes:
- Expose per tile media freq factor in sysfs (Ashutosh Dixit, Dale B Stimson)
- Document memory residency and Flat-CCS capability of obj (Ramalingam C)
- Disable GETPARAM lookups of I915_PARAM_[SUB]SLICE_MASK on Xe_HP+ (Matt Roper)
Cross-subsystem Changes:
- Rename intel-gtt symbols (Lucas De Marchi)
Core Changes:
Driver Changes:
- Support programming the EU priority in the GuC descriptor (DG2) (Matthew Brost)
- DG2 HuC loading support (Daniele Ceraolo Spurio)
- Fix build error without CONFIG_PM (YueHaibing)
- Enable THP on Icelake and beyond (Tvrtko Ursulin)
- Only setup private tmpfs mount when needed and fix logging (Tvrtko Ursulin)
- Make __guc_reset_context aware of guilty engines (Umesh Nerlige Ramappa)
- DG2 small bar memory probing fixes (Nirmoy Das)
- Remove unnecessary GuC err capture noise (Alan Previn)
- Fix i915_gem_object_ggtt_pin_ww regression on old platforms (Maarten Lankhorst)
- Fix undefined behavior in GuC backend due to shift overflowing the constant (Borislav Petkov)
- New DG2 workarounds (Swathi Dhanavanthri, Anshuman Gupta)
- Report no hwconfig support on ADL-N (Balasubramani Vivekanandan)
- Fix error_state_read ptr + offset use (Alan Previn)
- Expose per tile media freq factor in sysfs (Ashutosh Dixit, Dale B Stimson)
- Fix memory leaks in per-gt sysfs (Ashutosh Dixit)
- Fix dma_resv fence handling in multi-batch execbuf (Nirmoy Das)
- Add extra registers to GPU error dump on Gen11+ (Stuart Summers)
- More PVC+DG2 workarounds (Matt Roper)
- Improve user experience and driver robustness under SIGINT or similar (Tvrtko Ursulin)
- Don't show engine classes not present (Tvrtko Ursulin)
- Improve on suspend / resume time with VT-d enabled (Thomas Hellström)
- Add missing else (katrinzhou)
- Don't leak lmem mapping in vma_evict (Juha-Pekka Heikkila)
- Add smem fallback allocation for dpt (Juha-Pekka Heikkila)
- Tweak the ordering in cpu_write_needs_clflush (Matthew Auld)
- Do not access rq->engine without a reference (Niranjana Vishwanathapura)
- Revert "drm/i915: Hold reference to intel_context over life of i915_request" (Niranjana Vishwanathapura)
- Don't update engine busyness stats too frequently (Alan Previn)
- Add additional steps for Wa_22011802037 for execlist backend (Umesh Nerlige Ramappa)
- Fix a lockdep warning at error capture (Nirmoy Das)
- Ponte Vecchio prep work and new blitter engines (Matt Roper, John Harrison, Lucas De Marchi)
- Read correct RP_STATE_CAP register (PVC) (Matt Roper)
- Define MOCS table for PVC (Ayaz A Siddiqui)
- Driver refactor and support Ponte Vecchio forcewake handling (Matt Roper)
- Remove additional 3D flags from PIPE_CONTROL (Ponte Vecchio) (Stuart Summers)
- XEHPSDV and PVC do not use HuC (Daniele Ceraolo Spurio)
- Extract stepping information from PCI revid (Ponte Vecchio) (Matt Roper)
- Add initial PVC workarounds (Stuart Summers)
- SSEU handling driver refactor and Ponte Vecchio support (Matt Roper)
- GuC depriv applies to PVC (Matt Roper)
- Add register steering (Ponte Vecchio) (Matt Roper)
- Add recommended MMIO setting (Ponte Vecchio) (Matt Roper)
- Move multicast register handling to a dedicated file (Matt Roper)
- Cleanup interface for MCR operations (Matt Roper)
- Extend i915_vma_pin_iomap() (CQ Tang)
- Re-do the intel-gtt split (Lucas De Marchi)
- Correct duplicated/misplaced GT register definitions (Matt Roper)
- Prefer "XEHP_" prefix for registers (Matt Roper)
- Don't use DRM_DEBUG_WARN_ON for unexpected l3bank/mslice config (Tvrtko Ursulin)
- Don't use DRM_DEBUG_WARN_ON for ring unexpectedly not idle (Tvrtko Ursulin)
- Make drop_pages() return bool (Lucas De Marchi)
- Fix CFI violation with show_dynamic_id() (Nathan Chancellor)
- Use i915_probe_error instead of drm_error in GuC code (Vinay Belgaumkar)
- Fix use of static in macro mismatch (Andi Shyti)
- Update tiled blits selftest (Bommu Krishnaiah)
- Future-proof platform checks (Matt Roper)
- Only include what's needed (Jani Nikula)
- remove accidental static from a local variable (Jani Nikula)
- Add global forcewake request to drpc (Vinay Belgaumkar)
- Fix spelling typo in comment (pengfuyuan)
- Increase timeout for live_parallel_switch selftest (Akeem G Abodunrin)
- Use non-blocking H2G for waitboost (Vinay Belgaumkar)
Rodrigo Siqueira [Thu, 30 Jun 2022 18:46:20 +0000 (14:46 -0400)]
drm/amd/display: Fix __muldf3 undefined for 32 bit compilation
Sometimes when trying to enable some feature, we have to define some
values with educated guesses, but we mark those values as TBD, which
means "To Be Determined". However, the correct way to approach it is by
loading that information from the firmware. Anyway, some of the values
that we were experimenting with caused this issue:
This was caused because we were trying to assign an unsigned int to a
double value which causes issues for 32-bit architecture. This issue can
be fixed by changing the value type.
Cc: Aurabindo Pillai <aurabindo.pillai@amd.com> Cc: Harry Wentland <harry.wentland@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Randy Dunlap <rdunlap@infradead.org> Fixes: 265280b99822 ("drm/amd/display: add CLKMGR changes for DCN32/321") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Acked-by: Randy Dunlap <rdunlap@infradead.org> Tested-by: Randy Dunlap <rdunlap@infradead.org> # build-tested Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Philip Yang [Wed, 19 Jan 2022 17:57:26 +0000 (12:57 -0500)]
drm/amdkfd: Bump KFD API version for SMI profiling event
Indicate SMI profiling events available.
Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Philip Yang [Fri, 8 Apr 2022 14:25:11 +0000 (10:25 -0400)]
drm/amdkfd: Asynchronously free smi_client
The synchronize_rcu may take several ms, which noticeably slows down
applications close SMI event handle. Use call_rcu to free client->fifo
and client asynchronously and eliminate the synchronize_rcu call in the
user thread.
Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Philip Yang [Thu, 20 Jan 2022 21:43:42 +0000 (16:43 -0500)]
drm/amdkfd: Add unmap from GPU SMI event
SVM range unmapped from GPUs when range is unmapped from CPU, or with
xnack on from MMU notifier when range is evicted or migrated.
Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Philip Yang [Fri, 14 Jan 2022 02:24:20 +0000 (21:24 -0500)]
drm/amdkfd: Add user queue eviction restore SMI event
Output user queue eviction and restore event. User queue eviction may be
triggered by svm or userptr MMU notifier, TTM eviction, device suspend
and CRIU checkpoint and restore.
User queue restore may be rescheduled if eviction happens again while
restore.
Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Philip Yang [Fri, 14 Jan 2022 00:28:13 +0000 (19:28 -0500)]
drm/amdkfd: Add migration SMI event
For migration start and end event, output timestamp when migration
starts, ends, svm range address and size, GPU id of migration source and
destination and svm range attributes,
Migration trigger could be prefetch, CPU or GPU page fault and TTM
eviction.
Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Philip Yang [Fri, 14 Jan 2022 00:22:54 +0000 (19:22 -0500)]
drm/amdkfd: Add GPU recoverable fault SMI event
Use ktime_get_boottime_ns() as timestamp to correlate with other
APIs. Output timestamp when GPU recoverable fault starts and ends to
recover the fault, if migration happened or only GPU page table is
updated to recover, fault address, if read or write fault.
Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Philip Yang [Thu, 13 Jan 2022 23:59:02 +0000 (18:59 -0500)]
drm/amdkfd: Enable per process SMI event
Process receive event from same process by default. Add a flag to be
able to receive event from all processes, this requires super user
permission.
Event using pid 0 to send the event to all processes, to keep the
default behavior of existing SMI events.
Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Philip Yang [Thu, 20 Jan 2022 20:06:30 +0000 (15:06 -0500)]
drm/amdkfd: Add KFD SMI event IDs and triggers
Define new system management interface event IDs for migration, GPU
recoverable page fault, user queues eviction, restore and unmap from
GPU events and corresponding event triggers, those will be implemented
in the following patches.
Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Jonathan Kim [Mon, 27 Jun 2022 01:35:10 +0000 (21:35 -0400)]
drm/amdkfd: fix cu mask for asics with wgps
GFX10 and up have work group processors (WGP) and WGP mode is the native
compile mode.
KFD and ROCr have no visibility into whether a dispatch is operating
in CU or WGP mode.
Enforce CU masking to be pairwise continguous in enablement and
round robin distribute CUs across the SEs in a pairwise manner to
assume WGP mode at all times.
Signed-off-by: Jonathan Kim <jonathan.kim@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
José Roberto de Souza [Wed, 29 Jun 2022 13:47:21 +0000 (06:47 -0700)]
drm/i915: Drain freed object after suspend display
Display is turned off by i915_drm_suspend() during the suspend
procedure, removing the last reference of some gem objects that were
used by display.
The issue is that those objects are only actually freed when
mm.free_work executed and that can happen very late in the suspend
process causing issues.
So here draining all freed objects released by display fixing suspend
issues.
Jani Nikula [Tue, 21 Jun 2022 12:37:32 +0000 (15:37 +0300)]
drm/i915/bios: debug log ddi port info after parsing
The ddc pin and aux channel sanitization may disable DVI/HDMI and DP,
respectively, of ports parsed earlier, in "last one wins" fashion. With
parsing and printing interleaved, we'll end up logging support first and
disabling later anyway.
Now that we've split ddi port info parsing and printing, take it further
by doing the printing in a separate loop, fixing the logging.
Note that this also changes the logging order from VBT child device
order to port number order.
Jani Nikula [Wed, 29 Jun 2022 09:27:54 +0000 (12:27 +0300)]
drm/edid: add HF-EEODB support to EDID read and allocation
HDMI 2.1 section 10.3.6 defines an HDMI Forum EDID Extension Override
Data Block, which may contain a different extension count than the base
block claims. Add support for reading more EDID data if available. The
extra blocks aren't parsed yet, though.
Hard-coding the EEODB parsing instead of using the iterators we have is
a bit of a bummer, but we have to be able to do this on a partially
allocated EDID while reading it.
v2:
- Check for CEA Data Block Collection size (Ville)
- Amend commit message and comment about hard-coded parsing
Jani Nikula [Wed, 29 Jun 2022 09:27:53 +0000 (12:27 +0300)]
drm/edid: do invalid block filtering in-place
Rewrite edid_filter_invalid_blocks() to filter invalid blocks
in-place. The main motivation is to not rely on passed in information on
invalid block count or the allocation size, which will be helpful in
follow-up work on HF-EEODB.
Jani Nikula [Wed, 29 Jun 2022 09:27:52 +0000 (12:27 +0300)]
drm/edid: add drm_edid_raw() to access the raw EDID data
Unfortunately, there are still plenty of interfaces around that require
a struct edid pointer, and it's impossible to change them all at
once. Add an accessor to the raw EDID data to help the transition.
While there are no such cases now, be defensive against raw EDID
extension count indicating bigger EDID than is actually allocated.
Add a helper function to be used as the "default" .get_modes()
hook. This also works as an example of what the driver .get_modes()
hooks are supposed to do regarding the new drm_edid_read*() and
drm_edid_connector_update() calls.
Jani Nikula [Wed, 29 Jun 2022 09:27:50 +0000 (12:27 +0300)]
drm/edid: add drm_edid_connector_update()
Add a new function drm_edid_connector_update() to replace the
combination of calls drm_connector_update_edid_property() and
drm_add_edid_modes(). Usually they are called in the drivers in this
order, however the former needs information from the latter.
Since the new drm_edid_read*() functions no longer call the connector
updates directly, and the read and update are separated, we'll need this
new function for the connector update.
This is all in drm_edid.c simply to keep struct drm_edid opaque.
v2:
- Share code with drm_connector_update_edid_property() (Ville)
- Add comment about override EDID handling
Add functions drm_edid_override_set() and drm_edid_override_reset() to
support "edid_override" connector debugfs, and to hide the details about
it in drm_edid.c. No functional changes at this time.
Also note in the connector.override_edid flag kernel-doc that this is
only supposed to be modified by the code doing debugfs EDID override
handling. Currently, it is still being modified by amdgpu in
create_eml_sink() and handle_edid_mgmt() for reasons unknown. This was
added in commit 4562236b3bc0 ("drm/amd/dc: Add dc display driver (v2)")
and later moved to amdgpu_dm.c in commit e7b07ceef2a6 ("drm/amd/display:
Merge amdgpu_dm_types and amdgpu_dm").
Jani Nikula [Wed, 29 Jun 2022 09:27:46 +0000 (12:27 +0300)]
drm/edid: move drm_connector_update_edid_property() to drm_edid.c
The function needs access to drm_edid.c internals more than
drm_connector.c. We can make drm_reset_display_info(),
drm_add_display_info() and drm_update_tile_info() static. There will be
more benefits with follow-up struct drm_edid refactoring.
Alex Deucher [Thu, 23 Jun 2022 16:37:40 +0000 (12:37 -0400)]
drm/amdgpu: fix documentation warning
Fixes this issue:
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:5094: warning: expecting prototype for amdgpu_device_gpu_recover_imp(). Prototype was for amdgpu_device_gpu_recover() instead
Fixes: cf727044144d ("drm/amdgpu: Rename amdgpu_device_gpu_recover_imp back to amdgpu_device_gpu_recover") Reviewed-by: Kent Russell <kent.russell@amd.com> Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Lucas De Marchi [Tue, 28 Jun 2022 19:10:16 +0000 (12:10 -0700)]
iosys-map: Add per-word write
Like was done for read, provide the equivalent for write. Even if
current users are not in the hot path, this should future-proof it.
v2:
- Remove default from _Generic() - callers wanting to write more
than u64 should use iosys_map_memcpy_to()
- Add WRITE_ONCE() cases dereferencing the pointer when using system
memory
v3:
- Fix precedence issue when casting inside WRITE_ONCE(). By not using ()
around vaddr__ the offset was not part of the cast, but rather added
to it, producing a wrong address
- Remove compiletime_assert() as WRITE_ONCE() already contains it
Lucas De Marchi [Tue, 28 Jun 2022 19:10:15 +0000 (12:10 -0700)]
iosys-map: Add per-word read
Instead of always falling back to memcpy_fromio() for any size, prefer
using read{b,w,l}(). When reading struct members it's common to read
individual integer variables individually. Going through memcpy_fromio()
for each of them poses a high penalty.
Employ a similar trick as __seqprop() by using _Generic() to generate
only the specific call based on a type-compatible variable.
For a pariticular i915 workload producing GPU context switches,
__get_engine_usage_record() is particularly hot since the engine usage
is read from device local memory with dgfx, possibly multiple times
since it's racy. Test execution time for this test shows a ~12.5%
improvement with DG2:
Before:
nrepeats = 1000; min = 7.63243e+06; max = 1.01817e+07;
median = 9.52548e+06; var = 526149;
After:
nrepeats = 1000; min = 7.03402e+06; max = 8.8832e+06;
median = 8.33955e+06; var = 333113;
Other things attempted that didn't prove very useful:
1) Change the _Generic() on x86 to just dereference the memory address
2) Change __get_engine_usage_record() to do just 1 read per loop,
comparing with the previous value read
3) Change __get_engine_usage_record() to access the fields directly as it
was before the conversion to iosys-map
(3) did gave a small improvement (~3%), but doesn't seem to scale well
to other similar cases in the driver.
Additional test by Chris Wilson using gem_create from igt with some
changes to track object creation time. This happens to accidentally
stress this code path:
Pre iosys_map conversion of engine busyness:
lmem0: Creating 262144 4KiB objects took 59274.2ms
Unpatched:
lmem0: Creating 262144 4KiB objects took 108830.2ms
With readl (this patch):
lmem0: Creating 262144 4KiB objects took 61348.6ms
s/readl/READ_ONCE/
lmem0: Creating 262144 4KiB objects took 61333.2ms
So we do take a little bit more time than before the conversion, but
that is due to other factors: bringing the READ_ONCE back would be as
good as just doing this conversion.
v2:
- Remove default from _Generic() - callers wanting to read more
than u64 should use iosys_map_memcpy_from()
- Add READ_ONCE() cases dereferencing the pointer when using system
memory
v3:
- Fix precedence issue when casting inside READ_ONCE(). By not using ()
around vaddr__ the offset was not part of the cast, but rather added
to it, producing a wrong address
- Remove compiletime_assert() as READ_ONCE() already contains it