Thomas Zimmermann [Thu, 6 Apr 2023 13:21:09 +0000 (15:21 +0200)]
video/aperture: Provide a VGA helper for gma500 and internal use
The hardware for gma500 is different from the rest, as it uses stolen
framebuffer memory that is not available via PCI BAR. The regular PCI
removal helper cannot detect the framebuffer, while the non-PCI helper
misses possible conflicting VGA devices (i.e., a framebuffer or text
console).
Gma500 therefore calls both helpers to catch all cases. It's confusing
as it implies that there's something about the PCI device that requires
ownership management. The relationship between the PCI device and the
VGA devices is non-obvious. At worst, readers might assume that calling
two functions for clearing aperture ownership is a bug in the driver.
Hence, move the PCI removal helper's code for VGA functionality into
a separate function and call this function from gma500. Documents the
purpose of each call to aperture helpers. The change contains comments
and example code form the discussion at [1].
Daniel Vetter [Thu, 6 Apr 2023 13:21:08 +0000 (15:21 +0200)]
fbdev: Simplify fb_is_primary_device for x86
vga_default_device really is supposed to cover all corners, at least
for x86. Additionally checking for rom shadowing should be redundant,
because the bios/fw only does that for the boot vga device.
If this turns out to be wrong, then most likely that's a special case
which should be added to the vgaarb code, not replicated all over.
Patch motived by changes to the aperture helpers, which also have this
open code in a bunch of places, and which are all removed in a
clean-up series. This function here is just for selecting the default
fbdev device for fbcon, but I figured for consistency it might be good
to throw this patch in on top.
Note that the shadow rom check predates vgaarb, which was only wired
up in commit 88674088d10c ("x86: Use vga_default_device() when
determining whether an fb is primary"). That patch doesn't explain why
we still fall back to the shadow rom check.
Daniel Vetter [Thu, 6 Apr 2023 13:21:07 +0000 (15:21 +0200)]
video/aperture: Only remove sysfb on the default vga pci device
Instead of calling aperture_remove_conflicting_devices() to remove the
conflicting devices, just call to aperture_detach_devices() to detach
the device that matches the same PCI BAR / aperture range. Since the
former is just a wrapper of the latter plus a sysfb_disable() call,
and now that's done in this function but only for the primary devices.
This fixes a regression introduced by commit ee7a69aa38d8 ("fbdev:
Disable sysfb device registration when removing conflicting FBs"),
where we remove the sysfb when loading a driver for an unrelated pci
device, resulting in the user losing their efifb console or similar.
Note that in practice this only is a problem with the nvidia blob,
because that's the only gpu driver people might install which does not
come with an fbdev driver of it's own. For everyone else the real gpu
driver will restore a working console.
Also note that in the referenced bug there's confusion that this same
bug also happens on amdgpu. But that was just another amdgpu specific
regression, which just happened to happen at roughly the same time and
with the same user-observable symptoms. That bug is fixed now, see
https://bugzilla.kernel.org/show_bug.cgi?id=216331#c15
Note that we should not have any such issues on non-pci multi-gpu
issues, because I could only find two such cases:
- SoC with some external panel over spi or similar. These panel
drivers do not use drm_aperture_remove_conflicting_framebuffers(),
so no problem.
- vga+mga, which is a direct console driver and entirely bypasses all
this.
For the above reasons the cc: stable is just notionally, this patch
will need a backport and that's up to nvidia if they care enough.
v2:
- Explain a bit better why other multi-gpu that aren't pci shouldn't
have any issues with making all this fully pci specific.
Daniel Vetter [Thu, 6 Apr 2023 13:21:06 +0000 (15:21 +0200)]
video/aperture: Drop primary argument
With the preceding patches it's become defunct. Also I'm about to add
a different boolean argument, so it's better to keep the confusion
down to the absolute minimum.
v2: Since the hypervfb patch got droppped (it's only a pci device for
gen1 vm, not for gen2) there is one leftover user in an actual driver
left to touch.
Daniel Vetter [Thu, 6 Apr 2023 13:21:05 +0000 (15:21 +0200)]
video/aperture: Move vga handling to pci function
A few reasons for this:
- It's really the only one where this matters. I tried looking around,
and I didn't find any non-pci vga-compatible controllers for x86
(since that's the only platform where we had this until a few
patches ago), where a driver participating in the aperture claim
dance would interfere.
- I also don't expect that any future bus anytime soon will
not just look like pci towards the OS, that's been the case for like
25+ years by now for practically everything (even non non-x86).
- Also it's a bit funny if we have one part of the vga removal in the
pci function, and the other in the generic one.
v2: Rebase.
v4:
- fix Daniel's S-o-b address
v5:
- add back an S-o-b tag with Daniel's Intel address
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: Javier Martinez Canillas <javierm@redhat.com> Cc: Helge Deller <deller@gmx.de> Cc: linux-fbdev@vger.kernel.org Reviewed-by: Javier Martinez Canillas <javierm@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230406132109.32050-6-tzimmermann@suse.de
Daniel Vetter [Thu, 6 Apr 2023 13:21:04 +0000 (15:21 +0200)]
video/aperture: Only kick vgacon when the pdev is decoding vga
Otherwise it's a bit silly, and we might throw out the driver for the
screen the user is actually looking at. I haven't found a bug report
for this case yet, but we did get bug reports for the analog case
where we're throwing out the efifb driver.
v2: Flip the check around to make it clear it's a special case for
kicking out the vgacon driver only (Thomas)
Daniel Vetter [Thu, 6 Apr 2023 13:21:03 +0000 (15:21 +0200)]
drm/aperture: Remove primary argument
Only really pci devices have a business setting this - it's for
figuring out whether the legacy vga stuff should be nuked too. And
with the preceding two patches those are all using the pci version of
this.
Which means for all other callers primary == false and we can remove
it now.
v2:
- Reorder to avoid compile fail (Thomas)
- Include gma500, which retained it's called to the non-pci version.
v4:
- fix Daniel's S-o-b address
v5:
- add back an S-o-b tag with Daniel's Intel address
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: Javier Martinez Canillas <javierm@redhat.com> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Maxime Ripard <mripard@kernel.org> Cc: Deepak Rawat <drawat.floss@gmail.com> Cc: Neil Armstrong <neil.armstrong@linaro.org> Cc: Kevin Hilman <khilman@baylibre.com> Cc: Jerome Brunet <jbrunet@baylibre.com> Cc: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Cc: Thierry Reding <thierry.reding@gmail.com> Cc: Jonathan Hunter <jonathanh@nvidia.com> Cc: Emma Anholt <emma@anholt.net> Cc: Helge Deller <deller@gmx.de> Cc: David Airlie <airlied@gmail.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: linux-hyperv@vger.kernel.org Cc: linux-amlogic@lists.infradead.org Cc: linux-arm-kernel@lists.infradead.org Cc: linux-tegra@vger.kernel.org Cc: linux-fbdev@vger.kernel.org Acked-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Acked-by: Thierry Reding <treding@nvidia.com> Reviewed-by: Javier Martinez Canillas <javierm@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230406132109.32050-4-tzimmermann@suse.de
Daniel Vetter [Thu, 6 Apr 2023 13:21:02 +0000 (15:21 +0200)]
video/aperture: use generic code to figure out the vga default device
Since vgaarb has been promoted to be a core piece of the pci subsystem
we don't have to open code random guesses anymore, we actually know
this in a platform agnostic way, and there's no need for an x86
specific hack. See also commit 1d38fe6ee6a8 ("PCI/VGA: Move vgaarb to
drivers/pci")
This should not result in any functional change, and the non-x86
multi-gpu pci systems are probably rare enough to not matter (I don't
know of any tbh). But it's a nice cleanup, so let's do it.
There's been a few questions on previous iterations on dri-devel and
irc:
- fb_is_primary_device() seems to be yet another implementation of
this theme, and at least on x86 it checks for both
vga_default_device OR rom shadowing. There shouldn't ever be a case
where rom shadowing gives any additional hints about the boot vga
device, but if there is then the default vga selection in vgaarb
should probably be fixed. And not special-case checks replicated all
over.
- Thomas also brought up that on most !x86 systems
fb_is_primary_device() returns 0, except on sparc/parisc. But these
2 special cases are about platform specific devices and not pci, so
shouldn't have any interactions.
- Furthermore fb_is_primary_device() is a bit a red herring since it's
only used to select the right fbdev driver for fbcon, and not for
the fw handover dance which the aperture helpers handle. At least
for x86 we might want to look into unifying them, but that's a
separate thing.
v2: Extend commit message trying to summarize various discussions.
v4:
- make the test for the primary device easier to read (Javier)
- fix commit message style (i.e., commit 1234 ("..."))
- fix Daniel's S-o-b address
v5:
- add back an S-o-b tag with Daniel's Intel address
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: Javier Martinez Canillas <javierm@redhat.com> Cc: Helge Deller <deller@gmx.de> Cc: linux-fbdev@vger.kernel.org Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: linux-pci@vger.kernel.org Reviewed-by: Javier Martinez Canillas <javierm@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230406132109.32050-3-tzimmermann@suse.de
Daniel Vetter [Thu, 6 Apr 2023 13:21:01 +0000 (15:21 +0200)]
drm/gma500: Use drm_aperture_remove_conflicting_pci_framebuffers
This one nukes all framebuffers, which is a bit much. In reality
gma500 is igpu and never shipped with anything discrete, so there should
not be any difference.
v2: Unfortunately the framebuffer sits outside of the pci bars for
gma500, and so only using the pci helpers won't be enough. Otoh if we
only use non-pci helper, then we don't get the vga handling, and
subsequent refactoring to untangle these special cases won't work.
It's not pretty, but the simplest fix (since gma500 really is the only
quirky pci driver like this we have) is to just have both calls.
v4:
- fix Daniel's S-o-b address
v5:
- add back an S-o-b tag with Daniel's Intel address
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Cc: Patrik Jakobsson <patrik.r.jakobsson@gmail.com> Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: Javier Martinez Canillas <javierm@redhat.com> Reviewed-by: Javier Martinez Canillas <javierm@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230406132109.32050-2-tzimmermann@suse.de
Daniel Vetter [Wed, 12 Apr 2023 08:09:20 +0000 (10:09 +0200)]
MAINTAINERS: add drm_bridge for drm bridge maintainers
Otherwise core changes don't get noticed by the right people. I
noticed this because a patch set from Jagan Teki seems to have fallen
through the cracks.
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Jagan Teki <jagan@amarulasolutions.com> Cc: Andrzej Hajda <andrzej.hajda@intel.com> Cc: Neil Armstrong <neil.armstrong@linaro.org> Cc: Robert Foss <rfoss@kernel.org> Cc: Laurent Pinchart <Laurent.pinchart@ideasonboard.com> Cc: Jonas Karlman <jonas@kwiboo.se> Cc: Jernej Skrabec <jernej.skrabec@gmail.com> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com> Acked-by: Neil Armstrong <neil.armstrong@linaro.org> Reviewed-by: Jagan Teki <jagan@amarulasolutions.com>
[narmstrong: fixed ordering & Daniel's SoB] Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://patchwork.freedesktop.org/patch/msgid/20230412080921.10171-1-daniel.vetter@ffwll.ch
drm/i915: disable sampler indirect state in bindless heap
By default the indirect state sampler data (border colors) are stored
in the same heap as the SAMPLER_STATE structure. For userspace drivers
that can be 2 different heaps (dynamic state heap & bindless sampler
state heap). This means that border colors have to copied in 2
different places so that the same SAMPLER_STATE structure find the
right data.
This change is forcing the indirect state sampler data to only be in
the dynamic state pool (more convenient for userspace drivers, they
only have to have one copy of the border colors). This is reproducing
the behavior of the Windows drivers.
Tom Rix [Tue, 21 Mar 2023 18:24:14 +0000 (14:24 -0400)]
drm/vmwgfx: remove unused vmw_overlay function
clang with W=1 reports
drivers/gpu/drm/vmwgfx/vmwgfx_overlay.c:56:35: error:
unused function 'vmw_overlay' [-Werror,-Wunused-function]
static inline struct vmw_overlay *vmw_overlay(struct drm_device *dev)
^
This function is not used, so remove it.
Martin Krastev [Tue, 21 Mar 2023 02:09:49 +0000 (22:09 -0400)]
drm/vmwgfx: Fix Legacy Display Unit atomic drm support
Legacy Display Unit (LDU) fb dirty support used a custom fb dirty callback. Latter
handled only the DIRTYFB IOCTL presentation path but not the ADDFB2/PAGE_FLIP/RMFB
IOCTL path, common for Wayland compositors.
Get rid of the custom callback in favor of drm_atomic_helper_dirtyfb and unify the
handling of the presentation paths inside of vmw_ldu_primary_plane_atomic_update.
This also homogenizes the fb dirty callbacks across all DUs: LDU, SOU and STDU.
Signed-off-by: Martin Krastev <krastevm@vmware.com> Reviewed-by: Maaz Mombasawala <mombasawalam@vmware.com> Fixes: 2f5544ff0300 ("drm/vmwgfx: Use atomic helper function for dirty fb IOCTL") Cc: <stable@vger.kernel.org> # v5.0+ Signed-off-by: Zack Rusin <zackr@vmware.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230321020949.335012-3-zack@kde.org
Zack Rusin [Tue, 21 Mar 2023 02:09:48 +0000 (22:09 -0400)]
drm/vmwgfx: Print errors when running on broken/unsupported configs
virtualbox implemented an incomplete version of the svga device which
they decided to drop soon after the initial release. The device was
always broken in various ways and never supported by vmwgfx.
vmwgfx should refuse to load on those configurations but currently
drm has no way of reloading fbdev when the specific pci driver refuses
to load, which would leave users without a usable fb. Instead of
refusing to load print an error and disable a bunch of functionality
that virtualbox never implemented to at least get fb to work on their
setup.
Daniel Vetter [Tue, 11 Apr 2023 10:28:09 +0000 (12:28 +0200)]
Merge tag 'mediatek-drm-next-6.4' of https://git.kernel.org/pub/scm/linux/kernel/git/chunkuang.hu/linux into drm-next
Mediatek DRM Next for Linux 6.4
1. Add support for 10-bit overlays
2. Add MediaTek SoC DRM (vdosys1) support for mt8195
3. Change mmsys compatible for mt8195 mediatek-drm
4. Only trigger DRM HPD events if bridge is attached
5. Change the aux retries times when receiving AUX_DEFER
Daniel Vetter [Tue, 11 Apr 2023 10:11:32 +0000 (12:11 +0200)]
Merge tag 'drm-msm-next-2023-04-10' of https://gitlab.freedesktop.org/drm/msm into drm-next
main pull request for v6.4
Core Display:
============
* Bugfixes for error handling during probe
* rework UBWC decoder programming
* prepare_commit cleanup
* bindings for SM8550 (MDSS, DPU), SM8450 (DP)
* timeout calculation fixup
* atomic: use drm_crtc_next_vblank_start() instead of our own
custom thing to calculate the start of next vblank
DP:
==
* interrupts cleanup
DPU:
===
* DSPP sub-block flush on sc7280
* support AR30 in addition to XR30 format
* Allow using REC_0 and REC_1 to handle wide (4k) RGB planes
* Split the HW catalog into individual per-SoC files
DSI:
===
* rework DSI instance ID detection on obscure platforms
GPU:
===
* uapi C++ compatibility fix
* a6xx: More robust gdsc reset
* a3xx and a4xx devfreq support
* update generated headers
* various cleanups and fixes
* GPU and GEM updates to avoid allocations which could trigger
reclaim (shrinker) in fence signaling path
* dma-fence deadline hint support and wait-boost
* a640 speedbin support
* a650 speedbin support
Conflicts in drivers/gpu/drm/msm/adreno/adreno_gpu.c:
Conflict between the 7fa5047a436b ("drm: Use of_property_present() for
testing DT property presence") and 9f251f934012 ("drm/msm/adreno: Use
OPP for every GPU generation"). The latter removed the of_ function
call outright, so I went with what's in the PR unchanged.
Daniel Vetter [Tue, 11 Apr 2023 10:02:38 +0000 (12:02 +0200)]
Merge tag 'drm-habanalabs-next-2023-04-10' of https://git.kernel.org/pub/scm/linux/kernel/git/ogabbay/linux into drm-next
This tag contains additional habanalabs driver changes for v6.4:
- uAPI changes:
- Add a definition of a new Gaudi2 server type. This is used by userspace
to know what is the connectivity between the accelerators inside the
server
- New features and improvements:
- speedup h/w queues test in Gaudi2 to reduce device initialization times.
- Firmware related fixes:
- Fixes to the handshake protocol during f/w initialization.
- Sync f/w events interrupt in hard reset to avoid warning message.
- Improvements to extraction of the firmware version.
- Misc bug fixes and code cleanups. Notable fixes are:
- Multiple fixes for interrupt handling in Gaudi2.
- Unmap mapped memory in case TLB invalidation fails.
accel/habanalabs: add missing error flow in hl_sysfs_init()
hl_sysfs_fini() is called only if hl_sysfs_init() completes
successfully. Therefore if hl_sysfs_init() fails, need to remove any
sysfs group that was added until that point.
Moti Haimovski [Mon, 20 Mar 2023 20:59:11 +0000 (22:59 +0200)]
accel/habanalabs: speedup h/w queues test in Gaudi2
HW queues testing at driver load and after reset takes a substantial
amount of time.
This commit reduces the queues test time in Gaudi2 devices by running
all the tests in parallel instead of one after the other.
Time measurements on tests duration shows that the new method is almost
x100 faster than the serial approach.
Dani Liberman [Tue, 28 Mar 2023 17:41:35 +0000 (20:41 +0300)]
accel/habanalabs: fix handling of arc farm sei event
There is only single eq entry for arc farm sei event which aggregates
events from the four arc farms.
Fix the code to handle this event according to this behavior.
Ofir Bitton [Mon, 27 Mar 2023 10:40:56 +0000 (13:40 +0300)]
accel/habanalabs: remove Gaudi1 multi MSI code
Multi MSI interrupts aren't working in Gaudi1 and because of that,
we are only using a single MSI interrupt. Therefore, let's remove this
dead code in order to avoid confusion.
Koby Elbaz [Sun, 26 Mar 2023 15:22:57 +0000 (18:22 +0300)]
accel/habanalabs: don't wait for STS_OK after sending COMMS WFE
Sending COMMS_GOTO_WFE instructs the FW's CPU to halt (WFE state).
Once sent, FW's CPU isn't expected to continue communicating with LKD.
Therefore, the stage of waiting for COMMS_STS_OK should be skipped or
else waiting for COMMS_STS_OK will simply timeout, which will trigger
unexpected behavior.
Tal Cohen [Tue, 21 Mar 2023 08:59:28 +0000 (10:59 +0200)]
accel/habanalabs: sync f/w events interrupt in hard reset
Receiving events from FW, while the device is in hard reset, causes
a warning message in Driver log. The message may point to a
problem in the Driver or FW. But It also can appear as a result
of events that have been sent from FW just before the hard reset.
In order to avoid receiving events from FW while the device is in reset
and is already in 'disabled' mode, sync the f/w events interrupt right
before setting the device to 'disabled'.
Tomer Tayar [Sun, 26 Mar 2023 21:08:45 +0000 (00:08 +0300)]
accel/habanalabs: fix events mask of decoder abnormal interrupts
The decoder IRQ status register may have several set bits upon an
abnormal interrupt. Therefore, when setting the events mask, need to
check all bits and not using if-else.
Ofir Bitton [Wed, 15 Mar 2023 08:36:41 +0000 (10:36 +0200)]
accel/habanalabs: fix HBM MMU interrupt handling
Current mapping between HMMU event and HMMU block is wrong.
In addition the captured address in case of a page fault or
an access error is scrambled, Hence we must call the descramble
function.
Tal Cohen [Wed, 22 Mar 2023 09:20:05 +0000 (11:20 +0200)]
accel/habanalabs: send disable pci when compute ctx is active
Fix an issue in hard reset flow in which the driver didn't send a
disable pci message if there was an active compute context.
In hard reset, disable pci message should be sent no matter if
a compute context exists or not.
The disable pci message is sent in reset device. It informs the FW not
to raise more EQs. The Driver may ignore received EQs, when the device
is in disabled mode.
The duplication happens when hard reset is scheduled during compute
reset and also performs 'escalate_reset_flow'.
Koby Elbaz [Tue, 21 Mar 2023 14:03:07 +0000 (16:03 +0200)]
accel/habanalabs: change COMMS warning messages to error level
COMMS protocol is used for LKD <--> FW communication, and any
communication failure between the two might turn out to be
destructive, hence, it should be well emphasized.
Tal Cohen [Thu, 16 Mar 2023 15:30:46 +0000 (17:30 +0200)]
accel/habanalabs: print event type when device is disabled
When the device is in disabled state, the driver isn't suppose to
receive any events from FW. Printing the event type, as part of the
message that was already printed, shall help to get more info if this
unexpected message is received.
Koby Elbaz [Wed, 8 Mar 2023 15:53:39 +0000 (17:53 +0200)]
accel/habanalabs: unmap mapped memory when TLB inv fails
Once a memory mapping is added to the page tables, it's followed by
a TLB invalidation request which could potentially fail (HW failure).
Removing the mapping is simply a part of this failure handling routine.
TLB invalidation failure prints were updated to be more accurate.
Remove pci_clear_master to simplify the code,
the bus-mastering is also cleared in do_pci_disable_device,
like this:
./drivers/pci/pci.c:2197
static void do_pci_disable_device(struct pci_dev *dev)
{
u16 pci_command;
drm/vkms: Remove <drm/drm_simple_kms_helper.h> include
The driver doesn't use simple-KMS helpers to set a simple display pipeline
but only the drm_simple_encoder_init() function to initialize an encoder.
That helper is just a wrapper of drm_encoder_init(), but passing a struct
drm_encoder_funcs that sets the .destroy handler to drm_encoder_cleanup().
Since the <drm/drm_simple_kms_helper.h> header is only included for this
helper and because the connector is initialized with drm_connector_init()
as well, do the same for the encoder and drop the header include.
drm/msm/dpu: fetch DPU configuration from match data
In email discussion it was noted that there can be different SoC device
having slightly different SoC features, but sharing the same DPU hw
revision. Stop fetching catalog data using core_rev and use platform's
match data instead.
For sm8150+ the DPU_CTL_SPLIT_DISPLAY should be replaced with
DPU_CTL_ACTIVE_CFG support (which supports having a single CTL for both
interfaces in a split). Add comments where this conversion is required.
drm/msm/dpu: move UBWC/memory configuration to separate struct
UBWC and highest bank settings differ slightly between different DPU
units of the same generation, while the dpu_caps and dpu_mdp_cfg are
much more stable. To ease configuration reuse move ubwc_swizzle and
highest_bank_bit data to separate structure.
Konrad Dybcio [Tue, 4 Apr 2023 13:05:43 +0000 (16:05 +0300)]
drm/msm/dpu: Allow variable INTF_BLK size
These blocks are of variable length on different SoCs. Set the
correct values where I was able to retrieve it from downstream
DTs and leave the old defaults (0x280) otherwise.
Signed-off-by: Konrad Dybcio <konrad.dybcio@linaro.org>
[DB: fixed some lengths, split the INTF changes away] Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Reviewed-by: Abhinav Kumar <quic_abhinavk@quicinc.com>
Patchwork: https://patchwork.freedesktop.org/patch/530816/ Link: https://lore.kernel.org/r/20230404130622.509628-4-dmitry.baryshkov@linaro.org Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Konrad Dybcio [Tue, 4 Apr 2023 13:05:42 +0000 (16:05 +0300)]
drm/msm/dpu: Allow variable SSPP_BLK size
These blocks are of variable length on different SoCs. Set the
correct values where I was able to retrieve it from downstream
DTs and leave the old defaults (0x1c8) otherwise.
Signed-off-by: Konrad Dybcio <konrad.dybcio@linaro.org>
[DB: fixed some of lengths, split the INTF changes away] Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Patchwork: https://patchwork.freedesktop.org/patch/530814/ Link: https://lore.kernel.org/r/20230404130622.509628-3-dmitry.baryshkov@linaro.org Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
drm/scheduler: Fix UAF race in drm_sched_entity_push_job()
After a job is pushed into the queue, it is owned by the scheduler core
and may be freed at any time, so we can't write nor read the submit
timestamp after that point.
Fixes oopses observed with the drm/asahi driver, found with kASAN.
Konrad Dybcio [Sat, 18 Mar 2023 13:42:52 +0000 (14:42 +0100)]
drm/msm/dsi: Switch the QCM2290-specific compatible to index autodetection
Now that the logic can handle multiple sets of registers, move
the QCM2290 to the common logic and mark it deprecated. This allows us
to remove a couple of structs, saving some memory.
Konrad Dybcio [Sat, 18 Mar 2023 13:42:51 +0000 (14:42 +0100)]
drm/msm/dsi: dsi_cfg: Merge SC7180 config into SDM845
The configs are identical, other than the number of *maximum* DSI
hosts allowed. This isn't an issue, unless somebody deliberately
tries to access the inexistent host by adding a dt node for it.
Remove the SC7180 struct and point the hw revision match to the
SDM845's one. On a note, this could have been done back when
7180 support was introduced.
Douglas Anderson [Fri, 27 Jan 2023 01:09:13 +0000 (17:09 -0800)]
drm/msm/dp: Return IRQ_NONE for unhandled interrupts
If our interrupt handler gets called and we don't really handle the
interrupt then we should return IRQ_NONE. The current interrupt
handler didn't do this, so let's fix it.
NOTE: for some of the cases it's clear that we should return IRQ_NONE
and some cases it's clear that we should return IRQ_HANDLED. However,
there are a few that fall somewhere in between. Specifically, the
documentation for when to return IRQ_NONE vs. IRQ_HANDLED is probably
best spelled out in the commit message of commit d9e4ad5badf4 ("Document
that IRQ_NONE should be returned when IRQ not actually handled"). That
commit makes it clear that we should return IRQ_HANDLED if we've done
something to make the interrupt stop happening.
The case where it's unclear is, for instance, in dp_aux_isr() after
we've read the interrupt using dp_catalog_aux_get_irq() and confirmed
that "isr" is non-zero. The function dp_catalog_aux_get_irq() not only
reads the interrupts but it also "ack"s all the interrupts that are
returned. For an "unknown" interrupt this has a very good chance of
actually stopping the interrupt from happening. That would mean we've
identified that it's our device and done something to stop them from
happening and should return IRQ_HANDLED. Specifically, it should be
noted that most interrupts that need "ack"ing are ones that are
one-time events and doing an "ack" is enough to clear them. However,
since these interrupts are unknown then, by definition, it's unknown
if "ack"ing them is truly enough to clear them. It's possible that we
also need to remove the original source of the interrupt. In this
case, IRQ_NONE would be a better choice.
Given that returning an occasional IRQ_NONE isn't the absolute end of
the world, however, let's choose that course of action. The IRQ
framework will forgive a few IRQ_NONE returns now and again (and it
won't even log them, which is why we have to log them ourselves). This
means that if we _do_ end hitting an interrupt where "ack"ing isn't
enough the kernel will eventually detect the problem and shut our
device down.
Signed-off-by: Douglas Anderson <dianders@chromium.org> Tested-by: Kuogee Hsieh <quic_khsieh@quicinc.com> Reviewed-by: Kuogee Hsieh <quic_khsieh@quicinc.com>
Patchwork: https://patchwork.freedesktop.org/patch/520660/ Link: https://lore.kernel.org/r/20230126170745.v2.2.I2d7aec2fadb9c237cd0090a47d6a8ba2054bf0f8@changeid
[DB: reformatted commit message to make checkpatch happy] Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Konrad Dybcio [Sat, 18 Mar 2023 13:42:49 +0000 (14:42 +0100)]
drm/msm/dsi: Fix DSI index detection when version clash occurs
Currently, we allow for MAX_DSI entries in io_start to facilitate for
MAX_DSI number of DSI hosts at different addresses. The configuration
is matched against the DSI CTRL hardware revision read back from the
component. We need a way to resolve situations where multiple SoCs
with different register maps may use the same version of DSI CTRL. In
preparation to do so, make msm_dsi_config a 2d array where each entry
represents a set of configurations adequate for a given SoC.
This is totally fine to do, as the only differentiating factors
between same-version-different-SoCs configurations are the number of
DSI hosts (1 or 2, at least as of today) and the set of base registers.
The regulator setup is the same, because the DSI hardware is the same,
regardless of the SoC it was implemented in.
In addition to that, update the matching logic such that it will loop
over VARIANTS_MAX variants, making sure they are all taken into account.
Douglas Anderson [Fri, 27 Jan 2023 01:09:12 +0000 (17:09 -0800)]
drm/msm/dp: Clean up handling of DP AUX interrupts
The DP AUX interrupt handling was a bit of a mess.
* There were two functions (one for "native" transfers and one for
"i2c" transfers) that were quite similar. It was hard to say how
many of the differences between the two functions were on purpose
and how many of them were just an accident of how they were coded.
* Each function sometimes used "else if" to test for error bits and
sometimes didn't and again it was hard to say if this was on purpose
or just an accident.
* The two functions wouldn't notice whether "unknown" bits were
set. For instance, there seems to be a bit "DP_INTR_PLL_UNLOCKED"
and if it was set there would be no indication.
* The two functions wouldn't notice if more than one error was set.
Let's fix this by being more consistent / explicit about what we're
doing.
By design this could cause different handling for AUX transfers,
though I'm not actually aware of any bug fixed as a result of
this patch (this patch was created because we simply noticed how odd
the old code was by code inspection). Specific notes here:
1. In the old native transfer case if we got "done + wrong address"
we'd ignore the "wrong address" (because of the "else if"). Now we
won't.
2. In the old native transfer case if we got "done + timeout" we'd
ignore the "timeout" (because of the "else if"). Now we won't.
3. In the old native transfer case we'd see "nack_defer" and translate
it to the error number for "nack". This differed from the i2c
transfer case where "nack_defer" was given the error number for
"nack_defer". This 100% can't matter because the only user of this
error number treats "nack defer" the same as "nack", so it's clear
that the difference between the "native" and "i2c" was pointless
here.
4. In the old i2c transfer case if we got "done" plus any error
besides "nack" or "defer" then we'd ignore the error. Now we don't.
5. If there is more than one error signaled by the hardware it's
possible that we'll report a different one than we used to. I don't
know if this matters. If someone is aware of a case this matters we
should document it and change the code to make it explicit.
6. One quirk we keep (I don't know if this is important) is that in
the i2c transfer case if we see "done + defer" we report that as a
"nack". That seemed too intentional in the old code to just drop.
After this change we will add extra logging, including:
* A warning if we see more than one error bit set.
* A warning if we see an unexpected interrupt.
* A warning if we get an AUX transfer interrupt when shouldn't.
It actually turns out that as a result of this change then at boot we
sometimes see an error:
[drm:dp_aux_isr] *ERROR* Unexpected DP AUX IRQ 0x01000000 when not busy
That means that, during init, we are seeing DP_INTR_PLL_UNLOCKED. For
now I'm going to say that leaving this error reported in the logs is
OK-ish and hopefully it will encourage someone to track down what's
going on at init time.
One last note here is that this change renames one of the interrupt
bits. The bit named "i2c done" clearly was used for native transfers
being done too, so I renamed it to indicate this.