Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VCSM locking fix #6703

Draft
wants to merge 827 commits into
base: rpi-6.12.y
Choose a base branch
from
Draft

VCSM locking fix #6703

wants to merge 827 commits into from

Conversation

6by9
Copy link
Contributor

@6by9 6by9 commented Mar 5, 2025

Potential fix for #6701

popcornmix and others added 30 commits February 28, 2025 16:53
Apply fractional source co-ordinates into the scaling filters.

Signed-off-by: Dom Cobley <[email protected]>
Reviewed-by: Maxime Ripard <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Dave Stevenson <[email protected]>
When the margins are changed, the dlist needs to be regenerated
with the changed updated dest regions for each of the planes.

Setting the zpos_changed flag is sufficient to trigger that
without doing a full modeset, therefore set it should the
margins be changed.

Reviewed-by: Maxime Ripard <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Dave Stevenson <[email protected]>
Support displaying DRM_FORMAT_YUV444 and DRM_FORMAT_YVU444 formats.
Tested with kmstest and kodi. e.g.

kmstest -r 1920x1080@60 -f 400x300-YU24

Note: without the shift of width, only half the chroma is fetched,
resulting in correct left half of image and corrupt colours on right half.

The increase in width shouldn't affect fetching of Y data,
as the hardware will clamp at dest width.

Signed-off-by: Dom Cobley <[email protected]>
Reviewed-by: Maxime Ripard <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Dave Stevenson <[email protected]>
The VC4 HDMI driver has a bunch of accessors to read from a register.
The read accessor was warning when accessing an unknown register, but
the write one was just returning silently.

Let's make sure we warn also when writing to an unknown register.

Signed-off-by: Maxime Ripard <[email protected]>
Reviewed-by: Maxime Ripard <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Dave Stevenson <[email protected]>
DLIST generation can get pretty tricky and there's not a lot of debug in
the driver to help. Let's add a few more to track the generated DLIST
size.

Signed-off-by: Maxime Ripard <[email protected]>
Reviewed-by: Maxime Ripard <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Dave Stevenson <[email protected]>
We need to allocate a few additional structures when checking our
atomic_state, especially related to hardware SRAM that will hold the
plane descriptors (DLIST) and the current line context (LBM) during
composition.

Since those allocation can fail, let's add some error message in that
case to help debug what goes wrong.

Signed-off-by: Maxime Ripard <[email protected]>
Reviewed-by: Maxime Ripard <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Dave Stevenson <[email protected]>
LBM allocations need a different size depending on the line length,
format, etc.

This can get tricky, and fail. Let's add some more prints to ease the
debugging when it does.

Signed-off-by: Maxime Ripard <[email protected]>
Reviewed-by: Maxime Ripard <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Dave Stevenson <[email protected]>
The vc4_plane_atomic_check() directly returns the result of the final
function it calls.

Using the already defined ret variable to check its content on error,
and a separate return 0 on success, makes it easier to extend.

Signed-off-by: Maxime Ripard <[email protected]>
Reviewed-by: Maxime Ripard <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Dave Stevenson <[email protected]>
We access multiple times the vc4_crtc_state->assigned_channel variable
in the vc4_crtc_get_scanout_position() function, so let's store it in a
local variable.

Signed-off-by: Maxime Ripard <[email protected]>
Reviewed-by: Maxime Ripard <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Dave Stevenson <[email protected]>
It has been observed that a YUV422 unity scaled plane isn't displayed.
Enabling vertical scaling on the UV planes solves this. There is
already a similar clause to always enable horizontal scaling on the
UV planes.

Reviewed-by: Maxime Ripard <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Dave Stevenson <[email protected]>
ABORT_ON_EMPTY chooses whether the HVS abandons the current frame
when it experiences an underflow, or attempts to continue.

In theory the frame should be black from the point of underflow,
compared to a shift of sebsequent pixels to the left.

Unfortunately it seems to put the HVS is a bad state where it is not
possible to recover simply. This typically requires a reboot
following the 'flip done timed out message'.

Discussion with Broadcom has suggested we don't use this flag.
All their testing is done with it disabled.

Additionally setting BLANK_INSERT_EN causes the HDMI to output
blank pixels on an underflow which avoids it losing sync.

After this change a 'flip done timed out' due to sdram bandwidth
starvation or too low a clock is recoverable once the situation improves.

Signed-off-by: Dom Cobley <[email protected]>
Reviewed-by: Maxime Ripard <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Dave Stevenson <[email protected]>
The V3D IP has been separate since BCM2711, so let's make sure we issue
a WARN if we're running not only on BCM2711, but also anything newer.

Signed-off-by: Maxime Ripard <[email protected]>
Reviewed-by: Maxime Ripard <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Dave Stevenson <[email protected]>
…output

Since we'll support BCM2712 soon, let's move the logic behind
vc4_hvs_get_fifo_from_output() to a switch to extend it more easily.

Signed-off-by: Maxime Ripard <[email protected]>
Reviewed-by: Maxime Ripard <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Dave Stevenson <[email protected]>
Since the BCM2712 will feature a significantly different HVS, let's move
the hardware initialisation part of our bind function into a separate
function.

That way, it will be easier to extend in the future.

Signed-off-by: Maxime Ripard <[email protected]>
Reviewed-by: Maxime Ripard <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Dave Stevenson <[email protected]>
Just like the HVS itself, the COB parameters will be fairly different in
the BCM2712.

Let's move the COB parameters computation and its initialisation to a
separate function that will be easier to extend in the future.

Signed-off-by: Maxime Ripard <[email protected]>
Reviewed-by: Maxime Ripard <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Dave Stevenson <[email protected]>
The HVS register set has been heavily modified in the BCM2712, and we'll
thus need a separate debugfs_reg32 array for it.

The name hvs_regs is thus a bit too generic, so let's rename it to
something more specific.

Signed-off-by: Maxime Ripard <[email protected]>
Reviewed-by: Maxime Ripard <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Dave Stevenson <[email protected]>
The BCM2712 will have a fairly different dlist, that will feature one
Pointer 0 word for each plane.

Let's prepare by changing the ptr0_offset variable that holds the offset
in a dlist of the pointer 0 word to an array.

Signed-off-by: Maxime Ripard <[email protected]>
Reviewed-by: Maxime Ripard <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Dave Stevenson <[email protected]>
With the introduction of the support for BCM2712, the check of whether
we're running on vc5 or not to compute the LBM alignment requirement
doesn't work anymore.

Moreover, the LBM size will need to be computed in words for the
BCM2712, while we've had sizes in bytes so far.

Aligning on either 64 or 32 words is thus fairly harmful on BCM2712, so
let's just explicitly align the size when needed, and then call
drm_mm_insert_node_generic() with an alignment of 1.

Signed-off-by: Maxime Ripard <[email protected]>
Reviewed-by: Maxime Ripard <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Dave Stevenson <[email protected]>
The BCM2712 HVS has registers to report the size of the various SRAM the
driver uses, and their size actually differ depending on the stepping.

The initialisation of the memory pools happen in the __vc4_hvs_alloc()
function that also allocates the main HVS structure, that will then hold
the pointer to the memory mapping of the registers.

This creates some kind of circular dependency that we can break by
passing the mapping pointer as an argument for __vc4_hvs_alloc() to use
to query to get the SRAM sizes and initialise the memory pools
accordingly.

Signed-off-by: Maxime Ripard <[email protected]>
Reviewed-by: Maxime Ripard <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Dave Stevenson <[email protected]>
The offset fields in vc4_plane_state are described as being
the offset for each buffer in the bo, however it is used to
store the complete DMA address that is then written into the
register.

The DMA address including the fb ofset can be retrieved
using drm_fb_dma_get_gem_addr, and the offset adjustment due to
clipping is local to vc4_plane_mode_set.
Drop the offset field from the state, and compute the complete
DMA address in vc4_plane_mode_set.

Reviewed-by: Maxime Ripard <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Dave Stevenson <[email protected]>
Always enable SCALER_CONTROL before attempting other HVS
operations. It's safe to write to some parts of the HVS but
in general it's dangerous to do this because it can cause bus
lockups.

Signed-off-by: Tim Gover <[email protected]>
Reviewed-by: Maxime Ripard <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Dave Stevenson <[email protected]>
Similar to commit f2a4bcb ("drm/v3d: Use v3d_perfmon_find()"),
replace the open-coded `vc4_perfmon_find()` with the real thing.

Cc: Christian Gmeiner <[email protected]>
Signed-off-by: Maíra Canal <[email protected]>
Reviewed-by: Juan A. Suarez <[email protected]>
Reviewed-by: Christian Gmeiner <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Use of_device_get_match_data to retrieve the generation value
as set in the struct of_device_id, rather than manually comparing
compatible strings.

Reviewed-by: Maxime Ripard <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Dave Stevenson <[email protected]>
The frame count values moved within registers DISPSTAT1 and
DISPSTAT2 with GEN5, so update the accessor function to
accommodate that.

Fixes: b51cd7a ("drm/vc4: hvs: Fix frame count register readout")
Reviewed-by: Maxime Ripard <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Dave Stevenson <[email protected]>
The BCM2712 has an improved display pipeline, most notably with a
different HVS and only HDMI and writeback outputs.

Let's introduce it as a new VideoCore generation and compatible.

Signed-off-by: Maxime Ripard <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Dave Stevenson <[email protected]>
The HVS found in the BCM2712, while having a similar role, is very
different from the one found in the previous SoCs. Indeed, the register
layout is fairly different, and the DLIST format is new as well.

Let's introduce the needed functions to support the new HVS.

This commit adds the C-step register layout. The D-step will be
added later.

Signed-off-by: Maxime Ripard <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Dave Stevenson <[email protected]>
The PixelValves found on the BCM2712 are similar to the ones found in
the previous generation.

Compared to BCM2711:
- the pixelvalves only drive one HDMI controller each
- HDMI1 PixelValve has a FIFO long enough to support 4k at 60Hz
- support has been added for odd horizontal timings whilst at 2pixels/clock

Signed-off-by: Maxime Ripard <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Dave Stevenson <[email protected]>
The HDMI controllers found in the BCM2712 are largely the ones found in
the BCM2711 with a different PHY.

There's some difference with how timings are split between registers,
and HDMI1 is now able to run at 4k/60Hz.

Signed-off-by: Maxime Ripard <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Dave Stevenson <[email protected]>
The BCM2712 will have several TXP with small differences. Let's add a
structure tied to the compatible to deal with those differences.

Signed-off-by: Maxime Ripard <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Dave Stevenson <[email protected]>
The TXP data structure has a name too generic for the multiple variants
we'll have to support. Let's rename it to mention the SoC it applies to.

Signed-off-by: Maxime Ripard <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Dave Stevenson <[email protected]>
6by9 and others added 26 commits February 28, 2025 16:53
The tests on vc4 (BCM2835-7) were checking for DSI1 muxing being
to restricted channel 2, and therefore muxing with TXP was impossible.

As we no longer have that restriction, update the capabilities
defined for DSI1, move the tests that used to be impossible to the
valid list, and extend for additional combinations that are now
possible.

Signed-off-by: Dave Stevenson <[email protected]>
If an HDMI connector has no EDID and the mode is set via the
kernel command line, then drm_reset_display_info() is the only
thing that will have set up any of connector->display_info.

With commit 26ff1c3 ("drm/connector: hdmi: Compute bpc
and format automatically"), it is now checked that
DRM_COLOR_FORMAT_RGB444 is supported. Whilst it doesn't fail
the request, it does log dev_warn for every commit, spamming
the log.

For HDMI connectors initialise the color_format field to say
it supports RGB444.

Signed-off-by: Dave Stevenson <[email protected]>
6.12 kernel reworked how ZRAM backends were configured.
Let's enable the ones we've lost.

I've chosen zstd as the default.

6.6 kernel supported:
$ cat /sys/block/zram0/comp_algorithm
[lzo-rle] lzo lz4 zstd

6.12 currently supports:
$ cat /sys/block/zram0/comp_algorithm
[lzo-rle] lzo

With this PR 6.12 supports:
$ cat /sys/block/zram0/comp_algorithm
lzo-rle lzo lz4 [zstd]

See: https://forums.raspberrypi.com/viewtopic.php?p=2296678#p2296678

Signed-off-by: Dom Cobley <[email protected]>
Change the standard rate of PLL_AUDIO_SEC from 192MHz to
153.6MHz to suit audio out.

Declare audio out hardware and give it a named pin control.

Signed-off-by: Nick Hollinghurst <[email protected]>
…rror

Connect PLL_AUDIO_SEC to CLK_AUDIO_OUT, which had been commented out
to avoid interference with I2S: we expect them never to be enabled
at the same time. Work around a rounding error that occurs when the
desired rate is exactly the max but not exactly achievable by the PLL.

Signed-off-by: Nick Hollinghurst <[email protected]>
Only 48000Hz stereo 16-bit output is currently supported.

It requires some additional OF plumbing to connect it to a
"dummy" codec and generic sound card.

Signed-off-by: Nick Hollinghurst <[email protected]>
Since RP1 Audio Out can only work on GPIOs 12, 13 which would
previously have needed dtoverlay=audremap, overload it both to
enable and pin-map the block (do not enable for other pinouts).

At the same time, generate a default "codec" and "sound card".

Signed-off-by: Nick Hollinghurst <[email protected]>
The mutex used in arducam-pivariety was not properly initialized,
which could lead to undefined behavior. This also caused a NULL
pointer dereference under certain conditions.

This patch ensures the mutex is correctly initialized during probe
and prevents NULL pointer dereferences.

Signed-off-by: Yuriy Pasichnyk <[email protected]>
Support for the RP1 firmware mailbox API is rolling out to Pi 5 EEPROM
images. For most users, the fact that the PIO is not available is no
cause for alarm. Change the message to a warning, so that it does not
appear with "quiet" in cmdline.txt.

Link: raspberrypi#6642

Signed-off-by: Phil Elwell <[email protected]>
On Pi5 5, GPIOs 46/48 are made available on the 'CAM/DISP 1' connector as
'CD1_IO0_MICCLK'/'CD1_IO1_MICDAT1'. These GPIOs are not connected on
CM5.

Add hogs for GPIO 46/48 on CM5 to prevent camera drivers from
inadvertently using them when connected to 'CAM/DISP 1'

Signed-off-by: Richard Oliver <[email protected]>
In some circumstances, devm_gpiod_get_array_optional() can return
PTR_ERR rather than NULL to indicate failure. Handle these cases.

Signed-off-by: Richard Oliver <[email protected]>
Fast transfer mode requires that the first bit of data is clocked with a
rising edge. This can cause extra bits of data to be clocked on hardware
where the clock signal uses a pull-up. This change ensures that clk is
driven low before fast data transfer mode is entered.

Signed-off-by: Richard Oliver <[email protected]>
Acknowledge the fact that bcmrpi3_defconfig is neither used nor
supported by us, and avoid a bunch of future merge conflicts since
it is already gone from rpi-6.14.y.

Signed-off-by: Phil Elwell <[email protected]>
…ed allocations

Add iommu_dma_numa_policy= kernel parameter which can be used to modify
the NUMA allocation policy of remapped buffer allocations.

Policy is only used for devices which are not associated with a NUMA node.

Syntax identical to what tmpfs accepts as it's mpol argument is accepted.

Some examples:

 iommu_dma_numa_policy=interleave
 iommu_dma_numa_policy=interleave=skip-interleave
 iommu_dma_numa_policy=bind:0-3,5,7,9-15
 iommu_dma_numa_policy=bind=static:1-2

Signed-off-by: Tvrtko Ursulin <[email protected]>
To help work around certain memory controller limitations or similar, a
random NUMA allocation memory policy is added.

Signed-off-by: Tvrtko Ursulin <[email protected]>
The presence of bcm7445 in the gio_aon compatible strings should be
cosmetic, but it doesn't match gio and it breaks the pinctrl utility.
Standardise on just brcm,bcmstb-gpio, while also making a note to
improve pinctrl.

Signed-off-by: Phil Elwell <[email protected]>
The IMX500 (unlike the IMX477/IMX708) requires two regsiters to be set
for the exposure shift value to work correctly. The additional register
write (which was missing) is for the integration time shift.

Signed-off-by: Naushir Patuck <[email protected]>
There are apparently incomplete clones of the HXD type chip in use.
Those return -EPIPE on GET_LINE_REQUEST and BREAK_REQUEST. Avoid
flooding the kernel log with those errors. Detect them during startup
and then use the line_settings cache instead of GET_LINE_REQUEST. Signal
missing break support via -ENOTTY.

Signed-off-by: Jan Kiszka <[email protected]>
[ johan: fix macro prefix, drop oom error message ]
Signed-off-by: Johan Hovold <[email protected]>
Instead of trying to minimize the delay between seeing HSYNC edge
and asserting VSYNC, try to predict the next HSYNC edge precisely.
This eliminates the round-trip delay but introduces mode-dependent
rounding error. HSYNC->VSYNC lag reduced from ~30ns to -5ns..+10ns
(plus up to 5ns synchronization jitter as before).

This may benefit e.g. SCART HATs, particularly those that generate
Composite Sync using a XNOR gate.

Signed-off-by: Nick Hollinghurst <[email protected]>
commit 59ac702 ("drm/vc4: Get the rid of DRM_ERROR()") converted
all calls to DRM_ERROR into drm_err, but also converted one DRM_DEBUG
into a drm_err.

Switch it back to drm_dbg.

Fixes: 59ac702 ("drm/vc4: Get the rid of DRM_ERROR()")
Signed-off-by: Dave Stevenson <[email protected]>
Whilst the particular buffer object won't get destroyed whilst
being looked up, the idr structure itself could be modified by
another thread.

Take the spinlock for the lookup.

Signed-off-by: Dave Stevenson <[email protected]>
The kernel_id that is used on the firmware side is logged on free,
but not on import, making tying the two up difficult.

Log the id on import

Signed-off-by: Dave Stevenson <[email protected]>
@6by9 6by9 marked this pull request as draft March 5, 2025 17:40
@6by9
Copy link
Contributor Author

6by9 commented Mar 5, 2025

Nope. Ran far longer (just over 100k iterations), but still failed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.