This has proven immensely useful for debugging memory leaks and
overallocation (which is a rather serious concern on the platform,
given that we typically run at about 256MB of CMA out of up to 1GB
total memory, with framebuffers that are about 8MB ecah).
The state of the art without this is to dump debug logs from every GL
application, guess as to kernel allocations based on bo_stats, and try
to merge that all together into a global picture of memory allocation
state. With this, you can add a couple of calls to the debug build of
the 3D driver and get a pretty detailed view of GPU memory usage from
/debug/dri/0/bo_stats (or when we debug print to dmesg on allocation
failure).
The Mesa side currently labels at the gallium resource level (so you
see that a 1920x20 pixmap has been created, presumably for the window
system panel), but we could extend that to be even more useful with
glObjectLabel() names being sent all the way down to the kernel.
(partial) example of sorted debugfs output with Mesa labeling all
resources:
kernel BO cache: 16392kb BOs (3)
tiling shadow 1920x1080: 8160kb BOs (1)
resource 1920x1080@32/0: 8160kb BOs (1)
scanout resource 1920x1080@32/0: 8100kb BOs (1)
kernel: 8100kb BOs (1)
v2: Use strndup_user(), use lockdep assertion instead of just a
comment, fix an array[-1] reference, extend comment about name
freeing.
Signed-off-by: Eric Anholt <eric@anholt.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20170725182718.31468-2-eric@anholt.net
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Backmerge drm-next with -rc2 in it to pull in a couple stm patches that
were previously incorrectly applied to -misc-next. By picking them up in
the correct manner, git will hopefully fix any errant trees that are out
in the wild.
Signed-off-by: Sean Paul <seanpaul@chromium.org>
In order to support CEC the hsm clock needs to be enabled in
vc4_hdmi_bind(), not in vc4_hdmi_encoder_enable(). Otherwise you wouldn't
be able to support CEC when there is no hotplug detect signal, which is
required by some monitors that turn off the HPD when in standby, but keep
the CEC bus alive so they can be woken up.
The HDMI core also has to be enabled in vc4_hdmi_bind() for the same
reason.
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Eric Anholt <eric@anholt.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20170716104804.48308-3-hverkuil@xs4all.nl
HDMI 1.4b support the CEA video modes as per range of CEA-861-D (VIC 1-64).
For any other mode, the VIC filed in AVI infoframes should be 0.
HDMI 2.0 sinks, support video modes range as per CEA-861-F spec, which is
extended to (VIC 1-107).
This patch adds a bool input variable, which indicates if the connected
sink is a HDMI 2.0 sink or not. This will make sure that we don't pass a
HDMI 2.0 VIC to a HDMI 1.4 sink.
This patch touches all drm drivers, who are callers of this function
drm_hdmi_avi_infoframe_from_display_mode but to make sure there is
no change in current behavior, is_hdmi2 is kept as false.
In case of I915 driver, this patch:
- checks if the connected display is HDMI 2.0.
- HDMI infoframes carry one of this two type of information:
- VIC for 4K modes for HDMI 1.4 sinks
- S3D information for S3D modes
As CEA-861-F has already defined VICs for 4K videomodes, this
patch doesn't allow sending HDMI infoframes for HDMI 2.0 sinks,
until the mode is 3D.
Cc: Ville Syrjala <ville.syrjala@linux.intel.com>
Cc: Jose Abreu <jose.abreu@synopsys.com>
Cc: Andrzej Hajda <a.hajda@samsung.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Daniel Vetter <daniel.vetter@intel.com>
PS: This patch touches a few lines in few files, which were
already above 80 char, so checkpatch gives 80 char warning again.
- gpu/drm/omapdrm/omap_encoder.c
- gpu/drm/i915/intel_sdvo.c
V2: Rebase, Added r-b from Andrzej
V3: Addressed review comment from Ville:
- Do not send VICs in both AVI-IF and HDMI-IF
send only one of it.
V4: Rebase
V5: Added r-b from Neil.
Addressed review comments from Ville
- Do not block HDMI vendor IF, instead check for VIC while
handling AVI infoframes
V6: Rebase
V7: Rebase
Reviewed-by: Andrzej Hajda <a.hajda@samsung.com>
Reviewed-by: Neil Armstrong <narmstrong@baylibre.com>
Signed-off-by: Shashank Sharma <shashank.sharma@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1499960000-9232-2-git-send-email-shashank.sharma@intel.com
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
With instantaneous high precision vblank timestamping
that updates at leading edge of vblank, the emulated
"hw vblank counter" from vblank timestamping which
increments at leading edge of vblank, and reliable
page flip execution and completion at leading edge
of vblank, we should meet the requirements for fast
vblank irq disable/enable.
Testing against rpi-4.12-rc5 Linux kernel with timing
measurement equipment indicates this works fine,
so allow immediate vblank disable for power saving.
For debugging in case of unexpected trouble, booting
with kernel cmdline option drm.vblankoffdelay=0
would keep vblank irqs on to approximate old behavior.
v2: Respin onto drm-misc-next, per Eric's suggestion.
Drop !vc4->firmware_kms check, as the firmware_kms
implementation does not exist in upstream.
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Cc: Eric Anholt <eric@anholt.net>
Reviewed-by: Eric Anholt <eric@anholt.net>
Link: http://patchwork.freedesktop.org/patch/msgid/20170622012811.2139-1-mario.kleiner.de@gmail.com
The T tiling format is what V3D uses for textures, with no raster
support at all until later revisions of the hardware (and always at a
large 3D performance penalty). If we can't scan out V3D's format,
then we often need to do a relayout at some stage of the pipeline,
either right before texturing from the scanout buffer (common in X11
without a compositor) or between a tiled screen buffer right before
scanout (an option I've considered in trying to resolve this
inconsistency, but which means needing to use the dirty fb ioctl and
having some update policy).
T-format scanout lets us avoid either of those shadow copies, for a
massive, obvious performance improvement to X11 window dragging
without a compositor. Unfortunately, enabling a compositor to work
around the discrepancy has turned out to be too costly in memory
consumption for the Raspbian distribution.
Because the HVS operates a scanline at a time, compositing from T does
increase the memory bandwidth cost of scanout. On my 1920x1080@32bpp
display on a RPi3, we go from about 15% of system memory bandwidth
with linear to about 20% with tiled. However, for X11 this still ends
up being a huge performance win in active usage.
This patch doesn't yet handle src_x/src_y offsetting within the tiled
buffer. However, we fail to do so for untiled buffers already.
Signed-off-by: Eric Anholt <eric@anholt.net>
Link: http://patchwork.freedesktop.org/patch/msgid/20170608001336.12842-1-eric@anholt.net
Reviewed-by: Boris Brezillon <boris.brezillon@free-electrons.com>
If one 'drm_gem_handle_create()' fails, we leak somes handles and some
memory.
In order to fix it:
- move the 'free(bo_state)' at the end of the function so that it is also
called in the eror handling path. This has the side effect to also try
to free it if the first 'kcalloc' fails. This is harmless.
- add a new label, err_delete_handle, in order to delete already
allocated handles in error handling path
- remove the now useless 'err' label
The way the code is now written will also delete the handles if the
'copy_to_user()' call fails.
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Eric Anholt <eric@anholt.net>
Link: http://patchwork.freedesktop.org/patch/msgid/20170512123803.1886-1-christophe.jaillet@wanadoo.fr
The bo->resv pointer could be NULL, leading to kernel oopses
like the one below.
This patch ensures that bo->resv is always set in vc4_create_object
ensuring that it is never NULL.
Thanks to Eric Anholt for pointing to the correct solution.
[ 19.738487] Unable to handle kernel NULL pointer dereference at virtual address 00000000
[ 19.746805] pgd = ffff8000275fc000
[ 19.750319] [00000000] *pgd=0000000000000000
[ 19.754715] Internal error: Oops: 96000004 [#1] PREEMPT SMP
[ 19.760369] Modules linked in: smsc95xx usbnet vc4 drm_kms_helper drm pwm_bcm2835 i2c_bcm2835 bcm2835_rng rng_core bcm2835_dma virt_dma
[ 19.772767] CPU: 0 PID: 1297 Comm: Xorg Not tainted 4.12.0-rc1-rpi3 #58
[ 19.779476] Hardware name: Raspberry Pi 3 Model B (DT)
[ 19.784688] task: ffff800028268000 task.stack: ffff800026c08000
[ 19.790705] PC is at ww_mutex_lock_interruptible+0x14/0xc0
[ 19.796329] LR is at vc4_submit_cl_ioctl+0x4fc/0x998 [vc4]
...
[ 20.240855] [<ffff0000088975f4>] ww_mutex_lock_interruptible+0x14/0xc0
[ 20.247528] [<ffff0000009b3ea4>] vc4_submit_cl_ioctl+0x4fc/0x998 [vc4]
[ 20.254372] [<ffff0000008f75f8>] drm_ioctl+0x180/0x438 [drm]
[ 20.260120] [<ffff00000821383c>] do_vfs_ioctl+0xa4/0x7d0
[ 20.265510] [<ffff000008213fe4>] SyS_ioctl+0x7c/0x98
[ 20.270550] [<ffff000008082f30>] el0_svc_naked+0x24/0x28
[ 20.275941] Code: d2800002 d5384103 910003fd f9800011 (c85ffc04)
[ 20.282527] ---[ end trace 1f6bd640ff32ae12 ]---
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Link: http://patchwork.freedesktop.org/patch/msgid/14e68768-6c92-2d74-92fd-196dbc50d8f7@xs4all.nl
Now that drm_[cm]alloc* helpers are simple one line wrappers around
kvmalloc_array and drm_free_large is just kvfree alias we can drop
them and replace by their native forms.
This shouldn't introduce any functional change.
Changes since v1
- fix typo in drivers/gpu//drm/etnaviv/etnaviv_gem.c - noticed by 0day
build robot
Suggested-by: Daniel Vetter <daniel@ffwll.ch>
Signed-off-by: Michal Hocko <mhocko@suse.com>drm: drop drm_[cm]alloc* helpers
[danvet: Fixup vgem which grew another user very recently.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Christian König <christian.koenig@amd.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170517122312.GK18247@dhcp22.suse.cz
BCM2835's PLLD_DSI1 divider doesn't give us many choices for our pixel
clocks, so to support panels on the Raspberry Pi we need to set a
higher pixel clock rate than requested and adjust the mode we program
to extend out the HFP so that the refresh rate matches.
v2: Drop an unfinished comment (caught by Noralf)
Signed-off-by: Eric Anholt <eric@anholt.net>
Link: http://patchwork.freedesktop.org/patch/msgid/20170511235625.22427-2-eric@anholt.net
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
If we restrict this helper to only kms drivers (which is the case) we
can look up the correct mode easily ourselves. But it's a bit tricky:
- All legacy drivers look at crtc->hwmode. But that is updated already
at the beginning of the modeset helper, which means when we disable
a pipe. Hence the final timestamps might be a bit off. But since
this is an existing bug I'm not going to change it, but just try to
be bug-for-bug compatible with the current code. This only applies
to radeon&amdgpu.
- i915 tries to get it perfect by updating crtc->hwmode when the pipe
is off (i.e. vblank->enabled = false).
- All other atomic drivers look at crtc->state->adjusted_mode. Those
that look at state->requested_mode simply don't adjust their mode,
so it's the same. That has two problems: Accessing crtc->state from
interrupt handling code is unsafe, and it's updated before we shut
down the pipe. For nonblocking modesets it's even worse.
For atomic drivers try to implement what i915 does. To do that we add
a new hwmode field to the vblank structure, and update it from
drm_calc_timestamping_constants(). For atomic drivers that's called
from the right spot by the helper library already, so all fine. But
for safety let's enforce that.
For legacy driver this function is only called at the end (oh the
fun), which is broken, so again let's not bother and just stay
bug-for-bug compatible.
The benefit is that we can use drm_calc_vbltimestamp_from_scanoutpos
directly to implement ->get_vblank_timestamp in every driver, deleting
a lot of code.
v2: Completely new approach, trying to mimick the i915 solution.
v3: Fixup kerneldoc.
v4: Drop the WARN_ON to check that the vblank is off, atomic helpers
currently unconditionally call this. Recomputing the same stuff should
be harmless.
v5: Fix typos and move misplaced hunks to the right patches (Neil).
v6: Undo hunk movement (kbuild).
Cc: Mario Kleiner <mario.kleiner@tuebingen.mpg.de>
Cc: Eric Anholt <eric@anholt.net>
Cc: Rob Clark <robdclark@gmail.com>
Cc: linux-arm-msm@vger.kernel.org
Cc: freedreno@lists.freedesktop.org
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Neil Armstrong <narmstrong@baylibre.com>
Acked-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170509140329.24114-4-daniel.vetter@ffwll.ch
Cygnus has V3D 2.6 instead of 2.1, and doesn't use the VC4 display
modules. The V3D can be uniquely identified by the IDENT[01]
registers, and there's nothing to key off of for the display change
other than the lack of DT nodes for the display components, but it's
convention to have new compatible strings anyway.
Signed-off-by: Eric Anholt <eric@anholt.net>
Acked-by: Rob Herring <robh@kernel.org>
Link: http://patchwork.freedesktop.org/patch/msgid/20170428224223.21904-3-eric@anholt.net
For the Raspberry Pi's bindings, the power domain also implicitly
turns on the clock and deasserts reset, but for the new Cygnus port we
start representing the clock in the devicetree.
v2: Document the clock-names property, check for -ENOENT for no clock
in DT.
v3: Drop NULL checks around clk calls which embed NULL checks.
v4: Drop clk-names (feedback by Rob Herring)
Signed-off-by: Eric Anholt <eric@anholt.net>
Acked-by: Rob Herring <robh@kernel.org>
Link: http://patchwork.freedesktop.org/patch/msgid/20170428224223.21904-1-eric@anholt.net
Until now, we've had to limit Raspberry Pi to 256MB of CMA memory to
keep from triggering the hardware addressing bug between the tile
binner and the tile alloc memory (where the top 4 bits come from the
tile state data array's address).
To work around that and allow more memory to be reserved for graphics,
allocate a single BO to store tile state data arrays and tile
alloc/overflow memory while the GPU is active, and make sure that that
one BO doesn't happen to cross a 256MB boundary. With that in place,
we can allocate textures and shaders anywhere in system memory (still
contiguous, of course).
Signed-off-by: Eric Anholt <eric@anholt.net>
Link: http://patchwork.freedesktop.org/patch/msgid/20170327231025.19391-1-eric@anholt.net
Reviewed-by: Boris Brezillon <boris.brezillon@free-electrons.com>
The HDMI driver is currently enabling all clocks at probe time and
keeps the power-domain connected to the HDMI encoder enabled.
Move all activation code to vc4_hdmi_encoder_enable() and make sure
the clks and power domain are released when the HDMI encoder is not used
by adding deactivation steps in vc4_hdmi_encoder_disable().
Note that the sequencing imposed by the IP requires that we move
vc4_hdmi_encoder_mode_set() code into vc4_hdmi_encoder_enable().
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Eric Anholt <eric@anholt.net>
This is needed for proper synchronization with display on another DRM
device (pl111 or tinydrm) with buffers produced by vc4 V3D. Fixes the
new igt vc4_dmabuf_poll testcase, and rendering of one of the glmark2
desktop tests on pl111+vc4.
This doesn't yet introduce waits on another device's fences before
vc4's rendering/display, because I don't have testcases for them.
v2: Reuse dma_fence_free(), retitle commit message to clarify that
it's not a full dma-buf fencing implementation yet.
Signed-off-by: Eric Anholt <eric@anholt.net>
Link: http://patchwork.freedesktop.org/patch/msgid/20170412191202.22740-6-eric@anholt.net
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Convert drivers to use the new of_graph_get_remote_node() helper
instead of parsing the endpoint node and then getting the remote device
node. Now drivers can just specify the device node and which
port/endpoint and get back the connected remote device node. The details
of the graph binding are nicely abstracted into the core OF graph code.
This changes some error messages to debug messages (in the graph core).
Graph connections are often "no connects" depending on the particular
board, so we want to avoid spurious messages. Plus the kernel is not a
DT validator.
Signed-off-by: Rob Herring <robh@kernel.org>
Acked-by: Neil Armstrong <narmstrong@baylibre.com>
Tested-by: Liviu Dudau <liviu.dudau@arm.com>
Tested-by: Eric Anholt <eric@anholt.net>
Tested-by: Jyri Sarha <jsarha@ti.com>
Tested by: Archit Taneja <architt@codeaurora.org>
Signed-off-by: Sean Paul <seanpaul@chromium.org>