Insulate users from changes to the internal hole tracking within
struct drm_mm_node by using an accessor for hole_follows.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
[danvet: resolve conflicts in i915_vma.c]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Using mm->color_adjust makes the eviction scanner much tricker since we
don't know the actual neighbours of the target hole until after it is
created (after scanning is complete). To work out whether we need to
evict the neighbours because they impact upon the hole, we have to then
check the hole afterwards - requiring an extra step in the user of the
eviction scanner when they apply color_adjust.
v2: Massage kerneldoc.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/20161222083641.2691-34-chris@chris-wilson.co.uk
The scan state occupies a large proportion of the struct drm_mm and is
rarely used and only contains temporary state. That makes it suitable to
moving to its struct and onto the stack of the callers.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
[danvet: Fix up etnaviv to compile, was missing a BUG_ON.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
In addition to the last-in/first-out stack for accessing drm_mm nodes,
we occasionally and in the future often want to find a drm_mm_node by an
address. To do so efficiently we need to track the nodes in an interval
tree - lookups for a particular address will then be O(lg(N)), where N
is the number of nodes in the range manager as opposed to O(N).
Insertion however gains an extra O(lg(N)) step for all nodes
irrespective of whether the interval tree is in use. For future i915
patches, eliminating the linear walk is a significant improvement.
v2: Use generic interval-tree template for u64 and faster insertion.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: David Herrmann <dh.herrmann@gmail.com>
Cc: dri-devel@lists.freedesktop.org
Reviewed-by: David Herrmann <dh.herrmann@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1470236651-678-1-git-send-email-chris@chris-wilson.co.uk
The adj_start calculation for DRM_MM_CREATE_TOP should happen after
mm->color_adjust. There was an inconsistency between
drm_mm_insert_helper_range
and drm_mm_insert_helper, as the later was already updating after
color_adjust.
Didn't spot it before, as color_adjust is only done in systems without
LLC. But I'm not aware of anybody using this test case yet.
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The drm_mm debugfs output is difficult to read as two different formats
are used for the addresses:
0x00000080000000-0x0000008000b000: 45056: used
0x8000b000-0x80016000: 45056: free
0x00000080016000-0x0000008001b000: 20480: used
0x8001b000-0x817a1000: 24666112: free
0x000000817a1000-0x000000817a8000: 28672: used
0x000000817a8000-0x00000081ba8000: 4194304: used
Fix this by using %#018llx for all addresses, thus making the output:
0x0000000080000000-0x000000008000b000: 45056: used
0x000000008000b000-0x0000000080016000: 45056: free
0x0000000080016000-0x000000008001b000: 20480: used
0x000000008001b000-0x00000000817a1000: 24666112: free
0x00000000817a1000-0x00000000817a8000: 28672: used
0x00000000817a8000-0x0000000081ba8000: 4194304: used
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
bad argument if(tmp)... in check_free_hole
fix oops: kernel BUG at drivers/gpu/drm/drm_mm.c:305!
[airlied: excellent, this was my task for today].
Signed-off-by: Krzysztof Kolasa <kkolasa@winsoft.pl>
Reviewed-by: Chris wilson <chris@chris-wilson.co.uk>
Signed-off-by: Dave Airlie <airlied@redhat.com>
The current implementation is limited by the number of addresses that
fit into an unsigned long. This causes problems on 32-bit Tegra where
unsigned long is 32-bit but drm_mm is used to manage an IOVA space of
4 GiB. Given the 32-bit limitation, the range is limited to 4 GiB - 1
(or 4 GiB - 4 KiB for page granularity).
This commit changes the start and size of the range to be an unsigned
64-bit integer, thus allowing much larger ranges to be supported.
[airlied: fix i915 warnings and coloring callback]
Signed-off-by: Thierry Reding <treding@nvidia.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Dave Airlie <airlied@redhat.com>
fixupo
Jesse's BIOS fb reconstruction code actually relies on the -ENOSPC
return value to detect overlapping framebuffers (which the bios uses
always when lighting up more than one screen). All this fanciness
happens in intel_alloc_plane_obj in intel_display.c.
Since no one else uses this we can safely remove the WARN without
repercussions.
Reported-by: Ben Widawsky <benjamin.widawsky@intel.com>
Tested-by: Ben Widawsky <ben@bwidawsk.net>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: Dave Airlie <airlied@redhat.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
entry->size is the size of the node, not the size of the hole after it.
So the code would actually find the hole which can satisfy the
constraints and which is preceded by the smallest node, not the smallest
hole satisfying the constraints.
Reported-by: "Huang, FrankR" <FrankR.Huang@amd.com>
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Clients like i915 need to segregate cache domains within the GTT which
can lead to small amounts of fragmentation. By allocating the uncached
buffers from the bottom and the cacheable buffers from the top, we can
reduce the amount of wasted space and also optimize allocation of the
mappable portion of the GTT to only those buffers that require CPU
access through the GTT.
For other drivers, allocating small bos from one end and large ones
from the other helps improve the quality of fragmentation.
Based on drm_mm work by Chris Wilson.
v3: Changed to use a TTM placement flag
v2: Updated kerneldoc
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ben Widawsky <ben@bwidawsk.net>
Cc: Christian König <deathsimple@vodafone.de>
Signed-off-by: Lauri Kasanen <cand@gmx.com>
Signed-off-by: David Airlie <airlied@redhat.com>
While at it do a tiny bit of interface cleanup and convert boolean
return values to bool. With this patch all exported functions and inline
helpers which are part of the drm_mm public interface are documented.
Also drop superflous extern function modifiers since most of drm_mm.h
doesn't use them - more consistent that way.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
kerneldoc polish will follow in the next patch.
Hopefully documenting the lru scan support a bit better spurs someone
to give this a shot in the ttm eviction code. At least in i915 it
helped quite a lot with memory thrashing on platforms where eviction
was (we've fixed that too meanwhile) fairly expensive.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Need to get my stuff out the door ;-) Highlights:
- pc8+ support from Paulo
- more vma patches from Ben.
- Kconfig option to enable preliminary support by default (Josh
Triplett)
- Optimized cpu cache flush handling and support for write-through caching
of display planes on Iris (Chris)
- rc6 tuning from Stéphane Marchesin for more stability
- VECS seqno wrap/semaphores fix (Ben)
- a pile of smaller cleanups and improvements all over
Note that I've ditched Ben's execbuf vma conversion for 3.12 since not yet
ready. But there's still other vma conversion stuff in here.
* tag 'drm-intel-next-2013-08-23' of git://people.freedesktop.org/~danvet/drm-intel: (62 commits)
drm/i915: Print seqnos as unsigned in debugfs
drm/i915: Fix context size calculation on SNB/IVB/VLV
drm/i915: Use POSTING_READ in lcpll code
drm/i915: enable Package C8+ by default
drm/i915: add i915.pc8_timeout function
drm/i915: add i915_pc8_status debugfs file
drm/i915: allow package C8+ states on Haswell (disabled)
drm/i915: fix SDEIMR assertion when disabling LCPLL
drm/i915: grab force_wake when restoring LCPLL
drm/i915: drop WaMbcDriverBootEnable workaround
drm/i915: Cleaning up the relocate entry function
drm/i915: merge HSW and SNB PM irq handlers
drm/i915: fix how we mask PMIMR when adding work to the queue
drm/i915: don't queue PM events we won't process
drm/i915: don't disable/reenable IVB error interrupts when not needed
drm/i915: add dev_priv->pm_irq_mask
drm/i915: don't update GEN6_PMIMR when it's not needed
drm/i915: wrap GEN6_PMIMR changes
drm/i915: wrap GTIMR changes
drm/i915: add the FCLK case to intel_ddi_get_cdclk_freq
...
The conditional is usually a recoverable driver bug, and so WARNing, and
preventing the drm_mm code from doing potential damage (BUG) is
desirable.
This issue was hit and fixed twice while developing the i915 multiple
address space code. The first fix is the patch just before this, and is
hit on an not frequently occuring error path. Another was fixed during
patch iteration, so it's hard to see from the patch:
commit c6cfb32567
Author: Ben Widawsky <ben@bwidawsk.net>
Date: Fri Jul 5 14:41:06 2013 -0700
drm/i915: Embed drm_mm_node in i915 gem obj
From the intel-gfx mailing list, we discussed this:
References: <20130705191235.GA3057@bwidawsk.net>
Cc: Dave Airlie <airlied@redhat.com>
CC: <dri-devel@lists.freedesktop.org>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Acked-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
We used to pre-allocate drm_mm nodes and save them in a linked list for
later usage so we always have spare ones in atomic contexts. However, this
is really racy if multiple threads are in an atomic context at the same
time and we don't have enough spare nodes. Moreover, all remaining users
run in user-context and just lock drm_mm with a spinlock. So we can easily
preallocate the node, take the spinlock and insert the node.
This may have worked well with BKL in place, however, with today's
infrastructure it really doesn't make any sense. Besides, most users can
easily embed drm_mm_node into their objects so no allocation is needed at
all.
Thus, remove the old pre-alloc API and all the helpers that it provides.
Drivers have already been converted and we should not use the old API for
new code, anymore.
Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Dave Airlie <airlied@redhat.com>