Detangle the confusion that NOTRACE variants of the register read/write
routines were directly using the raw register access. We need for those
routines to reuse the common code for serializing register access and
ensuring the correct register power states. This is only possible now
that the only routines that required raw access use their own API.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The GT functions for enabling register access also need to occasionally
write to and read from registers. To avoid the potential recursion as we
modify the public interface to be stricter, introduce a private register
access API for the GT functions.
v2: Rebase
v3: Rebase onto uncore
v4: Use raw interfaces consistently so that we only use the low-level
readN functions from a single location.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Currently, the register access code is split between i915_drv.c and
intel_pm.c. It only bares a superficial resemblance to the reset of the
powermanagement code, so move it all into its own file. This is to ease
further patches to enforce serialised register access.
v2: Scan for random abuse of I915_WRITE_NOTRACE
v3: Take the opportunity to rename the GT functions as uncore. Uncore is
the term used by the hardware design (and bspec) for all functions
outside of the GPU (and CPU) cores in what is also known as the System
Agent.
v4: Rebase onto SNB rc6 fixes
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
[danvet: Wrestle patch into applying and inline
intel_uncore_early_sanitize (plus move the old comment to the new
function). Also keep the _santize postfix for intel_uncore_sanitize.]
[danvet: Squash in fixup spotted by Chris on irc: We need to call
intel_pm_init before intel_uncore_sanitize since the later will call
cancel_work on the delayed rps setup work the former initializes.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This backmerges Linus' merge commit of the latest drm-fixes pull:
commit 549f3a1218
Merge: 42577ca058ca4a
Author: Linus Torvalds <torvalds@linux-foundation.org>
Date: Tue Jul 23 15:47:08 2013 -0700
Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
We've accrued a few too many conflicts, but the real reason is that I
want to merge the 100% solution for Haswell concurrent registers
writes into drm-intel-next. But that depends upon the 90% bandaid
merged into -fixes:
commit a7cd1b8fea
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Fri Jul 19 20:36:51 2013 +0100
drm/i915: Serialize almost all register access
Also, we can roll up on accrued conflicts.
Usually I'd backmerge a tagged -rc, but I want to get this done before
heading off to vacations next week ;-)
Conflicts:
drivers/gpu/drm/i915/i915_dma.c
drivers/gpu/drm/i915/i915_gem.c
v2: For added hilarity we have a init sequence conflict around the
gt_lock, so need to move that one, too. Spotted by Jani Nikula.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Instead of unmapping the nodes in TTM and GEM users manually, we provide
a generic wrapper which does the correct thing for all vma-nodes.
v2: remove bdev->dev_mapping test in ttm_bo_unmap_virtual_unlocked() as
ttm_mem_io_free_vm() does nothing in that case (io_reserved_vm is 0).
v4: Fix docbook comments
v5: use drm_vma_node_size()
Cc: Dave Airlie <airlied@redhat.com>
Cc: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Cc: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Dave Airlie <airlied@gmail.com>
Use the new vma-manager infrastructure. This doesn't change any
implementation details as the vma-offset-manager is nearly copied 1-to-1
from TTM.
The vm_lock is moved into the offset manager so we can drop it from TTM.
During lookup, we use the vma locking helpers to take a reference to the
found object.
In all other scenarios, locking stays the same as before. We always
guarantee that drm_vma_offset_remove() is called only during destruction.
Hence, helpers like drm_vma_node_offset_addr() are always safe as long as
the node has a valid offset.
This also drops the addr_space_offset member as it is a copy of vm_start
in vma_node objects. Use the accessor functions instead.
v4:
- remove vm_lock
- use drm_vma_offset_lock_lookup() to protect lookup (instead of vm_lock)
Cc: Dave Airlie <airlied@redhat.com>
Cc: Ben Skeggs <bskeggs@redhat.com>
Cc: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Cc: Martin Peres <martin.peres@labri.fr>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Dave Airlie <airlied@gmail.com>
Use the new vma manager instead of the old hashtable. Also convert all
drivers to use the new convenience helpers. This drops all the
(map_list.hash.key << PAGE_SHIFT) non-sense.
Locking and access-management is exactly the same as before with an
additional lock inside of the vma-manager, which strictly wouldn't be
needed for gem.
v2:
- rebase on drm-next
- init nodes via drm_vma_node_reset() in drm_gem.c
v3:
- fix tegra
v4:
- remove duplicate if (drm_vma_node_has_offset()) checks
- inline now trivial drm_vma_node_offset_addr() calls
v5:
- skip node-reset on gem-init due to kzalloc()
- do not allow mapping gem-objects with offsets (backwards compat)
- remove unneccessary casts
Cc: Inki Dae <inki.dae@samsung.com>
Cc: Rob Clark <robdclark@gmail.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Thierry Reding <thierry.reding@gmail.com>
Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
Acked-by: Patrik Jakobsson <patrik.r.jakobsson@gmail.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Dave Airlie <airlied@gmail.com>
If we want to map GPU memory into user-space, we need to linearize the
addresses to not confuse mm-core. Currently, GEM and TTM both implement
their own offset-managers to assign a pgoff to each object for user-space
CPU access. GEM uses a hash-table, TTM uses an rbtree.
This patch provides a unified implementation that can be used to replace
both. TTM allows partial mmaps with a given offset, so we cannot use
hashtables as the start address may not be known at mmap time. Hence, we
use the rbtree-implementation of TTM.
We could easily update drm_mm to use an rbtree instead of a linked list
for it's object list and thus drop the rbtree from the vma-manager.
However, this would slow down drm_mm object allocation for all other
use-cases (rbtree insertion) and add another 4-8 bytes to each mm node.
Hence, use the separate tree but allow for later migration.
This is a rewrite of the 2012-proposal by David Airlie <airlied@linux.ie>
v2:
- fix Docbook integration
- drop drm_mm_node_linked() and use drm_mm_node_allocated()
- remove unjustified likely/unlikely usage (but keep for rbtree paths)
- remove BUG_ON() as drm_mm already does that
- clarify page-based vs. byte-based addresses
- use drm_vma_node_reset() for initialization, too
v4:
- allow external locking via drm_vma_offset_un/lock_lookup()
- add locked lookup helper drm_vma_offset_lookup_locked()
v5:
- fix drm_vma_offset_lookup() to correctly validate range-mismatches
(fix (offset > start + pages))
- fix drm_vma_offset_exact_lookup() to actually do what it says
- remove redundant vm_pages member (add drm_vma_node_size() helper)
- remove unneeded goto
- fix documentation
Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Dave Airlie <airlied@gmail.com>
This function is called without the dev->struct_mutex held, hence we
need to use the _unlocked unreference variants.
As soon as the object is registered userspace can sneak in here with a
gem_close ioctl call, so the object can (and with my new evil tests
actually does) get the final unreference in this place. The lack of
locking then results in hilarity and some good leakage.
To fix this we simply need to revert
Chris Wilson <chris@chris-wilson.co.uk>
v2: We need to make the trace call _before_ we drop our ref - the
object might very well be gone by then already.
v3: Just revert the original patch as suggested by Chris Wilson.
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
[danvet: Remove the added white line again to tighten the return
block, requested by Chris.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
So I made the mistake of missing that the desktop and mobile chipsets
have different layouts in their PCI configurations, and we were
incorrectly setting the wrong physical address for stolen memory on
mobile chipsets.
Since all gen3+ are actually consistent in the location of the GBSM
register in the PCI configuration space on device 2 (the GPU), use it.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
[danvet: Drop cc: stable and fudge conflicts.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Our phys_object code can't deal with stolen memory and so blows up.
Fixing this is quite a bit of work and not worth it much for a single
page object, so just opt-out.
This is necessary prep work to enable stolen on gen2/3 platforms where
the overlay register file isn't stored in the gtt.
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
For now there are no callers, but these functions are going to be
needed for the code that allows Package C8+. Other future features may
also require this code.
Also merge the commit which introduced assert_can_disable_lcpll and
had the following commit message:
Most of the hardware needs to be disabled before LCPLL is disabled, so
let's add a function to assert some of items listed in the "Display
Sequences for LCPLL disabling" documentation.
The idea is that hsw_disable_lcpll should not disable the hardware,
the callers need to take care of calling hsw_disable_lcpll only once
everything is already disabled.
v2: - Rebase.
- Fix D_COMP wait timeout.
v3: - Use wait_for_atomic_use (Ben)
- Remove/add a useless/needed POSTING_READ (Ben)
- Early return in case LCPLL is already restored (Ben)
- Add ndelay(100) (Ben)
v4: - Merge the commit that added assert_can_disable_lcpll (Ben)
- Add interrupt assertions (Ben)
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
[danvet: Fix compile fail since there's no HAS_LP_PCH yet.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
We currently don't support HDMI clock bending nor use SSC for DP or
HDMI on Haswell, so the only case where we need CLKOUT_DP is for VGA.
v2: - Replace the IS_ULT check for LPT-LP
- Simplify GEN0/DBUFF0 check due to change on the previous patch
- Also check for SBI_SSCCTL_DISABLE (Ben).
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Now it implements 3 different sequences from BSpec and also has
support for ULT.
v2: - Change IS_ULT checks for LPT-LP checks
- Add check for LPT-LP + with_fdi (Ben)
- Merge DBUFF0/GEN0 bit definitions since they're the same
register (Ben)
- DBUFF0 (1<<0) is Disable, not Enable
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Try to decypher detection failures is a little tricker at the moment as
the only indicator of progress is when output_poll_execute() tells us
the result after the connector->detect() has run. This patch adds a
telltale to the start of each detect function so that we can track
progress and associate activity more clearly with each connector.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The recent addition of lockdep support to reservations and their subsequent
use by TTM showed up a number of potential problems with the way qxl was using
TTM objects.
a) it was allocating objects, and reserving them later without validating
underneath the reservation, which meant in extreme conditions the objects could
be evicted before the reservation ever used them.
b) it was reserving objects straight after allocating them, but with no
ability to back off should the reservations fail. It now allocates the necessary
objects then does a complete reservation pass on them to avoid deadlocks.
c) it had two lists per release tracking objects, unnecessary complicating
the reservation process.
This patch removes the dual object tracking, adds reservations ticket support
to the release and fence object handling. It then ports the internal fb
drawing code and the userspace facing ioctl to use the new interfaces properly,
along with cleanup up the error path handling in some codepaths.
Signed-off-by: Dave Airlie <airlied@redhat.com>
In order to fix an issue with reservations we need to create the releases
as pre-pinned objects, this changes the placement interface and bo creation
interface to allow creating pinned objects to save nested reservations later.
This is just a stepping stone to main fix which follows to actually fix how
qxl deals with reservations.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Due to the nature of qxl hw we cannot queue operations while in an irq
context, so we queue these operations as best we can until atomic allocations
fail, and dequeue them later in a work queue.
Daniel looked over the locking on the list and agrees it should be sufficent.
The atomic allocs use no warn, as the last thing we want if we haven't memory
to allocate space for a printk in an irq context is more printks.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Use the generic gma functions instead of the medfield functions where
they are identical.
Signed-off-by: Patrik Jakobsson <patrik.r.jakobsson@gmail.com>
Use the generic gma functions instead of the oaktrail functions where
they are identical.
Signed-off-by: Patrik Jakobsson <patrik.r.jakobsson@gmail.com>
This takes care of the remaining chips using the old generic code.
We don't check if the pipe number is valid but the old code peeked in
the register map before checking anyways so just ignore it.
Signed-off-by: Patrik Jakobsson <patrik.r.jakobsson@gmail.com>
There is a slight difference in how we pick the palette register in the
generic function but we should be ok as long as psb_intel_crtc->pipe and
the register map is sane.
Signed-off-by: Patrik Jakobsson <patrik.r.jakobsson@gmail.com>
This patch makes psb use the gma_xxx counterparts that are identical. I
took them in one sweep as they should not cause any regressions.
Signed-off-by: Patrik Jakobsson <patrik.r.jakobsson@gmail.com>
This patch makes cdv use the gma_xxx counterparts that are identical. I
took them in one sweep as they should not cause any regressions.
Signed-off-by: Patrik Jakobsson <patrik.r.jakobsson@gmail.com>
Add chip specific callbacks for the generic and non-generic clock
calculation code. Also remove as much dupilicated code as possible.
Signed-off-by: Patrik Jakobsson <patrik.r.jakobsson@gmail.com>
Replace any use of xxx_intel_pipe_has_type() with the generic
gma_pipe_has_type() function. Poulsbo still use it but that will be
removed when we rip out psb_intel_pipe_has_type().
Signed-off-by: Patrik Jakobsson <patrik.r.jakobsson@gmail.com>