Commit Graph

13450 Commits

Author SHA1 Message Date
Chris Wilson
0b1de5d58e drm/i915: Use SSE4.1 movntdqa to accelerate reads from WC memory
This patch provides the infrastructure for performing a 16-byte aligned
read from WC memory using non-temporal instructions introduced with sse4.1.
Using movntdqa we can bypass the CPU caches and read directly from memory
and ignoring the page attributes set on the CPU PTE i.e. negating the
impact of an otherwise UC access. Copying using movntdqa from WC is almost
as fast as reading from WB memory, modulo the possibility of both hitting
the CPU cache or leaving the data in the CPU cache for the next consumer.
(The CPU cache itself my be flushed for the region of the movntdqa and on
later access the movntdqa reads from a separate internal buffer for the
cacheline.) The write back to the memory is however cached.

This will be used in later patches to accelerate accessing WC memory.

v2: Report whether the accelerated copy is successful/possible.
v3: Function alignment override was only necessary when using the
function target("sse4.1") - which is not necessary for emitting movntdqa
from __asm__.
v4: Improve notes on CPU cache behaviour vs non-temporal stores.
v5: Fix byte offsets for unrolled moves.
v6: Find all remaining typos of "movntqda", use kernel_fpu_begin.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Akash Goel <akash.goel@intel.com>
Cc: Damien Lespiau <damien.lespiau@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1471001999-17787-2-git-send-email-chris@chris-wilson.co.uk
2016-08-12 13:06:37 +01:00
Chris Wilson
d31d7cb146 drm/i915: Support for creating write combined type vmaps
vmaps has a provision for controlling the page protection bits, with which
we can use to control the mapping type, e.g. WB, WC, UC or even WT.
To allow the caller to choose their mapping type, we add a parameter to
i915_gem_object_pin_map - but we still only allow one vmap to be cached
per object. If the object is currently not pinned, then we recreate the
previous vmap with the new access type, but if it was pinned we report an
error. This effectively limits the access via i915_gem_object_pin_map to a
single mapping type for the lifetime of the object. Not usually a problem,
but something to be aware of when setting up the object's vmap.

We will want to vary the access type to enable WC mappings of ringbuffer
and context objects on !llc platforms, as well as other objects where we
need coherent access to the GPU's pages without going through the GTT

v2: Remove the redundant braces around pin count check and fix the marker
     in documentation (Chris)

v3:
- Add a new enum for the vmalloc mapping type & pass that as an argument to
   i915_object_pin_map. (Tvrtko)
- Use PAGE_MASK to extract or filter the mapping type info and remove a
   superfluous BUG_ON.(Tvrtko)

v4:
- Rename the enums and clean up the pin_map function. (Chris)

v5: Drop the VM_NO_GUARD, minor cosmetics.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Akash Goel <akash.goel@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1471001999-17787-1-git-send-email-chris@chris-wilson.co.uk
2016-08-12 13:06:36 +01:00
Daniel Vetter
b1116f645c drm: Remove superflous linux/fb.h includes
Everyone who uses the fbdev emulation helpers doesn't need to include
fb.h directly. Remove it.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1470847958-28465-3-git-send-email-daniel.vetter@ffwll.ch
2016-08-12 10:41:39 +02:00
Daniel Vetter
44adece57e drm/fb-helper: Add a dummy remove_conflicting_framebuffers
Lots of drivers don't properly compile without this when CONFIG_FB=n.
It's kinda a hack, but since CONFIG_FB doesn't stub any fucntions when
it's disabled I think it makes sense to add it to drm_fb_helper.h.

Long term we probably need to rethink all the logic to unload firmware
framebuffer drivers, at least if we want to be able to move away from
CONFIG_FB and fbcon.

v2: Unfortunately just stubbing out remove_conflicting_framebuffers in
drm_fb_helper.h upset gcc about static vs. non-static declarations, so
a new wrapper it needs to be. Means more churn :(

Cc: Tobias Jakobi <tjakobi@math.uni-bielefeld.de>
Cc: Noralf Trønnes <noralf@tronnes.org>
Cc: tomi.valkeinen@ti.com
Cc: dh.herrmann@gmail.com
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470847958-28465-2-git-send-email-daniel.vetter@ffwll.ch
2016-08-12 10:41:18 +02:00
Ville Syrjälä
8d970654b7 drm/i915: Deal with NV12 CbCr plane AUX surface on SKL+
With NV12 we have two color planes to deal with so we must compute the
surface and x/y offsets for the second plane as well.

What makes this a bit nasty is that the hardware expects the surface
offset to be specified as a distance from the main surface offset.
What's worse, the distance must be non-negative (no neat wraparound or
anything). So we must make sure that the main surface offset is always
less or equal to the AUX surface offset. We do that by computing the AUX
offset first and the main surface offset second. If the main surface
offset ends up being above the AUX offset, we just push it down as far
as is required while still maintaining the required alignment etc.

Fortunately the AUX offset only reuqires 4K alignment, so we don't need
to do any of the backwards searching for an acceptable offset that we
must do for the main surface. And X tiled + NV12 isn't a supported
combination anyway.

Note that this just computes aux surface offsets, we do not yet program
them into the actual hardware registers, and hence we can't yet expose
NV12.

v2: Rebase due to drm_plane_state src/dst rects
    s/TODO.../something else/ in the commit message/ (Daniel)

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470821001-25272-12-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2016-08-11 18:35:23 +03:00
Ville Syrjälä
b63a16f6cd drm/i915: Compute display surface offset in the plane check hook for SKL+
SKL has nasty limitations with the display surface offsets:
* source x offset + width must be less than the stride for X tiled
  surfaces or the display engine falls over
* the surface offset requires lots of alignment (256K or 1M)

These facts mean that we can't just pick any suitably aligned tile
boundary as the offset and expect the resulting x offset to be useable.
The solution is to start with the closest boundary as before, but then
keep searching backwards until we find one that works, or don't. This
means we must be prepared to fail, hence the whole surface offset
calculation needs to be moved to the .check_plane() hook from the
.update_plane() hook.

While at it we can check that the source width/height don't exceed
maximum plane size limits.

We'll store the results of the computation in the plane state to make
it easy for the .update_plane() hook to do its thing.

v2: Replace for+break loop with while loop
    Rebase due to drm_plane_state src/dst rects
    Rebase due to plane_check_state()

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Sivakumar Thulasimani <sivakumar.thulasimani@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470821001-25272-11-git-send-email-ville.syrjala@linux.intel.com
2016-08-11 18:35:10 +03:00
Ville Syrjälä
66a2d927cb drm/i915: Make intel_adjust_tile_offset() work for linear buffers
To make life less surprising we can make intel_adjust_tile_offset()
deal with linear buffers as well. Currently it doesn't seem like there's
a real need for this since only X tiling and NV12 (which would always
be tiled currently) should need it. But I've used it for some debug
hacks already so seems like a reasonable thing to have.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Sivakumar Thulasimani <sivakumar.thulasimani@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470821001-25272-10-git-send-email-ville.syrjala@linux.intel.com
2016-08-11 18:34:39 +03:00
Ville Syrjälä
b9b2403845 drm/i915: Allow calling intel_adjust_tile_offset() multiple times
Minimize the resulting X coordinate after intel_adjust_tile_offset() is
done with it's offset adjustment. This allows calling
intel_adjust_tile_offset() multiple times in case we need to adjust
the offset several times.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470821001-25272-9-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2016-08-11 18:34:29 +03:00
Ville Syrjälä
60d5f2a45b drm/i915: Limit fb x offset due to fences
If there's a fence on the object it will be aligned to the start
of the object, and hence CPU rendering to any fb that straddles
the fence edge will come out wrong due to lines wrapping at the
wrong place.

We have no API to manage fences on a sub-object level, so we can't
really fix this in any way. Additonally gen2/3 fences are rather
coarse grained so adjusting the offset migth not even be possible.

Avoid these problems by requiring the fb layout to agree with the
fence layout (if present).

v2: Rebase due to i915_gem_object_get_tiling() & co.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Sivakumar Thulasimani <sivakumar.thulasimani@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470821001-25272-8-git-send-email-ville.syrjala@linux.intel.com
2016-08-11 18:34:12 +03:00
Ville Syrjälä
c2ff7370ae drm/i915: Adjust obj tiling vs. fb modifier rules
Currently we require the object to be X tiled if the fb is X
tiled. The argument is supposedly FBC GTT tracking. But
actually that no longer holds water since FBC supports
Y tiling as well on SKL+.

A better rule IMO is to require that if there is a fence, the
fb modifier match the object tiling mode. But if the object is linear,
we can allow the fb modifier to be anything. The idea being that
if the user set the tiling mode on the object, presumably the intention
is to actually use the fence for CPU access. But if the tiling mode is
not set, the user has no intention of using a fence (and can't actually
since we disallow tiling mode changes when there are framebuffers
associated with the object).

On gen2/3 we must keep to the rule that the object and fb
must be either both linear or both X tiled. No mixing allowed
since the display engine itself will use the fence if it's present.

v2: Fix typos
v3: Rebase due to i915_gem_object_get_tiling() & co.

Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470821001-25272-7-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2016-08-11 18:34:07 +03:00
Ville Syrjälä
72618ebfde drm/i915: Use fb modifiers for display tiling decisions
Soon the fence tiling mode may not always match the fb modifier
even for X tiled buffers. So let's use the fb modifier
consistently for all display tiling decisions.

v2: Rebased due to s/ring/engine/
v3: Rebased due to s/engine/ring/ O_o
v4: Rebase due to i915_gem_object_get_tiling() & co.

Reviewed-by: Matthew Auld <matthew.auld@intel.com> (v1)
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470821001-25272-6-git-send-email-ville.syrjala@linux.intel.com
2016-08-11 18:33:43 +03:00
Ville Syrjälä
2949056c86 drm/i915: Pass around plane_state instead of fb+rotation
intel_compute_tile_offset() and intel_add_fb_offsets() get passed the fb
and the rotation. As both of those come from the plane state we can just
pass that in instead.

For extra consitency pass the plane state to intel_fb_xy_to_linear() as
well even though it only really needs the fb.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Sivakumar Thulasimani <sivakumar.thulasimani@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470821001-25272-5-git-send-email-ville.syrjala@linux.intel.com
2016-08-11 18:33:32 +03:00
Ville Syrjälä
d21967740f drm/i915: Move SKL hw stride calculation into a helper
We repeat the SKL stride register value calculations a several places.
Move it into a small helper function.

v2: Rebase due to drm_plane_state src/dst rects

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Sivakumar Thulasimani <sivakumar.thulasimani@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470821001-25272-4-git-send-email-ville.syrjala@linux.intel.com
2016-08-11 18:33:19 +03:00
Ville Syrjälä
ef78ec9423 drm/i915: Don't pass pitch to intel_compute_page_offset()
intel_compute_page_offset() can dig up the correct pitch from the fb
itself, no need for the caller to pass it in.

A bit of extra care is needed for the lower level
_intel_compute_page_offset() since that one gets called before the
rotated pitch under intel_fb is populated. Note that we don't actually
call it with anything but DRM_ROTATE_0 there so we wouldn't actually
look up the rotated pitch there, but still, leave the pitch as something
the caller has to pass to _intel_compute_page_offset() as an
indicator that something is a bit special.

This leaves 'stride_div' in the skl plane update hooks as a mostly useless
variable so just get rid of it.

v2: Add a note why stride_div got nuked
v3: Extract intel_fb_pitch() since it can be useful later

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> (v2)
Reviewed-by: Sivakumar Thulasimani <sivakumar.thulasimani@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470821001-25272-3-git-send-email-ville.syrjala@linux.intel.com
2016-08-11 18:33:06 +03:00
Ville Syrjälä
6687c9062c drm/i915: Rewrite fb rotation GTT handling
Redo the fb rotation handling in order to:
- eliminate the NV12 special casing
- handle fb->offsets[] properly
- make the rotation handling easier for the plane code

To achieve these goals we reduce intel_rotation_info to only contain
(for each plane) the rotated view width,height,stride in tile units,
and the page offset into the object where the plane starts. Each plane
is handled exactly the same way, no special casing for NV12 or other
formats. We then store the computed rotation_info under
intel_framebuffer so that we don't have to recompute it again.

To handle fb->offsets[] we treat them as a linear offsets and convert
them to x/y offsets from the start of the relevant GTT mapping (either
normal or rotated). We store the x/y offsets under intel_framebuffer,
and for some extra convenience we also store the rotated pitch (ie.
tile aligned plane height). So for each plane we have the normal
x/y offsets, rotated x/y offsets, and the rotated pitch. The normal
pitch is available already in fb->pitches[].

While we're gathering up all that extra information, we can also easily
compute the storage requirements for the framebuffer, so that we can
check that the object is big enough to hold it.

When it comes time to deal with the plane source coordinates, we first
rotate the clipped src coordinates to match the relevant GTT view
orientation, then add to them the fb x/y offsets. Next we compute
the aligned surface page offset, and as a result we're left with some
residual x/y offsets. Finally, if required by the hardware, we convert
the remaining x/y offsets into a linear offset.

For gen2/3 we simply skip computing the final page offset, and just
convert the src+fb x/y offsets directly into a linear offset since
that's what the hardware wants.

After this all platforms, incluing SKL+, compute these things in exactly
the same way (excluding alignemnt differences).

v2: Use BIT(DRM_ROTATE_270) instead of ROTATE_270 when rotating
    plane src coordinates
    Drop some spurious changes that got left behind during
    development
v3: Split out more changes to prep patches (Daniel)
    s/intel_fb->plane[].foo.bar/intel_fb->foo[].bar/ for brevity
    Rename intel_surf_gtt_offset to intel_fb_gtt_offset
    Kill the pointless 'plane' parameter from intel_fb_gtt_offset()
v4: Fix alignment vs. alignment-1 when calling
    _intel_compute_tile_offset() from intel_fill_fb_info()
    Pass the pitch in tiles in
    stad of pixels to intel_adjust_tile_offset() from intel_fill_fb_info()
    Pass the full width/height of the rotated area to
    drm_rect_rotate() for clarity
    Use u32 for more offsets
v5: Preserve the upper_32_bits()/lower_32_bits() handling for the
    fb ggtt offset (Sivakumar)
v6: Rebase due to drm_plane_state src/dst rects

Cc: Sivakumar Thulasimani <sivakumar.thulasimani@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Sivakumar Thulasimani <sivakumar.thulasimani@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470821001-25272-2-git-send-email-ville.syrjala@linux.intel.com
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2016-08-11 18:32:46 +03:00
Ville Syrjälä
d721b02fd0 drm/i915: Account for TSEG size when determining 865G stolen base
Looks like the TSEG lives just above TOUD, stolen comes after TSEG.

The spec seems somewhat self-contradictory in places, in the ESMRAMC
register desctription it says:
 TSEG Size:
  10=(TOUD + 512 KB) to TOUD
  11 =(TOUD + 1 MB) to TOUD

so that agrees with TSEG being at TOUD. But the example given
elsehwere in the spec says:

 TOUD equals 62.5 MB = 03E7FFFFh
 TSEG selected as 512 KB in size,
 Graphics local memory selected as 1 MB in size
 General System RAM available in system = 62.5 MB
 General system RAM range00000000h to 03E7FFFFh
 TSEG address range03F80000h to 03FFFFFFh
 TSEG pre-allocated from03F80000h to 03FFFFFFh
 Graphics local memory pre-allocated from03E80000h to 03F7FFFFh

so here we have TSEG above stolen.

Real world evidence agrees with the TOUD->TSEG->stolen order however, so
let's fix up the code to account for the TSEG size.

Cc: Taketo Kabe <fdporg@vega.pgw.jp>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
Cc: stable@vger.kernel.org
Fixes: 0ad98c74e0 ("drm/i915: Determine the stolen memory base address on gen2")
Fixes: a4dff76924 ("x86/gpu: Add Intel graphics stolen memory quirk for gen2 platforms")
Reported-by: Taketo Kabe <fdporg@vega.pgw.jp>
Tested-by: Taketo Kabe <fdporg@vega.pgw.jp>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96473
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470653919-27251-1-git-send-email-ville.syrjala@linux.intel.com
Link: http://download.intel.com/design/chipsets/datashts/25251405.pdf
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-08-11 17:20:42 +03:00
Tvrtko Ursulin
5e334c199a drm/i915/guc: Consolidate firmware major-minor to one place
Currently to change the firmware one has to update the exported
module firmware string and the major-minor versions used for
verification after load. Consolidate that to a single place
defining correct major and minor versions per platform.

v2: Rebased for KBL.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Dave Gordon <david.s.gordon@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Peter Antoine <peter.antoine@intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Dave Gordon <david.s.gordon@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470842206-35685-1-git-send-email-tvrtko.ursulin@linux.intel.com
2016-08-11 11:34:04 +01:00
Tvrtko Ursulin
c1bb11451e drm/i915: Store number of active engines in device info
Until now code was calling hweight32 to figure out the
number from device_info->ring_mask at runtime. Instead
we can cache it at engine init time and use directly.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Dave Gordon <david.s.gordon@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1470842530-35854-1-git-send-email-tvrtko.ursulin@linux.intel.com
2016-08-11 11:33:10 +01:00
Maarten Lankhorst
dfa2997055 drm/i915: Fix modeset handling during gpu reset, v5.
This function would call drm_modeset_lock_all, while the suspend/resume
functions already have their own locking. Fix this by factoring out
__intel_display_resume, and calling the atomic helpers for duplicating
atomic state and disabling all crtc's during suspend.

Changes since v1:
- Deal with -EDEADLK right after lock_all and clean up calls
  to hw readout.
- Always take all modeset locks so updates during gpu reset are blocked.
Changes since v2:
- Fix deadlock in intel_update_primary_planes.
- Move WARN_ON(EDEADLK) to __intel_display_resume.
- pctx -> ctx
- only call __intel_display_resume on success in intel_display_resume.
Changes since v3:
- Rebase on top of dev_priv -> dev change.
- Use drm_modeset_lock_all_ctx instead of drm_modeset_lock_all.
Changes since v4 [by vsyrjala]:
- Deal with skip_intermediate_wm
- Update comment w.r.t. mode_config.mutex vs. ->detect()
- Rebase due to INTEL_GEN() etc.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Fixes: e2c8b8701e ("drm/i915: Use atomic helpers for suspend, v2.")
Cc: stable@vger.kernel.org
Tested-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470428910-12125-2-git-send-email-ville.syrjala@linux.intel.com
(cherry picked from commit 7397489399)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2016-08-11 01:06:58 +03:00
Matthew Auld
3871f42a57 drm/i915: fix aliasing_ppgtt leak
In i915_ggtt_cleanup_hw we need to remember to free aliasing_ppgtt. This
fixes the following kmemleak message:

unreferenced object 0xffff880213cca000 (size 8192):
  comm "modprobe", pid 1298, jiffies 4294745402 (age 703.930s)
  hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<ffffffff817c808e>] kmemleak_alloc+0x4e/0xb0
    [<ffffffff8121f9c2>] kmem_cache_alloc_trace+0x142/0x1d0
    [<ffffffffa06d11ef>] i915_gem_init_ggtt+0x10f/0x210 [i915]
    [<ffffffffa06d71bb>] i915_gem_init+0x5b/0xd0 [i915]
    [<ffffffffa069749a>] i915_driver_load+0x97a/0x1460 [i915]
    [<ffffffffa06a26ef>] i915_pci_probe+0x4f/0x70 [i915]
    [<ffffffff81423015>] local_pci_probe+0x45/0xa0
    [<ffffffff81424463>] pci_device_probe+0x103/0x150
    [<ffffffff81515e6c>] driver_probe_device+0x22c/0x440
    [<ffffffff81516151>] __driver_attach+0xd1/0xf0
    [<ffffffff8151379c>] bus_for_each_dev+0x6c/0xc0
    [<ffffffff8151555e>] driver_attach+0x1e/0x20
    [<ffffffff81514fa3>] bus_add_driver+0x1c3/0x280
    [<ffffffff81516aa0>] driver_register+0x60/0xe0
    [<ffffffff8142297c>] __pci_register_driver+0x4c/0x50
    [<ffffffffa013605b>] 0xffffffffa013605b

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Fixes: b18b6bde30 ("drm/i915/bdw: Free PPGTT struct")
Cc: stable@vger.kernel.org
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1470420280-21417-1-git-send-email-matthew.auld@intel.com
(cherry picked from commit cb7f27601c)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2016-08-11 01:05:48 +03:00
Matthew Auld
bf74c93cd7 drm/i915: fix WaInsertDummyPushConstPs
As pointed out by Chris Harris, we are using the wrong WA name, it
should in fact be WaToEnableHwFixForPushConstHWBug, also it should be
applied from C0 onwards for both BXT and KBL.

Fixes: 7b9005cd45 ("drm/i915: Add WaInsertDummyPushConstP for bxt and kbl")

Cc: Chris Harris <chris.harris@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reported-by: Chris Harris <chris.harris@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Arun Siluvery <arun.siluvery@linux.intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470127013-29653-1-git-send-email-matthew.auld@intel.com
(cherry picked from commit 575e3ccbce)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2016-08-11 01:04:38 +03:00
Ville Syrjälä
85bf59d188 drm/i915: Fix iboost setting for SKL Y/U DP DDI buffer translation entry 2
The spec was recently fixed to have the correct iboost setting for the
SKL Y/U DP DDI buffer translation table entry 2. Update our tables
to match.

Cc: David Weinehall <david.weinehall@linux.intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470140517-13011-1-git-send-email-ville.syrjala@linux.intel.com
Cc: stable@vger.kernel.org
Reviewed-by: David Weinehall <david.weinehall@linux.intel.com>
(cherry picked from commit 5ac9056753)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2016-08-11 01:03:31 +03:00
Matt Roper
58e311b09c drm/i915/gen9: Give one extra block per line for SKL plane WM calculations
The bspec was updated a couple weeks ago to add an extra block per line
to plane watermark calculations for linear pixel formats.

Bspec update 115327 description:
  "Gen9+ - Updated the plane blocks per line calculation for linear
  cases. Adds +1 for all linear cases to handle the non-block aligned
  stride cases."

Cc: Lyude <cpaul@redhat.com>
Cc: drm-intel-fixes@lists.freedesktop.org
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470344880-27394-1-git-send-email-matthew.d.roper@intel.com
Reviewed-by: Lyude <cpaul@redhat.com>
(cherry picked from commit 055c3ff69d)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2016-08-11 01:02:22 +03:00
Chris Wilson
3cffb0a447 drm/i915: Acquire audio powerwell for HD-Audio registers
On Haswell/Broadwell, the HD-Audio block is inside the HDMI/display
power well and so the sna-hda audio codec acquires the display power
well while it is operational. However, Skylake separates the powerwells
again, but yet we still need the audio powerwell to setup the registers.
(But then the hardware uses those registers even while powered off???)

Acquiring the powerwell around setting the chicken bits when setting up
the audio channel does at least silence the WARNs from touching our
registers whilst unpowered. We silence our own test cases, but maybe
there is a latent bug in using the audio channel?

v2: Grab both rpm wakelock and audio wakelock

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96214
Fixes: 03b135cebc "ALSA: hda - remove dependency on i915 power well for SKL")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Libin Yang <libin.yang@intel.com>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Marius Vlad <marius.c.vlad@intel.com>
Tested-by: Hans de Goede <hdegoede@redhat.com>
Cc: stable@vger.kernel.org
Link: http://patchwork.freedesktop.org/patch/msgid/1470240540-29004-1-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
(cherry picked from commit d838a110f0)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2016-08-11 01:01:14 +03:00
Chris Wilson
2ca17b87e8 drm/i915: Add missing rpm wakelock to GGTT pread
Joonas spotted a discrepancy between the pwrite and pread ioctls, in
that pwrite takes the rpm wakelock around its GGTT access, The wakelock
is required in order for the GTT to function. In disregard for the
current convention, we take the rpm wakelock around the access itself
rather than around the struct_mutex as the nesting is not strictly
required and such ordering will one day be fixed by explicitly noting
the barrier dependencies between the GGTT and rpm.

Fixes: b50a53715f ("drm/i915: Support for pread/pwrite ...")
Reported-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ankitprasad Sharma <ankitprasad.r.sharma@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: drm-intel-fixes@lists.freedesktop.org
Link: http://patchwork.freedesktop.org/patch/msgid/1470298193-21765-1-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
(cherry picked from commit 1dd5b6f202)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2016-08-11 00:59:48 +03:00
Chris Wilson
0a491b96aa drm/i915/fbc: FBC causes display flicker when VT-d is enabled on Skylake
Erratum SKL075: Display Flicker May Occur When Both VT-d And FBC Are Enabled

"Display flickering may occur when both FBC (Frame Buffer Compression)
and VT - d (Intel® Virtualization Technology for Directed I/O) are enabled
and in use by the display controller."

Ville found the w/a name in the database:
WaFbcTurnOffFbcWhenHyperVisorIsUsed:skl,bxt and also dug out that it
affects Broxton.

v2: Log when the quirk is applied.
v3: Ensure i915.enable_fbc is false when !HAS_FBC()
v4: Fix function name after rebase
v5: Add Broxton to the workaround

Note for backporting to stable, we need to add
  #define mkwrite_device_info(ptr) \
	((struct intel_device_info *)INTEL_INFO(ptr))

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: stable@vger.kernel.org
Link: http://patchwork.freedesktop.org/patch/msgid/1470296633-20388-1-git-send-email-chris@chris-wilson.co.uk
(cherry picked from commit 36dbc4d769)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2016-08-11 00:58:18 +03:00
Ville Syrjälä
d8b6161f72 drm/i915: Clean up the extra RPM ref on CHV with i915.enable_rc6=0
Remove the CHV early bail out from intel_cleanup_gt_powersave() so that
we'll clean up the extra RPM reference held due to i915.enable_rc6=0.

Cc: Imre Deak <imre.deak@intel.com>
Fixes: b268c699ac ("drm/i915: refactor RPM disabling due to RC6 being disabled")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470136053-23276-1-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Imre Deak <imre.deak@intel.com>
(cherry picked from commit 8dac1e1f20)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2016-08-11 00:56:58 +03:00
Ville Syrjälä
7ff9a55614 drm/i915: Program iboost settings for HDMI/DVI on SKL
Currently we fail to program the iboost stuff for HDMI/DVI. Let's remedy
that.

Cc: stable@vger.kernel.org
Fixes: f8896f5d58 ("drm/i915/skl: Buffer translation improvements")
Cc: David Weinehall <david.weinehall@linux.intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1468328376-6380-4-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: David Weinehall <david.weinehall@linux.intel.com>
(cherry picked from commit 8d8bb85eb7)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2016-08-11 00:51:57 +03:00
Ville Syrjälä
5728e0de74 drm/i915: Fix iboost setting for DDI with 4 lanes on SKL
Bspec says:
"For DDIA with x4 capability (DDI_BUF_CTL DDIA Lane Capability Control =
 DDIA x4), the I_boost value has to be programmed in both
 tx_blnclegsctl_0 and tx_blnclegsctl_4."

Currently we only program tx_blnclegsctl_0. Let's do the other one as
well.

Cc: stable@vger.kernel.org
Fixes: f8896f5d58 ("drm/i915/skl: Buffer translation improvements")
Cc: David Weinehall <david.weinehall@linux.intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1468328376-6380-2-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: David Weinehall <david.weinehall@linux.intel.com>
(cherry picked from commit a7d8dbc07c)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2016-08-11 00:50:29 +03:00
Chris Wilson
fae82e59d2 drm/i915: Handle ENOSPC after failing to insert a mappable node
Even after adding individual page support for GTT mmaping, we can still
fail to find any space within the mappable region, and
drm_mm_insert_node() will then report ENOSPC. We have to then handle
this error by using the shmem access to the pages.

Fixes: b50a53715f ("drm/i915: Support for pread/pwrite ... objects")
Testcase: igt/gem_concurrent_blit
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ankitprasad Sharma <ankitprasad.r.sharma@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com
Link: http://patchwork.freedesktop.org/patch/msgid/1468690956-23480-1-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
(cherry picked from commit d1054ee492)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2016-08-11 00:49:02 +03:00
Chris Wilson
17f298cf54 drm/i915: Move setting of request->batch into its single callsite
request->batch_obj is only set by execbuffer for the convenience of
debugging hangs. By moving that operation to the callsite, we can
simplify all other callers and future patches. We also move the
complications of reference handling of the request->batch_obj next to
where the active tracking is set up for the request.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470832906-13972-2-git-send-email-chris@chris-wilson.co.uk
2016-08-10 16:07:52 +01:00
Chris Wilson
737aac2465 drm/i915: Mark unmappable GGTT entries as PIN_HIGH
We allocate a few objects into the GGTT that we never need to access via
the mappable aperture (such as contexts, status pages). We can request
that these are bound high in the VM to increase the amount of mappable
aperture available. However, anything that may be frequently pinned
(such as logical contexts) we want to use the fast search & insert.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470832906-13972-1-git-send-email-chris@chris-wilson.co.uk
2016-08-10 16:07:51 +01:00
Chris Wilson
b5163dbb17 drm/i915: Fix nesting of rps.mutex and struct_mutex during powersave init
During intel_gt_powersave_init() we take the RPS mutex to ensure that
all locking requirements are met as we talk to the punit, but we also
require the struct_mutex for allocating a slice of the global GTT for a
power context on Valleyview. struct_mutex must be the outer lock here,
as we nest rps.mutex inside later on.

Reported-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Fixes: 773ea9a801 ("drm/i915: Perform static RPS frequency setup before...")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470833904-29886-1-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
2016-08-10 16:07:36 +01:00
Chris Wilson
b06bc7ec4d drm/i915: Flush GT idle status upon reset
Upon resetting the GPU, we force the engines to be idle by clearing
their request lists. However, I neglected to clear the GT active status
and so the next request following the reset was not marking the device
as busy again. (We had to wait until any outstanding retire worker
finally ran and cleared the active status.)

Fixes: 67d97da349 ("drm/i915: Only start retire worker when idle")
Testcase: igt/pm_rps/reset
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1468397438-21226-1-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
(cherry picked from commit b913b33c43)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2016-08-10 17:43:49 +03:00
Imre Deak
eebb40e081 drm/i915: Remove LVDS and PPS suspend time save/restore
In the preceding patches we made sure that:
- the LVDS encoder takes care of reiniting both the LVDS register
and its PPS
- the eDP encoder takes care of reiniting its PPS
- the PPS register unlocking workaround is applied explicitly whenever
the PPS context is lost

Based on the above we can safely remove the opaque LVDS and PPS save /
restore from generic code.

Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470827254-21954-6-git-send-email-imre.deak@intel.com
2016-08-10 16:02:14 +03:00
Imre Deak
8090ba8c21 drm/i915: Apply the PPS register unlock workaround more consistently
Atm, we apply this workaround somewhat inconsistently at the following
points: driver loading, LVDS init, eDP PPS init, system resume. As this
workaround also affects registers other than PPS (timing, PLL) a more
consistent way is to apply it early after the PPS HW context is known to
be lost: driver loading, system resume and on VLV/CHV/BXT when turning
on power domains.

This is needed by the next patch that removes saving/restoring of the
PP_CONTROL register.

This also removes the incorrect programming of the workaround on HSW+
PCH platforms which don't have the register locking mechanism.

v2: (Ville)
- Don't apply the workaround on BXT.
- Simplify platform checks using HAS_DDI().
v3:
- Move the call of intel_pps_unlock_regs_wa() to the more
  logical vlv_display_power_well_init() (also fixing CHV) (Ville).

Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470827254-21954-5-git-send-email-imre.deak@intel.com
2016-08-10 16:01:42 +03:00
Imre Deak
335f752ba9 drm/i915/dp: Restore PPS HW state from the encoder resume hook
Similarly to the previous patch, initialize the PPS from the DP
encoder's resume hook. Note that as opposed to LVDS we can't do this
during encoder enabling, since we need the PPS for DP detection as well.
The PPS init code is now the same for init and resume, so factor out a
new intel_dp_pps_init() helper for this.

v2:
- Factor out intel_dp_pps_init() (Ville).

Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470827254-21954-4-git-send-email-imre.deak@intel.com
2016-08-10 16:00:59 +03:00
Imre Deak
ed6143b8f7 drm/i915/lvds: Restore initial HW state during encoder enabling
Atm the LVDS encoder depends on the PPS HW context being saved/restored
from generic suspend/resume code. Since the PPS is specific to the LVDS
and eDP encoders a cleaner way is to reinitialize it during encoder
enabling, so do this here for LVDS. Follow-up patches will init the PPS
for the eDP encoder similarly and remove the suspend/resume time save /
restore.

v2:
- Apply BSpec +1 offset and use DIV_ROUND_UP() when programming the
power cycle delay. (Ville)
v3: (Ville)
- Fix +1 vs. round-up order.
- s/reset_on_powerdown/powerdown_on_reset/

Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470827254-21954-3-git-send-email-imre.deak@intel.com
2016-08-10 16:00:47 +03:00
Imre Deak
5a162e229a drm/i915: Merge TARGET_POWER_ON and PANEL_POWER_ON flag definitions
These two flags mean the same thing, so remove the duplication.

Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470827254-21954-2-git-send-email-imre.deak@intel.com
2016-08-10 16:00:44 +03:00
Imre Deak
44cb734cd2 drm/i915: Merge the PPS register definitions
The PPS registers are pretty much the same everywhere, the differences
being:
- Register fields appearing, disappearing from one platform to the
  next: panel-reset-on-powerdown, backlight-on, panel-port,
  register-unlock
- Different register base addresses
- Different number of PPS instances: 2 on VLV/CHV/BXT, 1 everywhere
  else.

We can merge the separate set of PPS definitions by extending the PPS
instance argument to all platforms and using instance 0 on platforms
with a single instance. This means we'll need to calculate the register
addresses dynamically based on the given platform and PPS instance.

v2:
- Simplify if ladder in intel_pps_get_registers(). (Ville)

Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470827254-21954-1-git-send-email-imre.deak@intel.com
2016-08-10 16:00:07 +03:00
Dave Gordon
774439e12b drm/i915/guc: re-optimise i915_guc_client layout
As we're tweaking the GuC-related code in debugfs, we can
drop the no-longer-used 'q_fail' and repack the structure.

Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2016-08-10 10:40:05 +01:00
Dave Gordon
c18468c4b2 drm/i915/guc: use for_each_engine_id() where appropriate
Now that host structures are indexed by host engine-id rather than
guc_id, we can usefully convert some for_each_engine() loops to use
for_each_engine_id() and avoid multiple dereferences of engine->id.

Also a few related tweaks to cache structure members locally wherever
they're used more than once or twice, hopefully eliminating memory
references.

Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2016-08-10 10:40:05 +01:00
Dave Gordon
e02757d91f drm/i915/guc: add engine mask to GuC client & pass to GuC
The Context Descriptor passed by the kernel to the GuC contains a field
specifying which engine(s) the context will use. Historically, this was
always set to "all of them", but if we had a separate client for each
engine, we could be more precise, and set only the bit for the engine
that the client was associated with. So this patch enables this usage,
in preparation for having multiple clients, though at this point there
is still only a single client used for all supported engines.

Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2016-08-10 10:40:05 +01:00
Dave Gordon
84b7f88235 drm/i915/guc: refactor guc_init_doorbell_hw()
We have essentially the same code in each of two different
loops, so we can refactor it into a little helper function.

This also reduces the amount of work done during startup,
as we now only reprogram h/w found to be in a state other
than that expected, and so avoid the overhead of setting
doorbell registers to the state they're already in.

Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2016-08-10 10:40:05 +01:00
Dave Gordon
8888cd0154 drm/i915/guc: doorbell reset should avoid used doorbells
guc_init_doorbell_hw() borrows the (currently single) GuC client to use
in reinitialising ALL the doorbell registers (as the hardware doesn't
reset them when the GuC is reset). As a prerequisite for accommodating
multiple clients, it should only reset doorbells that are supposed to be
disabled, avoiding those that are marked as in use by any client.

Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2016-08-10 10:40:05 +01:00
Chris Wilson
dbd6ef29a7 drm/i915: Use RCU to annotate and enforce protection for breadcrumb's bh
The bottom-half we use for processing the breadcrumb interrupt is a
task, which is an RCU protected struct. When accessing this struct, we
need to be holding the RCU read lock to prevent it disappearing beneath
us. We can use the RCU annotation to mark our irq_seqno_bh pointer as
being under RCU guard and then use the RCU accessors to both provide
correct ordering of access through the pointer.

Most notably, this fixes the access from hard irq context to use the RCU
read lock, which both Daniel and Tvrtko complained about.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1470761272-1245-3-git-send-email-chris@chris-wilson.co.uk
2016-08-10 10:37:49 +01:00
Chris Wilson
83348ba84e drm/i915: Move missed interrupt detection from hangcheck to breadcrumbs
In commit 2529d57050 ("drm/i915: Drop racy markup of missed-irqs from
idle-worker") the racy detection of missed interrupts was removed when
we went idle. This however opened up the issue that the stuck waiters
were not being reported, causing a test case failure. If we move the
stuck waiter detection out of hangcheck and into the breadcrumb
mechanims (i.e. the waiter) itself, we can avoid this issue entirely.
This leaves hangcheck looking for a stuck GPU (inspecting for request
advancement and HEAD motion), and breadcrumbs looking for a stuck
waiter - hopefully make both easier to understand by their segregation.

v2: Reduce the error message as we now run independently of hangcheck,
and the hanging batch used by igt also counts as a stuck waiter causing
extra warnings in dmesg.
v3: Move the breadcrumb's hangcheck kickstart to the first missed wait.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97104
Fixes: 2529d57050 (waiter"drm/i915: Drop racy markup of missed-irqs...")
Testcase: igt/drv_missed_irq
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470761272-1245-2-git-send-email-chris@chris-wilson.co.uk
2016-08-10 10:37:35 +01:00
Chris Wilson
70cb472c6d drm/i915: Always mark the writer as also a read for busy ioctl
One of the few guarantees we want the busy ioctl to provide is that the
reported busy writer is included in the set of busy read engines. This
should be provided by the ordering of setting and retiring the active
trackers, but we can do better by explicitly setting the busy read
engine flag for the last writer.

v2: More comments inside __busy_write_id() to explain why both fields
are set.

Fixes: 3fdc13c7a3 ("drm/i915: Remove (struct_mutex) locking for busy-ioctl")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470762505-12799-1-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2016-08-10 10:37:21 +01:00
Chris Wilson
1426f7157e drm/i915: Correct typo for __i915_gem_active_get_rcu in a comment
I mistyped and added an extra _request_ to __i915_gem_active_get_rcu()
Also, the same happened to another comment for i915_gem_active_get_rcu()

Reported-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470758602-1338-1-git-send-email-chris@chris-wilson.co.uk
2016-08-09 17:17:56 +01:00
Mario Kleiner
196f954e25 drm/i915/dp: Revert "drm/i915/dp: fall back to 18 bpp when sink capability is unknown"
This reverts commit 013dd9e038
("drm/i915/dp: fall back to 18 bpp when sink capability is unknown")

This commit introduced a regression into stable kernels,
as it reduces output color depth to 6 bpc for any video
sink connected to a Displayport connector if that sink
doesn't report a specific color depth via EDID, or if
our EDID parser doesn't actually recognize the proper
bpc from EDID.

Affected are active DisplayPort->VGA converters and
active DisplayPort->DVI converters. Both should be
able to handle 8 bpc, but are degraded to 6 bpc with
this patch.

The reverted commit was meant to fix
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=105331

A followup patch implements a fix for that specific bug,
which is caused by a faulty EDID of the affected DP panel
by adding a new EDID quirk for that panel.

DP 18 bpp fallback handling and other improvements to
DP sink bpc detection will be handled for future
kernels in a separate series of patches.

Please backport to stable.

Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Cc: stable@vger.kernel.org
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-08-09 08:56:01 +10:00