Commit Graph

24301 Commits

Author SHA1 Message Date
Chris Wilson
a211da9c77 drm/i915/gt: Make timeslicing an explicit engine property
In order to allow userspace to rely on timeslicing to reorder their
batches, we must support preemption of those user batches. Declare
timeslicing as an explicit property that is a combination of having the
kernel support and HW support.

Suggested-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Fixes: 8ee36e048c ("drm/i915/execlists: Minimalistic timeslicing")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200501122249.12417-1-chris@chris-wilson.co.uk
2020-05-01 15:17:33 +01:00
Chris Wilson
3b55cdeb8f drm/i915/pmu: Keep a reference to module while active
While a perf event is open, keep a reference to the module so we don't
remove the driver internals mid-sampling.

Testcase: igt/perf_pmu/module-unload
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: stable@vger.kernel.org
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200430183324.23984-1-chris@chris-wilson.co.uk
2020-05-01 09:24:34 +01:00
Christophe Leroy
b44f687386 drm/i915/gem: Replace user_access_begin by user_write_access_begin
When i915_gem_execbuffer2_ioctl() is using user_access_begin(),
that's only to perform unsafe_put_user() so use
user_write_access_begin() in order to only open write access.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/ebf1250b6d4f351469fb339e5399d8b92aa8a1c1.1585898438.git.christophe.leroy@c-s.fr
2020-05-01 12:35:22 +10:00
Chris Wilson
16e8745967 drm/i915/gt: Move the batch buffer pool from the engine to the gt
Since the introduction of 'soft-rc6', we aim to park the device quickly
and that results in frequent idling of the whole device. Currently upon
idling we free the batch buffer pool, and so this renders the cache
ineffective for many workloads. If we want to have an effective cache of
recently allocated buffers available for reuse, we need to decouple that
cache from the engine powermanagement and make it timer based. As there
is no reason then to keep it within the engine (where it once made
retirement order easier to track), we can move it up the hierarchy to the
owner of the memory allocations.

v2: Hook up to debugfs/drop_caches to clear the cache on demand.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200430111819.10262-2-chris@chris-wilson.co.uk
2020-04-30 19:12:02 +01:00
Joonas Lahtinen
230982d8d8 drm/i915: Update DRIVER_DATE to 20200430
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-04-30 11:13:21 +03:00
Joonas Lahtinen
8b46ed57f3 Merge tag 'gvt-next-2020-04-22' of https://github.com/intel/gvt-linux into drm-intel-next-queued
gvt-next-2020-04-22

- remove non-upstream xen support bits (Christoph)
- guest context shadow copy optimization (Yan)
- guest context tracking for shadow skip optimization (Yan)

Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
From: Zhenyu Wang <zhenyuw@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200422051230.GH11247@zhen-hp.sh.intel.com
2020-04-30 10:53:21 +03:00
Zbigniew Kempczyński
79eb8c7f01 drm/i915/selftests: Add tiled blits selftest
Extend coverage of the blitter client by exercising conversion to and
from tiled sources. In the process we perform spot checks to verify that
the tiling/detiling is being applied correctly, along with position
invariance of the tiling parameters.

Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200430064957.14942-1-chris@chris-wilson.co.uk
2020-04-30 08:31:12 +01:00
Chris Wilson
de3b4d9361 drm/i915/gt: Restore aggressive post-boost downclocking
We reduced the clocks slowly after a boost event based on the
observation that the smoothness of animations suffered. However, since
reducing the evalution intervals, we should be able to respond to the
rapidly fluctuating workload of a simple desktop animation and so
restore the more aggressive downclocking.

References: 2a8862d2f3 ("drm/i915: Reduce the RPS shock")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200429205446.3259-6-chris@chris-wilson.co.uk
2020-04-30 00:57:38 +01:00
Chris Wilson
3f88dde6ee drm/i915/gt: Apply the aggressive downclocking to parking
We treat parking as a manual RPS timeout event, and downclock the GPU
for the next unpark and batch execution. However, having restored the
aggressive downclocking and observed that we have very light workloads
whose only interaction is through the manual parking events, carry over
the aggressive downclocking to the fake RPS events.

References: 21abf0bf16 ("drm/i915/gt: Treat idling as a RPS downclock event")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200429205446.3259-5-chris@chris-wilson.co.uk
2020-04-30 00:57:37 +01:00
Chris Wilson
36d516be86 drm/i915/gt: Switch to manual evaluation of RPS
As with the realisation for soft-rc6, we respond to idling the engines
within microseconds, far faster than the response times for HW RC6 and
RPS. Furthermore, our fast parking upon idle, prevents HW RPS from
running for many desktop workloads, as the RPS evaluation intervals are
on the order of tens of milliseconds, but the typical workload is just a
couple of milliseconds, but yet we still need to determine the best
frequency for user latency versus power.

Recognising that the HW evaluation intervals are a poor fit, and that
they were deprecated [in bspec at least] from gen10, start to wean
ourselves off them and replace the EI with a timer and our accurate
busy-stats. The principle benefit of manually evaluating RPS intervals
is that we can be more responsive for better performance and powersaving
for both spiky workloads and steady-state.

Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1698
Fixes: 98479ada42 ("drm/i915/gt: Treat idling as a RPS downclock event")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Andi Shyti <andi.shyti@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200429205446.3259-4-chris@chris-wilson.co.uk
2020-04-30 00:57:37 +01:00
Chris Wilson
8e99299a04 drm/i915/gt: Track use of RPS interrupts in flags
Use the new intel_rps.flags field to store whether or not interrupts are
being used with RPS.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Andi Shyti <andi@etezian.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20200429205446.3259-3-chris@chris-wilson.co.uk
2020-04-30 00:57:36 +01:00
Chris Wilson
9bad2adbdd drm/i915/gt: Move rps.enabled/active to flags
Pull the boolean intel_rps.enabled and intel_rps.active into a single
flags field, in preparation for more.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200429205446.3259-2-chris@chris-wilson.co.uk
2020-04-30 00:57:35 +01:00
Chris Wilson
426d0073fb drm/i915/gt: Always enable busy-stats for execlists
In the near future, we will utilize the busy-stats on each engine to
approximate the C0 cycles of each, and use that as an input to a manual
RPS mechanism. That entails having busy-stats always enabled and so we
can remove the enable/disable routines and simplify the pmu setup. As a
consequence of always having the stats enabled, we can also show the
current active time via sysfs/engine/xcs/active_time_ns.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200429205446.3259-1-chris@chris-wilson.co.uk
2020-04-30 00:57:34 +01:00
Chris Wilson
be1cb55a07 drm/i915/gt: Keep a no-frills swappable copy of the default context state
We need to keep the default context state around to instantiate new
contexts (aka golden rendercontext), and we also keep it pinned while
the engine is active so that we can quickly reset a hanging context.
However, the default contexts are large enough to merit keeping in
swappable memory as opposed to kernel memory, so we store them inside
shmemfs. Currently, we use the normal GEM objects to create the default
context image, but we can throw away all but the shmemfs file.

This greatly simplifies the tricky power management code which wants to
run underneath the normal GT locking, and we definitely do not want to
use any high level objects that may appear to recurse back into the GT.
Though perhaps the primary advantage of the complex GEM object is that
we aggressively cache the mapping, but here we are recreating the
vm_area everytime time we unpark. At the worst, we add a lightweight
cache, but first find a microbenchmark that is impacted.

Having started to create some utility functions to make working with
shmemfs objects easier, we can start putting them to wider use, where
GEM objects are overkill, such as storing persistent error state.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Ramalingam C <ramalingam.c@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200429172429.6054-1-chris@chris-wilson.co.uk
2020-04-29 19:02:37 +01:00
Ville Syrjälä
58911c2407 drm: Nuke mode->hsync
Let's just calculate the hsync rate on demand. No point in wasting
space storing it and risking the cached value getting out of sync
with reality.

v2: Move drm_mode_hsync() next to its only users
    Drop the TODO

Reviewed-by: Sam Ravnborg <sam@ravnborg.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com> #v1
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200428171940.19552-2-ville.syrjala@linux.intel.com
2020-04-29 18:44:26 +03:00
Dan Carpenter
8c35a19576 drm/i915/selftests: fix error handling in __live_lrc_indirect_ctx_bb()
If intel_context_create() fails then it leads to an error pointer
dereference.  I shuffled things around to make error handling easier.

Fixes: 1dd47b54ba ("drm/i915: Add live selftests for indirect ctx batchbuffers")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200429132425.GE815283@mwanda
2020-04-29 15:16:35 +01:00
Chris Wilson
24aac336ff drm/i915: Avoid dereferencing a dead context
Once the intel_context is closed, the GEM context may be freed and so
the link from intel_context.gem_context is invalid.

<3>[  219.782944] BUG: KASAN: use-after-free in intel_engine_coredump_alloc+0x1bc3/0x2250 [i915]
<3>[  219.782996] Read of size 8 at addr ffff8881d7dff0b8 by task kworker/0:1/12

<4>[  219.783052] CPU: 0 PID: 12 Comm: kworker/0:1 Tainted: G     U            5.7.0-rc2-g1f3ffd7683d54-kasan_118+ #1
<4>[  219.783055] Hardware name: System manufacturer System Product Name/Z170 PRO GAMING, BIOS 3402 04/26/2017
<4>[  219.783105] Workqueue: events heartbeat [i915]
<4>[  219.783109] Call Trace:
<4>[  219.783113]  <IRQ>
<4>[  219.783119]  dump_stack+0x96/0xdb
<4>[  219.783177]  ? intel_engine_coredump_alloc+0x1bc3/0x2250 [i915]
<4>[  219.783182]  print_address_description.constprop.6+0x16/0x310
<4>[  219.783239]  ? intel_engine_coredump_alloc+0x1bc3/0x2250 [i915]
<4>[  219.783295]  ? intel_engine_coredump_alloc+0x1bc3/0x2250 [i915]
<4>[  219.783300]  __kasan_report+0x137/0x190
<4>[  219.783359]  ? intel_engine_coredump_alloc+0x1bc3/0x2250 [i915]
<4>[  219.783366]  kasan_report+0x32/0x50
<4>[  219.783426]  intel_engine_coredump_alloc+0x1bc3/0x2250 [i915]
<4>[  219.783481]  execlists_reset+0x39c/0x13d0 [i915]
<4>[  219.783494]  ? mark_held_locks+0x9e/0xe0
<4>[  219.783546]  ? execlists_hold+0xfc0/0xfc0 [i915]
<4>[  219.783551]  ? lockdep_hardirqs_on+0x348/0x5f0
<4>[  219.783557]  ? _raw_spin_unlock_irqrestore+0x34/0x60
<4>[  219.783606]  ? execlists_submission_tasklet+0x118/0x3a0 [i915]
<4>[  219.783615]  tasklet_action_common.isra.14+0x13b/0x410
<4>[  219.783623]  ? __do_softirq+0x1e4/0x9a7
<4>[  219.783630]  __do_softirq+0x226/0x9a7
<4>[  219.783643]  do_softirq_own_stack+0x2a/0x40
<4>[  219.783647]  </IRQ>
<4>[  219.783692]  ? heartbeat+0x3e2/0x10f0 [i915]
<4>[  219.783696]  do_softirq.part.13+0x49/0x50
<4>[  219.783700]  __local_bh_enable_ip+0x1a2/0x1e0
<4>[  219.783748]  heartbeat+0x409/0x10f0 [i915]
<4>[  219.783801]  ? __live_idle_pulse+0x9f0/0x9f0 [i915]
<4>[  219.783806]  ? lock_acquire+0x1ac/0x8a0
<4>[  219.783811]  ? process_one_work+0x811/0x1870
<4>[  219.783827]  ? rcu_read_lock_sched_held+0x9c/0xd0
<4>[  219.783832]  ? rcu_read_lock_bh_held+0xb0/0xb0
<4>[  219.783836]  ? _raw_spin_unlock_irq+0x1f/0x40
<4>[  219.783845]  process_one_work+0x8ca/0x1870
<4>[  219.783848]  ? lock_acquire+0x1ac/0x8a0
<4>[  219.783852]  ? worker_thread+0x1d0/0xb80
<4>[  219.783864]  ? pwq_dec_nr_in_flight+0x2c0/0x2c0
<4>[  219.783870]  ? do_raw_spin_lock+0x129/0x290
<4>[  219.783886]  worker_thread+0x82/0xb80
<4>[  219.783895]  ? __kthread_parkme+0xaf/0x1b0
<4>[  219.783902]  ? process_one_work+0x1870/0x1870
<4>[  219.783906]  kthread+0x34e/0x420
<4>[  219.783911]  ? kthread_create_on_node+0xc0/0xc0
<4>[  219.783918]  ret_from_fork+0x3a/0x50

<3>[  219.783950] Allocated by task 1264:
<4>[  219.783975]  save_stack+0x19/0x40
<4>[  219.783978]  __kasan_kmalloc.constprop.3+0xa0/0xd0
<4>[  219.784029]  i915_gem_create_context+0xa2/0xab8 [i915]
<4>[  219.784081]  i915_gem_context_create_ioctl+0x1fa/0x450 [i915]
<4>[  219.784085]  drm_ioctl_kernel+0x1d8/0x270
<4>[  219.784088]  drm_ioctl+0x676/0x930
<4>[  219.784092]  ksys_ioctl+0xb7/0xe0
<4>[  219.784096]  __x64_sys_ioctl+0x6a/0xb0
<4>[  219.784100]  do_syscall_64+0x94/0x530
<4>[  219.784103]  entry_SYSCALL_64_after_hwframe+0x49/0xb3

<3>[  219.784120] Freed by task 12:
<4>[  219.784141]  save_stack+0x19/0x40
<4>[  219.784145]  __kasan_slab_free+0x130/0x180
<4>[  219.784148]  kmem_cache_free_bulk+0x1bd/0x500
<4>[  219.784152]  kfree_rcu_work+0x1d8/0x890
<4>[  219.784155]  process_one_work+0x8ca/0x1870
<4>[  219.784158]  worker_thread+0x82/0xb80
<4>[  219.784162]  kthread+0x34e/0x420
<4>[  219.784165]  ret_from_fork+0x3a/0x50

Fixes: 2e46a2a0b0 ("drm/i915: Use explicit flag to mark unreachable intel_context")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200428090255.10035-1-chris@chris-wilson.co.uk
2020-04-29 15:16:00 +01:00
Nathan Chancellor
2ea4a7ba9b drm/i915/gt: Avoid uninitialized use of rpcurupei in frequency_show
When building with clang + -Wuninitialized:

drivers/gpu/drm/i915/gt/debugfs_gt_pm.c:407:7: warning: variable
'rpcurupei' is uninitialized when used here [-Wuninitialized]
                           rpcurupei,
                           ^~~~~~~~~
drivers/gpu/drm/i915/gt/debugfs_gt_pm.c:304:16: note: initialize the
variable 'rpcurupei' to silence this warning
                u32 rpcurupei, rpcurup, rpprevup;
                             ^
                              = 0
1 warning generated.

rpupei is assigned twice; based on the second argument to
intel_uncore_read, it seems this one should have been assigned to
rpcurupei.

Fixes: 9c878557b1 ("drm/i915/gt: Use the RPM config register to determine clk frequencies")
Link: https://github.com/ClangBuiltLinux/linux/issues/1016
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200429030051.920203-1-natechancellor@gmail.com
2020-04-29 07:46:21 +01:00
Matt Roper
8598eb781c drm/i915: Use proper fault mask in interrupt postinstall too
The IRQ postinstall handling had open-coded pipe fault mask selection
that never got updated for gen11.  Switch it to use
gen8_de_pipe_fault_mask() to ensure we don't miss updates for new
platforms.

Cc: José Roberto de Souza <jose.souza@intel.com>
Fixes: d506a65d56 ("drm/i915: Catch GTT fault errors for gen11+ planes")
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200424231423.4065231-1-matthew.d.roper@intel.com
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
(cherry picked from commit 869129ee0c)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2020-04-28 16:38:03 -07:00
Chris Wilson
f6a7c21c99 drm/i915/execlists: Verify we don't submit two identical CCIDs
Check that we do not submit two contexts into ELSP with the same CCID
[upper portion of the descriptor].

References: https://gitlab.freedesktop.org/drm/intel/-/issues/1793
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200428184751.11257-3-chris@chris-wilson.co.uk
2020-04-28 22:17:36 +01:00
Chris Wilson
5c4a53e3b1 drm/i915/execlists: Track inflight CCID
The presumption is that by using a circular counter that is twice as
large as the maximum ELSP submission, we would never reuse the same CCID
for two inflight contexts.

However, if we continually preempt an active context such that it always
remains inflight, it can be resubmitted with an arbitrary number of
paired contexts. As each of its paired contexts will use a new CCID,
eventually it will wrap and submit two ELSP with the same CCID.

Rather than use a simple circular counter, switch over to a small bitmap
of inflight ids so we can avoid reusing one that is still potentially
active.

Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1796
Fixes: 2935ed5339 ("drm/i915: Remove logical HW ID")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: <stable@vger.kernel.org> # v5.5+
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200428184751.11257-2-chris@chris-wilson.co.uk
2020-04-28 22:17:36 +01:00
Chris Wilson
2632f174a2 drm/i915/execlists: Avoid reusing the same logical CCID
The bspec is confusing on the nature of the upper 32bits of the LRC
descriptor. Once upon a time, it said that it uses the upper 32b to
decide if it should perform a lite-restore, and so we must ensure that
each unique context submitted to HW is given a unique CCID [for the
duration of it being on the HW]. Currently, this is achieved by using
a small circular tag, and assigning every context submitted to HW a
new id. However, this tag is being cleared on repinning an inflight
context such that we end up re-using the 0 tag for multiple contexts.

To avoid accidentally clearing the CCID in the upper 32bits of the LRC
descriptor, split the descriptor into two dwords so we can update the
GGTT address separately from the CCID.

Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1796
Fixes: 2935ed5339 ("drm/i915: Remove logical HW ID")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: <stable@vger.kernel.org> # v5.5+
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200428184751.11257-1-chris@chris-wilson.co.uk
2020-04-28 22:17:36 +01:00
Daniel Vetter
274ed9e9ea drm/i915: Use devm_drm_dev_alloc
Luckily we're already well set up in the main driver, with
drm_dev_put() being the last thing in both the unload error case and
the pci remove function.

Acked-by: Jani Nikula <jani.nikula@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200415074034.175360-39-daniel.vetter@ffwll.ch
2020-04-28 20:51:33 +02:00
Matt Atwood
f9d77427c3 drm/i915/tgl: Wa_14011059788
Reflect recent Bspec changes

v2: fix whitespace, typo

Signed-off-by: Matt Atwood <matthew.s.atwood@intel.com>
Reviewed-by: Radhakrishna Sripada <Radhakrishna.sripada@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200415193535.14597-1-matthew.s.atwood@intel.com
2020-04-28 11:14:34 -07:00
Chris Wilson
96a4faf524 drm/i915/selftests: Tweak the tolerance for clock ticks to 12.5%
Give a small bump for our tolerance on comparing the expected vs
measured clock ticks/time from 10% to 12.5% to accommodate a bad result
on Sandybridge that was off by 10.3%. Hopefully, that is the worst we
will see.

Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1802
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200428114307.5153-1-chris@chris-wilson.co.uk
2020-04-28 14:25:21 +01:00
Colin Ian King
d631461d5c drm/i915/gt: fix spelling mistake "evalution" -> "evaluation"
There is a spelling mistaking in a pr_notice message. Fix it.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200428084920.1035125-1-colin.king@canonical.com
2020-04-28 09:53:59 +01:00
Matt Roper
869129ee0c drm/i915: Use proper fault mask in interrupt postinstall too
The IRQ postinstall handling had open-coded pipe fault mask selection
that never got updated for gen11.  Switch it to use
gen8_de_pipe_fault_mask() to ensure we don't miss updates for new
platforms.

Cc: José Roberto de Souza <jose.souza@intel.com>
Fixes: d506a65d56 ("drm/i915: Catch GTT fault errors for gen11+ planes")
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200424231423.4065231-1-matthew.d.roper@intel.com
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
2020-04-27 11:36:41 -07:00
Chris Wilson
2abaad4eb5 drm/i915/gt: Check cacheline is valid before acquiring
The hwsp_cacheline pointer from i915_request is very, very flimsy. The
i915_request.timeline (and the hwsp_cacheline) are lost upon retiring
(after an RCU grace). Therefore we need to confirm that once we have the
right pointer for the cacheline, it is not in the process of being
retired and disposed of before we attempt to acquire a reference to the
cacheline.

<3>[  547.208237] BUG: KASAN: use-after-free in active_debug_hint+0x6a/0x70 [i915]
<3>[  547.208366] Read of size 8 at addr ffff88822a0d2710 by task gem_exec_parall/2536

<4>[  547.208547] CPU: 3 PID: 2536 Comm: gem_exec_parall Tainted: G     U            5.7.0-rc2-ged7a286b5d02d-kasan_117+ #1
<4>[  547.208556] Hardware name: Dell Inc. XPS 13 9350/, BIOS 1.4.12 11/30/2016
<4>[  547.208564] Call Trace:
<4>[  547.208579]  dump_stack+0x96/0xdb
<4>[  547.208707]  ? active_debug_hint+0x6a/0x70 [i915]
<4>[  547.208719]  print_address_description.constprop.6+0x16/0x310
<4>[  547.208841]  ? active_debug_hint+0x6a/0x70 [i915]
<4>[  547.208963]  ? active_debug_hint+0x6a/0x70 [i915]
<4>[  547.208975]  __kasan_report+0x137/0x190
<4>[  547.209106]  ? active_debug_hint+0x6a/0x70 [i915]
<4>[  547.209127]  kasan_report+0x32/0x50
<4>[  547.209257]  ? i915_gemfs_fini+0x40/0x40 [i915]
<4>[  547.209376]  active_debug_hint+0x6a/0x70 [i915]
<4>[  547.209389]  debug_print_object+0xa7/0x220
<4>[  547.209405]  ? lockdep_hardirqs_on+0x348/0x5f0
<4>[  547.209426]  debug_object_assert_init+0x297/0x430
<4>[  547.209449]  ? debug_object_free+0x360/0x360
<4>[  547.209472]  ? lock_acquire+0x1ac/0x8a0
<4>[  547.209592]  ? intel_timeline_read_hwsp+0x4f/0x840 [i915]
<4>[  547.209737]  ? i915_active_acquire_if_busy+0x66/0x120 [i915]
<4>[  547.209861]  i915_active_acquire_if_busy+0x66/0x120 [i915]
<4>[  547.209990]  ? __live_alloc.isra.15+0xc0/0xc0 [i915]
<4>[  547.210005]  ? rcu_read_lock_sched_held+0xd0/0xd0
<4>[  547.210017]  ? print_usage_bug+0x580/0x580
<4>[  547.210153]  intel_timeline_read_hwsp+0xbc/0x840 [i915]
<4>[  547.210284]  __emit_semaphore_wait+0xd5/0x480 [i915]
<4>[  547.210415]  ? i915_fence_get_timeline_name+0x110/0x110 [i915]
<4>[  547.210428]  ? lockdep_hardirqs_on+0x348/0x5f0
<4>[  547.210442]  ? _raw_spin_unlock_irq+0x2a/0x40
<4>[  547.210567]  ? __await_execution.constprop.51+0x2e0/0x570 [i915]
<4>[  547.210706]  i915_request_await_dma_fence+0x8f7/0xc70 [i915]

Fixes: 85bedbf191 ("drm/i915/gt: Eliminate the trylock for reading a timeline's hwsp")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: <stable@vger.kernel.org> # v5.6+
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200427093038.29219-1-chris@chris-wilson.co.uk
(cherry picked from commit 2759e39535)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2020-04-27 09:47:40 -07:00
Chris Wilson
f524a774a4 drm/i915/gem: Hold obj->vma.lock over for_each_ggtt_vma()
While the ggtt vma are protected by their object lifetime, the list
continues until it hits a non-ggtt vma, and that vma is not protected
and may be freed as we inspect it. Hence, we require the obj->vma.lock
to protect the list as we iterate.

An example of forgetting to hold the obj->vma.lock is

[1642834.464973] general protection fault, probably for non-canonical address 0xdead000000000122: 0000 [#1] SMP PTI
[1642834.464977] CPU: 3 PID: 1954 Comm: Xorg Not tainted 5.6.0-300.fc32.x86_64 #1
[1642834.464979] Hardware name: LENOVO 20ARS25701/20ARS25701, BIOS GJET94WW (2.44 ) 09/14/2017
[1642834.465021] RIP: 0010:i915_gem_object_set_tiling+0x2c0/0x3e0 [i915]
[1642834.465024] Code: 8b 84 24 18 01 00 00 f6 c4 80 74 59 49 8b 94 24 a0 00 00 00 49 8b 84 24 e0 00 00 00 49 8b 74 24 10 48 8b 92 30 01 00 00 89 c7 <80> ba 0a 06 00 00 03 0f 87 86 00 00 00 ba 00 00 08 00 b9 00 00 10
[1642834.465025] RSP: 0018:ffffa98780c77d60 EFLAGS: 00010282
[1642834.465028] RAX: ffff8d232bfb2578 RBX: 0000000000000002 RCX: ffff8d25873a0000
[1642834.465029] RDX: dead000000000122 RSI: fffff0af8ac6e408 RDI: 000000002bfb2578
[1642834.465030] RBP: ffff8d25873a0000 R08: ffff8d252bfb5638 R09: 0000000000000000
[1642834.465031] R10: 0000000000000000 R11: ffff8d252bfb5640 R12: ffffa987801cb8f8
[1642834.465032] R13: 0000000000001000 R14: ffff8d233e972e50 R15: ffff8d233e972d00
[1642834.465034] FS:  00007f6a3d327f00(0000) GS:ffff8d25926c0000(0000) knlGS:0000000000000000
[1642834.465036] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[1642834.465037] CR2: 00007f6a2064d000 CR3: 00000002fb57c001 CR4: 00000000001606e0
[1642834.465038] Call Trace:
[1642834.465083]  i915_gem_set_tiling_ioctl+0x122/0x230 [i915]
[1642834.465121]  ? i915_gem_object_set_tiling+0x3e0/0x3e0 [i915]
[1642834.465151]  drm_ioctl_kernel+0x86/0xd0 [drm]
[1642834.465156]  ? avc_has_perm+0x3b/0x160
[1642834.465178]  drm_ioctl+0x206/0x390 [drm]
[1642834.465216]  ? i915_gem_object_set_tiling+0x3e0/0x3e0 [i915]
[1642834.465221]  ? selinux_file_ioctl+0x122/0x1c0
[1642834.465226]  ? __do_munmap+0x24b/0x4d0
[1642834.465231]  ksys_ioctl+0x82/0xc0
[1642834.465235]  __x64_sys_ioctl+0x16/0x20
[1642834.465238]  do_syscall_64+0x5b/0xf0
[1642834.465243]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[1642834.465245] RIP: 0033:0x7f6a3d7b047b
[1642834.465247] Code: 0f 1e fa 48 8b 05 1d aa 0c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d ed a9 0c 00 f7 d8 64 89 01 48
[1642834.465249] RSP: 002b:00007ffe71adba28 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[1642834.465251] RAX: ffffffffffffffda RBX: 000055f99048fa40 RCX: 00007f6a3d7b047b
[1642834.465253] RDX: 00007ffe71adba30 RSI: 00000000c0106461 RDI: 000000000000000e
[1642834.465254] RBP: 0000000000000002 R08: 000055f98f3f1798 R09: 0000000000000002
[1642834.465255] R10: 0000000000001000 R11: 0000000000000246 R12: 0000000000000080
[1642834.465257] R13: 000055f98f3f1690 R14: 00000000c0106461 R15: 00007ffe71adba30

Now to take the spinlock during the list iteration, we need to break it
down into two phases. In the first phase under the lock, we cannot sleep
and so must defer the actual work to a second list, protected by the
ggtt->mutex.

We also need to hold the spinlock during creation of a new vma to
serialise with updates of the tiling on the object.

Reported-by: Dave Airlie <airlied@redhat.com>
Fixes: 2850748ef8 ("drm/i915: Pull i915_vma_pin under the vm->mutex")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: <stable@vger.kernel.org> # v5.5+
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200422072805.17340-1-chris@chris-wilson.co.uk
(cherry picked from commit cb593e5d2b)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2020-04-27 09:47:37 -07:00
Xiyu Yang
5d5e100a20 drm/i915/selftests: Fix i915_address_space refcnt leak
igt_ppgtt_pin_update() invokes i915_gem_context_get_vm_rcu(), which
returns a reference of the i915_address_space object to "vm" with
increased refcount.

When igt_ppgtt_pin_update() returns, "vm" becomes invalid, so the
refcount should be decreased to keep refcount balanced.

The reference counting issue happens in two exception handling paths of
igt_ppgtt_pin_update(). When i915_gem_object_create_internal() returns
IS_ERR, the refcnt increased by i915_gem_context_get_vm_rcu() is not
decreased, causing a refcnt leak.

Fix this issue by jumping to "out_vm" label when
i915_gem_object_create_internal() returns IS_ERR.

Fixes: a4e7ccdac3 ("drm/i915: Move context management under GEM")
Signed-off-by: Xiyu Yang <xiyuyang19@fudan.edu.cn>
Signed-off-by: Xin Tan <tanxin.ctf@gmail.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/1587361342-83494-1-git-send-email-xiyuyang19@fudan.edu.cn
(cherry picked from commit e07c7606a0)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2020-04-27 09:47:33 -07:00
Chris Wilson
6dc0d028f5 drm/i915/gt: Fix up clock frequency
The bspec lists both the clock frequency and the effective interval. The
interval corresponds to observed behaviour, so adjust the frequency to
match.

v2: Mika rightfully asked if we could measure the clock frequency from a
selftest.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200427154554.12736-1-chris@chris-wilson.co.uk
2020-04-27 17:34:33 +01:00
Chris Wilson
4243cd5388 drm/i915/gt: Sanitize GT first
We see that if the HW doesn't actually sleep, the HW may eat the poison
we set in its write-only HWSP during sanitize:

  intel_gt_resume.part.8: 0000:00:02.0
  __gt_unpark: 0000:00:02.0
  gt_sanitize: 0000:00:02.0 force:yes
  process_csb: 0000:00:02.0 vcs0: cs-irq head=5, tail=90
  process_csb: 0000:00:02.0 vcs0: csb[0]: status=0x5a5a5a5a:0x5a5a5a5a
  assert_pending_valid: Nothing pending for promotion!

The CS TAIL pointer should have been reset by reset_csb_pointers(), so
in this case it is likely that we have read back from the CPU cache and
so we must clflush our control over that page. In doing so, push the
sanitisation to the start of the GT sequence so that our poisoning is
assuredly before we start talking to the HW.

References: https://gitlab.freedesktop.org/drm/intel/-/issues/1794
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200427084000.10999-1-chris@chris-wilson.co.uk
2020-04-27 11:39:23 +01:00
Chris Wilson
2759e39535 drm/i915/gt: Check cacheline is valid before acquiring
The hwsp_cacheline pointer from i915_request is very, very flimsy. The
i915_request.timeline (and the hwsp_cacheline) are lost upon retiring
(after an RCU grace). Therefore we need to confirm that once we have the
right pointer for the cacheline, it is not in the process of being
retired and disposed of before we attempt to acquire a reference to the
cacheline.

<3>[  547.208237] BUG: KASAN: use-after-free in active_debug_hint+0x6a/0x70 [i915]
<3>[  547.208366] Read of size 8 at addr ffff88822a0d2710 by task gem_exec_parall/2536

<4>[  547.208547] CPU: 3 PID: 2536 Comm: gem_exec_parall Tainted: G     U            5.7.0-rc2-ged7a286b5d02d-kasan_117+ #1
<4>[  547.208556] Hardware name: Dell Inc. XPS 13 9350/, BIOS 1.4.12 11/30/2016
<4>[  547.208564] Call Trace:
<4>[  547.208579]  dump_stack+0x96/0xdb
<4>[  547.208707]  ? active_debug_hint+0x6a/0x70 [i915]
<4>[  547.208719]  print_address_description.constprop.6+0x16/0x310
<4>[  547.208841]  ? active_debug_hint+0x6a/0x70 [i915]
<4>[  547.208963]  ? active_debug_hint+0x6a/0x70 [i915]
<4>[  547.208975]  __kasan_report+0x137/0x190
<4>[  547.209106]  ? active_debug_hint+0x6a/0x70 [i915]
<4>[  547.209127]  kasan_report+0x32/0x50
<4>[  547.209257]  ? i915_gemfs_fini+0x40/0x40 [i915]
<4>[  547.209376]  active_debug_hint+0x6a/0x70 [i915]
<4>[  547.209389]  debug_print_object+0xa7/0x220
<4>[  547.209405]  ? lockdep_hardirqs_on+0x348/0x5f0
<4>[  547.209426]  debug_object_assert_init+0x297/0x430
<4>[  547.209449]  ? debug_object_free+0x360/0x360
<4>[  547.209472]  ? lock_acquire+0x1ac/0x8a0
<4>[  547.209592]  ? intel_timeline_read_hwsp+0x4f/0x840 [i915]
<4>[  547.209737]  ? i915_active_acquire_if_busy+0x66/0x120 [i915]
<4>[  547.209861]  i915_active_acquire_if_busy+0x66/0x120 [i915]
<4>[  547.209990]  ? __live_alloc.isra.15+0xc0/0xc0 [i915]
<4>[  547.210005]  ? rcu_read_lock_sched_held+0xd0/0xd0
<4>[  547.210017]  ? print_usage_bug+0x580/0x580
<4>[  547.210153]  intel_timeline_read_hwsp+0xbc/0x840 [i915]
<4>[  547.210284]  __emit_semaphore_wait+0xd5/0x480 [i915]
<4>[  547.210415]  ? i915_fence_get_timeline_name+0x110/0x110 [i915]
<4>[  547.210428]  ? lockdep_hardirqs_on+0x348/0x5f0
<4>[  547.210442]  ? _raw_spin_unlock_irq+0x2a/0x40
<4>[  547.210567]  ? __await_execution.constprop.51+0x2e0/0x570 [i915]
<4>[  547.210706]  i915_request_await_dma_fence+0x8f7/0xc70 [i915]

Fixes: 85bedbf191 ("drm/i915/gt: Eliminate the trylock for reading a timeline's hwsp")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: <stable@vger.kernel.org> # v5.6+
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200427093038.29219-1-chris@chris-wilson.co.uk
2020-04-27 11:39:23 +01:00
Chris Wilson
68ace460c5 drm/i915/execlists: Check preempt-timeout target before submit_ports
We evaluate *active, which is a pointer into execlists->inflight[]
during dequeue to decide how long a preempt-timeout we need to apply.
However, as soon as we do the submit_ports, the HW may send its ACK
interrupt causing us to promote execlists->pending[] tp
execlists->inflight[], overwriting the value of *active. We know *active
is only stable until we submit (as we only submit when there is no
pending promotion).

[   16.102328] BUG: KCSAN: data-race in execlists_dequeue+0x1449/0x1600 [i915]
[   16.102356]
[   16.102375] race at unknown origin, with read to 0xffff8881e9500488 of 8 bytes by task 429 on cpu 1:
[   16.102780]  execlists_dequeue+0x1449/0x1600 [i915]
[   16.103160]  __execlists_submission_tasklet+0x48/0x60 [i915]
[   16.103540]  execlists_submit_request+0x38e/0x3c0 [i915]
[   16.103940]  submit_notify+0x8f/0xc0 [i915]
[   16.104308]  __i915_sw_fence_complete+0x61/0x420 [i915]
[   16.104683]  i915_sw_fence_complete+0x58/0x80 [i915]
[   16.105054]  i915_sw_fence_commit+0x16/0x20 [i915]
[   16.105457]  __i915_request_queue+0x60/0x70 [i915]
[   16.105843]  i915_gem_do_execbuffer+0x2d6b/0x4230 [i915]
[   16.106227]  i915_gem_execbuffer2_ioctl+0x2b0/0x580 [i915]
[   16.106257]  drm_ioctl_kernel+0xe9/0x130
[   16.106279]  drm_ioctl+0x27d/0x45e
[   16.106311]  ksys_ioctl+0x89/0xb0
[   16.106336]  __x64_sys_ioctl+0x42/0x60
[   16.106370]  do_syscall_64+0x6e/0x2c0
[   16.106397]  entry_SYSCALL_64_after_hwframe+0x44/0xa9

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200426094231.21995-1-chris@chris-wilson.co.uk
2020-04-27 11:36:59 +01:00
Nick Desaulniers
9f4069b055 drm/i915: re-disable -Wframe-address
The top level Makefile disables this warning. When building an
i386_defconfig with Clang, this warning is triggered a whole bunch via
includes of headers from perf.

Link: https://github.com/ClangBuiltLinux/continuous-integration/pull/182
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200426214215.139435-1-ndesaulniers@google.com
2020-04-27 09:58:27 +01:00
Mika Kuoppala
b8a1181122 drm/i915: Use indirect ctx bb to mend CMD_BUF_CCTL
Use indirect ctx bb to load cmd buffer control value
from context image to avoid corruption.

v2: add to lrc layout (Chris)
v3: end to a cacheline (Chris)
v4: add to lrc fixed (Chris)
v5: value in offset+1

Testcase: igt/i915_selftest/gt_lrc
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200424230632.30333-1-mika.kuoppala@linux.intel.com
2020-04-25 19:08:56 +01:00
Mika Kuoppala
1dd47b54ba drm/i915: Add live selftests for indirect ctx batchbuffers
Indirect ctx batchbuffers are a hw feature of which
batch can be run, by hardware, during context restoration stage.
Driver can setup a batchbuffer and also an offset into the
context image. When context image is marshalled from
memory to registers, and when the offset from the start of
context register state is equal of what driver pre-determined,
batch will run. So one can manipulate context restoration
process at cacheline granularity, given some limitations,
as you need to have rudimentaries in place before you can
run a batch.

Add selftest which will write the ring start register
to a canary spot. This will test that hardware will run a
batchbuffer for the context in question.

v2: request wait fix, naming (Chris)
v3: test order (Chris)
v4: rebase

Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200424214841.28076-3-mika.kuoppala@linux.intel.com
2020-04-25 19:08:18 +01:00
Mika Kuoppala
685d21096f drm/i915: Add per ctx batchbuffer wa for timestamp
Restoration of a previous timestamp can collide
with updating the timestamp, causing a value corruption.

Combat this issue by using indirect ctx bb to
modify the context image during restoring process.

We can preload value into scratch register. From which
we then do the actual write with LRR. LRR is faster and
thus less error prone as probability of race drops.

v2: tidying (Chris)
v3: lrr for all engines
v4: grp
v5: reg bit
v6: wa_bb_offset, virtual engines (Chris)

References: HSDES#16010904313
Testcase: igt/i915_selftest/gt_lrc
Suggested-by: Joseph Koston <joseph.koston@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200424230546.30271-1-mika.kuoppala@linux.intel.com
2020-04-25 18:39:32 +01:00
Mika Kuoppala
168c6d231b drm/i915: Add engine scratch register to live_lrc_fixed
General purpose registers are per engine and
in a fixed location. Add to live_lrc_fixed.

Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200424214841.28076-1-mika.kuoppala@linux.intel.com
2020-04-25 17:58:33 +01:00
Chris Wilson
9669a50799 drm/i915: Drop rq->ring->vma peeking from error capture
We only hold the active spinlock while dumping the error state, and this
does not prevent another thread from retiring the request -- as it is
quite possible that despite us capturing the current state, the GPU has
completed the request. As such, it is dangerous to dereference state
below the request as it may already be freed, and the simplest way to
avoid the danger is not include it in the error state.

Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1788
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Andi Shyti <andi.shyti@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200424191410.27570-1-chris@chris-wilson.co.uk
2020-04-24 22:14:35 +01:00
Rafael J. Wysocki
e07515563d PM: sleep: core: Rename DPM_FLAG_NEVER_SKIP
Rename DPM_FLAG_NEVER_SKIP to DPM_FLAG_NO_DIRECT_COMPLETE which
matches its purpose more closely.

No functional impact.

Suggested-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com> # for PCI parts
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-24 21:33:09 +02:00
Chris Wilson
9c878557b1 drm/i915/gt: Use the RPM config register to determine clk frequencies
For many configuration details within RC6 and RPS we are programming
intervals for the internal clocks. From gen11, these clocks are
configuration via the RPM_CONFIG and so for convenience, we would like
to convert to/from more natural units (ns).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Andi Shyti <andi.shyti@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200424162805.25920-2-chris@chris-wilson.co.uk
2020-04-24 19:10:17 +01:00
Chris Wilson
555a322429 drm/i915/gt: Trace RPS events
Add tracek to the RPS events (interrupts, worker, enabling, threshold
selection, frequency setting), so that if we have to debug reticent HW
we have some traces to start from.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200424162805.25920-1-chris@chris-wilson.co.uk
2020-04-24 18:38:46 +01:00
Chris Wilson
1ebf7aaf3a drm/i915/gt: Prefer soft-rc6 over RPS DOWN_TIMEOUT
The RPS DOWN_TIMEOUT interrupt is signaled after a period of rc6, and
upon receipt of that interrupt we reprogram the GPU clocks down to the
next idle notch [to help convserve power during rc6]. However, on
execlists, we benefit from soft-rc6 immediately parking the GPU and
setting idle frequencies upon idling [within a jiffie], and here the
interrupt prevents us from restarting from our last frequency.

In the process, we can simply opt for a static pm_events mask and rely
on the enable/disable interrupts to flush the worker on parking.

This will reduce the amount of oscillation observed during steady
workloads with microsleeps, as each time the rc6 timeout occurs we
immediately follow with a waitboost for a dropped frame.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200422001703.1697-1-chris@chris-wilson.co.uk
2020-04-24 17:20:58 +01:00
Ville Syrjälä
7db8736db0 drm/i915: Split some long lines
Split some overly long lines.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200420200610.31798-4-ville.syrjala@linux.intel.com
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
2020-04-24 17:59:59 +03:00
Ville Syrjälä
8fdda38549 drm/i915: Introduce .set_idle_link_train() vfunc
Relocate a bunch of DDI specific code from intel_dp.c to intel_ddi.c
by introducing a .set_idle_link_train() vfunc.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200420200610.31798-3-ville.syrjala@linux.intel.com
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
2020-04-24 17:57:15 +03:00
Ville Syrjälä
fb83f72c48 drm/i915: Introduce .set_signal_levels() vfunc
Sort out some of the mess between intel_ddi.c intel_dp.c by
introducing a .set_signal_levels() vfunc.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200420200610.31798-2-ville.syrjala@linux.intel.com
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
2020-04-24 17:53:26 +03:00
Ville Syrjälä
eee3f91195 drm/i915: Introduce .set_link_train() vfunc
Sort out some of the mess between intel_ddi.c intel_dp.c by
introducing a .set_link_train() vfunc.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200420200610.31798-1-ville.syrjala@linux.intel.com
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
2020-04-24 17:45:44 +03:00
Ville Syrjälä
d7ff281c6d drm/i915: Have pfit calculations return an error code
Change intel_{gmch,pch}_panel_fitting() to return a normal
error vs. success int. We'll need this later to validate that
the margin properties aren't misconfigured.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200422161917.17389-6-ville.syrjala@linux.intel.com
Reviewed-by: Manasi Navare <manasi.d.navare@intel.com>
2020-04-24 17:37:22 +03:00
Ville Syrjälä
4cecc7c0cc drm/i915: Pass connector state to pfit calculations
Pass the entire connector state to intel_{gmch,pch}_panel_fitting().
For now we just need to get at .scaling_mode but in the future we'll
want access to the margin properties as well.

v2: Deal with intel_dp_ycbcr420_config()

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200422161917.17389-5-ville.syrjala@linux.intel.com
Reviewed-by: Manasi Navare <manasi.d.navare@intel.com>
2020-04-24 17:33:35 +03:00