drm/i915/execlists: Read the context-status buffer from the HWSP

The engine provides a mirror of the CSB in the HWSP. If we use the
cacheable reads from the HWSP, we can shave off a few mmio reads per
context-switch interrupt (which are quite frequent!). Just removing a
couple of mmio is not enough to actually reduce any latency, but a small
reduction in overall cpu usage.

Much appreciation for Ben dropping the bombshell that the CSB was in the
HWSP and for Michel in digging out the details.

v2: Don't be lazy, add the defines for the indices.
v3: Include the HWSP in debugfs/i915_engine_info
v4: Check for GVT-g, it currently depends on intercepting CSB mmio
v5: Fixup GVT-g mmio path
v6: Disable HWSP if VT-d is active as the iommu adds unpredictable
memory latency. (Mika)
v7: Also markup the CSB read with READ_ONCE() as it may still be an mmio
read and we want to stop the compiler from issuing a later (v.slow) reload.

Suggested-by: Ben Widawsky <benjamin.widawsky@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michel Thierry <michel.thierry@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Cc: Zhi Wang <zhi.a.wang@intel.com>
Acked-by: Michel Thierry <michel.thierry@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170913133534.26927-1-chris@chris-wilson.co.uk
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
This commit is contained in:
Chris Wilson
2017-09-13 14:35:34 +01:00
parent 34a04e5e46
commit 6d2cb5aa38
3 changed files with 38 additions and 7 deletions

View File

@@ -3315,6 +3315,7 @@ static int i915_engine_info(struct seq_file *m, void *unused)
upper_32_bits(addr), lower_32_bits(addr));
if (i915.enable_execlists) {
const u32 *hws = &engine->status_page.page_addr[I915_HWS_CSB_BUF0_INDEX];
u32 ptr, read, write;
unsigned int idx;
@@ -3337,10 +3338,12 @@ static int i915_engine_info(struct seq_file *m, void *unused)
write += GEN8_CSB_ENTRIES;
while (read < write) {
idx = ++read % GEN8_CSB_ENTRIES;
seq_printf(m, "\tExeclist CSB[%d]: 0x%08x, context: %d\n",
seq_printf(m, "\tExeclist CSB[%d]: 0x%08x [0x%08x in hwsp], context: %d [%d in hwsp]\n",
idx,
I915_READ(RING_CONTEXT_STATUS_BUF_LO(engine, idx)),
I915_READ(RING_CONTEXT_STATUS_BUF_HI(engine, idx)));
hws[idx * 2],
I915_READ(RING_CONTEXT_STATUS_BUF_HI(engine, idx)),
hws[idx * 2 + 1]);
}
rcu_read_lock();