We removed retiring requests from the shrinker in order to decouple the
mutexes from reclaim in preparation for unravelling the struct_mutex.
The impact of not retiring is that we are much less agressive in making
global objects available for shrinking, as such objects remain pinned
until they are flushed by a heartbeat pulse following the last retired
request along their timeline. In order to ensure that pulse occurs in
time for memory reclamation, we should kick it from kswapd.
The catch is that we have added some flush_work() into the retirement
phase (to ensure that we reach a global idle in a timely manner), but
these flush_work() are not eligible (i.e do not belong to WQ_MEM_RELCAIM)
for use from inside kswapd. To avoid flushing those workqueues, we teach
the retirer not to do so unless we are actually waiting, and only do the
plain retire from inside the shrinker.
Note that for execlists, we already retire completed contexts as they
are scheduled out, so it should not be keeping global state
unnecessarily pinned. The legacy ringbuffer however...
References: 9e9539800d ("drm/i915: Remove waiting & retiring from shrinker paths")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200708173748.32734-1-chris@chris-wilson.co.uk
Avoid waking up the device and taking stale locks if we know that the
object is not currently mmapped. This is particularly useful as not many
object are actually mmapped and so we can destroy them without waking
the device up, and gives us a little more freedom of workqueue ordering
during shutdown.
v2: Pull the release_mmap() into its single user in freeing the objects,
where there can not be any race with a concurrent user of the freed
object. Or so one hopes!
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>,
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>,
Link: https://patchwork.freedesktop.org/patch/msgid/20200702163623.6402-2-chris@chris-wilson.co.uk
The obj->lut_list is traversed when the object is closed as the file
table is destroyed during process termination. As this occurs before we
kill any outstanding context if, due to some bug or another, the closure
is blocked, then we fail to shootdown any inflight operations
potentially leaving the GPU spinning forever. As we only need to guard
the list against concurrent closures and insertions, the hold is short
and merits being treated as a simple spinlock.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200701084439.17025-1-chris@chris-wilson.co.uk
Rearrange the allocation of the mm_struct registration to avoid
allocating underneath the i915->mm_lock, so that we avoid tainting the
lock (and in turn many other locks that may be held as i915->mm_lock is
taken, and those locks we may want on the free [shrinker] paths). In
doing so, we convert the lookup to be RCU protected by courtesy of
converting the free-worker to be an rcu_work.
v2: Remember to use hash_rcu variants to protect the list iteration from
concurrent add/del.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200619194038.5088-1-chris@chris-wilson.co.uk
Catch up with upstream, in particular to get c1e8d7c6a7 ("mmap locking
API: convert mmap_sem comments").
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Start using device specific parameters instead of module parameters for
most things. The module parameters become the immutable initial values
for i915 parameters. The device specific parameters in i915->params
start life as a copy of i915_modparams. Any later changes are only
reflected in the debugfs.
The stragglers are:
* i915.force_probe and i915.modeset. Needed before dev_priv is
available. This is fine because the parameters are read-only and never
modified.
* i915.verbose_state_checks. Passing dev_priv to I915_STATE_WARN and
I915_STATE_WARN_ON would result in massive and ugly churn. This is
handled by not exposing the parameter via debugfs, and leaving the
parameter writable in sysfs. This may be fixed up in follow-up work.
* i915.inject_probe_failure. Only makes sense in terms of the module,
not the device. This is handled by not exposing the parameter via
debugfs.
v2: Fix uc i915 lookup code (Michał Winiarski)
Cc: Juha-Pekka Heikkilä <juha-pekka.heikkila@intel.com>
Cc: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Acked-by: Michał Winiarski <michal.winiarski@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20200618150402.14022-1-jani.nikula@intel.com
Pull drm fixes from Dave Airlie:
"One sun4i fix and a connector hotplug race The ast fix is for a
regression in 5.6, and one of the i915 ones fixes an oops reported by
dhowells.
core:
- fix race in connectors sending hotplug
i915:
- Avoid use after free in cmdparser
- Avoid NULL dereference when probing all display encoders
- Fixup to module parameter type
sun4i:
- clock divider fix
ast:
- 24/32 bpp mode setting fix"
* tag 'drm-next-2020-06-11-1' of git://anongit.freedesktop.org/drm/drm:
drm/ast: fix missing break in switch statement for format->cpp[0] case 4
drm/sun4i: hdmi ddc clk: Fix size of m divider
drm/i915/display: Only query DP state of a DDI encoder
drm/i915/params: fix i915.reset module param type
drm/i915/gem: Mark the buffer pool as active for the cmdparser
drm/connector: notify userspace on hotplug after register complete
Pull i915 uaccess updates from Al Viro:
"Low-hanging fruit in i915; there are several trickier followups, but
that'll wait for the next cycle"
* 'uaccess.i915' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
i915:get_engines(): get rid of pointless access_ok()
i915: alloc_oa_regs(): get rid of pointless access_ok()
i915 compat ioctl(): just use drm_ioctl_kernel()
i915: switch copy_perf_config_registers_or_number() to unsafe_put_user()
i915: switch query_{topology,engine}_info() to copy_to_user()
Pull drm fixes from Dave Airlie:
"These are the fixes from last week for the stuff merged in the merge
window. It got a bunch of nouveau fixes for HDA audio on some new
GPUs, some i915 and some amdpgu fixes.
i915:
- gvt: Fix one clang warning on debug only function
- Use ARRAY_SIZE for coccicheck warning
- Use after free fix for display global state.
- Whitelisting context-local timestamp on Gen9 and two scheduler
fixes with deps (Cc: stable)
- Removal of write flag from sysfs files where ineffective
nouveau:
- HDMI/DP audio HDA fixes
- display hang fix for Volta/Turing
- GK20A regression fix.
amdgpu:
- Prevent hwmon accesses while GPU is in reset
- CTF interrupt fix
- Backlight fix for renoir
- Fix for display sync groups
- Display bandwidth validation workaround"
* tag 'drm-next-2020-06-08' of git://anongit.freedesktop.org/drm/drm: (28 commits)
drm/nouveau/kms/nv50-: clear SW state of disabled windows harder
drm/nouveau: gr/gk20a: Use firmware version 0
drm/nouveau/disp/gm200-: detect and potentially disable HDA support on some SORs
drm/nouveau/disp/gp100: split SOR implementation from gm200
drm/nouveau/disp: modify OR allocation policy to account for HDA requirements
drm/nouveau/disp: split part of OR allocation logic into a function
drm/nouveau/disp: provide hint to OR allocation about HDA requirements
drm/amd/display: Revalidate bandwidth before commiting DC updates
drm/amdgpu/display: use blanked rather than plane state for sync groups
drm/i915/params: fix i915.fake_lmem_start module param sysfs permissions
drm/i915/params: don't expose inject_probe_failure in debugfs
drm/i915: Whitelist context-local timestamp in the gen9 cmdparser
drm/i915: Fix global state use-after-frees with a refcount
drm/i915: Check for awaits on still currently executing requests
drm/i915/gt: Do not schedule normal requests immediately along virtual
drm/i915: Reorder await_execution before await_request
drm/nouveau/kms/gt215-: fix race with audio driver runpm
drm/nouveau/disp/gm200-: fix NV_PDISP_SOR_HDMI2_CTRL(n) selection
Revert "drm/amd/display: disable dcn20 abm feature for bring up"
drm/amd/powerplay: ack the SMUToHost interrupt on receive V2
...
If the execbuf is interrupted after building the cmdparser pipeline, and
before we commit to submitting the request to HW, we would attempt to
clean up the cmdparser early. While we held active references to the vma
being parsed and constructed, we did not hold an active reference for
the buffer pool itself. The result was that an interrupted execbuf could
still have run the cmdparser pipeline, but since the buffer pool was
idle, its target vma could have been recycled.
Note this problem only occurs if the cmdparser is running async due to
pipelined waits on busy fences, and the execbuf is interrupted.
Fixes: 686c7c35ab ("drm/i915/gem: Asynchronous cmdparser")
Fixes: 16e8745967 ("drm/i915/gt: Move the batch buffer pool from the engine to the gt")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200604103751.18816-1-chris@chris-wilson.co.uk
(cherry picked from commit 57a78ca4ec)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reduce the 3 relocation paths down to the single path that accommodates
all. The primary motivation for this is to guard the relocations with a
natural fence (derived from the i915_request used to write the
relocation from the GPU).
The tradeoff in using async gpu relocations is that it increases latency
over using direct CPU relocations, for the cases where the target is
idle and accessible by the CPU. The benefit is greatly reduced lock
contention and improved concurrency by pipelining.
Note that forcing the async gpu relocations does reveal a few issues
they have. Firstly, is that they are visible as writes to gem_busy,
causing to mark some buffers are being to written to by the GPU even
though userspace only reads. Secondly is that, in combination with the
cmdparser, they can cause priority inversions. This should be the case
where the work is being put into a common workqueue losing our priority
information and so being executed in FIFO from the worker, denying us
the opportunity to reorder the requests afterwards.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200604211457.19696-1-chris@chris-wilson.co.uk
If the execbuf is interrupted after building the cmdparser pipeline, and
before we commit to submitting the request to HW, we would attempt to
clean up the cmdparser early. While we held active references to the vma
being parsed and constructed, we did not hold an active reference for
the buffer pool itself. The result was that an interrupted execbuf could
still have run the cmdparser pipeline, but since the buffer pool was
idle, its target vma could have been recycled.
Note this problem only occurs if the cmdparser is running async due to
pipelined waits on busy fences, and the execbuf is interrupted.
Fixes: 686c7c35ab ("drm/i915/gem: Asynchronous cmdparser")
Fixes: 16e8745967 ("drm/i915/gt: Move the batch buffer pool from the engine to the gt")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200604103751.18816-1-chris@chris-wilson.co.uk
Merge updates from Andrew Morton:
"A few little subsystems and a start of a lot of MM patches.
Subsystems affected by this patch series: squashfs, ocfs2, parisc,
vfs. With mm subsystems: slab-generic, slub, debug, pagecache, gup,
swap, memcg, pagemap, memory-failure, vmalloc, kasan"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (128 commits)
kasan: move kasan_report() into report.c
mm/mm_init.c: report kasan-tag information stored in page->flags
ubsan: entirely disable alignment checks under UBSAN_TRAP
kasan: fix clang compilation warning due to stack protector
x86/mm: remove vmalloc faulting
mm: remove vmalloc_sync_(un)mappings()
x86/mm/32: implement arch_sync_kernel_mappings()
x86/mm/64: implement arch_sync_kernel_mappings()
mm/ioremap: track which page-table levels were modified
mm/vmalloc: track which page-table levels were modified
mm: add functions to track page directory modifications
s390: use __vmalloc_node in stack_alloc
powerpc: use __vmalloc_node in alloc_vm_stack
arm64: use __vmalloc_node in arch_alloc_vmap_stack
mm: remove vmalloc_user_node_flags
mm: switch the test_vmalloc module to use __vmalloc_node
mm: remove __vmalloc_node_flags_caller
mm: remove both instances of __vmalloc_node_flags
mm: remove the prot argument to __vmalloc_node
mm: remove the pgprot argument to __vmalloc
...
Doing a "get_user_pages()" on a copy-on-write page for reading can be
ambiguous: the page can be COW'ed at any time afterwards, and the
direction of a COW event isn't defined.
Yes, whoever writes to it will generally do the COW, but if the thread
that did the get_user_pages() unmapped the page before the write (and
that could happen due to memory pressure in addition to any outright
action), the writer could also just take over the old page instead.
End result: the get_user_pages() call might result in a page pointer
that is no longer associated with the original VM, and is associated
with - and controlled by - another VM having taken it over instead.
So when doing a get_user_pages() on a COW mapping, the only really safe
thing to do would be to break the COW when getting the page, even when
only getting it for reading.
At the same time, some users simply don't even care.
For example, the perf code wants to look up the page not because it
cares about the page, but because the code simply wants to look up the
physical address of the access for informational purposes, and doesn't
really care about races when a page might be unmapped and remapped
elsewhere.
This adds logic to force a COW event by setting FOLL_WRITE on any
copy-on-write mapping when FOLL_GET (or FOLL_PIN) is used to get a page
pointer as a result.
The current semantics end up being:
- __get_user_pages_fast(): no change. If you don't ask for a write,
you won't break COW. You'd better know what you're doing.
- get_user_pages_fast(): the fast-case "look it up in the page tables
without anything getting mmap_sem" now refuses to follow a read-only
page, since it might need COW breaking. Which happens in the slow
path - the fast path doesn't know if the memory might be COW or not.
- get_user_pages() (including the slow-path fallback for gup_fast()):
for a COW mapping, turn on FOLL_WRITE for FOLL_GET/FOLL_PIN, with
very similar semantics to FOLL_FORCE.
If it turns out that we want finer granularity (ie "only break COW when
it might actually matter" - things like the zero page are special and
don't need to be broken) we might need to push these semantics deeper
into the lookup fault path. So if people care enough, it's possible
that we might end up adding a new internal FOLL_BREAK_COW flag to go
with the internal FOLL_COW flag we already have for tracking "I had a
COW".
Alternatively, if it turns out that different callers might want to
explicitly control the forced COW break behavior, we might even want to
make such a flag visible to the users of get_user_pages() instead of
using the above default semantics.
But for now, this is mostly commentary on the issue (this commit message
being a lot bigger than the patch, and that patch in turn is almost all
comments), with that minimal "enable COW breaking early" logic using the
existing FOLL_WRITE behavior.
[ It might be worth noting that we've always had this ambiguity, and it
could arguably be seen as a user-space issue.
You only get private COW mappings that could break either way in
situations where user space is doing cooperative things (ie fork()
before an execve() etc), but it _is_ surprising and very subtle, and
fork() is supposed to give you independent address spaces.
So let's treat this as a kernel issue and make the semantics of
get_user_pages() easier to understand. Note that obviously a true
shared mapping will still get a page that can change under us, so this
does _not_ mean that get_user_pages() somehow returns any "stable"
page ]
Reported-by: Jann Horn <jannh@google.com>
Tested-by: Christoph Hellwig <hch@lst.de>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Kirill Shutemov <kirill@shutemov.name>
Acked-by: Jan Kara <jack@suse.cz>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull uaccess/readdir updates from Al Viro:
"Finishing the conversion of readdir.c to unsafe_... API.
This includes the uaccess_{read,write}_begin series by Christophe
Leroy"
* 'uaccess.readdir' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
readdir.c: get rid of the last __put_user(), drop now-useless access_ok()
readdir.c: get compat_filldir() more or less in sync with filldir()
switch readdir(2) to unsafe_copy_dirent_name()
drm/i915/gem: Replace user_access_begin by user_write_access_begin
uaccess: Selectively open read or write user access
uaccess: Add user_read_access_begin/end and user_write_access_begin/end
Since the removal of the no-semaphore boosting, we rely on timeslicing to
reorder passed inter-dependency hogs across the engines. However, we
require preemption to support timeslicing into user payloads, and not all
machine support preemption so we do not universally enable timeslicing,
even when it would correctly preempt our own inter-engine semaphores.
Since timeslicing and semaphore priority deboosting is now disabled on
Broadwell/Braswell, we have to follow suite and not use semaphores.
Testcase: igt/gem_exec_schedule/semaphore-codependency # bdw/bsw
Fixes: 18e4af04d2 ("drm/i915: Drop no-semaphore boosting")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200521140617.30015-1-chris@chris-wilson.co.uk
(cherry picked from commit 0eb670aac2)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Since the removal of the no-semaphore boosting, we rely on timeslicing to
reorder passed inter-dependency hogs across the engines. However, we
require preemption to support timeslicing into user payloads, and not all
machine support preemption so we do not universally enable timeslicing,
even when it would correctly preempt our own inter-engine semaphores.
Since timeslicing and semaphore priority deboosting is now disabled on
Broadwell/Braswell, we have to follow suite and not use semaphores.
Testcase: igt/gem_exec_schedule/semaphore-codependency # bdw/bsw
Fixes: 18e4af04d2 ("drm/i915: Drop no-semaphore boosting")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200521140617.30015-1-chris@chris-wilson.co.uk
UAPI Changes:
- drm/i915: Show per-engine default property values in sysfs
By providing the default values configured into the kernel via sysfs, it
is much more convenient for userspace to restore those sane defaults, or
at least know what are considered good baseline. This is useful, for
example, to cleanup after any failed userspace prior to commencing new
jobs.
Cross-subsystem Changes:
- video/hdmi: Add Unpack only function for DRM infoframe
- Includes pull request gvt-next-2020-05-12
Driver Changes:
- Restore Cherryview back to full-ppgtt (Chris, Mika)
- Document locking guidelines for i915 (Chris, Daniel, Joonas)
- Fix GitLab #1746: Handle idling during i915_gem_evict_something busy loops (Chris)
- Display WA #1105: Require linear fb stride to be multiple of 512 bytes on
gen9/glk (Ville)
- Add Wa_14010685332 for ICP/ICL (Matt R)
- Restrict w/a 1607087056 for EHL/JSL (Swathi)
- Fix interrupt handling for DP AUX transactions on Tigerlake (Imre)
- Revert "drm/i915/tgl: Include ro parts of l3 to invalidate" (Mika)
- Fix HDC pipeline flush hardware bit on Gen12 (Mika)
- Flush L3 when flushing render on Gen12 (Mika)
- Invalidate aux table entries forcibly between BB on Gen12 (Mika)
- Add aux table invalidate for all engines on Gen12 (Mika)
- Force pte cacheline to main memory Gen8+ (Mika)
- Add and enable TGL+ SAGV support (Stanislav)
- Implement vm_ops->access on i915 mmaps for GDB (Chris, Kristian)
- Replace zero-length array with flexible-array (Gustavo)
- Improve batch buffer pool effectiveness to mitigate soft-rc6 hit (Chris)
- Remove wait priority boosting (Chris)
- Keep driver module referenced when PMU is active (Chris)
- Sanitize RPS interrupts upon resume (Chris)
- Extend pcode read timeout to 20 ms (Chris)
- Wait for ACT sent before enabling MST pipe (Ville)
- Extend support to async relocations to SNB (Chris)
- Remove CNL pre-prod workarounds (Ville)
- Don't enable WaIncreaseLatencyIPCEnabled when IPC is disabled (Sultan)
- Record the active CCID from before reset (Chris)
- Mark concurrent submissions with a weak-dependency (Chris)
- Peel dma-fence-chains for await to allow engine-to-engine sync (Lionel)
- Prevent using semaphores to chain up to external fences (Chris)
- Fix GLK watermark calculations (Ville)
- Emit await(batch) before MI_BB_START (Chris)
- Reset execlists registers before HWSP (Chris)
- Drop no-semaphore boosting in favor of fast timeslicing (Chris)
- Fix enabled infoframe states of lspcon (Gwan-gyeong)
- Program DP SDPs on pipe updates (Gwan-gyeong)
- Stop sending DP SDPs on ddi disable (Gwan-gyeong)
- Store CS timestamp frequency in Hz (Ville)
- Remove unused HAS_FWTABLE macro (Pascal)
- Use batchbuffer chaining for relocations to save ring space (Chris)
- Try different engines for relocs if MI ops not supported (Chris, Tvrtko)
- Lazily acquire the device wakeref for freeing objects (Chris)
- Streamline display code arithmetics around rounding etc. (Ville)
- Use bw state for per crtc SAGV evaluation (Stanislav)
- Track active_pipes in bw_state (Stanislav)
- Nuke mode.vrefresh usage (Ville)
- Warn if the FBC is still writing to stolen on removal (Chris)
- Added new PCode commands prepping for QGV rescricting (Stansilav)
- Stop holding onto the pinned_default_state (Chris)
- Propagate error from completed fences (Chris)
- Ignore submit-fences on the same timeline (Chris)
- Pull waiting on an external dma-fence into its routine (Chris)
- Replace the hardcoded I915_FENCE_TIMEOUT with Kconfig (Chris)
- Mark up the racy read of execlists->context_tag (Chris)
- Tidy up the return handling for completed dma-fences (Chris)
- Introduce skl_plane_wm_level accessor (Stanislav)
- Extract SKL SAGV checking (Stanislav)
- Make active_pipes check skl specific (Stanislav)
- Suspend tasklets before resume sanitization (Chris)
- Remove redundant exec_fence (Chris)
- Mark the addition of the initial-breadcrumb in the request (Chris)
- Transfer old virtual breadcrumbs to irq_worker (Chris)
- Read the DP SDPs from the video DIP (Gwan-gyeong)
- Program DP SDPs with computed configs (Gwan-gyeong)
- Add state readout for DP VSC and DP HDR Metadata Infoframe SDP
(Gwan-gyeong)
- Add compute routine for DP PSR VSC SDP (Gwan-gyeong)
- Use new DP VSC SDP compute routine on PSR (Gwan-gyeong)
- Restrict qgv points which don't have enough bandwidth. (Stanislav)
- Nuke pointless div by 64bit (Ville)
- Static checker code fixes (Nathan, Mika, Chris)
- Add logging function for DP VSC SDP (Gwan-gyeong)
- Include HDMI DRM infoframe, DP HDR metadata and DP VSC SDP in the
crtc state dump (Gwan-gyeong)
- Make timeslicing explicit engine property (Chris, Tvrtko)
- Selftest and debugging improvements (Chris)
- Align variable names with BSpec (Ville)
- Tidy up gen8+ breadcrumb emission code (Chris)
- Turn intel_digital_port_connected() in a vfunc (Ville)
- Use stashed away hpd isr bits in intel_digital_port_connected() (Ville)
- Extract i915_cs_timestamp_{ns_to_ticks,tick_to_ns}() (Ville)
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200515160703.GA19043@jlahtine-desk.ger.corp.intel.com
As we no longer use the shmemfs allocation directly, we do not expect to
receive -ENOSPC from a backing store allocation. The potential sources
for -ENOSPC are then our own internal eviction code, so the choice is
either to kill the potential application with SIGBUS or to retry the
faulthandler.
In this patch we retry the fault handler, but since this is a should
never happen condition, it is arguable that we gather up copious debug
and kill the application. At worst, we cause an interruptible busy-wait,
stalling the application -- all causes should be transient and the
system should eventually recover. A small stall is hopefully a better
outcome than random oomkiller.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200515200031.12034-1-chris@chris-wilson.co.uk
Now that we have fast timeslicing on semaphores, we no longer need to
prioritise none-semaphore work as we will yield any work blocked on a
semaphore to the next in the queue. Previously with no timeslicing,
blocking on the semaphore caused extremely bad scheduling with multiple
clients utilising multiple rings. Now, there is no impact and we can
remove the complication.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200513173504.28322-1-chris@chris-wilson.co.uk
Driver Changes:
- Fix GitLab #1698: Performance regression with Linux 5.7-rc1 on
Iris Plus 655 and 4K screen (Chris)
- Add Wa_14011059788 for Tigerlake (Matt A)
- Add per ctx batchbuffer wa for timestamp for Gen12 (Mika)
- Use indirect ctx bb to load cmd buffer control value
from context image to avoid corruption (Mika)
- Enable DP Display Audio WA (Uma, Jani)
- Update forcewake firmware ranges for Icelake (Radhakrishna)
- Add missing deinitialization cases of load failure for display (Jose)
- Implement TC cold sequences for Icelake and Tigerlake (Jose)
- Unbreak enable_dpcd_backlight modparam (Lyude)
- Move the late flush_submission in retire to the end (Chris)
- Demote "Reducing compressed framebufer size" message to info (Peter)
- Push MST link retraining to the hotplug work (Ville)
- Hold obj->vma.lock over for_each_ggtt_vma() (Chris)
- Fix timeout handling during TypeC AUX power well enabling for ICL (Imre)
- Fix skl+ non-scaled pfit modes (Ville)
- Prefer soft-rc6 over RPS DOWN_TIMEOUT (Chris)
- Sanitize GT first before poisoning HWSP (Chris)
- Fix up clock RPS frequency readout (Chris)
- Avoid reusing the same logical CCID (Chris)
- Avoid dereferencing a dead context (Chris)
- Always enable busy-stats for execlists (Chris)
- Apply the aggressive downclocking to parking (Chris)
- Restore aggressive post-boost downclocking (Chris)
- Scrub execlists state on resume (Chris)
- Add debugfs attributes for LPSP (Ansuman)
- Improvements to kernel selftests (Chris, Mika)
- Add tiled blits selftest (Zbigniew)
- Fix error handling in __live_lrc_indirect_ctx_bb() (Dan)
- Add pre/post plane updates for SAGV (Stanislav)
- Add ICL PG3 PW ID for EHL (Anshuman)
- Fix Sphinx build duplicate label warning (Jani)
- Error log non-zero audio power refcount after unbind (Jani)
- Remove object_is_locked assertion from unpin_from_display_plane (Chris)
- Use single set of AUX powerwell ops for gen11+ (Matt R)
- Prefer drm_WARN_ON over WARN_ON (Pankaj)
- Poison residual state [HWSP] across resume (Chris, Tvrtko)
- Convert request-before-CS assertion to debug (Chris)
- Carefully order virtual_submission_tasklet (Chris)
- Check carefully for an idle engine in wait-for-idle (Chris)
- Only close vma we open (Chris)
- Trace RPS events (Chris)
- Use the RPM config register to determine clk frequencies (Chris)
- Drop rq->ring->vma peeking from error capture (Chris)
- Check preempt-timeout target before submit_ports (Chris)
- Check HWSP cacheline is valid before acquiring (Chris)
- Use proper fault mask in interrupt postinstall too (Matt R)
- Keep a no-frills swappable copy of the default context state (Chris)
- Add atomic helpers for bandwidth (Stanislav)
- Refactor setting dma info to a common helper from device info (Michael)
- Refactor DDI transcoder code for clairty (Ville)
- Extend PG3 power well ID to ICL (Anshuman)
- Refactor PFIT code for readability and future extensibility (Ville)
- Clarify code split between intel_ddi.c and intel_dp.c (Ville)
- Move out code to return the digital_port of the aux ch (Jose)
- Move rps.enabled/active and use of RPS interrupts to flags (Chris)
- Remove superfluous inlines and dead code (Jani)
- Re-disable -Wframe-address from top-level Makefile (Nick)
- Static checker and spelling fixes (Colin, Nathan)
- Split long lines (Ville)
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200430124904.GA100924@jlahtine-desk.ger.corp.intel.com