Commit Graph

5683 Commits

Author SHA1 Message Date
Ben Skeggs
6db25fb13a drm/nouveau/nvif: rename client ctor/dtor
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
2020-07-24 18:50:50 +10:00
Ben Skeggs
188e905ce4 drm/nouveau/kms/tu102: set NVC57D_HEAD_SET_HEAD_USAGE_BOUNDS_UPSCALING_ALLOWED to TRUE
Fixes issues when switching between scaling modes.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-07-24 18:50:50 +10:00
Gustavo A. R. Silva
f6e7393ede drm/nouveau: Use fallthrough pseudo-keyword
Replace the existing /* fall through */ comments and its variants with
the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary
fall-through markings when it is the case.

[1] https://www.kernel.org/doc/html/latest/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through

Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-07-24 18:50:50 +10:00
Ralph Campbell
7763d24f3b drm/nouveau/vmm/gp100-: fix mapping 2MB sysmem pages
The nvif_object_ioctl() method NVIF_VMM_V0_PFNMAP wasn't correctly
setting the hardware specific GPU page table entries for 2MB sized
pages. Fix this by adding functions to set and clear PD0 GPU page
table entries.

Signed-off-by: Ralph Campbell <rcampbell@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-07-24 18:50:50 +10:00
Ralph Campbell
e5c7864f62 drm/nouveau/mmu: make nvkm_vmm_ctor() static
The function nvkm_vmm_ctor() is not called outside of the file defining
it, so make it static.

Signed-off-by: Ralph Campbell <rcampbell@nvidia.com>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-07-24 18:50:50 +10:00
Aditya Pakki
8f29432417 drm/nouveau: fix reference count leak in nouveau_debugfs_strap_peek
nouveau_debugfs_strap_peek() calls pm_runtime_get_sync() that
increments the reference count. In case of failure, decrement the
ref count before returning the error.

Signed-off-by: Aditya Pakki <pakki001@umn.edu>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-07-24 18:50:49 +10:00
Aditya Pakki
990a116298 drm/nouveau: Fix reference count leak in nouveau_connector_detect
nouveau_connector_detect() calls pm_runtime_get_sync and in turn
increments the reference count. In case of failure, decrement the
ref count before returning the error.

Signed-off-by: Aditya Pakki <pakki001@umn.edu>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-07-24 18:50:49 +10:00
Aditya Pakki
a2cdf39536 drm/nouveau: fix reference count leak in nv50_disp_atomic_commit
nv50_disp_atomic_commit() calls calls pm_runtime_get_sync and in turn
increments the reference count. In case of failure, decrement the
ref count before returning the error.

Signed-off-by: Aditya Pakki <pakki001@umn.edu>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-07-24 18:50:49 +10:00
Aditya Pakki
659fb5f154 drm/nouveau: fix multiple instances of reference count leaks
On calling pm_runtime_get_sync() the reference count of the device
is incremented. In case of failure, decrement the
ref count before returning the error.

Signed-off-by: Aditya Pakki <pakki001@umn.edu>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-07-24 18:50:49 +10:00
Aditya Pakki
bfad51c763 drm/nouveau/drm/noveau: fix reference count leak in nouveau_fbcon_open
nouveau_fbcon_open() calls calls pm_runtime_get_sync() that
increments the reference count. In case of failure, decrement the
ref count before returning the error.

Signed-off-by: Aditya Pakki <pakki001@umn.edu>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-07-24 18:50:49 +10:00
Ben Skeggs
eddb047329 drm/nouveau/sec2/gp102: allow module to load when LSFW is missing
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-07-24 18:50:49 +10:00
Ben Skeggs
b9c246ad3b drm/nouveau/gr/gm200-: explicitly handle nofw
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-07-24 18:50:49 +10:00
Ben Skeggs
38fd546beb drm/nouveau/pmu/gm200-: explicitly handle nofw
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-07-24 18:50:49 +10:00
Ben Skeggs
46fc98bfb8 drm/nouveau/pmu/gm20x: don't pretend we support loading with our custom FW
It technically loads, and runs, but is ultimately pointless outside of
a very narrow window (fanless systems where one wants to attempt using
the, broken for a lot of gm20x, memory reclocking code).

It's also potentially dangerous to override the VBIOS-provided "Pre-OS"
PMU, which would be responsible for fan control otherwise.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-07-24 18:50:48 +10:00
Ben Skeggs
de088372da drm/nouveau/acr: store a mask of LS falcons the controlling LSFW can bootstrap
This will prevent some pain with broken firmware trees, as under some
circumstances the HSFW can fail and leave the GPU in a state we don't
know how to recover from.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-07-24 18:50:48 +10:00
Ben Skeggs
587debc9a7 drm/nouveau/acr: store a mask of LS falcons the HSFW can bootstrap
This will prevent reloading of HS FW where it's pointless, and bypass
hitting some timeouts.

Not a situation one should generally hit, but can occur with a messed
up firmware installation.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-07-24 18:50:48 +10:00
Ben Skeggs
90e9cf749a drm/nouveau/acr: allow module to load when HSFW(s) are missing
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-07-24 18:50:48 +10:00
Ben Skeggs
8fdc45e4b6 drm/nouveau/acr: refuse to load LSFW if HSFW is missing
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-07-24 18:50:48 +10:00
Ben Skeggs
8140f92c27 drm/nouveau/core: drop error message when no compatible FW found
This is less than useful with some subdevs having _nofw variants in their
FWIF lists - it's cleaner to handle them all in the same way.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-07-24 18:50:47 +10:00
Ben Skeggs
b9f327f1af drm/nouveau/mmu/gp100-: enable mmu invalidate depth optimisation
This causes us to invalidate MMU only at the level we made modifications -
ie: if we've only modified PTEs, there's no need to have MMU dump the PDs
it's fetched into L2.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-07-24 18:50:47 +10:00
Timur Tabi
b448a266cc drm/nouveau/nvfw: firmware structures should begin with nvfw_
Rename all structures that are used directly by firmware to have a nvfw_
prefix.

This makes it easier to identify structures that have a fixed, specific
layout.  A future patch will define several more such structures, so it's
important to be consistent now.

Signed-off-by: Timur Tabi <ttabi@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-07-24 18:50:47 +10:00
Timur Tabi
804f570502 drm/nouveau/tmr: fix nvkm_usec/nvkm_msec definitions
nvkm_timer_wait_init() takes a u64 as a duration parameter, but the
expression "(m) * 1000" will be promoted only to a 32-bit integer,
if 'm' is also an integer.  Changing the 1000 to 1000ULL ensures that
the expression will be 64 bits.

This change currently has no effect as there are no callers of
nvkm_msec() that exceed 2000ms.

Signed-off-by: Timur Tabi <ttabi@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-07-24 18:50:47 +10:00
Ben Skeggs
9c64a8dbcb drm/nouveau/therm/gt215: make gt215_therm_init static
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-07-24 18:50:47 +10:00
Ben Skeggs
3b54befd49 drm/nouveau/mmu: make a couple of functions static
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-07-24 18:50:47 +10:00
Ben Skeggs
94cad89ae4 drm/nouveau/mc/gp10b: make gp10b_mc_init static
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-07-24 18:50:46 +10:00
Ben Skeggs
8b962dc4ec drm/nouveau/nvfw/acr: make lsb_header_tail_dump static
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-07-24 18:50:46 +10:00
Ben Skeggs
f612b0f66c drm/nouveau/gr/gf100-: make some functions static
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-07-24 18:50:46 +10:00
Ben Skeggs
8869dff1bd drm/nouveau/disp/gm200-: remove 'head' parameter from nvkm_ior_func.hdmi.scdc()
It's no longer required.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-07-24 18:49:34 +10:00
Ben Skeggs
15fbc3b938 drm/nouveau/fbcon: zero-initialise the mode_cmd2 structure
This is tripping up the format modifier patches.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-07-24 18:33:14 +10:00
Ben Skeggs
498595abf5 drm/nouveau/fbcon: fix module unload when fbcon init has failed for some reason
Stale pointer was tripping up the unload path.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-07-24 18:33:14 +10:00
Ben Skeggs
705d9d0229 drm/nouveau/kms/tu102: wait for core update to complete when assigning windows
Fixes a race on Turing between the core cross-channel error checks and
the following window update.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-07-24 18:33:14 +10:00
Ben Skeggs
0508831470 drm/nouveau/kms/gf100: use correct format modifiers
The disp015x classes are used by both gt21x and gf1xx (aside from gf119), but page
kinds differ between Tesla and Fermi.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-07-24 18:33:13 +10:00
Ben Skeggs
163d5446c3 drm/nouveau/disp/gm200-: fix regression from HDA SOR selection changes
Fixes: 9b5ca547bb ("drm/nouveau/disp/gm200-: detect and potentially disable HDA support on some SORs")
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-07-24 18:33:13 +10:00
Dave Airlie
41206a073c Merge v5.8-rc6 into drm-next
I've got a silent conflict + two trees based on fixes to merge.

Fixes a silent merge with amdgpu

Signed-off-by: Dave Airlie <airlied@redhat.com>
2020-07-24 08:48:05 +10:00
Christian König
ed024ca62a drm/nouveau: stop using TTM_MEMTYPE_FLAG_MAPPABLE
The driver doesn't expose any not-mapable memory resources.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/378244/
2020-07-21 16:33:21 +02:00
Christian König
f5a9a9383f drm/ttm: remove TTM_MEMTYPE_FLAG_CMA
The original intention was to avoid CPU page table unmaps
when BOs move between the GTT and SYSTEM domain.

The problem is that this never correctly handled changes
in the caching attributes or backing pages.

Just drop this for now and simply unmap the CPU page
tables in all cases.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/378240/
2020-07-21 16:21:43 +02:00
Christian König
ce74773305 drm/ttm: remove io_reserve_fastpath flag
Just use the use_io_reserve_lru flag. It doesn't make much
sense to have two flags.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/378238/
2020-07-21 16:19:50 +02:00
Christian König
4b8edc39a4 drm/ttm: cleanup io_mem interface with nouveau
Nouveau is the only user of this functionality and evicting io space
on -EAGAIN is really a misuse of the return code.

Instead switch to using -ENOSPC here which makes much more sense and
simplifies the code.

This could unbreak something as we now cleanly return EAGAIN, but the
chance for this are rather low.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/378237/
2020-07-21 16:13:29 +02:00
Lyude Paul
2d7865082d drm/nouveau/kms/nvd9-: Fix disabling CRCs alongside OR reprogramming
While I had thought I'd tested this before, it looks like this one issue
slipped by my original CRC patches. Basically, there seem to be a few
rules we need to follow when sending CRC commands to the display
controller:

* CRCs cannot be both disabled and enabled for a single head in the same
  flush
* If a head with CRC reporting enabled switches from one OR to another,
  there must be a flush before the OR is re-enabled regardless of the
  final state of CRC reporting.

So, split nv50_crc_atomic_prepare_notifier_contexts() into two
functions:
* nv_crc_atomic_release_notifier_contexts() - checks whether the CRC
  notifier contexts were released successfully after the first flush
* nv_crc_atomic_init_notifier_contexts() - prepares any CRC notifier
  contexts for use before enabling reporting

Additionally, in order to force a flush when we re-assign ORs with heads
that have CRCs enabled we split our atomic check function into two:

* nv50_crc_atomic_check_head() - called from our heads' atomic checks,
  determines whether a state needs to set or clear CRC reporting
* nv50_crc_atomic_check_outp() - called at the end of the atomic check
  after all ORs have been added to the atomic state, and sets
  nv50_atom->flush_disable if needed

Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Ben Skeggs <skeggsb@gmail.com>
Acked-by: Dave Airlie <airlied@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200629223635.103804-1-lyude@redhat.com
2020-07-16 18:16:33 -04:00
Lyude Paul
12885ecbfe drm/nouveau/kms/nvd9-: Add CRC support
This introduces support for CRC readback on gf119+, using the
documentation generously provided to us by Nvidia:

https://github.com/NVIDIA/open-gpu-doc/blob/master/Display-CRC/display-crc.txt

We expose all available CRC sources. SF, SOR, PIOR, and DAC are exposed
through a single set of "outp" sources: outp-active/auto for a CRC of
the scanout region, outp-complete for a CRC of both the scanout and
blanking/sync region combined, and outp-inactive for a CRC of only the
blanking/sync region. For each source, nouveau selects the appropriate
tap point based on the output path in use. We also expose an "rg"
source, which allows for capturing CRCs of the scanout raster before
it's encoded into a video signal in the output path. This tap point is
referred to as the raster generator.

Note that while there's some other neat features that can be used with
CRC capture on nvidia hardware, like capturing from two CRC sources
simultaneously, I couldn't see any usecase for them and did not
implement them.

Nvidia only allows for accessing CRCs through a shared DMA region that
we program through the core EVO/NvDisplay channel which is referred to
as the notifier context. The notifier context is limited to either 255
(for Fermi-Pascal) or 2047 (Volta+) entries to store CRCs in, and
unfortunately the hardware simply drops CRCs and reports an overflow
once all available entries in the notifier context are filled.

Since the DRM CRC API and igt-gpu-tools don't expect there to be a limit
on how many CRCs can be captured, we work around this in nouveau by
allocating two separate notifier contexts for each head instead of one.
We schedule a vblank worker ahead of time so that once we start getting
close to filling up all of the available entries in the notifier
context, we can swap the currently used notifier context out with
another pre-prepared notifier context in a manner similar to page
flipping.

Unfortunately, the hardware only allows us to this by flushing two
separate updates on the core channel: one to release the current
notifier context handle, and one to program the next notifier context's
handle. When the hardware processes the first update, the CRC for the
current frame is lost. However, the second update can be flushed
immediately without waiting for the first to complete so that CRC
generation resumes on the next frame. According to Nvidia's hardware
engineers, there isn't any cleaner way of flipping notifier contexts
that would avoid this.

Since using vblank workers to swap out the notifier context will ensure
we can usually flush both updates to hardware within the timespan of a
single frame, we can also ensure that there will only be exactly one
frame lost between the first and second update being executed by the
hardware. This gives us the guarantee that we're always correctly
matching each CRC entry with it's respective frame even after a context
flip. And since IGT will retrieve the CRC entry for a frame by waiting
until it receives a CRC for any subsequent frames, this doesn't cause an
issue with any tests and is much simpler than trying to change the
current DRM API to accommodate.

In order to facilitate testing of correct handling of this limitation,
we also expose a debugfs interface to manually control the threshold for
when we start trying to flip the notifier context. We will use this in
igt to trigger a context flip for testing purposes without needing to
wait for the notifier to completely fill up. This threshold is reset
to the default value set by nouveau after each capture, and is exposed
in a separate folder within each CRTC's debugfs directory labelled
"nv_crc".

Changes since v1:
* Forgot to finish saving crc.h before saving, whoops. This just adds
  some corrections to the empty function declarations that we use if
  CONFIG_DEBUG_FS isn't enabled.
Changes since v2:
* Don't check return code from debugfs_create_dir() or
  debugfs_create_file() - Greg K-H
Changes since v3:
  (no functional changes)
* Fix SPDX license identifiers (checkpatch)
* s/uint32_t/u32/ (checkpatch)
* Fix indenting in switch cases (checkpatch)
Changes since v4:
* Remove unneeded param changes with nv50_head_flush_clr/set
* Rebase
Changes since v5:
* Remove set but unused variable (outp) in nv50_crc_atomic_check() -
  Kbuild bot

Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
Acked-by: Dave Airlie <airlied@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200627194657.156514-10-lyude@redhat.com
2020-07-16 18:16:33 -04:00
Lyude Paul
0bc8ffe097 drm/nouveau/kms/nv50-: Move hard-coded object handles into header
While most of the functionality on Nvidia GPUs doesn't require using an
explicit handle instead of the main VRAM handle + offset, there are a
couple of places that do require explicit handles, such as CRC
functionality. Since this means we're about to add another
nouveau-chosen handle, let's just go ahead and move any hard-coded
handles into a single header. This is just to keep things slightly
organized, and to make it a little bit easier if we need to add more
handles in the future.

This patch should contain no functional changes.

Changes since v3:
* Correct SPDX license identifier (checkpatch)

Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
Acked-by: Dave Airlie <airlied@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200627194657.156514-9-lyude@redhat.com
2020-07-16 18:16:32 -04:00
Lyude Paul
ebec884728 drm/nouveau/kms/nv50-: Expose nv50_outp_atom in disp.h
In order to make sure that we flush disable updates at the right time
when disabling CRCs, we'll need to be able to look at the outp state to
see if we're changing it at the same time that we're disabling CRCs.

So, expose the struct in disp.h.

Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
Acked-by: Dave Airlie <airlied@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200627194657.156514-8-lyude@redhat.com
2020-07-16 18:16:32 -04:00
Lyude Paul
dbdaf719c6 drm/nouveau/kms/nv140-: Track wndw mappings in nv50_head_atom
While we're not quite ready yet to add support for flexible wndw
mappings, we are going to need to at least keep track of the static wndw
mappings we're currently using in each head's atomic state. We'll likely
use this in the future to implement real flexible window mapping, but
the primary reason we'll need this is for CRC support.

See: on nvidia hardware, each CRC entry in the CRC notifier dma context
has a "tag". This tag corresponds to the nth update on a specific
EVO/NvDisplay channel, which itself is referred to as the "controlling
channel". For gf119+ this can be the core channel, ovly channel, or base
channel. Since we don't expose CRC entry tags to userspace, we simply
ignore this feature and always use the core channel as the controlling
channel. Simple.

Things get a little bit more complicated on gv100+ though. GV100+ only
lets us set the controlling channel to a specific wndw channel, and that
wndw must be owned by the head that we're grabbing CRCs when we enable
CRC generation. Thus, we always need to make sure that each atomic head
state has at least one wndw that is mapped to the head, which will be
used as the controlling channel.

Note that since we don't have flexible wndw mappings yet, we don't
expect to run into any scenarios yet where we'd have a head with no
mapped wndws. When we do add support for flexible wndw mappings however,
we'll need to make sure that we handle reprogramming CRC capture if our
controlling wndw is moved to another head (and potentially reject the
new head state entirely if we can't find another available wndw to
replace it).

With that being said, nouveau currently tracks wndw visibility on heads.
It does not keep track of the actual ownership mappings, which are
(currently) statically programmed. To fix this, we introduce another
bitmask into nv50_head_atom.wndw to keep track of ownership separately
from visibility. We then introduce a nv50_head callback to handle
populating the wndw ownership map, and call it during the atomic check
phase when core->assign_windows is set to true.

Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
Acked-by: Dave Airlie <airlied@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200627194657.156514-7-lyude@redhat.com
2020-07-16 18:16:32 -04:00
Lyude Paul
fb2420b701 drm/nouveau/kms/nv50-: Fix disabling dithering
While we expose the ability to turn off hardware dithering for nouveau,
we actually make the mistake of turning it on anyway, due to
dithering_depth containing a non-zero value if our dithering depth isn't
also set to 6 bpc.

So, fix it by never enabling dithering when it's disabled.

Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
Acked-by: Dave Airlie <airlied@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200627194657.156514-6-lyude@redhat.com
2020-07-16 18:16:31 -04:00
Lyude Paul
9c8e9b790d drm/nouveau/kms/nv140-: Don't modify depth in state during atomic commit
Currently, we modify the depth value stored in the atomic state when
performing a commit in order to workaround the fact we haven't
implemented support for depths higher then 10 yet. This isn't idempotent
though, as it will happen every atomic commit where we modify the OR
state even if the head's depth in the atomic state hasn't been modified.

Normally this wouldn't matter, since we don't modify OR state outside of
modesets, but since the CRC capture region is implemented as part of the
OR state in hardware we'll want to make sure all commits modifying OR
state are idempotent so as to avoid changing the depth unexpectedly.

So, fix this by simply not writing the reduced depth value we come up
with to the atomic state.

Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
Acked-by: Dave Airlie <airlied@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200627194657.156514-5-lyude@redhat.com
2020-07-16 18:16:31 -04:00
Ralph Campbell
b223555dc4 nouveau/hmm: support mapping large sysmem pages
Nouveau currently only supports mapping PAGE_SIZE sized pages of system
memory when shared virtual memory (SVM) is enabled. Use the new
hmm_pfn_to_map_order() function to support mapping system memory pages
that are PMD_SIZE.

Link: https://lore.kernel.org/r/20200701225352.9649-5-rcampbell@nvidia.com
Signed-off-by: Ralph Campbell <rcampbell@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-10 16:24:28 -03:00
Ralph Campbell
4725c6b82a nouveau: fix mapping 2MB sysmem pages
The nvif_object_ioctl() method NVIF_VMM_V0_PFNMAP wasn't correctly setting
the hardware specific GPU page table entries for 2MB sized pages. Fix this
by adding functions to set and clear PD0 GPU page table entries.

Link: https://lore.kernel.org/r/20200701225352.9649-4-rcampbell@nvidia.com
Signed-off-by: Ralph Campbell <rcampbell@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-10 16:24:28 -03:00
Ralph Campbell
0cafc62e4d nouveau/hmm: fault one page at a time
The SVM page fault handler groups faults into a range of contiguous
virtual addresses and requests hmm_range_fault() to populate and return
the page frame number of system memory mapped by the CPU.  In preparation
for supporting large pages to be mapped by the GPU, process faults one
page at a time. In addition, use the hmm_range default_flags to fix a
corner case where the input hmm_pfns array is not reinitialized after
hmm_range_fault() returns -EBUSY and must be called again.

Link: https://lore.kernel.org/r/20200701225352.9649-2-rcampbell@nvidia.com
Signed-off-by: Ralph Campbell <rcampbell@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-10 16:24:28 -03:00
Ralph Campbell
ed710a6ed7 drm/nouveau/nouveau: fix page fault on device private memory
If system memory is migrated to device private memory and no GPU MMU
page table entry exists, the GPU will fault and call hmm_range_fault()
to get the PFN for the page. Since the .dev_private_owner pointer in
struct hmm_range is not set, hmm_range_fault returns an error which
results in the GPU program stopping with a fatal fault.
Fix this by setting .dev_private_owner appropriately.

Fixes: 08ddddda66 ("mm/hmm: check the device private page owner in hmm_range_fault()")
Cc: stable@vger.kernel.org
Signed-off-by: Ralph Campbell <rcampbell@nvidia.com>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-07-08 13:30:42 +10:00
Ralph Campbell
ad61f5f5e0 drm/nouveau/svm: fix migrate page regression
The patch to add zero page migration to GPU memory inadvertently included
part of a future change which broke normal page migration to GPU memory
by copying too much data and corrupting GPU memory.
Fix this by only copying one page instead of a byte count.

Fixes: 9d4296a7d4 ("drm/nouveau/nouveau/hmm: fix migrate zero page to GPU")
Signed-off-by: Ralph Campbell <rcampbell@nvidia.com>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-07-08 13:30:42 +10:00