If system memory is migrated to device private memory and no GPU MMU
page table entry exists, the GPU will fault and call hmm_range_fault()
to get the PFN for the page. Since the .dev_private_owner pointer in
struct hmm_range is not set, hmm_range_fault returns an error which
results in the GPU program stopping with a fatal fault.
Fix this by setting .dev_private_owner appropriately.
Fixes: 08ddddda66 ("mm/hmm: check the device private page owner in hmm_range_fault()")
Cc: stable@vger.kernel.org
Signed-off-by: Ralph Campbell <rcampbell@nvidia.com>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
The patch to add zero page migration to GPU memory inadvertently included
part of a future change which broke normal page migration to GPU memory
by copying too much data and corrupting GPU memory.
Fix this by only copying one page instead of a byte count.
Fixes: 9d4296a7d4 ("drm/nouveau/nouveau/hmm: fix migrate zero page to GPU")
Signed-off-by: Ralph Campbell <rcampbell@nvidia.com>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Tegra TRM says worst-case reply time is 1216us, and this should fix some
spurious timeouts that have been popping up.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Prevents "snd_hda_codec_hdmi hdaudioC1D0: HDMI: pin nid 5 not registered"
that occur on some configurations.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Pull drm fixes from Dave Airlie:
"These are the fixes from last week for the stuff merged in the merge
window. It got a bunch of nouveau fixes for HDA audio on some new
GPUs, some i915 and some amdpgu fixes.
i915:
- gvt: Fix one clang warning on debug only function
- Use ARRAY_SIZE for coccicheck warning
- Use after free fix for display global state.
- Whitelisting context-local timestamp on Gen9 and two scheduler
fixes with deps (Cc: stable)
- Removal of write flag from sysfs files where ineffective
nouveau:
- HDMI/DP audio HDA fixes
- display hang fix for Volta/Turing
- GK20A regression fix.
amdgpu:
- Prevent hwmon accesses while GPU is in reset
- CTF interrupt fix
- Backlight fix for renoir
- Fix for display sync groups
- Display bandwidth validation workaround"
* tag 'drm-next-2020-06-08' of git://anongit.freedesktop.org/drm/drm: (28 commits)
drm/nouveau/kms/nv50-: clear SW state of disabled windows harder
drm/nouveau: gr/gk20a: Use firmware version 0
drm/nouveau/disp/gm200-: detect and potentially disable HDA support on some SORs
drm/nouveau/disp/gp100: split SOR implementation from gm200
drm/nouveau/disp: modify OR allocation policy to account for HDA requirements
drm/nouveau/disp: split part of OR allocation logic into a function
drm/nouveau/disp: provide hint to OR allocation about HDA requirements
drm/amd/display: Revalidate bandwidth before commiting DC updates
drm/amdgpu/display: use blanked rather than plane state for sync groups
drm/i915/params: fix i915.fake_lmem_start module param sysfs permissions
drm/i915/params: don't expose inject_probe_failure in debugfs
drm/i915: Whitelist context-local timestamp in the gen9 cmdparser
drm/i915: Fix global state use-after-frees with a refcount
drm/i915: Check for awaits on still currently executing requests
drm/i915/gt: Do not schedule normal requests immediately along virtual
drm/i915: Reorder await_execution before await_request
drm/nouveau/kms/gt215-: fix race with audio driver runpm
drm/nouveau/disp/gm200-: fix NV_PDISP_SOR_HDMI2_CTRL(n) selection
Revert "drm/amd/display: disable dcn20 abm feature for bring up"
drm/amd/powerplay: ack the SMUToHost interrupt on receive V2
...
The most innocuous result of not having done this is that we end up
sending unnecessary methods when we next enable the window.
However, interactions with the code handling skipping disables when
an update immediately follows, and window ownership assignment, can
lead to upsetting the display hardware on Volta and newer.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Tegra firmware doesn't actually use any version numbers and passing -1
causes the existing firmware binaries not to be found. Use version 0 to
find the correct files.
Fixes: ef16dc278e ("drm/nouveau/gr/gf100-: select implementation based on available FW")
Signed-off-by: Thierry Reding <treding@nvidia.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Some HDA pin widgets may be disabled by BIOS, and unavailable from a
SOR. Our SOR allocation policy uses this information to allocate an
appropriate SOR when HDA is supported by a display.
Thank you to NVIDIA for providing the information to determine this.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Since GM200, SORs are no longer tied to a specific connector, and we
allocate them instead, with the assumption that all SORs are equally
capable.
However, there's a 1<->1 mapping between SOR and HDA pin widget, and
it turns out that it's possible for some widgets to be disabled...
In order to avoid picking a SOR without a valid pin widget, some new
rules need to be added.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
No logical changes here, this is just moving the code to make the
changes in the next commit more obvious.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
The audio driver can call into nouveau right while we're in the middle
of re-fetching the EDID, and decide it no longer needs to be awake.
Stop depending on EDID in the audio component get_eld() callback, and
instead cache whether audio support is present from the prior modeset.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
This is a SOR register, and not indexed by the bound head.
Fixes display not coming up on high-bandwidth HDMI displays under a
number of configurations.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
pm_runtime_get_sync() increments the runtime PM usage counter even
the call returns an error code. Thus a pairing decrement is needed
on the error handling path to keep the counter balanced.
Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
pm_runtime_get_sync() increments the runtime PM usage counter even
the call returns an error code. Thus a pairing decrement is needed
on the error handling path to keep the counter balanced.
Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
pm_runtime_get_sync() increments the runtime PM usage counter even
the call returns an error code. Thus a pairing decrement is needed
on the error handling path to keep the counter balanced.
Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
pm_runtime_get_sync() increments the runtime PM usage counter even
the call returns an error code. Thus a pairing decrement is needed
on the error handling path to keep the counter balanced.
Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
When calling OpenCL clEnqueueSVMMigrateMem() on a region of memory that
is backed by pte_none() or zero pages, migrate_vma_setup() will fill the
source PFN array with an entry indicating the source page is zero.
Use this to optimize migration to device private memory by allocating
GPU memory and zero filling it instead of failing to migrate the page.
Signed-off-by: Ralph Campbell <rcampbell@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
In nouveau_dmem_init(), a number of struct nouveau_dmem_chunk are allocated
and put on the dmem->chunk_empty list. Then in nouveau_dmem_pages_alloc(),
a nouveau_dmem_chunk is removed from the list and GPU memory is allocated.
However, the nouveau_dmem_chunk is never removed from the chunk_empty
list nor placed on the chunk_free or chunk_full lists. This results
in only one chunk ever being actually used (2MB) and quickly leads to
migration to device private memory failures.
Fix this by having just one list of free device private pages and if no
pages are free, allocate a chunk of device private pages and GPU memory.
Signed-off-by: Ralph Campbell <rcampbell@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Currently, the nv50_mstc_mode_valid() function is happy to take any and
all modes, even the ones we can't actually support sometimes like
interlaced modes.
Luckily, the only difference between the mode validation that needs to
be performed for MST vs. SST is that eventually we'll need to check the
minimum PBN against the MSTB's full PBN capabilities (remember-we don't
care about the current bw state here). Otherwise, all of the other code
can be shared.
So, we move all of the common mode validation in
nouveau_connector_mode_valid() into a separate helper,
nv50_dp_mode_valid(), and use that from both nv50_mstc_mode_valid() and
nouveau_connector_mode_valid(). Note that we allow for returning the
calculated clock that nv50_dp_mode_valid() came up with, since we'll
eventually want to use that for PBN calculation in
nv50_mstc_mode_valid().
Signed-off-by: Lyude Paul <lyude@redhat.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
This just limits the BPC for MST connectors to a maximum of 8 from
nv50_mstc_get_modes(), instead of doing so during
nv50_msto_atomic_check(). This doesn't introduce any functional changes
yet (other then userspace now lying about the max bpc, but we can't
support that yet anyway so meh). But, we'll need this in a moment so
that we can share mode validation between SST and MST which will fix
some real world issues.
Signed-off-by: Lyude Paul <lyude@redhat.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
We advertise being able to set interlaced modes, so let's actually make
sure to do that. Otherwise, we'll end up hanging the display engine due
to trying to set a mode with timings adjusted for interlacing without
telling the hardware it's actually an interlaced mode.
Signed-off-by: Lyude Paul <lyude@redhat.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Right now, we make the mistake of allowing interlacing on all
connectors. Nvidia hardware does not always support interlacing with DP
though, so we need to make sure that we don't allow interlaced modes to
be set in such situations as otherwise we'll end up accidentally hanging
the display HW.
This fixes some hangs with Turing, which would be caused by attempting
to set an interlaced mode on hardware that doesn't support it. This
patch likely fixes other hardware hanging in the same way as well.
Note that we say we probe PIOR caps, but they don't actually have any
interlacing caps. So, the get_caps() function for PIORs just sets
interlacing support to true.
Changes since v1:
* Actually probe caps correctly this time, both on EVO and NVDisplay.
Changes since v2:
* Fix probing for < GF119
* Use vfunc table, in prep for adding more caps in the future.
Signed-off-by: Lyude Paul <lyude@redhat.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
We'll need the core channel initialized and ready by the time that we
start creating modesetting objects, so that we can call the
NV507D_GET_CAPABILITIES method to make the hardware expose it's
modesetting capabilities for later probing.
So, when loading the driver prepare the core channel from within
nouveau_display_create(). Everywhere else, we initialize the core
channel during resume.
Signed-off-by: Lyude Paul <lyude@redhat.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Since the commit 742db30c4e ("drm/nouveau: Add HD-audio component
notifier support"), the nouveau driver notifies and pokes the HD-audio
HPD and ELD via audio component, but this seems broken. The culprit
is the naive assumption that crtc->index corresponds to the HDA pin.
Actually this rather corresponds to the MST dev_id (alias "pipe" in
the audio component framework) while the actual port number is given
from the output ior id number.
This patch corrects the assignment of port and dev_id arguments in the
audio component ops to recover from the HDMI/DP audio regression.
Fixes: 742db30c4e ("drm/nouveau: Add HD-audio component notifier support")
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=207223
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Using ENODEV as this prevents probe failed errors in dmesg.
v2: move check further down
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Fixes warnings on GPUs with smaller a smaller mmio region like vGPUs.
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Replace nouveau_pr3_present() in favor of a more generic one,
pci_pr3_present().
Also the presence of upstream bridge _PR3 doesn't need to go hand in
hand with device's _DSM, so check _PR3 before _DSM.
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Fixes coccicheck warning:
drivers/gpu/drm/nouveau/nvkm/subdev/acr/hsfw.c:103:23-30: WARNING opportunity for kmemdup
drivers/gpu/drm/nouveau/nvkm/subdev/acr/hsfw.c:113:22-29: WARNING opportunity for kmemdup
Fixes: 22dcda45a3 ("drm/nouveau/acr: implement new subdev to replace "secure boot"")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zou Wei <zou_wei@huawei.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
The variable ret is being initialized with a value that is never
read and it is being updated later with a new value. The initialization
is redundant and can be removed.
Addresses-Coverity: ("Unused value")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
When memory is migrated to the GPU, it is likely to be accessed by GPU
code soon afterwards. Instead of waiting for a GPU fault, map the
migrated memory into the GPU page tables with the same access permissions
as the source CPU page table entries. This preserves copy on write
semantics.
Signed-off-by: Ralph Campbell <rcampbell@nvidia.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Jason Gunthorpe <jgg@mellanox.com>
Cc: "Jérôme Glisse" <jglisse@redhat.com>
Cc: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>