The original intention was to avoid CPU page table unmaps
when BOs move between the GTT and SYSTEM domain.
The problem is that this never correctly handled changes
in the caching attributes or backing pages.
Just drop this for now and simply unmap the CPU page
tables in all cases.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/378240/
The Documentation/DMA-API-HOWTO.txt states that the dma_map_sg() function
returns the number of the created entries in the DMA address space.
However the subsequent calls to the dma_sync_sg_for_{device,cpu}() and
dma_unmap_sg must be called with the original number of the entries
passed to the dma_map_sg().
struct sg_table is a common structure used for describing a non-contiguous
memory buffer, used commonly in the DRM and graphics subsystems. It
consists of a scatterlist with memory pages and DMA addresses (sgl entry),
as well as the number of scatterlist entries: CPU pages (orig_nents entry)
and DMA mapped pages (nents entry).
It turned out that it was a common mistake to misuse nents and orig_nents
entries, calling DMA-mapping functions with a wrong number of entries or
ignoring the number of mapped entries returned by the dma_map_sg()
function.
To avoid such issues, lets use a common dma-mapping wrappers operating
directly on the struct sg_table objects and use scatterlist page
iterators where possible. This, almost always, hides references to the
nents and orig_nents entries, making the code robust, easier to follow
and copy/paste safe.
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/371142/
Signed-off-by: Christian König <christian.koenig@amd.com>
Instead of signaling failure by setting the node pointer to
NULL do so by returning -ENOSPC.
v2: add memset() to make sure that mem is always initialized.
v3: drop memset() only set mm_node = NULL, move mm_node init in amdgpu
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Link: https://patchwork.freedesktop.org/patch/373181/
Some conflicts with ttm_bo->offset removal, but drm-misc-next needs updating to v5.8.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
drm-misc-next for v5.9:
UAPI Changes:
- Add DRM_MODE_TYPE_USERDEF for video modes specified in cmdline.
Cross-subsystem Changes:
- Assorted devicetree binding updates.
- Add might_sleep() to dma_fence_wait().
- Fix fbdev's get_user_pages_fast() handling, and use pin_user_pages.
- Small cleanup with IS_BUILTIN in video/fbdev drivers.
- Fix video/hdmi coding style for infoframe size.
Core Changes:
- Silence vblank output during init.
- Fix DP-MST corruption during send msg timeout.
- Clear leak in drm_gem_objecs_lookup().
- Make newlines work with force connector attribute.
- Fix module refcounting error in drm_encoder_slave, and use new i2c api.
- Header fix for drm_managed.c
- More struct_mutex removal for !legacy drivers:
- Remove gem_free_object()
- Removal of drm_gem_object_put_unlocked().
- Show current->comm alongside pid in debug printfs.
- Add drm_client_modeset_check() + drm_client_framebuffer_flush().
- Replace drm_fb_swab16 with drm_fb_swap that also supports 32-bits.
- Remove mode->vrefresh, and compactify drm_display_mode.
- Use drm_* macros for logging and warnings.
- Add WARN when drm_gem_get_pages is used on a private obj.
- Handle importing and imported dmabuf better in shmem helpers.
- Small fix for drm/mm hole size comparison, and remove invalid entry optimization.
- Add a drm/mm selftest.
- Set DSI connector type for DSI panels.
- Assorted small fixes and documentation updates.
- Fix DDI I2C device registration for MST ports, and flushing on destroy.
- Fix master_set return type, used by vmwgfx.
- Make the drm_set/drop_master ioctl symmetrical.
Driver Changes:
Allow iommu in the sun4i driver and use it for sun8i.
- Simplify backlight lookup for omap, amba-clcd and tilcdc.
- Hold reg_lock for rockchip.
- Add support for bridge gpio and lane reordering + polarity to ti-sn65dsi86, and fix clock choice.
- Small assorted fixes to tilcdc, vc4, i915, omap, fbdev/sm712fb, fbdev/pxafb, console/newport_con, msm, virtio, udl, malidp, hdlcd, bridge/ti-sn65dsi86, panfrost.
- Remove hw cursor support for mgag200, and use simple kms helper + shmem helpers.
- Add support for KOE Allow iommu in the sun4i driver and use it for sun8i.
- Simplify backlight lookup for omap, amba-clcd and tilcdc.
- Hold reg_lock for rockchip.
- Add support for bridge gpio and lane reordering + polarity to ti-sn65dsi86, and fix clock choice.
- Small assorted fixes to tilcdc, vc4 (multiple), i915.
- Remove hw cursor support for mgag200, and use simple kms helper + shmem helpers.
- Add support for KOE TX26D202VM0BWA panel.
- Use GEM CMA functions in arc, arm, atmel-hlcdc, fsi-dcu, hisilicon, imx, ingenic, komeda, malidp, mcde, meson, msxfb, rcar-du, shmobile, stm, sti, tilcdc, tve200, zte.
- Remove gem_print_info.
- Improve gem_create_object_helper so udl can use shmem helpers.
- Convert vc4 dt bindings to schemas, and add clock properties.
- Device initialization cleanups for mgag200.
- Add a workaround to fix DP-MST short pulses handling on broken hardware in i915.
- Allow build test compiling arm drivers.
- Use managed pci functions in mgag200 and ast.
- Use dev_groups in malidp.
- Add per pixel alpha support for PX30 VOP in rockchip.
- Silence deferred probe logs in panfrost.
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/001cd9a6-405d-4e29-43d8-354f53ae4e8b@linux.intel.com
Add rename the gpu busy percentage for consistency and
add the mem busy percentage documentation.
Reviewed-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Nirmoy Das <nirmoy.das@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The existing code used the major version number of the DRM driver
instead of the device major number of the DRM subsystem for
validating access for a devices cgroup.
This meant that accesses allowed by the devices cgroup weren't
permitted and certain accesses denied by the devices cgroup were
permitted (if they matched the wrong major device number).
Signed-off-by: Lorenz Brun <lorenz@brun.one>
Fixes: 6b855f7b83 ("drm/amdkfd: Check against device cgroup")
Reviewed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
When we want to use float point operation on Linux
we need to use within special kernel protection
(`kernel_fpu_{begin,end}()`.), otherwise the kernel
can clobber userspace FPU register state. For detecting
these issues we use a tool named objtool (with -Ffa
flags) to highlight the FPU problems, all warnings can
be summed up as follows:
./tools/objtool/objtool check -Ffa
drivers/gpu/drm/amd/display/dc/dml/dml_common_defs.o
[..] dc/dsc/rc_calc.o: warning: objtool: get_qp_set()+0x2f8:
FPU instruction outside of kernel_fpu_{begin,end}()
[..] dc/dsc/rc_calc.o: warning: objtool: dsc_roundf()+0x5:
FPU instruction outside of kernel_fpu_{begin,end}()
[..] dc/dsc/rc_calc.o: warning: objtool: dsc_ceil()+0x5:
FPU instruction outside of kernel_fpu_{begin,end}()
[..] dc/dsc/rc_calc.o: warning: objtool: get_ofs_set()+0x3eb:
FPU instruction outside of kernel_fpu_{begin,end}()
[..] dc/dsc/rc_calc.o: warning: objtool: calc_rc_params()+0x3c:
FPU instruction outside of kernel_fpu_{begin,end}()
[..] dc/dsc/dc_dsc.o: warning: objtool:
get_dsc_bandwidth_range.isra.0()+0x8d:
FPU instruction outside of kernel_fpu_{begin,end}()
[..] dc/dsc/dc_dsc.o: warning: objtool: setup_dsc_config()+0x2ef:
FPU instruction outside of kernel_fpu_{begin,end}()
[..] dc/dsc/rc_calc_dpi.o: warning: objtool:copy_pps_fields()+0xbb:
FPU instruction outside of kernel_fpu_{begin,end}()
[..] dc/dsc/rc_calc_dpi.o: warning: objtool:
dscc_compute_dsc_parameters()+0x7b:
FPU instruction outside of kernel_fpu_{begin,end}()
This commit fixes the above issues by rework DSC as described:
1. Isolate all FPU operations in a single file;
2. Use FPU flags only in the file that handles FPU operations;
3. Isolate all functions that require float point operation in static
functions;
4. Add a mid-layer function that does not use any float point operation,
and that could be safely invoked in other parts of the code.
5. Keep float point operation under DC_FP_{START/END} macro.
CC: Christian König <christian.koenig@amd.com>
CC: Alexander Deucher <Alexander.Deucher@amd.com>
CC: Peter Zijlstra <peterz@infradead.org>
CC: Tony Cheng <tony.cheng@amd.com>
CC: Harry Wentland <hwentlan@amd.com>
Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Reviewed-by: Mikita Lipski <Mikita.Lipski@amd.com>
Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Initializes Powertune data for a specific Hawaii card by fixing what
looks like a typo in the code. The device ID 66B1 is not a supported
device ID for this driver, and is not mentioned elsewhere. 67B1 is a
valid device ID, and is a Hawaii Pro GPU.
I have tested on my R9 390 which has device ID 67B1, and it works
fine without problems.
Signed-off-by: Sandeep Raghuraman <sandy.8925@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Use kfree() instead of kvfree() to free rgb_user in
calculate_user_regamma_ramp() because the memory is allocated with
kcalloc().
Signed-off-by: Denis Efremov <efremov@linux.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Use kvfree() instead of kfree() to free coeff in build_regamma()
because the memory is allocated with kvzalloc().
Fixes: e752058b86 ("drm/amd/display: Optimize gamma calculations")
Cc: stable@vger.kernel.org
Signed-off-by: Denis Efremov <efremov@linux.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Pull drm fixes from Dave Airlie:
"These are the fixes from last week for the stuff merged in the merge
window. It got a bunch of nouveau fixes for HDA audio on some new
GPUs, some i915 and some amdpgu fixes.
i915:
- gvt: Fix one clang warning on debug only function
- Use ARRAY_SIZE for coccicheck warning
- Use after free fix for display global state.
- Whitelisting context-local timestamp on Gen9 and two scheduler
fixes with deps (Cc: stable)
- Removal of write flag from sysfs files where ineffective
nouveau:
- HDMI/DP audio HDA fixes
- display hang fix for Volta/Turing
- GK20A regression fix.
amdgpu:
- Prevent hwmon accesses while GPU is in reset
- CTF interrupt fix
- Backlight fix for renoir
- Fix for display sync groups
- Display bandwidth validation workaround"
* tag 'drm-next-2020-06-08' of git://anongit.freedesktop.org/drm/drm: (28 commits)
drm/nouveau/kms/nv50-: clear SW state of disabled windows harder
drm/nouveau: gr/gk20a: Use firmware version 0
drm/nouveau/disp/gm200-: detect and potentially disable HDA support on some SORs
drm/nouveau/disp/gp100: split SOR implementation from gm200
drm/nouveau/disp: modify OR allocation policy to account for HDA requirements
drm/nouveau/disp: split part of OR allocation logic into a function
drm/nouveau/disp: provide hint to OR allocation about HDA requirements
drm/amd/display: Revalidate bandwidth before commiting DC updates
drm/amdgpu/display: use blanked rather than plane state for sync groups
drm/i915/params: fix i915.fake_lmem_start module param sysfs permissions
drm/i915/params: don't expose inject_probe_failure in debugfs
drm/i915: Whitelist context-local timestamp in the gen9 cmdparser
drm/i915: Fix global state use-after-frees with a refcount
drm/i915: Check for awaits on still currently executing requests
drm/i915/gt: Do not schedule normal requests immediately along virtual
drm/i915: Reorder await_execution before await_request
drm/nouveau/kms/gt215-: fix race with audio driver runpm
drm/nouveau/disp/gm200-: fix NV_PDISP_SOR_HDMI2_CTRL(n) selection
Revert "drm/amd/display: disable dcn20 abm feature for bring up"
drm/amd/powerplay: ack the SMUToHost interrupt on receive V2
...
[Why]
Whenever we switch between tiled formats without also switching pixel
formats or doing anything else that recreates the DC plane state we
can run into underflow or hangs since we're not updating the
DML parameters before committing to the hardware.
[How]
If the update type is FULL then call validate_bandwidth again to update
the DML parmeters before committing the state.
This is basically just a workaround and protective measure against
update types being added DC where we could run into this issue in
the future.
We can only fully validate the state in advance before applying it to
the hardware if we recreate all the plane and stream states since
we can't modify what's currently in use.
The next step is to update DM to ensure that we're creating the plane
and stream states for whatever could potentially be a full update in
DC to pre-emptively recreate the state for DC global validation.
The workaround can stay until this has been fixed in DM.
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Reviewed-by: Hersen Wu <hersenxs.wu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Pull hmm updates from Jason Gunthorpe:
"This series adds a selftest for hmm_range_fault() and several of the
DEVICE_PRIVATE migration related actions, and another simplification
for hmm_range_fault()'s API.
- Simplify hmm_range_fault() with a simpler return code, no
HMM_PFN_SPECIAL, and no customizable output PFN format
- Add a selftest for hmm_range_fault() and DEVICE_PRIVATE related
functionality"
* tag 'for-linus-hmm' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
MAINTAINERS: add HMM selftests
mm/hmm/test: add selftests for HMM
mm/hmm/test: add selftest driver for HMM
mm/hmm: remove the customizable pfn format from hmm_range_fault
mm/hmm: remove HMM_PFN_SPECIAL
drm/amdgpu: remove dead code after hmm_range_fault()
mm/hmm: make hmm_range_fault return 0 or -1
Pull power management updates from Rafael Wysocki:
"These rework the system-wide PM driver flags, make runtime switching
of cpuidle governors easier, improve the user space hibernation
interface code, add intel-speed-select interface documentation, add
more debug messages to the ACPI code handling suspend to idle, update
the cpufreq core and drivers, fix a minor issue in the cpuidle core
and update two cpuidle drivers, improve the PM-runtime framework,
update the Intel RAPL power capping driver, update devfreq core and
drivers, and clean up the cpupower utility.
Specifics:
- Rework the system-wide PM driver flags to make them easier to
understand and use and update their documentation (Rafael Wysocki,
Alan Stern).
- Allow cpuidle governors to be switched at run time regardless of
the kernel configuration and update the related documentation
accordingly (Hanjun Guo).
- Improve the resume device handling in the user space hibernarion
interface code (Domenico Andreoli).
- Document the intel-speed-select sysfs interface (Srinivas
Pandruvada).
- Make the ACPI code handing suspend to idle print more debug
messages to help diagnose issues with it (Rafael Wysocki).
- Fix a helper routine in the cpufreq core and correct a typo in the
struct cpufreq_driver kerneldoc comment (Rafael Wysocki, Wang
Wenhu).
- Update cpufreq drivers:
- Make the intel_pstate driver start in the passive mode by
default on systems without HWP (Rafael Wysocki).
- Add i.MX7ULP support to the imx-cpufreq-dt driver and add
i.MX7ULP to the cpufreq-dt-platdev blacklist (Peng Fan).
- Convert the qoriq cpufreq driver to a platform one, make the
platform code create a suitable device object for it and add
platform dependencies to it (Mian Yousaf Kaukab, Geert
Uytterhoeven).
- Fix wrong compatible binding in the qcom driver (Ansuel Smith).
- Build the omap driver by default for ARCH_OMAP2PLUS (Anders
Roxell).
- Add r8a7742 SoC support to the dt cpufreq driver (Lad
Prabhakar).
- Update cpuidle core and drivers:
- Fix three reference count leaks in error code paths in the
cpuidle core (Qiushi Wu).
- Convert Qualcomm SPM to a generic cpuidle driver (Stephan
Gerhold).
- Fix up the execution order when entering a domain idle state in
the PSCI driver (Ulf Hansson).
- Fix a reference counting issue related to clock management and
clean up two oddities in the PM-runtime framework (Rafael Wysocki,
Andy Shevchenko).
- Add ElkhartLake support to the Intel RAPL power capping driver and
remove an unused local MSR definition from it (Jacob Pan, Sumeet
Pawnikar).
- Update devfreq core and drivers:
- Replace strncpy() with strscpy() in the devfreq core and use
lockdep asserts instead of manual checks for a locked mutex in
it (Dmitry Osipenko, Krzysztof Kozlowski).
- Add a generic imx bus scaling driver and make it register an
interconnect device (Leonard Crestez, Gustavo A. R. Silva).
- Make the cpufreq notifier in the tegra30 driver take boosting
into account and delete an unuseful error message from that
driver (Dmitry Osipenko, Markus Elfring).
- Remove unneeded semicolon from the cpupower code (Zou Wei)"
* tag 'pm-5.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (51 commits)
cpuidle: Fix three reference count leaks
PM: runtime: Replace pm_runtime_callbacks_present()
PM / devfreq: Use lockdep asserts instead of manual checks for locked mutex
PM / devfreq: imx-bus: Fix inconsistent IS_ERR and PTR_ERR
PM / devfreq: Replace strncpy with strscpy
PM / devfreq: imx: Register interconnect device
PM / devfreq: Add generic imx bus scaling driver
PM / devfreq: tegra30: Delete an error message in tegra_devfreq_probe()
PM / devfreq: tegra30: Make CPUFreq notifier to take into account boosting
PM: hibernate: Restrict writes to the resume device
PM: runtime: clk: Fix clk_pm_runtime_get() error path
cpuidle: Convert Qualcomm SPM driver to a generic CPUidle driver
ACPI: EC: PM: s2idle: Extend GPE dispatching debug message
ACPI: PM: s2idle: Print type of wakeup debug messages
powercap: RAPL: remove unused local MSR define
PM: runtime: Make clear what we do when conditions are wrong in rpm_suspend()
Documentation: admin-guide: pm: Document intel-speed-select
PM: hibernate: Split off snapshot dev option
PM: hibernate: Incorporate concurrency handling
Documentation: ABI: make current_governer_ro as a candidate for removal
...
Pull drm fixes from Dave Airlie:
"A couple of amdgpu fixes and minor ingenic fixes:
amdgpu:
- display atomic test fix
- Fix soft hang in display vupdate code
ingenic:
- fix pointer cast
- fix crtc atomic check callback"
* tag 'drm-fixes-2020-05-29-1' of git://anongit.freedesktop.org/drm/drm:
drm/amd/display: Fix potential integer wraparound resulting in a hang
drm/amd/display: drop cursor position check in atomic test
gpu/drm: Ingenic: Fix opaque pointer casted to wrong type
gpu/drm: ingenic: Fix bogus crtc_atomic_check callback
Return an error for sysfs and debugfs power interfaces during
gpu reset and suspend. Prevents access to the hw while it may
be in an unusable state.
v2: squash in fix to drop suspend check
Acked-by: Nirmoy Das <nirmoy.das@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why]
If VUPDATE_END is before VUPDATE_START the delay calculated can become
very large, causing a soft hang.
[How]
Take the absolute value of the difference between START and END.
Signed-off-by: Aric Cyr <aric.cyr@amd.com>
Reviewed-by: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com>
Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>