Fixes the BUG_ON spuriously triggering under the following
circumstances:
* reservation_object_reserve_shared is called with shared_count ==
shared_max - 1, so obj->staged is freed in preparation of an in-place
update.
* reservation_object_add_shared_fence is called with the first fence,
after which shared_count == shared_max.
* reservation_object_add_shared_fence is called with a follow-up fence
from the same context.
In the second reservation_object_add_shared_fence call, the BUG_ON
triggers. However, nothing bad would happen in
reservation_object_add_shared_inplace, since both fences are from the
same context, so they only occupy a single slot.
Prevent this by moving the BUG_ON to where an overflow would actually
happen (e.g. if a buggy caller didn't call
reservation_object_reserve_shared before).
v2:
* Fix description of breaking scenario (Christian König)
* Add bugzilla reference
Cc: stable@vger.kernel.org
Bugzilla: https://bugs.freedesktop.org/106418
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> # v1
Reviewed-by: Christian König <christian.koenig@amd.com> # v1
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20180704151405.10357-1-michel@daenzer.net
The current Wound-Wait mutex algorithm is actually not Wound-Wait but
Wait-Die. Implement also Wound-Wait as a per-ww-class choice. Wound-Wait
is, contrary to Wait-Die a preemptive algorithm and is known to generate
fewer backoffs. Testing reveals that this is true if the
number of simultaneous contending transactions is small.
As the number of simultaneous contending threads increases, Wait-Wound
becomes inferior to Wait-Die in terms of elapsed time.
Possibly due to the larger number of held locks of sleeping transactions.
Update documentation and callers.
Timings using git://people.freedesktop.org/~thomash/ww_mutex_test
tag patch-18-06-15
Each thread runs 100000 batches of lock / unlock 800 ww mutexes randomly
chosen out of 100000. Four core Intel x86_64:
Algorithm #threads Rollbacks time
Wound-Wait 4 ~100 ~17s.
Wait-Die 4 ~150000 ~19s.
Wound-Wait 16 ~360000 ~109s.
Wait-Die 16 ~450000 ~82s.
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Gustavo Padovan <gustavo@padovan.org>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Sean Paul <seanpaul@chromium.org>
Cc: David Airlie <airlied@linux.ie>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Kate Stewart <kstewart@linuxfoundation.org>
Cc: Philippe Ombredanne <pombredanne@nexb.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: linux-doc@vger.kernel.org
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
Co-authored-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Ingo Molnar <mingo@kernel.org>
drm-misc-next for 4.17:
UAPI Changes:
- drm/vc4: Expose performance counters to userspace (Boris)
Cross-subsystem Changes:
- MAINTAINERS: Linus to maintain panel-arm-versatile in -misc (Linus)
Core Changes:
- Only use swiotlb when necessary (Chunming)
Driver Changes:
- drm/panel: Add support for ARM Versatile panels (Linus)
- pl111: Improvements around versatile panel support (Linus)
----------------------------------------
Tagged on 2018-02-06:
drm-misc-next for 4.17:
UAPI Changes:
- Validate mode flags + type (Ville)
- Deprecate unused mode flags PIXMUX, BCAST (Ville)
- Deprecate unused mode types BUILTIN, CRTC_C, CLOCK_C, DEFAULT (Ville)
Cross-subsystem Changes:
- MAINTAINERS: s/Daniel/Maarten/ for drm-misc (Daniel)
Core Changes:
- gem: Export gem functions for drivers to use (Samuel)
- bridge: Introduce bridge timings in drm_bridge (Linus)
- dma-buf: Allow exclusive fence to be bundled in fence array when
calling reservation_object_get_fences_rcu (Christian)
- dp: Add training pattern 4 and HBR3 support to dp helpers (Manasi)
- fourcc: Add alpha bit to formats to avoid driver format LUTs (Maxime)
- mode: Various cleanups + add new device-wide .mode_valid hook (Ville)
- atomic: Fix state leak when non-blocking commits fail (Leo)
NOTE: IIRC, this was cross-picked to -fixes so it might fall out
- crc: Allow polling on the data fd (Maarten)
Driver Changes:
- bridge/vga-dac: Add THS8134* support (Linus)
- tinydrm: Various MIPI DBI improvements/cleanups (Noralf)
- bridge/dw-mipi-dsi: Cleanups + use create_packet helper (Brian)
- drm/sun4i: Add Display Engine frontend support (Maxime)
- drm/sun4i: Add zpos support + increase num planes from 2 to 4 (Maxime)
- various: Use drm_mode_get_hv_timing() to fill plane clip rectangle (Ville)
- stm: Add 8-bit clut support, add dsi phy v1.31 support, +fixes (Phillipe)
Cc: Boris Brezillon <boris.brezillon@free-electrons.com>
Cc: Chunming Zhou <david1.zhou@amd.com>
Cc: Samuel Li <Samuel.Li@amd.com>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Noralf Trønnes <noralf@tronnes.org>
Cc: Brian Norris <briannorris@chromium.org>
Cc: Maxime Ripard <maxime.ripard@free-electrons.com>
Cc: Ville Syrjala <ville.syrjala@linux.intel.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Manasi Navare <manasi.d.navare@intel.com>
Cc: Philippe Cornu <philippe.cornu@st.com>
Cc: Leo (Sunpeng) Li <sunpeng.li@amd.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
* tag 'drm-misc-next-2018-02-13' of git://anongit.freedesktop.org/drm/drm-misc: (115 commits)
drm/radeon: only enable swiotlb path when need v2
drm/amdgpu: only enable swiotlb alloc when need v2
drm: add func to get max iomem address v2
drm/vc4: Expose performance counters to userspace
drm: Print the pid when debug logging an ioctl error.
drm/stm: ltdc: remove non-alpha color formats on layer 2 for older hw
drm/stm: ltdc: add non-alpha color formats
drm/bridge/synopsys: dsi: Add 1.31 version support
drm/bridge/synopsys: dsi: Add read feature
drm/pl111: Support multiple endpoints on the CLCD
drm/pl111: Support variants with broken VBLANK
drm/pl111: Support variants with broken clock divider
drm/pl111: Handle the Versatile RGB/BGR565 mode
drm/pl111: Properly detect the ARM PL110 variants
drm/panel: Add support for ARM Versatile panels
drm/panel: Device tree bindings for ARM Versatile panels
drm/bridge: Rename argument from crtc to bridge
drm/crc: Add support for polling on the data fd.
drm/sun4i: Use drm_mode_get_hv_timing() to populate plane clip rectangle
drm/rcar-du: Use drm_mode_get_hv_timing() to populate plane clip rectangle
...
- WARNING: Missing a blank line after declarations
- WARNING: line over 80 characters
- WARNING: please, no space before tabs
Signed-off-by: Jagan Teki <jteki@openedev.com>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
When the timeout value passed to reservation_object_wait_timeout_rcu
is zero, no wait should be done if the fences are not signaled.
Return '1' for idle and '0' for busy if the specified timeout is '0'
to keep consistent with the case of non-zero timeout.
v2: call fence_put if not signaled in the case of timeout==0
v3: switch to reservation_object_test_signaled_rcu
Signed-off-by: Jammy Zhou <Jammy.Zhou@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-By: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
This adds some extra functions to deal with rcu.
reservation_object_get_fences_rcu() will obtain the list of shared
and exclusive fences without obtaining the ww_mutex.
reservation_object_wait_timeout_rcu() will wait on all fences of the
reservation_object, without obtaining the ww_mutex.
reservation_object_test_signaled_rcu() will test if all fences of the
reservation_object are signaled without using the ww_mutex.
reservation_object_get_excl and reservation_object_get_list require
the reservation object to be held, updating requires
write_seqcount_begin/end. If only the exclusive fence is needed,
rcu_dereference followed by fence_get_rcu can be used, if the shared
fences are needed it's recommended to use the supplied functions.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Acked-by: Sumit Semwal <sumit.semwal@linaro.org>
Acked-by: Daniel Vetter <daniel@ffwll.ch>
Reviewed-By: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Move the list of shared fences to a struct, and return it in
reservation_object_get_list().
Add reservation_object_get_excl to get the exclusive fence.
Add reservation_object_reserve_shared(), which reserves space
in the reservation_object for 1 more shared fence.
reservation_object_add_shared_fence() and
reservation_object_add_excl_fence() are used to assign a new
fence to a reservation_object pointer, to complete a reservation.
Changes since v1:
- Add reservation_object_get_excl, reorder code a bit.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Acked-by: Sumit Semwal <sumit.semwal@linaro.org>
Acked-by: Daniel Vetter <daniel@ffwll.ch>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>