The rps disabling code wasn't properly cancelling outstanding work
items. Also add a comment that explains why we're not racing with
the work item that could unmask interrupts - that piece of code
confused me quite a bit.
v2: Ben Widawsky pointed out that the first patch would deadlock
(and a few lesser problems). All corrected.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Keith Packard <keithp@keithp.com>
This patch closes the following race:
We get a PM interrupt A, mask it, set dev_priv->iir = PM_A and kick of the
work item. Scheduler isn't grumpy, so the work queue takes rps_lock,
grabs pm_iir = dev_priv->pm_iir and pm_imr = READ(PMIMR). Note that
pm_imr == pm_iir because we've just masked the interrupt we've got.
Now hw sends out PM interrupt B (not masked), we process it and mask
it. Later on the irq handler also clears PMIIR.
Then the work item proceeds and at the end clears PMIMR. Because
(local) pm_imr == pm_iir we have
pm_imr & ~pm_iir == 0
so all interrupts are enabled.
Hardware is still interrupt-happy, and sends out a new PM interrupt B.
PMIMR doesn't mask B (it does not mask anything), PMIIR is cleared, so
we get it and hit the WARN in the interrupt handler (because
dev_priv->pm_iir == PM_B).
That's why I've moved the
WRITE(PMIMR, 0)
up under the protection of the rps_lock. And write an uncoditional 0
to PMIMR, because that's what we'll do anyway.
This races looks much more likely because we can arbitrarily extend
the window by grabing dev->struct mutex right after the irq handler
has processed the first PM_B interrupt.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Keith Packard <keithp@keithp.com>
Quoting Chris Wilson's more concise description:
"Ah I think I see the problem. As you point out we only mask the current
interrupt received, so that if we have a task pending (and so IMR != 0) we
actually unmask the pending interrupt and so could receive it again before the
tasklet is finally kicked off by the grumpy scheduler."
We need the hw to issue PM interrupts A, B, A while the scheduler is hating us
and refuses to run the rps work item. On receiving PM interrupt A we hit the
WARN because
dev_priv->pm_iir == PM_A | PM_B
Also add a posting read as suggested by Chris to ensure proper ordering of the
writes to PMIMR and PMIIR. Just in case somebody weakens write ordering.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Keith Packard <keithp@keithp.com>
I can't think of any sensible reason to limit this to a mask of 0x0f,
ie, SDVO_OUTPUT_{TMDS,RGB,CVBS,SVID}0.
Signed-off-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Keith Packard <keithp@keithp.com>
I have no evidence for this byte being used this way, and lots of
counterexamples. Restore the struct to its empirical definition and
patch up gmbus setup to match.
Signed-off-by: Adam Jackson <ajax@redhat.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Keith Packard <keithp@keithp.com>
If an older userspace passes in a smaller arg than the current kernel
ioctl arg struct, then extra fields should be initialized to zero
rather than passing random data to the DRM driver.
Signed-off-by: Rob Clark <rob@ti.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
FB scratch indices are dword indices, but we were treating
them as byte indices. As such, we were getting the wrong
FB scratch data for non-0 indices. Fix the indices and
guard the indexing against indices larger than the scratch
allocation.
Fixes memory corruption on some boards if data was written
past the end of the FB scratch array.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reported-by: Dave Airlie <airlied@redhat.com>
Tested-by: Dave Airlie <airlied@redhat.com>
Cc: stable@kernel.org
Signed-off-by: Dave Airlie <airlied@redhat.com>
There are a number of fixes in mainline required for code in -next,
also there was a few conflicts I'd rather resolve myself.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Conflicts:
drivers/gpu/drm/radeon/evergreen.c
drivers/gpu/drm/radeon/r600.c
drivers/gpu/drm/radeon/radeon_asic.h
The intent here was to return an error code, but instead the code
returns the number of bytes remaining (that weren't copied).
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Settings in this table reflect the physical panel/connector rather
than the internal dig encoding.
v2: fix typo for DRM_MODE_CONNECTOR_VGA case.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
It's handled via external clock. It should already be protected
by the external ss flag, but add an explicit check just in case.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
llano has fully routeable dig encoders similar to DCE3.2 while
ontario has a hardcoded mapping similar to DCE4.0.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
factor out most of evergreen blit code and use the refactored code
from r600 that is now common for both r600 and evergreen
Signed-off-by: Ilija Hadzic <ihadzic@research.bell-labs.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
blit copy functions deal with GPU pages, not CPU pages,
so rename the variables and parameters accordingly
Signed-off-by: Ilija Hadzic <ihadzic@research.bell-labs.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
reorganize the code such that only the primitives (i.e., the functions
that load the CP ring) are hardware specific; dynamically link the
primitives in a (new) pointer structure inside r600_blit at
blit initialization time so that the functions that control the blit
operations can be made common for r600 and evergreen parts
Signed-off-by: Ilija Hadzic <ihadzic@research.bell-labs.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Lots of new (and hopefully useful) benchmark. Load the driver
with radeon_benchmark=<test_number> and enjoy. Among tests
added are VRAM to VRAM blits and blits with buffer size sweeps.
The latter can be from GTT to VRAM, VRAM to GTT, and VRAM to VRAM
and there are two types of sweeps: powers of two and (probably
more interesting) buffers sizes that correspond to common modes.
Signed-off-by: Ilija Hadzic <ihadzic@research.bell-labs.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
factor out repeated code into functions
fix units in which the throughput is reported (megabytes per second
and megabits per second make sense, others are kind of confusing)
make report more amenable to awk and friends (e.g. whitespace is
always the separator, unit is separated from the number, etc)
add #defines for some hard coded constants
besides "beautification" this reorg is done in preparation
for writing more elaborate benchmarks
Signed-off-by: Ilija Hadzic <ihadzic@research.bell-labs.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
some 3d register bits look like magic in r600 blit functions
use predefined constants to make it more intuitive what they are
Signed-off-by: Ilija Hadzic <ihadzic@research.bell-labs.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
some bits in 3D registers used by blit functions look like
magic and this is hard to follow; change them to a little bit
more meaningful pre-defined constants
Signed-off-by: Ilija Hadzic <ihadzic@research.bell-labs.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Covert 4k pages to multiples of 64x64x4 tiles.
This is also more efficient than a scanline based
approach from the MC's perspective.
Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Ilija Hadzic <ihadzic@research.bell-labs.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Covert 4k pages to multiples of 64x64x4 tiles.
This is also more efficient than a scanline based
approach from the MC's perspective.
Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Ilija Hadzic <ihadzic@research.bell-labs.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
There is no point in re-doing in post_xfer all the initialization
that was already done by pre_xfer. Instead, only do the work which
differs from pre_xfer.
Signed-off-by: Jean Delvare <jdelvare@suse.de>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
in case of using two drivers such as fimd and hdmi controller that
they have their own hardware interrupt, drm framework doesn't provide
pipe number corresponding to it. so the pipe should be set to event's
from specific crtc.
Signed-off-by: Seung-Woo Kim <sw0312.kim@samsung.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
this patch adds the following comments and code clean.
- add comment of exynos_drm_crtc_apply() call at page flip time.
- add comment that when exynos_drm_fbdev_reinit() is called,
why num_connector is 0 and also the framebuffers should be destroyed.
- remove buf_off member from struct exynos_drm_overlay because this member
isn't used anymore.
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
this patch solves the problem that fb_helper is released
when exynos_drm_fbdev_reinit() was called. if this function call
is ok then just return.
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
this patch adds common members to overlay structure and
makes each driver such as fimd or hdmi driver set them to
its own structure.
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This shrinks the sizes of a lot of functions in the radeon driver
dramatically.
With a non force inline + -Os kernel this is default anyways.
Cc: David Airlie <airlied@linux.ie>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
With this patch I'm only about 50k larger with DRM debugging
enables (why is that enabled by default?!?), and slightly
smaller without.
[airlied: moved r100.c additions to radeon_ring.c]
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
With the dropped inlines gccs starts warning about genuinely unused
functions. Remove r600_bpe_from_format, evergreen_cs_track_validate_cb,
evergreen-cs_packet_next_is_pkt3_nop which are all unused.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Fixes
evergreen_cs_parse 4080 23124 +19044
and others compared to a non force inline kernel.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This fixes kernel panics when running the vbltest from the drm repo. We
can't just skip initializing the vblank system since it sets up certain
state for us, see: "vmwgfx: Enable use of the vblank system."
Signed-off-by: Jakob Bornecrantz <jakob@vmware.com>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Make sure we null the display private, make sure we catch and
handle vblank failing to init and don't call vblank_cleanup if
we haven't initialized the display system.
Signed-off-by: Jakob Bornecrantz <jakob@vmware.com>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
If the panel is powered up, there's no need to delay for the 'off'
interval when turning the panel on.
Signed-off-by: Keith Packard <keithp@keithp.com>
This eliminates a fairly long delay when power sequencing newer
hardware
Signed-off-by: Keith Packard <keithp@keithp.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>