Adding a reference count helps drivers to properly implement the unbind
while in use case.
References are taken and put every time a channel is allocated or freed.
Once the final reference is put, the device is removed from the
dma_device_list and a release callback function is called to signal
the driver to free the memory.
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Link: https://lore.kernel.org/r/20191216190120.21374-5-logang@deltatee.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
The module reference is taken to ensure the callbacks still exist
when they are called. If the channel holds the last reference to the
module, the module can disappear before device_free_chan_resources() is
called and would cause a call into free'd memory.
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Link: https://lore.kernel.org/r/20191216190120.21374-3-logang@deltatee.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
dma_chan_to_owner() dereferences the driver from the struct device to
obtain the owner and call module_[get|put](). However, if the backing
device is unbound before the dma_device is unregistered, the driver
will be cleared and this will cause a NULL pointer dereference.
Instead, store a pointer to the owner module in the dma_device struct
so the module reference can be properly put when the channel is put, even
if the backing device was destroyed first.
This change helps to support a safer unbind of DMA engines.
If the dma_device is unregistered in the driver's remove function,
there's no guarantee that there are no existing clients and a users
action may trigger the WARN_ONCE in dma_async_device_unregister()
which is unlikely to leave the system in a consistent state.
Instead, a better approach is to allow the backing driver to go away
and fail any subsequent requests to it.
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Link: https://lore.kernel.org/r/20191216190120.21374-2-logang@deltatee.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
In some cases we seem to submit two transactions in a row, which
causes us to lose track of the first. If we then cancel the
request, we may still get an interrupt, which traverses a null
ds_run value.
So try to avoid starting a new transaction if the ds_run value
is set.
While this patch avoids the null pointer crash, I've had some
reports of the k3dma driver still getting confused, which
suggests the ds_run/ds_done value handling still isn't quite
right. However, I've not run into an issue recently with it
so I think this patch is worth pushing upstream to avoid the
crash.
Signed-off-by: John Stultz <john.stultz@linaro.org>
[add ss tag]
Link: https://lore.kernel.org/r/20191218190906.6641-1-john.stultz@linaro.org
Signed-off-by: Vinod Koul <vkoul@kernel.org>
With old DMA code disabled for handling DMA requests for device tree based
SoCs, we can move omap3 specific context save and restore to the dmaengine
driver.
Let's do this by adding cpu_pm notifier handling to save and restore context,
and enable it based on device tree match data. This way we can use the match
data later to configure more SoC specific features later on too.
Note that we only clear the channels in use while the platform code also
clears reserved channels 0 and 1 on high-security SoCs. Based on testing
on n900, this is not needed though and the system idles just fine.
With the dmaengine driver handling context save and restore, we must now
remove the old custom calls for context save and restore.
Cc: Aaro Koskinen <aaro.koskinen@iki.fi>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Peter Ujfalusi <peter.ujfalusi@ti.com>
Cc: Russell King <rmk+kernel@armlinux.org.uk>
Cc: Vinod Koul <vkoul@kernel.org>
Acked-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Tested-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Acked-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Let's drop the boilerplate code in the system suspend/resume callbacks and
convert to use pm_runtime_force_suspend|resume(). This change also has a
nice side effect, as pm_runtime_force_resume() may decide to leave the
device in low power state, when that is feasible, thus avoiding to waste
both time and energy during system resume.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Link: https://lore.kernel.org/r/20191205143746.24873-2-ulf.hansson@linaro.org
Signed-off-by: Vinod Koul <vkoul@kernel.org>
The driver forgets to call pm_runtime_disable and pm_runtime_put_sync in
probe failure and remove.
Add the calls and modify probe failure handling to fix it.
To simplify the fix, the patch adjusts the calling order and merges checks
for devm_kcalloc.
Fixes: 2b6b3b7420 ("ARM/dmaengine: edma: Merge the two drivers under drivers/dma/")
Signed-off-by: Chuhong Yuan <hslester96@gmail.com>
Acked-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Link: https://lore.kernel.org/r/20191124052855.6472-1-hslester96@gmail.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
The Spreadtrum Audio compress offload mode will use 2-stage DMA transfer
to save power. That means we can request 2 dma channels, one for source
channel, and another one for destination channel. Once the source channel's
transaction is done, it will trigger the destination channel's transaction
automatically by hardware signal.
In this case, the source channel will transfer data from IRAM buffer to
the DSP fifo to decoding/encoding, once IRAM buffer is empty by transferring
done, the destination channel will start to transfer data from DDR buffer
to IRAM buffer. Since the destination channel will use link-list mode to
fill the IRAM data, and IRAM buffer is allocated by 32K, and DDR buffer
is larger to 2M, that means we need lots of link-list nodes to do a cyclic
transfer, instead wasting lots of link-list memory, we can use wrap address
support to reduce link-list node number, which means when the transfer
address reaches the wrap address, the transfer address will jump to the
wrap_to address specified by wrap_to register, and only 2 link-list nodes
can do a cyclic transfer to transfer data from DDR to IRAM.
Thus this patch adds wrap address to support this case.
[Baolin Wang changes the commit message]
Signed-off-by: Eric Long <eric.long@unisoc.com>
Signed-off-by: Baolin Wang <baolin.wang@linaro.org>
Link: https://lore.kernel.org/r/85a5484bc1f3dd53ce6f92700ad8b35f30a0b096.1571812029.git.baolin.wang@linaro.org
Signed-off-by: Vinod Koul <vkoul@kernel.org>
When devm_request_irq fails, currently, the function
dma_async_device_unregister gets called. This doesn't free
the resources allocated by of_dma_controller_register.
Therefore, we have called of_dma_controller_free for this purpose.
Signed-off-by: Satendra Singh Thakur <sst2005@gmail.com>
Link: https://lore.kernel.org/r/20191109113523.6067-1-sst2005@gmail.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Add support for AXI Multichannel Direct Memory Access (AXI MCDMA)
core, which is a soft Xilinx IP core that provides high-bandwidth
direct memory access between memory and AXI4-Stream target peripherals.
The AXI MCDMA core provides scatter-gather interface with multiple
independent transmit and receive channels. The driver supports
device_prep_slave_sg slave transfer mode.
Signed-off-by: Radhey Shyam Pandey <radhey.shyam.pandey@xilinx.com>
Link: https://lore.kernel.org/r/1571763622-29281-7-git-send-email-radhey.shyam.pandey@xilinx.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Like paRAM slots, channels could be used by other cores and in this case
we need to make sure that the driver do not alter these channels.
Handle the generic dma-channel-mask property to mark channels in a bitmap
which can not be used by Linux and convert the legacy rsv_chans if it is
provided by platform_data.
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Link: https://lore.kernel.org/r/20191025073056.25450-4-peter.ujfalusi@ti.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Clang warns:
drivers/dma/fsl-dpaa2-qdma/dpdmai.c:148:25: warning: variable 'cfg' is
uninitialized when used within its own initialization [-Wuninitialized]
DPDMAI_CMD_CREATE(cmd, cfg);
~~~~~~~~~~~~~~~~~~~~~~~^~~~
drivers/dma/fsl-dpaa2-qdma/dpdmai.c:42:24: note: expanded from macro
'DPDMAI_CMD_CREATE'
typeof(_cfg) (cfg) = (_cfg); \
~~~ ^~~~
1 warning generated.
Looking at the preprocessed source, we can see that this is true.
int dpdmai_create(struct fsl_mc_io *mc_io, u32 cmd_flags,
const struct dpdmai_cfg *cfg, u16 *token)
{
struct fsl_mc_command cmd = { 0 };
int err;
cmd.header = mc_encode_cmd_header((((0x90E) << 4) | 0), cmd_flags, 0);
do {
typeof(cmd)(cmd) = (cmd);
typeof(cfg)(cfg) = (cfg);
((cmd).params[0] |= mc_enc((8), (8), (cfg)->priorities[0]));
((cmd).params[0] |= mc_enc((16), (8), (cfg)->priorities[1]));
} while (0);
I cannot see a good reason to create another version of cfg when the
parameter one will work perfectly fine and cmd can just be used as is.
Remove them to fix this warning.
Fixes: f2835adf8a ("dmaengine: fsl-dpaa2-qdma: Add the DPDMAI(Data Path DMA Interface) support")
Link: https://github.com/ClangBuiltLinux/linux/issues/746
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Link: https://lore.kernel.org/r/20191022171648.37732-1-natechancellor@gmail.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Yegor Yefremov <yegorslists@googlemail.com> reported that musb and ftdi
uart can fail for the first open of the uart unless connected using
a hub.
This is because the first dma call done by musb_ep_program() must wait
if cppi41 is PM runtime suspended. Otherwise musb_ep_program() continues
with other non-dma packets before the DMA transfer is started causing at
least ftdi uarts to fail to receive data.
Let's fix the issue by waking up cppi41 with PM runtime calls added to
cppi41_dma_prep_slave_sg() and return NULL if still idled. This way we
have musb_ep_program() continue with PIO until cppi41 is awake.
Fixes: fdea2d09b9 ("dmaengine: cppi41: Add basic PM runtime support")
Reported-by: Yegor Yefremov <yegorslists@googlemail.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Cc: stable@vger.kernel.org # v4.9+
Link: https://lore.kernel.org/r/20191023153138.23442-1-tony@atomide.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Whenever we reset the channel, we need to clear desc_pendingcount
along with desc_submitcount. Otherwise when a new transaction is
submitted, the irq coalesce level could be programmed to an incorrect
value in the axidma case.
This behavior can be observed when terminating pending transactions
with xilinx_dma_terminate_all() and then submitting new transactions
without releasing and requesting the channel.
Signed-off-by: Nicholas Graumann <nick.graumann@gmail.com>
Signed-off-by: Radhey Shyam Pandey <radhey.shyam.pandey@xilinx.com>
Link: https://lore.kernel.org/r/1571150904-3988-8-git-send-email-radhey.shyam.pandey@xilinx.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
bam_dma_terminate_all() will leak resources if any of the transactions are
committed to the hardware (present in the desc fifo), and not complete.
Since bam_dma_terminate_all() does not cause the hardware to be updated,
the hardware will still operate on any previously committed transactions.
This can cause memory corruption if the memory for the transaction has been
reassigned, and will cause a sync issue between the BAM and its client(s).
Fix this by properly updating the hardware in bam_dma_terminate_all().
Fixes: e7c0fe2a5c ("dmaengine: add Qualcomm BAM dma driver")
Signed-off-by: Jeffrey Hugo <jeffrey.l.hugo@gmail.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20191017152606.34120-1-jeffrey.l.hugo@gmail.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
DPPA2(Data Path Acceleration Architecture 2) qDMA supports
virtualized channel by allowing DMA jobs to be enqueued into
different work queues. Core can initiate a DMA transaction by
preparing a frame descriptor(FD) for each DMA job and enqueuing
this job through a hardware portal. DPAA2 components can also
prepare a FD and enqueue a DMA job through a hardware portal.
The qDMA prefetches DMA jobs through DPAA2 hardware portal. It
then schedules and dispatches to internal DMA hardware engines,
which generate read and write requests. Both qDMA source data and
destination data can be either contiguous or non-contiguous using
one or more scatter/gather tables.
The qDMA supports global bandwidth flow control where all DMA
transactions are stalled if the bandwidth threshold has been reached.
Also supported are transaction based read throttling.
Add NXP dppa2 qDMA to support some of Layerscape SoCs.
such as: LS1088A, LS208xA, LX2, etc.
Signed-off-by: Peng Ma <peng.ma@nxp.com>
Link: https://lore.kernel.org/r/20190930020440.7754-2-peng.ma@nxp.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>