qcacmn: Do not hold the lock for a long time in timer delete function
On NOL timer expiry, timer callback handler dfs_remove_from_nol is called.
It acquires dfs_nol_lock at various instances while accessing the elements
of NOL list like deleting an NOL element, printing or copying the elements
of NOL. Also, there is a thread scheduled in the callback handler
dfs_nol_elem_free_work to free the NOL element memory after all operations.
This thread does a del_timer_sync and hence waits for the timer handler to
complete operation before freeing the NOL timer element.
Consider a case where nol timer expires on channel (say chan 100) and
the timer handler is called on core1 of CPU. After all cleanup operations,
the thread to free NOL timer element is scheduled and acquires an NOL lock.
The lock will be released only after del_timer_sync succeeds in its
operation for which the handler (dfs_remove_from_nol) must have finished
executing on other CPUs.
While the Lock is held by Core1, nol timer expires on another channel
(say chan 120) and timer handler function dfs_remove_from_nol is called
from CPU Core0. In the handler, the same NOL lock is tried to be acquired.
Since it is already held by cleanup thread, it spins. del_timer_sync fails
to succeed in its timer cleanup as the handler function dfs_remove_from_nol
is executing on another CPU. It waits for the handler to finish execution
and handler waits for the lock acquired by the thread. It runs into
a deadlock.
Release the NOL lock before calling del_timer_sync. This will give a
chance for the handler to execute. Acquire the lock again when the next
element is removed from the free list.
CRs-Fixed: 2301063
Change-Id: I822714edee3269ccbb93838e3892796219e1b88e