Commit a9c4284bf5 ("ALSA: firewire-lib: add context information to
tracepoints") adds new members to tracepoint events of this module, to
represent context information. One of the members is bool type and
this causes sparse warnings.
16:1: warning: expression using sizeof bool
60:1: warning: expression using sizeof bool
16:1: warning: odd constant _Bool cast (ffffffffffffffff becomes 1)
60:1: warning: odd constant _Bool cast (ffffffffffffffff becomes 1)
This commit suppresses the warnings, by changing type of the member
to 'unsigned int'. Additionally, this commit applies '!!' idiom to
get 0/1 from 'in_interrupt()'.
Fixes: a9c4284bf5 ("ALSA: firewire-lib: add context information to tracepoints")
Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
In the Ethernet/TCP world, CAP_NET_RAW is sufficient to allow a program
to listen to all incoming packets on a specific interface, and the
higher CAP_NET_ADMIN is required to set the interface into promiscuous
mode. We want to emulate that same basic division of privilege in the
RDMA stack, so when dealing with Raw Ethernet QPs, allow apps with
CAP_NET_RAW to listen to all incoming flows (and direct them as they see
fit in their own listen stream). Do not require CAP_NET_ADMIN just to
listen to traffic already incoming. Reserve CAP_NET_ADMIN if we attempt
to set promiscuous mode.
Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
The problem is that the function 'send_reply_to_slave' gets the
'req_sa_mad' as a pointer whose address is only aliged to 4 bytes
but is 8 bytes in size. This can result in unaligned access faults
on certain architectures.
Sowmini Varadhan pointed to this reply from Dave Miller that say
that memcpy should not be used to solve alignment issues:
https://lkml.org/lkml/2015/10/21/352
Optimization of memcpy to 'ldx' instruction can only happen if the
compiler knows that the size of the data we are copying is 8 bytes
and it assumes it is aligned to 8 bytes. If the compiler know the
type is not aligned to 8 it must not optimize the 8 byte copy.
Defining the data type as aligned to 4 forces the compiler to treat
all accesses as though they aren't aligned and avoids the 'ldx'
optimization.
Full credit for the idea goes to Jason Gunthorpe
<jgunthorpe@obsidianresearch.com>.
Signed-off-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
When booting with nr_cpus=1, uncore_pci_probe tries to init the PCI/uncore
also for the other packages and fails with warning when they are not found.
The warning is bogus because it's correct to fail here for packages which are
not initialized. Remove it and return silently.
Fixes: cf6d445f68 "perf/x86/uncore: Track packages, not per CPU data"
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: stable@vger.kernel.org
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Remove unused variable 'ret' from functions where it
was not used anyway, and directly return 0.
Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com>
Signed-off-by: Muhammad Falak R Wani <falakreyaz@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The most recent release of AXS103 [v1.1] is proven to work
at 100 MHz in dual-core mode so this change uses mentioned feature.
For that we:
* Update axc003_idu.dtsi with mention of really-used CPU clock freq
* Remove clock override in AXS platform code for dual-core HW
Note we're still leaving a hack for clock "downgrade" on early boot
for quad-core hardware.
Also note this change will break functionality of AXS103 v1.0 hardware.
That means all users of AXS103 __must__ upgrade their boards with the
most recent firmware.
Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Rearrange the inode tagging functions so that they are higher up in
xfs_cache.c and so there is no need for forward prototypes to be
defined. This is purely code movement, no other change.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Inode radix tree tagging for reclaim passes a lot of unnecessary
variables around. Over time the xfs-perag has grown a xfs_mount
backpointer, and an internal agno so we don't need to pass other
variables into the tagging functions to supply this information.
Rework the functions to pass the minimal variable set required
and simplify the internal logic and flow.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
The cluster inode variable uses unconventional naming - iq - which
makes it hard to distinguish it between the inode passed into the
function - ip - and that is a vector for mistakes to be made.
Rename all the cluster inode variables to use a more conventional
prefixes to reduce potential future confusion (cilist, cilist_size,
cip).
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>
xfs_iflush_cluster() does a gang lookup on the radix tree, meaning
it can find inodes beyond the current cluster if there is sparse
cache population. gang lookups return results in ascending index
order, so stop trying to cluster inodes once the first inode outside
the cluster mask is detected.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>
The last thing we do before using call_rcu() on an xfs_inode to be
freed is mark it as invalid. This means there is a window between
when we know for certain that the inode is going to be freed and
when we do actually mark it as "freed".
This is important in the context of RCU lookups - we can look up the
inode, find that it is valid, and then use it as such not realising
that it is in the final stages of being freed.
As such, mark the inode as being invalid the moment we know it is
going to be reclaimed. This can be done while we still hold the
XFS_ILOCK_EXCL and the flush lock in xfs_inode_reclaim, meaning that
it occurs well before we remove it from the radix tree, and that
the i_flags_lock, the XFS_ILOCK and the inode flush lock all act as
synchronisation points for detecting that an inode is about to go
away.
For defensive purposes, this allows us to add a further check to
xfs_iflush_cluster to ensure we skip inodes that are being freed
after we grab the XFS_ILOCK_SHARED and the flush lock - we know that
if the inode number if valid while we have these locks held we know
that it has not progressed through reclaim to the point where it is
clean and is about to be freed.
[bfoster: fixed __xfs_inode_clear_reclaim() using ip->i_ino after it
had already been zeroed.]
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
The xfs_inode freed in xfs_inode_free() has multiple allocated
structures attached to it. We free these in xfs_inode_free() before
we mark the inode as invalid, and before we run call_rcu() to queue
the structure for freeing.
Unfortunately, this freeing can race with other accesses that are in
the RCU current grace period that have found the inode in the radix
tree with a valid state. This includes xfs_iflush_cluster(), which
calls xfs_inode_clean(), and that accesses the inode log item on the
xfs_inode.
The log item structure is freed in xfs_inode_free(), so there is the
possibility we can be accessing freed memory in xfs_iflush_cluster()
after validating the xfs_inode structure as being valid for this RCU
context. Hence we can get spuriously incorrect clean state returned
from such checks. This can lead to use thinking the inode is dirty
when it is, in fact, clean, and so incorrectly attaching it to the
buffer for IO and completion processing.
This then leads to use-after-free situations on the xfs_inode itself
if the IO completes after the current RCU grace period expires. The
buffer callbacks will access the xfs_inode and try to do all sorts
of things it shouldn't with freed memory.
IOWs, xfs_iflush_cluster() only works correctly when racing with
inode reclaim if the inode log item is present and correctly stating
the inode is clean. If the inode is being freed, then reclaim has
already made sure the inode is clean, and hence xfs_iflush_cluster
can skip it. However, we are accessing the inode inode under RCU
read lock protection and so also must ensure that all dynamically
allocated memory we reference in this context is not freed until the
RCU grace period expires.
To fix this, move all the potential memory freeing into
xfs_inode_free_callback() so that we are guarantee RCU protected
lookup code will always have the memory structures it needs
available during the RCU grace period that lookup races can occur
in.
Discovered-by: Brain Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>
When unmounting XFS, we call:
xfs_inode_free => xfs_idestroy_fork => xfs_iext_destroy
This goes over the whole indirection array and calls
xfs_iext_irec_remove for each one of the erps (from the last one to
the first one). As a result, we keep shrinking (reallocating
actually) the indirection array until we shrink out all of its
elements. When we have files with huge numbers of extents, umount
takes 30-80 sec, depending on the amount of files that XFS loaded
and the amount of indirection entries of each file. The unmount
stack looks like:
[<ffffffffc0b6d200>] xfs_iext_realloc_indirect+0x40/0x60 [xfs]
[<ffffffffc0b6cd8e>] xfs_iext_irec_remove+0xee/0xf0 [xfs]
[<ffffffffc0b6cdcd>] xfs_iext_destroy+0x3d/0xb0 [xfs]
[<ffffffffc0b6cef6>] xfs_idestroy_fork+0xb6/0xf0 [xfs]
[<ffffffffc0b87002>] xfs_inode_free+0xb2/0xc0 [xfs]
[<ffffffffc0b87260>] xfs_reclaim_inode+0x250/0x340 [xfs]
[<ffffffffc0b87583>] xfs_reclaim_inodes_ag+0x233/0x370 [xfs]
[<ffffffffc0b8823d>] xfs_reclaim_inodes+0x1d/0x20 [xfs]
[<ffffffffc0b96feb>] xfs_unmountfs+0x7b/0x1a0 [xfs]
[<ffffffffc0b98e4d>] xfs_fs_put_super+0x2d/0x70 [xfs]
[<ffffffff811e9e36>] generic_shutdown_super+0x76/0x100
[<ffffffff811ea207>] kill_block_super+0x27/0x70
[<ffffffff811ea519>] deactivate_locked_super+0x49/0x60
[<ffffffff811eaaee>] deactivate_super+0x4e/0x70
[<ffffffff81207593>] cleanup_mnt+0x43/0x90
[<ffffffff81207632>] __cleanup_mnt+0x12/0x20
[<ffffffff8108f8e7>] task_work_run+0xa7/0xe0
[<ffffffff81014ff7>] do_notify_resume+0x97/0xb0
[<ffffffff81717c6f>] int_signal+0x12/0x17
Further, this reallocation prevents us from freeing the extent list
from a RCU callback as allocation can block. Hence if the extent
list is in indirect format, optimise the freeing of the extent list
to only use kmem_free calls by freeing entire extent buffer pages at
a time, rather than extent by extent.
[dchinner: simplified freeing loop based on Christoph's suggestion]
Signed-off-by: Alex Lyakas <alex@zadarastorage.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Some careless idiot(*) wrote crap code in commit 1a3e8f3 ("xfs:
convert inode cache lookups to use RCU locking") back in late 2010,
and so xfs_iflush_cluster checks the wrong inode for whether it is
still valid under RCU protection. Fix it to lock and check the
correct inode.
(*) Careless-idiot: Dave Chinner <dchinner@redhat.com>
cc: <stable@vger.kernel.org> # 3.10.x-
Discovered-by: Brain Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>
When a failure due to an inode buffer occurs, the error handling
fails to abort the inode writeback correctly. This can result in the
inode being reclaimed whilst still in the AIL, leading to
use-after-free situations as well as filesystems that cannot be
unmounted as the inode log items left in the AIL never get removed.
Fix this by ensuring fatal errors from xfs_imap_to_bp() result in
the inode flush being aborted correctly.
cc: <stable@vger.kernel.org> # 3.10.x-
Reported-by: Shyam Kaushik <shyam@zadarastorage.com>
Diagnosed-by: Shyam Kaushik <shyam@zadarastorage.com>
Tested-by: Shyam Kaushik <shyam@zadarastorage.com>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Joe Lawrence reported a list_add corruption with 4.6-rc1 when
testing some custom md administration code that made it's own
block device nodes for the md array. The simple test loop of:
for i in {0..100}; do
mknod --mode=0600 $tmp/tmp_node b $MAJOR $MINOR
mdadm --detail --export $tmp/tmp_node > /dev/null
rm -f $tmp/tmp_node
done
Would produce this warning in bd_acquire() when mdadm opened the
device node:
list_add double add: new=ffff88043831c7b8, prev=ffff8804380287d8, next=ffff88043831c7b8.
And then produce this from bd_forget from kdevtmpfs evicting a block
dev inode:
list_del corruption. prev->next should be ffff8800bb83eb10, but was ffff88043831c7b8
This is a regression caused by commit c19b3b05 ("xfs: mode di_mode
to vfs inode"). The issue is that xfs_inactive() frees the
unlinked inode, and the above commit meant that this freeing zeroed
the mode in the struct inode. The problem is that after evict() has
called ->evict_inode, it expects the i_mode to be intact so that it
can call bd_forget() or cd_forget() to drop the reference to the
block device inode attached to the XFS inode.
In reality, the only thing we do in xfs_fs_evict_inode() that is not
generic is call xfs_inactive(). We can move the xfs_inactive() call
to xfs_fs_destroy_inode() without any problems at all, and this
will leave the VFS inode intact until it is completely done with it.
So, remove xfs_fs_evict_inode(), and do the work it used to do in
->destroy_inode instead.
cc: <stable@vger.kernel.org> # 4.6
Reported-by: Joe Lawrence <joe.lawrence@stratus.com>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
This fix prevents nodes to wrongly create a 00:00:00:00:00:00 originator
which can potentially interfere with the rest of the neighbor statistics.
Fixes: d6f94d91f7 ("batman-adv: ELP - adding basic infrastructure")
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Antonio Quartulli <a@unstable.cc>
The undefined behavior sanatizer detected an signed integer overflow in a
setup with near perfect link quality
UBSAN: Undefined behaviour in net/batman-adv/bat_iv_ogm.c:1246:25
signed integer overflow:
8713350 * 255 cannot be represented in type 'int'
The problems happens because the calculation of mixed unsigned and signed
integers resulted in an integer multiplication.
batadv_ogm_packet::tq (u8 255)
* tq_own (u8 255)
* tq_asym_penalty (int 134; max 255)
* tq_iface_penalty (int 255; max 255)
The tq_iface_penalty, tq_asym_penalty and inv_asym_penalty can just be
changed to unsigned int because they are not expected to become negative.
Fixes: c039876892 ("batman-adv: add WiFi penalty")
Signed-off-by: Sven Eckelmann <sven.eckelmann@open-mesh.com>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Antonio Quartulli <a@unstable.cc>
When the MAC address of the primary interface is changed,
update the originator address in the ELP and OGM skb buffers as
well in order to reflect the change.
Fixes: d6f94d91f7 ("batman-adv: ELP - adding basic infrastructure")
Reported-by: Marek Lindner <marek@neomailbox.ch>
Signed-off-by: Antonio Quartulli <a@unstable.cc>
The function batadv_iv_ogm_orig_add_if allocates new buffers for bcast_own
and bcast_own_sum. It is expected that these buffers are unchanged in case
either bcast_own or bcast_own_sum couldn't be resized.
But the error handling of this function frees the already resized buffer
for bcast_own when the allocation of the new bcast_own_sum buffer failed.
This will lead to an invalid memory access when some code will try to
access bcast_own.
Instead the resized new bcast_own buffer has to be kept. This will not lead
to problems because the size of the buffer was only increased and therefore
no user of the buffer will try to access bytes outside of the new buffer.
Fixes: d0015fdd3d ("batman-adv: provide orig_node routing API")
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Antonio Quartulli <a@unstable.cc>
The functions batadv_neigh_ifinfo_get increase the reference counter of the
batadv_neigh_ifinfo. These have to be reduced again when the reference is
not used anymore to correctly free the objects.
Fixes: 9786906022 ("batman-adv: B.A.T.M.A.N. V - implement neighbor comparison API calls")
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Antonio Quartulli <a@unstable.cc>
batadv_neigh_ifinfo_get can return NULL when it cannot find (even when only
temporarily) anymore the neigh_ifinfo in the list neigh->ifinfo_list. This
has to be checked to avoid kernel Oopses when the ifinfo is dereferenced.
This a situation which isn't expected but is already handled by functions
like batadv_v_neigh_cmp. The same kind of warning is therefore used before
the function returns without dereferencing the pointers.
Fixes: 9786906022 ("batman-adv: B.A.T.M.A.N. V - implement neighbor comparison API calls")
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Antonio Quartulli <a@unstable.cc>
batadv_send_skb_to_orig() calls dev_queue_xmit() so we can't use skb->len.
Fixes: 953324776d ("batman-adv: network coding - buffer unicast packets before forward")
Signed-off-by: Florian Westphal <fw@strlen.de>
Reviewed-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Antonio Quartulli <a@unstable.cc>
If we take "retry forever" literally on metadata IO errors, we can
hang at unmount, once it retries those writes forever. This is the
default behavior, unfortunately.
Add an error configuration option for this behavior and default it
to "fail" so that an unmount will trigger actuall errors, a shutdown
and allow the unmount to succeed. It will be noisy, though, as it
will log the errors and shutdown that occurs.
To fix this, we need to mark the filesystem as being in the process
of unmounting. Do this with a mount flag that is added at the
appropriate time (i.e. before the blocking AIL sync). We also need
to add this flag if mount fails after the initial phase of log
recovery has been run.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
now most of the infrastructure is in place, we can start adding
support for configuring specific errors such as ENODEV, ENOSPC, EIO,
etc. Add these error configurations and configure them all to have
appropriate behaviours. That is, all will be configured to retry
forever by default, except for ENODEV, which is an unrecoverable
error, so it will be configured to not retry on error
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
On reception of an error, we can fail immediately, perform some
bound amount of retries or retry indefinitely. The current behaviour
we have is to retry forever.
However, we'd like the ability to choose how long the filesystem
should try after an error, it can either fail immediately, retry a
few times, or retry forever. This is implemented by using
max_retries sysfs attribute, to hold the amount of times we allow
the filesystem to retry after an error. Being -1 a special case
where the filesystem will retry indefinitely.
Add both a maximum retry count and a retry timeout so that we can
bound by time and/or physical IO attempts.
Finally, plumb these into xfs_buf_iodone error processing so that
the error behaviour follows the selected configuration.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Before we start expanding the number of error classes and errors we
can configure behaviour for, we need a simple and clear way to
define the default behaviour that we initialized each mount with.
Introduce a table based method for keeping the initial configuration
in, and apply that to the existing initialization code.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
With the error configuration handle for async metadata write errors
in place, we can now add initial support to the IO error processing
in xfs_buf_iodone_error().
Add an infrastructure function to look up the configuration handle,
and rearrange the error handling to prepare the way for different
error handling conigurations to be used.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Now we have the basic infrastructure, add the first error class so
we can build up the infrastructure in a meaningful way. Add the
metadata async write IO error class and sysfs entry, and introduce a
default configuration that matches the existing "retry forever"
behavior for async write metadata buffers.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Pull libata updates from Tejun Heo:
"Trivial changes except for special case timeout bumping.
I have two more libata branches which depend on SCSI and dmaengine
tree respectively. I'll send pull requests for them once the
prerequisite trees are pulled in"
* 'for-4.7' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata:
libata-scsi: use %*ph to dump small buffers
treewide: Fix typos in libata.xml
libata-core: Allow longer timeout for drive spinup from PUIS
libata: Fixup awkward whitespace in warning by removing line continuation.
We need to be able to change the way XFS behaviours in error
conditions depending on the type of underlying storage. This is
necessary for handling non-traditional block devices with extended
error cases, such as thin provisioned devices that can return ENOSPC
as an IO error.
Introduce the basic sysfs infrastructure needed to define and
configure error behaviours. This is done to be generic enough to
extend to configuring behaviour in other error conditions, such as
ENOMEM, which also has different desired behaviours according to
machine configuration.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Reports have surfaced of a lockdep splat complaining about an
irq-safe -> irq-unsafe locking order in the xfs_buf_bio_end_io() bio
completion handler. This only occurs when I/O errors are present
because bp->b_lock is only acquired in this context to protect
setting an error on the buffer. The problem is that this lock can be
acquired with the (request_queue) q->queue_lock held. See
scsi_end_request() or ata_qc_schedule_eh(), for example.
Replace the locked test/set of b_io_error with a cmpxchg() call.
This eliminates the need for the lock and thus the lock ordering
problem goes away.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Commit 0b89e9aa28 (cpuidle: delay enabling interrupts until all
coupled CPUs leave idle) rightfully fixed a regression by letting
the coupled idle state framework to handle local interrupt enabling
when the CPU is exiting an idle state.
The current code checks if the idle state is coupled and, if so, it
will let the coupled code to enable interrupts. This way, it can
decrement the ready-count before handling the interrupt. This
mechanism prevents the other CPUs from waiting for a CPU which is
handling interrupts.
But the check is done against the state index returned by the back
end driver's ->enter functions which could be different from the
initial index passed as parameter to the cpuidle_enter_state()
function.
entered_state = target_state->enter(dev, drv, index);
[ ... ]
if (!cpuidle_state_is_coupled(drv, entered_state))
local_irq_enable();
[ ... ]
If the 'index' is referring to a coupled idle state but the
'entered_state' is *not* coupled, then the interrupts are enabled
again. All CPUs blocked on the sync barrier may busy loop longer
if the CPU has interrupts to handle before decrementing the
ready-count. That's consuming more energy than saving.
Fixes: 0b89e9aa28 (cpuidle: delay enabling interrupts until all coupled CPUs leave idle)
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: 3.15+ <stable@vger.kernel.org> # 3.15+
[ rjw: Subject & changelog ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Pull regulator fix from Mark Brown:
"Fix build warnings from regulator_can_change_voltage()
Cut down on noise for mainstream users of the API and people
doing build testing by dropping the deprecated flag from
regulator_can_change_voltage() as it triggers even on the
EXPORT_SYMBOL_GPL() which affects all builds rather than just
the remaining drivers with calls to it (for which fixes are
currently pending).
The function remains deprecated and is expected to be removed
entirely in v4.8"
* tag 'regulator-fix-can-change-voltage' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
regulator: Silence build warnings from regulator_can_change_voltage()