Pull irq fixes from Ingo Molnar:
"A sparse irq race/locking fix, and a MSI irq domains population fix"
* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
genirq: Make sparse_irq_lock protect what it should protect
genirq/msi: Fix populating multiple interrupts
for_each_active_irq() iterates the sparse irq allocation bitmap. The caller
must hold sparse_irq_lock. Several code pathes expect that an active bit in
the sparse bitmap also has a valid interrupt descriptor.
Unfortunately that's not true. The (de)allocation is a two step process,
which holds the sparse_irq_lock only across the queue/remove from the radix
tree and the set/clear in the allocation bitmap.
If a iteration locks sparse_irq_lock between the two steps, then it might
see an active bit but the corresponding irq descriptor is NULL. If that is
dereferenced unconditionally, then the kernel oopses. Of course, all
iterator sites could be audited and fixed, but....
There is no reason why the sparse_irq_lock needs to be dropped between the
two steps, in fact the code becomes simpler when the mutex is held across
both and the semantics become more straight forward, so future problems of
missing NULL pointer checks in the iteration are avoided and all existing
sites are fixed in one go.
Expand the lock held sections so both operations are covered and the bitmap
and the radixtree are in sync.
Fixes: a05a900a51 ("genirq: Make sparse_lock a mutex")
Reported-and-tested-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
On allocating the interrupts routed via a wire-to-MSI bridge, the allocator
iterates over the MSI descriptors to build the hierarchy, but fails to use
the descriptor interrupt number, and instead uses the base number,
generating the wrong IRQ domain mappings.
The fix is to use the MSI descriptor interrupt number when setting up
the interrupt instead of the base interrupt for the allocation range.
The only saving grace is that although the MSI descriptors are allocated
in bulk, the wired interrupts are only allocated one by one (so
desc->irq == virq) and the bug went unnoticed so far.
Fixes: 2145ac9310 ("genirq/msi: Add msi_domain_populate_irqs")
Signed-off-by: John Keeping <john@metanate.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/20170906103540.373864a2.john@metanate.com
Pull device properties framework updates from Rafael Wysocki:
"These introduce fwnode operations for all of the separate types of
'firmware nodes' that can be handled by the device properties
framework, make the framework use const fwnode arguments all over, add
a helper for the consolidated handling of node references and switch
over the framework to the new UUID API.
Specifics:
- Introduce fwnode operations for all of the separate types of
'firmware nodes' that can be handled by the device properties
framework and drop the type field from struct fwnode_handle (Sakari
Ailus, Arnd Bergmann).
- Make the device properties framework use const fwnode arguments
where possible (Sakari Ailus).
- Add a helper for the consolidated handling of node references to
the device properties framework (Sakari Ailus).
- Switch over the ACPI part of the device properties framework to the
new UUID API (Andy Shevchenko)"
* tag 'devprop-4.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI: device property: Switch to use new generic UUID API
device property: export irqchip_fwnode_ops
device property: Introduce fwnode_property_get_reference_args
device property: Constify fwnode property API
device property: Constify argument to pset fwnode backend
ACPI: Constify internal fwnode arguments
ACPI: Constify acpi_bus helper functions, switch to macros
ACPI: Prepare for constifying acpi_get_next_subnode() fwnode argument
device property: Get rid of struct fwnode_handle type field
ACPI: Use IS_ERR_OR_NULL() instead of non-NULL check in is_acpi_data_node()
Pull irq updates from Thomas Gleixner:
"The interrupt subsystem delivers this time:
- Refactoring of the GIC-V3 driver to prepare for the GIC-V4 support
- Initial GIC-V4 support
- Consolidation of the FSL MSI support
- Utilize the effective affinity interface in various ARM irqchip
drivers
- Yet another interrupt chip driver (UniPhier AIDET)
- Bulk conversion of the irq chip driver to use %pOF
- The usual small fixes and improvements all over the place"
* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (77 commits)
irqchip/ls-scfg-msi: Add MSI affinity support
irqchip/ls-scfg-msi: Add LS1043a v1.1 MSI support
irqchip/ls-scfg-msi: Add LS1046a MSI support
arm64: dts: ls1046a: Add MSI dts node
arm64: dts: ls1043a: Share all MSIs
arm: dts: ls1021a: Share all MSIs
arm64: dts: ls1043a: Fix typo of MSI compatible string
arm: dts: ls1021a: Fix typo of MSI compatible string
irqchip/ls-scfg-msi: Fix typo of MSI compatible strings
irqchip/irq-bcm7120-l2: Use correct I/O accessors for irq_fwd_mask
irqchip/mmp: Make mmp_intc_conf const
irqchip/gic: Make irq_chip const
irqchip/gic-v3: Advertise GICv4 support to KVM
irqchip/gic-v4: Enable low-level GICv4 operations
irqchip/gic-v4: Add some basic documentation
irqchip/gic-v4: Add VLPI configuration interface
irqchip/gic-v4: Add VPE command interface
irqchip/gic-v4: Add per-VM VPE domain creation
irqchip/gic-v3-its: Set implementation defined bit to enable VLPIs
irqchip/gic-v3-its: Allow doorbell interrupts to be injected/cleared
...
Pull irqchip updates for 4.14 from Marc Zyngier:
- irqchip-specific part of the monster GICv4 series
- new UniPhier AIDET irqchip driver
- new variants of some Freescale MSI widget
- blanket removal of of_node->full_name in printk
- random collection of fixes
kernel/irq/proc.c: In function ‘show_irq_affinity’:
include/linux/cpumask.h:24:29: warning: ‘mask’ may be used uninitialized in this function [-Wmaybe-uninitialized]
#define cpumask_bits(maskp) ((maskp)->bits)
gcc is silly, but admittedly it can't know that this won't be called with
anything else than the enumerated constants.
Shut up the warning by creating a default clause.
Fixes: 6bc6d4abd2 ("genirq/proc: Use the the accessor to report the effective affinity
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
This code generates a Smatch warning:
kernel/irq/irqdomain.c:1511 irq_domain_push_irq()
warn: variable dereferenced before check 'root_irq_data' (see line 1508)
irq_get_irq_data() can return a NULL pointer, but the code dereferences
the returned pointer before checking it.
Move the NULL pointer check before the dereference.
[ tglx: Rewrote changelog to be precise and conforming to the instructions
in submitting-patches and added a Fixes tag. Sigh! ]
Fixes: 495c38d300 ("irqdomain: Add irq_domain_{push,pop}_irq() functions")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: David Daney <david.daney@cavium.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: kernel-janitors@vger.kernel.org
Link: http://lkml.kernel.org/r/20170825121409.6rfv4vt6ztz2oqkt@mwanda
When assigning an interrupt to a vcpu, it is not unlikely that
the level of the hierarchy implementing irq_set_vcpu_affinity
is not the top level (think a generic MSI domain on top of a
virtualization aware interrupt controller).
In such a case, let's iterate over the hierarchy until we find
an irqchip implementing it.
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
irq_modify_status starts by clearing the trigger settings from
irq_data before applying the new settings, but doesn't restore them,
leaving them to IRQ_TYPE_NONE.
That's pretty confusing to the potential request_irq() that could
follow. Instead, snapshot the settings before clearing them, and restore
them if the irq_modify_status() invocation was not changing the trigger.
Fixes: 1e2a7d7849 ("irqdomain: Don't set type when mapping an IRQ")
Reported-and-tested-by: jeffy <jeffy.chen@rock-chips.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jon Hunter <jonathanh@nvidia.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/20170818095345.12378-1-marc.zyngier@arm.com
For an already existing irqdomain hierarchy, as might be obtained via
a call to pci_enable_msix_range(), a PCI driver wishing to add an
additional irqdomain to the hierarchy needs to be able to insert the
irqdomain to that already initialized hierarchy. Calling
irq_domain_create_hierarchy() allows the new irqdomain to be created,
but no existing code allows for initializing the associated irq_data.
Add a couple of helper functions (irq_domain_push_irq() and
irq_domain_pop_irq()) to initialize the irq_data for the new
irqdomain added to an existing hierarchy.
Signed-off-by: David Daney <david.daney@cavium.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Alexandre Courbot <gnurou@gmail.com>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: linux-gpio@vger.kernel.org
Link: http://lkml.kernel.org/r/1503017616-3252-6-git-send-email-david.daney@cavium.com
When developing new (and therefore buggy) interrupt related
code, it can sometimes be useful to inject interrupts without
having to rely on a device to actually generate them.
This functionnality relies either on the irqchip driver to
expose a irq_set_irqchip_state(IRQCHIP_STATE_PENDING) callback,
or on the core code to be able to retrigger a (edge-only)
interrupt.
To use this feature:
echo -n trigger > /sys/kernel/debug/irq/irqs/IRQNUM
WARNING: This is DANGEROUS, and strictly a debug feature.
Do not use it on a production system. Your HW is likely to
catch fire, your data to be corrupted, and reporting this will
make you look an even bigger fool than the idiot who wrote
this patch.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20170818081156.9264-1-marc.zyngier@arm.com
That commit was part of the changes moving x86 to the generic CPU hotplug
interrupt migration code. The force flag was required on x86 before the
hierarchical irqdomain rework, but invoking set_affinity() with force=true
stayed and had no side effects.
At some point in the past, the force flag got repurposed to support the
exynos timer interrupt affinity setting to a not yet online CPU, so the
interrupt controller callback does not verify the supplied affinity mask
against cpu_online_mask.
Setting the flag in the CPU hotplug code causes the cpu online masking to
be blocked on these irq controllers and results in potentially affining an
interrupt to the CPU which is unplugged, i.e. instead of moving it away,
it's just reassigned to it.
As the force flags is not longer needed on x86, it's safe to revert that
patch so the ARM irqchips which use the force flag work again.
Add comments to that effect, so this won't happen again.
Note: The online mask handling should be done in the generic code and the
force flag and the masking in the irq chips removed all together, but
that's not a change possible for 4.13.
Fixes: 77f85e66aa ("genirq/cpuhotplug: Set force affinity flag on hotplug migration")
Reported-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Will Deacon <will.deacon@arm.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: LAK <linux-arm-kernel@lists.infradead.org>
Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1707271217590.3109@nanos
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The newly added irqchip_fwnode_ops structure is not exported, which can
lead to link errors:
ERROR: "irqchip_fwnode_ops" [drivers/gpio/gpio-xgene-sb.ko] undefined!
I checked that all other such symbols that were introduced are
exported if they need to be, this is the only missing one.
Fixes: db3e50f323 (device property: Get rid of struct fwnode_handle type field)
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Instead of relying on the struct fwnode_handle type field, define
fwnode_operations structs for all separate types of fwnodes. To find out
the type, compare to the ops field to relevant ops structs.
This change has two benefits:
1. it avoids adding the type field to each and every instance of struct
fwnode_handle, thus saving memory and
2. makes the ops field the single factor that defines both the types of
the fwnode as well as defines the implementation of its operations,
decreasing the possibility of bugs when developing code dealing with
fwnode internals.
Suggested-by: Rob Herring <robh@kernel.org>
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Interrupts with the IRQF_FORCE_RESUME flag set have also the
IRQF_NO_SUSPEND flag set. They are not disabled in the suspend path, but
must be forcefully resumed. That's used by XEN to keep IPIs enabled beyond
the suspension of device irqs. Force resume works by pretending that the
interrupt was disabled and then calling __irq_enable().
Incrementing the disabled depth counter was enough to do that, but with the
recent changes which use state flags to avoid unnecessary hardware access,
this is not longer sufficient. If the state flags are not set, then the
hardware callbacks are not invoked and the interrupt line stays disabled in
"hardware".
Set the disabled and masked state when pretending that an interrupt got
disabled by suspend.
Fixes: bf22ff45be ("genirq: Avoid unnecessary low level irq function calls")
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: xen-devel@lists.xenproject.org
Cc: boris.ostrovsky@oracle.com
Link: http://lkml.kernel.org/r/20170717174703.4603-2-jgross@suse.com
Pull irq fix from Thomas Gleixner:
"Fix the fallout from reworking the locking and resource management in
request/free_irq()"
* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
genirq: Keep chip buslock across irq_request/release_resources()
Moving the irq_request/release_resources() callbacks out of the spinlocked,
irq disabled and bus locked region, unearthed an interesting abuse of the
irq_bus_lock/irq_bus_sync_unlock() callbacks.
The OMAP GPIO driver does merily power management inside of them. The
irq_request_resources() callback of this GPIO irqchip calls a function
which reads a GPIO register. That read aborts now because the clock of the
GPIO block is not magically enabled via the irq_bus_lock() callback.
Move the callbacks under the bus lock again to prevent this. In the
free_irq() path this requires to drop the bus_lock before calling
synchronize_irq() and reaquiring it before calling the
irq_release_resources() callback.
The bus lock can't be held because:
1) The data which has been changed between bus_lock/un_lock is cached in
the irq chip driver private data and needs to go out to the irq chip
via the slow bus (usually SPI or I2C) before calling
synchronize_irq().
That's the reason why this bus_lock/unlock magic exists in the first
place, as you cannot do SPI/I2C transactions while holding desc->lock
with interrupts disabled.
2) synchronize_irq() will actually deadlock, if there is a handler on
flight. These chips use threaded handlers for obvious reasons, as
they allow to do SPI/I2C communication. When the threaded handler
returns then bus_lock needs to be taken in irq_finalize_oneshot() as
we need to talk to the actual irq chip once more. After that the
threaded handler is marked done, which makes synchronize_irq() return.
So if we hold bus_lock accross the synchronize_irq() call, the
handler cannot mark itself done because it blocks on the bus
lock. That in turn makes synchronize_irq() wait forever on the
threaded handler to complete....
Add the missing unlock of desc->request_mutex in the error path of
__free_irq() and add a bunch of comments to explain the locking and
protection rules.
Fixes: 46e48e2573 ("genirq: Move irq resource handling out of spinlocked region")
Reported-and-tested-by: Sebastian Reichel <sebastian.reichel@collabora.co.uk>
Reported-and-tested-by: Tony Lindgren <tony@atomide.com>
Reported-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Not-longer-ranted-at-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Grygorii Strashko <grygorii.strashko@ti.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Pull irq fixes from Thomas Gleixner:
- A few fixes mopping up the fallout of the big irq overhaul
- Move the interrupt resource management logic out of the spin locked,
irq disabled region to avoid unnecessary restrictions of the resource
callbacks
- Preparation for reworking the per cpu irq request function.
* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irqdomain: Allow ACPI device nodes to be used as irqdomain identifiers
genirq/debugfs: Remove redundant NULL pointer check
genirq: Allow to pass the IRQF_TIMER flag with percpu irq request
genirq/timings: Move free timings out of spinlocked region
genirq: Move irq resource handling out of spinlocked region
genirq: Add mutex to irq desc to serialize request/free_irq()
genirq: Move bus locking into __setup_irq()
genirq: Force inlining of __irq_startup_managed to prevent build failure
genirq/debugfs: Fix build for !CONFIG_IRQ_DOMAIN
A number of irqchip implementations are (ab)using the irqdomain allocator
by passing a fwnode that is neither a FWNODE_OF or a FWNODE_IRQCHIP.
This is pretty bad, but it also feels pretty crap to force these drivers to
allocate their own irqchip_fwid when they already have a proper fwnode.
Instead, let's teach the irqdomain allocator about ACPI device nodes, and
add some lovely name generation code... Tested on an arm64 D05 system.
Reported-and-tested-by: John Garry <john.garry@huawei.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Agustin Vega-Frias <agustinv@codeaurora.org>
Cc: Ma Jun <majun258@huawei.com>
Cc: Hanjun Guo <hanjun.guo@linaro.org>
Link: http://lkml.kernel.org/r/20170707083959.10349-1-marc.zyngier@arm.com
debugfs_remove() can be called with a NULL pointer.
Fixes: 087cdfb662 ("genirq/debugfs: Add proper debugfs interface")
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The irq timings infrastructure tracks when interrupts occur in order to
statistically predict te next interrupt event.
There is no point to track timer interrupts and try to predict them because
the next expiration time is already known. This can be avoided via the
IRQF_TIMER flag which is passed by timer drivers in request_irq(). It marks
the interrupt as timer based which alloes to ignore these interrupts in the
timings code.
Per CPU interrupts which are requested via request_percpu_+irq() have no
flag argument, so marking per cpu timer interrupts is not possible and they
get tracked pointlessly.
Add __request_percpu_irq() as a variant of request_percpu_irq() with a
flags argument and make request_percpu_irq() an inline wrapper passing
flags = 0.
The flag parameter is restricted to IRQF_TIMER as all other IRQF_ flags
make no sense for per cpu interrupts.
The next step is to convert all existing users of request_percpu_irq() and
then remove the wrapper and the underscores.
[ tglx: Massaged changelog ]
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: peterz@infradead.org
Cc: nicolas.pitre@linaro.org
Cc: vincent.guittot@linaro.org
Cc: rafael@kernel.org
Link: http://lkml.kernel.org/r/1499344144-3964-1-git-send-email-daniel.lezcano@linaro.org
Aside of being conceptually wrong, there is also an actual (hard to
trigger and mostly theoretical) problem.
CPU0 CPU1
free_irq(X) interrupt X
spin_lock(desc->lock)
wake irq thread()
spin_unlock(desc->lock)
spin_lock(desc->lock)
remove action()
shutdown_irq()
release_resources() thread_handler()
spin_unlock(desc->lock) access released resources.
synchronize_irq()
Move the release resources invocation after synchronize_irq() so it's
guaranteed that the threaded handler has finished.
Move the resource request call out of the desc->lock held region as well,
so the invocation context is the same for both request and release.
This solves the problems with those functions on RT as well.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: Heiko Stuebner <heiko@sntech.de>
Cc: Julia Cartwright <julia@ni.com>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Brian Norris <briannorris@chromium.org>
Cc: Doug Anderson <dianders@chromium.org>
Cc: linux-rockchip@lists.infradead.org
Cc: John Keeping <john@metanate.com>
Cc: linux-gpio@vger.kernel.org
Link: http://lkml.kernel.org/r/20170629214344.117028181@linutronix.de