There is no good reason to keep them apart, and this makes using the API
a bit simpler.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>
The new name is irq_poll as iopoll is already taken. Better suggestions
welcome.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>
KASAN found that our additional element processing scripts drop off
the end of the VPD page into unallocated space. The reason is that
not every element has additional information but our traversal
routines think they do, leading to them expecting far more additional
information than is present. Fix this by adding a gate to the
traversal routine so that it only processes elements that are expected
to have additional information (list is in SES-2 section 6.1.13.1:
Additional Element Status diagnostic page overview)
Reported-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Tested-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Cc: stable@vger.kernel.org
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
This was really dumb - reference counting for the main EDAC sysfs
object. While we could've simply registered it as the first thing in the
module init path and then hand it around to what needs it.
Do that and rip out all the code around it, thus simplifying the whole
handling significantly.
Move the edac_subsys node back to edac_module.c.
Signed-off-by: Borislav Petkov <bp@suse.de>
Since commit 4baadb9e05 ("ARM: shmobile: r8a7778: remove obsolete
setup code"), Renesas R-Car SoCs are only supported in generic DT-only
ARM multi-platform builds. The driver doesn't need to use platform data
anymore, hence remove platform data configuration.
Make gpio_rcar_priv.has_both_edge_trigger a boolean for consistency with
gpio_rcar_info.has_both_edge_trigger.
Move gpio_rcar_priv.irq_parent down while we're at it, to prevent gaps
on 64-bit.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Simon Horman <horms+renesas@verge.net.au>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
For adjtimex()'s ADJ_SETOFFSET, make sure the tv_usec value is
sane. We might multiply them later which can cause an overflow
and undefined behavior.
This patch introduces new helper functions to simplify the
checking code and adds comments to clarify
Orginally this patch was by Sasha Levin, but I've basically
rewritten it, so he should get credit for finding the issue
and I should get the blame for any mistakes made since.
Also, credit to Richard Cochran for the phrasing used in the
comment for what is considered valid here.
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Reported-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
* pci/aspm:
PCI/ASPM: Make sysfs link_state_store() consistent with link_state_show()
* pci/hotplug:
PCI: pciehp: Always protect pciehp_disable_slot() with hotplug mutex
* pci/misc:
x86/PCI: Simplify pci_bios_{read,write}
PCI: Simplify config space size computation
PCI: Limit config space size for Netronome NFP6000 family
PCI: Add Netronome vendor and device IDs
PCI: Support PCIe devices with short cfg_size
x86/PCI: Clarify AMD Fam10h config access restrictions comment
PCI: Print warnings for all invalid expansion ROM headers
PCI: Check for PCI_HEADER_TYPE_BRIDGE equality, not bitmask
* pci/msi:
PCI/MSI: Remove empty pci_msi_init_pci_dev()
PCI/MSI: Initialize MSI capability for all architectures
Pull rdma fixes from Doug Ledford:
"Most are minor to important fixes.
There is one performance enhancement that I took on the grounds that
failing to check if other processes can run before running what's
intended to be a background, idle-time task is a bug, even though the
primary effect of the fix is to improve performance (and it was a very
simple patch)"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma:
IB/mlx5: Postpone remove_keys under knowledge of coming preemption
IB/mlx4: Use vmalloc for WR buffers when needed
IB/mlx4: Use correct order of variables in log message
iser-target: Remove explicit mlx4 work-around
mlx4: Expose correct max_sge_rd limit
IB/mad: Require CM send method for everything except ClassPortInfo
IB/cma: Add a missing rcu_read_unlock()
IB core: Fix ib_sg_to_pages()
IB/srp: Fix srp_map_sg_fr()
IB/srp: Fix indirect data buffer rkey endianness
IB/srp: Initialize dma_length in srp_map_idb
IB/srp: Fix possible send queue overflow
IB/srp: Fix a memory leak
IB/sa: Put netlink request into the request list before sending
IB/iser: use sector_div instead of do_div
IB/core: use RCU for uverbs id lookup
IB/qib: Minor fixes to qib per SFF 8636
IB/core: Fix user mode post wr corruption
IB/qib: Fix qib_mr structure
OPP bindings (for few properties) allow a platform to choose a
value/range among a set of available options. The options are present as
opp-<prop>-<name>, where the platform needs to supply the <name> string.
The OPP properties which allow such an option are: opp-microvolt and
opp-microamp.
Add support to the OPP-core to parse these bindings, by introducing
dev_pm_opp_{set|put}_prop_name() APIs.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Tested-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
OPP bindings allow a platform to enable OPPs based on the version of the
hardware they are used for.
Add support to the OPP-core to parse these bindings, by introducing
dev_pm_opp_{set|put}_supported_hw() APIs.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Tested-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Merge "Reset controller changes for v4.5" from Philipp Zabel:
- oftree support for getting reset devices by index
- fixed return value consistency of of_reset_control_get
- added support for STi co-processor resets
- added STi status callback
- various fixes
* tag 'reset-for-4.5' of git://git.pengutronix.de/git/pza/linux:
reset: check return value of reset_controller_register()
reset: remove redundant $(CONFIG_RESET_CONTROLLER) from Makefile
reset: use ENOTSUPP instead of ENOSYS
reset: sunxi: mark the of_device_id array as __initconst
reset: sti: add a missing blank line after declaration
reset: sti: Provide ops .status() call-back
reset: sti: Add support for resetting co-processors
ARM: STi: Add DT defines for co-processor reset lines
reset: Fix of_reset_control_get() for consistent return values
reset: add of_reset_control_get_by_index()
Pass the net pointer to the call_batch callback functions so we can skip
recurrent lookups.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Tested-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com>
As of commit 4baadb9e05 ("ARM: shmobile: r8a7778: remove obsolete
setup code"), the Renesas R-Car HPB-DMAC driver is no longer used.
In theory it could still be used on R-Car Gen1 SoCs, but that requires
adding DT support to the driver, which is not planned.
Remove the driver, it can be resurrected from git history when needed.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Simon Horman <horms+renesas@verge.net.au>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
This change makes the DT file to be easier to read since the memcpy
channels array does not need the '/bits/ 16' to be specified, which might
confuse some people.
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Rob Herring <robh@kernel.org>
Acked-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
On dm814x we have some clocks at DPLLS and some at PRCM. Let's add a new
omap_prcm_init_data entry for the DPLLS so we can initalize timer clocks
early.
Cc: Paul Walmsley <paul@pwsan.com>
Cc: Tero Kristo <t-kristo@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Pull VFIO fixes from Alex Williamson:
- Various fixes for removing redundancy, const'ifying structs, avoiding
stack usage, fixing WARN usage (Krzysztof Kozlowski, Julia Lawall,
Kees Cook, Dan Carpenter)
- Revert No-IOMMU mode as the intended user has not emerged (Alex
Williamson)
* tag 'vfio-v4.4-rc5' of git://github.com/awilliam/linux-vfio:
Revert: "vfio: Include No-IOMMU mode"
vfio: fix a warning message
vfio: platform: remove needless stack usage
vfio-pci: constify pci_error_handlers structures
vfio: Drop owner assignment from platform_driver
Pull DT fixes from Rob Herring:
"I think this should be all for 4.4:
- Fix incorrect warning about overlapping memory regions
- Export of_irq_find_parent again which was made static in 4.4, but
has users pending for 4.5.
- Fix of_msi_map_rid declaration location
- Fix re-entrancy for of_fdt_unflatten_tree
- Clean-up of phys_addr_t printks"
* tag 'devicetree-fixes-for-4.4-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
of/irq: move of_msi_map_rid declaration to the correct ifdef section
of/irq: Export of_irq_find_parent again
of/fdt: Add mutex protection for calls to __unflatten_device_tree()
of/address: fix typo in comment block of of_translate_one()
of: do not use 0x in front of %pa
of: Fix comparison of reserved memory regions
If partition parsers need to clean up their resources, we shouldn't
assume that all memory will fit in a single kmalloc() that the caller
can kfree(). We should allow the parser to provide a proper cleanup
routine.
Note that this means we need to keep a hold on the parser's module for a
bit longer, and release it later with mtd_part_parser_put().
Alongside this, define a default callback that we'll automatically use
if the parser doesn't provide one, so we can still retain the old
behavior.
Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Reviewed-by: Boris Brezillon <boris.brezillon@free-electrons.com>
sock_cgroup_data is a struct containing an anonymous union.
sock_cgroup_set_prioidx() and sock_cgroup_set_classid() were
initializing a field inside the anonymous union as follows.
struct sock_ccgroup_data skcd_buf = { .val = VAL };
While this is fine on more recent compilers, gcc-4.4.7 triggers the
following errors.
include/linux/cgroup-defs.h: In function ‘sock_cgroup_set_prioidx’:
include/linux/cgroup-defs.h:619: error: unknown field ‘val’ specified in initializer
include/linux/cgroup-defs.h:619: warning: missing braces around initializer
include/linux/cgroup-defs.h:619: warning: (near initialization for ‘skcd_buf.<anonymous>’)
This is because .val belongs to the anonymous union nested inside the
struct but the initializer is missing the nesting. Fix it by adding
an extra pair of braces.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Alaa Hleihel <alaa@dev.mellanox.co.il>
Fixes: bd1060a1d6 ("sock, cgroup: add sock->sk_cgroup")
Signed-off-by: David S. Miller <davem@davemloft.net>
ROL on a 32 bit integer with a shift of 32 or more is undefined and the
result is arch-dependent. Avoid this by handling the trivial case of
roling by 0 correctly.
The trivial solution of checking if shift is 0 breaks gcc's detection
of this code as a ROL instruction, which is unacceptable.
This bug was reported and fixed in GCC
(https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57157):
The standard rotate idiom,
(x << n) | (x >> (32 - n))
is recognized by gcc (for concreteness, I discuss only the case that x
is an uint32_t here).
However, this is portable C only for n in the range 0 < n < 32. For n
== 0, we get x >> 32 which gives undefined behaviour according to the
C standard (6.5.7, Bitwise shift operators). To portably support n ==
0, one has to write the rotate as something like
(x << n) | (x >> ((-n) & 31))
And this is apparently not recognized by gcc.
Note that this is broken on older GCCs and will result in slower ROL.
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
For some of the core partitioning code, it helps to keep info about the
parsed partition (and who parsed them) together in one place.
Signed-off-by: Brian Norris <computersforpeace@gmail.com>
We only want to modify these arrays inside the parser "drivers", so the
drivers should construct them however they like, then return them as
immutable arrays.
This will make other refactorings easier.
Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Reviewed-by: Boris Brezillon <boris.brezillon@free-electrons.com>
This introduces the MEMBLOCK_NOMAP attribute and the required plumbing
to make it usable as an indicator that some parts of normal memory
should not be covered by the kernel direct mapping. It is up to the
arch to actually honor the attribute when laying out this mapping,
but the memblock code itself is modified to disregard these regions
for allocations and other general use.
Cc: linux-mm@kvack.org
Cc: Alexander Kuleshov <kuleshovmail@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Matt Fleming <matt@codeblueprint.co.uk>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
In checking fixes for of_irq_find_parent declaration location, I found
that of_msi_map_rid is also wrong. of_msi_map_rid is not implemented for
Sparc, so it should not be in the Sparc specific section of the header.
Move it to just depend on OF_IRQ.
Cc: Frank Rowand <frowand.list@gmail.com>
Signed-off-by: Rob Herring <robh@kernel.org>
of_irq_find_parent was made static since it had no users outside of
of_irq.c. Export it again since we are going to use it again.
Signed-off-by: Carlo Caione <carlo@endlessm.com>
[robh: move of_irq_find_parent to correct ifdef section]
Signed-off-by: Rob Herring <robh@kernel.org>
based upon the corresponding patch from Neil's March patchset,
again with kmap-related horrors removed.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
new method: ->get_link(); replacement of ->follow_link(). The differences
are:
* inode and dentry are passed separately
* might be called both in RCU and non-RCU mode;
the former is indicated by passing it a NULL dentry.
* when called that way it isn't allowed to block
and should return ERR_PTR(-ECHILD) if it needs to be called
in non-RCU mode.
It's a flagday change - the old method is gone, all in-tree instances
converted. Conversion isn't hard; said that, so far very few instances
do not immediately bail out when called in RCU mode. That'll change
in the next commits.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
kmap() in page_follow_link_light() needed to go - allowing to hold
an arbitrary number of kmaps for long is a great way to deadlocking
the system.
new helper (inode_nohighmem(inode)) needs to be used for pagecache
symlinks inodes; done for all in-tree cases. page_follow_link_light()
instrumented to yell about anything missed.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
In cgroup v1, dealing with cgroup membership was difficult because the
number of membership associations was unbound. As a result, cgroup v1
grew several controllers whose primary purpose is either tagging
membership or pull in configuration knobs from other subsystems so
that cgroup membership test can be avoided.
net_cls and net_prio controllers are examples of the latter. They
allow configuring network-specific attributes from cgroup side so that
network subsystem can avoid testing cgroup membership; unfortunately,
these are not only cumbersome but also problematic.
Both net_cls and net_prio aren't properly hierarchical. Both inherit
configuration from the parent on creation but there's no interaction
afterwards. An ancestor doesn't restrict the behavior in its subtree
in anyway and configuration changes aren't propagated downwards.
Especially when combined with cgroup delegation, this is problematic
because delegatees can mess up whatever network configuration
implemented at the system level. net_prio would allow the delegatees
to set whatever priority value regardless of CAP_NET_ADMIN and net_cls
the same for classid.
While it is possible to solve these issues from controller side by
implementing hierarchical allowable ranges in both controllers, it
would involve quite a bit of complexity in the controllers and further
obfuscate network configuration as it becomes even more difficult to
tell what's actually being configured looking from the network side.
While not much can be done for v1 at this point, as membership
handling is sane on cgroup v2, it'd be better to make cgroup matching
behave like other network matches and classifiers than introducing
further complications.
In preparation, this patch updates sock->sk_cgrp_data handling so that
it points to the v2 cgroup that sock was created in until either
net_prio or net_cls is used. Once either of the two is used,
sock->sk_cgrp_data reverts to its previous role of carrying prioidx
and classid. This is to avoid adding yet another cgroup related field
to struct sock.
As the mode switching can happen at most once per boot, the switching
mechanism is aimed at lowering hot path overhead. It may leak a
finite, likely small, number of cgroup refs and report spurious
prioidx or classid on switching; however, dynamic updates of prioidx
and classid have always been racy and lossy - socks between creation
and fd installation are never updated, config changes don't update
existing sockets at all, and prioidx may index with dead and recycled
cgroup IDs. Non-critical inaccuracies from small race windows won't
make any noticeable difference.
This patch doesn't make use of the pointer yet. The following patch
will implement netfilter match for cgroup2 membership.
v2: Use sock_cgroup_data to avoid inflating struct sock w/ another
cgroup specific field.
v3: Add comments explaining why sock_data_prioidx() and
sock_data_classid() use different fallback values.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Daniel Wagner <daniel.wagner@bmw-carit.de>
CC: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Introduce sock->sk_cgrp_data which is a struct sock_cgroup_data.
->sk_cgroup_prioidx and ->sk_classid are moved into it. The struct
and its accessors are defined in cgroup-defs.h. This is to prepare
for overloading the fields with a cgroup pointer.
This patch mostly performs equivalent conversions but the followings
are noteworthy.
* Equality test before updating classid is removed from
sock_update_classid(). This shouldn't make any noticeable
difference and a similar test will be implemented on the helper side
later.
* sock_update_netprioidx() now takes struct sock_cgroup_data and can
be moved to netprio_cgroup.h without causing include dependency
loop. Moved.
* The dummy version of sock_update_netprioidx() converted to a static
inline function while at it.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This reverts commit 0d76d6e8b2 and merge
commit c402293bd7, reversing changes made
to c89359a42e.
The virtio-vsock device specification is not finalized yet. Michael
Tsirkin voiced concerned about merging this code when the hardware
interface (and possibly the userspace interface) could still change.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The users of BUS_NOTIFY_BIND_DRIVER have no chance to do any cleanup in case of
a probe failure. In the result there might be problems, such as some resources
that had been allocated will continue to be allocated and therefore lead to a
resource leak.
Introduce a new notification to inform the subscriber that ->probe() failed. Do
the same in case of failed device_bind_driver() call.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Pull cgroup fixes from Tejun Heo:
"More change than I'd have liked at this stage. The pids controller
and the changes made to cgroup core to support it introduced and
revealed several important issues.
- Assigning membership to a newly created task and migrating it can
race leading to incorrect accounting. Oleg fixed it by widening
threadgroup synchronization. It looks like we'll be able to merge
it with a different percpu rwsem which is used in fork path making
things simpler and cheaper.
- The recent change to extend cgroup membership to zombies (so that
pid accounting can extend till the pid is actually released) missed
pinning the underlying data structures leading to use-after-free.
Fixed.
- v2 hierarchy was calling subsystem callbacks with the wrong target
cgroup_subsys_state based on the incorrect assumption that they
share the same target. pids is the first controller affected by
this. Subsys callbacks updated so that they can deal with
multi-target migrations"
* 'for-4.4-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
cgroup_pids: don't account for the root cgroup
cgroup: fix handling of multi-destination migration from subtree_control enabling
cgroup_freezer: simplify propagation of CGROUP_FROZEN clearing in freezer_attach()
cgroup: pids: kill pids_fork(), simplify pids_can_fork() and pids_cancel_fork()
cgroup: pids: fix race between cgroup_post_fork() and cgroup_migrate()
cgroup: make css_set pin its css's to avoid use-afer-free
cgroup: fix cftype->file_offset handling
Pull libata fixes from Tejun Heo:
"Nothing too interesting. All are device specific additions and
workarounds"
* 'for-4.4-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata:
ata/sata_fsl.c: add ATA_FLAG_NO_LOG_PAGE to blacklist the controller for log page reads
libata-eh.c: Introduce new ata port flag for controller which lockup on read log page
sata_sil: disable trim
AHCI: Fix softreset failed issue of Port Multiplier
sata/mvebu: use #ifdef around suspend/resume code
ahci: Order SATA device IDs for codename Lewisburg
ahci: Add Device ID for Intel Sunrise Point PCH
Pull perf fixes from Ingo Molnar:
"This tree includes four core perf fixes for misc bugs, three fixes to
x86 PMU drivers, and two updates to old email addresses"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf: Do not send exit event twice
perf/x86/intel: Fix INTEL_FLAGS_UEVENT_CONSTRAINT_DATALA_NA macro
perf/x86/intel: Make L1D_PEND_MISS.FB_FULL not constrained on Haswell
perf: Fix PERF_EVENT_IOC_PERIOD deadlock
treewide: Remove old email address
perf/x86: Fix LBR call stack save/restore
perf: Update email address in MAINTAINERS
perf/core: Robustify the perf_cgroup_from_task() RCU checks
perf/core: Fix RCU problem with cgroup context switching code
Currently all NAND controller drivers are providing both the mtd_info and
nand_chip struct and then let the NAND subsystem to initialize a few
things before registering the mtd instance to the MTD layer.
Embed an mtd_info field into nand_chip to add some consistency to all NAND
controller drivers.
This change will also help factorizing boilerplate code copied in all NAND
drivers.
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Workqueue stalls can happen from a variety of usage bugs such as
missing WQ_MEM_RECLAIM flag or concurrency managed work item
indefinitely staying RUNNING. These stalls can be extremely difficult
to hunt down because the usual warning mechanisms can't detect
workqueue stalls and the internal state is pretty opaque.
To alleviate the situation, this patch implements workqueue lockup
detector. It periodically monitors all worker_pools periodically and,
if any pool failed to make forward progress longer than the threshold
duration, triggers warning and dumps workqueue state as follows.
BUG: workqueue lockup - pool cpus=0 node=0 flags=0x0 nice=0 stuck for 31s!
Showing busy workqueues and worker pools:
workqueue events: flags=0x0
pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=17/256
pending: monkey_wrench_fn, e1000_watchdog, cache_reap, vmstat_shepherd, release_one_tty, release_one_tty, release_one_tty, release_one_tty, release_one_tty, release_one_tty, release_one_tty, release_one_tty, release_one_tty, release_one_tty, release_one_tty, release_one_tty, cgroup_release_agent
workqueue events_power_efficient: flags=0x80
pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=2/256
pending: check_lifetime, neigh_periodic_work
workqueue cgroup_pidlist_destroy: flags=0x0
pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/1
pending: cgroup_pidlist_destroy_work_fn
...
The detection mechanism is controller through kernel parameter
workqueue.watchdog_thresh and can be updated at runtime through the
sysfs module parameter file.
v2: Decoupled from softlockup control knobs.
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Don Zickus <dzickus@redhat.com>
Cc: Ulrich Obergfell <uobergfe@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Chris Mason <clm@fb.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
touch_softlockup_watchdog() is used to tell watchdog that scheduler
stall is expected. One group of usage is from paths where the task
may not be able to yield for a long time such as performing slow PIO
to finicky device and coming out of suspend. The other is to account
for scheduler and timer going idle.
For scheduler softlockup detection, there's no reason to distinguish
the two cases; however, workqueue lockup detector is planned and it
can use the same signals from the former group while the latter would
spuriously prevent detection. This patch introduces a new function
touch_softlockup_watchdog_sched() and convert the latter group to call
it instead. For now, it just calls touch_softlockup_watchdog() and
there's no functional difference.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Ulrich Obergfell <uobergfe@redhat.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Andrew Morton <akpm@linux-foundation.org>