Introduce API functions to restart and cancel tty buffer work, rather
than manipulate buffer work directly.
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The job_control() check in n_tty_read() has nearly identical purpose
and results as tty_check_change(). Both functions' purpose is to
determine if the current task's pgrp is the foreground pgrp for the tty,
and if not, to signal the current pgrp.
Introduce __tty_check_change() which takes the signal to send
and performs the shared operations for job control() and
tty_check_change().
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The tty lock is strictly for serializing tty lifetime events
(open/close/hangup), and not for line discipline serialization.
The tty core already provides serialization of concurrent writes
to the same tty, and line discipline lifetime management (by ldisc
references), so pinning the tty via tty_lock() is unnecessary and
counter-productive; remove tty lock use.
However, the line discipline is responsible for serializing reads
(if required by the line discipline); add read_lock mutex to
serialize calls of r3964_read().
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The tty core provides read_wait waitqueue specifically for line
disciplines to wait readers; otherwise, the line discipline may
miss wakeups generated by the tty core.
NB: The tty core already provides serialization for the line discipline's
close() method, and guarantees no readers or writers will be using the
closing instance of the line discipline. Completely remove that wakeup.
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
With the removal of tty_wait_until_sent_from_close(), tty drivers
no longer wait during open for parallel closes to complete (instead,
the tty core waits before calling the driver open() method). Thus,
the close_wait waitqueue is no longer used for waiting.
Remove struct tty_port::close_wait.
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
tty_wait_until_sent_from_close() drops the tty lock while waiting
for the tty driver to finish sending previously accepted data (ie.,
data remaining in its write buffer and transmit fifo).
tty_wait_until_sent_from_close() was added by commit a57a7bf3fc
("TTY: define tty_wait_until_sent_from_close") to prevent the entire
tty subsystem from being unable to open new ttys while waiting for
one tty to close while output drained.
However, since commit 0911261d4c ("tty: Don't take tty_mutex for tty
count changes"), holding a tty lock while closing does not prevent other
ttys from being opened/closed/hung up, but only prevents lifetime event
changes for the tty under lock.
Holding the tty lock while waiting for output to drain does prevent
parallel non-blocking opens (O_NONBLOCK) from advancing or returning
while the tty lock is held. However, all parallel opens _already_
block even if the tty lock is dropped while closing and the parallel
open advances. Blocking in open has been in mainline since at least 2.6.29
(see tty_port_block_til_ready(); note the test for O_NONBLOCK is _after_
the wait while ASYNC_CLOSING).
IOW, before this patch a non-blocking open will sleep anyway for the
_entire_ duration of a parallel hardware shutdown, and when it wakes, the
error return will cause a release of its tty, and it will restart with
a fresh attempt to open. Similarly with a blocking open that is already
waiting; when it's woken, the hardware shutdown has already completed
to ASYNC_INITIALIZED is not set, which forces a release and restart as
well.
So, holding the tty lock across the _entire_ close (which is what this
patch does), even while waiting for output to drain, is equivalent to
the current outcome wrt parallel opens.
Cc: Alan Cox <alan@linux.intel.com>
Cc: David Laight <David.Laight@aculab.com>
CC: Arnd Bergmann <arnd@arndb.de>
CC: Karsten Keil <isdn@linux-pingi.de>
CC: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Create separate predicate functions to test/set/clear feature flags,
thereby replacing the wordy old macros. Furthermore, clean out the
places where we open-coded feature tests.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Instead of overloading EIO for CRC errors and corrupt structures,
return the same error codes that XFS returns for the same issues.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
This merge resolves conflicts with 75aec9df3a ("bridge: Remove
br_nf_push_frag_xmit_sk") as part of Eric Biederman's effort to improve
netns support in the network stack that reached upstream via David's
net-next tree.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Conflicts:
net/bridge/br_netfilter_hooks.c
Chanwoo writes:
Update extcon for 4.4
Detailed description for patchset:
1. Update the extcon core:
- Modify the unique identification and name of each external connector
with the additional prefix to clarify both attribute and meaning of
external connector as following:
: EXTCON_CHG_* mean the charger connector.
: EXTCON_JACK_* mean the jack connector.
: EXTCON_DISP_* mean the display port connector.
- Keep the standard name of USB charging port by refering to the
"Battery Charging v1.2 Spec and Adopters Agreement"[1] to use
the standard name of USB charging port as following:
: EXTCON_CHG_USB_SDP /* Standard Downstream Port */
: EXTCON_CHG_USB_DCP /* Dedicated Charging Port */
: EXTCON_CHG_USB_CDP /* Charging Downstream Port */
: EXTCON_CHG_USB_ACA /* Accessory Charger Adapter */
[1] www.usb.org/developers/docs/devclass_docs/BCv1.2_070312.zip
2. Update the extcon-arizona.c driver:
- Support the WM8998 and WM1814 codec for jack detection.
- Support for the ADC mode microphone detection and the general
purpose switch for pop suppression.
- Fix bug include fixing the headphone detection accuracy
at the top end of the range and some corrections around
the use of the microphone clamps.
3. Update the extcon-gpio.c driver:
- Clean-up the extcon-gpio driver and fix minor issue before
supporting the Device tree binding of it.
4. Clean-up and fix the minor issue for extcon drivers:
- Export OF module alias information for extcon-rt8973a.c and extcon-sm5502.c.
- Fix wrong type of variable of for extcon-rt8973a.c and extcon-sm5502.c.
- Use resource managed API for extcon-axp288.c.
Some encoders have both outputs low in stable states, others also have
a stable state with both outputs high (half-period mode) and some have
a stable state in all steps (quarter-period mode). The driver used to
support the former states and with this change it can also support the
later.
This commit also deprecates the 'half-period' property and introduces
a new property 'steps-per-period'. This property specifies the
number of steps (stable states) produced by the rotary encoder
for each GPIO period.
Signed-off-by: Guido Martínez <guido@vanguardiasur.com.ar>
Signed-off-by: Ezequiel Garcia <ezequiel@vanguardiasur.com.ar>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
The flag matches the DT GPIO_SINGLE_ENDED flag and allows drivers to
parse and use the DT flag to handle single-ended (open-drain or
open-source) GPIOs.
Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Introduce common interface acpi_pci_root_create() and related data
structures to create PCI root bus for ACPI PCI host bridges. It will
be used to kill duplicated arch specific code for IA64 and x86. It may
also help ARM64 in future.
Reviewed-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Tested-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Provide generic request/free implementations that pinctrl aware gpio
drivers can use instead of open coding if they use a 1:1 pin to gpio
signal mapping.
Signed-off-by: Jonas Gorski <jogo@openwrt.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
When calling __clk_get_name() on a const clock:
warning: passing argument 1 of '__clk_get_name' discards 'const' qualifier from pointer target type
include/linux/clk-provider.h:613:13: note: expected 'struct clk *' but argument is of type 'const struct clk *'
__clk_get_name() does not modify the passed clock, hence make it const.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Many voltage Regulators need a input voltage that is higher than the
output voltage. Allow to specify a minimum dropout voltage which will
be used later to find the best input voltage for regulators.
[Changed uv to uV for consistency and legibility -- broonie]
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: Mark Brown <broonie@kernel.org>
A recent change to the dst_output handling caused a new warning
when the call to NF_HOOK() is the only used of a local variable
passed as 'dev', and CONFIG_NETFILTER is disabled:
net/ipv6/ip6_output.c: In function 'ip6_output':
net/ipv6/ip6_output.c:135:21: warning: unused variable 'dev' [-Wunused-variable]
The reason for this is that the NF_HOOK macro in this case does
not reference the variable at all, and the call to dev_net(dev)
got removed from the ip6_output function. To avoid that warning now
and in the future, this changes the macro into an equivalent
inline function, which tells the compiler that the variable is
passed correctly but still unused.
The dn_forward function apparently had the same problem in
the past and added a local workaround that no longer works
with the inline function. In order to avoid a regression, we
have to also remove the #ifdef from decnet in the same patch.
Fixes: ede2059dba ("dst: Pass net into dst->output")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
since commit 8405a8fff3 ("netfilter: nf_qeueue: Drop queue entries on
nf_unregister_hook") all pending queued entries are discarded.
So we can simply remove all of the owner handling -- when module is
removed it also needs to unregister all its hooks.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This newly introduced netdevice notifier is called before actual change
upper happens. That provides a possibility for notifier handlers to
know upper change will happen and react to it, including possibility to
forbid the change. That is valuable for drivers which can check if the
upper device linkage is supported and forbid that in case it is not.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
So far, we've always considered that for a given PCI device, its
MSI controller was either set by the architecture-specific
pcibios hook, or simply inherited from the host bridge.
This doesn't cover things like firmware-defined topologies like
msi-map (DT) or IORT (ACPI), which can provide information about
which MSI controller to use on a per-device basis.
This patch adds the necessary hook into the MSI code to allow this
feature, and provides the msi-map functionnality as a first
implementation.
Acked-by: Rob Herring <robh@kernel.org>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
While msi-parent is used to point to the MSI controller that
works for all the devices behind a root complex, it doesn't
allow configurations where each individual device can be routed
to a separate MSI controller.
The msi-map property provides this flexibility (and much more),
so let's add a utility function that parses it, and return the
corresponding MSI domain.
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Since 126b16e2ad ("Docs: dt: add generic MSI bindings"),
the definition of "msi-parent" has evolved, while maintaining
some degree of compatibility. It can now express multiple MSI
controllers as parents, as well as some sideband data being
communicated to the controller.
This patch adds the parsing of the property, iterating over
the multiple parents until a suitable irqdomain is found.
It can also fallback to the original parsing if the old
binding is detected.
This support code gets used in the subsequent patches.
Suggested-by: Robin Murphy <robin.murphy@arm.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Add pci_msi_domain_get_msi_rid() to return the MSI requester id (RID).
Initially needed by gic-v3 based systems. It will be used by follow on
patch to drivers/irqchip/irq-gic-v3-its-pci-msi.c
Initially supports mapping the RID via OF device tree. In the future,
this could be extended to use ACPI _IORT tables as well.
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: David Daney <david.daney@cavium.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
The device tree property "msi-map" specifies how to create the PCI
requester id used in some MSI controllers. Add a new function
of_msi_map_rid() that finds the msi-map property and applies its
translation to a given requester id.
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: David Daney <david.daney@cavium.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Allow for arch-specific interrupt types to be set. For that, add
kvm_arch_set_irq() which takes interrupt type-specific action if it
recognizes the interrupt type given, and -EWOULDBLOCK otherwise.
The default implementation always returns -EWOULDBLOCK.
Signed-off-by: Andrey Smetanin <asmetanin@virtuozzo.com>
Reviewed-by: Roman Kagan <rkagan@virtuozzo.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Vitaly Kuznetsov <vkuznets@redhat.com>
CC: "K. Y. Srinivasan" <kys@microsoft.com>
CC: Gleb Natapov <gleb@kernel.org>
CC: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This patch modifies the id and name of external connector with the
additional prefix to clarify both attribute and meaning of external
connector as following:
- EXTCON_CHG_* mean the charger connector.
- EXTCON_JACK_* mean the jack connector.
- EXTCON_DISP_* mean the display port connector.
Following table show the new name of external connector with old name:
--------------------------------------------------
Old extcon name | New extcon name |
--------------------------------------------------
EXTCON_TA | EXTCON_CHG_USB_DCP |
EXTCON_CHARGE_DOWNSTREAM| EXTCON_CHG_USB_CDP |
EXTCON_FAST_CHARGER | EXTCON_CHG_USB_FAST |
EXTCON_SLOW_CHARGER | EXTCON_CHG_USB_SLOW |
--------------------------------------------------
EXTCON_MICROPHONE | EXTCON_JACK_MICROPHONE |
EXTCON_HEADPHONE | EXTCON_JACK_HEADPHONE |
EXTCON_LINE_IN | EXTCON_JACK_LINE_IN |
EXTCON_LINE_OUT | EXTCON_JACK_LINE_OUT |
EXTCON_VIDEO_IN | EXTCON_JACK_VIDEO_IN |
EXTCON_VIDEO_OUT | EXTCON_JACK_VIDEO_OUT |
EXTCON_SPDIF_IN | EXTCON_JACK_SPDIF_IN |
EXTCON_SPDIF_OUT | EXTCON_JACK_SPDIF_OUT |
--------------------------------------------------
EXTCON_HMDI | EXTCON_DISP_HDMI |
EXTCON_MHL | EXTCON_DISP_MHL |
EXTCON_DVI | EXTCON_DISP_DVI |
EXTCON_VGA | EXTCON_DISP_VGA |
--------------------------------------------------
And, when altering the name of USB charger connector, EXTCON refers to the
"Battery Charging v1.2 Spec and Adopters Agreement"[1] to use the standard
name of USB charging port as following. Following name of USB charging port
are already used in power_supply subsystem. We chan check it on patch[2].
- EXTCON_CHG_USB_SDP /* Standard Downstream Port */
- EXTCON_CHG_USB_DCP /* Dedicated Charging Port */
- EXTCON_CHG_USB_CDP /* Charging Downstream Port */
- EXTCON_CHG_USB_ACA /* Accessory Charger Adapter */
[1] www.usb.org/developers/docs/devclass_docs/BCv1.2_070312.zip
[2] commit 85efc8a18c ("power_supply: Add types for USB chargers")
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
[ckeepax: For the Arizona changes]
Acked-by: Charles Keepax <ckeepax@opensource.wolfsonmicro.com>
Reviewed-by: Roger Quadros <rogerq@ti.com>
Pull "Qualcomm ARM Based SoC Updates for 4.4" from Andy Gross:
* Implement id_table driver matching in SMD
* Avoid NULL pointer exception on remove of SMEM
* Reorder SMEM/SMD configs
* Make qcom_smem_get() return a pointer
* Handle big endian CPUs correctly in SMEM
* Represent SMD channel layout in structures
* Use __iowrite32_copy() in SMD
* Remove use of VLAIs in SMD
* Handle big endian CPUs correctly in SMD/RPM
* Handle big endian CPUs corretly in SMD
* Reject sending SMD packets that are too large
* Fix endianness issue in SCM __qcom_scm_is_call_available
* Add missing prototype for qcom_scm_is_available()
* Correct SMEM items for upper channels
* Use architecture level to build SCM correctly
* Delete unneeded of_node_put in SMD
* Correct active/slep state flagging in SMD/RPM
* Move RPM message ram out of SMEM DT node
* tag 'qcom-soc-for-4.4' of git://codeaurora.org/quic/kernel/agross-msm:
soc: qcom: smem: Move RPM message ram out of smem DT node
soc: qcom: smd-rpm: Correct the active vs sleep state flagging
soc: qcom: smd: delete unneeded of_node_put
firmware: qcom-scm: build for correct architecture level
soc: qcom: smd: Correct SMEM items for upper channels
qcom-scm: add missing prototype for qcom_scm_is_available()
qcom-scm: fix endianess issue in __qcom_scm_is_call_available
soc: qcom: smd: Reject send of too big packets
soc: qcom: smd: Handle big endian CPUs
soc: qcom: smd_rpm: Handle big endian CPUs
soc: qcom: smd: Remove use of VLAIS
soc: qcom: smd: Use __iowrite32_copy() instead of open-coding it
soc: qcom: smd: Represent channel layout in structures
soc: qcom: smem: Handle big endian CPUs
soc: qcom: Make qcom_smem_get() return a pointer
soc: qcom: Reorder SMEM/SMD configs
soc: qcom: smem: Avoid NULL pointer exception on remove
soc: qcom: smd: Implement id_table driver matching
pids controller is completely broken in that it uncharges when a task
exits allowing zombies to escape resource control. With the recent
updates, cgroup core now maintains cgroup association till task free
and pids controller can be fixed by uncharging on free instead of
exit.
This patch adds cgroup_subsys->free() method and update pids
controller to use it instead of ->exit() for uncharging.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Aleksa Sarai <cyphar@cyphar.com>
cgroup_exit() is called when a task exits and disassociates the
exiting task from its cgroups and half-attach it to the root cgroup.
This is unnecessary and undesirable.
No controller actually needs an exiting task to be disassociated with
non-root cgroups. Both cpu and perf_event controllers update the
association to the root cgroup from their exit callbacks just to keep
consistent with the cgroup core behavior.
Also, this disassociation makes it difficult to track resources held
by zombies or determine where the zombies came from. Currently, pids
controller is completely broken as it uncharges on exit and zombies
always escape the resource restriction. With cgroup association being
reset on exit, fixing it is pretty painful.
There's no reason to reset cgroup membership on exit. The zombie can
be removed from its css_set so that it doesn't show up on
"cgroup.procs" and thus can't be migrated or interfere with cgroup
removal. It can still pin and point to the css_set so that its cgroup
membership is maintained. This patch makes cgroup core keep zombies
associated with their cgroups at the time of exit.
* Previous patches decoupled populated_cnt tracking from css_set
lifetime, so a dying task can be simply unlinked from its css_set
while pinning and pointing to the css_set. This keeps css_set
association from task side alive while hiding it from "cgroup.procs"
and populated_cnt tracking. The css_set reference is dropped when
the task_struct is freed.
* ->exit() callback no longer needs the css arguments as the
associated css never changes once PF_EXITING is set. Removed.
* cpu and perf_events controllers no longer need ->exit() callbacks.
There's no reason to explicitly switch away on exit. The final
schedule out is enough. The callbacks are removed.
* On traditional hierarchies, nothing changes. "/proc/PID/cgroup"
still reports "/" for all zombies. On the default hierarchy,
"/proc/PID/cgroup" keeps reporting the cgroup that the task belonged
to at the time of exit. If the cgroup gets removed before the task
is reaped, " (deleted)" is appended.
v2: Build brekage due to missing dummy cgroup_free() when
!CONFIG_CGROUP fixed.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
css_set_rwsem is the inner lock protecting css_sets and is accessed
from hot paths such as fork and exit. Internally, it has no reason to
be a rwsem or even mutex. There are no internal blocking operations
while holding it. This was rwsem because css task iteration used to
expose it to external iterator users. As the previous patch updated
css task iteration such that the locking is not leaked to its users,
there's no reason to keep it a rwsem.
This patch converts css_set_rwsem to a spinlock and rename it to
css_set_lock. It uses bh-safe operations as a planned usage needs to
access it from RCU callback context.
Signed-off-by: Tejun Heo <tj@kernel.org>
css_sets are synchronized through css_set_rwsem but the locking scheme
is kinda bizarre. The hot paths - fork and exit - have to write lock
the rwsem making the rw part pointless; furthermore, many readers
already hold cgroup_mutex.
One of the readers is css task iteration. It read locks the rwsem
over the entire duration of iteration. This leads to silly locking
behavior. When cpuset tries to migrate processes of a cgroup to a
different NUMA node, css_set_rwsem is held across the entire migration
attempt which can take a long time locking out forking, exiting and
other cgroup operations.
This patch updates css task iteration so that it locks css_set_rwsem
only while the iterator is being advanced. css task iteration
involves two levels - css_set and task iteration. As css_sets in use
are practically immutable, simply pinning the current one is enough
for resuming iteration afterwards. Task iteration is tricky as tasks
may leave their css_set while iteration is in progress. This is
solved by keeping track of active iterators and advancing them if
their next task leaves its css_set.
v2: put_task_struct() in css_task_iter_next() moved outside
css_set_rwsem. A later patch will add cgroup operations to
task_struct free path which may grab the same lock and this avoids
deadlock possibilities.
css_set_move_task() updated to use list_for_each_entry_safe() when
walking task_iters and advancing them. This is necessary as
advancing an iter may remove it from the list.
Signed-off-by: Tejun Heo <tj@kernel.org>
Currently, cgroup_has_tasks() tests whether the target cgroup has any
css_set linked to it. This works because a css_set's refcnt converges
with the number of tasks linked to it and thus there's no css_set
linked to a cgroup if it doesn't have any live tasks.
To help tracking resource usage of zombie tasks, putting the ref of
css_set will be separated from disassociating the task from the
css_set which means that a cgroup may have css_sets linked to it even
when it doesn't have any live tasks.
This patch replaces cgroup_has_tasks() with cgroup_is_populated()
which tests cgroup->nr_populated instead which locally counts the
number of populated css_sets. Unlike cgroup_has_tasks(),
cgroup_is_populated() is recursive - if any of the descendants is
populated, the cgroup is populated too. While this changes the
meaning of the test, all the existing users are okay with the change.
While at it, replace the open-coded ->populated_cnt test in
cgroup_events_show() with cgroup_is_populated().
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Currently, cgroup->nr_populated counts whether the cgroup has any
css_sets linked to it and the number of children which has non-zero
->nr_populated. This works because a css_set's refcnt converges with
the number of tasks linked to it and thus there's no css_set linked to
a cgroup if it doesn't have any live tasks.
To help tracking resource usage of zombie tasks, putting the ref of
css_set will be separated from disassociating the task from the
css_set which means that a cgroup may have css_sets linked to it even
when it doesn't have any live tasks.
This patch updates cgroup->nr_populated so that for the cgroup itself
it counts the number of css_sets which have tasks associated with them
so that empty css_sets don't skew the populated test.
Signed-off-by: Tejun Heo <tj@kernel.org>
Merge "Broadcom soc changes for v4.4 (try 2)" from Florian Fainelli:
This pull request contains the following Broadcom SoC platform and driver changes:
- Brian Norris create a drivers/soc/brcmstb/ stub as a place holder for SoC-specific
code which is coming next
- Florian Fainelli adds support for configuring the BCM7xxx SoCs Bus Interface Unit
with their specific write-pairing setting, which must be saved and restored during
system-wide suspend/resume, and consequently updates the brcmstb machine code to
initialize the BIU
- Jon Mason adds support for the Northstar Plus SoCs by introducing a custom machine
descriptor matching their compatible string and setting up the PL310 L2 cache and
enabling the relevant ARM errata for their Cortex-A9
* tag 'arm-soc/for-4.4/soc' of http://github.com/Broadcom/stblinux:
ARM: brcmstb: Setup BIU control registers during boot
soc: brcmstb: Add Bus Interface Unit control setup
soc: add stubs for brcmstb SoC's
ARM: NSP: Add basic support for Broadcom Northstar Plus SoC
bdi's are initialized in two steps, bdi_init() and bdi_register(), but
destroyed in a single step by bdi_destroy() which, for a bdi embedded
in a request_queue, is called during blk_cleanup_queue() which makes
the queue invisible and starts the draining of remaining usages.
A request_queue's user can access the congestion state of the embedded
bdi as long as it holds a reference to the queue. As such, it may
access the congested state of a queue which finished
blk_cleanup_queue() but hasn't reached blk_release_queue() yet.
Because the congested state was embedded in backing_dev_info which in
turn is embedded in request_queue, accessing the congested state after
bdi_destroy() was called was fine. The bdi was destroyed but the
memory region for the congested state remained accessible till the
queue got released.
a13f35e871 ("writeback: don't embed root bdi_writeback_congested in
bdi_writeback") changed the situation. Now, the root congested state
which is expected to be pinned while request_queue remains accessible
is separately reference counted and the base ref is put during
bdi_destroy(). This means that the root congested state may go away
prematurely while the queue is between bdi_dstroy() and
blk_cleanup_queue(), which was detected by Andrey's KASAN tests.
The root cause of this problem is that bdi doesn't distinguish the two
steps of destruction, unregistration and release, and now the root
congested state actually requires a separate release step. To fix the
issue, this patch separates out bdi_unregister() and bdi_exit() from
bdi_destroy(). bdi_unregister() is called from blk_cleanup_queue()
and bdi_exit() from blk_release_queue(). bdi_destroy() is now just a
simple wrapper calling the two steps back-to-back.
While at it, the prototype of bdi_destroy() is moved right below
bdi_setup_and_register() so that the counterpart operations are
located together.
Signed-off-by: Tejun Heo <tj@kernel.org>
Fixes: a13f35e871 ("writeback: don't embed root bdi_writeback_congested in bdi_writeback")
Cc: stable@vger.kernel.org # v4.2+
Reported-and-tested-by: Andrey Konovalov <andreyknvl@google.com>
Link: http://lkml.kernel.org/g/CAAeHK+zUJ74Zn17=rOyxacHU18SgCfC6bsYW=6kCY5GXJBwGfQ@mail.gmail.com
Reviewed-by: Jan Kara <jack@suse.com>
Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
This is only usable for the static 1:1 mapping of physical memory.
Any access to vmalloc or module regions will require some way of doing
an IOTLB flush. It's theoretically possible to hook into the
tlb_flush_kernel_range() function, but that seems like overkill — most
of the addresses accessed through a kernel PASID *will* be in the 1:1
mapping.
If we really need to allow access to more interesting kernel regions,
then the answer will probably be an explicit IOTLB flush call after use,
akin to the DMA API's unmap function.
In fact, it might be worth introducing that sooner rather than later, and
making it just BUG() if the address isn't in the static 1:1 mapping.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Taking inspiration from the existing arch/arm code, break out some
generic functions to interface the DMA-API to the IOMMU-API. This will
do the bulk of the heavy lifting for IOMMU-backed dma-mapping.
Since associating an IOVA allocator with an IOMMU domain is a fairly
common need, rather than introduce yet another private structure just to
do this for ourselves, extend the top-level struct iommu_domain with the
notion. A simple opaque cookie allows reuse by other IOMMU API users
with their various different incompatible allocator types.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>