When using GPIO irqchip helpers to setup irqchip for a gpiolib based
driver, it is not possible to select which GPIOs to add to the IRQ domain.
Instead it just adds all GPIOs which is not always desired. For example
there might be GPIOs that for some reason cannot generated normal
interrupts at all.
To support this we add a flag irq_need_valid_mask to struct gpio_chip. When
this flag is set the core allocates irq_valid_mask that holds one bit for
each GPIO the chip has. By default all bits are set but drivers can
manipulate this using set_bit() and clear_bit() accordingly.
Then when gpiochip_irqchip_add() is called, this mask is checked and all
GPIOs with bit is set are added to the IRQ domain created for the GPIO
chip.
Suggested-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Marc Kleine-Budde says:
====================
pull-request: can 2016-09-22
this is a pull request of one patch for the upcoming linux-4.8 release.
The patch by Sergei Miroshnichenko fixes a potential deadlock in the generic
CAN device code that cann occour after a bus-off.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Provide a nand_cleanup() function to free all nand related resources
without unregistering the mtd device.
This should allow drivers to call mtd_device_unregister() and handle
its return value and still being able to cleanup all nand related
resources.
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Daniel Walter <dwalter@sigma-star.at>
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
The generic NAND DT bindings allows one to tweak the ECC strength and
step size to their need. It can be used to lower the ECC strength to
match a bootloader/firmware config, but might also be used to get a better
reliability.
In the latter case, the user might want to use the maximum ECC strength
without having to explicitly calculate the exact value (this value not
only depends on the OOB size, but also on the NAND controller, and can
be tricky to extract).
Add a generic 'nand-ecc-maximize' DT property and the associated
NAND_ECC_MAXIMIZE flag, to let ECC controller drivers select the best
ECC strength and step-size on their own.
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Acked-by: Rob Herring <robh@kernel.org>
The NAND framework provides several helpers to query timing modes supported
by a NAND chip, but this implies that all NAND controller drivers have
to implement the same timings selection dance. Also currently NAND
devices can be resetted at arbitrary places which also resets the timing
for ONFI chips to timing mode 0.
Provide a common logic to select the best timings based on ONFI or
->onfi_timing_mode_default information. Hook this into nand_reset()
to make sure the new timing is applied each time during a reset.
NAND controller willing to support timings adjustment should just
implement the ->setup_data_interface() method.
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Currently we have no data structure to fully describe a NAND timing.
We only have struct nand_sdr_timings for NAND timings in SDR mode,
but nothing for DDR mode and also no container to store both types
of timing.
This patch adds struct nand_data_interface which stores the timing
type and a union of different timings. This can be used to pass to
drivers in order to configure the timing.
Add kerneldoc for struct nand_sdr_timings while touching it anyway.
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
When NAND devices are resetted some initialization may have to be done,
like for example they have to be configured for the timing mode that
shall be used. To get a common place where this initialization can be
implemented create a nand_reset() function. This currently only issues
a NAND_CMD_RESET to the NAND device. The places issuing this command
manually are replaced with a call to nand_reset().
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
'extern' is not necessary for function declarations. To prevent
people from adding the keyword to new declarations remove the
existing ones.
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
From: Andrey Vagin <avagin@openvz.org>
Each namespace has an owning user namespace and now there is not way
to discover these relationships.
Pid and user namepaces are hierarchical. There is no way to discover
parent-child relationships too.
Why we may want to know relationships between namespaces?
One use would be visualization, in order to understand the running
system. Another would be to answer the question: what capability does
process X have to perform operations on a resource governed by namespace
Y?
One more use-case (which usually called abnormal) is checkpoint/restart.
In CRIU we are going to dump and restore nested namespaces.
There [1] was a discussion about which interface to choose to determing
relationships between namespaces.
Eric suggested to add two ioctl-s [2]:
> Grumble, Grumble. I think this may actually a case for creating ioctls
> for these two cases. Now that random nsfs file descriptors are bind
> mountable the original reason for using proc files is not as pressing.
>
> One ioctl for the user namespace that owns a file descriptor.
> One ioctl for the parent namespace of a namespace file descriptor.
Here is an implementaions of these ioctl-s.
$ man man7/namespaces.7
...
Since Linux 4.X, the following ioctl(2) calls are supported for
namespace file descriptors. The correct syntax is:
fd = ioctl(ns_fd, ioctl_type);
where ioctl_type is one of the following:
NS_GET_USERNS
Returns a file descriptor that refers to an owning user names‐
pace.
NS_GET_PARENT
Returns a file descriptor that refers to a parent namespace.
This ioctl(2) can be used for pid and user namespaces. For
user namespaces, NS_GET_PARENT and NS_GET_USERNS have the same
meaning.
In addition to generic ioctl(2) errors, the following specific ones
can occur:
EINVAL NS_GET_PARENT was called for a nonhierarchical namespace.
EPERM The requested namespace is outside of the current namespace
scope.
[1] https://lkml.org/lkml/2016/7/6/158
[2] https://lkml.org/lkml/2016/7/9/101
Changes for v2:
* don't return ENOENT for init_user_ns and init_pid_ns. There is nothing
outside of the init namespace, so we can return EPERM in this case too.
> The fewer special cases the easier the code is to get
> correct, and the easier it is to read. // Eric
Changes for v3:
* rename ns->get_owner() to ns->owner(). get_* usually means that it
grabs a reference.
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: "Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com>
Cc: "W. Trevor King" <wking@tremily.us>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Serge Hallyn <serge.hallyn@canonical.com>
Pid and user namepaces are hierarchical. There is no way to discover
parent-child relationships.
In a future we will use this interface to dump and restore nested
namespaces.
Acked-by: Serge Hallyn <serge@hallyn.com>
Signed-off-by: Andrei Vagin <avagin@openvz.org>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Return -EPERM if an owning user namespace is outside of a process
current user namespace.
v2: In a first version ns_get_owner returned ENOENT for init_user_ns.
This special cases was removed from this version. There is nothing
outside of init_user_ns, so we can return EPERM.
v3: rename ns->get_owner() to ns->owner(). get_* usually means that it
grabs a reference.
Acked-by: Serge Hallyn <serge@hallyn.com>
Signed-off-by: Andrei Vagin <avagin@openvz.org>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
PCIe hotplug supports optional Attention and Power Indicators, which are
used internally by pciehp. Users can't control the Power Indicator, but
they can control the Attention Indicator by writing to a sysfs "attention"
file.
The Slot Control register has two bits for each indicator, and the PCIe
spec defines the encodings for each as (Reserved/On/Blinking/Off). For
sysfs "attention" writes, pciehp_set_attention_status() maps into these
encodings, so the only useful write values are 0 (Off), 1 (On), and 2
(Blinking).
However, some platforms use all four bits for platform-specific indicators,
and they need to allow direct user control of them while preventing pciehp
from using them at all.
Add a "hotplug_user_indicators" flag to the pci_dev structure. When set,
pciehp does not use either the Attention Indicator or the Power Indicator,
and the low four bits (values 0x0 - 0xf) of sysfs "attention" write values
are written directly to the Attention Indicator Control and Power Indicator
Control fields.
[bhelgaas: changelog, rename flag and accessors to s/attention/indicator/]
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
If a driver sets BLK_MQ_F_BLOCKING, it is allowed to block in its
->queue_rq() handler. For that case, blk-mq ensures that we always
calls it from a safe context.
Signed-off-by: Jens Axboe <axboe@fb.com>
Tested-by: Josef Bacik <jbacik@fb.com>
Add a waitqueue head to the client structure. Have clients set a wait
on that queue prior to requesting a lock from the server. If the lock
is blocked, then we can use that to wait for wakeups.
Note that we do need to do this "manually" since we need to set the
wait on the waitqueue prior to requesting the lock, but requesting a
lock can involve activities that can block.
However, only do that for NFSv4.1 locks, either by compiling out
all of the waitqueue handling when CONFIG_NFS_V4_1 is disabled, or
skipping all of it at runtime if we're dealing with v4.0, or v4.1
servers that don't send lock callbacks.
Note too that even when we expect to get a lock callback, RFC5661
section 20.11.4 is pretty clear that we still need to poll for them,
so we do still sleep on a timeout. We do however always poll at the
longest interval in that case.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
[Anna: nfs4_retry_setlk() "status" should default to -ERESTARTSYS]
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Pull media fixes from Mauro Carvalho Chehab:
- several fixes for new drivers added for Kernel 4.8 addition (cec
core, pulse8 cec driver and Mediatek vcodec)
- a regression fix for cx23885 and saa7134 drivers
- an important fix for rcar-fcp, making rcar_fcp_enable() return 0 on
success
* tag 'media/v4.8-7' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (25 commits)
[media] cx23885/saa7134: assign q->dev to the PCI device
[media] rcar-fcp: Make sure rcar_fcp_enable() returns 0 on success
[media] cec: fix ioctl return code when not registered
[media] cec: don't Feature Abort broadcast msgs when unregistered
[media] vcodec:mediatek: Refine VP8 encoder driver
[media] vcodec:mediatek: Refine H264 encoder driver
[media] vcodec:mediatek: change H264 profile default to profile high
[media] vcodec:mediatek: Add timestamp and timecode copy for V4L2 Encoder
[media] vcodec:mediatek: Fix visible_height larger than coded_height issue in s_fmt_out
[media] vcodec:mediatek: Fix fops_vcodec_release flow for V4L2 Encoder
[media] vcodec:mediatek:code refine for v4l2 Encoder driver
[media] cec-funcs.h: add missing vendor-specific messages
[media] cec-edid: check for IEEE identifier
[media] pulse8-cec: fix error handling
[media] pulse8-cec: set correct Signal Free Time
[media] mtk-vcodec: add HAS_DMA dependency
[media] cec: ignore messages when log_addr_mask == 0
[media] cec: add item to TODO
[media] cec: set unclaimed addresses to CEC_LOG_ADDR_INVALID
[media] cec: add CEC_LOG_ADDRS_FL_ALLOW_UNREG_FALLBACK flag
...
asm-generic is only intended for architecture defaults, and we can simply
kill it off by moving the two defintions directly to <linux/libata.h>.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Tejun Heo <tj@kernel.org>
stop_two_cpus() and stop_cpus() use stop_cpus_lock to avoid the deadlock,
we need to ensure that the stopper functions can't be queued "backwards"
from one another. This doesn't look nice; if we use lglock then we do not
really need stopper->lock, cpu_stop_queue_work() could use lg_local_lock()
under local_irq_save().
OTOH it would be even better to avoid lglock in stop_machine.c and remove
lg_double_lock(). This patch adds "bool stop_cpus_in_progress" set/cleared
by queue_stop_cpus_work(), and changes cpu_stop_queue_two_works() to busy
wait until it is cleared.
queue_stop_cpus_work() sets stop_cpus_in_progress = T lockless, but after
it queues a work on CPU1 it must be visible to stop_two_cpus(CPU1, CPU2)
which checks it under the same lock. And since stop_two_cpus() holds the
2nd lock too, queue_stop_cpus_work() can not clear stop_cpus_in_progress
if it is also going to queue a work on CPU2, it needs to take that 2nd
lock to do this.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20151121181148.GA433@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Provide a down_read()/up_read() variant that keeps preemption disabled
over the whole thing, when possible.
This avoids a needless preemption point for constructs such as:
percpu_down_read(&global_rwsem);
spin_lock(&lock);
...
spin_unlock(&lock);
percpu_up_read(&global_rwsem);
Which perturbs timings. In particular it was found to cure a
performance regression in a follow up patch in fs/locks.c
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
adds PHY-mode "trgmii" as an extension for the operation
mode of the PHY interface for PHY_INTERFACE_MODE_TRGMII.
and adds a variable trgmii inside mtk_mac as the indication
to make the difference between the MAC connected to internal
switch or connected to external PHY by the given configuration
on the board and then to perform the corresponding setup on
TRGMII hardware module.
Signed-off-by: Sean Wang <sean.wang@mediatek.com>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David Howells says:
====================
rxrpc: Preparation for slow-start algorithm [ver #2]
Here are some patches that prepare for improvements in ACK generation and
for the implementation of the slow-start part of the protocol:
(1) Stop storing the protocol header in the Tx socket buffers, but rather
generate it on the fly. This potentially saves a little space and
makes it easier to alter the header just before transmission (the
flags may get altered and the serial number has to be changed).
(2) Mask off the Tx buffer annotations and add a flag to record which ones
have already been resent.
(3) Track RTT on a per-peer basis for use in future changes. Tracepoints
are added to log this.
(4) Send PING ACKs in response to incoming calls to elicit a PING-RESPONSE
ACK from which RTT data can be calculated. The response also carries
other useful information.
(5) Expedite PING-RESPONSE ACK generation from sendmsg. If we're actively
using sendmsg, this allows us, under some circumstances, to avoid
having to rely on the background work item to run to generate this
ACK.
This requires ktime_sub_ms() to be added.
(6) Set the REQUEST-ACK flag on some DATA packets to elicit ACK-REQUESTED
ACKs from which RTT data can be calculated.
(7) Limit the use of pings and ACK requests for RTT determination.
Changes:
(V2) Don't use the C division operator for 64-bit division. One instance
should use do_div() and the other should be using nsecs_to_jiffies().
The last two patches got transposed, leading to an undefined symbol
in one of them.
Reported-by: kbuild test robot <lkp@intel.com>
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
inode_change_ok() will be resposible for clearing capabilities and IMA
extended attributes and as such will need dentry. Give it as an argument
to inode_change_ok() instead of an inode. Also rename inode_change_ok()
to setattr_prepare() to better relect that it does also some
modifications in addition to checks.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jan Kara <jack@suse.cz>
When file permissions are modified via chmod(2) and the user is not in
the owning group or capable of CAP_FSETID, the setgid bit is cleared in
inode_change_ok(). Setting a POSIX ACL via setxattr(2) sets the file
permissions as well as the new ACL, but doesn't clear the setgid bit in
a similar way; this allows to bypass the check in chmod(2). Fix that.
References: CVE-2016-7097
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
A timer was used to restart after the bus-off state, leading to a
relatively large can_restart() executed in an interrupt context,
which in turn sets up pinctrl. When this happens during system boot,
there is a high probability of grabbing the pinctrl_list_mutex,
which is locked already by the probe() of other device, making the
kernel suspect a deadlock condition [1].
To resolve this issue, the restart_timer is replaced by a delayed
work.
[1] https://github.com/victronenergy/venus/issues/24
Signed-off-by: Sergei Miroshnichenko <sergeimir@emcraft.com>
Cc: linux-stable <stable@vger.kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Drivers must be ready to accept NULL from ptp_clock_register() if the
PTP clock subsystem is configured out.
This patch documents that and ensures that all drivers cope well
with a NULL return.
Signed-off-by: Nicolas Pitre <nico@linaro.org>
Reviewed-by: Eugenia Emantayev <eugenia@mellanox.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Acked-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This exports the functionality of extracting the tag from the payload,
without moving next vlan tag into hw accel tag.
Signed-off-by: Shmulik Ladkani <shmulik.ladkani@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Advanced JIT compilers and translators may want to use
eBPF verifier as a base for parsers or to perform custom
checks and validations.
Add ability for external users to invoke the verifier
and provide callbacks to be invoked for every intruction
checked. For now only add most basic callback for
per-instruction pre-interpretation checks is added. More
advanced users may also like to have per-instruction post
callback and state comparison callback.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Move verifier's internal structures to a header file and
prefix their names with bpf_ to avoid potential namespace
conflicts. Those structures will soon be used by external
analyzers.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds hardware offload capability to cls_bpf classifier,
similar to what have been done with U32 and flower.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Optimize RAID6 recovery functions to take advantage of
the 512-bit ZMM integer instructions introduced in AVX512.
AVX512 optimized recovery functions, which is simply based
on recov_avx2.c written by Jim Kukunas
This patch was tested and benchmarked before submission on
a hardware that has AVX512 flags to support such instructions
Cc: Jim Kukunas <james.t.kukunas@linux.intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Megha Dey <megha.dey@linux.intel.com>
Signed-off-by: Gayatri Kammela <gayatri.kammela@intel.com>
Reviewed-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Shaohua Li <shli@fb.com>
Optimize RAID6 gen_syndrom functions to take advantage of
the 512-bit ZMM integer instructions introduced in AVX512.
AVX512 optimized gen_syndrom functions, which is simply based
on avx2.c written by Yuanhan Liu and sse2.c written by hpa.
The patch was tested and benchmarked before submission on
a hardware that has AVX512 flags to support such instructions
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jim Kukunas <james.t.kukunas@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Megha Dey <megha.dey@linux.intel.com>
Signed-off-by: Gayatri Kammela <gayatri.kammela@intel.com>
Reviewed-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Shaohua Li <shli@fb.com>
For a host to access an Open-Channel SSD, it has to know its geometry,
so that it writes and reads at the appropriate device bounds.
Currently, the geometry information is kept within the kernel, and not
exported to user-space for consumption. This patch exposes the
configuration through sysfs and enables user-space libraries, such as
liblightnvm, to use the sysfs implementation to get the geometry of an
Open-Channel SSD.
The sysfs entries are stored within the device hierarchy, and can be
found using the "lightnvm" device type.
An example configuration looks like this:
/sys/class/nvme/
└── nvme0n1
├── capabilities: 3
├── device_mode: 1
├── erase_max: 1000000
├── erase_typ: 1000000
├── flash_media_type: 0
├── media_capabilities: 0x00000001
├── media_type: 0
├── multiplane: 0x00010101
├── num_blocks: 1022
├── num_channels: 1
├── num_luns: 4
├── num_pages: 64
├── num_planes: 1
├── page_size: 4096
├── prog_max: 100000
├── prog_typ: 100000
├── read_max: 10000
├── read_typ: 10000
├── sector_oob_size: 0
├── sector_size: 4096
├── media_manager: gennvm
├── ppa_format: 0x380830082808001010102008
├── vendor_opcode: 0
├── max_phys_secs: 64
└── version: 1
Signed-off-by: Simon A. F. Lund <slund@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
LightNVM compatible device drivers does not have a method to expose
LightNVM specific sysfs entries.
To enable LightNVM sysfs entries to be exposed, lightnvm device
drivers require a struct device to attach it to. To allow both the
actual device driver and lightnvm sysfs entries to coexist, the device
driver tracks the lifetime of the nvm_dev structure.
This patch refactors NVMe and null_blk to handle the lifetime of struct
nvm_dev, which eliminates the need for struct gendisk when a lightnvm
compatible device is provided.
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>