Commit Graph

85471 Commits

Author SHA1 Message Date
Michael S. Tsirkin
7d7072e3ba skb_array: resize support
Update skb_array after ptr_ring API changes.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Tested-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-15 13:58:27 -07:00
Michael S. Tsirkin
5d49de5320 ptr_ring: resize support
This adds ring resize support. Seems to be necessary as
users such as tun allow userspace control over queue size.

If resize is used, this costs us ability to peek at queue without
consumer lock - should not be a big deal as peek and consumer are
usually run on the same CPU.

If ring is made bigger, ring contents is preserved.  If ring is made
smaller, extra pointers are passed to an optional destructor callback.

Cleanup function also gains destructor callback such that
all pointers in queue can be cleaned up.

This changes some APIs but we don't have any users yet,
so it won't break bisect.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-15 13:58:27 -07:00
Michael S. Tsirkin
ad69f35d1d skb_array: array based FIFO for skbs
A simple array based FIFO of pointers.  Intended for net stack so uses
skbs for type safety. Implemented as a set of wrappers around ptr_ring.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Tested-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-15 13:58:27 -07:00
Mauro Carvalho Chehab
a087ce704b [media] media-device: dynamically allocate struct media_devnode
struct media_devnode is currently embedded at struct media_device.

While this works fine during normal usage, it leads to a race
condition during devnode unregister. the problem is that drivers
assume that, after calling media_device_unregister(), the struct
that contains media_device can be freed. This is not true, as it
can't be freed until userspace closes all opened /dev/media devnodes.

In other words, if the media devnode is still open, and media_device
gets freed, any call to an ioctl will make the core to try to access
struct media_device, with will cause an use-after-free and even GPF.

Fix this by dynamically allocating the struct media_devnode and only
freeing it when it is safe.

Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
2016-06-15 17:57:24 -03:00
Michael S. Tsirkin
2e0ab8ca83 ptr_ring: array based FIFO for pointers
A simple array based FIFO of pointers.  Intended for net stack which
commonly has a single consumer/producer.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-15 13:57:21 -07:00
Mauro Carvalho Chehab
163f1e93e9 [media] media-devnode: fix namespace mess
Along all media controller code, "mdev" is used to represent
a pointer to struct media_device, and "devnode" for a pointer
to struct media_devnode.

However, inside media-devnode.[ch], "mdev" is used to represent
a pointer to struct media_devnode.

This is very confusing and may lead to development errors.

So, let's change all occurrences at media-devnode.[ch] to
also use "devnode" for such pointers.

This patch doesn't make any functional changes.

Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
2016-06-15 17:56:06 -03:00
WANG Cong
b2313077ed net_sched: make tcf_hash_check() boolean
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-15 12:43:35 -07:00
David Ahern
9ff7438460 net: vrf: Handle ipv6 multicast and link-local addresses
IPv6 multicast and link-local addresses require special handling by the
VRF driver:
1. Rather than using the VRF device index and full FIB lookups,
   packets to/from these addresses should use direct FIB lookups based on
   the VRF device table.

2. fail sends/receives on a VRF device to/from a multicast address
   (e.g, make ping6 ff02::1%<vrf> fail)

3. move the setting of the flow oif to the first dst lookup and revert
   the change in icmpv6_echo_reply made in ca254490c8 ("net: Add VRF
   support to IPv6 stack"). Linklocal/mcast addresses require use of the
   skb->dev.

With this change connections into and out of a VRF enslaved device work
for multicast and link-local addresses work (icmp, tcp, and udp)
e.g.,

1. packets into VM with VRF config:
    ping6 -c3 fe80::e0:f9ff:fe1c:b974%br1
    ping6 -c3 ff02::1%br1

    ssh -6 fe80::e0:f9ff:fe1c:b974%br1

2. packets going out a VRF enslaved device:
    ping6 -c3 fe80::18f8:83ff:fe4b:7a2e%eth1
    ping6 -c3 ff02::1%eth1
    ssh -6 root@fe80::18f8:83ff:fe4b:7a2e%eth1

Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-15 12:34:34 -07:00
David Ahern
cd2a9e62c8 net: l3mdev: Remove const from flowi6 arg to get_rt6_dst
Allow drivers to pass flow arg to functions where the arg is not const
and allow the driver to make updates as needed (eg., setting oif).

Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-15 12:34:34 -07:00
J. Bruce Fields
39a9beab5a rpc: share one xps between all backchannels
The spec allows backchannels for multiple clients to share the same tcp
connection.  When that happens, we need to use the same xprt for all of
them.  Similarly, we need the same xps.

This fixes list corruption introduced by the multipath code.

Cc: stable@vger.kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Acked-by: Trond Myklebust <trondmy@primarydata.com>
2016-06-15 10:32:25 -04:00
J. Bruce Fields
d50039ea5e nfsd4/rpc: move backchannel create logic into rpc code
Also simplify the logic a bit.

Cc: stable@vger.kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Acked-by: Trond Myklebust <trondmy@primarydata.com>
2016-06-15 10:32:25 -04:00
Pablo Neira Ayuso
8588ac097b netfilter: nf_tables: reject loops from set element jump to chain
Liping Zhang says:

"Users may add such a wrong nft rules successfully, which will cause an
endless jump loop:

  # nft add rule filter test tcp dport vmap {1: jump test}

This is because before we commit, the element in the current anonymous
set is inactive, so osp->walk will skip this element and miss the
validate check."

To resolve this problem, this patch passes the generation mask to the
walk function through the iter container structure depending on the code
path:

1) If we're dumping the elements, then we have to check if the element
   is active in the current generation. Thus, we check for the current
   bit in the genmask.

2) If we're checking for loops, then we have to check if the element is
   active in the next generation, as we're in the middle of a
   transaction. Thus, we check for the next bit in the genmask.

Based on original patch from Liping Zhang.

Reported-by: Liping Zhang <liping.zhang@spreadtrum.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Tested-by: Liping Zhang <liping.zhang@spreadtrum.com>
2016-06-15 12:17:23 +02:00
Tudor Ambarus
5a7de97309 crypto: rsa - return raw integers for the ASN.1 parser
Return the raw key with no other processing so that the caller
can copy it or MPI parse it, etc.

The scope is to have only one ANS.1 parser for all RSA
implementations.

Update the RSA software implementation so that it does
the MPI conversion on top.

Signed-off-by: Tudor Ambarus <tudor-dan.ambarus@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-06-15 17:07:54 +08:00
Stephan Mueller
3cfc3b9721 crypto: drbg - use aligned buffers
Hardware cipher implementation may require aligned buffers. All buffers
that potentially are processed with a cipher are now aligned.

Signed-off-by: Stephan Mueller <smueller@chronox.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-06-15 17:07:53 +08:00
Stephan Mueller
3559128521 crypto: drbg - use CTR AES instead of ECB AES
The CTR DRBG derives its random data from the CTR that is encrypted with
AES.

This patch now changes the CTR DRBG implementation such that the
CTR AES mode is employed. This allows the use of steamlined CTR AES
implementation such as ctr-aes-aesni.

Unfortunately there are the following subtile changes we need to apply
when using the CTR AES mode:

- the CTR mode increments the counter after the cipher operation, but
  the CTR DRBG requires the increment before the cipher op. Hence, the
  crypto_inc is applied to the counter (drbg->V) once it is
  recalculated.

- the CTR mode wants to encrypt data, but the CTR DRBG is interested in
  the encrypted counter only. The full CTR mode is the XOR of the
  encrypted counter with the plaintext data. To access the encrypted
  counter, the patch uses a NULL data vector as plaintext to be
  "encrypted".

Signed-off-by: Stephan Mueller <smueller@chronox.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-06-15 17:07:53 +08:00
Linus Walleij
61f922db72 gpio: userspace ABI for reading GPIO line events
This adds an ABI for listening to events on GPIO lines.
The mechanism returns an anonymous file handle to a request
to listen to a specific offset on a specific gpiochip.
To fetch the stream of events from the file handle, userspace
simply reads an event.

- Events can be requested with the same flags as ordinary
  handles, i.e. open drain or open source. An ioctl() call
  GPIO_GET_LINEEVENT_IOCTL is issued indicating the desired
  line.

- Events can be requested for falling edge events, rising
  edge events, or both.

- All events are timestamped using the kernel real time
  nanosecond timestamp (the same as is used by IIO).

- The supplied consumer label will appear in "lsgpio"
  listings of the lines, and in /proc/interrupts as the
  mechanism will request an interrupt from the gpio chip.

- Events are not supported on gpiochips that do not serve
  interrupts (no legal .to_irq() call). The event interrupt
  is threaded to avoid any realtime problems.

- It is possible to also directly read the current value
  of the registered GPIO line by issuing the same
  GPIOHANDLE_GET_LINE_VALUES_IOCTL as used by the
  line handles. Setting the value is not supported: we
  do not listen to events on output lines.

This ABI is strongly influenced by Industrial I/O and surpasses
the old sysfs ABI by providing proper precision timestamps,
making it possible to set flags like open drain, and put
consumer names on the GPIO lines.

Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2016-06-15 09:29:17 +02:00
Linus Walleij
d7c51b47ac gpio: userspace ABI for reading/writing GPIO lines
This adds a userspace ABI for reading and writing GPIO lines.
The mechanism returns an anonymous file handle to a request
to read/write n offsets from a gpiochip. This file handle
in turn accepts two ioctl()s: one that reads and one that
writes values to the selected lines.

- Handles can be requested as input/output, active low,
  open drain, open source, however when you issue a request
  for n lines with GPIO_GET_LINEHANDLE_IOCTL, they must all
  have the same flags, i.e. all inputs or all outputs, all
  open drain etc. If a granular control of the flags for
  each line is desired, they need to be requested
  individually, not in a batch.

- The GPIOHANDLE_GET_LINE_VALUES_IOCTL read ioctl() can be
  issued also to output lines to verify that the hardware
  is in the expected state.

- It reads and writes up to GPIOHANDLES_MAX lines at once,
  utilizing the .set_multiple() call in the driver if
  possible, making the call efficient if several lines
  can be written with a single register update.

The limitation of GPIOHANDLES_MAX to 64 lines is done under
the assumption that we may expect hardware that can issue a
transaction updating 64 bits at an instant but unlikely
anything larger than that.

ChangeLog v2->v3:
- Use gpiod_get_value_cansleep() so we support also slowpath
  GPIO drivers.
- Fix up the UAPI docs kerneldoc.
- Allocate the anonymous fd last, so that the release
  function don't get called until that point of something
  fails. After this point, skip the errorpath.
ChangeLog v1->v2:
- Handle ioctl_compat() properly based on a similar patch
  to the other ioctl() handling code.
- Use _IOWR() as we pass pointers both in and out of the
  ioctl()
- Use kmalloc() and kfree() for the linehandled, do not
  try to be fancy with devm_* it doesn't work the way I
  thought.
- Fix const-correctness on the linehandle name field.

Acked-by: Michael Welling <mwelling@ieee.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2016-06-15 09:28:50 +02:00
Borislav Petkov
ca5f2b4c4f PM / sleep: Make pm_prepare_console() return void
Nothing is using its return value so change it to return void.

No functionality change.

Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-06-15 01:26:04 +02:00
Paul E. McKenney
d95f5ba90f torture: Break online and offline functions out of torture_onoff()
This commit breaks torture_online() and torture_offline() out of
torture_onoff() in preparation for allowing waketorture finer-grained
control of its CPU-hotplug workload.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2016-06-14 16:02:16 -07:00
Paul E. McKenney
810ce8b5df rcu: Document RCU_NONIDLE() restrictions in comment header
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2016-06-14 16:01:42 -07:00
Mauro Carvalho Chehab
dc19ed1571 Update my main e-mails at the Kernel tree
For the third time in three years, I'm changing my e-mail at
Samsung. That's bad, as it may stop communications with me for
a while. So, this time, I'll also the mchehab@kernel.org e-mail,
as it remains stable since ever.

Cc: stable@vger.kernel.org
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
2016-06-14 14:55:18 -03:00
Kees Cook
8112c4f140 seccomp: remove 2-phase API
Since nothing is using the 2-phase API, and it adds more complexity than
benefit, remove it.

Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Andy Lutomirski <luto@kernel.org>
2016-06-14 10:54:40 -07:00
Andy Lutomirski
2f275de5d1 seccomp: Add a seccomp_data parameter secure_computing()
Currently, if arch code wants to supply seccomp_data directly to
seccomp (which is generally much faster than having seccomp do it
using the syscall_get_xyz() API), it has to use the two-phase
seccomp hooks. Add it to the easy hooks, too.

Cc: linux-arch@vger.kernel.org
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
2016-06-14 10:54:39 -07:00
Olof Johansson
d8d8126f22 Merge tag 'reset-for-4.8-2' of git://git.pengutronix.de/git/pza/linux into next/drivers
Reset controller changes for v4.8, part 2

- add Amlogic Meson SoC Reset Controller driver

* tag 'reset-for-4.8-2' of git://git.pengutronix.de/git/pza/linux:
  dt-bindings: reset: Add bindings for the Meson SoC Reset Controller
  reset: Add support for the Amlogic Meson SoC Reset Controller

Signed-off-by: Olof Johansson <olof@lixom.net>
2016-06-14 10:30:46 -07:00
Peter Zijlstra
726328d92a locking/spinlock, arch: Update and fix spin_unlock_wait() implementations
This patch updates/fixes all spin_unlock_wait() implementations.

The update is in semantics; where it previously was only a control
dependency, we now upgrade to a full load-acquire to match the
store-release from the spin_unlock() we waited on. This ensures that
when spin_unlock_wait() returns, we're guaranteed to observe the full
critical section we waited on.

This fixes a number of spin_unlock_wait() users that (not
unreasonably) rely on this.

I also fixed a number of ticket lock versions to only wait on the
current lock holder, instead of for a full unlock, as this is
sufficient.

Furthermore; again for ticket locks; I added an smp_rmb() in between
the initial ticket load and the spin loop testing the current value
because I could not convince myself the address dependency is
sufficient, esp. if the loads are of different sizes.

I'm more than happy to remove this smp_rmb() again if people are
certain the address dependency does indeed work as expected.

Note: PPC32 will be fixed independently

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: chris@zankel.net
Cc: cmetcalf@mellanox.com
Cc: davem@davemloft.net
Cc: dhowells@redhat.com
Cc: james.hogan@imgtec.com
Cc: jejb@parisc-linux.org
Cc: linux@armlinux.org.uk
Cc: mpe@ellerman.id.au
Cc: ralf@linux-mips.org
Cc: realmz6@gmail.com
Cc: rkuo@codeaurora.org
Cc: rth@twiddle.net
Cc: schwidefsky@de.ibm.com
Cc: tony.luck@intel.com
Cc: vgupta@synopsys.com
Cc: ysato@users.sourceforge.jp
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-06-14 11:55:15 +02:00
Peter Zijlstra
7cb45c0fe9 locking/barriers: Move smp_cond_load_acquire() to asm-generic/barrier.h
Since all asm/barrier.h should/must include asm-generic/barrier.h the
latter is a good place for generic infrastructure like this.

This also allows archs to override the new smp_acquire__after_ctrl_dep().

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-06-14 11:55:14 +02:00
Peter Zijlstra
33ac279677 locking/barriers: Introduce smp_acquire__after_ctrl_dep()
Introduce smp_acquire__after_ctrl_dep(), this construct is not
uncommon, but the lack of this barrier is.

Use it to better express smp_rmb() uses in WRITE_ONCE(), the IPC
semaphore code and the qspinlock code.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-06-14 11:55:14 +02:00
Peter Zijlstra
1f03e8d291 locking/barriers: Replace smp_cond_acquire() with smp_cond_load_acquire()
This new form allows using hardware assisted waiting.

Some hardware (ARM64 and x86) allow monitoring an address for changes,
so by providing a pointer we can use this to replace the cpu_relax()
with hardware optimized methods in the future.

Requested-by: Will Deacon <will.deacon@arm.com>
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-06-14 11:54:27 +02:00
David Howells
965475acca KEYS: Strip trailing spaces
Strip some trailing spaces.

Signed-off-by: David Howells <dhowells@redhat.com>
2016-06-14 10:29:44 +01:00
Ingo Molnar
245050c287 Merge branch 'linus' into locking/core, to pick up fixes before merging new changes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-06-14 11:17:42 +02:00
Andi Kleen
2c95afc1e8 perf/x86/intel, watchdog: Switch NMI watchdog to ref cycles on x86
The NMI watchdog uses either the fixed cycles or a generic cycles
counter. This causes a lot of conflicts with users of the PMU who want
to run a full group including the cycles fixed counter, for example
the --topdown support recently added to perf stat. The code needs to
fall back to not use groups, which can cause measurement inaccuracy
due to multiplexing errors.

This patch switches the NMI watchdog to use reference cycles
on Intel systems.  This is actually more accurate than cycles,
because cycles can tick faster than the measured CPU Frequency
due to Turbo mode.

The ref cycles always tick at their frequency, or slower when
the system is idling. That means the NMI watchdog can never
expire too early, unlike with cycles.

The reference cycles tick roughly at the frequency of the TSC,
so the same period computation can be used.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: acme@kernel.org
Cc: jolsa@kernel.org
Link: http://lkml.kernel.org/r/1465478079-19993-1-git-send-email-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-06-14 11:16:59 +02:00
Ingo Molnar
3559ff9650 Merge branch 'linus' into perf/core, to pick up fixes before merging new changes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-06-14 11:14:34 +02:00
Ingo Molnar
07f9f22087 Merge branch 'sched/urgent' into sched/core, to pick up fixes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-06-14 11:04:13 +02:00
Lars Ellenberg
1b57e66384 drbd: correctly handle failed crypto_alloc_hash
crypto_alloc_hash returns an ERR_PTR(), not NULL.

Also reset peer_integrity_tfm to NULL, to not call crypto_free_hash()
on an errno in the cleanup path.

Reported-by: Insu Yun <wuninsu@gmail.com>

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-06-13 21:43:08 -06:00
Fabian Frederick
7e5fec3168 drbd: code cleanups without semantic changes
This contains various cosmetic fixes ranging from simple typos to
const-ifying, and using booleans properly.

Original commit messages from Fabian's patch set:
drbd: debugfs: constify drbd_version_fops
drbd: use seq_put instead of seq_print where possible
drbd: include linux/uaccess.h instead of asm/uaccess.h
drbd: use const char * const for drbd strings
drbd: kerneldoc warning fix in w_e_end_data_req()
drbd: use unsigned for one bit fields
drbd: use bool for peer is_ states
drbd: fix typo
drbd: use | for bitmask combination
drbd: use true/false for bool
drbd: fix drbd_bm_init() comments
drbd: introduce peer state union
drbd: fix maybe_pull_ahead() locking comments
drbd: use bool for growing
drbd: remove redundant declarations
drbd: replace if/BUG by BUG_ON

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: Roland Kammerer <roland.kammerer@linbit.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-06-13 21:43:07 -06:00
Lars Ellenberg
505675f96c drbd: allow larger max_discard_sectors
Make sure we have at least 67 (> AL_UPDATES_PER_TRANSACTION)
al-extents available, and allow up to half of that to be
discarded in one bio.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-06-13 21:43:05 -06:00
Lars Ellenberg
dd4f699da6 drbd: when receiving P_TRIM, zero-out partial unaligned chunks
We can avoid spurious data divergence caused by partially-ignored
discards on certain backends with discard_zeroes_data=0, if we
translate partial unaligned discard requests into explicit zero-out.

The relevant use case is LVM/DM thin.

If on different nodes, DRBD is backed by devices with differing
discard characteristics, discards may lead to data divergence
(old data or garbage left over on one backend, zeroes due to
unmapped areas on the other backend). Online verify would now
potentially report tons of spurious differences.

While probably harmless for most use cases (fstrim on a file system),
DRBD cannot have that, it would violate our promise to upper layers
that our data instances on the nodes are identical.

To be correct and play safe (make sure data is identical on both copies),
we would have to disable discard support, if our local backend (on a
Primary) does not support "discard_zeroes_data=true".

We'd also have to translate discards to explicit zero-out on the
receiving (typically: Secondary) side, unless the receiving side
supports "discard_zeroes_data=true".

Which both would allocate those blocks, instead of unmapping them,
in contrast with expectations.

LVM/DM thin does set discard_zeroes_data=0,
because it silently ignores discards to partial chunks.

We can work around this by checking the alignment first.
For unaligned (wrt. alignment and granularity) or too small discards,
we zero-out the initial (and/or) trailing unaligned partial chunks,
but discard all the aligned full chunks.

At least for LVM/DM thin, the result is effectively "discard_zeroes_data=1".

Arguably it should behave this way internally, by default,
and we'll try to make that happen.

But our workaround is still valid for already deployed setups,
and for other devices that may behave this way.

Setting discard-zeroes-if-aligned=yes will allow DRBD to use
discards, and to announce discard_zeroes_data=true, even on
backends that announce discard_zeroes_data=false.

Setting discard-zeroes-if-aligned=no will cause DRBD to always
fall-back to zero-out on the receiving side, and to not even
announce discard capabilities on the Primary, if the respective
backend announces discard_zeroes_data=false.

We used to ignore the discard_zeroes_data setting completely.
To not break established and expected behaviour, and suddenly
cause fstrim on thin-provisioned LVs to run out-of-space,
instead of freeing up space, the default value is "yes".

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-06-13 21:43:05 -06:00
Philipp Reisner
a5ca66c419 drbd: Introduce new disk config option rs-discard-granularity
As long as the value is 0 the feature is disabled. With setting
it to a positive value, DRBD limits and aligns its resync requests
to the rs-discard-granularity setting. If the sync source detects
all zeros in such a block, the resync target discards the range
on disk.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-06-13 21:43:04 -06:00
Olof Johansson
735251aeb9 Merge tag 'renesas-rcar-sysc-for-v4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas into next/drivers
Renesas ARM Based SoC R-Car SYSC Updates for v4.8

e0c98b9171 soc: renesas: rcar-sysc: Add support for R-Car M3-W power areas
74699228b9 soc: renesas: Add r8a7796 SYSC PM Domain Binding Definitions
2ff1bf77e4 soc: renesas: rcar-sysc: Document r8a7796 support

* tag 'renesas-rcar-sysc-for-v4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas:
  soc: renesas: rcar-sysc: Add support for R-Car M3-W power areas
  soc: renesas: Add r8a7796 SYSC PM Domain Binding Definitions
  soc: renesas: rcar-sysc: Document r8a7796 support

Signed-off-by: Olof Johansson <olof@lixom.net>
2016-06-13 15:39:46 -07:00
Olof Johansson
057b670df0 Merge tag 'renesas-dt-for-v4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas into next/dt
Renesas ARM Based SoC DT Updates for v4.8

* Fix W=1 dtc warnings
* Reverence both DMA controllers on R-Car Gen 2 SoCs
* Remove nonexistent thermal sensor clock from r8a7794 SoC
* Correct unit names for cpu nodes on r8a7790 SoC
* Add MMCIF0 to r8a7793 SoC
* RTS/CTS hardware flow control for kzm9g and bockw boards

* tag 'renesas-dt-for-v4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas: (30 commits)
  ARM: dts: silk: Fix W=1 dtc warnings
  ARM: dts: porter: Fix W=1 dtc warnings
  ARM: dts: marzen: Fix W=1 dtc warnings
  ARM: dts: lager: Fix W=1 dtc warnings
  ARM: dts: kzm9g: Fix W=1 dtc warnings
  ARM: dts: kzm9d: Fix W=1 dtc warnings
  ARM: dts: koelsch: Fix W=1 dtc warnings
  ARM: dts: gose: Fix W=1 dtc warnings
  ARM: dts: genmai: Fix W=1 dtc warnings
  ARM: dts: bockw: Fix W=1 dtc warnings
  ARM: dts: armadillo800eva: Fix W=1 dtc warnings
  ARM: dts: ape6evm: Fix W=1 dtc warnings
  ARM: dts: sh73a0: Fix W=1 dtc warnings
  ARM: dts: r8a7794: Fix W=1 dtc warnings
  ARM: dts: r8a7793: Fix W=1 dtc warnings
  ARM: dts: r8a7791: Fix W=1 dtc warnings
  ARM: dts: r8a7790: Fix W=1 dtc warnings
  ARM: dts: r8a7778: Fix W=1 dtc warnings
  ARM: dts: r8a7740: Fix W=1 dtc warnings
  ARM: dts: r8a73a4: Fix W=1 dtc warnings
  ...

Signed-off-by: Olof Johansson <olof@lixom.net>
2016-06-13 15:37:25 -07:00
Olof Johansson
95eb940c0e Merge tag 'samsung-dt-odroid-xu-4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux into next/dt
Topic branch for adding Exynos 5410 Odroid XU board for v4.8.

This brings support for Hardkernel's Odroid XU board.  It was the first
design with big.LITTLE SoC from Samsung: Exynos5410.  The board is not
very popular.  Newer XU3 and XU4 got more attention.

Board details:
1. Exynos5410 octa-core (A15+A7, however as of now only one cluster is
   enabled),
2. 2 GB DDR3 RAM,
3. PowerVR SGX544MP3 GPU (not enabled in DTS),
4. USB 3.0 Host x 1, USB 3.0 OTG x 1, USB 2.0 Host x 4,
5. HDMI 1.4a, MIPI DSI and Display Port (Display Port not on all of
   revisions though),
6. eMMC 4.5 and microSD slots.

* tag 'samsung-dt-odroid-xu-4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux: (28 commits)
  ARM: dts: exynos: Add watchdog and Security SubSystem to Exynos5410
  ARM: dts: exynos: Configure PWM, usb3503, PMIC and thermal on Odroid XU board
  ARM: dts: exynos: Add Thermal Management Unit to Exynos5410
  ARM: dts: exynos: Interrupt for USB DWC3-1 differs between Exynos5420 and 5410
  dt-bindings: clock: Add watchdog and SSS clock IDs to Exynos5410
  dt-bindings: clock: Add TMU clock ID to Exynos5410
  ARM: dts: exynos: Add RTC and I2C to Exynos5410
  ARM: dts: exynos: Add I2C, PWM and UART pinctrl to Exynos5410
  ARM: dts: exynos: Move HSI2C nodes to exynos54xx.dtsi
  ARM: dts: exynos: Add initial support for Odroid XU board
  ARM: dts: exynos: Add USB to Exynos5410
  ARM: dts: exynos: Move common Exynos5410/542x/5800 nodes to new DTSI
  ARM: dts: exynos: MCT is not an interrupt controller and extend length of iomap
  ARM: dts: exynos: Enable UART3 on Exynos5410
  ARM: dts: exynos: Include common exynos5 in exynos5410.dtsi
  ARM: dts: exynos: Move Exynos5250 and Exynos5420 nodes under soc
  ARM: dts: exynos: Use phandle to get parent node in exynos5250-snow
  ARM: dts: exynos: Prepare for inclusion of exynos5.dtsi in exynos5410.dtsi
  ARM: dts: exynos: Move common nodes to exynos5.dtsi
  ARM: dts: exynos: Split Odroid XU3 LEDs to separate DTSI
  ...

Signed-off-by: Olof Johansson <olof@lixom.net>
2016-06-13 14:51:58 -07:00
Rafael J. Wysocki
bb4b9933e2 Merge back earlier cpufreq changes for v4.8. 2016-06-13 23:33:17 +02:00
Jean-Michel Hautbois
0f614d834b i2c: Add generic support passing secondary devices addresses
Some I2C devices have multiple addresses assigned, for example each address
corresponding to a different internal register map page of the device.
So far drivers which need support for this have handled this with a driver
specific and non-generic implementation, e.g. passing the additional address
via platform data.

This patch provides a new helper function called i2c_new_secondary_device()
which is intended to provide a generic way to get the secondary address
as well as instantiate a struct i2c_client for the secondary address.

The function expects a pointer to the primary i2c_client, a name
for the secondary address and an optional default address. The name is used
as a handle to specify which secondary address to get.

The default address is used as a fallback in case no secondary address
was explicitly specified. In case no secondary address and no default
address were specified the function returns NULL.

For now the function only supports look-up of the secondary address
from devicetree, but it can be extended in the future
to for example support board files and/or ACPI.

Signed-off-by: Jean-Michel Hautbois <jean-michel.hautbois@veo-labs.com>
Acked-by: Rob Herring <robh@kernel.org>
Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2016-06-13 22:32:09 +02:00
Mika Westerberg
9d26d3a8f1 PCI: Put PCIe ports into D3 during suspend
Currently the Linux PCI core does not touch power state of PCI bridges and
PCIe ports when system suspend is entered.  Leaving them in D0 consumes
power unnecessarily and may prevent the CPU from entering deeper C-states.

With recent PCIe hardware we can power down the ports to save power given
that we take into account few restrictions:

  - The PCIe port hardware is recent enough, starting from 2015.

  - Devices connected to PCIe ports are effectively in D3cold once the port
    is transitioned to D3 (the config space is not accessible anymore and
    the link may be powered down).

  - Devices behind the PCIe port need to be allowed to transition to D3cold
    and back.  There is a way both drivers and userspace can forbid this.

  - If the device behind the PCIe port is capable of waking the system it
    needs to be able to do so from D3cold.

This patch adds a new flag to struct pci_device called 'bridge_d3'.  This
flag is set and cleared by the PCI core whenever there is a change in power
management state of any of the devices behind the PCIe port.  When system
later on is suspended we only need to check this flag and if it is true
transition the port to D3 otherwise we leave it in D0.

Also provide override mechanism via command line parameter
"pcie_port_pm=[off|force]" that can be used to disable or enable the
feature regardless of the BIOS manufacturing date.

Tested-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-06-13 14:57:36 -05:00
Gustavo Padovan
ceb74152c4 drm: make drm_vblank_{get,put}() static
As they are not used anywhere outside drm_irq.c make them static.

Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1465308482-15104-4-git-send-email-gustavo@padovan.org
2016-06-13 18:37:33 +02:00
Trond Myklebust
f1dc237c60 SUNRPC: Reduce latency when send queue is congested
Use the low latency transport workqueue to process the task that is
next in line on the xprt->sending queue.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-06-13 12:35:51 -04:00
Trond Myklebust
40a5f1b19b SUNRPC: RPC transport queue must be low latency
rpciod can easily get congested due to the long list of queued rpc_tasks.
Having the receive queue wait in turn for those tasks to complete can
therefore be a bottleneck.

Address the problem by separating the workqueues into:
- rpciod: manages rpc_tasks
- xprtiod: manages transport related work.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-06-13 12:35:51 -04:00
Trond Myklebust
42d42a5b0c SUNRPC: Small optimisation of client receive
Do not queue the client receive work if we're still processing.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-06-13 12:35:51 -04:00
Gustavo Padovan
93507d135b drm: remove legacy drm_arm_vblank_event()
We don't have any user of this function anymore, let's remove it.

Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1465308482-15104-3-git-send-email-gustavo@padovan.org
2016-06-13 18:34:06 +02:00
Gustavo Padovan
db749b7e3d drm: remove legacy drm_send_vblank_event()
We don't have any user of this function anymore, let's remove it.

Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1465308482-15104-2-git-send-email-gustavo@padovan.org
2016-06-13 18:34:02 +02:00