Commit Graph

57000 Commits

Author SHA1 Message Date
Jonathan Cameron
0464415dd2 staging:iio:in kernel users: Add a data field for channel specific info.
Used to allow information about a given channel mapping to be passed
through from board files to the consumer drivers.

Signed-off-by: Jonathan Cameron <jic23@kernel.org>
2012-11-10 10:17:27 +00:00
Jonathan Cameron
84b36ce5f7 staging:iio: Add support for multiple buffers
Route all buffer writes through the demux.
Addition or removal of a buffer results in tear down and
setup of all the buffers for a given device.

Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Tested-by: srinivas pandruvada <srinivas.pandruvada@intel.com>
2012-11-10 10:17:21 +00:00
Henrik Rydberg
800963fd59 Input: document new members of struct input_dev
Fixes kernel-doc warnings for the members added in 3.7-rc1.

Signed-off-by: Henrik Rydberg <rydberg@euromail.se>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2012-11-10 00:40:24 -08:00
Linus Torvalds
0020dd0b8c Merge tag 'stable/for-linus-3.7-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen
Pull Xen fixes from Konrad Rzeszutek Wilk:
 "There are three ARM compile fixes (we forgot to export certain
  functions and if the drivers are built as an module - we go belly-up).

  There is also an mismatch of irq_enter() / exit_idle() calls sequence
  which were fixed some time ago in other piece of codes, but failed to
  appear in the Xen code.

  Lastly a fix for to help in the field with troubleshooting in case we
  cannot get the appropriate parameter and also fallback code when
  working with very old hypervisors."

Bug-fixes:
 - Fix compile issues on ARM.
 - Fix hypercall fallback code for old hypervisors.
 - Print out which HVM parameter failed if it fails.
 - Fix idle notifier call after irq_enter.

* tag 'stable/for-linus-3.7-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
  xen/arm: Fix compile errors when drivers are compiled as modules (export more).
  xen/arm: Fix compile errors when drivers are compiled as modules.
  xen/generic: Disable fallback build on ARM.
  xen/events: fix RCU warning, or Call idle notifier after irq_enter()
  xen/hvm: If we fail to fetch an HVM parameter print out which flag it is.
  xen/hypercall: fix hypercall fallback code for very old hypervisors
2012-11-10 06:56:21 +01:00
Donald Dutile
bff73156d3 PCI: Provide method to reduce the number of total VFs supported
Some implementations of SRIOV provide a capability structure
value of TotalVFs that is greater than what the software can support.
Provide a method to reduce the capability structure reported value
to the value the driver can support.
This ensures sysfs reports the current capability of the system,
hardware and software.
Example for its use: igb & ixgbe -- report 8 & 64 as TotalVFs,
but drivers only support 7 & 63 maximum.

Signed-off-by: Donald Dutile <ddutile@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2012-11-09 21:37:39 -07:00
Donald Dutile
1789382a72 PCI: SRIOV control and status via sysfs
Provide files under sysfs to determine the maximum number of VFs
an SR-IOV-capable PCIe device supports, and methods to enable and
disable the VFs on a per-device basis.

Currently, VF enablement by SR-IOV-capable PCIe devices is done
via driver-specific module parameters.  If not setup in modprobe files,
it requires admin to unload & reload PF drivers with number of desired
VFs to enable.  Additionally, the enablement is system wide: all
devices controlled by the same driver have the same number of VFs
enabled.  Although the latter is probably desired, there are PCI
configurations setup by system BIOS that may not enable that to occur.

Two files are created for the PF of PCIe devices with SR-IOV support:

    sriov_totalvfs	Contains the maximum number of VFs the device
			could support as reported by the TotalVFs register
			in the SR-IOV extended capability.

    sriov_numvfs	Contains the number of VFs currently enabled on
			this device as reported by the NumVFs register in
			the SR-IOV extended capability.

			Writing zero to this file disables all VFs.

			Writing a positive number to this file enables that
			number of VFs.

These files are readable for all SR-IOV PF devices.  Writes to the
sriov_numvfs file are effective only if a driver that supports the
sriov_configure() method is attached.

Signed-off-by: Donald Dutile <ddutile@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2012-11-09 20:10:39 -07:00
Jesse Gross
f8f626754e ipv6: Move ipv6_find_hdr() out of Netfilter code.
Open vSwitch will soon also use ipv6_find_hdr() so this moves it
out of Netfilter-specific code into a more common location.

Signed-off-by: Jesse Gross <jesse@nicira.com>
2012-11-09 17:05:07 -08:00
Nicolas Dichtel
c075b13098 ip6tnl: advertise tunnel param via rtnl
It is usefull for daemons that monitor link event to have the full parameters of
these interfaces when a rtnl message is sent.
It allows also to dump them via rtnetlink.

It is based on what is done for GRE tunnels.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-09 19:36:20 -05:00
Nicolas Dichtel
0974658da4 ipip: advertise tunnel param via rtnl
It is usefull for daemons that monitor link event to have the full parameters of
these interfaces when a rtnl message is sent.
It allows also to dump them via rtnetlink.

It is based on what is done for GRE tunnels.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-09 19:36:20 -05:00
Andreas Larsson
0bce04be44 of/address: sparc: Declare of_address_to_resource() as an extern function for sparc again
This bug-fix makes sure that of_address_to_resource is defined extern for sparc
so that the sparc-specific implementation of of_address_to_resource() is once
again used when including include/linux/of_address.h in a sparc context. A
number of drivers in mainline relies on this function working for sparc.

The bug was introduced in a850a75544, "of/address:
add empty static inlines for !CONFIG_OF". Contrary to that commit title, the
static inlines are added for !CONFIG_OF_ADDRESS, and CONFIG_OF_ADDRESS is never
defined for sparc. This is good behavior for the other functions in
include/linux/of_address.h, as the extern functions defined in
drivers/of/address.c only gets linked when OF_ADDRESS is configured. However,
for of_address_to_resource there exists a sparc-specific implementation in
arch/sparc/arch/sparc/kernel/of_device_common.c

Solution suggested by: Sam Ravnborg <sam@ravnborg.org>

Signed-off-by: Andreas Larsson <andreas@gaisler.com>
Acked-by: Rob Herring <rob.herring@calxeda.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-09 16:30:50 -08:00
Tony Lindgren
edf8dde393 Merge branch 'linus' into omap-for-v3.8/cleanup-headers-prepare-multiplatform-v3 2012-11-09 14:58:01 -08:00
Tony Lindgren
f56f52e02a Merge branch 'omap-for-v3.8/cleanup-headers-prepare-multiplatform-v3' into omap-for-v3.8/dt
Conflicts:
	arch/arm/plat-omap/dmtimer.c

Resolved as suggested by Jon Hunter.
2012-11-09 14:54:17 -08:00
David Howells
c48c8d51c2 Fix the wanxl firmware to include missing constants
Fix the wanxl firmware to include missing constants such as PARITY_NONE.  It
should be #including the linux/hdlc/ioctl.h header.

To make this work, we also have to guard parts of ioctl.h with !__ASSEMBLY__.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-09 16:28:37 -05:00
David Howells
d77807230e UAPI: (Scripted) Disintegrate include/linux/hdlc
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Michael Kerrisk <mtk.manpages@gmail.com>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Dave Jones <davej@redhat.com>
Acked-by: Krzysztof Halasa <khc@pm.waw.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-09 16:27:51 -05:00
Linus Torvalds
a4275153cc Merge tag 'mmc-fixes-for-3.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc
Pull MMC fixes from Chris Ball:
 - sdhci: fix a NULL dereference at resume-time, seen on OLPC XO-4
 - sdhci: fix against 3.7-rc1 for UHS modes without a vqmmc regulator
 - sdhci-of-esdhc: disable CMD23 on boards where it's broken
 - sdhci-s3c: fix against 3.7-rc1 for card detection with runtime PM
 - dw_mmc, omap_hsmmc: fix potential NULL derefs, compiler warnings

* tag 'mmc-fixes-for-3.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc:
  mmc: sdhci-s3c: fix the card detection in runtime-pm
  mmc: sdhci-s3c: use clk_prepare_enable and clk_disable_unprepare
  mmc: dw_mmc: constify dw_mci_idmac_ops in exynos back-end
  mmc: dw_mmc: fix modular build for exynos back-end
  mmc: sdhci: fix NULL dereference in sdhci_request() tuning
  mmc: sdhci: fix IS_ERR() checking of regulator_get()
  mmc: fix sdhci-dove probe/removal
  mmc: sh_mmcif: fix use after free
  mmc: sdhci-pci: fix 'Invalid iomem size' error message condition
  mmc: mxcmmc: Fix MODULE_ALIAS
  mmc: omap_hsmmc: fix NULL pointer dereference for dt boot
  mmc: omap_hsmmc: fix host reference after mmc_free_host
  mmc: dw_mmc: fix multiple drv_data NULL dereferences
  mmc: dw_mmc: enable controller interrupt before calling mmc_start_host
  mmc: sdhci-of-esdhc: disable CMD23 for some Freescale SoCs
  mmc: dw_mmc: remove _dev_info compile warning
  mmc: dw_mmc: convert the variable type of irq
2012-11-09 21:32:33 +01:00
Jingoo Han
b891b4dc1e PCI: Fix bit definitions of PCI_EXP_LNKCAP2 register
According to the PCIe 3.0 spec, PCI_EXP_LNKCAP2_SLS_2_5GB is
1st bit of PCI_EXP_LNKCAP2 register, not 0th bit. So, the bit
definition of supported link speed vector should be fixed.

[bhelgaas: change "Current" to "Supported"]
Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2012-11-09 11:17:59 -07:00
Tejun Heo
574bd9f7c7 cgroup: implement generic child / descendant walk macros
Currently, cgroup doesn't provide any generic helper for walking a
given cgroup's children or descendants.  This patch adds the following
three macros.

* cgroup_for_each_child() - walk immediate children of a cgroup.

* cgroup_for_each_descendant_pre() - visit all descendants of a cgroup
  in pre-order tree traversal.

* cgroup_for_each_descendant_post() - visit all descendants of a
  cgroup in post-order tree traversal.

All three only require the user to hold RCU read lock during
traversal.  Verifying that each iterated cgroup is online is the
responsibility of the user.  When used with proper synchronization,
cgroup_for_each_descendant_pre() can be used to propagate state
updates to descendants in reliable way.  See comments for details.

v2: s/config/state/ in commit message and comments per Michal.  More
    documentation on synchronization rules.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujisu.com>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Acked-by: Li Zefan <lizefan@huawei.com>
2012-11-09 09:12:29 -08:00
Tejun Heo
eb6fd5040e cgroup: use rculist ops for cgroup->children
Use RCU safe list operations for cgroup->children.  This will be used
to implement cgroup children / descendant walking which can be used by
controllers.

Note that cgroup_create() now puts a new cgroup at the end of the
->children list instead of head.  This isn't strictly necessary but is
done so that the iteration order is more conventional.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Li Zefan <lizefan@huawei.com>
2012-11-09 09:12:29 -08:00
Tejun Heo
a8638030f6 cgroup: add cgroup_subsys->post_create()
Currently, there's no way for a controller to find out whether a new
cgroup finished all ->create() allocatinos successfully and is
considered "live" by cgroup.

This becomes a problem later when we add generic descendants walking
to cgroup which can be used by controllers as controllers don't have a
synchronization point where it can synchronize against new cgroups
appearing in such walks.

This patch adds ->post_create().  It's called after all ->create()
succeeded and the cgroup is linked into the generic cgroup hierarchy.
This plays the counterpart of ->pre_destroy().

When used in combination with the to-be-added generic descendant
iterators, ->post_create() can be used to implement reliable state
inheritance.  It will be explained with the descendant iterators.

v2: Added a paragraph about its future use w/ descendant iterators per
    Michal.

v3: Forgot to add ->post_create() invocation to cgroup_load_subsys().
    Fixed.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Michal Hocko <mhocko@suse.cz>
Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Li Zefan <lizefan@huawei.com>
Cc: Glauber Costa <glommer@parallels.com>
2012-11-09 09:12:29 -08:00
Johannes Berg
8b2c98243e mac80211: clarify interface iteration and make it configurable
During hardware restart, all interfaces are iterated even
though they haven't been re-added to the driver, document
this behaviour. The same also happens during resume, which
is even more confusing since all of the interfaces were
previously removed from the driver. Make this optional so
drivers relying on the current behaviour can still use it,
but to let drivers that don't want this behaviour disable
it.

Also convert all API users, keeping the old semantics
except in hwsim, where the new normal ones are desired.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2012-11-09 17:34:35 +01:00
Johannes Berg
9214ad7f9a mac80211: call driver method when restart completes
When the driver requests a restart (reconfiguration) it
gets all the normal method calls, but can't really tell
why they're happening. Call a new restart_complete op
in the driver when the restart completes, so it could
keep its own state about the restart and clear it there.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2012-11-09 17:34:35 +01:00
Philipp Reisner
986836503e Merge branch 'drbd-8.4_ed6' into for-3.8-drivers-drbd-8.4_ed6 2012-11-09 14:20:23 +01:00
Philipp Reisner
328e0f125b drbd: Broadcast sync progress no more often than once per second
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-09 14:11:43 +01:00
Lars Ellenberg
eb12010e9a drbd: disambiguation, s/ERR_DISCARD/ERR_DISCARD_IMPOSSIBLE/
If for some reason (typically "split-brained" cluster manager)
drbd replica data has diverged, we can chose a victim,
and reconnect using "--discard-my-data", causing the victim
to become sync-target, fetching all changed blocks from the peer.

If we are Primary, we are potentially in use, and we refuse to
"roll back" changes to the data below the page cache and other users.

Rename the error symbol for this to ERR_DISCARD_IMPOSSIBLE.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-09 14:05:50 +01:00
Philipp Marek
3174f8c504 drbd: pass some more information to userspace.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-09 14:05:45 +01:00
Lars Ellenberg
58ffa580a7 drbd: introduce stop-sector to online verify
We now can schedule only a specific range of sectors for online verify,
or interrupt a running verify without interrupting the connection.

Had to bump the protocol version differently, we are now 101.
Added verify_can_do_stop_sector() { protocol >= 97 && protocol != 100; }

Also, the return value convention for worker callbacks has changed,
we returned "true/false" for "keep the connection up" in 8.3,
we return 0 for success and <= for failure in 8.4.
Affected: receive_state()

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-09 14:05:32 +01:00
Andrew Morton
a80a6b85b4 revert "epoll: support for disabling items, and a self-test app"
Revert commit 03a7beb55b ("epoll: support for disabling items, and a
self-test app") pending resolution of the issues identified by Michael
Kerrisk, copied below.

We'll revisit this for 3.8.

: I've taken a look at this patch as it currently stands in 3.7-rc1, and
: done a bit of testing. (By the way, the test program
: tools/testing/selftests/epoll/test_epoll.c does not compile...)
:
: There are one or two places where the behavior seems a little strange,
: so I have a question or two at the end of this mail. But other than
: that, I want to check my understanding so that the interface can be
: correctly documented.
:
: Just to go though my understanding, the problem is the following
: scenario in a multithreaded application:
:
: 1. Multiple threads are performing epoll_wait() operations,
:    and maintaining a user-space cache that contains information
:    corresponding to each file descriptor being monitored by
:    epoll_wait().
:
: 2. At some point, a thread wants to delete (EPOLL_CTL_DEL)
:    a file descriptor from the epoll interest list, and
:    delete the corresponding record from the user-space cache.
:
: 3. The problem with (2) is that some other thread may have
:    previously done an epoll_wait() that retrieved information
:    about the fd in question, and may be in the middle of using
:    information in the cache that relates to that fd. Thus,
:    there is a potential race.
:
: 4. The race can't solved purely in user space, because doing
:    so would require applying a mutex across the epoll_wait()
:    call, which would of course blow thread concurrency.
:
: Right?
:
: Your solution is the EPOLL_CTL_DISABLE operation. I want to
: confirm my understanding about how to use this flag, since
: the description that has accompanied the patches so far
: has been a bit sparse
:
: 0. In the scenario you're concerned about, deleting a file
:    descriptor means (safely) doing the following:
:    (a) Deleting the file descriptor from the epoll interest list
:        using EPOLL_CTL_DEL
:    (b) Deleting the corresponding record in the user-space cache
:
: 1. It's only meaningful to use this EPOLL_CTL_DISABLE in
:    conjunction with EPOLLONESHOT.
:
: 2. Using EPOLL_CTL_DISABLE without using EPOLLONESHOT in
:    conjunction is a logical error.
:
: 3. The correct way to code multithreaded applications using
:    EPOLL_CTL_DISABLE and EPOLLONESHOT is as follows:
:
:    a. All EPOLL_CTL_ADD and EPOLL_CTL_MOD operations should
:       should EPOLLONESHOT.
:
:    b. When a thread wants to delete a file descriptor, it
:       should do the following:
:
:       [1] Call epoll_ctl(EPOLL_CTL_DISABLE)
:       [2] If the return status from epoll_ctl(EPOLL_CTL_DISABLE)
:           was zero, then the file descriptor can be safely
:           deleted by the thread that made this call.
:       [3] If the epoll_ctl(EPOLL_CTL_DISABLE) fails with EBUSY,
:           then the descriptor is in use. In this case, the calling
:           thread should set a flag in the user-space cache to
:           indicate that the thread that is using the descriptor
:           should perform the deletion operation.
:
: Is all of the above correct?
:
: The implementation depends on checking on whether
: (events & ~EP_PRIVATE_BITS) == 0
: This replies on the fact that EPOLL_CTL_AD and EPOLL_CTL_MOD always
: set EPOLLHUP and EPOLLERR in the 'events' mask, and EPOLLONESHOT
: causes those flags (as well as all others in ~EP_PRIVATE_BITS) to be
: cleared.
:
: A corollary to the previous paragraph is that using EPOLL_CTL_DISABLE
: is only useful in conjunction with EPOLLONESHOT. However, as things
: stand, one can use EPOLL_CTL_DISABLE on a file descriptor that does
: not have EPOLLONESHOT set in 'events' This results in the following
: (slightly surprising) behavior:
:
: (a) The first call to epoll_ctl(EPOLL_CTL_DISABLE) returns 0
:     (the indicator that the file descriptor can be safely deleted).
: (b) The next call to epoll_ctl(EPOLL_CTL_DISABLE) fails with EBUSY.
:
: This doesn't seem particularly useful, and in fact is probably an
: indication that the user made a logic error: they should only be using
: epoll_ctl(EPOLL_CTL_DISABLE) on a file descriptor for which
: EPOLLONESHOT was set in 'events'. If that is correct, then would it
: not make sense to return an error to user space for this case?

Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: "Paton J. Lewis" <palewis@adobe.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-11-09 06:41:46 +01:00
Zheng Liu
992e9fdd7b ext4: add some tracepoints in extent status tree
This patch adds some tracepoints in extent status tree.

Signed-off-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2012-11-08 21:57:33 -05:00
Li RongQing
a4477c4ddb ipv6: remove rt6i_peer_genid from rt6_info and its handler
6431cbc25f(Create a mechanism for upward inetpeer propagation into routes)
introduces these codes, but this mechanism is never enabled since
rt6i_peer_genid always is zero whether it is not assigned or assigned by
rt6_peer_genid(). After 5943634fc5 (ipv4: Maintain redirect and PMTU info
in struct rtable again), the ipv4 related codes of this mechanism has been
removed, I think we maybe able to remove them now.

Signed-off-by: Li RongQing <roy.qing.li@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-08 21:16:08 -05:00
Zheng Liu
19b303d8b5 ext4: print map->m_flags in trace_ext4_ext/ind_map_blocks_exit
When we use trace_ext4_ext/ind_map_blocks_exit, print the value of
map->m_flags in order that we can understand the extent's current
status.

Reviewed-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2012-11-08 14:34:04 -05:00
Zheng Liu
b5645534ce ext4: print 'flags' in ext4_ext_handle_uninitialized_extents
In trace_ext4_ext_handle_uninitialized_extents we don't care about the
value of map->m_flags because this value is probably 0, and we prefer
to get the value of flags because we can know how to handle this
extent in this function.

Reviewed-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2012-11-08 14:33:43 -05:00
Dmitry Torokhov
2be975c6d9 Input: introduce managed input devices (add devres support)
There is a demand from driver's writers to use managed devices framework
for their drivers. Unfortunately up to this moment input devices did not
provide support for managed devices and that lead to mixing two styles
of resource management which usually introduced more bugs, such as
manually unregistering input device but relying in devres to free
interrupt handler which (unless device is properly shut off) can cause
ISR to reference already freed memory.

This change introduces devm_input_allocate_device() that will allocate
managed instance of input device so that driver writers who prefer
using devm_* framework do not have to mix 2 styles.

Reviewed-by: Henrik Rydberg <rydberg@euromail.se>
Reviewed-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2012-11-08 09:10:05 -08:00
Behan Webster
fd47d3e1c2 jbd2: remove VLAIS usage from JBD2 code
The use of variable length arrays in structs (VLAIS) in the Linux Kernel code
precludes the use of compilers which don't implement VLAIS (for instance the
Clang compiler). Since ctx is always a 32-bit CRC, hard coding a size of 4
bytes accomplishes the same thing without the use of VLAIS. This is the same
technique already employed in fs/ext4/ext4.h

Signed-off-by: Mark Charlebois <charlebm@gmail.com>
Signed-off-by: Behan Webster <behanw@converseincode.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2012-11-08 11:24:46 -05:00
Eric Sandeen
37be2f59d3 ext4: remove ext4_handle_release_buffer()
ext4_handle_release_buffer() was intended to remove journal
write access from a buffer, but it doesn't actually do anything
at all other than add a BUFFER_TRACE point, but it's not reliably
used for that either.  Remove all the associated dead code.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
2012-11-08 11:22:46 -05:00
Philipp Reisner
9a51ab1c1b drbd: New disk option al-updates
By disabling al-updates one might increase performace. The price for
that is that in case a crashed primary (that had al-updates disabled)
is reintegraded, it will receive a full-resync instead of a bitmap
based resync.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:58:31 +01:00
Andreas Gruenbacher
26ec92871b drbd: Stop using NLA_PUT*().
These macros no longer exist in kernel version v3.5-rc1.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:58:30 +01:00
Philipp Reisner
d60de03a66 drbd: Load balancing method: striping
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:58:11 +01:00
Philipp Reisner
380207d08e drbd: Load balancing of read requests
New config option for the disk secition "read-balancing", with
the values: prefer-local, prefer-remote, round-robin, when-congested-remote.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:58:10 +01:00
Philipp Reisner
b80c043327 drbd: The minor_count module parameter is only a hint nowadays
* The max of minor_count is 255
* In drbdadm count the number of minors, instead of finding
  the highest minor number
* No longer us the magic in the init script

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:58:01 +01:00
Andreas Gruenbacher
0317d9ecbc drbd: Fix the maximum accepted minor device number
The maximum minor device number allowed by the kernel is (1<<20 - 1).  Reject
device numbers higher than that to earlier catch possible errors.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:57:54 +01:00
Andreas Gruenbacher
32bdb64038 drbd: Define scale factors in a single place
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:57:53 +01:00
Philipp Reisner
65d94927e0 drbd: Changed some defaults
* Enabled the resync controller, with a fill target of 50Kib. That gives
  reasonable resync speeds without tuning. A much better default than
  the 250KiB/s fixed.

* Enable bitmap compression. It is save to use, and most people have
  more CPU power than network bandwidth.

* ko-count of 7: Abort a connection if the peer fails to process a
  write request within 42 seconds.

* al-extents of 1237: ~5 GiB seems to be a much more sane default
  these days.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:57:53 +01:00
Lars Ellenberg
d5d7ebd422 drbd: on attach, enforce clean meta data
Detection of unclean shutdown has moved into user space.

The kernel code will, whenever it updates the meta data, mark it as
"unclean", and will refuse to attach to such unclean meta data.

"drbdadm up" now schedules "drbdmeta apply-al", which will apply
the activity log to the bitmap, and/or reinitialize it, if necessary,
as well as set a "clean" indicator flag.

This moves a bit code out of kernel space.
As a side effect, it also prevents some 8.3 module from accidentally
ignoring the 8.4 style activity log, if someone should downgrade,
whether on purpose, or accidentally because he changed kernel versions
without providing an 8.4 for the new kernel, and the new kernel comes
with in-tree 8.3.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:57:51 +01:00
Philipp Reisner
cdfda633d2 drbd: detach from frozen backing device
* drbd-8.3:
  documentation: Documented detach's --force and disk's --disk-timeout
  drbd: Implemented the disk-timeout option
  drbd: Force flag for the detach operation
  drbd: Allow new IOs while the local disk in in FAILED state
  drbd: Bitmap IO functions can not return prematurely if the disk breaks
  drbd: Added a kref to bm_aio_ctx
  drbd: Hold a reference to ldev while doing meta-data IO
  drbd: Keep a reference to the bio until the completion handler finished
  drbd: Implemented wait_until_done_or_disk_failure()
  drbd: Replaced md_io_mutex by an atomic: md_io_in_use
  drbd: moved md_io into mdev
  drbd: Immediately allow completion of IOs, that wait for IO completions on a failed disk
  drbd: Keep a reference to barrier acked requests

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:57:50 +01:00
Philipp Reisner
d942ae4453 drbd: Fixes from the 8.3 development branch
* commit 'ae57a0a':
   drbd: Only print sanitize state's warnings, if the state change happens
   drbd: we should write meta data updates with FLUSH FUA
   drbd: fix limit define, we support 1 PiByte now
   drbd: fix log message argument order
   drbd: Typo in user-visible message.
   drbd: Make "(rcv|snd)buf-size" and "ping-timeout" available for the proxy, too.
   drbd: Allow keywords to be used in multiple config sections.
   drbd: fix typos in comments.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:57:49 +01:00
Lars Ellenberg
f03c254961 drbd: allow ping-timeout of up to 30 seconds
Allow up to 300 centi-seconds to be configured for the "ping timeout".
There may be setups where heavy congestion, huge buffers, and asymmetric
bandwidth limitations may need a "huge" ping-timeout as work-around
for "spurious connection loss" problems.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:57:48 +01:00
Andreas Gruenbacher
6dff290220 drbd: Rename --dry-run to --tentative
drbdadm already has a --dry-run option, so this option cannot directly be
passed through to drbdsetup.  Rename the drbdsetup option to resolve this
conflict.

For backward compatibility, make --dry-run an alias of --tentative.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:57:47 +01:00
Andreas Gruenbacher
089c075d88 drbd: Convert the generic netlink interface to accept connection endpoints
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:57:46 +01:00
Andreas Gruenbacher
7c3063cc6f drbd: Also need to check for DRBD_GENLA_F_MANDATORY flags before nla_find_nested()
This is done by introducing drbd_nla_find_nested() which handles the flag
before calling nla_find_nested().

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:57:45 +01:00
Andreas Gruenbacher
789c1b626c drbd: Use the terminology suggested by the command names in the source code and messages
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:57:44 +01:00