Commit Graph

26775 Commits

Author SHA1 Message Date
Justin TerAvest
4aede84b33 fixlet: Remove fs_excl from struct task.
fs_excl is a poor man's priority inheritance for filesystems to hint to
the block layer that an operation is important. It was never clearly
specified, not widely adopted, and will not prevent starvation in many
cases (like across cgroups).

fs_excl was introduced with the time sliced CFQ IO scheduler, to
indicate when a process held FS exclusive resources and thus needed
a boost.

It doesn't cover all file systems, and it was never fully complete.
Lets kill it.

Signed-off-by: Justin TerAvest <teravest@google.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-07-12 08:35:10 +02:00
Eric Dumazet
4915a0de43 net: introduce __netdev_alloc_skb_ip_align
RX rings should use GFP_KERNEL allocations if possible, add
__netdev_alloc_skb_ip_align() helper to ease this.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-07-11 20:08:34 -07:00
Benjamin Herrenschmidt
a63fdc5156 mm: Move definition of MIN_MEMORY_BLOCK_SIZE to a header
The macro MIN_MEMORY_BLOCK_SIZE is currently defined twice in two .c
files, and I need it in a third one to fix a powerpc bug, so let's
first move it into a header

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Ingo Molnar <mingo@elte.hu>
2011-07-12 11:08:01 +10:00
Rafael J. Wysocki
c6d22b3726 PM / Domains: Allow callbacks to execute all runtime PM helpers
A deadlock may occur if one of the PM domains' .start_device() or
.stop_device() callbacks or a device driver's .runtime_suspend() or
.runtime_resume() callback executed by the core generic PM domain
code uses a "wrong" runtime PM helper function.  This happens, for
example, if .runtime_resume() from one device's driver calls
pm_runtime_resume() for another device in the same PM domain.
A similar situation may take place if a device's parent is in the
same PM domain, in which case the runtime PM framework may execute
pm_genpd_runtime_resume() automatically for the parent (if it is
suspended at the moment).  This, of course, is undesirable, so
the generic PM domains code should be modified to prevent it from
happening.

The runtime PM framework guarantees that pm_genpd_runtime_suspend()
and pm_genpd_runtime_resume() won't be executed in parallel for
the same device, so the generic PM domains code need not worry
about those cases.  Still, it needs to prevent the other possible
race conditions between pm_genpd_runtime_suspend(),
pm_genpd_runtime_resume(), pm_genpd_poweron() and pm_genpd_poweroff()
from happening and it needs to avoid deadlocks at the same time.
To this end, modify the generic PM domains code to relax
synchronization rules so that:

* pm_genpd_poweron() doesn't wait for the PM domain status to
  change from GPD_STATE_BUSY.  If it finds that the status is
  not GPD_STATE_POWER_OFF, it returns without powering the domain on
  (it may modify the status depending on the circumstances).

* pm_genpd_poweroff() returns as soon as it finds that the PM
  domain's status changed from GPD_STATE_BUSY after it's released
  the PM domain's lock.

* pm_genpd_runtime_suspend() doesn't wait for the PM domain status
  to change from GPD_STATE_BUSY after executing the domain's
  .stop_device() callback and executes pm_genpd_poweroff() only
  if pm_genpd_runtime_resume() is not executed in parallel.

* pm_genpd_runtime_resume() doesn't wait for the PM domain status
  to change from GPD_STATE_BUSY after executing pm_genpd_poweron()
  and sets the domain's status to GPD_STATE_BUSY and increments its
  counter of resuming devices (introduced by this change) immediately
  after acquiring the lock.  The counter of resuming devices is then
  decremented after executing __pm_genpd_runtime_resume() for the
  device and the domain's status is reset to GPD_STATE_ACTIVE (unless
  there are more resuming devices in the domain, in which case the
  status remains GPD_STATE_BUSY).

This way, for example, if a device driver's .runtime_resume()
callback executes pm_runtime_resume() for another device in the same
PM domain, pm_genpd_poweron() called by pm_genpd_runtime_resume()
invoked by the runtime PM framework will not block and it will see
that there's nothing to do for it.  Next, the PM domain's lock will
be acquired without waiting for its status to change from
GPD_STATE_BUSY and the device driver's .runtime_resume() callback
will be executed.  In turn, if pm_runtime_suspend() is executed by
one device driver's .runtime_resume() callback for another device in
the same PM domain, pm_genpd_poweroff() executed by
pm_genpd_runtime_suspend() invoked by the runtime PM framework as a
result will notice that one of the devices in the domain is being
resumed, so it will return immediately.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2011-07-12 00:39:36 +02:00
Rafael J. Wysocki
17b75eca76 PM / Domains: Do not execute device callbacks under locks
Currently, the .start_device() and .stop_device() callbacks from
struct generic_pm_domain() as well as the device drivers' runtime PM
callbacks used by the generic PM domains code are executed under
the generic PM domain lock.  This, unfortunately, is prone to
deadlocks, for example if a device and its parent are boths members
of the same PM domain.  For this reason, it would be better if the
PM domains code didn't execute device callbacks under the lock.

Rework the locking in the generic PM domains code so that the lock
is dropped for the execution of device callbacks.  To this end,
introduce PM domains states reflecting the current status of a PM
domain and such that the PM domain lock cannot be acquired if the
status is GPD_STATE_BUSY.  Make threads attempting to acquire a PM
domain's lock wait until the status changes to either
GPD_STATE_ACTIVE or GPD_STATE_POWER_OFF.

This change by itself doesn't fix the deadlock problem mentioned
above, but the mechanism introduced by it will be used for for this
purpose by a subsequent patch.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2011-07-12 00:39:29 +02:00
Steven Rostedt
04da85b861 ftrace: Fix warning when CONFIG_FUNCTION_TRACER is not defined
The struct ftrace_hash was declared within CONFIG_FUNCTION_TRACER
but was referenced outside of it.

Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-07-11 10:12:59 -04:00
Jiri Kosina
b7e9c223be Merge branch 'master' into for-next
Sync with Linus' tree to be able to apply pending patches that
are based on newer code already present upstream.
2011-07-11 14:15:55 +02:00
Kuninori Morimoto
1522043bf7 sh: move CLKDEV_xxx_ID macro to sh_clk.h
Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2011-07-11 15:07:25 +09:00
Theodore Ts'o
4862fd6047 jbd2: remove jbd2_dev_to_name() from jbd2 tracepoints
Using function calls in TP_printk causes perf heartburn, so print the
MAJOR/MINOR device numbers instead.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2011-07-10 22:05:08 -04:00
Daniel Baluta
d84e0bd797 skbuff: update struct sk_buff members comments
Rearrange struct sk_buff members comments to follow their
definition order. Also, add missing comments for ooo_okay
and dropcount members.

Signed-off-by: Daniel Baluta <dbaluta@ixiacom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-07-10 07:04:04 -07:00
Andy Green
5a9aaf0cf8 I2C: OMAP1/OMAP2+: create omap I2C functionality flags for each cpu_... test
These represent the 8 kinds of implementation functionality
that up until now were inferred by the 16 remaining cpu_...()
tests in the omap i2c driver.

Changed to use BIT() as suggested by Balaji T Krishnamoorthy.

Cc: patches@linaro.org
Cc: Ben Dooks <ben-linux@fluff.org>
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Andy Green <andy.green@linaro.org>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Kevin Hilman <khilman@ti.com>
Signed-off-by: Paul Walmsley <paul@pwsan.com>
2011-07-10 05:27:15 -06:00
Andy Green
d72fe7883f I2C: OMAP2+: Introduce I2C IP versioning constants
These represent the two kinds of (incompatible) OMAP I2C
peripheral unit in use so far.

The constants are in linux/i2c-omap.h so the omap i2c driver can have
them too.

Cc: patches@linaro.org
Cc: Ben Dooks <ben-linux@fluff.org>
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Andy Green <andy.green@linaro.org>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Kevin Hilman <khilman@ti.com>
Signed-off-by: Paul Walmsley <paul@pwsan.com>
2011-07-10 05:27:14 -06:00
Magnus Damm
18b4f3f5d0 PM / Domains: Export pm_genpd_poweron() in header
Allow SoC-specific code to call pm_genpd_poweron().

Signed-off-by: Magnus Damm <damm@opensource.se>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2011-07-10 10:39:14 +02:00
Wu Fengguang
1a12d8bd7b writeback: scale IO chunk size up to half device bandwidth
Originally, MAX_WRITEBACK_PAGES was hard-coded to 1024 because of a
concern of not holding I_SYNC for too long.  (At least, that was the
comment previously.)  This doesn't make sense now because the only
time we wait for I_SYNC is if we are calling sync or fsync, and in
that case we need to write out all of the data anyway.  Previously
there may have been other code paths that waited on I_SYNC, but not
any more.					    -- Theodore Ts'o

So remove the MAX_WRITEBACK_PAGES constraint. The writeback pages
will adapt to as large as the storage device can write within 500ms.

XFS is observed to do IO completions in a batch, and the batch size is
equal to the write chunk size. To avoid dirty pages to suddenly drop
out of balance_dirty_pages()'s dirty control scope and create large
fluctuations, the chunk size is also limited to half the control scope.

The balance_dirty_pages() control scrope is

	[(background_thresh + dirty_thresh) / 2, dirty_thresh]

which is by default [15%, 20%] of global dirty pages, whose range size
is dirty_thresh / DIRTY_FULL_SCOPE.

The adpative write chunk size will be rounded to the nearest 4MB
boundary.

http://bugzilla.kernel.org/show_bug.cgi?id=13930

CC: Theodore Ts'o <tytso@mit.edu>
CC: Dave Chinner <david@fromorbit.com>
CC: Chris Mason <chris.mason@oracle.com>
CC: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
2011-07-09 22:09:03 -07:00
Wu Fengguang
ffd1f609ab writeback: introduce max-pause and pass-good dirty limits
The max-pause limit helps to keep the sleep time inside
balance_dirty_pages() within MAX_PAUSE=200ms. The 200ms max sleep means
per task rate limit of 8pages/200ms=160KB/s when dirty exceeded, which
normally is enough to stop dirtiers from continue pushing the dirty
pages high, unless there are a sufficient large number of slow dirtiers
(eg. 500 tasks doing 160KB/s will still sum up to 80MB/s, exceeding the
write bandwidth of a slow disk and hence accumulating more and more dirty
pages).

The pass-good limit helps to let go of the good bdi's in the presence of
a blocked bdi (ie. NFS server not responding) or slow USB disk which for
some reason build up a large number of initial dirty pages that refuse
to go away anytime soon.

For example, given two bdi's A and B and the initial state

	bdi_thresh_A = dirty_thresh / 2
	bdi_thresh_B = dirty_thresh / 2
	bdi_dirty_A  = dirty_thresh / 2
	bdi_dirty_B  = dirty_thresh / 2

Then A get blocked, after a dozen seconds

	bdi_thresh_A = 0
	bdi_thresh_B = dirty_thresh
	bdi_dirty_A  = dirty_thresh / 2
	bdi_dirty_B  = dirty_thresh / 2

The (bdi_dirty_B < bdi_thresh_B) test is now useless and the dirty pages
will be effectively throttled by condition (nr_dirty < dirty_thresh).
This has two problems:
(1) we lose the protections for light dirtiers
(2) balance_dirty_pages() effectively becomes IO-less because the
    (bdi_nr_reclaimable > bdi_thresh) test won't be true. This is good
    for IO, but balance_dirty_pages() loses an important way to break
    out of the loop which leads to more spread out throttle delays.

DIRTY_PASSGOOD_AREA can eliminate the above issues. The only problem is,
DIRTY_PASSGOOD_AREA needs to be defined as 2 to fully cover the above
example while this patch uses the more conservative value 8 so as not to
surprise people with too many dirty pages than expected.

The max-pause limit won't noticeably impact the speed dirty pages are
knocked down when there is a sudden drop of global/bdi dirty thresholds.
Because the heavy dirties will be throttled below 160KB/s which is slow
enough. It does help to avoid long dirty throttle delays and especially
will make light dirtiers more responsive.

Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
2011-07-09 22:09:02 -07:00
Wu Fengguang
c42843f2f0 writeback: introduce smoothed global dirty limit
The start of a heavy weight application (ie. KVM) may instantly knock
down determine_dirtyable_memory() if the swap is not enabled or full.
global_dirty_limits() and bdi_dirty_limit() will in turn get global/bdi
dirty thresholds that are _much_ lower than the global/bdi dirty pages.

balance_dirty_pages() will then heavily throttle all dirtiers including
the light ones, until the dirty pages drop below the new dirty thresholds.
During this _deep_ dirty-exceeded state, the system may appear rather
unresponsive to the users.

About "deep" dirty-exceeded: task_dirty_limit() assigns 1/8 lower dirty
threshold to heavy dirtiers than light ones, and the dirty pages will
be throttled around the heavy dirtiers' dirty threshold and reasonably
below the light dirtiers' dirty threshold. In this state, only the heavy
dirtiers will be throttled and the dirty pages are carefully controlled
to not exceed the light dirtiers' dirty threshold. However if the
threshold itself suddenly drops below the number of dirty pages, the
light dirtiers will get heavily throttled.

So introduce global_dirty_limit for tracking the global dirty threshold
with policies

- follow downwards slowly
- follow up in one shot

global_dirty_limit can effectively mask out the impact of sudden drop of
dirtyable memory. It will be used in the next patch for two new type of
dirty limits. Note that the new dirty limits are not going to avoid
throttling the light dirtiers, but could limit their sleep time to 200ms.

Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
2011-07-09 22:09:02 -07:00
Wu Fengguang
e98be2d599 writeback: bdi write bandwidth estimation
The estimation value will start from 100MB/s and adapt to the real
bandwidth in seconds.

It tries to update the bandwidth only when disk is fully utilized.
Any inactive period of more than one second will be skipped.

The estimated bandwidth will be reflecting how fast the device can
writeout when _fully utilized_, and won't drop to 0 when it goes idle.
The value will remain constant at disk idle time. At busy write time, if
not considering fluctuations, it will also remain high unless be knocked
down by possible concurrent reads that compete for the disk time and
bandwidth with async writes.

The estimation is not done purely in the flusher because there is no
guarantee for write_cache_pages() to return timely to update bandwidth.

The bdi->avg_write_bandwidth smoothing is very effective for filtering
out sudden spikes, however may be a little biased in long term.

The overheads are low because the bdi bandwidth update only occurs at
200ms intervals.

The 200ms update interval is suitable, because it's not possible to get
the real bandwidth for the instance at all, due to large fluctuations.

The NFS commits can be as large as seconds worth of data. One XFS
completion may be as large as half second worth of data if we are going
to increase the write chunk to half second worth of data. In ext4,
fluctuations with time period of around 5 seconds is observed. And there
is another pattern of irregular periods of up to 20 seconds on SSD tests.

That's why we are not only doing the estimation at 200ms intervals, but
also averaging them over a period of 3 seconds and then go further to do
another level of smoothing in avg_write_bandwidth.

CC: Li Shaohua <shaohua.li@intel.com>
CC: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
2011-07-09 22:09:01 -07:00
Jan Kara
f7d2b1ecd0 writeback: account per-bdi accumulated written pages
Introduce the BDI_WRITTEN counter. It will be used for estimating the
bdi's write bandwidth.

Peter Zijlstra <a.p.zijlstra@chello.nl>:
Move BDI_WRITTEN accounting into __bdi_writeout_inc().
This will cover and fix fuse, which only calls bdi_writeout_inc().

CC: Michael Rubin <mrubin@google.com>
Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
2011-07-09 22:09:01 -07:00
Wu Fengguang
d46db3d582 writeback: make writeback_control.nr_to_write straight
Pass struct wb_writeback_work all the way down to writeback_sb_inodes(),
and initialize the struct writeback_control there.

struct writeback_control is basically designed to control writeback of a
single file, but we keep abuse it for writing multiple files in
writeback_sb_inodes() and its callers.

It immediately clean things up, e.g. suddenly wbc.nr_to_write vs
work->nr_pages starts to make sense, and instead of saving and restoring
pages_skipped in writeback_sb_inodes it can always start with a clean
zero value.

It also makes a neat IO pattern change: large dirty files are now
written in the full 4MB writeback chunk size, rather than whatever
remained quota in wbc->nr_to_write.

Acked-by: Jan Kara <jack@suse.cz>
Proposed-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
2011-07-09 22:09:01 -07:00
Jean-François Dagenais
f607e7fc5f w1: ds1wm: add a reset recovery parameter
This fixes a regression in 3.0 reported by Paul Parsons regarding the
removal of the msleep(1) in the ds1wm_reset() function:

: The linux-3.0-rc4 DS1WM 1-wire driver is logging "bus error, retrying"
: error messages on an HP iPAQ hx4700 PDA (XScale-PXA270):
:
: <snip>
: Driver for 1-wire Dallas network protocol.
: DS1WM w1 busmaster driver - (c) 2004 Szabolcs Gyurko
: 1-Wire driver for the DS2760 battery monitor  chip  - (c) 2004-2005, Szabolcs Gyurko
: ds1wm ds1wm: pass: 1 bus error, retrying
: ds1wm ds1wm: pass: 2 bus error, retrying
: ds1wm ds1wm: pass: 3 bus error, retrying
: ds1wm ds1wm: pass: 4 bus error, retrying
: ds1wm ds1wm: pass: 5 bus error, retrying
: ...
:
: The visible result is that the battery charging LED is erratic; sometimes
: it works, mostly it doesn't.
:
: The linux-2.6.39 DS1WM 1-wire driver worked OK.  I haven't tried 3.0-rc1,
: 3.0-rc2, or 3.0-rc3.

This sleep should not be required on normal circuitry provided the
pull-ups on the bus are correctly adapted to the slaves.  Unfortunately,
this is not always the case.  The sleep is restored but as a parameter to
the probe function in the pdata.

[akpm@linux-foundation.org: coding-style fixes]
Reported-by: Paul Parsons <lost.distance@yahoo.com>
Tested-by: Paul Parsons <lost.distance@yahoo.com>
Signed-off-by: Jean-François Dagenais <dagenaisj@sonatest.com>
Cc: Evgeniy Polyakov <johnpol@2ka.mipt.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-07-08 21:14:44 -07:00
Greg Kroah-Hartman
18fbb93fbe Merge branch 'for-next' of master.kernel.org:/pub/scm/linux/kernel/git/balbi/usb into usb-next
* 'for-next' of master.kernel.org:/pub/scm/linux/kernel/git/balbi/usb:
  usb: gadget: m66592-udc: add pullup function
  usb: gadget: m66592-udc: add function for external controller
  usb: gadget: r8a66597-udc: add pullup function
  usb: gadget: zero: add superspeed support
  usb: gadget: add SS descriptors to Ethernet gadget
  usb: gadget: r8a66597-udc: add support for TEST_MODE
  usb: gadget: m66592-udc: add support for TEST_MODE
  usb: gadget: r8a66597-udc: Make BUSWAIT configurable through platform data
  usb: gadget: r8a66597-udc: fix cannot connect after rmmod gadget driver
  usb: update email address in r8a66597-udc and m66592-udc
  usb: musb: restore INDEX register in resume path
  usb: gadget: fix up depencies
  usb: gadget: fusb300_udc: fix compile warnings
  usb: gadget: ci13xx_udc.c: fix compile warning
  usb: gadget: net2272: fix compile warnings
  usb: gadget: langwell_udc: fix compile warnings
  usb: gadget: fusb300_udc: drop dead code
2011-07-08 15:30:55 -07:00
Shreshtha Kumar Sahu
def90f4239 amba pl011: workaround for uart registers lockup
This workaround aims to break the deadlock situation
which raises during continuous transfer of data for long
duration over uart with hardware flow control. It is
observed that CTS interrupt cannot be cleared in uart
interrupt register (ICR). Hence further transfer over
uart gets blocked.

It is seen that during such deadlock condition ICR
don't get cleared even on multiple write. This leads
pass_counter to decrease and finally reach zero. This
can be taken as trigger point to run this UART_BT_WA.

Workaround backups the register configuration, does soft
reset of UART using BIT-0 of PRCC_K_SOFTRST_SET/CLEAR
registers and restores the registers.

This patch also provides support for uart init and exit
function calls if present.

Signed-off-by: Shreshtha Kumar Sahu <shreshthakumar.sahu@stericsson.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-07-08 15:09:23 -07:00
Yoshihiro Shimoda
bb59dbff4e usb: gadget: m66592-udc: add function for external controller
M66592 has the pin of WR0 and WR1. So, if one write-pin of CPU
connects to the pins, we have to change the setting of FIFOSEL
register in the controller. If we don't change the setting,
the controller cannot send the data of odd length.

Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Signed-off-by: Felipe Balbi <balbi@ti.com>
2011-07-09 01:08:39 +03:00
Yoshihiro Shimoda
45304e8cd9 usb: update email address in ohci-sh and r8a66597-hcd
Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-07-08 14:57:12 -07:00
Yoshihiro Shimoda
f2e9039a43 usb: r8a66597-hcd: add function for external controller
R8A66597 has the pin of WR0 and WR1. So, if one write-pin of CPU
connects to the pins, we have to change the setting of FIFOSEL
register in the controller. If we don't change the setting,
the controller cannot send the data of odd length.

Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-07-08 14:57:11 -07:00
Michal Hocko
d8bf4ca9ca rcu: treewide: Do not use rcu_read_lock_held when calling rcu_dereference_check
Since ca5ecddf (rcu: define __rcu address space modifier for sparse)
rcu_dereference_check use rcu_read_lock_held as a part of condition
automatically so callers do not have to do that as well.

Signed-off-by: Michal Hocko <mhocko@suse.cz>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2011-07-08 22:21:58 +02:00
Shawn Guo
b98c023920 dt: add empty of_property_read_u32[_array] for non-dt
The patch adds empty functions of_property_read_u32 and
of_property_read_u32_array for non-dt build, so that drivers
migrating to dt can save some '#ifdef CONFIG_OF'.

Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
[grant.likely: Moved things around so only one new static inline is needed]
[grant.likely: Added _string variant]
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2011-07-08 12:52:33 -06:00
Luciano Coelho
1a84ff7564 cfg80211: return -ENOENT when stopping sched_scan while not running
If we try to stop a scheduled scan while it is not running, we should
return -ENOENT instead of simply ignoring the command and returning
success.  This is more consistent with other parts of the code.

Reported-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Luciano Coelho <coelho@ti.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-07-08 11:47:54 -04:00
John W. Linville
204d1641d2 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 into for-davem 2011-07-08 11:03:36 -04:00
Donggeun Kim
086ef502c2 power_supply: MAX17042: Support additional properties
This patch supports additional properties (PRESENT, CYCLE_COUNT,
VOLTAGE_MAX, VOLTAGE_MIN_DESIGN, CURRENT_NOW, CURRENT_AVG,
CHARGE_FULL, and TEMP).
Plus, initialization code for registers is added.

Signed-off-by: Donggeun Kim <dg77.kim@samsung.com>
Signed-off-by: MyungJoo Ham <myungjoo.ham@samsung.com>
Signed-off-by: KyungMin Park <kyungmin.park@samsung.com>
Signed-off-by: Anton Vorontsov <cbouatmailru@gmail.com>
2011-07-08 17:01:58 +04:00
Donggeun Kim
bb4ce97087 power_supply: Add charger driver for MAX8998/LP3974
This patch supports power supply APIs for MAX8998/LP3974.

Signed-off-by: Donggeun Kim <dg77.kim@samsung.com>
Signed-off-by: MyungJoo Ham <myungjoo.ham@samsung.com>
Signed-off-by: KyungMin Park <kyungmin.park@samsung.com>
Acked-by: Samuel Ortiz <sameo@linux.intel.com>
Signed-off-by: Anton Vorontsov <cbouatmailru@gmail.com>
2011-07-08 16:59:34 +04:00
Donggeun Kim
149c077b4b power_supply: Add charger driver for MAX8997/8966
MAX8997/8966 chip is a multi-function device which includes
PMIC, RTC, Fuel Gauge, MUIC, Haptic, Flash control, and
Battery charging control.
The driver for it is located at drivers/mfd.

This patch supports battery charging control of MAX8997/8966 chip and
provides power supply class information to userspace.

Signed-off-by: Donggeun Kim <dg77.kim@samsung.com>
Signed-off-by: MyungJoo Ham <myungjoo.ham@samsung.com>
Signed-off-by: KyungMin Park <kyungmin.park@samsung.com>
Acked-by: Samuel Ortiz <sameo@linux.intel.com>
Signed-off-by: Anton Vorontsov <cbouatmailru@gmail.com>
2011-07-08 16:58:59 +04:00
Rhyland Klein
58ddafae2d bq20z75: Add support for external notification
Adding support for external power change notification. One problem found
is that there is a lag time before the sensor will return a new status.
To ensure that we only fire off the power_supply_changed event when the
status returned from the sensor is actually different, we delay sending
the the notification, and instead poll on it looking for a change. The
amount of time to poll is configurable via platform data.

Signed-off-by: Rhyland Klein <rklein@nvidia.com>
Signed-off-by: Anton Vorontsov <cbouatmailru@gmail.com>
2011-07-08 16:53:07 +04:00
Dima Zavin
732375c6a5 plist: Remove the need to supply locks to plist heads
This was legacy code brought over from the RT tree and
is no longer necessary.

Signed-off-by: Dima Zavin <dima@android.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Daniel Walker <dwalker@codeaurora.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Link: http://lkml.kernel.org/r/1310084879-10351-2-git-send-email-dima@android.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-07-08 14:02:53 +02:00
Yoshihiro Shimoda
5154e9f126 usb: gadget: r8a66597-udc: Make BUSWAIT configurable through platform data
BUSWAIT is a 4-bit-wide value that controls the number of access waits
from the CPU to on-chip USB module. b'0000 inserts 0 wait (2 access
cycles) and b'1111 inserts 15 waits (17 access cycles, hardware
initial value), respectively.

BUSWAIT value depends on peripheral clock frequency supplied to on-chip
of each CPU, hence should be configurable through platform data.

Note that this patch assumes that b'0000 (0 wait, 2 access cycles) is
rerely used and considered as invalid. If valid 'buswait' data is not
provided by platform, initial b'1111 (15 waits, 17 access cycles) will
be applied as a safe default.

Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Signed-off-by: Felipe Balbi <balbi@ti.com>
2011-07-08 12:47:42 +03:00
Shaohua Li
316cc67d5e block: document blk_plug list access
I'm often confused why not disable preempt when changing blk_plug list. It
would be better to add comments here in case others have the similar concerns.

Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-07-08 08:19:21 +02:00
Shaohua Li
55c022bbdd block: avoid building too big plug list
When I test fio script with big I/O depth, I found the total throughput drops
compared to some relative small I/O depth. The reason is the thread accumulates
big requests in its plug list and causes some delays (surely this depends
on CPU speed).
I thought we'd better have a threshold for requests. When a threshold reaches,
this means there is no request merge and queue lock contention isn't severe
when pushing per-task requests to queue, so the main advantages of blk plug
don't exist. We can force a plug list flush in this case.
With this, my test throughput actually increases and almost equals to small
I/O depth. Another side effect is irq off time decreases in blk_flush_plug_list()
for big I/O depth.
The BLK_MAX_REQUEST_COUNT is choosen arbitarily, but 16 is efficiently to
reduce lock contention to me. But I'm open here, 32 is ok in my test too.

Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-07-08 08:19:20 +02:00
Kumar Gala
a77ce8167c driver core: Add ability for arch code to setup pdev_archdata
On some architectures we need to setup pdev_archdata before we add the
device.  Waiting til a bus_notifier is too late since we might need the
pdev_archdata in the bus notifier.  One example is setting up of dma_mask
pointers such that it can be used in a bus_notifier.

We add weak noop version of arch_setup_pdev_archdata() and allow the arch
code to override with access the full definitions of struct device,
struct platform_device, and struct pdev_archdata.

Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2011-07-08 00:21:35 -05:00
Timur Tabi
6db7199407 drivers/virt: introduce Freescale hypervisor management driver
Add the drivers/virt directory, which houses drivers that support
virtualization environments, and add the Freescale hypervisor management
driver.

The Freescale hypervisor management driver provides several services to
drivers and applications related to the Freescale hypervisor:

1. An ioctl interface for querying and managing partitions

2. A file interface to reading incoming doorbells

3. An interrupt handler for shutting down the partition upon receiving the
   shutdown doorbell from a manager partition

4. A kernel interface for receiving callbacks when a managed partition
   shuts down.

Signed-off-by: Timur Tabi <timur@freescale.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2011-07-08 00:21:27 -05:00
Keith Packard
bc67f799e7 Merge branch 'drm-intel-fixes' into drm-intel-next 2011-07-07 13:39:38 -07:00
Linus Torvalds
2a9d6df425 Merge branch 'for-linus' of git://git.kernel.dk/linux-block
* 'for-linus' of git://git.kernel.dk/linux-block:
  drbd: we should write meta data updates with FLUSH FUA
  drbd: fix limit define, we support 1 PiByte now
  drbd: when receive times out on meta socket, also check last receive time on data socket
  drbd: account bitmap IO during resync as resync-(related-)-io
  drbd: don't cond_resched_lock with IRQs disabled
  drbd: add missing spinlock to bitmap receive
  drbd: Use the correct max_bio_size when creating resync requests
  cfq-iosched: make code consistent
  cfq-iosched: fix a rcu warning
2011-07-07 13:22:26 -07:00
David Howells
c902ce1bfb FS-Cache: Add a helper to bulk uncache pages on an inode
Add an FS-Cache helper to bulk uncache pages on an inode.  This will
only work for the circumstance where the pages in the cache correspond
1:1 with the pages attached to an inode's page cache.

This is required for CIFS and NFS: When disabling inode cookie, we were
returning the cookie and setting cifsi->fscache to NULL but failed to
invalidate any previously mapped pages.  This resulted in "Bad page
state" errors and manifested in other kind of errors when running
fsstress.  Fix it by uncaching mapped pages when we disable the inode
cookie.

This patch should fix the following oops and "Bad page state" errors
seen during fsstress testing.

  ------------[ cut here ]------------
  kernel BUG at fs/cachefiles/namei.c:201!
  invalid opcode: 0000 [#1] SMP
  Pid: 5, comm: kworker/u:0 Not tainted 2.6.38.7-30.fc15.x86_64 #1 Bochs Bochs
  RIP: 0010: cachefiles_walk_to_object+0x436/0x745 [cachefiles]
  RSP: 0018:ffff88002ce6dd00  EFLAGS: 00010282
  RAX: ffff88002ef165f0 RBX: ffff88001811f500 RCX: 0000000000000000
  RDX: 0000000000000000 RSI: 0000000000000100 RDI: 0000000000000282
  RBP: ffff88002ce6dda0 R08: 0000000000000100 R09: ffffffff81b3a300
  R10: 0000ffff00066c0a R11: 0000000000000003 R12: ffff88002ae54840
  R13: ffff88002ae54840 R14: ffff880029c29c00 R15: ffff88001811f4b0
  FS:  00007f394dd32720(0000) GS:ffff88002ef00000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
  CR2: 00007fffcb62ddf8 CR3: 000000001825f000 CR4: 00000000000006e0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
  Process kworker/u:0 (pid: 5, threadinfo ffff88002ce6c000, task ffff88002ce55cc0)
  Stack:
   0000000000000246 ffff88002ce55cc0 ffff88002ce6dd58 ffff88001815dc00
   ffff8800185246c0 ffff88001811f618 ffff880029c29d18 ffff88001811f380
   ffff88002ce6dd50 ffffffff814757e4 ffff88002ce6dda0 ffffffff8106ac56
  Call Trace:
   cachefiles_lookup_object+0x78/0xd4 [cachefiles]
   fscache_lookup_object+0x131/0x16d [fscache]
   fscache_object_work_func+0x1bc/0x669 [fscache]
   process_one_work+0x186/0x298
   worker_thread+0xda/0x15d
   kthread+0x84/0x8c
   kernel_thread_helper+0x4/0x10
  RIP  cachefiles_walk_to_object+0x436/0x745 [cachefiles]
  ---[ end trace 1d481c9af1804caa ]---

I tested the uncaching by the following means:

 (1) Create a big file on my NFS server (104857600 bytes).

 (2) Read the file into the cache with md5sum on the NFS client.  Look in
     /proc/fs/fscache/stats:

	Pages  : mrk=25601 unc=0

 (3) Open the file for read/write ("bash 5<>/warthog/bigfile").  Look in proc
     again:

	Pages  : mrk=25601 unc=25601

Reported-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-and-Tested-by: Suresh Jayaraman <sjayaraman@suse.de>
cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-07-07 13:21:56 -07:00
Linus Torvalds
27a3b735b7 Merge branches 'core-urgent-for-linus', 'perf-urgent-for-linus' and 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  debugobjects: Fix boot crash when kmemleak and debugobjects enabled

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  jump_label: Fix jump_label update for modules
  oprofile, x86: Fix race in nmi handler while starting counters

* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  sched: Disable (revert) SCHED_LOAD_SCALE increase
  sched, cgroups: Fix MIN_SHARES on 64-bit boxen
2011-07-07 13:17:45 -07:00
Christoph Lameter
ea6bd8ee1a SLUB: Fix build breakage in linux/mm_types.h
On Wed, 6 Jul 2011, Jonathan Cameron wrote:

> Getting:
>
>   CHK     include/linux/version.h
>   CHK     include/generated/utsrelease.h
> make[1]: `include/generated/mach-types.h' is up to date.
>   CC      arch/arm/kernel/asm-offsets.s
> In file included from include/linux/sched.h:64:0,
>                  from arch/arm/kernel/asm-offsets.c:13:
> include/linux/mm_types.h:74:15: error: duplicate member '_count'
> make[1]: *** [arch/arm/kernel/asm-offsets.s] Error 1
> make: *** [prepare0] Error 2
>
> Issue looks to have been introduced by
>
>     mm: Rearrange struct page
>
> fc9bb8c768
>
> Guessing it's a known issue, but just thought I'd flag it up in case
> it's something very specific about my build.
>
> gcc-2.6 armv7a
>
> Reverting that patch works, but given I don't know the history, I'm
> not proposing doing that in general!

Well _count exists in two unionized structs but always has the same offset
within the larger struct. Maybe ARM creates different offsets there for
some reason?

The following is a patch to restructure the union / structs combo in such
a way that only a single definition of _count

Reported-by: Jonathan Cameron <jic23@cam.ac.uk>
Tested-by: Piotr Hosowicz <piotr@hosowicz.com>
Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2011-07-07 22:24:30 +03:00
Ben Greear
d18a90dd85 slub: Add method to verify memory is not freed
This is for tracking down suspect memory usage.

Acked-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Ben Greear <greearb@candelatech.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2011-07-07 22:17:08 +03:00
Christoph Lameter
90810645f7 slab allocators: Provide generic description of alignment defines
Provide description for alignment defines.

Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2011-07-07 21:04:12 +03:00
Simon Guinot
659fb32d1b genirq: replace irq_gc_ack() with {set,clr}_bit variants (fwd)
This fixes a regression introduced by e59347a "arm: orion:
Use generic irq chip".

Depending on the device, interrupts acknowledgement is done by setting
or by clearing a dedicated register. Replace irq_gc_ack() with some
{set,clr}_bit variants allows to handle both cases.

Note that this patch affects the following SoCs: Davinci, Samsung and
Orion. Except for this last, the change is minor: irq_gc_ack() is just
renamed into irq_gc_ack_set_bit().

For the Orion SoCs, the edge GPIO interrupts support is currently
broken. irq_gc_ack() try to acknowledge a such interrupt by setting
the corresponding cause register bit. The Orion GPIO device expect the
opposite. To fix this issue, the irq_gc_ack_clr_bit() variant is used.

Tested on Network Space v2.

Reported-by: Joey Oravec <joravec@drewtech.com>
Signed-off-by: Simon Guinot <sguinot@lacie.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2011-07-07 16:02:26 +00:00
Steven Rostedt
43dd61c9a0 ftrace: Fix regression of :mod:module function enabling
The new code that allows different utilities to pick and choose
what functions they trace broke the :mod: hook that allows users
to trace only functions of a particular module.

The reason is that the :mod: hook bypasses the hash that is setup
to allow individual users to trace their own functions and uses
the global hash directly. But if the global hash has not been
set up, it will cause a bug:

echo '*:mod:radeon' > /sys/kernel/debug/set_ftrace_filter

produces:

 [drm:drm_mode_getfb] *ERROR* invalid framebuffer id
 [drm:radeon_crtc_page_flip] *ERROR* failed to reserve new rbo buffer before flip
 BUG: unable to handle kernel paging request at ffffffff8160ec90
 IP: [<ffffffff810d9136>] add_hash_entry+0x66/0xd0
 PGD 1a05067 PUD 1a09063 PMD 80000000016001e1
 Oops: 0003 [#1] SMP Jul  7 04:02:28 phyllis kernel: [55303.858604] CPU 1
 Modules linked in: cryptd aes_x86_64 aes_generic binfmt_misc rfcomm bnep ip6table_filter hid radeon r8169 ahci libahci mii ttm drm_kms_helper drm video i2c_algo_bit intel_agp intel_gtt

 Pid: 10344, comm: bash Tainted: G        WC  3.0.0-rc5 #1 Dell Inc. Inspiron N5010/0YXXJJ
 RIP: 0010:[<ffffffff810d9136>]  [<ffffffff810d9136>] add_hash_entry+0x66/0xd0
 RSP: 0018:ffff88003a96bda8  EFLAGS: 00010246
 RAX: ffff8801301735c0 RBX: ffffffff8160ec80 RCX: 0000000000306ee0
 RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff880137c92940
 RBP: ffff88003a96bdb8 R08: ffff880137c95680 R09: 0000000000000000
 R10: 0000000000000001 R11: 0000000000000000 R12: ffffffff81c9df78
 R13: ffff8801153d1000 R14: 0000000000000000 R15: 0000000000000000
 FS: 00007f329c18a700(0000) GS:ffff880137c80000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: ffffffff8160ec90 CR3: 000000003002b000 CR4: 00000000000006e0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
 Process bash (pid: 10344, threadinfo ffff88003a96a000, task ffff88012fcfc470)
 Stack:
  0000000000000fd0 00000000000000fc ffff88003a96be38 ffffffff810d92f5
  ffff88011c4c4e00 ffff880000000000 000000000b69f4d0 ffffffff8160ec80
  ffff8800300e6f06 0000000081130295 0000000000000282 ffff8800300e6f00
 Call Trace:
  [<ffffffff810d92f5>] match_records+0x155/0x1b0
  [<ffffffff810d940c>] ftrace_mod_callback+0xbc/0x100
  [<ffffffff810dafdf>] ftrace_regex_write+0x16f/0x210
  [<ffffffff810db09f>] ftrace_filter_write+0xf/0x20
  [<ffffffff81166e48>] vfs_write+0xc8/0x190
  [<ffffffff81167001>] sys_write+0x51/0x90
  [<ffffffff815c7e02>] system_call_fastpath+0x16/0x1b
 Code: 48 8b 33 31 d2 48 85 f6 75 33 49 89 d4 4c 03 63 08 49 8b 14 24 48 85 d2 48 89 10 74 04 48 89 42 08 49 89 04 24 4c 89 60 08 31 d2
 RIP [<ffffffff810d9136>] add_hash_entry+0x66/0xd0
  RSP <ffff88003a96bda8>
 CR2: ffffffff8160ec90
 ---[ end trace a5d031828efdd88e ]---

Reported-by: Brian Marete <marete@toshnix.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-07-07 11:30:08 -04:00
Michael Büsch
eb032b9837 Update my e-mail address
Signed-off-by: Michael Buesch <m@bues.ch>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2011-07-07 15:18:01 +02:00
Shirley Ma
a6686f2f38 skbuff: skb supports zero-copy buffers
This patch adds userspace buffers support in skb shared info. A new
struct skb_ubuf_info is needed to maintain the userspace buffers
argument and index, a callback is used to notify userspace to release
the buffers once lower device has done DMA (Last reference to that skb
has gone).

If there is any userspace apps to reference these userspace buffers,
then these userspaces buffers will be copied into kernel. This way we
can prevent userspace apps from holding these userspace buffers too long.

Use destructor_arg to point to the userspace buffer info; a new tx flags
SKBTX_DEV_ZEROCOPY is added for zero-copy buffer check.

Signed-off-by: Shirley Ma <xma@...ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-07-07 04:41:13 -07:00