An nsproxy argument here has always been awkard and now the nsproxy argument
is completely unnecessary so remove it, replacing it with the set we want
the registered tables to show up in.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Piecing together directories by looking first in one directory
tree, than in another directory tree and finally in a third
directory tree makes it hard to verify that some directory
entries are not multiply defined and makes it hard to create
efficient implementations the sysctl filesystem.
Replace the sysctl wide list of roots with autogenerated
links from the core sysctl directory tree to the other
sysctl directory trees.
This simplifies sysctl directory reading and lookups as now
only entries in a single sysctl directory tree need to be
considered.
Benchmark before:
make-dummies 0 999 -> 0.44s
rmmod dummy -> 0.065s
make-dummies 0 9999 -> 1m36s
rmmod dummy -> 0.4s
Benchmark after:
make-dummies 0 999 -> 0.63s
rmmod dummy -> 0.12s
make-dummies 0 9999 -> 2m35s
rmmod dummy -> 18s
The slowdown is caused by the lookups used in insert_headers
and put_links to see if we need to add links or remove links.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Simplify the code and the sysctl semantics by autogenerating
sysctl directories when a sysctl table is registered that needs
the directories and autodeleting the directories when there are
no more sysctl tables registered that need them.
Autogenerating directories keeps sysctl tables from depending
on each other, removing all of the arcane register/unregister
ordering constraints and makes it impossible to get the order
wrong when reigsering and unregistering sysctl tables.
Autogenerating directories yields one unique entity that dentries
can point to, retaining the current effective use of the dcache.
Add struct ctl_dir as the type of these new autogenerated
directories.
The attached_by and attached_to fields in ctl_table_header are
removed as they are no longer needed.
The child field in ctl_table is no longer needed by the core of
the sysctl code. ctl_table.child can be removed once all of the
existing users have been updated.
Benchmark before:
make-dummies 0 999 -> 0.7s
rmmod dummy -> 0.07s
make-dummies 0 9999 -> 1m10s
rmmod dummy -> 0.4s
Benchmark after:
make-dummies 0 999 -> 0.44s
rmmod dummy -> 0.065s
make-dummies 0 9999 -> 1m36s
rmmod dummy -> 0.4s
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Add a ctl_table_root pointer to ctl_table set so it is easy to
go from a ctl_table_set to a ctl_table_root.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Add nreg to ctl_table_header. When nreg drops to 0 the ctl_table_header
will be unregistered.
Factor out drop_sysctl_table from unregister_sysctl_table, and add
the logic for decrementing nreg.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
While useful at one time for selinux and the sysctl sanity
checks those users no longer use the parent field and we can
safely remove it.
Inspired-by: Lucian Adrian Grijincu <lucian.grijincu@gmil.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Split the registration of a complex ctl_table array which may have
arbitrary numbers of directories (->child != NULL) and tables of files
into a series of simpler registrations that only register tables of files.
Graphically:
register('dir', { + file-a
+ file-b
+ subdir1
+ file-c
+ subdir2
+ file-d
+ file-e })
is transformed into:
wrapper->subheaders[0] = register('dir', {file1-a, file1-b})
wrapper->subheaders[1] = register('dir/subdir1', {file-c})
wrapper->subheaders[2] = register('dir/subdir2', {file-d, file-e})
return wrapper
This guarantees that __register_sysctl_table will only see a simple
ctl_table array with all entries having (->child == NULL).
Care was taken to pass the original simple ctl_table arrays to
__register_sysctl_table whenever possible.
This change is derived from a similar patch written
by Lucrian Grijincu.
Inspired-by: Lucian Adrian Grijincu <lucian.grijincu@gmail.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Make __register_sysctl_table the core sysctl registration operation and
make it take a char * string as path.
Now that binary paths have been banished into the real of backwards
compatibility in kernel/binary_sysctl.c where they can be safely
ignored there is no longer a need to use struct ctl_path to represent
path names when registering ctl_tables.
Start the transition to using normal char * strings to represent
pathnames when registering sysctl tables. Normal strings are easier
to deal with both in the internal sysctl implementation and for
programmers registering sysctl tables.
__register_sysctl_paths is turned into a backwards compatibility wrapper
that converts a ctl_path array into a normal char * string.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
In sysctl_net register the two networking roots in the proper order.
In register_sysctl walk the sysctl sets in the reverse order of the
sysctl roots.
Remove parent from ctl_table_set and setup_sysctl_set as it is no
longer needed.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
This adds a small helper retire_sysctl_set to remove the intimate knowledge about
the how a sysctl_set is implemented from net/sysct_net.c
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Move the core sysctl code from kernel/sysctl.c and kernel/sysctl_check.c
into fs/proc/proc_sysctl.c.
Currently sysctl maintenance is hampered by the sysctl implementation
being split across 3 files with artificial layering between them.
Consolidate the entire sysctl implementation into 1 file so that
it is easier to see what is going on and hopefully allowing for
simpler maintenance.
For functions that are now only used in fs/proc/proc_sysctl.c remove
their declarations from sysctl.h and make them static in fs/proc/proc_sysctl.c
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Simplify the code by treating the base sysctl table like any other
sysctl table and register it with register_sysctl_table.
To ensure this table is registered early enough to avoid problems
call sysctl_init from proc_sys_init.
Rename sysctl_net.c:sysctl_init() to net_sysctl_init() to avoid
name conflicts now that kernel/sysctl.c:sysctl_init() is no longer
static.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
- In sysctl.h move functions only available if CONFIG_SYSCL
is defined inside of #ifdef CONFIG_SYSCTL
- Move the stub function definitions for !CONFIG_SYSCTL
into sysctl.h and make them static inlines.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Now that there are no users of get_driver() or put_driver(), this
patch (as1513) removes those routines completely.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
CC: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Davem says:
1) Fix JIT code generation on x86-64 for divide by zero, from Eric Dumazet.
2) tg3 header length computation correction from Eric Dumazet.
3) More build and reference counting fixes for socket memory cgroup
code from Glauber Costa.
4) module.h snuck back into a core header after all the hard work we
did to remove that, from Paul Gortmaker and Jesper Dangaard Brouer.
5) Fix PHY naming regression and add some new PCI IDs in stmmac, from
Alessandro Rubini.
6) Netlink message generation fix in new team driver, should only advertise
the entries that changed during events, from Jiri Pirko.
7) SRIOV VF registration and unregistration fixes, and also add a
missing PCI ID, from Roopa Prabhu.
8) Fix infinite loop in tx queue flush code of brcmsmac, from Stanislaw Gruszka.
9) ftgmac100/ftmac100 build fix, missing interrupt.h include.
10) Memory leak fix in net/hyperv do_set_mutlicast() handling, from Wei Yongjun.
11) Off by one fix in netem packet scheduler, from Vijay Subramanian.
12) TCP loss detection fix from Yuchung Cheng.
13) TCP reset packet MD5 calculation uses wrong address, fix from Shawn Lu.
14) skge carrier assertion and DMA mapping fixes from Stephen Hemminger.
15) Congestion recovery undo performed at the wrong spot in BIC and CUBIC
congestion control modules, fix from Neal Cardwell.
16) Ethtool ETHTOOL_GSSET_INFO is unnecessarily restrictive, from Michał Mirosław.
17) Fix triggerable race in ipv6 sysctl handling, from Francesco Ruggeri.
18) Statistics bug fixes in mlx4 from Eugenia Emantayev.
19) rds locking bug fix during info dumps, from your's truly.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (67 commits)
rds: Make rds_sock_lock BH rather than IRQ safe.
netprio_cgroup.h: dont include module.h from other includes
net: flow_dissector.c missing include linux/export.h
team: send only changed options/ports via netlink
net/hyperv: fix possible memory leak in do_set_multicast()
drivers/net: dsa/mv88e6xxx.c files need linux/module.h
stmmac: added PCI identifiers
llc: Fix race condition in llc_ui_recvmsg
stmmac: fix phy naming inconsistency
dsa: Add reporting of silicon revision for Marvell 88E6123/88E6161/88E6165 switches.
tg3: fix ipv6 header length computation
skge: add byte queue limit support
mv643xx_eth: Add Rx Discard and Rx Overrun statistics
bnx2x: fix compilation error with SOE in fw_dump
bnx2x: handle CHIP_REVISION during init_one
bnx2x: allow user to change ring size in ISCSI SD mode
bnx2x: fix Big-Endianess in ethtool -t
bnx2x: fixed ethtool statistics for MF modes
bnx2x: credit-leakage fixup on vlan_mac_del_all
macvlan: fix a possible use after free
...
After adding devpts multiple-insrances sysctl kernel.pty.max limit pty count for
each devpts instance independently, while kernel.pty.nr shows total pty count.
This patch restores sysctl kernel.pty.max as global limit (4096 by default),
adds pty reseve for main devpts (mounted without "newinstance" argument),
and new sysctl to tune it: kernel.pty.reserve (1024 by default)
Also it adds devpts mount option "max=%d" to limit pty count for each devpts
instance independently. (by default NR_UNIX98_PTY_MAX == 2^20)
Thus devpts instances in containers cannot eat up all available pty even if we didn't
set any limits, while with "max" argument we can adjust limits more precisely.
Plus, now open("/dev/ptmx") return -ENOSPC in case lack of pty indexes,
this is more informative than -EIO.
Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
cleanup hack added in v2.6.27-3203-g15582d3
comment from that patch:
: pty: If the administrator creates a device for a ptmx slave we should not error
:
: The open path for ptmx slaves is via the ptmx device. Opening them any
: other way is not allowed. Vegard Nossum found that previously this was not
: the case and mknod foo c 128 42; cat foo would produce nasty diagnostics
:
: Signed-off-by: Alan Cox <alan@redhat.com>
: Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
devpts_get_tty() returns non-null only for inodes on devpts, but there is no
inodes for master-devices, /dev/ptmx (/dev/pts/ptmx) is the only way to open them.
Thus we can completely forbid lookup for master-devices and eliminate that hack in
tty_init_dev() because tty_open() will get EIO from tty_driver_lookup_tty().
Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch changes event message behaviour to send only updated records
instead of whole list. This fixes bug on which userspace receives non-actual
data in case multiple events occur in row.
Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
lineno:24 allows files with 4 million lines, an insane file-size, even
for never-to-get-in-tree machine generated code. Reduce this to 18
bits, which still allows 256k lines. This is still insanely big, but
its not raving mad.
Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Signed-off-by: Jason Baron <jbaron@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Change describe_flags() to emit '=[pmflt_]+' for current callsite
flags, or just '=_' when they're disabled. Having '=' in output
allows a more selective grep expression; in contrast '-' may appear
in filenames, line-ranges, and format-strings. '=' also has better
mnemonics, saying; "the current setting is equal to <flags>".
This allows grep "=_" <dbgfs>/dynamic_debug/control to see disabled
callsites while avoiding the many occurrences of " = " seen in format
strings.
Enlarge flagsbufs to handle additional flag char, and alter
ddebug_parse_flags() to allow flags=0, so that user can turn off all
debug flags via:
~# echo =_ > <dbgfs>/dynamic_debug/control
Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Signed-off-by: Jason Baron <jbaron@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
If CONFIG_DYNAMIC_DEBUG is defined, honor it over DEBUG, so that
pr_debug()s are controllable, instead of always-on. When DEBUG is
also defined, change _DPRINTK_FLAGS_DEFAULT to enable printing by
default.
Also adding _DPRINTK_FLAGS_INCL_MODNAME would be nice, but there are
numerous cases of pr_debug(NAME ": ...), which would result in double
printing of module-name. So defer this until things settle.
Cc: David Miller <davem@davemloft.net>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Signed-off-by: Jason Baron <jbaron@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Currently any enabled dynamic-debug flag on a pr_debug callsite will
enable printing, even if _DPRINTK_FLAGS_PRINT is off. Checking print
flag directly allows "-p" to disable callsites without fussing with
other flags, so the following disables everything, without altering
flags user may have set:
echo -p > $DBGFS/dynamic_debug/control
Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Signed-off-by: Jason Baron <jbaron@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Fix new kernel-doc warning:
Warning(include/linux/usb.h:1251): No description found for parameter 'num_mapped_sgs'
Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
quota: Pass information that quota is stored in system file to userspace
ext2: protect inode changes in the SETVERSION and SETFLAGS ioctls
jbd: Issue cache flush after checkpointing
Export new attributes sensb_res for tech B and sensf_res
for tech F in the target info (returned as a response to
NFC_CMD_GET_TARGET).
The max size of the attributes nfcid1, sensb_res and sensf_res
is exported to user space though include/linux/nfc.
Signed-off-by: Ilan Elias <ilane@ti.com>
Acked-by: Samuel Ortiz <sameo@linux.intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
We already extract some basic info but it's incomplete, reads info
about the first core only. Used data structure doesn't allow easy
adding of more cores.
This patch adds new struct and array for storing power info. The plan
is to: switch all extractors (including the ones using NVRAM) to new
struct, switch drivers, then deprecate and finally drop old SSB fields.
Signed-off-by: Rafał Miłecki <zajec5@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Fix new kernel-doc warnings:
Warning(include/linux/device.h:299): No description found for parameter 'name'
Warning(include/linux/device.h:299): No description found for parameter 'subsys'
Warning(include/linux/device.h:299): No description found for parameter 'node'
Warning(include/linux/device.h:299): No description found for parameter 'add_dev'
Warning(include/linux/device.h:299): No description found for parameter 'remove_dev'
Warning(include/linux/device.h:685): No description found for parameter 'id'
Warning(include/linux/device.h:1009): No description found for parameter '__driver'
Warning(include/linux/device.h:1009): No description found for parameter '__register'
Warning(include/linux/device.h:1009): No description found for parameter '__unregister'
Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Cc: Lars-Peter Clausen <lars@metafoo.de>
Cc: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The SUDMAC uses 8-bit width only. So, when the driver uses SUDMAC,
we have to clear the MBW_32.
Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Signed-off-by: Felipe Balbi <balbi@ti.com>
Power management fixes for 3.3
Two fixes for regressions introduced during the merge window, one fix for
a long-standing obscure issue in the computation of hibernate image size
and two small PM documentation fixes.
* tag 'pm-fixes-for-3.3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
PM / Sleep: Fix read_unlock_usermodehelper() call.
PM / Hibernate: Rewrite unlock_system_sleep() to fix s2disk regression
PM / Hibernate: Correct additional pages number calculation
PM / Documentation: Fix minor issue in freezing_of_tasks.txt
PM / Documentation: Fix spelling mistake in basic-pm-debugging.txt
Complete the renaming from "flush" to "invalidate" across
both tmem frontends (cleancache and frontswap) and both tmem backends
(Xen and zcache), as required by akpm.
This change is completely cosmetic.
[v10: no change]
[v9: akpm@linux-foundation.org: change "flush" to "invalidate", part 3]
Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com>
Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Jan Beulich <JBeulich@novell.com>
Acked-by: Seth Jennings <sjenning@linux.vnet.ibm.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Nitin Gupta <ngupta@vflare.org>
Cc: Matthew Wilcox <matthew@wil.cx>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Rik Riel <riel@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
[v11: Remove the frontswap part]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Per akpm suggestions alter the use of the term flush to be
invalidate. The next patch will do this across all MM.
This change is completely cosmetic.
[v9: akpm@linux-foundation.org: change "flush" to "invalidate", part 3]
Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com>
Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Jan Beulich <JBeulich@novell.com>
Reviewed-by: Seth Jennings <sjenning@linux.vnet.ibm.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Nitin Gupta <ngupta@vflare.org>
Cc: Matthew Wilcox <matthew@wil.cx>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Rik Riel <riel@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
[v10: Fixed fs: move code out of buffer.c conflict change]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
If V4L2_CAP_DEVICE_CAPS is set, then the new device_caps field is filled with
the capabilities of the opened device node.
The capabilities field traditionally contains the capabilities of the physical
device, being a superset of all capabilities available at the several device
nodes. E.g., if you open /dev/video0, then if it contains VBI caps then that means
that there is a corresponding vbi node as well. And the capabilities field of
both the video and vbi nodes should contain identical caps.
However, it would be very useful to also have a capabilities field that contains
just the caps for the currently open device, hence the new CAP bit and field.
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
The usual kernel-doc fixups from Randy. Some of them David acked as
merged in his tree, this is the random left-overs.
* kernel-doc:
docbook: fix sched source file names in device-drivers book
docbook: change iomap source filename in deviceiobook
docbook: don't use serial_core.h in device-drivers book
kernel-doc: fix kernel-doc warnings in sched
kernel-doc: fix new warnings in cfg80211.h
kernel-doc: fix new warning in usb.h
kernel-doc: fix new warnings in device.h
kernel-doc: fix new warnings in debugfs
kernel-doc: fix new warning in regulator core
kernel-doc: fix new warnings in pci
kernel-doc: fix new warnings in driver-core
kernel-doc: fix new warnings in auditsc.c
scripts/kernel-doc: fix fatal error caused by cfg80211.h
Fix new kernel-doc notation warnings:
Warning(include/linux/sched.h:2094): No description found for parameter 'p'
Warning(include/linux/sched.h:2094): Excess function parameter 'tsk' description in 'is_idle_task'
Warning(kernel/sched/cpupri.c:139): No description found for parameter 'newpri'
Warning(kernel/sched/cpupri.c:139): Excess function parameter 'pri' description in 'cpupri_set'
Warning(kernel/sched/cpupri.c:208): Excess function parameter 'bootmem' description in 'cpupri_init'
Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix new kernel-doc warning:
Warning(include/linux/usb.h:1251): No description found for parameter 'num_mapped_sgs'
Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Cc: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix new kernel-doc warnings:
Warning(include/linux/device.h:299): No description found for parameter 'name'
Warning(include/linux/device.h:299): No description found for parameter 'subsys'
Warning(include/linux/device.h:299): No description found for parameter 'node'
Warning(include/linux/device.h:299): No description found for parameter 'add_dev'
Warning(include/linux/device.h:299): No description found for parameter 'remove_dev'
Warning(include/linux/device.h:685): No description found for parameter 'id'
Warning(include/linux/device.h:1009): No description found for parameter '__driver'
Warning(include/linux/device.h:1009): No description found for parameter '__register'
Warning(include/linux/device.h:1009): No description found for parameter '__unregister'
Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Cc: Lars-Peter Clausen <lars@metafoo.de>
Cc: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit cc39c6a9bb ("mm: account skipped entries to avoid looping in
find_get_pages") correctly fixed an infinite loop; but left a problem
that find_get_pages() on shmem would return 0 (appearing to callers to
mean end of tree) when it meets a run of nr_pages swap entries.
The only uses of find_get_pages() on shmem are via pagevec_lookup(),
called from invalidate_mapping_pages(), and from shmctl SHM_UNLOCK's
scan_mapping_unevictable_pages(). The first is already commented, and
not worth worrying about; but the second can leave pages on the
Unevictable list after an unusual sequence of swapping and locking.
Fix that by using shmem_find_get_pages_and_swap() (then ignoring the
swap) instead of pagevec_lookup().
But I don't want to contaminate vmscan.c with shmem internals, nor
shmem.c with LRU locking. So move scan_mapping_unevictable_pages() into
shmem.c, renaming it shmem_unlock_mapping(); and rename
check_move_unevictable_page() to check_move_unevictable_pages(), looping
down an array of pages, oftentimes under the same lock.
Leave out the "rotate unevictable list" block: that's a leftover from
when this was used for /proc/sys/vm/scan_unevictable_pages, whose flawed
handling involved looking at pages at tail of LRU.
Was there significance to the sequence first ClearPageUnevictable, then
test page_evictable, then SetPageUnevictable here? I think not, we're
under LRU lock, and have no barriers between those.
Signed-off-by: Hugh Dickins <hughd@google.com>
Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Minchan Kim <minchan.kim@gmail.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Shaohua Li <shaohua.li@intel.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michel Lespinasse <walken@google.com>
Cc: <stable@vger.kernel.org> [back to 3.1 but will need respins]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
kdump only allocates memory for the prstatus ELF note. For s390x,
besides of prstatus multiple ELF notes for various different register
types are stored. Therefore the currently allocated memory is not
sufficient. With this patch the KEXEC_NOTE_BYTES macro can be defined
by architecture code and for s390x it is set to the correct size now.
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
sparc64 allmodconfig:
In file included from include/linux/compat.h:15,
from /usr/src/25/arch/sparc/include/asm/siginfo.h:19,
from include/linux/signal.h:5,
from include/linux/sched.h:73,
from arch/sparc/kernel/asm-offsets.c:13:
include/linux/fs.h:618: warning: parameter has incomplete type
It seems that my sparc64 compiler (gcc-3.4.5) doesn't like the forward
declaration of enums.
Fix this by moving the "enum migrate_mode" definition into its own header
file.
Acked-by: Mel Gorman <mgorman@suse.de>
Cc: Rik van Riel <riel@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Minchan Kim <minchan.kim@gmail.com>
Cc: Dave Jones <davej@redhat.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Andy Isaacson <adi@hexapodia.org>
Cc: Nai Xia <nai.xia@gmail.com>
Cc: Johannes Weiner <jweiner@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Device manufacturers frequently provide register sequences, usually not
fully documented, to be run at startup in order to provide better defaults
for devices (for example, improving performance in the light of silicon
evaluation). Support such updates by allowing drivers to register update
sets with the core. These updates will be written to the device immediately
and will also be rewritten when the cache is synced.
The assumption is that the reason for resyncing the cache will always be
that the device has been powered off. If this turns out to not be the case
then a separate operation can be provided.
Currently the implementation only allows a single set of updates to be
specified for a device, this could be extended in future.
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
It doesn't seem right for the thermal subsystem to export a symbol
named generate_netlink_event. This function is thermal-specific and
its name should reflect that fact. Rename it to
thermal_generate_netlink_event.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Acked-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: R.Durgadoss <durgadoss.r@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Now that all users of 'struct sysdev' are removed from the kernel, we
can safely remove the .h and .c files for this code, to ensure that no
one accidentally starts to use it again.
Many thanks for Kay who did all the hard work here on making this
happen.
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
There is a case in __sk_mem_schedule(), where an allocation
is beyond the maximum, but yet we are allowed to proceed.
It happens under the following condition:
sk->sk_wmem_queued + size >= sk->sk_sndbuf
The network code won't revert the allocation in this case,
meaning that at some point later it'll try to do it. Since
this is never communicated to the underlying res_counter
code, there is an inbalance in res_counter uncharge operation.
I see two ways of fixing this:
1) storing the information about those allocations somewhere
in memcg, and then deducting from that first, before
we start draining the res_counter,
2) providing a slightly different allocation function for
the res_counter, that matches the original behavior of
the network code more closely.
I decided to go for #2 here, believing it to be more elegant,
since #1 would require us to do basically that, but in a more
obscure way.
Signed-off-by: Glauber Costa <glommer@parallels.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
CC: Tejun Heo <tj@kernel.org>
CC: Li Zefan <lizf@cn.fujitsu.com>
CC: Laurent Chavey <chavey@google.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
For the memcg sock code, we'll need to register allocations
that are temporarily over limit. Let's make sure that margin
is 0 in this case.
I am keeping this as a separate patch, so that if any weirdness
interaction appears in the future, we can now exactly what caused
it.
Suggested by Johannes Weiner
Signed-off-by: Glauber Costa <glommer@parallels.com>
CC: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
CC: Johannes Weiner <hannes@cmpxchg.org>
CC: Michal Hocko <mhocko@suse.cz>
CC: Tejun Heo <tj@kernel.org>
CC: Li Zefan <lizf@cn.fujitsu.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Correctly implement a loss detection heuristic: New sequences (above
high_seq) sent during the fast recovery are deemed lost when higher
sequences are SACKed.
Current code does not catch these losses, because tcp_mark_head_lost()
does not check packets beyond high_seq. The fix is straight-forward by
checking packets until the highest sacked packet. In addition, all the
FLAG_DATA_LOST logic are in-effective and redundant and can be removed.
Update the loss heuristic comments. The algorithm above is documented
as heuristic B, but it is redundant too because heuristic A already
covers B.
Note that this change only marks some forward-retransmitted packets LOST.
It does NOT forbid TCP performing further CWR on new losses. A potential
follow-up patch under preparation is to perform another CWR on "new"
losses such as
1) sequence above high_seq is lost (by resetting high_seq to snd_nxt)
2) retransmission is lost.
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In native mode display all available staticstics.
In SRIOV mode on VF display only SW counters statistics,
in SRIOV mode on hypervisor display SW counters and errors (got from FW)
statistics.
Signed-off-by: Eugenia Emantayev <eugenia@mellanox.co.il>
Reviewed-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>