The only user is cxgb3 driver.
old_neigh is used to check device change, but it must not happen
on redirect. In this sense, we can remove old_neigh argument.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull sound fixes from Takashi Iwai:
"Most of commits found here are for ASoC device specific fixes,
arizona, cs4271, wm5102, wm2200, etc, in addition to a couple of
memory leak fixes in ASoC core.
Other than that, regression fixes in HD-audio and USB-audio, and a fix
for new Realtek codecs."
* tag 'sound-3.8' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (30 commits)
ALSA: usb-audio: Fix NULL dereference by access to non-existing substream
ALSA: hda - Add support of new codec ALC284
ALSA: usb-audio: Make ebox44_table static
ALSA: hdspm - Fix wordclock status on AES32
Revert "ALSA: hda - Shut up pins at power-saving mode with Conexnat codecs"
ALSA: hda - Disable runtime D3 for Intel CPT & co
ALSA: pxa27x: fix ac97 warm reset
ALSA: pxa27x: fix ac97 cold reset
ASoC: wm_adsp: Ensure that block writes are from DMA aligned addresses
ASoC: wm2000: Fix sense of speech clarity enable
ASoC: wm5100: Remove DSP B and left justified formats
ASoC: arizona: Remove DSP B and left justified AIF modes
ASoC: wm2200: Remove DSP B and left justified AIF modes
ASoC: wm5102: Improve speaker enable performance
ASoC: core: fix the memory leak in case of remove_aux_dev()
ASoC: core: fix the memory leak in case of device_add() failure
ASoC: cs42l52: Catch no-match case in cs42l52_get_clk
ASoC: lm49453: Update lm49453_reg_defs values as per LM49453 HW revision-B
ASoC: lm49453: Fix adc, mic and sidetone volume ranges
ASoC: arizona: Correct FLL source definitions
...
NCQ capability was used to check availability of SATA Settings page
from Identify Device Data Log, which contains DevSlp timing variables.
It does not work on some HDDs and leads to error messages.
IDENTIFY word 78 bit 5(Hardware Feature Control) can't work either
because it is only the sufficient condition of Identify Device data
log, not the necessary condition.
This patch replaced ata_device->sata_settings with ->devslp_timing
to only save DevSlp timing variables(8 bytes), instead of the whole
SATA Settings page(512 bytes).
Addresses https://bugzilla.kernel.org/show_bug.cgi?id=51881
Reported-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Shane Huang <shane.huang@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Clockevent cleanup series from Shawn Guo.
Resolved move/change conflict in mach-pxa/time.c due to the sys_timer
cleanup.
* clocksource/cleanup:
clocksource: use clockevents_config_and_register() where possible
ARM: use clockevents_config_and_register() where possible
clockevents: export clockevents_config_and_register for module use
+ sync to Linux 3.8-rc3
Signed-off-by: Olof Johansson <olof@lixom.net>
Conflicts:
arch/arm/mach-pxa/time.c
ipv6_prefix_equal() just casts its arguments and it is the only
user of __ipv6_prefix_equal().
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Introduce __ipv6_addr_diff64() to to find the first different
bit between two addresses on 64bit architectures.
32bit version is still available as __ipv6_addr_diff32(),
and __ipv6_addr_diff() automatically selects appropriate
version.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pass an optional device_node pointer in the platform data, which in turn
will be put into a mtd_part_parser_data. This way, code that sets up the
platform devices can pass along the node from DT so that the partitions
can be parsed.
For non-DT boards, this change has no effect.
Signed-off-by: Daniel Mack <zonque@gmail.com>
Acked-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: Artem Bityutskiy <dedekind1@gmail.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Pull driver core fixes from Greg Kroah-Hartman:
"Here are two patches for 3.8-rc3.
One removes the __dev* defines from init.h now that all usages of it
are gone from your tree. The other fix is for debugfs's paramater
that was using the wrong base for the option.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>"
* tag 'driver-core-3.8-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
debugfs: convert gid= argument from decimal, not octal
Remove __dev* markings from init.h
Pull networking fixes from David Miller:
1) Fix regression allowing IP_TTL setting of zero, fix from Cong Wang.
2) Fix leak regressions in tunap, from Jason Wang.
3) be2net driver always returns IRQ_HANDLED in INTx handler, fix from
Sathya Perla.
4) qlge doesn't really support NETIF_F_TSO6, don't set that flag. Fix
from Amerigo Wang.
5) Add 802.11ad Atheros wil6210 driver, from Vladimir Kondratiev.
6) Fix MTU calculations in mac80211 layer, from T Krishna Chaitanya.
7) Station info layer of mac80211 needs to use del_timer_sync(), from
Johannes Berg.
8) tcp_read_sock() can loop forever, because we don't immediately stop
when recv_actor() returns zero. Fix from Eric Dumazet.
9) Fix WARN_ON() in tcp_cleanup_rbuf(). We have to use sk_eat_skb() in
tcp_recv_skb() to handle the case where a large GRO packet is split
up while it is use by a splice() operation. Fix also from Eric
Dumazet.
10) addrconf_get_prefix_route() in ipv6 tests flags incorrectly, it
does:
if (X && (p->flags & Y) != 0)
when it really meant to go:
if (X && (p->flags & X) != 0)
fix from Romain Kuntz.
11) Fix lost Kconfig dependency for bfin_mac driver hardware
timestamping. From Lars-Peter Clausen.
12) Fix regression in handling of RST without ACK in TCP, from Eric
Dumazet.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (37 commits)
be2net: fix unconditionally returning IRQ_HANDLED in INTx
tuntap: fix leaking reference count
tuntap: forbid calling TUNSETIFF when detached
tuntap: switch to use rtnl_dereference()
net, wireless: overwrite default_ethtool_ops
qlge: remove NETIF_F_TSO6 flag
tcp: accept RST without ACK flag
net: ethernet: xilinx: Do not use NO_IRQ in axienet
net: ethernet: xilinx: Do not use axienet on PPC
bnx2x: Allow management traffic after boot from SAN
bnx2x: Fix fastpath structures when memory allocation fails
bfin_mac: Restore hardware time-stamping dependency on BF518
tun: avoid owner checks on IFF_ATTACH_QUEUE
bnx2x: move debugging code before the return
tuntap: refuse to re-attach to different tun_struct
ipv6: use addrconf_get_prefix_route for prefix route lookup [v2]
ipv6: fix the noflags test in addrconf_get_prefix_route
tcp: fix splice() and tcp collapsing interaction
tcp: splice: fix an infinite loop in tcp_read_sock()
net: prevent setting ttl=0 via IP_TTL
...
Add tracepoints for page dirtying, writeback_single_inode start, inode
dirtying and writeback. For the latter two inode events, a pair of
events are defined to denote start and end of the operations (the
starting one has _start suffix and the one w/o suffix happens after
the operation is complete). These inode ops are FS specific and can
be non-trivial and having enclosing tracepoints is useful for external
tracers.
This is part of tracepoint additions to improve visiblity into
dirtying / writeback operations for io tracer and userland.
v2: writeback_dirty_inode[_start] TPs may be called for files on
pseudo FSes w/ unregistered bdi. Check whether bdi->dev is %NULL
before dereferencing.
v3: buffer dirtying moved to a block TP.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
The former is triggered from touch_buffer() and the latter
mark_buffer_dirty().
This is part of tracepoint additions to improve visiblity into
dirtying / writeback operations for io tracer and userland.
v2: Transformed writeback_dirty_buffer to block_dirty_buffer and made
it share TP definition with block_touch_buffer.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
We want to add a trace point to touch_buffer() but macros and inline
functions defined in header files can't have tracing points. Move
touch_buffer() to fs/buffer.c and make it a proper function.
The new exported function is also declared inline. As most uses of
touch_buffer() are inside buffer.c with nilfs2 as the only other user,
the effect of this change should be negligible.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
bio_{front|back}_merge tracepoints report a bio merging into an
existing request but didn't specify which request the bio is being
merged into. Add @req to it. This makes it impossible to share the
event template with block_bio_queue - split it out.
@req isn't used or exported to userland at this point and there is no
userland visible behavior change. Later changes will make use of the
extra parameter.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
bio completion didn't kick block_bio_complete TP. Only dm was
explicitly triggering the TP on IO completion. This makes
block_bio_complete TP useless for tracers which want to know about
bios, and all other bio based drivers skip generating blktrace
completion events.
This patch makes all bio completions via bio_endio() generate
block_bio_complete TP.
* Explicit trace_block_bio_complete() invocation removed from dm and
the trace point is unexported.
* @rq dropped from trace_block_bio_complete(). bios may fly around
w/o queue associated. Verifying and accessing the assocaited queue
belongs to TP probes.
* blktrace now gets both request and bio completions. Make it ignore
bio completions if request completion path is happening.
This makes all bio based drivers generate blktrace completion events
properly and makes the block_bio_complete TP actually useful.
v2: With this change, block_bio_complete TP could be invoked on sg
commands which have bio's with %NULL bi_bdev. Update TP
assignment code to check whether bio->bi_bdev is %NULL before
dereferencing.
Signed-off-by: Tejun Heo <tj@kernel.org>
Original-patch-by: Namhyung Kim <namhyung@gmail.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Alasdair Kergon <agk@redhat.com>
Cc: dm-devel@redhat.com
Cc: Neil Brown <neilb@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
With commit f2a33cde55a03 "ACPI: Drop ACPI device .bind() and .unbind()
callbacks", acpi_op_bind and acpi_op_unbind are not used any more. So
remove them.
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
nfc.h being GPL makes it quite controversial for non GPL applications to
include it.
Moreover, nfc.h only includes structures and API definitions that are hardly
copyrightable.
Signed-off-by: Lauro Ramos Venancio <lauro.venancio@openbossa.org>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
The reg_notifier()'s return value need not be checked
as it is only supposed to do post regulatory work and
that should never fail. Any behaviour to regulatory
that needs to be considered before cfg80211 does work
to a driver should be specified by using the already
existing flags, the reg_notifier() just does post
processing should it find it needs to.
Also make lbs_reg_notifier static.
Signed-off-by: Luis R. Rodriguez <mcgrof@do-not-panic.com>
[move lbs_reg_notifier to not break compile]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
This patch add support to manage LLI by SW for select phy channels.
There is a HW issue in certain controllers due to which on certain
occassions HW LLI cannot be used on some physical channels. To avoid
the HW issue on a specific phy channel, the phy channel number can be
added to the list of soft_lli_channels and there after all the transfers
on that channel will use software LLI, for peripheral to memory
transfers.
SoftLLI introduces relink overhead, that could impact performace for
certain use cases.
This is based on a previous patch of Narayanan Gopalakrishnan.
Cc: Shreshtha Kumar Sahu <shreshthakumar.sahu@stericsson.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Vinod Koul <vinod.koul@intel.com>
Signed-off-by: Fabio Baltieri <fabio.baltieri@linaro.org>
DMAC_ICFG[0:2]=SCHNB only allows to count 'multiple of 4' physical
channels so it was ok with platforms having 8 channels but cannot be
used for next versions (with 10 or 14 channels). This patch allows to
provide the number of physical channels for a DMA device via
platform_data, or still rely on SCHNB if platform_data announces 0
channel.
Signed-off-by: Gerald Baeza <gerald.baeza@stericsson.com>
Reviewed-by: Per Forlin <per.forlin@stericsson.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Vinod Koul <vinod.koul@intel.com>
Signed-off-by: Fabio Baltieri <fabio.baltieri@linaro.org>
This patch moves arch-vt8500/timer.c into drivers/clocksource and
updates the necessary Kconfig/Makefile options.
Signed-off-by: Tony Prisk <linux@prisktech.co.nz>
IN6ADDR_* and in6addr_* are not exported to userspace, and are defined
in include/linux/in6.h.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Router Alert option is very small and we can store the value
itself in the skb.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Move generalized version of ipv6_is_mld() to header,
and use it from ip6_mc_input().
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 7a3198a8 ("ipv6: helper function to get tclass") introduced
ipv6_tclass(), but similar function is already available as
ipv6_get_dsfield().
We might be able to call ipv6_tclass() from ipv6_get_dsfield(),
but it is confusing to have two versions.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This is not only for readability but also for optimization.
What we do here is to build the 32bit word at the beginning of the ipv6
header (the "ip6_flow" virtual member of struct ip6_hdr in RFC3542) and
we do not need to read the tclass portion of the target buffer.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The ACPI handles of PCI root bridges need to be known to
acpi_bind_one(), so that it can create the appropriate
"firmware_node" and "physical_node" files for them, but currently
the way it gets to know those handles is not exactly straightforward
(to put it lightly).
This is how it works, roughly:
1. acpi_bus_scan() finds the handle of a PCI root bridge,
creates a struct acpi_device object for it and passes that
object to acpi_pci_root_add().
2. acpi_pci_root_add() creates a struct acpi_pci_root object,
populates its "device" field with its argument's address
(device->handle is the ACPI handle found in step 1).
3. The struct acpi_pci_root object created in step 2 is passed
to pci_acpi_scan_root() and used to get resources that are
passed to pci_create_root_bus().
4. pci_create_root_bus() creates a struct pci_host_bridge object
and passes its "dev" member to device_register().
5. platform_notify(), which for systems with ACPI is set to
acpi_platform_notify(), is called.
So far, so good. Now it starts to be "interesting".
6. acpi_find_bridge_device() is used to find the ACPI handle of
the given device (which is the PCI root bridge) and executes
acpi_pci_find_root_bridge(), among other things, for the
given device object.
7. acpi_pci_find_root_bridge() uses the name (sic!) of the given
device object to extract the segment and bus numbers of the PCI
root bridge and passes them to acpi_get_pci_rootbridge_handle().
8. acpi_get_pci_rootbridge_handle() browses the list of ACPI PCI
root bridges and finds the one that matches the given segment
and bus numbers. Its handle is then used to initialize the
ACPI handle of the PCI root bridge's device object by
acpi_bind_one(). However, this is *exactly* the ACPI handle we
started with in step 1.
Needless to say, this is quite embarassing, but it may be avoided
thanks to commit f3fd0c8 (ACPI: Allow ACPI handles of devices to be
initialized in advance), which makes it possible to initialize the
ACPI handle of a device before passing it to device_register().
Accordingly, add a new __weak routine, pcibios_root_bridge_prepare(),
defaulting to an empty implementation that can be replaced by the
interested architecutres (x86 and ia64 at the moment) with functions
that will set the root bridge's ACPI handle before its dev member is
passed to device_register(). Make both x86 and ia64 provide such
implementations of pcibios_root_bridge_prepare() and remove
acpi_pci_find_root_bridge() and acpi_get_pci_rootbridge_handle() that
aren't necessary any more.
Included is a fix for breakage on systems with non-ACPI PCI host
bridges from Bjorn Helgaas.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
The core does not modify these fields, so they can be made const. This allows
drivers to declare their op tables as const.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Current simple-card driver calls asoc_simple_card_dai_init()
if platform had a asoc_simple_card_dai_init pointer.
And, this initialization function works only
when platform has an applicable initial value for each dai settings.
And basically, almost all sound card requires certain initialization.
This means that almost all platform has initialization settings,
and driver do nothing if it doesn't have settings.
And additionally, current simple-card supports sysclk settings but it was
only for codec. In order to abolish deviation between cpu and codec,
and in order to simplify processing,
this patch adds asoc_simple_dai, and removed pointless
struct asoc_simple_dai_init_info which was trigger of
calling asoc_simple_card_dai_init().
Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
The header is used by drivers/dma/amba-pl08x.c, which can be compiled
under x86, where PL080 exists under a PCI-to-AMBA bridge. This patche
moves it where it can be accessed by other architectures, and fixes
all users.
Signed-off-by: Alessandro Rubini <rubini@gnudd.com>
Acked-by: Giancarlo Asnaghi <giancarlo.asnaghi@st.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
canqun zhang reported that we're hitting BUG_ON in the
nf_conntrack_destroy path when calling kfree_skb while
rmmod'ing the nf_conntrack module.
Currently, the nf_ct_destroy hook is being set to NULL in the
destroy path of conntrack.init_net. However, this is a problem
since init_net may be destroyed before any other existing netns
(we cannot assume any specific ordering while releasing existing
netns according to what I read in recent emails).
Thanks to Gao feng for initial patch to address this issue.
Reported-by: canqun zhang <canqunzhang@gmail.com>
Acked-by: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
You should never look at such a module, so it's excised from all paths
which traverse the modules list.
We add the state at the end, to avoid gratuitous ABI break (ksplice).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
writeback_inodes_sb(_nr)_if_idle() is re-implemented by replacing down_read()
with down_read_trylock() because
- If ->s_umount is write locked, then the sb is not idle. That is
writeback_inodes_sb(_nr)_if_idle() needn't wait for the lock.
- writeback_inodes_sb(_nr)_if_idle() grabs s_umount lock when it want to start
writeback, it may bring us deadlock problem when doing umount. In order to
fix the problem, ext4 and btrfs implemented their own writeback functions
instead of writeback_inodes_sb(_nr)_if_idle(), but it introduced the redundant
code, it is better to implement a new writeback_inodes_sb(_nr)_if_idle().
The name of these two functions is cumbersome, so rename them to
try_to_writeback_inodes_sb(_nr).
This idea came from Christoph Hellwig.
Some code is from the patch of Kamal Mostafa.
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Since:
commit 2c60db0370
Author: Eric Dumazet <edumazet@google.com>
Date: Sun Sep 16 09:17:26 2012 +0000
net: provide a default dev->ethtool_ops
wireless core does not correctly assign ethtool_ops.
After alloc_netdev*() call, some cfg80211 drivers provide they own
ethtool_ops, but some do not. For them, wireless core provide generic
cfg80211_ethtool_ops, which is assigned in NETDEV_REGISTER notify call:
if (!dev->ethtool_ops)
dev->ethtool_ops = &cfg80211_ethtool_ops;
But after Eric's commit, dev->ethtool_ops is no longer NULL (on cfg80211
drivers without custom ethtool_ops), but points to &default_ethtool_ops.
In order to fix the problem, provide function which will overwrite
default_ethtool_ops and use it by wireless core.
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Acked-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Merge misc fixes from Andrew Morton:
"The audit fixes have been floating around for a while - Al and Eric
aren't responding to either myself or Kees so I asked Kees to
re-review them and here they are."
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (22 commits)
lib/rbtree.c: avoid the use of non-static __always_inline
MAINTAINERS: Omar had moved
mm: compaction: partially revert capture of suitable high-order page
linux/audit.h: move ptrace.h include to kernel header
kernel/audit.c: avoid negative sleep durations
audit: catch possible NULL audit buffers
audit: create explicit AUDIT_SECCOMP event type
MAINTAINERS: fix a status pattern
MAINTAINERS: fix arch/arm/plat-omap/include/plat/omap_hwmod.h
mm: thp: acquire the anon_vma rwsem for write during split
mm: mmap: annotate vm_lock_anon_vma locking properly for lockdep
lockdep, rwsem: provide down_write_nest_lock()
arch/mn10300/Kconfig: select CONFIG_GENERIC_ATOMIC64
mm: bootmem: fix free_all_bootmem_core() with odd bitmap alignment
mm: use aligned zone start for pfn_to_bitidx calculation
fs/exec.c: work around icc miscompilation
mm: compaction: fix echo 1 > compact_memory return error issue
mm: memblock: fix wrong memmove size in memblock_merge_regions()
drivers/video/ssd1307fb.c: fix bit order bug in the byte translation function
mm: migrate: check page_count of THP before migrating
...
lib/rbtree.c declared __rb_erase_color() as __always_inline void, and
then exported it with EXPORT_SYMBOL.
This was because __rb_erase_color() must be exported for augmented
rbtree users, but it must also be inlined into rb_erase() so that the
dummy callback can get optimized out of that call site.
(Actually with a modern compiler, none of the dummy callback functions
should even be generated as separate text functions).
The above usage is legal C, but it was unusual enough for some compilers
to warn about it. This change makes things more explicit, with a static
__always_inline ____rb_erase_color function for use in rb_erase(), and a
separate non-inline __rb_erase_color function for use in
rb_erase_augmented call sites.
Signed-off-by: Michel Lespinasse <walken@google.com>
Reported-by: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Eric Wong reported on 3.7 and 3.8-rc2 that ppoll() got stuck when
waiting for POLLIN on a local TCP socket. It was easier to trigger if
there was disk IO and dirty pages at the same time and he bisected it to
commit 1fb3f8ca0e ("mm: compaction: capture a suitable high-order page
immediately when it is made available").
The intention of that patch was to improve high-order allocations under
memory pressure after changes made to reclaim in 3.6 drastically hurt
THP allocations but the approach was flawed. For Eric, the problem was
that page->pfmemalloc was not being cleared for captured pages leading
to a poor interaction with swap-over-NFS support causing the packets to
be dropped. However, I identified a few more problems with the patch
including the fact that it can increase contention on zone->lock in some
cases which could result in async direct compaction being aborted early.
In retrospect the capture patch took the wrong approach. What it should
have done is mark the pageblock being migrated as MIGRATE_ISOLATE if it
was allocating for THP and avoided races that way. While the patch was
showing to improve allocation success rates at the time, the benefit is
marginal given the relative complexity and it should be revisited from
scratch in the context of the other reclaim-related changes that have
taken place since the patch was first written and tested. This patch
partially reverts commit 1fb3f8ca0e ("mm: compaction: capture a
suitable high-order page immediately when it is made available").
Reported-and-tested-by: Eric Wong <normalperson@yhbt.net>
Tested-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
While the kernel internals want pt_regs (and so it includes
linux/ptrace.h), the user version of audit.h does not need it. So move
the include out of the uapi version.
This avoids issues where people want the audit defines and userland
ptrace api. Including both the kernel ptrace and the userland ptrace
headers can easily lead to failure.
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Cc: Eric Paris <eparis@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The seccomp path was using AUDIT_ANOM_ABEND from when seccomp mode 1
could only kill a process. While we still want to make sure an audit
record is forced on a kill, this should use a separate record type since
seccomp mode 2 introduces other behaviors.
In the case of "handled" behaviors (process wasn't killed), only emit a
record if the process is under inspection. This change also fixes
userspace examination of seccomp audit events, since it was considered
malformed due to missing fields of the AUDIT_ANOM_ABEND event type.
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Eric Paris <eparis@redhat.com>
Cc: Jeff Layton <jlayton@redhat.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Julien Tinnes <jln@google.com>
Acked-by: Will Drewry <wad@chromium.org>
Acked-by: Steve Grubb <sgrubb@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>