Even if we make use of classifier and actions from the egress
path, we're going into handle_ing() executing additional code
on a per-packet cost for ingress qdisc, just to realize that
nothing is attached on ingress.
Instead, this can just be blinded out as a no-op entirely with
the use of a static key. On input fast-path, we already make
use of static keys in various places, e.g. skb time stamping,
in RPS, etc. It makes sense to not waste time when we're assured
that no ingress qdisc is attached anywhere.
Enabling/disabling of that code path is being done via two
helpers, namely net_{inc,dec}_ingress_queue(), that are being
invoked under RTNL mutex when a ingress qdisc is being either
initialized or destructed.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull core locking changes from Ingo Molnar:
"Main changes:
- jump label asm preparatory work for PowerPC (Anton Blanchard)
- rwsem optimizations and cleanups (Davidlohr Bueso)
- mutex optimizations and cleanups (Jason Low)
- futex fix (Oleg Nesterov)
- remove broken atomicity checks from {READ,WRITE}_ONCE() (Peter
Zijlstra)"
* 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
powerpc, jump_label: Include linux/jump_label.h to get HAVE_JUMP_LABEL define
jump_label: Allow jump labels to be used in assembly
jump_label: Allow asm/jump_label.h to be included in assembly
locking/mutex: Further simplify mutex_spin_on_owner()
locking: Remove atomicy checks from {READ,WRITE}_ONCE
locking/rtmutex: Rename argument in the rt_mutex_adjust_prio_chain() documentation as well
locking/rwsem: Fix lock optimistic spinning when owner is not running
locking: Remove ACCESS_ONCE() usage
locking/rwsem: Check for active lock before bailing on spinning
locking/rwsem: Avoid deceiving lock spinners
locking/rwsem: Set lock ownership ASAP
locking/rwsem: Document barrier need when waking tasks
locking/futex: Check PF_KTHREAD rather than !p->mm to filter out kthreads
locking/mutex: Refactor mutex_spin_on_owner()
locking/mutex: In mutex_spin_on_owner(), return true when owner changes
Pull EFI update from Ingo Molnar:
"This tree includes various fixes, cleanups, a new efi=debug boot
option and EFI boot stub memory allocation optimizations"
* 'core-efi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
efi/libstub: Retrieve FDT size when loaded from UEFI config table
efi: Clean up the efi_call_phys_[prolog|epilog]() save/restore interaction
efi: Disable interrupts around EFI calls, not in the epilog/prolog calls
x86/efi: Add a "debug" option to the efi= cmdline
firmware: dmi_scan: Use direct access to static vars
firmware: dmi_scan: Use full dmi version for SMBIOS3
Some trivial reorders while preserving the RX/TX cache lines
split to fill a couple of holes.
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
e1000e is the only driver requiring pm_qos_req, instead of causing
every device to waste up to 240 bytes. Allocate it for the specific
driver.
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull KVM updates from Paolo Bonzini:
"First batch of KVM changes for 4.1
The most interesting bit here is irqfd/ioeventfd support for ARM and
ARM64.
Summary:
ARM/ARM64:
fixes for live migration, irqfd and ioeventfd support (enabling
vhost, too), page aging
s390:
interrupt handling rework, allowing to inject all local interrupts
via new ioctl and to get/set the full local irq state for migration
and introspection. New ioctls to access memory by virtual address,
and to get/set the guest storage keys. SIMD support.
MIPS:
FPU and MIPS SIMD Architecture (MSA) support. Includes some
patches from Ralf Baechle's MIPS tree.
x86:
bugfixes (notably for pvclock, the others are small) and cleanups.
Another small latency improvement for the TSC deadline timer"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (146 commits)
KVM: use slowpath for cross page cached accesses
kvm: mmu: lazy collapse small sptes into large sptes
KVM: x86: Clear CR2 on VCPU reset
KVM: x86: DR0-DR3 are not clear on reset
KVM: x86: BSP in MSR_IA32_APICBASE is writable
KVM: x86: simplify kvm_apic_map
KVM: x86: avoid logical_map when it is invalid
KVM: x86: fix mixed APIC mode broadcast
KVM: x86: use MDA for interrupt matching
kvm/ppc/mpic: drop unused IRQ_testbit
KVM: nVMX: remove unnecessary double caching of MAXPHYADDR
KVM: nVMX: checks for address bits beyond MAXPHYADDR on VM-entry
KVM: x86: cache maxphyaddr CPUID leaf in struct kvm_vcpu
KVM: vmx: pass error code with internal error #2
x86: vdso: fix pvclock races with task migration
KVM: remove kvm_read_hva and kvm_read_hva_atomic
KVM: x86: optimize delivery of TSC deadline timer interrupt
KVM: x86: extract blocking logic from __vcpu_run
kvm: x86: fix x86 eflags fixed bit
KVM: s390: migrate vcpu interrupt state
...
This change makes it so that instead of using smp_wmb/rmb which varies
depending on the kernel configuration we can can use dma_wmb/rmb which for
most architectures should be equal to or slightly more strict than
smp_wmb/rmb.
The advantage to this is that these barriers are available to uniprocessor
builds as well so the performance should improve under such a
configuration.
Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
drm: Use of-graph helpers to loop over endpoints
Convert all drm callers that use of_graph_get_next_endpoint to loop over
of-graph endpoints to the newly introduced for_each_endpoint_of_node
helper macro.
* tag 'of-graph-drm-2015-04-08' of git://git.pengutronix.de/git/pza/linux:
drm/rockchip: use for_each_endpoint_of_node macro, drop endpoint reference on break
drm/rcar-du: use for_each_endpoint_of_node macro
drm/imx: use for_each_endpoint_of_node macro in imx_drm_encoder_get_mux_id
drm: use for_each_endpoint_of_node macro in drm_of_find_possible_crtcs
of: Explicitly include linux/types.h in of_graph.h
dt-bindings: brcm: rationalize Broadcom documentation naming
of/unittest: replace 'selftest' with 'unittest'
Documentation: rename of_selftest.txt to of_unittest.txt
Documentation: update the of_selftest.txt
dt: OF_UNITTEST make dependency broken
MAINTAINERS: Pantelis Antoniou device tree overlay maintainer
of: Add of_graph_get_port_by_id function
of: Add for_each_endpoint_of_node helper macro
of: Decrement refcount of previous endpoint in of_graph_get_next_endpoint
Currently, smpboot_unpark_threads() is invoked before the incoming CPU
has been added to the scheduler's runqueue structures. This might
potentially cause the unparked kthread to run on the wrong CPU, since the
correct CPU isn't fully set up yet.
That causes a sporadic, hard to debug boot crash triggering on some
systems, reported by Borislav Petkov, and bisected down to:
2a442c9c64 ("x86: Use common outgoing-CPU-notification code")
This patch places smpboot_unpark_threads() in a CPU hotplug
notifier with priority set so that these kthreads are unparked just after
the CPU has been added to the runqueues.
Reported-and-tested-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
* pnp:
PNP: Avoid leaving unregistered device objects in lists
PNP: Convert pnp_lock into a mutex
PNP: tty/serial/8250/8250_fintek: Use module_pnp_driver to register driver
PNP: platform/x86/apple-gmux: Use module_pnp_driver to register driver
PNP: net/sb1000: Use module_pnp_driver to register driver
PNP: media/rc: Use module_pnp_driver to register driver
PNP: ide/ide-pnp: Use module_pnp_driver to register driver
PNP: ata/pata_isapnp: Use module_pnp_driver to register driver
PNP: tpm/tpm_infineon: Use module_pnp_driver to register driver
PNP: Add helper macro for pnp_register_driver boilerplate
PNP / ACPI: Use ACPI_COMPANION_SET() during initialization
* device-properties:
device property: Introduce firmware node type for platform data
device property: Make it possible to use secondary firmware nodes
driver core: Implement device property accessors through fwnode ones
driver core: property: Update fwnode_property_read_string_array()
driver core: Add comments about returning array counts
ACPI: Introduce has_acpi_companion()
driver core / ACPI: Represent ACPI companions using fwnode_handle
Pull vfs and fs fixes from Al Viro:
"Several AIO and OCFS2 fixes"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
ocfs2: _really_ sync the right range
ocfs2_file_write_iter: keep return value and current position update in sync
[regression] ocfs2: do *not* increment ->ki_pos twice
ioctx_alloc(): fix vma (and file) leak on failure
fix mremap() vs. ioctx_kill() race
... returning -E... upon error and amount of data left in iter after
(possible) truncation upon success. Note, that normal case gives
a non-zero (positive) return value, so any tests for != 0 _must_ be
updated.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Conflicts:
fs/ext4/file.c
Most filesystems call through to these at some point, so we'll start
here.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
all remaining instances of aio_{read,write} (all 4 of them) have explicit
->read and ->write resp.; do_sync_read/do_sync_write is never called by
__vfs_read/__vfs_write anymore and no other users had been left.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
All places outside of core VFS that checked ->read and ->write for being NULL or
called the methods directly are gone now, so NULL {read,write} with non-NULL
{read,write}_iter will do the right thing in all cases.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
simillar to iov_iter_fault_in_readable() but differs in that it is
not limited to faulting in the first iovec and instead faults in
"bytes" bytes iterating over the iovecs as necessary.
Also, instead of only faulting in the first and last page of the
range, all pages are faulted in.
This function is needed by NTFS when it does multi page file
writes.
Signed-off-by: Anton Altaparmakov <anton@tuxera.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
just make const char iname[] the last member and compare name->name with
name->iname instead of checking name->separate
We need to make sure that out-of-line name doesn't end up allocated adjacent
to struct filename refering to it; fortunately, it's easy to achieve - just
allocate that struct filename with one byte in ->iname[], so that ->iname[0]
will be inside the same object and thus have an address different from that
of out-of-line name [spotted by Boqun Feng <boqun.feng@gmail.com>]
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
irqchip core change for v4.1 (round 3) from Jason Cooper
Purge the gic_arch_extn hacks and abuse by using the new stacked domains
NOTE: Due to the nature of these changes, patches crossing subsystems have
been kept together in their own branches.
- tegra
- Handle the LIC properly
- omap
- Convert crossbar to stacked domains
- kill arm,routable-irqs in GIC binding
- exynos
- Convert PMU wakeup to stacked domains
- shmobile, ux500, zynq (irq_set_wake branch)
- Switch from abusing gic_arch_extn to using gic_set_irqchip_flags
Add configuration setting for drivers to allow/block an RSS Redirection
Table and a Hash Key querying for discrete VFs.
On some devices VF share the mentioned above information with PF and
querying it may adduce a theoretical security risk. We want to let a
system administrator to decide if he/she wants to take this risk or not.
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The active low flag in the DT cell is currently ignored.
This occurs because of_get_named_gpio_flags() does not apply the flags
to the underlying struct gpio_desc so the test in clk_register_gpio_gate()
was bogus.
Note that this patch changes the internal kernel API for
clk_register_gpio_gate() but there are currently no other users.
Signed-off-by: Martin Fuzzey <mfuzzey@parkeon.com>
Acked-by: Jyri Sarha <jsarha@ti.com>
Signed-off-by: Michael Turquette <mturquette@linaro.org>
Pull SCSI fixes from James Bottomley:
"This is our remaining set of three fixes for 4.0: two oops fixes(one
for cable pulls triggering oopses and the other be2iscsi specific) and
one warn on in sysfs on multipath devices using enclosures"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
Defer processing of REQ_PREEMPT requests for blocked devices
be2iscsi: Fix kernel panic when device initialization fails
enclosure: fix WARN_ON removing an adapter in multi-path devices
The current string_get_size() overflows when the device size goes over
2^64 bytes because the string helper routine computes the suffix from
the size in bytes. However, the entirety of Linux thinks in terms of
blocks, not bytes, so this will artificially induce an overflow on very
large devices. Fix this by making the function string_get_size() take
blocks and the block size instead of bytes. This should allow us to
keep working until the current SCSI standard overflows.
Also fix virtio_blk and mmc (both of which were also artificially
multiplying by the block size to pass a byte side to string_get_size()).
The mathematics of this is pretty simple: we're taking a product of
size in blocks (S) and block size (B) and trying to re-express this in
exponential form: S*B = R*N^E (where N, the exponent is either 1000 or
1024) and R < N. Mathematically, S = RS*N^ES and B=RB*N^EB, so if RS*RB
< N it's easy to see that S*B = RS*RB*N^(ES+EB). However, if RS*BS > N,
we can see that this can be re-expressed as RS*BS = R*N (where R =
RS*BS/N < N) so the whole exponent becomes R*N^(ES+EB+1)
[jejb: fix incorrect 32 bit do_div spotted by kbuild test robot <fengguang.wu@intel.com>]
Acked-by: Ulf Hansson <ulf.hansson@linaro.org>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: James Bottomley <JBottomley@Odin.com>
If f2fs was corrupted with missing dot dentries, it needs to recover them after
fsck.f2fs detection.
The underlying precedure is:
1. The fsck.f2fs remains F2FS_INLINE_DOTS flag in directory inode, if it detects
missing dot dentries.
2. When f2fs looks up the corrupted directory, it triggers f2fs_add_link with
proper inode numbers and their dot and dotdot names.
3. Once f2fs recovers the directory without errors, it removes F2FS_INLINE_DOTS
finally.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>