Firmware Assisted Dump (FA_DUMP) on ppc64 reserves substantial amounts
of memory when booting a secondary kernel. Srikar Dronamraju reported
that multiple nodes may have no memory managed by the buddy allocator
but still return true for populated_zone().
Commit 1d82de618d ("mm, vmscan: make kswapd reclaim in terms of
nodes") was reported to cause kswapd to spin at 100% CPU usage when
fadump was enabled. The old code happened to deal with the situation of
a populated node with zero free pages by co-incidence but the current
code tries to reclaim populated zones without realising that is
impossible.
We cannot just convert populated_zone() as many existing users really
need to check for present_pages. This patch introduces a managed_zone()
helper and uses it in the few cases where it is critical that the check
is made for managed pages -- zonelist construction and page reclaim.
Link: http://lkml.kernel.org/r/20160831195104.GB8119@techsingularity.net
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Reported-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Tested-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
fdb dumps spanning multiple skb's currently restart from the first
interface again for every skb. This results in unnecessary
iterations on the already visited interfaces and their fdb
entries. In large scale setups, we have seen this to slow
down fdb dumps considerably. On a system with 30k macs we
see fdb dumps spanning across more than 300 skbs.
To fix the problem, this patch replaces the existing single fdb
marker with three markers: netdev hash entries, netdevs and fdb
index to continue where we left off instead of restarting from the
first netdev. This is consistent with link dumps.
In the process of fixing the performance issue, this patch also
re-implements fix done by
commit 472681d57a ("net: ndo_fdb_dump should report -EMSGSIZE to rtnl_fdb_dump")
(with an internal fix from Wilson Kok) in the following ways:
- change ndo_fdb_dump handlers to return error code instead
of the last fdb index
- use cb->args strictly for dump frag markers and not error codes.
This is consistent with other dump functions.
Below results were taken on a system with 1000 netdevs
and 35085 fdb entries:
before patch:
$time bridge fdb show | wc -l
15065
real 1m11.791s
user 0m0.070s
sys 1m8.395s
(existing code does not return all macs)
after patch:
$time bridge fdb show | wc -l
35085
real 0m2.017s
user 0m0.113s
sys 0m1.942s
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: Wilson Kok <wkok@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add the const for the parameter of flow_keys_have_l4 for the readability.
Signed-off-by: Gao Feng <fgao@ikuai8.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Don't expose skbs to in-kernel users, such as the AFS filesystem, but
instead provide a notification hook the indicates that a call needs
attention and another that indicates that there's a new call to be
collected.
This makes the following possibilities more achievable:
(1) Call refcounting can be made simpler if skbs don't hold refs to calls.
(2) skbs referring to non-data events will be able to be freed much sooner
rather than being queued for AFS to pick up as rxrpc_kernel_recv_data
will be able to consult the call state.
(3) We can shortcut the receive phase when a call is remotely aborted
because we don't have to go through all the packets to get to the one
cancelling the operation.
(4) It makes it easier to do encryption/decryption directly between AFS's
buffers and sk_buffs.
(5) Encryption/decryption can more easily be done in the AFS's thread
contexts - usually that of the userspace process that issued a syscall
- rather than in one of rxrpc's background threads on a workqueue.
(6) AFS will be able to wait synchronously on a call inside AF_RXRPC.
To make this work, the following interface function has been added:
int rxrpc_kernel_recv_data(
struct socket *sock, struct rxrpc_call *call,
void *buffer, size_t bufsize, size_t *_offset,
bool want_more, u32 *_abort_code);
This is the recvmsg equivalent. It allows the caller to find out about the
state of a specific call and to transfer received data into a buffer
piecemeal.
afs_extract_data() and rxrpc_kernel_recv_data() now do all the extraction
logic between them. They don't wait synchronously yet because the socket
lock needs to be dealt with.
Five interface functions have been removed:
rxrpc_kernel_is_data_last()
rxrpc_kernel_get_abort_code()
rxrpc_kernel_get_error_number()
rxrpc_kernel_free_skb()
rxrpc_kernel_data_consumed()
As a temporary hack, sk_buffs going to an in-kernel call are queued on the
rxrpc_call struct (->knlrecv_queue) rather than being handed over to the
in-kernel user. To process the queue internally, a temporary function,
temp_deliver_data() has been added. This will be replaced with common code
between the rxrpc_recvmsg() path and the kernel_rxrpc_recv_data() path in a
future patch.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull audit fixes from Paul Moore:
"Two small patches to fix some bugs with the audit-by-executable
functionality we introduced back in v4.3 (both patches are marked
for the stable folks)"
* 'stable-4.8' of git://git.infradead.org/users/pcmoore/audit:
audit: fix exe_file access in audit_exe_compare
mm: introduce get_task_exe_file
Pull xfs and iomap fixes from Dave Chinner:
"Most of these changes are small regression fixes that address problems
introduced in the 4.8-rc1 window. The two fixes that aren't (IO
completion fix and superblock inprogress check) are fixes for problems
introduced some time ago and need to be pushed back to stable kernels.
Changes in this update:
- iomap FIEMAP_EXTENT_MERGED usage fix
- additional mount-time feature restrictions
- rmap btree query fixes
- freeze/unmount io completion workqueue fix
- memory corruption fix for deferred operations handling"
* tag 'xfs-iomap-for-linus-4.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs:
xfs: track log done items directly in the deferred pending work item
iomap: don't set FIEMAP_EXTENT_MERGED for extent based filesystems
xfs: prevent dropping ioend completions during buftarg wait
xfs: fix superblock inprogress check
xfs: simple btree query range should look right if LE lookup fails
xfs: fix some key handling problems in _btree_simple_query_range
xfs: don't log the entire end of the AGF
xfs: disallow mounting of realtime + rmap filesystems
xfs: don't perform lookups on zero-height btrees
Introduce a driver to provide calls into secure monitor mode.
In the Amlogic SoCs these calls are used for multiple reasons: access to
NVMEM, set USB boot, enable JTAG, etc...
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Carlo Caione <carlo@endlessm.com>
[khilman: add in SZ_4K cleanup]
Signed-off-by: Kevin Hilman <khilman@baylibre.com>
The XC instruction can be used to improve the speed of the raid6
recovery. The loops now operate on blocks of 256 bytes.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Switch to using new gpio_desc interface and devm gpio get calls to
automatically manage gpio resource. Use gpiod_get_value which handles
active high / low calls.
If gpio_detect is set then force loading of the driver as it is
reasonable to assume that the battery may not be present.
Update the is_present flag immediately in the IRQ.
Remove legacy gpio specification from platform data.
Signed-off-by: Phil Reid <preid@electromag.com.au>
Signed-off-by: Sebastian Reichel <sre@kernel.org>
The arm64 debug monitor initialisation code uses a CPU hotplug notifier
to clear the OS lock when CPUs come online.
This patch converts the code to the new hotplug mechanism.
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Reviewed-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
The arm64 hw_breakpoint implementation uses a CPU hotplug notifier to
reset the {break,watch}point registers when CPUs come online.
This patch converts the code to the new hotplug mechanism, whilst moving
the invocation earlier to remove the need to disable IRQs explicitly in
the driver (which could cause havok if we trip a watchpoint in an IRQ
handler whilst restoring the debug register state).
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Reviewed-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Some operations (setxattr/chmod) can make the cached acl stale. We either
need to clear overlay's acl cache for the affected inode or prevent acl
caching on the overlay altogether. Preventing caching has the following
advantages:
- no double caching, less memory used
- overlay cache doesn't go stale when fs clears it's own cache
Possible disadvantage is performance loss. If that becomes a problem
get_acl() can be optimized for overlayfs.
This patch disables caching by pre setting i_*acl to a value that
- has bit 0 set, so is_uncached_acl() will return true
- is not equal to ACL_NOT_CACHED, so get_acl() will not overwrite it
The constant -3 was chosen for this purpose.
Fixes: 39a25b2b37 ("ovl: define ->get_acl() for overlay inodes")
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
I ran into this:
================================================================================
UBSAN: Undefined behaviour in kernel/time/time.c:783:2
signed integer overflow:
5273 + 9223372036854771711 cannot be represented in type 'long int'
CPU: 0 PID: 17363 Comm: trinity-c0 Not tainted 4.8.0-rc1+ #88
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org
04/01/2014
0000000000000000 ffff88011457f8f0 ffffffff82344f50 0000000041b58ab3
ffffffff84f98080 ffffffff82344ea4 ffff88011457f918 ffff88011457f8c8
ffff88011457f8e0 7fffffffffffefff ffff88011457f6d8 dffffc0000000000
Call Trace:
[<ffffffff82344f50>] dump_stack+0xac/0xfc
[<ffffffff82344ea4>] ? _atomic_dec_and_lock+0xc4/0xc4
[<ffffffff8242f4c8>] ubsan_epilogue+0xd/0x8a
[<ffffffff8242fc04>] handle_overflow+0x202/0x23d
[<ffffffff8242fa02>] ? val_to_string.constprop.6+0x11e/0x11e
[<ffffffff823c7837>] ? debug_smp_processor_id+0x17/0x20
[<ffffffff8131b581>] ? __sigqueue_free.part.13+0x51/0x70
[<ffffffff8146d4e0>] ? rcu_is_watching+0x110/0x110
[<ffffffff8242fc4d>] __ubsan_handle_add_overflow+0xe/0x10
[<ffffffff81476ef8>] timespec64_add_safe+0x298/0x340
[<ffffffff81476c60>] ? timespec_add_safe+0x330/0x330
[<ffffffff812f7990>] ? wait_noreap_copyout+0x1d0/0x1d0
[<ffffffff8184bf18>] poll_select_set_timeout+0xf8/0x170
[<ffffffff8184be20>] ? poll_schedule_timeout+0x2b0/0x2b0
[<ffffffff813aa9bb>] ? __might_sleep+0x5b/0x260
[<ffffffff833c8a87>] __sys_recvmmsg+0x107/0x790
[<ffffffff833c8980>] ? SyS_recvmsg+0x20/0x20
[<ffffffff81486378>] ? hrtimer_start_range_ns+0x3b8/0x1380
[<ffffffff845f8bfb>] ? _raw_spin_unlock_irqrestore+0x3b/0x60
[<ffffffff8148bcea>] ? do_setitimer+0x39a/0x8e0
[<ffffffff813aa9bb>] ? __might_sleep+0x5b/0x260
[<ffffffff833c9110>] ? __sys_recvmmsg+0x790/0x790
[<ffffffff833c91e9>] SyS_recvmmsg+0xd9/0x160
[<ffffffff833c9110>] ? __sys_recvmmsg+0x790/0x790
[<ffffffff823c7853>] ? __this_cpu_preempt_check+0x13/0x20
[<ffffffff8162f680>] ? __context_tracking_exit.part.3+0x30/0x1b0
[<ffffffff833c9110>] ? __sys_recvmmsg+0x790/0x790
[<ffffffff81007bd3>] do_syscall_64+0x1b3/0x4b0
[<ffffffff845f936a>] entry_SYSCALL64_slow_path+0x25/0x25
================================================================================
Line 783 is this:
783 set_normalized_timespec64(&res, lhs.tv_sec + rhs.tv_sec,
784 lhs.tv_nsec + rhs.tv_nsec);
In other words, since lhs.tv_sec and rhs.tv_sec are both time64_t, this
is a signed addition which will cause undefined behaviour on overflow.
Note that this is not currently a huge concern since the kernel should be
built with -fno-strict-overflow by default, but could be a problem in the
future, a problem with older compilers, or other compilers than gcc.
The easiest way to avoid the overflow is to cast one of the arguments to
unsigned (so the addition will be done using unsigned arithmetic).
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
More -misc stuff
- moar drm_crtc.c split up&documentation
- some fixes for the simple kms helpers (Andrea)
- I included all the dri1 patches from David - we're not removing any code
or drivers, and it seems to have worked as a wake-up call to motivate a
few more people to upstream kms conversions for these. Feel free to
revert if you disagree strongly.
- a few other single patches
* tag 'topic/drm-misc-2016-08-31' of git://anongit.freedesktop.org/drm-intel: (24 commits)
drm: drm_probe_helper: Fix output_poll_work scheduling
drm: bridge/dw-hdmi: Fix colorspace and scan information registers values
drm/doc: Polish docs for drm_property&drm_property_blob
drm: Unify handling of blob and object properties
drm: Extract drm_property.[hc]
drm: move drm_mode_legacy_fb_format to drm_fourcc.c
drm/doc: Polish docs for drm_mode_object
drm: Remove drm_mode_object->atomic_count
drm: Extract drm_mode_object.[hc]
drm/doc: Polish kerneldoc for encoders
drm: Extract drm_encoder.[hc]
drm/fb-helper: don't call remove_conflicting_framebuffers for FB=m && DRM=y
drm/atomic-helper: Add NO_DISABLE_AFTER_MODESET flag support for plane commit
drm/atomic-helper: Disable appropriate planes in disable_planes_on_crtc()
drm/atomic-helper: Add atomic_disable CRTC helper callback
drm: simple_kms_helper: add support for bridges
drm: simple_kms_helper: make connector optional at init time
drm/bridge: introduce bridge detaching mechanism
drm/simple-helpers: Always add planes to the state update
drm: reduce GETCLIENT to a minimum
...
For more convenient access if one has a pointer to the task.
As a minor nit take advantage of the fact that only task lock + rcu are
needed to safely grab ->exe_file. This saves mm refcount dance.
Use the helper in proc_exe_link.
Signed-off-by: Mateusz Guzik <mguzik@redhat.com>
Acked-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Acked-by: Richard Guy Briggs <rgb@redhat.com>
Cc: <stable@vger.kernel.org> # 4.3.x
Signed-off-by: Paul Moore <paul@paul-moore.com>
Add a header for the SIP interface defined to access the dcf controller
handling ddr rate changes on rk3399 (and most likely later socs).
This interface is shared between the clock driver as well as the
devfreq driver.
The SIP interface counterpart was merged from pull-request #684 [0]
into the upstream arm-trusted-firmware codebase.
[0] https://github.com/ARM-software/arm-trusted-firmware/pull/684
Signed-off-by: Lin Huang <hl@rock-chips.com>
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Intel Quark UART uses DesignWare DMA IP. Though the DMA IP is connected in such
way that handshake interface uses inverted polarity. We have to provide a
possibility to set this in the DMA driver when configuring a channel.
Introduce a new member of custom slave configuration called 'hs_polarity' and
set active low polarity in case this value is 'true'.
Acked-by: Vinod Koul <vinod.koul@intel.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Tested-by: Bryan O'Donoghue <pure.logic@nexus-software.ie>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The USART device provides a fractional baud rate generator to get a more
accurate baud rate. It can be used only when the USART is configured in
'normal mode' and this feature is not available on AT91RM9200 SoC.
Signed-off-by: Ludovic Desroches <ludovic.desroches@atmel.com>
Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
There is no Peripheral Identification Registers on ZTE PL011 device, so
although the driver amba-pl011 is ready to work for ZTE device, the
device cannot be probed by the driver at all.
With arm,primecell-periphid DT bindings (bindings/arm/primecell.txt) in
place, it should be the cleanest the way to use a pseudo-ID to probe the
device from AMBA bus. We create an unofficial vendor number
AMBA_VENDOR_LINUX, which will practically never become an official
vendor ID, and takes Configuration, Revision number, and Part number as
input to compose a pseudo-ID for ZTE device.
Also, since we start using vendor_zte to probe ZTE device, the
__maybe_unused for vendor_zte is removed.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
For some reason we do not really understand, ZTE hardware designers
choose to define PL011 Flag Register bit positions differently from
standard ones as below.
Bit Standard ZTE
-----------------------------------
CTS 0 1
DSR 1 3
BUSY 3 8
RI 8 0
Let's define these bits into vendor data and get ZTE PL011 supported
properly.
Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
Acked-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The RK818 chip is a Power Management IC (PMIC) for multimedia and handheld
devices. It contains the following components:
- Regulators
- RTC
- Clocking
- Battery support
Both RK808 and RK818 chips are using a similar register map,
so we can reuse the RTC and Clocking functionality.
Signed-off-by: Wadim Egorov <w.egorov@phytec.de>
Tested-by: Andy Yan <andy.yan@rock-chips.com>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
v2: Fixed the very obvious lack of setting ucounts
on struct mnt_ns reported by Andrei Vagin, and the kbuild
test report.
Reported-by: Andrei Vagin <avagin@openvz.org>
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
The LP873X chip is a power management IC for Portable Navigation Systems
and Tablet Computing devices. It contains the following components:
- Regulators.
- Configurable General Purpose Output Signals (GPO).
PMIC interacts with the main processor through i2c. PMIC has
couple of LDOs (Linear Regulators), couple of BUCKs (Step-Down DC-DC
Converter Cores) and GPOs (General Purpose Output Signals).
Signed-off-by: Keerthy <j-keerthy@ti.com>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Many modules call misc_register and misc_deregister in its module init
and exit methods without any additional code. This ends up being
boilerplate. This patch adds helper macro module_misc_device(), that
replaces module_init()/ module_exit() with template functions.
This patch also converts drivers to use new macro.
Change since v1:
Add device.h include in miscdevice.h as module_driver macro was not
available from other include files in some architectures.
Signed-off-by: PrasannaKumar Muralidharan <prasannatsmkumar@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Make the location monitor callback function prototype more useful by
changing the argument from an integer to a void pointer.
All VME bridge drivers were simply passing the location monitor index
(e.g. 0-3) as the argument to these callbacks. It is much more useful
to pass back a pointer to data that the callback-registering driver
cares about.
There appear to be no in-kernel callers of vme_lm_attach (or
vme_lme_request for that matter), so this change only affects the VME
subsystem and bridge drivers.
This has been tested with Tsi148 hardware, but the CA91Cx42 changes
have only been compiled.
Signed-off-by: Aaron Sierra <asierra@xes-inc.com>
Acked-by: Martyn Welch <martyn@welchs.me.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The MEN Chameleon specification states that a chameleon FPGA can include a
bridge descriptor, which then opens up a new bus behind this bridge. MCB
included subdevice handling code in the core, but no support for bus
descriptors in the parser, due to a lack of hardware access.
As this is technically dead code, but it gets executed on a device add,
I've decided to remove it.
Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The to_mcb_{bus,device,driver}() macros lacked type safety, so convert them to
inline functions to enforce compile time type checking.
Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
With this commit [1] address range filter information is now found
in the struct hw_perf_event::addr_filters. As such pass the event
itself to the coresight_source::enable/disable() functions so that
both event attribute and filter can be accessible for configuration.
[1] 'commit 375637bc52 ("perf/core: Introduce address range filtering")'
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
On Hyper-V, performance critical channels use the monitor
mechanism to signal the host when the guest posts mesages
for the host. This mechanism minimizes the hypervisor intercepts
and also makes the host more efficient in that each time the
host is woken up, it processes a batch of messages as opposed to
just one. The goal here is improve the throughput and this is at
the expense of increased latency.
Implement a mechanism to let the client driver decide if latency
is important.
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
There is a rare race when we remove an entry from the global list
hv_context.percpu_list[cpu] in hv_process_channel_removal() ->
percpu_channel_deq() -> list_del(): at this time, if vmbus_on_event() ->
process_chn_event() -> pcpu_relid2channel() is trying to query the list,
we can get the kernel fault.
Similarly, we also have the issue in the code path: vmbus_process_offer() ->
percpu_channel_enq().
We can resolve the issue by disabling the tasklet when updating the list.
The patch also moves vmbus_release_relid() to a later place where
the channel has been removed from the per-cpu and the global lists.
Reported-by: Rolf Neugebauer <rolf.neugebauer@docker.com>
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Newer revisions of the ChromeOS EC add more events besides the keyboard
ones. So handle interrupts in the MFD driver and let consumers register
for notifications for the events they might care.
To keep backward compatibility, if the EC doesn't support MKBP event, we
fall back to the old MKBP key matrix host command.
Cc: Randall Spangler <rspangler@chromium.org>
Cc: Vincent Palatin <vpalatin@chromium.org>
Cc: Benson Leung <bleung@chromium.org>
Signed-off-by: Vic Yang <victoryang@google.com>
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Tested-by: Enric Balletbo i Serra <enric.balletbo@collabora.com>
Acked-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Some platforms (e.g. USB-DMAC on R-Car SoCs) has memory alignment
restriction. If memory alignment is not match, the usb peripheral
driver decides not to use the DMA controller. Then, the performance
is not good.
In the case of u_ether.c, since it calls skb_reserve() in rx_submit(),
it is possible to cause memory alignment mismatch.
So, this patch adds a new quirk "quirk_avoids_skb_reserve" to avoid
skb_reserve() calling in u_ether.c to improve performance.
A peripheral driver will set this flag and network gadget drivers
(e.g. f_ncm.c) will reference the flag via gadget_avoids_skb_reserve().
Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Today mpls iptunnel lwtunnel_output redirect expects the tunnel
output function to handle fragmentation. This is ok but can be
avoided if we did not do the mpls output redirect too early.
ie we could wait until ip fragmentation is done and then call
mpls output for each ip fragment.
To make this work we will need,
1) the lwtunnel state to carry encap headroom
2) and do the redirect to the encap output handler on the ip fragment
(essentially do the output redirect after fragmentation)
This patch adds tunnel headroom in lwtstate to make sure we
account for tunnel data in mtu calculations during fragmentation
and adds new xmit redirect handler to redirect to lwtunnel xmit func
after ip fragmentation.
This includes IPV6 and some mtu fixes and testing from David Ahern.
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
The following patchset contains Netfilter fixes for your net tree,
they are:
1) Allow nf_tables reject expression from input, forward and output hooks,
since only there the routing information is available, otherwise we crash.
2) Fix unsafe list iteration when flushing timeout and accouting objects.
3) Fix refcount leak on timeout policy parsing failure.
4) Unlink timeout object for unconfirmed conntracks too
5) Missing validation of pkttype mangling from bridge family.
6) Fix refcount leak on ebtables on second lookup for the specific
bridge match extension, this patch from Sabrina Dubroca.
7) Remove unnecessary ip_hdr() in nf_tables_netdev family.
Patches from 1-5 and 7 from Liping Zhang.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Johannes Berg says:
====================
Three little fixes:
* revert a recent wext patch, which Ben Hutchings noticed was
wrong, and it turns out not to be necessary for any driver
* fix an infinite loop that can occur under certain conditions
in mac80211's TDLS code (depending on regulatory information)
* add a cfg80211_get_station() static inline when cfg80211 isn't
built, to allow other modules to not have to depend on it for it
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
PCC status field exposes an error bit(2) to indicate any errors during
the execution of last comamnd. This patch checks the error bit before
notifying success/failure to the cpufreq driver.
Signed-off-by: Prashanth Prakash <pprakash@codeaurora.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The CPPC tables contain entries for per CPU feedback counters which
allows us to compute the delivered performance over a given interval
of time.
The math for delivered performance per the CPPCv5.0+ spec is:
reference perf * delta(delivered perf ctr)/delta(ref perf ctr)
Maintaining deltas of the counters in the kernel is messy, as it
depends on when the reads are triggered. (e.g. via the cpufreq
->get() interface). Also the ->get() interace only returns one
value, so cant return raw values. So instead, leave it to userspace
to keep track of raw values and do its math for CPUs it cares about.
delivered and reference perf counters are exposed via the same
sysfs file to avoid the potential "skid", if these values are read
individually from userspace.
Signed-off-by: Prashanth Prakash <pprakash@codeaurora.org>
Signed-off-by: Ashwin Chaugule <ashwin.chaugule@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Compute the expected transition latency for frequency transitions
using the values from the PCCT tables when the desired perf
register is in PCC.
Signed-off-by: Prashanth Prakash <pprakash@codeaurora.org>
Reviewed-by: Alexey Klimov <alexey.klimov@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
CPPC defined in section 8.4.7 of ACPI 6.0 specification suggests
"To amortize the cost of PCC transactions, OSPM should read or write
all PCC registers via a single read or write command when possible"
This patch enables opportunistic batching of frequency transition
requests whenever the request happen to overlap in time.
Currently the access to pcc is serialized by a spin lock which does
not scale well as we increase the number of cores in the system. This
patch improves the scalability by allowing the differnt CPU cores to
update PCC subspace in parallel and by batching requests which will
reduce the certain types of operation(checking command completion bit,
ringing doorbell) by a significant margin.
Profiling shows significant improvement in the overall effeciency
to service freq. transition requests. With this patch we observe close
to 30% of the frequency transition requests being batched with other
requests while running apache bench on a ARM platform with 6
independent domains(or sets of related cpus).
Signed-off-by: Prashanth Prakash <pprakash@codeaurora.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>