ARM: tegra: switch to dmaengine
The Tegra code-base has contained both a legacy DMA and a dmaengine
driver since v3.6-rcX. This series flips Tegra's defconfig to enable
dmaengine rather than the legacy driver, and removes the legacy driver
and all client code.
* tag 'tegra-for-3.7-dmaengine' of git://git.kernel.org/pub/scm/linux/kernel/git/swarren/linux-tegra:
ASoC: tegra: remove support of legacy DMA driver based access
spi: tegra: remove support of legacy DMA driver based access
ARM: tegra: apbio: remove support of legacy DMA driver based access
ARM: tegra: dma: remove legacy APB DMA driver
ARM: tegra: config: enable dmaengine based APB DMA driver
+ sync to 3.6-rc6
Device tree related changes for omaps.
Note that this branch is based on omap-cleanup-sparseirq-for-v3.7
to avoid merge conflicts with the sparseirq changes for gpio-twl4030
driver.
* tag 'omap-devel-dt-merged-for-v3.7' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
arm/dts: Mux uart pins for omap4-sdp
ARM: OMAP2+: select PINCTRL in Kconfig
arm/dts: Add pinctrl driver entries for omap2/3/4
arm/dts: Add omap36xx.dtsi file and rename omap3-beagle to omap3-beagle-xm
ARM: dts: omap3-overo: Add support for the blue LED
Documentation: dt: Update the OMAP documentation with Overo/Toby
ARM: dts: OMAP3: Add support for Gumstix Overo with Tobi expansion board
ARM: dts: OMAP4: Add reg and interrupts for every nodes
ARM: dts: AM33XX: Specify reg and interrupt property for all nodes
ARM: dts: AM33XX: Convert all hex numbers to lower-case
ARM: dts: omap3-beagle: Enable audio support
ARM: dts: omap5: Add McPDM and DMIC section to the dtsi file
ARM: dts: omap5: Add McBSP entries
ARM: dts: omap4: Add reg-names for McPDM and DMIC
ARM: dts: omap4: Add McBSP entries
ARM: dts: omap3: Add McBSP entries
ARM: dts: omap2420-h4: Include omap2420.dtsi file instead the common omap2
ARM: dts: omap2: Add McBSP entries for OMAP2420 and OMAP2430 SoC
ARM: dts: omap3-beagle: Add heartbeat and mmc LEDs support
ARM: dts: omap3: Add gpio-twl4030 properties for BeagleBoard and omap3-EVM
...
Remove the offset from ipi_msg_type and assume that SGI0 is the
wakeup interrupt now that all WFI hotplug users call
gic_raise_softirq() with 0 instead of 1. This allows us to
track how many wakeup interrupts are sent and also removes the
unknown IPI printk message for WFI hotplug based systems.
Reviewed-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
As specified by ftrace-design.txt, TIF_SYSCALL_TRACEPOINT was
added, as well as NR_syscalls in asm/unistd.h. Additionally,
__sys_trace was modified to call trace_sys_enter and
trace_sys_exit when appropriate.
Tests #2 - #4 of "perf test" now complete successfully.
Signed-off-by: Steven Walter <stevenrwalter@gmail.com>
Signed-off-by: Wade Farnsworth <wade_farnsworth@mentor.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
In order to easily detect pathological cases, print some diagnostics
when the kernel boots.
This also provides helpers to detect that HYP mode is actually available,
which can be used by other subsystems to enable HYP specific features.
Signed-off-by: Dave Martin <dave.martin@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
This patch does two things:
* Ensure that asynchronous aborts are masked at kernel entry.
The bootloader should be masking these anyway, but this reduces
the damage window just in case it doesn't.
* Enter svc mode via exception return to ensure that CPU state is
properly serialised. This does not matter when switching from
an ordinary privileged mode ("PL1" modes in ARMv7-AR rev C
parlance), but it potentially does matter when switching from a
another privileged mode such as hyp mode.
This should allow the kernel to boot safely either from svc mode or
hyp mode, even if no support for use of the ARM Virtualization
Extensions is built into the kernel.
Signed-off-by: Dave Martin <dave.martin@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Enabling boot from HYP mode requires the use of some more
virt-specific instructions ("eret" and "msr elr_hyp, reg").
Add the necessary encoding to asm/opcode-virt.h.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
From Tony Lindgren:
Remove the ancient omap specific atags that are no longer needed.
At some point we were planning to pass the bootloader information
with custom atags that did not work out too well.
There's no need for these any longer as the kernel has been booting
fine without them for quite some time. And Now we have device tree
support that can be used instead.
* tag 'cleanup-omap-tags-for-v3.7' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
ARM: OMAP: remove plat/board.h file
ARM: OMAP: move debug_card_init() function
ARM: OMAP1: move lcd pdata out of arch/arm/*
ARM: OMAP1: move omap1_bl pdata out of arch/arm/*
ARM: OMAP: remove the omap custom tags
ARM: OMAP1: remove the crystal type tag parsing
ARM: OMAP: remove the sti console workaround
ARM: OMAP: omap3evm: cleanup revision bits
ARM: OMAP: cleanup struct omap_board_config_kernel
+ sync to 3.6-rc5
From Stephen Warren:
ARM: tegra: i2c driver enhancements mostly related to clocking
This branch contains a number of fixes and cleanups to the Tegra I2C
driver related to clocks. These are based on the common clock conversion
in order to avoid duplicating the clock driver changes before and after
the conversion. Finally, a bug-fix related to I2C_M_NOSTART is included.
This branch is based on previous pull request tegra-for-3.7-common-clk.
* tag 'tegra-for-3.7-drivers-i2c' of git://git.kernel.org/pub/scm/linux/kernel/git/swarren/linux-tegra:
i2c: tegra: dynamically control fast clk
i2c: tegra: I2_M_NOSTART functionality not supported in Tegra20
ARM: tegra: clock: remove unused clock entry for i2c
ARM: tegra: clock: add connection name in i2c clock entry
i2c: tegra: pass proper name for getting clock
ARM: tegra: clock: add i2c fast clock entry in clock table
ARM: Tegra: Add smp_twd clock for Tegra20
ARM: tegra: cpu-tegra: explicitly manage re-parenting
ARM: tegra: fix overflow in tegra20_pll_clk_round_rate()
ARM: tegra: Fix data type for io address
ARM: tegra: remove tegra_timer from tegra_list_clks
ARM: tegra30: clocks: fix the wrong tegra_audio_sync_clk_ops name
ARM: tegra: clocks: separate tegra_clk_32k_ops from Tegra20 and Tegra30
ARM: tegra: Remove duplicate code
ARM: tegra: Port tegra to generic clock framework
ARM: tegra: Add clk_tegra structure and helper functions
ARM: tegra: Rename tegra20 clock file
ARM: tegra20: Separate out clk ops and clk data
ARM: tegra30: Separate out clk ops and clk data
ARM: tegra: fix U16 divider range check
...
+ sync to v3.6-rc4
Resolved remove/modify conflict in arch/arm/mach-sa1100/leds-hackkit.c
caused by the sync with v3.6-rc4.
Signed-off-by: Olof Johansson <olof@lixom.net>
Some subsystems (KVM for example) need access to a cycle counter.
In the KVM case, this is used to measure the time delta between
host and guest in order to accurately generate timer events for
the guest.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
For now, this patch just adds a definition for the HVC instruction.
More can be added here later, as needed.
Now that we have a real example of how to use the opcode injection
macros properly, this patch also adds a cross-reference from the
explanation in opcodes.h (since without an example, figuring out
how to use the macros is not that easy).
Signed-off-by: Dave Martin <dave.martin@linaro.org>
Acked-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This patch adds some __inst_() macros for injecting custom opcodes
in assembler (both inline and in .S files). They should make it
easier and cleaner to get things right in little-/big-
endian/ARM/Thumb-2 kernels without a lot of #ifdefs.
This pure-preprocessor approach is preferred over the alternative
method of wedging extra assembler directives into the assembler
input using top-level asm() blocks, since there is no way to
guarantee that the compiler won't reorder those with respect to
each other or with respect to non-toplevel asm() blocks, unless
-fno-toplevel-reorder is passed (which is in itself somewhat
undesirable because it defeats some potential optimisations).
Currently <asm/unified.h> _does_ silently rely on the compiler not
reordering at the top level, but it seems better to avoid adding
extra code which depends on this if the same result can be achieved
in another way.
Signed-off-by: Dave Martin <dave.martin@linaro.org>
Acked-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Most of the existing macros don't work with assembler, due to the
use of type casts and C functions from <linux/swab.h>.
This patch abstracts out those operations and provides simple
explicit versions for use in assembly code.
__opcode_is_thumb32() and __opcode_is_thumb16() are also converted
to do bitmask-based testing to avoid confusion if these are used in
assembly code (the assembler typically treats all arithmetic values
as signed).
These changes avoid the need for the compiler to pre-evaluate
constant expressions used to generate opcodes. By ensuring that
the forms of these expressions can be evaluated directly by the
assembler, we can just stringify the expressions directly into the
asm during the preprocessing pass. The alternative approach
(passing the evaluated expression via an inline asm "i" constraint)
gets painful because the contents of the asm and the constraints
must be kept in sync. This makes the resulting macros awkward to
use.
Retaining the C forms of the macros allows more efficient code to
be generated when opcodes are generated programmatically at run-
time, but there is no way to embed run-time-generated opcodes in
asm() blocks.
Signed-off-by: Dave Martin <dave.martin@linaro.org>
Acked-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The existing __mem_to_opcode_thumb32() is incorrect for BE32
platforms. However, these don't support Thumb-2 kernels, so this
option is not so relevant for those platforms anyway.
This operation is complicated by the lack of unaligned memory
access support prior to ARMv6.
Rather than provide a "working" macro which will probably won't get
used (or worse, will get misused), this patch removes the macro for
BE32 kernels. People manipulating Thumb opcodes prior to ARMv6
should almost certainly be splitting these operations into
halfwords anyway, using __opcode_thumb32_{first,second,compose}()
and the 16-bit opcode transformations.
Signed-off-by: Dave Martin <dave.martin@linaro.org>
Acked-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This lets us build a multiplatform kernel for experimental purposes.
However, it will not be useful for any real work, because it relies
on a number of useful things to be disabled for now:
* SMP support must be turned off because of conflicting symbols.
Marc Zyngier has proposed a solution by adding a new SOC
operations structure to hold indirect function pointers
for these, but that work is currently stalled
* We turn on SPARSE_IRQ unconditionally, which is not supported
on most platforms. Each of them is currently in a different
state, but most are being worked on.
* A common clock framework is in place since v3.4 but not yet
being used. Work on this is on its way.
* DEBUG_LL for early debugging is currently disabled.
* THUMB2_KERNEL does not work with allyesconfig because the
kernel gets too big
[Rob Herring]: Rebased to not be dependent on the mass mach header rename.
As a result, omap2plus, imx, mxs and ux500 are not converted. Highbank,
picoxcell, mvebu, and socfpga are converted.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Andrew Lunn <andrew@lunn.ch>
Acked-by: Jamie Iles <jamie@jamieiles.com>
Cc: Dinh Nguyen <dinguyen@altera.com>
Move highbank debug-macro.S over to common debug macro directory.
Also, remove v7 specific movw/movt instructions so this can compile under
v6 mode.
Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Based on suggestion by Russell King, create a common location for debug
macros and select the included debug macro file using config option.
Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Cc: Russell King <linux@arm.linux.org.uk>
Most platforms don't need mach/gpio.h and it prevents multi-platform
kernel images. Add CONFIG_NEED_MACH_GPIO_H and make platforns select it
if they need gpio.h. This is platforms that define __GPIOLIB_COMPLEX
or have lots of implicit includes pulled in by mach/gpio.h.
at91 and omap have gpio clean-up pending and can drop
CONFIG_NEED_MACH_GPIO_H once that is in.
Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Cc: Russell King <linux@arm.linux.org.uk>
Acked-by: Jason Cooper <jason@lakedaemon.net>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Almost each SMP platform defines pen_release to manage booting secondary
CPUs. This of course clashes with the single zImage effort.
Add the pen_release definition to the ARM SMP code, and remove all others.
This should only be used by platforms which lack any kind of CPU power
management...
Reported-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Acked-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Now that all SMP platforms have been converted to use struct
smp_operations, remove the "weak" attribute from the hooks
in smp.c, and make the functions static wherever possible.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Acked-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
This adds a 'struct smp_operations' to abstract the CPU initialization
and hot plugging functions on SMP systems, which otherwise conflict
in a multiplatform kernel. This also helps shmobile and potentially
others that have more than one method to do these.
To allow the kernel to continue building, the platform hooks are
defined as weak symbols which are overrided by the platform code.
Once all platforms are converted, the "weak" attribute will be
removed and the function made static.
Unlike the original version from Marc, this new version from Arnd
does not use a generalized abstraction for per-soc data structures
but only tries to solve the problem for the SMP operations. This
way, we can collapse the previous four data structures into a
single struct, which is less systematic but also easier to follow
as a causal reader.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Acked-by: Nicolas Pitre <nico@fluxnic.net>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Pull ARM fixes from Russell King:
"It's been a while... so there's a little more here than normal.
Mostly updates from Will for the breakpoint stuff, and plugging a few
holes in the user access functions which crept in when domain support
was disabled for ARMv7 CPUs."
* 'fixes' of git://git.linaro.org/people/rmk/linux-arm:
ARM: 7529/1: delay: set loops_per_jiffy when moving to timer-based loop
ARM: 7528/1: uaccess: annotate [__]{get,put}_user functions with might_fault()
ARM: 7527/1: uaccess: explicitly check __user pointer when !CPU_USE_DOMAINS
ARM: 7526/1: traps: send SIGILL if get_user fails on undef handling path
ARM: 7521/1: Fix semihosting Kconfig text
ARM: 7513/1: Make sure dtc is built before running it
ARM: 7512/1: Fix XIP build due to PHYS_OFFSET definition moving
ARM: 7499/1: mm: Fix vmalloc overlap check for !HIGHMEM
ARM: 7503/1: mm: only flush both pmd entries for classic MMU
ARM: 7502/1: contextidr: avoid using bfi instruction during notifier
ARM: 7501/1: decompressor: reset ttbcr for VMSA ARMv7 cores
ARM: 7497/1: hw_breakpoint: allow single-byte watchpoints on all addresses
ARM: 7496/1: hw_breakpoint: don't rely on dfsr to show watchpoint access type
ARM: Fix ioremap() of address zero
The user access functions may generate a fault, resulting in invocation
of a handler that may sleep.
This patch annotates the accessors with might_fault() so that we print a
warning if they are invoked from atomic context and help lockdep keep
track of mmap_sem.
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The {get,put}_user macros don't perform range checking on the provided
__user address when !CPU_HAS_DOMAINS.
This patch reworks the out-of-line assembly accessors to check the user
address against a specified limit, returning -EFAULT if is is out of
range.
[will: changed get_user register allocation to match put_user]
[rmk: fixed building on older ARM architectures]
Reported-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Cc: stable@vger.kernel.org
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
During the p2v changes, the PHYS_OFFSET #define moved into a
!__ASSEMBLY__ section. This causes a XIP build to fail with
arch/arm/kernel/head.o: In function 'stext':
arch/arm/kernel/head.S:146: undefined reference to 'PHYS_OFFSET'
Momentarily leave the #ifndef __ASSEMBLY__ section so we can
define PHYS_OFFSET for all compilation units.
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
* 'soc-core' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas:
ARM: mach-shmobile: Add compilation support for dtbs using 'make dtbs'
+ sync to 3.6-rc3
From Will Deacon:
Bunch of perf updates for the ARM backend that pave the way for
big.LITTLE support in the future. The separation of CPU and PMU code
is also the start of being able to move some of this stuff under
drivers/.
* tag 'arm-perf-3.7' of git://git.kernel.org/pub/scm/linux/kernel/git/will/linux:
ARM: perf: move irq registration into pmu implementation
ARM: perf: move CPU-specific PMU handling code into separate file
ARM: perf: prepare for moving CPU PMU code into separate file
ARM: perf: probe devicetree in preference to current CPU
ARM: perf: remove mysterious compiler barrier
ARM: pmu: remove arm_pmu_type enumeration
ARM: pmu: remove unused reservation mechanism
ARM: perf: add devicetree bindings for 11MPcore, A5, A7 and A15 PMUs
ARM: PMU: Add runtime PM Support
As Stephen Rothwell reports, a849088aa1 ("ARM: Fix ioremap() of
address zero") from the arm-current tree and commit c279443709 ("ARM:
Add fixed PCI i/o mapping") from the arm-soc tree conflict in
a nontrivial way in arch/arm/mm/mmu.c.
Rob Herring explains:
The PCI i/o reserved area has a dummy physical address of 0 and
needs to be skipped by ioremap searches. So we don't set
VM_ARM_STATIC_MAPPING to prevent matches by ioremap. The vm_struct
settings don't really matter when we do the real mapping of the
i/o space.
Since commit a849088aa1 is at the start of the fixes branch
in the arm tree, we can merge it into the branch that contains
the other ioremap changes.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Rob Herring <rob.herring@calxeda.com>
Cc: Russell King <rmk+kernel@arm.linux.org.uk>
Commit 774c096bf9 (ARM: v6/v7 cache: allow cache calls to be
optimized) got dropped when the merge conflicts for moving the contents
of the files in commit 753790e713 (ARM: move cache/processor/fault
glue to separate include files) was fixed up in merge bd1274dc00
(Merge branch 'v6v7' into devel).
This puts the change back.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
There is no point reserving space at the bottom of the kernel stack for
per-thread crunch state, and per-thread VFP state if these are not being
supported by the kernel being built. Remove these members from the
thread union when these features are disabled.
Reported-by: Tim Bird <tim.bird@am.sony.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Some platforms might require to increase atomic coherent pool to make
sure that their device will be able to allocate all their buffers from
atomic context. This function can be also used to decrease atomic
coherent pool size if coherent allocations are not used for the given
sub-platform.
Suggested-by: Josh Coombs <josh.coombs@gmail.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Data aborts taken to hyp mode do not provide a valid instruction
syndrome field in the HSR if the faulting instruction is a memory
access using a writeback addressing mode.
For hypervisors emulating MMIO accesses to virtual peripherals, taking
such an exception requires disassembling the faulting instruction in
order to determine the behaviour of the access. Since this requires
manually walking the two stages of translation, the world must be
stopped to prevent races against page aging in the guest, where the
first-stage translation is invalidated after the hypervisor has
translated to an IPA and the physical page is reused for something else.
This patch avoids taking this heavy performance penalty when running
Linux as a guest by ensuring that our I/O accessors do not make use of
writeback addressing modes.
Cc: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Commit a76d7bd96d ("ARM: 7467/1: mutex: use generic xchg-based
implementation for ARMv6+") removed the barrier-less, ARM-specific
mutex implementation in favour of the generic xchg-based code.
Since then, a bug was uncovered in the xchg code when running on SMP
platforms, due to interactions between the locking paths and the
MUTEX_SPIN_ON_OWNER code. This was fixed in 0bce9c46bf ("mutex: place
lock in contended state after fastpath_lock failure"), however, the
atomic_dec-based mutex algorithm is now marginally more efficient for
ARM (~0.5% improvement in hackbench scores on dual A15).
This patch moves ARMv6+ platforms to the atomic_dec-based mutex code.
Acked-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
As pointed out by Arnd Bergmann, this fixes a couple of issues but will
increase code size:
The original macro user_termio_to_kernel_termios was not endian safe. It
used an unsigned short ptr to access the low bits in a 32-bit word.
Both user_termio_to_kernel_termios and kernel_termios_to_user_termio are
missing error checking on put_user/get_user and copy_to/from_user.
Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Reviewed-by: Nicolas Pitre <nico@linaro.org>
Tested-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This moves ARM over to the asm-generic/unaligned.h header. This has the
benefit of better code generated especially for ARMv7 on gcc 4.7+
compilers.
As Arnd Bergmann, points out: The asm-generic version uses the "struct"
version for native-endian unaligned access and the "byteshift" version
for the opposite endianess. The current ARM version however uses the
"byteshift" implementation for both.
Thanks to Nicolas Pitre for the excellent analysis:
Test case:
int foo (int *x) { return get_unaligned(x); }
long long bar (long long *x) { return get_unaligned(x); }
With the current ARM version:
foo:
ldrb r3, [r0, #2] @ zero_extendqisi2 @ MEM[(const u8 *)x_1(D) + 2B], MEM[(const u8 *)x_1(D) + 2B]
ldrb r1, [r0, #1] @ zero_extendqisi2 @ MEM[(const u8 *)x_1(D) + 1B], MEM[(const u8 *)x_1(D) + 1B]
ldrb r2, [r0, #0] @ zero_extendqisi2 @ MEM[(const u8 *)x_1(D)], MEM[(const u8 *)x_1(D)]
mov r3, r3, asl #16 @ tmp154, MEM[(const u8 *)x_1(D) + 2B],
ldrb r0, [r0, #3] @ zero_extendqisi2 @ MEM[(const u8 *)x_1(D) + 3B], MEM[(const u8 *)x_1(D) + 3B]
orr r3, r3, r1, asl #8 @, tmp155, tmp154, MEM[(const u8 *)x_1(D) + 1B],
orr r3, r3, r2 @ tmp157, tmp155, MEM[(const u8 *)x_1(D)]
orr r0, r3, r0, asl #24 @,, tmp157, MEM[(const u8 *)x_1(D) + 3B],
bx lr @
bar:
stmfd sp!, {r4, r5, r6, r7} @,
mov r2, #0 @ tmp184,
ldrb r5, [r0, #6] @ zero_extendqisi2 @ MEM[(const u8 *)x_1(D) + 6B], MEM[(const u8 *)x_1(D) + 6B]
ldrb r4, [r0, #5] @ zero_extendqisi2 @ MEM[(const u8 *)x_1(D) + 5B], MEM[(const u8 *)x_1(D) + 5B]
ldrb ip, [r0, #2] @ zero_extendqisi2 @ MEM[(const u8 *)x_1(D) + 2B], MEM[(const u8 *)x_1(D) + 2B]
ldrb r1, [r0, #4] @ zero_extendqisi2 @ MEM[(const u8 *)x_1(D) + 4B], MEM[(const u8 *)x_1(D) + 4B]
mov r5, r5, asl #16 @ tmp175, MEM[(const u8 *)x_1(D) + 6B],
ldrb r7, [r0, #1] @ zero_extendqisi2 @ MEM[(const u8 *)x_1(D) + 1B], MEM[(const u8 *)x_1(D) + 1B]
orr r5, r5, r4, asl #8 @, tmp176, tmp175, MEM[(const u8 *)x_1(D) + 5B],
ldrb r6, [r0, #7] @ zero_extendqisi2 @ MEM[(const u8 *)x_1(D) + 7B], MEM[(const u8 *)x_1(D) + 7B]
orr r5, r5, r1 @ tmp178, tmp176, MEM[(const u8 *)x_1(D) + 4B]
ldrb r4, [r0, #0] @ zero_extendqisi2 @ MEM[(const u8 *)x_1(D)], MEM[(const u8 *)x_1(D)]
mov ip, ip, asl #16 @ tmp188, MEM[(const u8 *)x_1(D) + 2B],
ldrb r1, [r0, #3] @ zero_extendqisi2 @ MEM[(const u8 *)x_1(D) + 3B], MEM[(const u8 *)x_1(D) + 3B]
orr ip, ip, r7, asl #8 @, tmp189, tmp188, MEM[(const u8 *)x_1(D) + 1B],
orr r3, r5, r6, asl #24 @,, tmp178, MEM[(const u8 *)x_1(D) + 7B],
orr ip, ip, r4 @ tmp191, tmp189, MEM[(const u8 *)x_1(D)]
orr ip, ip, r1, asl #24 @, tmp194, tmp191, MEM[(const u8 *)x_1(D) + 3B],
mov r1, r3 @,
orr r0, r2, ip @ tmp171, tmp184, tmp194
ldmfd sp!, {r4, r5, r6, r7}
bx lr
In both cases the code is slightly suboptimal. One may wonder why
wasting r2 with the constant 0 in the second case for example. And all
the mov's could be folded in subsequent orr's, etc.
Now with the asm-generic version:
foo:
ldr r0, [r0, #0] @ unaligned @,* x
bx lr @
bar:
mov r3, r0 @ x, x
ldr r0, [r0, #0] @ unaligned @,* x
ldr r1, [r3, #4] @ unaligned @,
bx lr @
This is way better of course, but only because this was compiled for
ARMv7. In this case the compiler knows that the hardware can do
unaligned word access. This isn't that obvious for foo(), but if we
remove the get_unaligned() from bar as follows:
long long bar (long long *x) {return *x; }
then the resulting code is:
bar:
ldmia r0, {r0, r1} @ x,,
bx lr @
So this proves that the presumed aligned vs unaligned cases does have
influence on the instructions the compiler may use and that the above
unaligned code results are not just an accident.
Still... this isn't fully conclusive without at least looking at the
resulting assembly fron a pre ARMv6 compilation. Let's see with an
ARMv5 target:
foo:
ldrb r3, [r0, #0] @ zero_extendqisi2 @ tmp139,* x
ldrb r1, [r0, #1] @ zero_extendqisi2 @ tmp140,
ldrb r2, [r0, #2] @ zero_extendqisi2 @ tmp143,
ldrb r0, [r0, #3] @ zero_extendqisi2 @ tmp146,
orr r3, r3, r1, asl #8 @, tmp142, tmp139, tmp140,
orr r3, r3, r2, asl #16 @, tmp145, tmp142, tmp143,
orr r0, r3, r0, asl #24 @,, tmp145, tmp146,
bx lr @
bar:
stmfd sp!, {r4, r5, r6, r7} @,
ldrb r2, [r0, #0] @ zero_extendqisi2 @ tmp139,* x
ldrb r7, [r0, #1] @ zero_extendqisi2 @ tmp140,
ldrb r3, [r0, #4] @ zero_extendqisi2 @ tmp149,
ldrb r6, [r0, #5] @ zero_extendqisi2 @ tmp150,
ldrb r5, [r0, #2] @ zero_extendqisi2 @ tmp143,
ldrb r4, [r0, #6] @ zero_extendqisi2 @ tmp153,
ldrb r1, [r0, #7] @ zero_extendqisi2 @ tmp156,
ldrb ip, [r0, #3] @ zero_extendqisi2 @ tmp146,
orr r2, r2, r7, asl #8 @, tmp142, tmp139, tmp140,
orr r3, r3, r6, asl #8 @, tmp152, tmp149, tmp150,
orr r2, r2, r5, asl #16 @, tmp145, tmp142, tmp143,
orr r3, r3, r4, asl #16 @, tmp155, tmp152, tmp153,
orr r0, r2, ip, asl #24 @,, tmp145, tmp146,
orr r1, r3, r1, asl #24 @,, tmp155, tmp156,
ldmfd sp!, {r4, r5, r6, r7}
bx lr
Compared to the initial results, this is really nicely optimized and I
couldn't do much better if I were to hand code it myself.
Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Reviewed-by: Nicolas Pitre <nico@linaro.org>
Tested-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Inspired by the AArgh64 claim that it should be separate from ARM and one
reason was being able to use more asm-generic headers. Doing a diff of
arch/arm/include/asm and include/asm-generic there are numerous asm
headers which are functionally identical to their asm-generic counterparts.
Delete the ARM version and use the generic ones.
Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Reviewed-by: Nicolas Pitre <nico@linaro.org>
Tested-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
LPAE does not use two pmd entries for a pte, so the additional tlb
flushing is not required.
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This patch moves the CPU-specific IRQ registration and parsing code into
the CPU PMU backend. This is required because a PMU may have more than
one interrupt, which in turn can be either PPI (per-cpu) or SPI
(requiring strict affinity setting at the interrupt distributor).
Signed-off-by: Sudeep KarkadaNagesha <Sudeep.KarkadaNagesha@arm.com>
[will: cosmetic edits and reworked interrupt dispatching]
Signed-off-by: Will Deacon <will.deacon@arm.com>
The CPU PMU code is tightly coupled with generic ARM PMU handling code.
This makes it cumbersome when trying to add support for other ARM PMUs
(e.g. interconnect, L2 cache controller, bus) as the generic parts of
the code are not readily reusable.
This patch cleans up perf_event.c so that reusable code is exposed via
header files to other potential PMU drivers. The CPU code is
consistently named to identify it as such and also to prepare for moving
it into a separate file.
Signed-off-by: Will Deacon <will.deacon@arm.com>
The arm_pmu_type enumeration was initially introduced to identify
different PMU types in the system, the usual one being that on the CPU
(ARM_PMU_DEVICE_CPU). With the removal of the PMU reservation code and
the introduction of devicetree bindings for the CPU PMU, the enumeration
is no longer required.
This patch removes the enumeration and updates the various CPU PMU
platform devices so that they no longer pass an .id field referring
to identify the PMU type.
Cc: Haojian Zhuang <haojian.zhuang@gmail.com>
Cc: Olof Johansson <olof@lixom.net>
Cc: Pawel Moll <pawel.moll@arm.com>
Acked-by: Jon Hunter <jon-hunter@ti.com>
Acked-by: Kukjin Kim <kgene.kim@samsung.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Jiandong Zheng <jdzheng@broadcom.com>
Signed-off-by: Sudeep KarkadaNagesha <Sudeep.KarkadaNagesha@arm.com>
[will: cosmetic edits and actual removal of the enum type]
Signed-off-by: Will Deacon <will.deacon@arm.com>
The PMU reservation mechanism was originally intended to allow OProfile
and perf-events to co-ordinate over access to the CPU PMU. Since then,
OProfile for ARM has moved to using perf as its backend, so the
reservation code is no longer used.
This patch removes the reservation code for the CPU PMU on ARM.
Signed-off-by: Will Deacon <will.deacon@arm.com>
Add runtime PM support to the ARM PMU driver so that devices such as OMAP
supporting dynamic PM can use the platform->runtime_* hooks to initialise
hardware at runtime. Without having these runtime PM hooks in place any
configuration of the PMU hardware would be lost when low power states are
entered and hence would prevent PMU from working.
This change also replaces the PMU platform functions enable_irq and disable_irq
added by Ming Lei with runtime_resume and runtime_suspend funtions. Ming had
added the enable_irq and disable_irq functions as a method to configure the
cross trigger interface on OMAP4 for routing the PMU interrupts. By adding
runtime PM support, we can move the code called by enable_irq and disable_irq
into the runtime PM callbacks runtime_resume and runtime_suspend.
Cc: Ming Lei <ming.lei@canonical.com>
Cc: Benoit Cousson <b-cousson@ti.com>
Cc: Paul Walmsley <paul@pwsan.com>
Cc: Kevin Hilman <khilman@ti.com>
Signed-off-by: Jon Hunter <jon-hunter@ti.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
The extra feature may be used by SOCs are prefetch, burst8,
write buffer coalesce
Signed-off-by: Chao Xie <xiechao.mail@gmail.com>
Signed-off-by: Haojian Zhuang <haojian.zhuang@gmail.com>