Currently, "cma=" kernel parameter is used to specify the size of CMA,
but we can't specify where it is located. We want to locate CMA below
4GB for devices only supporting 32-bit addressing on 64-bit systems
without iommu.
This enables to specify the placement of CMA by extending "cma=" kernel
parameter.
Examples:
1. locate 64MB CMA below 4GB by "cma=64M@0-4G"
2. locate 64MB CMA exact at 512MB by "cma=64M@512M"
Note that the DMA contiguous memory allocator on x86 assumes that
page_address() works for the pages to allocate. So this change requires
to limit end address of contiguous memory area upto max_pfn_mapped to
prevent from locating it on highmem area by the argument of
dma_contiguous_reserve().
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Don Dutile <ddutile@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The DMA Contiguous Memory Allocator support on x86 is disabled when
swiotlb config option is enabled. So DMA CMA is always disabled on
x86_64 because swiotlb is always enabled. This attempts to support for
DMA CMA with enabling swiotlb config option.
The contiguous memory allocator on x86 is integrated in the function
dma_generic_alloc_coherent() which is .alloc callback in nommu_dma_ops
for dma_alloc_coherent().
x86_swiotlb_alloc_coherent() which is .alloc callback in swiotlb_dma_ops
tries to allocate with dma_generic_alloc_coherent() firstly and then
swiotlb_alloc_coherent() is called as a fallback.
The main part of supporting DMA CMA with swiotlb is that changing
x86_swiotlb_free_coherent() which is .free callback in swiotlb_dma_ops
for dma_free_coherent() so that it can distinguish memory allocated by
dma_generic_alloc_coherent() from one allocated by
swiotlb_alloc_coherent() and release it with dma_generic_free_coherent()
which can handle contiguous memory. This change requires making
is_swiotlb_buffer() global function.
This also needs to change .free callback in the dma_map_ops for amd_gart
and sta2x11, because these dma_ops are also using
dma_generic_alloc_coherent().
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Acked-by: Marek Szyprowski <m.szyprowski@samsung.com>
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Don Dutile <ddutile@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patchset enhances the DMA Contiguous Memory Allocator on x86.
Currently the DMA CMA is only supported with pci-nommu dma_map_ops and
furthermore it can't be enabled on x86_64. But I would like to allocate
big contiguous memory with dma_alloc_coherent() and tell it to the device
that requires it, regardless of which dma mapping implementation is
actually used in the system.
So this makes it work with swiotlb and intel-iommu dma_map_ops, too. And
this also extends "cma=" kernel parameter to specify placement constraint
by the physical address range of memory allocations. For example, CMA
allocates memory below 4GB by "cma=64M@0-4G", it is required for the
devices only supporting 32-bit addressing on 64-bit systems without iommu.
This patch (of 5):
Calling dma_alloc_coherent() with __GFP_ZERO must return zeroed memory.
But when the contiguous memory allocator (CMA) is enabled on x86 and the
memory region is allocated by dma_alloc_from_contiguous(), it doesn't
return zeroed memory. Because dma_generic_alloc_coherent() forgot to fill
the memory region with zero if it was allocated by
dma_alloc_from_contiguous()
Most implementations of dma_alloc_coherent() return zeroed memory
regardless of whether __GFP_ZERO is specified. So this fixes it by
unconditionally zeroing the allocated memory region.
Alternatively, we could fix dma_alloc_from_contiguous() to return zeroed
out memory and remove memset() from all caller of it. But we can't simply
remove the memset on arm because __dma_clear_buffer() is used there for
ensuring cache flushing and it is used in many places. Of course we can
do redundant memset in dma_alloc_from_contiguous(), but I think this patch
is less impact for fixing this problem.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Don Dutile <ddutile@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull core irq updates from Thomas Gleixner:
"The irq department delivers:
- Another tree wide update to get rid of the horrible create_irq
interface along with its even more horrible variants. That also
gets rid of the last leftovers of the initial sparse irq hackery.
arch/driver specific changes have been either acked or ignored.
- A fix for the spurious interrupt detection logic with threaded
interrupts.
- A new ARM SoC interrupt controller
- The usual pile of fixes and improvements all over the place"
* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (40 commits)
Documentation: brcmstb-l2: Add Broadcom STB Level-2 interrupt controller binding
irqchip: brcmstb-l2: Add Broadcom Set Top Box Level-2 interrupt controller
genirq: Improve documentation to match current implementation
ARM: iop13xx: fix msi support with sparse IRQ
genirq: Provide !SMP stub for irq_set_affinity_notifier()
irqchip: armada-370-xp: Move the devicetree binding documentation
irqchip: gic: Use mask field in GICC_IAR
genirq: Remove dynamic_irq mess
ia64: Use irq_init_desc
genirq: Replace dynamic_irq_init/cleanup
genirq: Remove irq_reserve_irq[s]
genirq: Replace reserve_irqs in core code
s390: Avoid call to irq_reserve_irqs()
s390: Remove pointless arch_show_interrupts()
s390: pci: Check return value of alloc_irq_desc() proper
sh: intc: Remove pointless irq_reserve_irqs() invocation
x86, irq: Remove pointless irq_reserve_irqs() call
genirq: Make create/destroy_irq() ia64 private
tile: Use SPARSE_IRQ
tile: pci: Use irq_alloc/free_hwirq()
...
Pull DeviceTree updates from Rob Herring:
- Another round of clean-up of FDT related code in architecture code.
This removes knowledge of internal FDT details from most
architectures except powerpc.
- Conversion of kernel's custom FDT parsing code to use libfdt.
- DT based initialization for generic serial earlycon. The
introduction of generic serial earlycon support went in through the
tty tree.
- Improve the platform device naming for DT probed devices to ensure
unique naming and use parent names instead of a global index.
- Fix a race condition in of_update_property.
- Unify the various linker section OF match tables and fix several
function prototype errors.
- Update platform_get_irq_byname to work in deferred probe cases.
- 2 binding doc updates
* tag 'devicetree-for-3.16' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: (58 commits)
of: handle NULL node in next_child iterators
of/irq: provide more wrappers for !CONFIG_OF
devicetree: bindings: Document micrel vendor prefix
dt: bindings: dwc2: fix required value for the phy-names property
of_pci_irq: kill useless variable in of_irq_parse_pci()
of/irq: do irq resolution in platform_get_irq_byname()
of: Add a testcase for of_find_node_by_path()
of: Make of_find_node_by_path() handle /aliases
of: Create unlocked version of for_each_child_of_node()
lib: add glibc style strchrnul() variant
of: Handle memory@0 node on PPC32 only
pci/of: Remove dead code
of: fix race between search and remove in of_update_property()
of: Use NULL for pointers
of: Stop naming platform_device using dcr address
of: Ensure unique names without sacrificing determinism
tty/serial: pl011: add DT based earlycon support
of/fdt: add FDT serial scanning for earlycon
of/fdt: add FDT address translation support
serial: earlycon: add DT support
...
Pull KVM updates from Paolo Bonzini:
"At over 200 commits, covering almost all supported architectures, this
was a pretty active cycle for KVM. Changes include:
- a lot of s390 changes: optimizations, support for migration, GDB
support and more
- ARM changes are pretty small: support for the PSCI 0.2 hypercall
interface on both the guest and the host (the latter acked by
Catalin)
- initial POWER8 and little-endian host support
- support for running u-boot on embedded POWER targets
- pretty large changes to MIPS too, completing the userspace
interface and improving the handling of virtualized timer hardware
- for x86, a larger set of changes is scheduled for 3.17. Still, we
have a few emulator bugfixes and support for running nested
fully-virtualized Xen guests (para-virtualized Xen guests have
always worked). And some optimizations too.
The only missing architecture here is ia64. It's not a coincidence
that support for KVM on ia64 is scheduled for removal in 3.17"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (203 commits)
KVM: add missing cleanup_srcu_struct
KVM: PPC: Book3S PR: Rework SLB switching code
KVM: PPC: Book3S PR: Use SLB entry 0
KVM: PPC: Book3S HV: Fix machine check delivery to guest
KVM: PPC: Book3S HV: Work around POWER8 performance monitor bugs
KVM: PPC: Book3S HV: Make sure we don't miss dirty pages
KVM: PPC: Book3S HV: Fix dirty map for hugepages
KVM: PPC: Book3S HV: Put huge-page HPTEs in rmap chain for base address
KVM: PPC: Book3S HV: Fix check for running inside guest in global_invalidates()
KVM: PPC: Book3S: Move KVM_REG_PPC_WORT to an unused register number
KVM: PPC: Book3S: Add ONE_REG register names that were missed
KVM: PPC: Add CAP to indicate hcall fixes
KVM: PPC: MPIC: Reset IRQ source private members
KVM: PPC: Graciously fail broken LE hypercalls
PPC: ePAPR: Fix hypercall on LE guest
KVM: PPC: BOOK3S: Remove open coded make_dsisr in alignment handler
KVM: PPC: BOOK3S: Always use the saved DAR value
PPC: KVM: Make NX bit available with magic page
KVM: PPC: Disable NX for old magic page using guests
KVM: PPC: BOOK3S: HV: Add mixed page-size support for guest
...
check_irq_vectors_for_cpu_disable() can overestimate the number of
available interrupt vectors, so the check for cpu down succeeds, but
the actual cpu removal fails.
It iterates from FIRST_EXTERNAL_VECTOR to NR_VECTORS, which is wrong
because the systems vectors are not taken into account.
Limit the search to first_system_vector instead of NR_VECTORS.
The second indicator for vector availability the used_vectors bitmap
is not taken into account at all. So system vectors,
e.g. IA32_SYSCALL_VECTOR (0x80) and IRQ_MOVE_CLEANUP_VECTOR (0x20),
are accounted as available.
Add a check for the used_vectors bitmap and do not account vectors
which are marked there.
[ tglx: Simplified code. Rewrote changelog and code comments. ]
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Acked-by: Prarit Bhargava <prarit@redhat.com>
Cc: Seiji Aguchi <seiji.aguchi@hds.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: K. Y. Srinivasan <kys@microsoft.com>
Cc: Steven Rostedt (Red Hat) <rostedt@goodmis.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: "Elliott, Robert (Server Storage)" <Elliott@hp.com>
Cc: x86@kernel.org
Link: http://lkml.kernel.org/r/1400160305-17774-2-git-send-email-prarit@redhat.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
I just went over this when looking at some Xen-related ftrace initialization
problems. They were related to Xen code that is not upstream but this clean up
would make sense here.
I think that this was already the intention when text_ip_addr() was introduced
in the commit 87fbb2ac60 (ftrace/x86: Use breakpoints for converting
function graph caller). Anyway, better do it now before it shots people into
their leg ;-)
Link: http://lkml.kernel.org/p/1401812601-2359-1-git-send-email-pmladek@suse.cz
Signed-off-by: Petr Mladek <pmladek@suse.cz>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Pull x86/UV changes from Ingo Molnar:
"Continued updates for SGI UV 3 hardware support"
* 'x86-uv-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/UV: Fix conditional in gru_exit()
x86/UV: Set n_lshift based on GAM_GR_CONFIG MMR for UV3
Pull x86 RAS changes from Ingo Molnar:
"Improve mcheck device initialization and bootstrap robustness"
* 'x86-ras-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
mce: Panic when a core has reached a timeout
x86/mce: Improve mcheck_init_device() error handling
Pull x86 IOSF platform updates from Ingo Molnar:
"IOSF (Intel OnChip System Fabric) updates:
- generalize the IOSF interface to allow mixed mode drivers: non-IOSF
drivers to utilize of IOSF features on IOSF platforms.
- add 'Quark X1000' IOSF/MBI support
- clean up BayTrail and Quark PCI ID enumeration"
* 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86, iosf: Add PCI ID macros for better readability
x86, iosf: Add Quark X1000 PCI ID
x86, iosf: Added Quark MBI identifiers
x86, iosf: Make IOSF driver modular and usable by more drivers
Pull x86 microcode changes from Ingo Molnar:
"A microcode-debugging boot flag plus related refactoring"
* 'x86-microcode-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86, microcode: Add a disable chicken bit
x86, boot: Carve out early cmdline parsing function
Pull scheduler updates from Ingo Molnar:
"The main scheduling related changes in this cycle were:
- various sched/numa updates, for better performance
- tree wide cleanup of open coded nice levels
- nohz fix related to rq->nr_running use
- cpuidle changes and continued consolidation to improve the
kernel/sched/idle.c high level idle scheduling logic. As part of
this effort I pulled cpuidle driver changes from Rafael as well.
- standardized idle polling amongst architectures
- continued work on preparing better power/energy aware scheduling
- sched/rt updates
- misc fixlets and cleanups"
* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (49 commits)
sched/numa: Decay ->wakee_flips instead of zeroing
sched/numa: Update migrate_improves/degrades_locality()
sched/numa: Allow task switch if load imbalance improves
sched/rt: Fix 'struct sched_dl_entity' and dl_task_time() comments, to match the current upstream code
sched: Consolidate open coded implementations of nice level frobbing into nice_to_rlimit() and rlimit_to_nice()
sched: Initialize rq->age_stamp on processor start
sched, nohz: Change rq->nr_running to always use wrappers
sched: Fix the rq->next_balance logic in rebalance_domains() and idle_balance()
sched: Use clamp() and clamp_val() to make sys_nice() more readable
sched: Do not zero sg->cpumask and sg->sgp->power in build_sched_groups()
sched/numa: Fix initialization of sched_domain_topology for NUMA
sched: Call select_idle_sibling() when not affine_sd
sched: Simplify return logic in sched_read_attr()
sched: Simplify return logic in sched_copy_attr()
sched: Fix exec_start/task_hot on migrated tasks
arm64: Remove TIF_POLLING_NRFLAG
metag: Remove TIF_POLLING_NRFLAG
sched/idle: Make cpuidle_idle_call() void
sched/idle: Reflow cpuidle_idle_call()
sched/idle: Delay clearing the polling bit
...
Pull perf updates from Ingo Molnar:
"The tooling changes maintained by Jiri Olsa until Arnaldo is on
vacation:
User visible changes:
- Add -F option for specifying output fields (Namhyung Kim)
- Propagate exit status of a command line workload for record command
(Namhyung Kim)
- Use tid for finding thread (Namhyung Kim)
- Clarify the output of perf sched map plus small sched command
fixes (Dongsheng Yang)
- Wire up perf_regs and unwind support for ARM64 (Jean Pihet)
- Factor hists statistics counts processing which in turn also fixes
several bugs in TUI report command (Namhyung Kim)
- Add --percentage option to control absolute/relative percentage
output (Namhyung Kim)
- Add --list-cmds to 'kmem', 'mem', 'lock' and 'sched', for use by
completion scripts (Ramkumar Ramachandra)
Development/infrastructure changes and fixes:
- Android related fixes for pager and map dso resolving (Michael
Lentine)
- Add libdw DWARF post unwind support for ARM (Jean Pihet)
- Consolidate types.h for ARM and ARM64 (Jean Pihet)
- Fix possible null pointer dereference in session.c (Masanari Iida)
- Cleanup, remove unused variables in map_switch_event() (Dongsheng
Yang)
- Remove nr_state_machine_bugs in perf latency (Dongsheng Yang)
- Remove usage of trace_sched_wakeup(.success) (Peter Zijlstra)
- Cleanups for perf.h header (Jiri Olsa)
- Consolidate types.h and export.h within tools (Borislav Petkov)
- Move u64_swap union to its single user's header, evsel.h (Borislav
Petkov)
- Fix for s390 to properly parse tracepoints plus test code
(Alexander Yarygin)
- Handle EINTR error for readn/writen (Namhyung Kim)
- Add a test case for hists filtering (Namhyung Kim)
- Share map_groups among threads of the same group (Arnaldo Carvalho
de Melo, Jiri Olsa)
- Making some code (cpu node map and report parse callchain callback)
global to be usable by upcomming changes (Don Zickus)
- Fix pmu object compilation error (Jiri Olsa)
Kernel side changes:
- intrusive uprobes fixes from Oleg Nesterov. Since the interface is
admin-only, and the bug only affects user-space ("any probed
jmp/call can kill the application"), we queued these fixes via the
development tree, as a special exception.
- more fuzzer motivated race fixes and related refactoring and
robustization.
- allow PMU drivers to be built as modules. (No actual module yet,
because the x86 Intel uncore module wasn't ready in time for this)"
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (114 commits)
perf tools: Add automatic remapping of Android libraries
perf tools: Add cat as fallback pager
perf tests: Add a testcase for histogram output sorting
perf tests: Factor out print_hists_*()
perf tools: Introduce reset_output_field()
perf tools: Get rid of obsolete hist_entry__sort_list
perf hists: Reset width of output fields with header length
perf tools: Skip elided sort entries
perf top: Add --fields option to specify output fields
perf report/tui: Fix a bug when --fields/sort is given
perf tools: Add ->sort() member to struct sort_entry
perf report: Add -F option to specify output fields
perf tools: Call perf_hpp__init() before setting up GUI browsers
perf tools: Consolidate management of default sort orders
perf tools: Allow hpp fields to be sort keys
perf ui: Get rid of callback from __hpp__fmt()
perf tools: Consolidate output field handling to hpp format routines
perf tools: Use hpp formats to sort final output
perf tools: Support event grouping in hpp ->sort()
perf tools: Use hpp formats to sort hist entries
...
This patch cleans up some code in xstate offsets computation in xsave
area:
1. It changes xstate_comp_offsets as an array. This avoids possible NULL pointer
caused by possible kmalloc() failure during boot time.
2. It changes the global variable xstate_comp_sizes to a local variable because
it is used only in setup_xstate_comp().
3. It adds missing offsets for FP and SSE in xsave area.
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Link: http://lkml.kernel.org/r/1401387164-43416-17-git-send-email-fenghua.yu@intel.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
There is very little and maybe practically nothing we can do to recover
from a system where at least one core has reached a timeout during the
whole monarch cores gathering. So panic when that happens.
Link: http://lkml.kernel.org/r/20140523091041.GA21332@pd.tnic
Signed-off-by: Borislav Petkov <bp@suse.de>
In standard form, each state is saved in the xsave area in fixed offset.
But in compacted form, offset of each saved state only can be calculated during
run time because some xstates may not be enabled and saved.
We define kernel API get_xsave_addr() returns address of a given state saved in a xsave area.
It can be called in kernel to get address of each xstate in xsave area in
either standard format or compacted format.
It's useful when kernel wants to directly access each state in xsave area.
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Link: http://lkml.kernel.org/r/1401387164-43416-17-git-send-email-fenghua.yu@intel.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
This patch adds a kernel parameter noxsaves to disable xsaves/xrstors feature.
The kernel will fall back to use xsaveopt and xrstor to save and restor
xstates. By using this parameter, xsave area occupies more memory because
standard form of xsave area in xsaveopt/xrstor occupies more memory than
compacted form of xsave area.
This patch adds a description of the kernel parameter noxsaveopt in doc.
The code to support the parameter noxsaveopt has been in the kernel before.
This patch just adds the description of this parameter in the doc.
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Link: http://lkml.kernel.org/r/1401387164-43416-4-git-send-email-fenghua.yu@intel.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Detect the xsaveopt, xsavec, xgetbv, and xsaves features in processor extended
state enumberation sub-leaf (eax=0x0d, ecx=1):
Bit 00: XSAVEOPT is available
Bit 01: Supports XSAVEC and the compacted form of XRSTOR if set
Bit 02: Supports XGETBV with ECX = 1 if set
Bit 03: Supports XSAVES/XRSTORS and IA32_XSS if set
The above features are defined in the new word 10 in cpu features.
The IA32_XSS MSR (index DA0H) contains a state-component bitmap that specifies
the state components that software has enabled xsaves and xrstors to manage.
If the bit corresponding to a state component is clear in XCR0 | IA32_XSS,
xsaves and xrstors will not operate on that state component, regardless of
the value of the instruction mask.
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Link: http://lkml.kernel.org/r/1401387164-43416-3-git-send-email-fenghua.yu@intel.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Pull perf fixes from Ingo Molnar:
"The biggest changes are fixes for races that kept triggering Trinity
crashes, plus liblockdep build fixes and smaller misc fixes.
The liblockdep bits in perf/urgent are a pull mistake - they should
have been in locking/urgent - but by the time I noticed other commits
were added and testing was done :-/ Sorry about that"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf: Fix a race between ring_buffer_detach() and ring_buffer_attach()
perf: Prevent false warning in perf_swevent_add
perf: Limit perf_event_attr::sample_period to 63 bits
tools/liblockdep: Remove all build files when doing make clean
tools/liblockdep: Build liblockdep from tools/Makefile
perf/x86/intel: Fix Silvermont's event constraints
perf: Fix perf_event_init_context()
perf: Fix race in removing an event
Print the AGP bridge info the same way as the rest of the kernel, e.g.,
"0000:00:04.0" instead of "00:04:00".
Also print the AGP aperture address range the same way we print resources,
and label it explicitly as a bus address range.
No functional change except the message changes.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Replace printk() with pr_info(), pr_err(), etc. Define pr_fmt() to prefix
output with "AGP: ".
No functional change except the addition of "AGP: " prefix in dmesg output.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
I noticed on some of my systems that page fault tracing doesn't
work:
cd /sys/kernel/debug/tracing
echo 1 > events/exceptions/enable
cat trace;
# nothing shows up
I eventually traced it down to CONFIG_KVM_GUEST. At least in a
KVM VM, enabling that option breaks page fault tracing, and
disabling fixes it. I tried on some old kernels and this does
not appear to be a regression: it never worked.
There are two page-fault entry functions today. One when tracing
is on and another when it is off. The KVM code calls do_page_fault()
directly instead of calling the traced version:
> dotraplinkage void __kprobes
> do_async_page_fault(struct pt_regs *regs, unsigned long
> error_code)
> {
> enum ctx_state prev_state;
>
> switch (kvm_read_and_reset_pf_reason()) {
> default:
> do_page_fault(regs, error_code);
> break;
> case KVM_PV_REASON_PAGE_NOT_PRESENT:
I'm also having problems with the page fault tracing on bare
metal (same symptom of no trace output). I'm unsure if it's
related.
Steven had an alternative to this which has zero overhead when
tracing is off where this includes the standard noops even when
tracing is disabled. I'm unconvinced that the extra complexity
of his apporach:
http://lkml.kernel.org/r/20140508194508.561ed220@gandalf.local.home
is worth it, expecially considering that the KVM code is already
making page fault entry slower here. This solution is
dirt-simple.
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86@kernel.org
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Gleb Natapov <gleb@redhat.com>
Cc: kvm@vger.kernel.org
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: "H. Peter Anvin" <hpa@zytor.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Merge x86/espfix into x86/vdso, due to changes in the vdso setup code
that otherwise cause conflicts.
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Merge in Linus' tree with:
fa81511bb0 x86-64, modify_ldt: Make support for 16-bit segments a runtime option
... reverted, to avoid a conflict. This commit is no longer necessary
with the proper fix in place.
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Add a cmdline param which disables the microcode loader. This is useful
mostly in debugging situations where we want to turn off microcode
loading, both early from the initrd and late, as a means to be able to
rule out its influence on the machine.
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/1400525957-11525-3-git-send-email-bp@alien8.de
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
This patch fixes a bug in precise_store_data_hsw() whereby
it would set the data source memory level to the wrong value.
As per the the SDM Vol 3b Table 18-41 (Layout of Data Linear
Address Information in PEBS Record), when status bit 0 is set
this is a L1 hit, otherwise this is a L1 miss.
This patch encodes the memory level according to the specification.
In V2, we added the filtering on the store events.
Only the following events produce L1 information:
* MEM_UOPS_RETIRED.STLB_MISS_STORES
* MEM_UOPS_RETIRED.LOCK_STORES
* MEM_UOPS_RETIRED.SPLIT_STORES
* MEM_UOPS_RETIRED.ALL_STORES
Cc: mingo@elte.hu
Cc: acme@ghostprotocols.net
Cc: jolsa@redhat.com
Cc: jmario@redhat.com
Cc: ak@linux.intel.com
Tested-and-Reviewed-by: Don Zickus <dzickus@redhat.com>
Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20140515155644.GA3884@quad
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
This is just a cleanup to get rid of the create/destroy_irq variants
which were designed in hell.
The long term solution for x86 is to switch over to irq domains and
cleanup the whole vector allocation mess.
The generic irq_alloc_hwirqs() interface deliberately prevents
multi-MSI vector allocation to further enforce the irq domain
conversion (aside of the desire to support ioapic hotplug).
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Grant Likely <grant.likely@linaro.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: x86@kernel.org
Link: http://lkml.kernel.org/r/20140507154334.482904047@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Checkin:
b3b42ac2cb x86-64, modify_ldt: Ban 16-bit segments on 64-bit kernels
disabled 16-bit segments on 64-bit kernels due to an information
leak. However, it does seem that people are genuinely using Wine to
run old 16-bit Windows programs on Linux.
A proper fix for this ("espfix64") is coming in the upcoming merge
window, but as a temporary fix, create a sysctl to allow the
administrator to re-enable support for 16-bit segments.
It adds a "/proc/sys/abi/ldt16" sysctl that defaults to zero (off). If
you hit this issue and care about your old Windows program more than
you care about a kernel stack address information leak, you can do
echo 1 > /proc/sys/abi/ldt16
as root (add it to your startup scripts), and you should be ok.
The sysctl table is only added if you have COMPAT support enabled on
x86-64, but I assume anybody who runs old windows binaries very much
does that ;)
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Link: http://lkml.kernel.org/r/CA%2B55aFw9BPoD10U1LfHbOMpHWZkvJTkMcfCs9s3urPr1YyWBxw@mail.gmail.com
Cc: <stable@vger.kernel.org>
As the decision to what needs to be done (converting a call to the
ftrace_caller to ftrace_caller_regs or to convert from ftrace_caller_regs
to ftrace_caller) can easily be determined from the rec->flags of
FTRACE_FL_REGS and FTRACE_FL_REGS_EN, there's no need to have the
ftrace_check_record() return either a UPDATE_MODIFY_CALL_REGS or a
UPDATE_MODIFY_CALL. Just he latter is enough. This added flag causes
more complexity than is required. Remove it.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>