android_kernel_xiaomi_sm8450

xiaomi-sm8450/android_kernel_xiaomi_sm8450

Author	SHA1	Message	Date
Michael Opdenacker	9d71cee667	x86/xen: remove deprecated IRQF_DISABLED This patch proposes to remove the IRQF_DISABLED flag from x86/xen code. It's a NOOP since 2.6.35 and it will be removed one day. Signed-off-by: Michael Opdenacker <michael.opdenacker@free-electrons.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2013-11-06 15:31:01 -05:00
Oleg Nesterov	8a8de66c4f	uprobes: Introduce arch_uprobe->ixol Currently xol_get_insn_slot() assumes that we should simply copy arch_uprobe->insn[] which is (ignoring arch_uprobe_analyze_insn) just the copy of the original insn. This is not true for arm which needs to create another insn to execute it out-of-line. So this patch simply adds the new member, ->ixol into the union. This doesn't make any difference for x86 and powerpc, but arm can divorce insn/ixol and initialize the correct xol insn in arch_uprobe_analyze_insn(). Signed-off-by: Oleg Nesterov <oleg@redhat.com>	2013-11-06 20:00:05 +01:00
David A. Long	3820b4d278	uprobes: Move function declarations out of arch Move the function declarations from the arch headers to the common header, since only the function bodies are architecture-specific. These changes are from Vincent Rabin's uprobes patch. [ oleg: update arch/powerpc/include/asm/uprobes.h ] Signed-off-by: Rabin Vincent <rabin@rab.in> Signed-off-by: David A. Long <dave.long@linaro.org> Signed-off-by: Oleg Nesterov <oleg@redhat.com>	2013-11-06 19:59:37 +01:00
H. Peter Anvin	4c08edd305	x86, hyperv: Move a variable to avoid an unused variable warning The variable hv_lapic_frequency causes an unused variable warning if CONFIG_X86_LOCAL_APIC is disabled. Since the variable is only used inside a small if statement, move the declaration of that variable into the if statement itself. Cc: K. Y. Srinivasan <kys@microsoft.com> Link: http://lkml.kernel.org/r/1381444224-3303-1-git-send-email-kys@microsoft.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2013-11-06 10:02:05 -08:00
John Stultz	1ca7d67cf5	seqcount: Add lockdep functionality to seqcount/seqlock structures Currently seqlocks and seqcounts don't support lockdep. After running across a seqcount related deadlock in the timekeeping code, I used a less-refined and more focused variant of this patch to narrow down the cause of the issue. This is a first-pass attempt to properly enable lockdep functionality on seqlocks and seqcounts. Since seqcounts are used in the vdso gettimeofday code, I've provided non-lockdep accessors for those needs. I've also handled one case where there were nested seqlock writers and there may be more edge cases. Comments and feedback would be appreciated! Signed-off-by: John Stultz <john.stultz@linaro.org> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Li Zefan <lizefan@huawei.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: netdev@vger.kernel.org Link: http://lkml.kernel.org/r/1381186321-4906-3-git-send-email-john.stultz@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-11-06 12:40:26 +01:00
Yan, Zheng	f891d8cfb8	perf/x86/intel: Add Ivy Bridge-EP uncore IRP box support Unlike other uncore boxes, IRP boxes live in PCI buses with no UBOX device. For PCI bus without UBOX device, we find the next bus that has UBOX device and use its 'bus to socket' mapping. Besides the counter/control registers in IRP boxes are not properly aligned. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: eranian@google.com Cc: "Yan Zheng" <zheng.z.yan@intel.com> Link: http://lkml.kernel.org/r/1383197815-17706-2-git-send-email-zheng.z.yan@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-11-06 12:34:31 +01:00
Yan, Zheng	d1e8f4a836	perf/x86/intel/uncore: Add filter support for IvyBridge-EP QPI boxes The encoding for filter registers of IvyBridge-EP uncore QPI boxes is completely the same as SandyBridge-EP. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: eranian@google.com Cc: "Yan Zheng" <zheng.z.yan@intel.com> Link: http://lkml.kernel.org/r/1383197815-17706-1-git-send-email-zheng.z.yan@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-11-06 12:34:29 +01:00
Peter Zijlstra	0a196848ca	perf: Fix arch_perf_out_copy_user default The arch_perf_output_copy_user() default of __copy_from_user_inatomic() returns bytes not copied, while all other argument functions given DEFINE_OUTPUT_COPY() return bytes copied. Since copy_from_user_nmi() is the odd duck out by returning bytes copied where all other copy_{to,from} functions return bytes not copied, change it over and ammend DEFINE_OUTPUT_COPY() to expect bytes not copied. Oddly enough DEFINE_OUTPUT_COPY() already returned bytes not copied while expecting its worker functions to return bytes copied. Signed-off-by: Peter Zijlstra <peterz@infradead.org> Acked-by: will.deacon@arm.com Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/r/20131030201622.GR16117@laptop.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-11-06 12:34:25 +01:00
Josh Triplett	a890b6fefd	kvm: Delete prototype for non-existent function kvm_check_iopl The prototype for kvm_check_iopl appeared in commit `f850e2e603` ("KVM: x86 emulator: Check IOPL level during io instruction emulation"), but the function never actually existed. Remove the prototype. Signed-off-by: Josh Triplett <josh@joshtriplett.org> Signed-off-by: Gleb Natapov <gleb@redhat.com>	2013-11-06 11:07:09 +02:00
Josh Triplett	35a5121b58	kvm: Delete prototype for non-existent function complete_pio complete_pio ceased to exist in commit `7972995b0c` ("KVM: x86 emulator: Move string pio emulation into emulator.c"), but the prototype remained. Remove its prototype. Signed-off-by: Josh Triplett <josh@joshtriplett.org> Signed-off-by: Gleb Natapov <gleb@redhat.com>	2013-11-06 11:06:32 +02:00
Marcelo Tosatti	8b414521bc	hung_task: add method to reset detector In certain occasions it is possible for a hung task detector positive to be false: continuation from a paused VM, for example. Add a method to reset detection, similar as is done with other kernel watchdogs. Acked-by: Don Zickus <dzickus@redhat.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>	2013-11-06 09:49:02 +02:00
Marcelo Tosatti	d63285e94a	pvclock: detect watchdog reset at pvclock read Implement reset of kernel watchdogs at pvclock read time. This avoids adding special code to every watchdog. This is possible for watchdogs which measure time based on sched_clock() or ktime_get() variants. Suggested by Don Zickus. Acked-by: Don Zickus <dzickus@redhat.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>	2013-11-06 09:48:43 +02:00
Michael S. Tsirkin	01b71917b5	kvm: optimize out smp_mb after srcu_read_unlock I noticed that srcu_read_lock/unlock both have a memory barrier, so just by moving srcu_read_unlock earlier we can get rid of one call to smp_mb() using smp_mb__after_srcu_read_unlock instead. Unsurprisingly, the gain is small but measureable using the unit test microbenchmark: before vmcall in the ballpark of 1410 cycles after vmcall in the ballpark of 1360 cycles Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>	2013-11-06 09:32:31 +02:00
Josh Boyer	b53b5eda81	x86/cpu: Increase max CPU count to 8192 The MAXSMP option is intended to enable silly large numbers of CPUs for testing purposes. The current value of 4096 isn't very silly any longer as there are actual SGI machines that approach 6096 CPUs when taking HT into account. Increase the value to a nice round 8192 to account for this and allow for short term future increases. Signed-off-by: Josh Boyer <jwboyer@fedoraproject.org> Cc: prarit@redhat.com Cc: Russ Anderson <rja@sgi.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/r/20131105143816.GK9944@hansolo.jdub.homelinux.org [ Tweaked it so that MAXSMP simply sets the maximum of the normal range. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-11-06 08:16:34 +01:00
Josh Boyer	bb61ccc7db	x86/cpu: Allow higher NR_CPUS values The current range for SMP configs is 2 - 512 CPUs, or a full 4096 in the case of MAXSMP. There are machines that have 1024 CPUs in them today and configuring a kernel for that means you are forced to set MAXSMP. This adds additional unnecessary overhead. While that overhead might be considered tiny for large machines, it isn't necessarily so if you are building a kernel that runs across a wide variety of machines. To cover the range of more common machines today, we allow NR_CPUS to be up to 4096 when CPUMASK_OFFSTACK is enabled. Signed-off-by: Josh Boyer <jwboyer@fedoraproject.org> Cc: prarit@redhat.com Cc: Russ Anderson <rja@sgi.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/r/20131105143728.GJ9944@hansolo.jdub.homelinux.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-11-06 08:16:28 +01:00
HATAYAMA Daisuke	a477c8594b	x86/cpu: Always print SMP information in /proc/cpuinfo Currently show_cpuinfo_core() displays cpu core information only if the number of threads per a whole cores is 2 or larger. However, this condition doesn't care about the number of sockets. For example, this condition doesn't hold on systems with two logical cpus consisting of two sockets and a single core on each socket - yet the topology information would be interesting to see in that case as well. I don't know whether or not there are processors in real world by which such configurations are possible, but at least on vitual machine environments, such configuration can occur, typically when no explicit SMP information is provided in advance. For example, on qemu/KVM, SMP information is specified via -smp command-line option, more specifically, its syntax is: -smp n[,cores=cores][,threads=threads][,sockets=sockets][,maxcpus=maxcpus] If this is not specified, qemu tells configuration with n-sockets, 1-core and 1-thread to the guest machine, on which guest, MP information is not displayed in /proc/cpuinfo. I saw this situation on VMWare guest environment, too. To fix this issue, this patch simply removes the condition because this information is useful even if there's only 1 thread. Signed-off-by: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: H. Peter Anvin <hpa@linux.intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/5277D644.4090707@jp.fujitsu.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-11-06 08:13:56 +01:00
Ingo Molnar	c90423d1de	Merge branch 'sched/core' into core/locking, to prepare the kernel/locking/ file move Conflicts: kernel/Makefile There are conflicts in kernel/Makefile due to file moving in the scheduler tree - resolve them. Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-11-06 07:50:37 +01:00
Ingo Molnar	164777530b	Merge tag 'v3.12' into x86/cpu, to refresh the branch before queueing up more changes Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-11-06 06:50:21 +01:00
Ingo Molnar	97c53b402f	Merge tag 'v3.12' into core/locking to pick up mutex upates Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-11-06 06:39:45 +01:00
Kevin Hao	ab4ead02ec	ftrace/x86: skip over the breakpoint for ftrace caller In commit `8a4d0a687a` "ftrace: Use breakpoint method to update ftrace caller", we choose to use breakpoint method to update the ftrace caller. But we also need to skip over the breakpoint in function ftrace_int3_handler() for them. Otherwise weird things would happen. Cc: stable@vger.kernel.org # 3.5+ Signed-off-by: Kevin Hao <haokexin@gmail.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2013-11-05 16:01:47 -05:00
Gleb Natapov	a9d4e4393b	KVM: x86: trace cpuid emulation when called from emulator Currently cpuid emulation is traced only when executed by intercept. Move trace point so that emulator invocation is traced too. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>	2013-11-05 09:11:40 +02:00
Gleb Natapov	6d4d85ec56	KVM: emulator: cleanup decode_register_operand() a bit Make code shorter. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>	2013-11-05 09:11:30 +02:00
Gleb Natapov	aa9ac1a632	KVM: emulator: check rex prefix inside decode_register() All decode_register() callers check if instruction has rex prefix to properly decode one byte operand. It make sense to move the check inside. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>	2013-11-05 09:11:18 +02:00
Chen Gang	7f71be4c9f	x86, defconfig: Add DEVTMPFS and DEVTMPFS_MOUNT to 86_defconfig The defconfig kernel can not run under neither fedora16 x86_64 laptop nor fedora17 x86_64 pc. After enable DEVTMPFS* in x86_64_defconfig, it will be OK. DEVTMPFS* is only related with software, so for i386_defconfig may also need them (at least, it has no negative effect for defconfig). Signed-off-by: Chen Gang <gang.chen@asianux.com> Link: http://lkml.kernel.org/r/52784DFF.8040004@asianux.com Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2013-11-04 20:01:55 -08:00
David S. Miller	394efd19d5	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: drivers/net/ethernet/emulex/benet/be.h drivers/net/netconsole.c net/bridge/br_private.h Three mostly trivial conflicts. The net/bridge/br_private.h conflict was a function signature (argument addition) change overlapping with the extern removals from Joe Perches. In drivers/net/netconsole.c we had one change adjusting a printk message whilst another changed "printk(KERN_INFO" into "pr_info(". Lastly, the emulex change was a new inline function addition overlapping with Joe Perches's extern removals. Signed-off-by: David S. Miller <davem@davemloft.net>	2013-11-04 13:48:30 -05:00
Gleb Natapov	95f328d3ad	Merge branch 'kvm-ppc-queue' of git://github.com/agraf/linux-2.6 into queue Conflicts: arch/powerpc/include/asm/processor.h	2013-11-04 10:20:57 +02:00
Ingo Molnar	2a3ede8cb2	Merge branch 'perf/urgent' into perf/core to fix conflicts Conflicts: tools/perf/bench/numa.c Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-11-04 07:49:35 +01:00
Paolo Bonzini	daf727225b	KVM: x86: fix emulation of "movzbl %bpl, %eax" When I was looking at RHEL5.9's failure to start with unrestricted_guest=0/emulate_invalid_guest_state=1, I got it working with a slightly older tree than kvm.git. I now debugged the remaining failure, which was introduced by commit `660696d1` (KVM: X86 emulator: fix source operand decoding for 8bit mov[zs]x instructions, 2013-04-24) introduced a similar mis-emulation to the one in commit `8acb4207` (KVM: fix sil/dil/bpl/spl in the mod/rm fields, 2013-05-30). The incorrect decoding occurs in 8-bit movzx/movsx instructions whose 8-bit operand is sil/dil/bpl/spl. Needless to say, "movzbl %bpl, %eax" does occur in RHEL5.9's decompression prolog, just a handful of instructions before finally giving control to the decompressed vmlinux and getting out of the invalid guest state. Because OpMem8 bypasses decode_modrm, the same handling of the REX prefix must be applied to OpMem8. Reported-by: Michele Baldessari <michele@redhat.com> Cc: stable@vger.kernel.org Cc: Gleb Natapov <gleb@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>	2013-11-03 16:50:37 +02:00
Linus Torvalds	9581b7d268	Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar: "Two fixes: - Fix 'NMI handler took too long to run' false positives [ Genuine NMI overhead speedups will come for v3.13, this commit only fixes a measurement bug ] - Fix perf ring-buffer missed barrier causing (rare) ring-buffer data corruption on ppc64" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/x86: Fix NMI measurements perf: Fix perf ring buffer memory ordering	2013-11-01 12:54:51 -07:00
Ingo Molnar	fb10d5b7ef	Merge branch 'linus' into sched/core Resolve cherry-picking conflicts: Conflicts: mm/huge_memory.c mm/memory.c mm/mprotect.c See this upstream merge commit for more details: `52469b4fcd` Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-11-01 08:24:41 +01:00
Paolo Bonzini	98f73630f9	KVM: x86: emulate SAHF instruction Yet another instruction that we fail to emulate, this time found in Windows 2008R2 32-bit. Reviewed-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2013-10-31 22:14:10 +01:00
Bjorn Helgaas	33de1b8bf6	Merge branch 'pci/misc' into next * pci/misc: PCI: Report pci_pme_active() kmalloc failure mn10300/PCI: Remove useless pcibios_last_bus frv/PCI: Remove pcibios_last_bus PCI: Fail MSI/MSI-X initialization if device is not in PCI_D0 x86/PCI: Coalesce multiple overlapping host bridge windows MAINTAINERS: Add arch/x86/pci to PCI file patterns PCI/PM: Remove pci_pm_complete() PCI: Add pci_dev_show_local_cpu() to simplify code mn10300/PCI: Remove unused pci_mem_start cris/PCI: Remove unused pci_mem_start PCI: Make pci_dev_pm_ops static Conflicts: drivers/pci/pci-sysfs.c	2013-10-31 14:12:40 -06:00
Lv Zheng	40bce100ca	ACPICA: Cleanup asmlinkage for ACPICA APIs. Add an asmlinkage wrapper around acpi_enter_sleep_state() to prevent an empty stub from being called by assmebly code for ACPI_REDUCED_HARDWARE set. As arch/x86/kernel/acpi/wakeup_xx.S is only compiled when CONFIG_ACPI=y and there are no users of ACPI_HARDWARE_REDUCED, currently this is in fact not a real issue, but a cleanup to reduce source code differences between Linux and ACPICA upstream. [rjw: Changelog] Signed-off-by: Lv Zheng <lv.zheng@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-10-31 14:37:35 +01:00
Michael S. Tsirkin	6026620475	kvm/vmx: error message typo fix mst can't be blamed for lack of switch entries: the issue is with msrs actually. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2013-10-31 11:55:15 +01:00
Paolo Bonzini	c67a04cb9a	KVM: x86: fix KVM_SET_XCRS loop The loop was always using 0 as the index. This means that any rubbish after the first element of the array went undetected. It seems reasonable to assume that no KVM userspace did that. Reviewed-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2013-10-31 11:31:19 +01:00
Paolo Bonzini	46c34cb059	KVM: x86: fix KVM_SET_XCRS for CPUs that do not support XSAVE The KVM_SET_XCRS ioctl must accept anything that KVM_GET_XCRS could return. XCR0's bit 0 is always 1 in real processors with XSAVE, and KVM_GET_XCRS will always leave bit 0 set even if the emulated processor does not have XSAVE. So, KVM_SET_XCRS must ignore that bit when checking for attempts to enable unsupported save states. Reviewed-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2013-10-31 11:30:46 +01:00
David Rientjes	d68ce0177c	x86, hyperv: Fix build error due to missing <asm/apic.h> include `9e7827b5ea` ("x86, hyperv: Get the local APIC timer frequency from the hypervisor") breaks the build with some configs because apic.h isn't directly included: arch/x86/kernel/cpu/mshyperv.c: In function 'ms_hyperv_init_platform': arch/x86/kernel/cpu/mshyperv.c:90:3: error: 'lapic_timer_frequency' undeclared (first use in this function) arch/x86/kernel/cpu/mshyperv.c:90:3: note: each undeclared identifier is reported only once for each function it appears in Fix it by including asm/apic.h. Signed-off-by: David Rientjes <rientjes@google.com> Link: http://lkml.kernel.org/r/alpine.DEB.2.02.1310111604160.31170@chino.kir.corp.google.com Acked-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2013-10-30 15:37:47 -07:00
Linus Torvalds	12aee278b5	Merge branch 'akpm' (fixes from Andrew Morton) Merge three fixes from Andrew Morton. * emailed patches from Andrew Morton <akpm@linux-foundation.org>: memcg: use __this_cpu_sub() to dec stats to avoid incorrect subtrahend casting percpu: fix this_cpu_sub() subtrahend casting for unsigneds mm/pagewalk.c: fix walk_page_range() access of wrong PTEs	2013-10-30 14:27:10 -07:00
Greg Thelen	bd09d9a351	percpu: fix this_cpu_sub() subtrahend casting for unsigneds this_cpu_sub() is implemented as negation and addition. This patch casts the adjustment to the counter type before negation to sign extend the adjustment. This helps in cases where the counter type is wider than an unsigned adjustment. An alternative to this patch is to declare such operations unsupported, but it seemed useful to avoid surprises. This patch specifically helps the following example: unsigned int delta = 1 preempt_disable() this_cpu_write(long_counter, 0) this_cpu_sub(long_counter, delta) preempt_enable() Before this change long_counter on a 64 bit machine ends with value 0xffffffff, rather than 0xffffffffffffffff. This is because this_cpu_sub(pcp, delta) boils down to this_cpu_add(pcp, -delta), which is basically: long_counter = 0 + 0xffffffff Also apply the same cast to: __this_cpu_sub() __this_cpu_sub_return() this_cpu_sub_return() All percpu_test.ko passes, especially the following cases which previously failed: l -= ui_one; __this_cpu_sub(long_counter, ui_one); CHECK(l, long_counter, -1); l -= ui_one; this_cpu_sub(long_counter, ui_one); CHECK(l, long_counter, -1); CHECK(l, long_counter, 0xffffffffffffffff); ul -= ui_one; __this_cpu_sub(ulong_counter, ui_one); CHECK(ul, ulong_counter, -1); CHECK(ul, ulong_counter, 0xffffffffffffffff); ul = this_cpu_sub_return(ulong_counter, ui_one); CHECK(ul, ulong_counter, 2); ul = __this_cpu_sub_return(ulong_counter, ui_one); CHECK(ul, ulong_counter, 1); Signed-off-by: Greg Thelen <gthelen@google.com> Acked-by: Tejun Heo <tj@kernel.org> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-10-30 14:27:03 -07:00
Alex Williamson	e0f0bbc527	kvm: Create non-coherent DMA registeration We currently use some ad-hoc arch variables tied to legacy KVM device assignment to manage emulation of instructions that depend on whether non-coherent DMA is present. Create an interface for this, adapting legacy KVM device assignment and adding VFIO via the KVM-VFIO device. For now we assume that non-coherent DMA is possible any time we have a VFIO group. Eventually an interface can be developed as part of the VFIO external user interface to query the coherency of a group. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2013-10-30 19:02:23 +01:00
Alex Williamson	d96eb2c6f4	kvm/x86: Convert iommu_flags to iommu_noncoherent Default to operating in coherent mode. This simplifies the logic when we switch to a model of registering and unregistering noncoherent I/O with KVM. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2013-10-30 19:02:13 +01:00
Alex Williamson	ec53500fae	kvm: Add VFIO device So far we've succeeded at making KVM and VFIO mostly unaware of each other, but areas are cropping up where a connection beyond eventfds and irqfds needs to be made. This patch introduces a KVM-VFIO device that is meant to be a gateway for such interaction. The user creates the device and can add and remove VFIO groups to it via file descriptors. When a group is added, KVM verifies the group is valid and gets a reference to it via the VFIO external user interface. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2013-10-30 19:02:03 +01:00
Borislav Petkov	84cffe499b	kvm: Emulate MOVBE This basically came from the need to be able to boot 32-bit Atom SMP guests on an AMD host, i.e. a host which doesn't support MOVBE. As a matter of fact, qemu has since recently received MOVBE support but we cannot share that with kvm emulation and thus we have to do this in the host. We're waay faster in kvm anyway. :-) So, we piggyback on the #UD path and emulate the MOVBE functionality. With it, an 8-core SMP guest boots in under 6 seconds. Also, requesting MOVBE emulation needs to happen explicitly to work, i.e. qemu -cpu n270,+movbe... Just FYI, a fairly straight-forward boot of a MOVBE-enabled 3.9-rc6+ kernel in kvm executes MOVBE ~60K times. Signed-off-by: Andre Przywara <andre@andrep.de> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2013-10-30 18:54:41 +01:00
Borislav Petkov	0bc5eedb82	kvm, emulator: Add initial three-byte insns support Add initial support for handling three-byte instructions in the emulator. Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2013-10-30 18:54:41 +01:00
Borislav Petkov	b51e974fcd	kvm, emulator: Rename VendorSpecific flag Call it EmulateOnUD which is exactly what we're trying to do with vendor-specific instructions. Rename ->only_vendor_specific_insn to something shorter, while at it. Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2013-10-30 18:54:40 +01:00
Borislav Petkov	1ce19dc16c	kvm, emulator: Use opcode length Add a field to the current emulation context which contains the instruction opcode length. This will streamline handling of opcodes of different length. Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2013-10-30 18:54:39 +01:00
Borislav Petkov	9c15bb1d0a	kvm: Add KVM_GET_EMULATED_CPUID Add a kvm ioctl which states which system functionality kvm emulates. The format used is that of CPUID and we return the corresponding CPUID bits set for which we do emulate functionality. Make sure ->padding is being passed on clean from userspace so that we can use it for something in the future, after the ioctl gets cast in stone. s/kvm_dev_ioctl_get_supported_cpuid/kvm_dev_ioctl_get_cpuid/ while at it. Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2013-10-30 18:54:39 +01:00
Tim Gardner	d780a31271	KVM: Fix modprobe failure for kvm_intel/kvm_amd The x86 specific kvm init creates a new conflicting debugfs directory which causes modprobe issues with kvm_intel and kvm_amd. For example, sudo modprobe kvm_amd modprobe: ERROR: could not insert 'kvm_amd': Bad address The simplest fix is to just rename the directory. The following KVM config options are set: CONFIG_KVM_GUEST=y CONFIG_KVM_DEBUG_FS=y CONFIG_HAVE_KVM=y CONFIG_HAVE_KVM_IRQCHIP=y CONFIG_HAVE_KVM_IRQ_ROUTING=y CONFIG_HAVE_KVM_EVENTFD=y CONFIG_KVM_APIC_ARCHITECTURE=y CONFIG_KVM_MMIO=y CONFIG_KVM_ASYNC_PF=y CONFIG_HAVE_KVM_MSI=y CONFIG_HAVE_KVM_CPU_RELAX_INTERCEPT=y CONFIG_KVM=m CONFIG_KVM_INTEL=m CONFIG_KVM_AMD=m CONFIG_KVM_DEVICE_ASSIGNMENT=y Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Gleb Natapov <gleb@redhat.com> Cc: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> [Change debugfs directory name. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2013-10-30 12:10:42 +01:00
Peter Zijlstra	e00b12e64b	perf/x86: Further optimize copy_from_user_nmi() Now that we can deal with nested NMI due to IRET re-enabling NMIs and can deal with faults from NMI by making sure we preserve CR2 over NMIs we can in fact simply access user-space memory from NMI context. So rewrite copy_from_user_nmi() to use __copy_from_user_inatomic() and rework the fault path to do the minimal required work before taking the in_atomic() fault handler. In particular avoid perf_sw_event() which would make perf recurse on itself (it should be harmless as our recursion protections should be able to deal with this -- but why tempt fate). Also rename notify_page_fault() to kprobes_fault() as that is a much better name; there is no notifier in it and its specific to kprobes. Don measured that his worst case NMI path shrunk from ~300K cycles to ~150K cycles. Cc: Stephane Eranian <eranian@google.com> Cc: jmario@redhat.com Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: dave.hansen@linux.intel.com Tested-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20131024105206.GM2490@laptop.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-10-29 12:02:54 +01:00
Peter Zijlstra	e8a923cc1f	perf/x86: Fix NMI measurements OK, so what I'm actually seeing on my WSM is that sched/clock.c is 'broken' for the purpose we're using it for. What triggered it is that my WSM-EP is broken :-( [ 0.001000] tsc: Fast TSC calibration using PIT [ 0.002000] tsc: Detected 2533.715 MHz processor [ 0.500180] TSC synchronization [CPU#0 -> CPU#6]: [ 0.505197] Measured 3 cycles TSC warp between CPUs, turning off TSC clock. [ 0.004000] tsc: Marking TSC unstable due to check_tsc_sync_source failed For some reason it consistently detects TSC skew, even though NHM+ should have a single clock domain for 'reasonable' systems. This marks sched_clock_stable=0, which means that we do fancy stuff to try and get a 'sane' clock. Part of this fancy stuff relies on the tick, clearly that's gone when NOHZ=y. So for idle cpus time gets stuck, until it either wakes up or gets kicked by another cpu. While this is perfectly fine for the scheduler -- it only cares about actually running stuff, and when we're running stuff we're obviously not idle. This does somewhat break down for perf which can trigger events just fine on an otherwise idle cpu. So I've got NMIs get get 'measured' as taking ~1ms, which actually don't last nearly that long: <idle>-0 [013] d.h. 886.311970: rcu_nmi_enter <-do_nmi ... <idle>-0 [013] d.h. 886.311997: perf_sample_event_took: HERE!!! : 1040990 So ftrace (which uses sched_clock(), not the fancy bits) only sees ~27us, but we measure ~1ms !! Now since all this measurement stuff lives in x86 code, we can actually fix it. Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: mingo@kernel.org Cc: dave.hansen@linux.intel.com Cc: eranian@google.com Cc: Don Zickus <dzickus@redhat.com> Cc: jmario@redhat.com Cc: acme@infradead.org Link: http://lkml.kernel.org/r/20131017133350.GG3364@laptop.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-10-29 12:01:20 +01:00

1 2 3 4 5 ...

18359 Commits