android_kernel_xiaomi_sm8450

xiaomi-sm8450/android_kernel_xiaomi_sm8450

Author	SHA1	Message	Date
Michael Ellerman	b536da7c2d	powerpc/64s: Drop unused loc parameter to MASKABLE_EXCEPTION macros We pass the "loc" (location) parameter to MASKABLE_EXCEPTION and friends, but it's not used, so drop it. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-07 21:49:38 +10:00
Michael Ellerman	0a55c24185	powerpc/64s: Remove PSERIES naming from the MASKABLE macros Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-07 21:49:38 +10:00
Michael Ellerman	6adc6e9c07	powerpc/64s: Drop _MASKABLE_RELON_EXCEPTION_PSERIES() _MASKABLE_RELON_EXCEPTION_PSERIES() does nothing useful, update all callers to use __MASKABLE_RELON_EXCEPTION_PSERIES() directly. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-07 21:49:37 +10:00
Michael Ellerman	9bf2877ac1	powerpc/64s: Drop _MASKABLE_EXCEPTION_PSERIES() _MASKABLE_EXCEPTION_PSERIES() does nothing useful, update all callers to use __MASKABLE_EXCEPTION_PSERIES() directly. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-07 21:49:37 +10:00
Michael Ellerman	bdf08e1da0	powerpc/64s: Rename EXCEPTION_PROLOG_PSERIES to EXCEPTION_PROLOG Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-07 21:49:36 +10:00
Michael Ellerman	270373f14f	powerpc/64s: Rename EXCEPTION_RELON_PROLOG_PSERIES To just EXCEPTION_RELON_PROLOG(). Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-07 21:49:36 +10:00
Michael Ellerman	6ebb939740	powerpc/64s: Rename EXCEPTION_RELON_PROLOG_PSERIES_1 The EXCEPTION_RELON_PROLOG_PSERIES_1() macro does the same job as EXCEPTION_PROLOG_2 (which we just recently created), except for "RELON" (relocation on) exceptions. So rename it as such. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-07 21:49:35 +10:00
Michael Ellerman	94f3cc8e36	powerpc/64s: Remove PSERIES from the NORI macros Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-07 21:49:35 +10:00
Michael Ellerman	cb58a4a4b3	powerpc/64s: Rename EXCEPTION_PROLOG_PSERIES_1 to EXCEPTION_PROLOG_2 As with the other patches in this series, we are removing the "PSERIES" from the name as it's no longer meaningful. In this case it's not simply a case of removing the "PSERIES" as that would result in a clash with the existing EXCEPTION_PROLOG_1. Instead we name this one EXCEPTION_PROLOG_2, as it's usually used in sequence after 0 and 1. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-07 21:49:35 +10:00
Michael Ellerman	b706f42362	powerpc/64s: Rename STD_RELON_EXCEPTION_PSERIES_OOL to STD_RELON_EXCEPTION_OOL Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-07 21:49:34 +10:00
Michael Ellerman	e42389c5f1	powerpc/64s: Rename STD_RELON_EXCEPTION_PSERIES to STD_RELON_EXCEPTION Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-07 21:49:34 +10:00
Michael Ellerman	75e8bef3d6	powerpc/64s: Rename STD_EXCEPTION_PSERIES_OOL to STD_EXCEPTION_OOL Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-07 21:49:33 +10:00
Michael Ellerman	e899fce509	powerpc/64s: Rename STD_EXCEPTION_PSERIES to STD_EXCEPTION The "PSERIES" in STD_EXCEPTION_PSERIES is to differentiate the macros from the legacy iSeries versions, which are called STD_EXCEPTION_ISERIES. It is not anything to do with pseries vs powernv or powermac etc. We removed the legacy iSeries code in 2012, in commit 8ee3e0d69623x ("powerpc: Remove the main legacy iSerie platform code"). So remove "PSERIES" from the macros. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-07 21:49:33 +10:00
Michael Ellerman	92b6d65c07	powerpc/64s: Move SET_SCRATCH0() into EXCEPTION_RELON_PROLOG_PSERIES() EXCEPTION_RELON_PROLOG_PSERIES() only has two users, STD_RELON_EXCEPTION_PSERIES() and STD_RELON_EXCEPTION_HV() both of which "call" SET_SCRATCH0(), so just move SET_SCRATCH0() into EXCEPTION_RELON_PROLOG_PSERIES(). Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-07 21:49:32 +10:00
Michael Ellerman	4a7a0a8444	powerpc/64s: Move SET_SCRATCH0() into EXCEPTION_PROLOG_PSERIES() EXCEPTION_PROLOG_PSERIES() only has two users, STD_EXCEPTION_PSERIES() and STD_EXCEPTION_HV() both of which "call" SET_SCRATCH0(), so just move SET_SCRATCH0() into EXCEPTION_PROLOG_PSERIES(). Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-07 21:49:31 +10:00
Darren Stevens	250a93501d	powerpc/pasemi: Search for PCI root bus by compatible property Pasemi arch code finds the root of the PCI-e bus by searching the device-tree for a node called 'pxp'. But the root bus has a compatible property of 'pasemi,rootbus' so search for that instead. Signed-off-by: Darren Stevens <darren@stevens-zone.net> Acked-by: Olof Johansson <olof@lixom.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-07 21:49:31 +10:00
Christophe Leroy	9412b23450	powerpc/lib: Implement strlen() in assembly for PPC32 The generic implementation of strlen() reads strings byte per byte. This patch implements strlen() in assembly based on a read of entire words, in the same spirit as what some other arches and glibc do. On a 8xx the time spent in strlen is reduced by 3/4 for long strings. strlen() selftest on an 8xx provides the following values: Before the patch (ie with the generic strlen() in lib/string.c): len 256 : time = 1.195055 len 016 : time = 0.083745 len 008 : time = 0.046828 len 004 : time = 0.028390 After the patch: len 256 : time = 0.272185 ==> 78% improvment len 016 : time = 0.040632 ==> 51% improvment len 008 : time = 0.033060 ==> 29% improvment len 004 : time = 0.029149 ==> 2% degradation On a 832x: Before the patch: len 256 : time = 0.236125 len 016 : time = 0.018136 len 008 : time = 0.011000 len 004 : time = 0.007229 After the patch: len 256 : time = 0.094950 ==> 60% improvment len 016 : time = 0.013357 ==> 26% improvment len 008 : time = 0.010586 ==> 4% improvment len 004 : time = 0.008784 Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-07 21:49:30 +10:00
Mahesh Salgaonkar	94675cceac	powerpc/pseries: Defer the logging of rtas error to irq work queue. rtas_log_buf is a buffer to hold RTAS event data that are communicated to kernel by hypervisor. This buffer is then used to pass RTAS event data to user through proc fs. This buffer is allocated from vmalloc (non-linear mapping) area. On Machine check interrupt, register r3 points to RTAS extended event log passed by hypervisor that contains the MCE event. The pseries machine check handler then logs this error into rtas_log_buf. The rtas_log_buf is a vmalloc-ed (non-linear) buffer we end up taking up a page fault (vector 0x300) while accessing it. Since machine check interrupt handler runs in NMI context we can not afford to take any page fault. Page faults are not honored in NMI context and causes kernel panic. Apart from that, as Nick pointed out, pSeries_log_error() also takes a spin_lock while logging error which is not safe in NMI context. It may endup in deadlock if we get another MCE before releasing the lock. Fix this by deferring the logging of rtas error to irq work queue. Current implementation uses two different buffers to hold rtas error log depending on whether extended log is provided or not. This makes bit difficult to identify which buffer has valid data that needs to logged later in irq work. Simplify this using single buffer, one per paca, and copy rtas log to it irrespective of whether extended log is provided or not. Allocate this buffer below RMA region so that it can be accessed in real mode mce handler. Fixes: `b96672dd84` ("powerpc: Machine check interrupt is a non-maskable interrupt") Cc: stable@vger.kernel.org # v4.14+ Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-07 21:49:29 +10:00
Mahesh Salgaonkar	74e96bf44f	powerpc/pseries: Avoid using the size greater than RTAS_ERROR_LOG_MAX. The global mce data buffer that used to copy rtas error log is of 2048 (RTAS_ERROR_LOG_MAX) bytes in size. Before the copy we read extended_log_length from rtas error log header, then use max of extended_log_length and RTAS_ERROR_LOG_MAX as a size of data to be copied. Ideally the platform (phyp) will never send extended error log with size > 2048. But if that happens, then we have a risk of buffer overrun and corruption. Fix this by using min_t instead. Fixes: `d368514c30` ("powerpc: Fix corruption when grabbing FWNMI data") Reported-by: Michal Suchanek <msuchanek@suse.com> Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-07 21:49:28 +10:00
Benjamin Herrenschmidt	e27e0a9465	powerpc/xive: Remove xive_kexec_teardown_cpu() It's identical to xive_teardown_cpu() so just use the latter Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-07 21:49:28 +10:00
Benjamin Herrenschmidt	dbc5740247	powerpc/xive: Remove now useless pr_debug statements Those overly verbose statement in the setup of the pool VP aren't particularly useful (esp. considering we don't actually use the pool, we configure it bcs HW requires it only). So remove them which improves the code readability. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-07 21:49:27 +10:00
Nicholas Piggin	34c604d275	powerpc/64s: free page table caches at exit_mmap time The kernel page table caches are tied to init_mm, so there is no more need for them after userspace is finished. destroy_context() gets called when we drop the last reference for an mm, which can be much later than the task exit due to other lazy mm references to it. We can free the page table cache pages on task exit because they only cache the userspace page tables and kernel threads should not access user space addresses. The mapping for kernel threads itself is maintained in init_mm and page table cache for that is attached to init_mm. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> [mpe: Merge change log additions from Aneesh] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-07 21:49:27 +10:00
Nicholas Piggin	5a6099346c	powerpc/64s/radix: tlb do not flush on page size when fullmm When the mm is being torn down there will be a full PID flush so there is no need to flush the TLB on page size changes. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-07 21:49:26 +10:00
Michael Ellerman	7cd129b4b5	powerpc: Add a checkpatch wrapper with our preferred settings This makes it easy to run checkpatch with settings that I like. Usage is eg: $ ./arch/powerpc/tools/checkpatch.sh -g origin/master.. To check all commits since origin/master. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Russell Currey <ruscur@russell.cc>	2018-08-07 21:49:25 +10:00
Michael Ellerman	4da1f79227	powerpc/64: Disable irq restore warning for now We recently added a warning in arch_local_irq_restore() to check that the soft masking state matches reality. Unfortunately it trips in a few places, which are not entirely trivial to fix. The key problem is if we're doing function_graph tracing of restore_math(), the warning pops and then seems to recurse. It's not entirely clear because the system continuously oopses on all CPUs, with the output interleaved and unreadable. It's also been observed on a G5 coming out of idle. Until we can fix those cases disable the warning for now. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-07 21:49:24 +10:00
Martin Schwidefsky	26f843848b	s390: fix br_r1_trampoline for machines without exrl For machines without the exrl instruction the BFP jit generates code that uses an "br %r1" instruction located in the lowcore page. Unfortunately there is a cut & paste error that puts an additional "larl %r1,.+14" instruction in the code that clobbers the branch target address in %r1. Remove the larl instruction. Cc: <stable@vger.kernel.org> # v4.17+ Fixes: `de5cb6eb51` ("s390: use expoline thunks in the BPF JIT") Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2018-08-07 13:38:16 +02:00
Martin Schwidefsky	5eda25b102	s390/lib: use expoline for all bcr instructions The memove, memset, memcpy, __memset16, __memset32 and __memset64 function have an additional indirect return branch in form of a "bzr" instruction. These need to use expolines as well. Cc: <stable@vger.kernel.org> # v4.17+ Fixes: `97489e0663` ("s390/lib: use expoline for indirect branches") Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2018-08-07 13:38:13 +02:00
Thomas Gleixner	bc2d8d262c	cpu/hotplug: Fix SMT supported evaluation Josh reported that the late SMT evaluation in cpu_smt_state_init() sets cpu_smt_control to CPU_SMT_NOT_SUPPORTED in case that 'nosmt' was supplied on the kernel command line as it cannot differentiate between SMT disabled by BIOS and SMT soft disable via 'nosmt'. That wreckages the state and makes the sysfs interface unusable. Rework this so that during bringup of the non boot CPUs the availability of SMT is determined in cpu_smt_allowed(). If a newly booted CPU is not a 'primary' thread then set the local cpu_smt_available marker and evaluate this explicitely right after the initial SMP bringup has finished. SMT evaulation on x86 is a trainwreck as the firmware has all the information _before_ booting the kernel, but there is no interface to query it. Fixes: `73d5e2b472` ("cpu/hotplug: detect SMT disabled by BIOS") Reported-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2018-08-07 12:25:30 +02:00
Ard Biesheuvel	22240df7ac	crypto: arm64/ghash-ce - implement 4-way aggregation Enhance the GHASH implementation that uses 64-bit polynomial multiplication by adding support for 4-way aggregation. This more than doubles the performance, from 2.4 cycles per byte to 1.1 cpb on Cortex-A53. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2018-08-07 17:51:40 +08:00
Ard Biesheuvel	8e492eff7d	crypto: arm64/ghash-ce - replace NEON yield check with block limit Checking the TIF_NEED_RESCHED flag is disproportionately costly on cores with fast crypto instructions and comparatively slow memory accesses. On algorithms such as GHASH, which executes at ~1 cycle per byte on cores that implement support for 64 bit polynomial multiplication, there is really no need to check the TIF_NEED_RESCHED particularly often, and so we can remove the NEON yield check from the assembler routines. However, unlike the AEAD or skcipher APIs, the shash/ahash APIs take arbitrary input lengths, and so there needs to be some sanity check to ensure that we don't hog the CPU for excessive amounts of time. So let's simply cap the maximum input size that is processed in one go to 64 KB. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2018-08-07 17:51:39 +08:00
Ondrej Mosnacek	877ccce7cb	crypto: x86/aegis,morus - Fix and simplify CPUID checks It turns out I had misunderstood how the x86_match_cpu() function works. It evaluates a logical OR of the matching conditions, not logical AND. This caused the CPU feature checks for AEGIS to pass even if only SSE2 (but not AES-NI) was supported (or vice versa), leading to potential crashes if something tried to use the registered algs. This patch switches the checks to a simpler method that is used e.g. in the Camellia x86 code. The patch also removes the MODULE_DEVICE_TABLE declarations which actually seem to cause the modules to be auto-loaded at boot, which is not desired. The crypto API on-demand module loading is sufficient. Fixes: `1d373d4e8e` ("crypto: x86 - Add optimized AEGIS implementations") Fixes: `6ecc9d9ff9` ("crypto: x86 - Add optimized MORUS implementations") Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com> Tested-by: Milan Broz <gmazyland@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2018-08-07 17:51:15 +08:00
Ard Biesheuvel	30f1a9f53e	crypto: arm64/aes-ce-gcm - don't reload key schedule if avoidable Squeeze out another 5% of performance by minimizing the number of invocations of kernel_neon_begin()/kernel_neon_end() on the common path, which also allows some reloads of the key schedule to be optimized away. The resulting code runs at 2.3 cycles per byte on a Cortex-A53. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2018-08-07 17:38:04 +08:00
Ard Biesheuvel	e0bd888dc4	crypto: arm64/aes-ce-gcm - implement 2-way aggregation Implement a faster version of the GHASH transform which amortizes the reduction modulo the characteristic polynomial across two input blocks at a time. On a Cortex-A53, the gcm(aes) performance increases 24%, from 3.0 cycles per byte to 2.4 cpb for large input sizes. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2018-08-07 17:38:04 +08:00
Ard Biesheuvel	71e52c278c	crypto: arm64/aes-ce-gcm - operate on two input blocks at a time Update the core AES/GCM transform and the associated plumbing to operate on 2 AES/GHASH blocks at a time. By itself, this is not expected to result in a noticeable speedup, but it paves the way for reimplementing the GHASH component using 2-way aggregation. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2018-08-07 17:38:04 +08:00
Herbert Xu	3465893d27	Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Merge crypto-2.6 to pick up NEON yield revert.	2018-08-07 17:37:10 +08:00
Ard Biesheuvel	f10dc56c64	crypto: arm64 - revert NEON yield for fast AEAD implementations As it turns out, checking the TIF_NEED_RESCHED flag after each iteration results in a significant performance regression (~10%) when running fast algorithms (i.e., ones that use special instructions and operate in the < 4 cycles per byte range) on in-order cores with comparatively slow memory accesses such as the Cortex-A53. Given the speed of these ciphers, and the fact that the page based nature of the AEAD scatterwalk API guarantees that the core NEON transform is never invoked with more than a single page's worth of input, we can estimate the worst case duration of any resulting scheduling blackout: on a 1 GHz Cortex-A53 running with 64k pages, processing a page's worth of input at 4 cycles per byte results in a delay of ~250 us, which is a reasonable upper bound. So let's remove the yield checks from the fused AES-CCM and AES-GCM routines entirely. This reverts commit `7b67ae4d5c` and partially reverts commit `7c50136a8a`. Fixes: `7c50136a8a` ("crypto: arm64/aes-ghash - yield NEON after every ...") Fixes: `7b67ae4d5c` ("crypto: arm64/aes-ccm - yield NEON after every ...") Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2018-08-07 17:26:23 +08:00
Paul Burton	b023a93960	MIPS: Avoid using array as parameter to write_c0_kpgd() Passing an array (swapper_pg_dir) as the argument to write_c0_kpgd() in setup_pw() will become problematic if we modify __write_64bit_c0_split() to cast its val argument to unsigned long long, because for 32-bit kernel builds the size of a pointer will differ from the size of an unsigned long long. This would fall foul of gcc's pointer-to-int-cast diagnostic. Cast the value to a long, which should be the same width as the pointer that we ultimately want & will be sign extended if required to the unsigned long long that __write_64bit_c0_split() ultimately needs. Signed-off-by: Paul Burton <paul.burton@mips.com>	2018-08-06 18:44:09 -07:00
Paul Burton	ee67855ecd	MIPS: vdso: Allow clang's --target flag in VDSO cflags The MIPS VDSO code filters out a subset of known-good flags from KBUILD_CFLAGS to use when building VDSO libraries. When we build using clang we need to allow the --target flag through, otherwise we'll generally attempt to build the VDSO for the architecture of the build machine rather than for MIPS. Signed-off-by: Paul Burton <paul.burton@mips.com> Patchwork: https://patchwork.linux-mips.org/patch/20154/ Cc: James Hogan <jhogan@kernel.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org	2018-08-06 15:53:33 -07:00
Paul Burton	4467f7ad7d	MIPS: genvdso: Remove GOT checks Our genvdso tool performs some rather paranoid checking that the VDSO library isn't attempting to make use of a GOT by constraining the number of entries that the GOT is allowed to contain to the minimum 2 entries that are always generated by binutils. Unfortunately lld prior to revision 334390 generates a third entry, which is unused & thus harmless but falls foul of genvdso's checks & causes the build to fail. Since we already check that the VDSO contains no relocations it seems reasonable to presume that it also doesn't contain use of a GOT, which would involve relocations. Thus rather than attempting to work around this issue by allowing 3 GOT entries when using lld, simply remove the GOT checks which seem overly paranoid. Signed-off-by: Paul Burton <paul.burton@mips.com> Patchwork: https://patchwork.linux-mips.org/patch/20152/ Cc: James Hogan <jhogan@kernel.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org	2018-08-06 15:28:46 -07:00
M. Vefa Bicakci	405c018a25	xen/pv: Call get_cpu_address_sizes to set x86_virt/phys_bits Commit `d94a155c59` ("x86/cpu: Prevent cpuinfo_x86::x86_phys_bits adjustment corruption") has moved the query and calculation of the x86_virt_bits and x86_phys_bits fields of the cpuinfo_x86 struct from the get_cpu_cap function to a new function named get_cpu_address_sizes. One of the call sites related to Xen PV VMs was unfortunately missed in the aforementioned commit. This prevents successful boot-up of kernel versions 4.17 and up in Xen PV VMs if CONFIG_DEBUG_VIRTUAL is enabled, due to the following code path: enlighten_pv.c::xen_start_kernel mmu_pv.c::xen_reserve_special_pages page.h::__pa physaddr.c::__phys_addr physaddr.h::phys_addr_valid phys_addr_valid uses boot_cpu_data.x86_phys_bits to validate physical addresses. boot_cpu_data.x86_phys_bits is no longer populated before the call to xen_reserve_special_pages due to the aforementioned commit though, so the validation performed by phys_addr_valid fails, which causes __phys_addr to trigger a BUG, preventing boot-up. Signed-off-by: M. Vefa Bicakci <m.v.b@runbox.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Juergen Gross <jgross@suse.com> Cc: xen-devel@lists.xenproject.org Cc: x86@kernel.org Cc: stable@vger.kernel.org # for v4.17 and up Fixes: `d94a155c59` ("x86/cpu: Prevent cpuinfo_x86::x86_phys_bits adjustment corruption") Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>	2018-08-06 16:27:41 -04:00
Thomas Gleixner	315706049c	Merge branch 'x86/pti-urgent' into x86/pti Integrate the PTI Global bit fixes which conflict with the 32bit PTI support. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2018-08-06 20:56:34 +02:00
Dave Hansen	c40a56a781	x86/mm/init: Remove freed kernel image areas from alias mapping The kernel image is mapped into two places in the virtual address space (addresses without KASLR, of course): 1. The kernel direct map (0xffff880000000000) 2. The "high kernel map" (0xffffffff81000000) We actually execute out of #2. If we get the address of a kernel symbol, it points to #2, but almost all physical-to-virtual translations point to Parts of the "high kernel map" alias are mapped in the userspace page tables with the Global bit for performance reasons. The parts that we map to userspace do not (er, should not) have secrets. When PTI is enabled then the global bit is usually not set in the high mapping and just used to compensate for poor performance on systems which lack PCID. This is fine, except that some areas in the kernel image that are adjacent to the non-secret-containing areas are unused holes. We free these holes back into the normal page allocator and reuse them as normal kernel memory. The memory will, of course, get used via the normal map, but the alias mapping is kept. This otherwise unused alias mapping of the holes will, by default keep the Global bit, be mapped out to userspace, and be vulnerable to Meltdown. Remove the alias mapping of these pages entirely. This is likely to fracture the 2M page mapping the kernel image near these areas, but this should affect a minority of the area. The pageattr code changes all aliases mapping the physical pages that it operates on (by default). We only want to modify a single alias, so we need to tweak its behavior. This unmapping behavior is currently dependent on PTI being in place. Going forward, we should at least consider doing this for all configurations. Having an extra read-write alias for memory is not exactly ideal for debugging things like random memory corruption and this does undercut features like DEBUG_PAGEALLOC or future work like eXclusive Page Frame Ownership (XPFO). Before this patch: current_kernel:---[ High Kernel Mapping ]--- current_kernel-0xffffffff80000000-0xffffffff81000000 16M pmd current_kernel-0xffffffff81000000-0xffffffff81e00000 14M ro PSE GLB x pmd current_kernel-0xffffffff81e00000-0xffffffff81e11000 68K ro GLB x pte current_kernel-0xffffffff81e11000-0xffffffff82000000 1980K RW NX pte current_kernel-0xffffffff82000000-0xffffffff82600000 6M ro PSE GLB NX pmd current_kernel-0xffffffff82600000-0xffffffff82c00000 6M RW PSE NX pmd current_kernel-0xffffffff82c00000-0xffffffff82e00000 2M RW NX pte current_kernel-0xffffffff82e00000-0xffffffff83200000 4M RW PSE NX pmd current_kernel-0xffffffff83200000-0xffffffffa0000000 462M pmd current_user:---[ High Kernel Mapping ]--- current_user-0xffffffff80000000-0xffffffff81000000 16M pmd current_user-0xffffffff81000000-0xffffffff81e00000 14M ro PSE GLB x pmd current_user-0xffffffff81e00000-0xffffffff81e11000 68K ro GLB x pte current_user-0xffffffff81e11000-0xffffffff82000000 1980K RW NX pte current_user-0xffffffff82000000-0xffffffff82600000 6M ro PSE GLB NX pmd current_user-0xffffffff82600000-0xffffffffa0000000 474M pmd After this patch: current_kernel:---[ High Kernel Mapping ]--- current_kernel-0xffffffff80000000-0xffffffff81000000 16M pmd current_kernel-0xffffffff81000000-0xffffffff81e00000 14M ro PSE GLB x pmd current_kernel-0xffffffff81e00000-0xffffffff81e11000 68K ro GLB x pte current_kernel-0xffffffff81e11000-0xffffffff82000000 1980K pte current_kernel-0xffffffff82000000-0xffffffff82400000 4M ro PSE GLB NX pmd current_kernel-0xffffffff82400000-0xffffffff82488000 544K ro NX pte current_kernel-0xffffffff82488000-0xffffffff82600000 1504K pte current_kernel-0xffffffff82600000-0xffffffff82c00000 6M RW PSE NX pmd current_kernel-0xffffffff82c00000-0xffffffff82c0d000 52K RW NX pte current_kernel-0xffffffff82c0d000-0xffffffff82dc0000 1740K pte current_user:---[ High Kernel Mapping ]--- current_user-0xffffffff80000000-0xffffffff81000000 16M pmd current_user-0xffffffff81000000-0xffffffff81e00000 14M ro PSE GLB x pmd current_user-0xffffffff81e00000-0xffffffff81e11000 68K ro GLB x pte current_user-0xffffffff81e11000-0xffffffff82000000 1980K pte current_user-0xffffffff82000000-0xffffffff82400000 4M ro PSE GLB NX pmd current_user-0xffffffff82400000-0xffffffff82488000 544K ro NX pte current_user-0xffffffff82488000-0xffffffff82600000 1504K pte current_user-0xffffffff82600000-0xffffffffa0000000 474M pmd [ tglx: Do not unmap on 32bit as there is only one mapping ] Fixes: `0f561fce4d` ("x86/pti: Enable global pages for shared areas") Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Kees Cook <keescook@google.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Hugh Dickins <hughd@google.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Joerg Roedel <jroedel@suse.de> Link: https://lkml.kernel.org/r/20180802225831.5F6A2BFC@viggo.jf.intel.com	2018-08-06 20:54:16 +02:00
Robert P. J. Day	7dc084d625	MIPS: Remove obsolete MIPS checks for DST node "chosen@0" As there is precious little left in any DTS files referring to the node "/chosen@0" as opposed to "/chosen", remove the two checks for the former node name. [paul.burton@mips.com: The modified yamon-dt code only operates on arch/mips/boot/dts/mti/sead3.dts right now, and that uses chosen rather than chosen@0 anyway, so this should have no behavioural effect.] Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca> Signed-off-by: Paul Burton <paul.burton@mips.com> Patchwork: https://patchwork.linux-mips.org/patch/20131/ Cc: linux-mips@linux-mips.org	2018-08-06 09:50:33 -07:00
Uros Bizjak	fd8ca6dac9	KVM/x86: Use CC_SET()/CC_OUT in arch/x86/kvm/vmx.c Remove open-coded uses of set instructions to use CC_SET()/CC_OUT() in arch/x86/kvm/vmx.c. Signed-off-by: Uros Bizjak <ubizjak@gmail.com> [Mark error paths as unlikely while touching this. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2018-08-06 18:18:41 +02:00
Wanpeng Li	aaffcfd1e8	KVM: X86: Implement PV IPIs in linux guest Implement paravirtual apic hooks to enable PV IPIs for KVM if the "send IPI" hypercall is available. The hypercall lets a guest send IPIs, with at most 128 destinations per hypercall in 64-bit mode and 64 vCPUs per hypercall in 32-bit mode. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Wanpeng Li <wanpengli@tencent.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2018-08-06 17:59:22 +02:00
Wanpeng Li	d63bae079b	KVM: X86: Add kvm hypervisor init time platform setup callback Add kvm hypervisor init time platform setup callback which will be used to replace native apic hooks by pararvirtual hooks. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Wanpeng Li <wanpengli@tencent.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2018-08-06 17:59:21 +02:00
Wanpeng Li	4180bf1b65	KVM: X86: Implement "send IPI" hypercall Using hypercall to send IPIs by one vmexit instead of one by one for xAPIC/x2APIC physical mode and one vmexit per-cluster for x2APIC cluster mode. Intel guest can enter x2apic cluster mode when interrupt remmaping is enabled in qemu, however, latest AMD EPYC still just supports xapic mode which can get great improvement by Exit-less IPIs. This patchset lets a guest send multicast IPIs, with at most 128 destinations per hypercall in 64-bit mode and 64 vCPUs per hypercall in 32-bit mode. Hardware: Xeon Skylake 2.5GHz, 2 sockets, 40 cores, 80 threads, the VM is 80 vCPUs, IPI microbenchmark(https://lkml.org/lkml/2017/12/19/141): x2apic cluster mode, vanilla Dry-run: 0, 2392199 ns Self-IPI: 6907514, 15027589 ns Normal IPI: 223910476, 251301666 ns Broadcast IPI: 0, 9282161150 ns Broadcast lock: 0, 8812934104 ns x2apic cluster mode, pv-ipi Dry-run: 0, 2449341 ns Self-IPI: 6720360, 15028732 ns Normal IPI: 228643307, 255708477 ns Broadcast IPI: 0, 7572293590 ns => 22% performance boost Broadcast lock: 0, 8316124651 ns x2apic physical mode, vanilla Dry-run: 0, 3135933 ns Self-IPI: `8572670`, 17901757 ns Normal IPI: 226444334, 255421709 ns Broadcast IPI: 0, 19845070887 ns Broadcast lock: 0, 19827383656 ns x2apic physical mode, pv-ipi Dry-run: 0, 2446381 ns Self-IPI: 6788217, 15021056 ns Normal IPI: 219454441, 249583458 ns Broadcast IPI: 0, 7806540019 ns => 154% performance boost Broadcast lock: 0, 9143618799 ns Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Wanpeng Li <wanpengli@tencent.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2018-08-06 17:59:20 +02:00
Tianyu Lan	74fec5b9db	KVM/x86: Move X86_CR4_OSXSAVE check into kvm_valid_sregs() X86_CR4_OSXSAVE check belongs to sregs check and so move into kvm_valid_sregs(). Signed-off-by: Lan Tianyu <Tianyu.Lan@microsoft.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2018-08-06 17:59:19 +02:00
Liang Chen	ee6268ba3a	KVM: x86: Skip pae_root shadow allocation if tdp enabled Considering the fact that the pae_root shadow is not needed when tdp is in use, skip the pae_root shadow page allocation to allow mmu creation even not being able to obtain memory from DMA32 zone when particular cgroup cpuset.mems or mempolicy control is applied. Signed-off-by: Liang Chen <liangchen.linux@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2018-08-06 17:59:19 +02:00
Tianyu Lan	c2a4eadf77	KVM/MMU: Combine flushing remote tlb in mmu_set_spte() mmu_set_spte() flushes remote tlbs for drop_parent_pte/drop_spte() and set_spte() separately. This may introduce redundant flush. This patch is to combine these flushes and check flush request after calling set_spte(). Signed-off-by: Lan Tianyu <Tianyu.Lan@microsoft.com> Reviewed-by: Junaid Shahid <junaids@google.com> Reviewed-by: Xiao Guangrong <xiaoguangrong@tencent.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2018-08-06 17:59:18 +02:00

... 6 7 8 9 10 ...

150436 Commits