Pull microcode fixes from Borislav Petkov:
- Two fixes hardening microcode data handling. (Quentin Casasnovas)
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull RAS updates from Borislav Petkov:
"- Enable AMD thresholding IRQ by default if supported. (Aravind Gopalakrishnan)
- Unify mce_panic() message pattern. (Derek Che)
- A bit more involved simplification of the CMCI logic after yet another
report about race condition with the adaptive logic. (Borislav Petkov)
- ACPI APEI EINJ fleshing out of the user documentation. (Borislav Petkov)
- Minor cleanup. (Jan Beulich.)"
Signed-off-by: Ingo Molnar <mingo@kernel.org>
We setup APIC vectors for threshold errors if interrupt_capable.
However, we don't set interrupt_enable by default. Rework
threshold_restart_bank() so that when we set up lvt_offset, we also set
IntType to APIC and also enable thresholding interrupts for banks which
support it by default.
User is still allowed to disable interrupts through sysfs.
While at it, check if status is valid before we proceed to log error
using mce_log. This is because, in multi-node platforms, only the NBC
(Node Base Core, i.e. the first core in the node) has valid status info
in its MCA registers. So, the decoding of status values on the non-NBC
leads to noise on kernel logs like so:
EDAC DEBUG: amd64_inject_write_store: section=0x80000000 word_bits=0x10020001
[Hardware Error]: Corrected error, no action required.
[Hardware Error]: CPU:25 (15:2:0) MC4_STATUS[-|CE|-|-|-
[Hardware Error]: Corrected error, no action required.
[Hardware Error]: CPU:26 (15:2:0) MC4_STATUS[-|CE|-|-|-
<...>
WARNING: CPU: 25 PID: 0 at drivers/edac/amd64_edac.c:2147 decode_bus_error+0x1ba/0x2a0()
WARNING: CPU: 26 PID: 0 at drivers/edac/amd64_edac.c:2147 decode_bus_error+0x1ba/0x2a0()
Something is rotten in the state of Denmark.
Suggested-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Aravind Gopalakrishnan <aravind.gopalakrishnan@amd.com>
Link: http://lkml.kernel.org/r/1422896561-7695-1-git-send-email-aravind.gopalakrishnan@amd.com
[ Massage commit message. ]
Signed-off-by: Borislav Petkov <bp@suse.de>
There is another mce_panic call with "Fatal machine check on current CPU" in
the same mce.c file, why not keep them all in same pattern
mce_panic("Fatal machine check on current CPU", &m, msg);
Signed-off-by: Derek Che <drc@yahoo-inc.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Initially, this started with the yet another report about a race
condition in the CMCI storm adaptive period length thing. Yes, we have
to admit, it is fragile and error prone. So let's simplify it.
The simpler logic is: now, after we enter storm mode, we go straight to
polling with CMCI_STORM_INTERVAL, i.e. once a second. We remain in storm
mode as long as we see errors being logged while polling.
Theoretically, if we see an uninterrupted error stream, we will remain
in storm mode indefinitely and keep polling the MSRs.
However, when the storm is actually a burst of errors, once we have
logged them all, we back out of it after ~5 mins of polling and no more
errors logged.
If we encounter an error during those 5 minutes, we reset the polling
interval to 5 mins.
Making machine_check_poll() return a bool and denoting whether it has
seen an error or not lets us simplify a bunch of code and move the storm
handling private to mce_intel.c.
Some minor cleanups while at it.
Reported-by: Calvin Owens <calvinowens@fb.com>
Tested-by: Tony Luck <tony.luck@intel.com>
Link: http://lkml.kernel.org/r/1417746575-23299-1-git-send-email-calvinowens@fb.com
Signed-off-by: Borislav Petkov <bp@suse.de>
Pull ASLR and kASLR fixes from Borislav Petkov:
- Add a global flag announcing KASLR state so that relevant code can do
informed decisions based on its setting. (Jiri Kosina)
- Fix a stack randomization entropy decrease bug. (Hector Marco-Gisbert)
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Commit:
e2b32e6785 ("x86, kaslr: randomize module base load address")
makes the base address for module to be unconditionally randomized in
case when CONFIG_RANDOMIZE_BASE is defined and "nokaslr" option isn't
present on the commandline.
This is not consistent with how choose_kernel_location() decides whether
it will randomize kernel load base.
Namely, CONFIG_HIBERNATION disables kASLR (unless "kaslr" option is
explicitly specified on kernel commandline), which makes the state space
larger than what module loader is looking at. IOW CONFIG_HIBERNATION &&
CONFIG_RANDOMIZE_BASE is a valid config option, kASLR wouldn't be applied
by default in that case, but module loader is not aware of that.
Instead of fixing the logic in module.c, this patch takes more generic
aproach. It introduces a new bootparam setup data_type SETUP_KASLR and
uses that to pass the information whether kaslr has been applied during
kernel decompression, and sets a global 'kaslr_enabled' variable
accordingly, so that any kernel code (module loading, livepatching, ...)
can make decisions based on its value.
x86 module loader is converted to make use of this flag.
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Acked-by: Kees Cook <keescook@chromium.org>
Cc: "H. Peter Anvin" <hpa@linux.intel.com>
Link: https://lkml.kernel.org/r/alpine.LNX.2.00.1502101411280.10719@pobox.suse.cz
[ Always dump correct kaslr status when panicking ]
Signed-off-by: Borislav Petkov <bp@suse.de>
Pull FPU updates from Borislav Petkov:
"A round of updates to the FPU maze from Oleg and Rik. It should make
the code a bit more understandable/readable/streamlined and a preparation
for more cleanups and improvements in that area."
Signed-off-by: Ingo Molnar <mingo@kernel.org>
unlazy_fpu()->__thread_fpu_end() doesn't look right if use_eager_fpu().
Unconditional __thread_fpu_end() is only correct if we know that this
thread can't return to user-mode and use FPU.
Fortunately it has only 2 callers. fpu_copy() checks use_eager_fpu(),
and init_fpu(current) can be only called by the coredumping thread via
regset->get(). But it is exported to modules, and imo this should be
fixed anyway.
And if we check use_eager_fpu() we can use __save_fpu() like fpu_copy()
and save_init_fpu() do.
- It seems that even !use_eager_fpu() case doesn't need the unconditional
__thread_fpu_end(), we only need it if __save_init_fpu() returns 0.
- It is still not clear to me if __save_init_fpu() can safely nest with
another save + restore from __kernel_fpu_begin(). If not, we can use
kernel_fpu_disable() to fix the race.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Rik van Riel <riel@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/1423252925-14451-3-git-send-email-riel@redhat.com
Signed-off-by: Borislav Petkov <bp@suse.de>
The "else" branch clears ->fpu_counter as a remnant of the lazy FPU
usage counting:
e07e23e1fd ("[PATCH] non lazy "sleazy" fpu implementation")
However, switch_fpu_prepare() does this now so that else branch is
superfluous.
If we do use_eager_fpu(), then this has no effect. Otherwise, if we
actually wanted to prevent fpu preload after the context switch we would
need to reset it unconditionally, even if __thread_has_fpu().
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Rik van Riel <riel@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/1423252925-14451-2-git-send-email-riel@redhat.com
Signed-off-by: Borislav Petkov <bp@suse.de>
Enabled probing of lar, lsl, popcnt, lddqu, prefetch insns.
They should be safe to probe, they throw no exceptions.
Enabled probing of 3-byte opcodes 0f 38-3f xx - these are
vector isns, so should be safe.
Enabled probing of many currently undefined 0f xx insns.
At the rate new vector instructions are getting added,
we don't want to constantly enable more bits.
We want to only occasionally *disable* ones which
for some reason can't be probed.
This includes 0f 24,26 opcodes, which are undefined
since Pentium. On 486, they were "mov to/from test register".
Explained more fully what 0f 78,79 opcodes are.
Explained what 0f ae opcode is. (It's unclear why we don't allow
probing it, but let's not change it for now).
Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Jim Keniston <jkenisto@us.ibm.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/1423768732-32194-3-git-send-email-dvlasenk@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
This change fixes 1-byte opcode tables so that only insns
for which we have real reasons to disallow probing are marked
with unset bits.
To that end:
Set bits for all prefix bytes. Their setting is ignored anyway -
we check the bitmap against OPCODE1(insn), not against first
byte. Keeping them set to 0 only confuses code reader with
"why we don't support that opcode" question.
Thus: enable bytes c4,c5 in 64-bit mode (VEX prefixes).
Byte 62 (EVEX prefix) is not yet enabled since insn decoder
does not support that yet.
For 32-bit mode, enable probing of opcodes 63 (arpl) and d6
(salc). They don't require any special handling.
For 64-bit mode, disable 9a and ea - these undefined opcodes
were mistakenly left enabled.
Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Jim Keniston <jkenisto@us.ibm.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/1423768732-32194-2-git-send-email-dvlasenk@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Haswell has a new feature that utilizes the existing LBR facility to
record call chains. To enable this feature, bits (JCC, NEAR_IND_JMP,
NEAR_REL_JMP, FAR_BRANCH, EN_CALLSTACK) in LBR_SELECT must be set to 1,
bits (NEAR_REL_CALL, NEAR-IND_CALL, NEAR_RET) must be cleared. Due to
a hardware bug of Haswell, this feature doesn't work well with
FREEZE_LBRS_ON_PMI.
When the call stack feature is enabled, the LBR stack will capture
unfiltered call data normally, but as return instructions are executed,
the last captured branch record is flushed from the on-chip registers
in a last-in first-out (LIFO) manner. Thus, branch information relative
to leaf functions will not be captured, while preserving the call stack
information of the main line execution path.
This patch defines a separate lbr_sel map for Haswell. The map contains
a new entry for the call stack feature.
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
Signed-off-by: Kan Liang <kan.liang@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: eranian@google.com
Cc: jolsa@redhat.com
Link: http://lkml.kernel.org/r/1415156173-10035-5-git-send-email-kan.liang@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
When an interrupt is migrated away from a cpu it will stay
in its vector_irq array until smp_irq_move_cleanup_interrupt
succeeded. The cfg->move_in_progress flag is cleared already
when the IPI was sent.
When the interrupt is destroyed after migration its 'struct
irq_desc' is freed and the vector_irq arrays are cleaned up.
But since cfg->move_in_progress is already 0 the references
at cpus before the last migration will not be cleared. So
this would leave a reference to an already destroyed irq
alive.
When the cpu is taken down at this point, the
check_irq_vectors_for_cpu_disable() function finds a valid irq
number in the vector_irq array, but gets NULL for its
descriptor and dereferences it, causing a kernel panic.
This has been observed on real systems at shutdown. Add a
check to check_irq_vectors_for_cpu_disable() for a valid
'struct irq_desc' to prevent this issue.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Jiang Liu <jiang.liu@linux.intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jan Beulich <JBeulich@suse.com>
Cc: K. Y. Srinivasan <kys@microsoft.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: alnovak@suse.com
Cc: joro@8bytes.org
Link: http://lkml.kernel.org/r/20150204132754.GA10078@suse.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Paravirt spinlock clears slowpath flag after doing unlock.
As explained by Linus currently it does:
prev = *lock;
add_smp(&lock->tickets.head, TICKET_LOCK_INC);
/* add_smp() is a full mb() */
if (unlikely(lock->tickets.tail & TICKET_SLOWPATH_FLAG))
__ticket_unlock_slowpath(lock, prev);
which is *exactly* the kind of things you cannot do with spinlocks,
because after you've done the "add_smp()" and released the spinlock
for the fast-path, you can't access the spinlock any more. Exactly
because a fast-path lock might come in, and release the whole data
structure.
Linus suggested that we should not do any writes to lock after unlock(),
and we can move slowpath clearing to fastpath lock.
So this patch implements the fix with:
1. Moving slowpath flag to head (Oleg):
Unlocked locks don't care about the slowpath flag; therefore we can keep
it set after the last unlock, and clear it again on the first (try)lock.
-- this removes the write after unlock. note that keeping slowpath flag would
result in unnecessary kicks.
By moving the slowpath flag from the tail to the head ticket we also avoid
the need to access both the head and tail tickets on unlock.
2. use xadd to avoid read/write after unlock that checks the need for
unlock_kick (Linus):
We further avoid the need for a read-after-release by using xadd;
the prev head value will include the slowpath flag and indicate if we
need to do PV kicking of suspended spinners -- on modern chips xadd
isn't (much) more expensive than an add + load.
Result:
setup: 16core (32 cpu +ht sandy bridge 8GB 16vcpu guest)
benchmark overcommit %improve
kernbench 1x -0.13
kernbench 2x 0.02
dbench 1x -1.77
dbench 2x -0.63
[Jeremy: Hinted missing TICKET_LOCK_INC for kick]
[Oleg: Moved slowpath flag to head, ticket_equals idea]
[PeterZ: Added detailed changelog]
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Reported-by: Sasha Levin <sasha.levin@oracle.com>
Tested-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Cc: Andrew Jones <drjones@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Dave Jones <davej@redhat.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: Fernando Luis Vázquez Cao <fernando_b1@lab.ntt.co.jp>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Ulrich Obergfell <uobergfe@redhat.com>
Cc: Waiman Long <Waiman.Long@hp.com>
Cc: a.ryabinin@samsung.com
Cc: dave@stgolabs.net
Cc: hpa@zytor.com
Cc: jasowang@redhat.com
Cc: jeremy@goop.org
Cc: paul.gortmaker@windriver.com
Cc: riel@redhat.com
Cc: tglx@linutronix.de
Cc: waiman.long@hp.com
Cc: xen-devel@lists.xenproject.org
Link: http://lkml.kernel.org/r/20150215173043.GA7471@linux.vnet.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull mcelog regression fix from Tony Luck:
"Fix regression - functions on the mce notifier chain should not be
able to decide that an event should not be logged"
* tag 'please-pull-fixmcelog' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras:
x86/mce: Fix regression. All error records should report via /dev/mcelog
Pull x86 perf updates from Ingo Molnar:
"This series tightens up RDPMC permissions: currently even highly
sandboxed x86 execution environments (such as seccomp) have permission
to execute RDPMC, which may leak various perf events / PMU state such
as timing information and other CPU execution details.
This 'all is allowed' RDPMC mode is still preserved as the
(non-default) /sys/devices/cpu/rdpmc=2 setting. The new default is
that RDPMC access is only allowed if a perf event is mmap-ed (which is
needed to correctly interpret RDPMC counter values in any case).
As a side effect of these changes CR4 handling is cleaned up in the
x86 code and a shadow copy of the CR4 value is added.
The extra CR4 manipulation adds ~ <50ns to the context switch cost
between rdpmc-capable and rdpmc-non-capable mms"
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/x86: Add /sys/devices/cpu/rdpmc=2 to allow rdpmc for all tasks
perf/x86: Only allow rdpmc if a perf_event is mapped
perf: Pass the event to arch_perf_update_userpage()
perf: Add pmu callbacks to track event mapping and unmapping
x86: Add a comment clarifying LDT context switching
x86: Store a per-cpu shadow copy of CR4
x86: Clean up cr4 manipulation
Pull tty/serial driver patches from Greg KH:
"Here's the big tty/serial driver update for 3.20-rc1. Nothing huge
here, just lots of driver updates and some core tty layer fixes as
well. All have been in linux-next with no reported issues"
* tag 'tty-3.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (119 commits)
serial: 8250: Fix UART_BUG_TXEN workaround
serial: driver for ETRAX FS UART
tty: remove unused variable sprop
serial: of-serial: fetch line number from DT
serial: samsung: earlycon support depends on CONFIG_SERIAL_SAMSUNG_CONSOLE
tty/serial: serial8250_set_divisor() can be static
tty/serial: Add Spreadtrum sc9836-uart driver support
Documentation: DT: Add bindings for Spreadtrum SoC Platform
serial: samsung: remove redundant interrupt enabling
tty: Remove external interface for tty_set_termios()
serial: omap: Fix RTS handling
serial: 8250_omap: Use UPSTAT_AUTORTS for RTS handling
serial: core: Rework hw-assisted flow control support
tty/serial: 8250_early: Add support for PXA UARTs
tty/serial: of_serial: add support for PXA/MMP uarts
tty/serial: of_serial: add DT alias ID handling
serial: 8250: Prevent concurrent updates to shadow registers
serial: 8250: Use canary to restart console after suspend
serial: 8250: Refactor XR17V35X divisor calculation
serial: 8250: Refactor divisor programming
...
This feature let us to detect accesses out of bounds of global variables.
This will work as for globals in kernel image, so for globals in modules.
Currently this won't work for symbols in user-specified sections (e.g.
__init, __read_mostly, ...)
The idea of this is simple. Compiler increases each global variable by
redzone size and add constructors invoking __asan_register_globals()
function. Information about global variable (address, size, size with
redzone ...) passed to __asan_register_globals() so we could poison
variable's redzone.
This patch also forces module_alloc() to return 8*PAGE_SIZE aligned
address making shadow memory handling (
kasan_module_alloc()/kasan_module_free() ) more simple. Such alignment
guarantees that each shadow page backing modules address space correspond
to only one module_alloc() allocation.
Signed-off-by: Andrey Ryabinin <a.ryabinin@samsung.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Konstantin Serebryany <kcc@google.com>
Cc: Dmitry Chernenkov <dmitryc@google.com>
Signed-off-by: Andrey Konovalov <adech.fo@gmail.com>
Cc: Yuri Gribov <tetra2005@gmail.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
For instrumenting global variables KASan will shadow memory backing memory
for modules. So on module loading we will need to allocate memory for
shadow and map it at address in shadow that corresponds to the address
allocated in module_alloc().
__vmalloc_node_range() could be used for this purpose, except it puts a
guard hole after allocated area. Guard hole in shadow memory should be a
problem because at some future point we might need to have a shadow memory
at address occupied by guard hole. So we could fail to allocate shadow
for module_alloc().
Now we have VM_NO_GUARD flag disabling guard page, so we need to pass into
__vmalloc_node_range(). Add new parameter 'vm_flags' to
__vmalloc_node_range() function.
Signed-off-by: Andrey Ryabinin <a.ryabinin@samsung.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Konstantin Serebryany <kcc@google.com>
Cc: Dmitry Chernenkov <dmitryc@google.com>
Signed-off-by: Andrey Konovalov <adech.fo@gmail.com>
Cc: Yuri Gribov <tetra2005@gmail.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Recently instrumentation of builtin functions calls was removed from GCC
5.0. To check the memory accessed by such functions, userspace asan
always uses interceptors for them.
So now we should do this as well. This patch declares
memset/memmove/memcpy as weak symbols. In mm/kasan/kasan.c we have our
own implementation of those functions which checks memory before accessing
it.
Default memset/memmove/memcpy now now always have aliases with '__'
prefix. For files that built without kasan instrumentation (e.g.
mm/slub.c) original mem* replaced (via #define) with prefixed variants,
cause we don't want to check memory accesses there.
Signed-off-by: Andrey Ryabinin <a.ryabinin@samsung.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Konstantin Serebryany <kcc@google.com>
Cc: Dmitry Chernenkov <dmitryc@google.com>
Signed-off-by: Andrey Konovalov <adech.fo@gmail.com>
Cc: Yuri Gribov <tetra2005@gmail.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch adds arch specific code for kernel address sanitizer.
16TB of virtual addressed used for shadow memory. It's located in range
[ffffec0000000000 - fffffc0000000000] between vmemmap and %esp fixup
stacks.
At early stage we map whole shadow region with zero page. Latter, after
pages mapped to direct mapping address range we unmap zero pages from
corresponding shadow (see kasan_map_shadow()) and allocate and map a real
shadow memory reusing vmemmap_populate() function.
Also replace __pa with __pa_nodebug before shadow initialized. __pa with
CONFIG_DEBUG_VIRTUAL=y make external function call (__phys_addr)
__phys_addr is instrumented, so __asan_load could be called before shadow
area initialized.
Signed-off-by: Andrey Ryabinin <a.ryabinin@samsung.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Konstantin Serebryany <kcc@google.com>
Cc: Dmitry Chernenkov <dmitryc@google.com>
Signed-off-by: Andrey Konovalov <adech.fo@gmail.com>
Cc: Yuri Gribov <tetra2005@gmail.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Jim Davis <jim.epost@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
printk and friends can now format bitmaps using '%*pb[l]'. cpumask
and nodemask also provide cpumask_pr_args() and nodemask_pr_args()
respectively which can be used to generate the two printf arguments
necessary to format the specified cpu/nodemask.
* Unnecessary buffer size calculation and condition on the lenght
removed from intel_cacheinfo.c::show_shared_cpu_map_func().
* uv_nmi_nr_cpus_pr() got overly smart and implemented "..."
abbreviation if the output stretched over the predefined 1024 byte
buffer. Replaced with plain printk.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Mike Travis <travis@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This reverts commit 5fcee53ce7.
It causes the suspend to fail on at least the Chromebook Pixel, possibly
other platforms too.
Joerg Roedel points out that the logic should probably have been
if (max_physical_apicid > 255 ||
!(IS_ENABLED(CONFIG_HYPERVISOR_GUEST) &&
hypervisor_x2apic_available())) {
instead, but since the code is not in any fast-path, so we can just live
without that optimization and just revert to the original code.
Acked-by: Joerg Roedel <joro@8bytes.org>
Acked-by: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
__FUNCTION__ hasn't been treated as a string literal since gcc 3.4, so
this only helps people who only test-compile using 3.3 (compiler-gcc3.h
barks at anything older than that). Besides, there are almost no
occurrences of __FUNCTION__ left in the tree.
[akpm@linux-foundation.org: convert remaining __FUNCTION__ references]
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Michal Nazarewicz <mina86@mina86.com>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>