Commit Graph

7191 Commits

Author SHA1 Message Date
Andy Lutomirski
5a83d60c07 x86/fpu: Remove irq_ts_save() and irq_ts_restore()
Now that lazy FPU is gone, we don't use CR0.TS (except possibly in
KVM guest mode).  Remove irq_ts_save(), irq_ts_restore(), and all of
their callers.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: kvm list <kvm@vger.kernel.org>
Link: http://lkml.kernel.org/r/70b9b9e7ba70659bedcb08aba63d0f9214f338f2.1477951965.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-11-01 07:47:54 +01:00
Ingo Molnar
c29c716662 Merge branch 'core/urgent' into x86/fpu, to merge fixes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-11-01 07:47:40 +01:00
Ingo Molnar
05b93c19d5 Merge branch 'linus' into x86/asm, to pick up fixes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-11-01 07:41:06 +01:00
Fenghua Yu
4f341a5e48 x86/intel_rdt: Add scheduler hook
Hook the x86 scheduler code to update closid based on whether the current
task is assigned to a specific closid or running on a CPU assigned to a
specific closid.

Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Cc: "Ravi V Shankar" <ravi.v.shankar@intel.com>
Cc: "Tony Luck" <tony.luck@intel.com>
Cc: "Shaohua Li" <shli@fb.com>
Cc: "Sai Prakhya" <sai.praneeth.prakhya@intel.com>
Cc: "Peter Zijlstra" <peterz@infradead.org>
Cc: "Stephane Eranian" <eranian@google.com>
Cc: "Dave Hansen" <dave.hansen@intel.com>
Cc: "David Carrillo-Cisneros" <davidcc@google.com>
Cc: "Nilay Vaish" <nilayvaish@gmail.com>
Cc: "Vikas Shivappa" <vikas.shivappa@linux.intel.com>
Cc: "Ingo Molnar" <mingo@elte.hu>
Cc: "Borislav Petkov" <bp@suse.de>
Cc: "H. Peter Anvin" <h.peter.anvin@intel.com>
Link: http://lkml.kernel.org/r/1477692289-37412-10-git-send-email-fenghua.yu@intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-10-30 19:10:16 -06:00
Tony Luck
60ec2440c6 x86/intel_rdt: Add schemata file
Last of the per resource group files. Also mode 0644. This one shows
the resources available to the group. Syntax depends on whether the
"cdp" mount option was given. With code/data prioritization disabled
it is simply a list of masks for each cache domain. Initial value
allows access to all of the L3 cache on all domains. E.g. on a 2 socket
Broadwell:
        L3:0=fffff;1=fffff
With CDP enabled, separate masks for data and instructions are provided:
        L3DATA:0=fffff;1=fffff
        L3CODE:0=fffff;1=fffff

Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Cc: "Ravi V Shankar" <ravi.v.shankar@intel.com>
Cc: "Shaohua Li" <shli@fb.com>
Cc: "Sai Prakhya" <sai.praneeth.prakhya@intel.com>
Cc: "Peter Zijlstra" <peterz@infradead.org>
Cc: "Stephane Eranian" <eranian@google.com>
Cc: "Dave Hansen" <dave.hansen@intel.com>
Cc: "David Carrillo-Cisneros" <davidcc@google.com>
Cc: "Nilay Vaish" <nilayvaish@gmail.com>
Cc: "Vikas Shivappa" <vikas.shivappa@linux.intel.com>
Cc: "Ingo Molnar" <mingo@elte.hu>
Cc: "Borislav Petkov" <bp@suse.de>
Cc: "H. Peter Anvin" <h.peter.anvin@intel.com>
Link: http://lkml.kernel.org/r/1477692289-37412-9-git-send-email-fenghua.yu@intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-10-30 19:10:16 -06:00
Tony Luck
12e0110c11 x86/intel_rdt: Add cpus file
Now we populate each directory with a read/write (mode 0644) file
named "cpus". This is used to over-ride the resources available
to processes in the default resource group when running on specific
CPUs.  Each "cpus" file reads as a cpumask showing which CPUs belong
to this resource group. Initially all online CPUs are assigned to
the default group. They can be added to other groups by writing a
cpumask to the "cpus" file in the directory for the resource group
(which will remove them from the previous group to which they were
assigned). CPU online/offline operations will delete CPUs that go
offline from whatever group they are in and add new CPUs to the
default group.

If there are CPUs assigned to a group when the directory is removed,
they are returned to the default group.

Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Cc: "Ravi V Shankar" <ravi.v.shankar@intel.com>
Cc: "Shaohua Li" <shli@fb.com>
Cc: "Sai Prakhya" <sai.praneeth.prakhya@intel.com>
Cc: "Peter Zijlstra" <peterz@infradead.org>
Cc: "Stephane Eranian" <eranian@google.com>
Cc: "Dave Hansen" <dave.hansen@intel.com>
Cc: "David Carrillo-Cisneros" <davidcc@google.com>
Cc: "Nilay Vaish" <nilayvaish@gmail.com>
Cc: "Vikas Shivappa" <vikas.shivappa@linux.intel.com>
Cc: "Ingo Molnar" <mingo@elte.hu>
Cc: "Borislav Petkov" <bp@suse.de>
Cc: "H. Peter Anvin" <h.peter.anvin@intel.com>
Link: http://lkml.kernel.org/r/1477692289-37412-7-git-send-email-fenghua.yu@intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-10-30 19:10:15 -06:00
Fenghua Yu
60cf5e101f x86/intel_rdt: Add mkdir to resctrl file system
Resource control groups are represented as directories in the resctrl
file system. The root directory describes the default resources available
to tasks that have not been assigned specific resources. Other directories
can be created at the root level to make new resource groups. It is not
permitted to make directories within other directories.

Hardware uses a CLOSID (Class of service ID) to determine which resource
limits are currently in effect. The exact number available is enumerated
by CPUID leaf 0x10, but on current implementations it is a small number.
We implement a simple bitmask allocator for CLOSIDs.

Each resource control group uses one CLOSID, which limits the total number
of directories that can be created.

Resource groups can be removed using rmdir.

Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Cc: "Ravi V Shankar" <ravi.v.shankar@intel.com>
Cc: "Tony Luck" <tony.luck@intel.com>
Cc: "Shaohua Li" <shli@fb.com>
Cc: "Sai Prakhya" <sai.praneeth.prakhya@intel.com>
Cc: "Peter Zijlstra" <peterz@infradead.org>
Cc: "Stephane Eranian" <eranian@google.com>
Cc: "Dave Hansen" <dave.hansen@intel.com>
Cc: "David Carrillo-Cisneros" <davidcc@google.com>
Cc: "Nilay Vaish" <nilayvaish@gmail.com>
Cc: "Vikas Shivappa" <vikas.shivappa@linux.intel.com>
Cc: "Ingo Molnar" <mingo@elte.hu>
Cc: "Borislav Petkov" <bp@suse.de>
Cc: "H. Peter Anvin" <h.peter.anvin@intel.com>
Link: http://lkml.kernel.org/r/1477692289-37412-6-git-send-email-fenghua.yu@intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-10-30 19:10:14 -06:00
Fenghua Yu
4e978d06de x86/intel_rdt: Add "info" files to resctrl file system
For the convenience of applications we make the decoded values of some
of the CPUID values available in read-only (0444) files.

Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Cc: "Ravi V Shankar" <ravi.v.shankar@intel.com>
Cc: "Tony Luck" <tony.luck@intel.com>
Cc: "Shaohua Li" <shli@fb.com>
Cc: "Sai Prakhya" <sai.praneeth.prakhya@intel.com>
Cc: "Peter Zijlstra" <peterz@infradead.org>
Cc: "Stephane Eranian" <eranian@google.com>
Cc: "Dave Hansen" <dave.hansen@intel.com>
Cc: "David Carrillo-Cisneros" <davidcc@google.com>
Cc: "Nilay Vaish" <nilayvaish@gmail.com>
Cc: "Vikas Shivappa" <vikas.shivappa@linux.intel.com>
Cc: "Ingo Molnar" <mingo@elte.hu>
Cc: "Borislav Petkov" <bp@suse.de>
Cc: "H. Peter Anvin" <h.peter.anvin@intel.com>
Link: http://lkml.kernel.org/r/1477692289-37412-5-git-send-email-fenghua.yu@intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-10-30 19:10:14 -06:00
Fenghua Yu
5ff193fbde x86/intel_rdt: Add basic resctrl filesystem support
Use kernfs as basis for our user interface filesystem. This patch
supports mount/umount, and one mount parameter "cdp" to enable code/data
prioritization (though all we do at this point is ensure that the system
can support CDP).  The file system is not populated yet in this patch.

[ tglx: Fixed up a few nits and added cdp handling in case of error ]

Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Cc: "Ravi V Shankar" <ravi.v.shankar@intel.com>
Cc: "Tony Luck" <tony.luck@intel.com>
Cc: "Shaohua Li" <shli@fb.com>
Cc: "Sai Prakhya" <sai.praneeth.prakhya@intel.com>
Cc: "Peter Zijlstra" <peterz@infradead.org>
Cc: "Stephane Eranian" <eranian@google.com>
Cc: "Dave Hansen" <dave.hansen@intel.com>
Cc: "David Carrillo-Cisneros" <davidcc@google.com>
Cc: "Nilay Vaish" <nilayvaish@gmail.com>
Cc: "Vikas Shivappa" <vikas.shivappa@linux.intel.com>
Cc: "Ingo Molnar" <mingo@elte.hu>
Cc: "Borislav Petkov" <bp@suse.de>
Cc: "H. Peter Anvin" <h.peter.anvin@intel.com>
Link: http://lkml.kernel.org/r/1477692289-37412-4-git-send-email-fenghua.yu@intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-10-30 19:10:14 -06:00
Tony Luck
2264d9c74d x86/intel_rdt: Build structures for each resource based on cache topology
We use the cpu hotplug notifier to catch each cpu in turn and look at
its cache topology w.r.t each of the resource groups. As we discover
new resources, we initialize the bitmask array for each to the default
(full access) value.

Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Cc: "Ravi V Shankar" <ravi.v.shankar@intel.com>
Cc: "Shaohua Li" <shli@fb.com>
Cc: "Sai Prakhya" <sai.praneeth.prakhya@intel.com>
Cc: "Peter Zijlstra" <peterz@infradead.org>
Cc: "Stephane Eranian" <eranian@google.com>
Cc: "Dave Hansen" <dave.hansen@intel.com>
Cc: "David Carrillo-Cisneros" <davidcc@google.com>
Cc: "Nilay Vaish" <nilayvaish@gmail.com>
Cc: "Vikas Shivappa" <vikas.shivappa@linux.intel.com>
Cc: "Ingo Molnar" <mingo@elte.hu>
Cc: "Borislav Petkov" <bp@suse.de>
Cc: "H. Peter Anvin" <h.peter.anvin@intel.com>
Link: http://lkml.kernel.org/r/1477692289-37412-3-git-send-email-fenghua.yu@intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-10-30 19:10:13 -06:00
Dmitry Safonov
a01aa6c9f4 x86/prctl/uapi: Remove #ifdef for CHECKPOINT_RESTORE
As userspace knows nothing about kernel config, thus #ifdefs
around ABI prctl constants makes them invisible to userspace.

Let it be clean'n'simple: remove #ifdefs.

If kernel has CONFIG_CHECKPOINT_RESTORE disabled, sys_prctl()
will return -EINVAL for those prctls.

Reported-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: Dmitry Safonov <dsafonov@virtuozzo.com>
Acked-by: Andy Lutomirski <luto@kernel.org>
Cc: 0x7f454c46@gmail.com
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-mm@kvack.org
Cc: oleg@redhat.com
Fixes: 2eefd87896 ("x86/arch_prctl/vdso: Add ARCH_MAP_VDSO_*")
Link: http://lkml.kernel.org/r/20161027141516.28447-2-dsafonov@virtuozzo.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-10-28 08:15:55 +02:00
Fenghua Yu
6b281569df x86/cqm: Share PQR_ASSOC related data between CQM and CAT
PQR_ASSOC MSR contains the RMID used for preformance monitoring of cache
occupancy and memory bandwidth. The upper 32bit of this MSR contain the
CLOSID for cache allocation. So we need to share the information between
the two facilities.

Move the rdt data structure declaration into the shared header file and
make the per cpu data structure containing the MSR values global.

Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Cc: "Ravi V Shankar" <ravi.v.shankar@intel.com>
Cc: "Tony Luck" <tony.luck@intel.com>
Cc: "David Carrillo-Cisneros" <davidcc@google.com>
Cc: "Sai Prakhya" <sai.praneeth.prakhya@intel.com>
Cc: "Peter Zijlstra" <peterz@infradead.org>
Cc: "Stephane Eranian" <eranian@google.com>
Cc: "Dave Hansen" <dave.hansen@intel.com>
Cc: "Shaohua Li" <shli@fb.com>
Cc: "Nilay Vaish" <nilayvaish@gmail.com>
Cc: "Vikas Shivappa" <vikas.shivappa@linux.intel.com>
Cc: "Ingo Molnar" <mingo@elte.hu>
Cc: "Borislav Petkov" <bp@suse.de>
Cc: "H. Peter Anvin" <h.peter.anvin@intel.com>
Link: http://lkml.kernel.org/r/1477142405-32078-10-git-send-email-fenghua.yu@intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-10-26 23:12:39 +02:00
Fenghua Yu
c1c7c3f9d6 x86/intel_rdt: Pick up L3/L2 RDT parameters from CPUID
Define struct rdt_resource to hold all the parameterized values for an RDT
resource and fill in the CPUID enumerated values from leaf 0x10 if
available. Hard code them for the MSR detected Haswells.

Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Cc: "Ravi V Shankar" <ravi.v.shankar@intel.com>
Cc: "Tony Luck" <tony.luck@intel.com>
Cc: "David Carrillo-Cisneros" <davidcc@google.com>
Cc: "Sai Prakhya" <sai.praneeth.prakhya@intel.com>
Cc: "Peter Zijlstra" <peterz@infradead.org>
Cc: "Stephane Eranian" <eranian@google.com>
Cc: "Dave Hansen" <dave.hansen@intel.com>
Cc: "Shaohua Li" <shli@fb.com>
Cc: "Nilay Vaish" <nilayvaish@gmail.com>
Cc: "Vikas Shivappa" <vikas.shivappa@linux.intel.com>
Cc: "Ingo Molnar" <mingo@elte.hu>
Cc: "Borislav Petkov" <bp@suse.de>
Cc: "H. Peter Anvin" <h.peter.anvin@intel.com>
Link: http://lkml.kernel.org/r/1477142405-32078-9-git-send-email-fenghua.yu@intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-10-26 23:12:39 +02:00
Fenghua Yu
113c60970c x86/intel_rdt: Add Haswell feature discovery
Some Haswell generation CPUs support RDT, but they don't enumerate this via
CPUID.  Use rdmsr_safe() and wrmsr_safe() to probe the MSRs on cpu model 63
(INTEL_FAM6_HASWELL_X)

Move the relevant defines into a common header file which is shared between
RDT/CQM and RDT/Allocation to avoid duplication.

Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Cc: "Ravi V Shankar" <ravi.v.shankar@intel.com>
Cc: "Tony Luck" <tony.luck@intel.com>
Cc: "David Carrillo-Cisneros" <davidcc@google.com>
Cc: "Sai Prakhya" <sai.praneeth.prakhya@intel.com>
Cc: "Peter Zijlstra" <peterz@infradead.org>
Cc: "Stephane Eranian" <eranian@google.com>
Cc: "Dave Hansen" <dave.hansen@intel.com>
Cc: "Shaohua Li" <shli@fb.com>
Cc: "Nilay Vaish" <nilayvaish@gmail.com>
Cc: "Vikas Shivappa" <vikas.shivappa@linux.intel.com>
Cc: "Ingo Molnar" <mingo@elte.hu>
Cc: "Borislav Petkov" <bp@suse.de>
Cc: "H. Peter Anvin" <h.peter.anvin@intel.com>
Link: http://lkml.kernel.org/r/1477142405-32078-8-git-send-email-fenghua.yu@intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-10-26 23:12:38 +02:00
Fenghua Yu
4ab1586488 x86/cpufeature: Add RDT CPUID feature bits
Check CPUID leaves for all the Resource Director Technology (RDT)
Cache Allocation Technology (CAT) bits.

Presence of allocation features:
  CPUID.(EAX=7H, ECX=0):EBX[bit 15]	X86_FEATURE_RDT_A

L2 and L3 caches are each separately enabled:
  CPUID.(EAX=10H, ECX=0):EBX[bit 1]	X86_FEATURE_CAT_L3
  CPUID.(EAX=10H, ECX=0):EBX[bit 2]	X86_FEATURE_CAT_L2

L3 cache may support independent control of allocation for
code and data (CDP = Code/Data Prioritization):
  CPUID.(EAX=10H, ECX=1):ECX[bit 2]	X86_FEATURE_CDP_L3

[ tglx: Fixed up Borislavs comments and moved the feature bits into a gap ]

Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Acked-by: "Borislav Petkov" <bp@suse.de>
Cc: "Ravi V Shankar" <ravi.v.shankar@intel.com>
Cc: "Tony Luck" <tony.luck@intel.com>
Cc: "David Carrillo-Cisneros" <davidcc@google.com>
Cc: "Sai Prakhya" <sai.praneeth.prakhya@intel.com>
Cc: "Peter Zijlstra" <peterz@infradead.org>
Cc: "Stephane Eranian" <eranian@google.com>
Cc: "Dave Hansen" <dave.hansen@intel.com>
Cc: "Shaohua Li" <shli@fb.com>
Cc: "Nilay Vaish" <nilayvaish@gmail.com>
Cc: "Vikas Shivappa" <vikas.shivappa@linux.intel.com>
Cc: "Ingo Molnar" <mingo@elte.hu>
Cc: "H. Peter Anvin" <h.peter.anvin@intel.com>
Link: http://lkml.kernel.org/r/1477142405-32078-5-git-send-email-fenghua.yu@intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-10-26 23:12:38 +02:00
Dave Airlie
8ef4227615 x86/io: add interface to reserve io memtype for a resource range. (v1.1)
A recent change to the mm code in:
87744ab383 mm: fix cache mode tracking in vm_insert_mixed()

started enforcing checking the memory type against the registered list for
amixed pfn insertion mappings. It happens that the drm drivers for a number
of gpus relied on this being broken. Currently the driver only inserted
VRAM mappings into the tracking table when they came from the kernel,
and userspace mappings never landed in the table. This led to a regression
where all the mapping end up as UC instead of WC now.

I've considered a number of solutions but since this needs to be fixed
in fixes and not next, and some of the solutions were going to introduce
overhead that hadn't been there before I didn't consider them viable at
this stage. These mainly concerned hooking into the TTM io reserve APIs,
but these API have a bunch of fast paths I didn't want to unwind to add
this to.

The solution I've decided on is to add a new API like the arch_phys_wc
APIs (these would have worked but wc_del didn't take a range), and
use them from the drivers to add a WC compatible mapping to the table
for all VRAM on those GPUs. This means we can then create userspace
mapping that won't get degraded to UC.

v1.1: use CONFIG_X86_PAT + add some comments in io.h

Cc: Toshi Kani <toshi.kani@hp.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: x86@kernel.org
Cc: mcgrof@suse.com
Cc: Dan Williams <dan.j.williams@intel.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-10-26 15:45:38 +10:00
Josh Poimboeuf
0ee1dd9f5e x86/dumpstack: Remove raw stack dump
For mostly historical reasons, the x86 oops dump shows the raw stack
values:

  ...
  [registers]
  Stack:
   ffff880079af7350 ffff880079905400 0000000000000000 ffffc900008f3ae0
   ffffffffa0196610 0000000000000001 00010000ffffffff 0000000087654321
   0000000000000002 0000000000000000 0000000000000000 0000000000000000
  Call Trace:
  ...

This seems to be an artifact from long ago, and probably isn't needed
anymore.  It generally just adds noise to the dump, and it can be
actively harmful because it leaks kernel addresses.

Linus says:

  "The stack dump actually goes back to forever, and it used to be
   useful back in 1992 or so. But it used to be useful mainly because
   stacks were simpler and we didn't have very good call traces anyway. I
   definitely remember having used them - I just do not remember having
   used them in the last ten+ years.

   Of course, it's still true that if you can trigger an oops, you've
   likely already lost the security game, but since the stack dump is so
   useless, let's aim to just remove it and make games like the above
   harder."

This also removes the related 'kstack=' cmdline option and the
'kstack_depth_to_print' sysctl.

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/e83bd50df52d8fe88e94d2566426ae40d813bf8f.1477405374.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-10-25 18:40:37 +02:00
Josh Poimboeuf
bb5e5ce545 x86/dumpstack: Remove kernel text addresses from stack dump
Printing kernel text addresses in stack dumps is of questionable value,
especially now that address randomization is becoming common.

It can be a security issue because it leaks kernel addresses.  It also
affects the usefulness of the stack dump.  Linus says:

  "I actually spend time cleaning up commit messages in logs, because
  useless data that isn't actually information (random hex numbers) is
  actively detrimental.

  It makes commit logs less legible.

  It also makes it harder to parse dumps.

  It's not useful. That makes it actively bad.

  I probably look at more oops reports than most people. I have not
  found the hex numbers useful for the last five years, because they are
  just randomized crap.

  The stack content thing just makes code scroll off the screen etc, for
  example."

The only real downside to removing these addresses is that they can be
used to disambiguate duplicate symbol names.  However such cases are
rare, and the context of the stack dump should be enough to be able to
figure it out.

There's now a 'faddr2line' script which can be used to convert a
function address to a file name and line:

  $ ./scripts/faddr2line ~/k/vmlinux write_sysrq_trigger+0x51/0x60
  write_sysrq_trigger+0x51/0x60:
  write_sysrq_trigger at drivers/tty/sysrq.c:1098

Or gdb can be used:

  $ echo "list *write_sysrq_trigger+0x51" |gdb ~/k/vmlinux |grep "is in"
  (gdb) 0xffffffff815b5d83 is in driver_probe_device (/home/jpoimboe/git/linux/drivers/base/dd.c:378).

(But note that when there are duplicate symbol names, gdb will only show
the first symbol it finds.  faddr2line is recommended over gdb because
it handles duplicates and it also does function size checking.)

Here's an example of what a stack dump looks like after this change:

  BUG: unable to handle kernel NULL pointer dereference at           (null)
  IP: sysrq_handle_crash+0x45/0x80
  PGD 36bfa067 [   29.650644] PUD 7aca3067
  Oops: 0002 [#1] PREEMPT SMP
  Modules linked in: ...
  CPU: 1 PID: 786 Comm: bash Tainted: G            E   4.9.0-rc1+ #1
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.1-1.fc24 04/01/2014
  task: ffff880078582a40 task.stack: ffffc90000ba8000
  RIP: 0010:sysrq_handle_crash+0x45/0x80
  RSP: 0018:ffffc90000babdc8 EFLAGS: 00010296
  RAX: ffff880078582a40 RBX: 0000000000000063 RCX: 0000000000000001
  RDX: 0000000000000001 RSI: 0000000000000000 RDI: 0000000000000292
  RBP: ffffc90000babdc8 R08: 0000000b31866061 R09: 0000000000000000
  R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000
  R13: 0000000000000007 R14: ffffffff81ee8680 R15: 0000000000000000
  FS:  00007ffb43869700(0000) GS:ffff88007d400000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 0000000000000000 CR3: 000000007a3e9000 CR4: 00000000001406e0
  Stack:
   ffffc90000babe00 ffffffff81572d08 ffffffff81572bd5 0000000000000002
   0000000000000000 ffff880079606600 00007ffb4386e000 ffffc90000babe20
   ffffffff81573201 ffff880036a3fd00 fffffffffffffffb ffffc90000babe40
  Call Trace:
   __handle_sysrq+0x138/0x220
   ? __handle_sysrq+0x5/0x220
   write_sysrq_trigger+0x51/0x60
   proc_reg_write+0x42/0x70
   __vfs_write+0x37/0x140
   ? preempt_count_sub+0xa1/0x100
   ? __sb_start_write+0xf5/0x210
   ? vfs_write+0x183/0x1a0
   vfs_write+0xb8/0x1a0
   SyS_write+0x58/0xc0
   entry_SYSCALL_64_fastpath+0x1f/0xc2
  RIP: 0033:0x7ffb42f55940
  RSP: 002b:00007ffd33bb6b18 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
  RAX: ffffffffffffffda RBX: 0000000000000046 RCX: 00007ffb42f55940
  RDX: 0000000000000002 RSI: 00007ffb4386e000 RDI: 0000000000000001
  RBP: 0000000000000011 R08: 00007ffb4321ea40 R09: 00007ffb43869700
  R10: 00007ffb43869700 R11: 0000000000000246 R12: 0000000000778a10
  R13: 00007ffd33bb5c00 R14: 0000000000000007 R15: 0000000000000010
  Code: 34 e8 d0 34 bc ff 48 c7 c2 3b 2b 57 81 be 01 00 00 00 48 c7 c7 e0 dd e5 81 e8 a8 55 ba ff c7 05 0e 3f de 00 01 00 00 00 0f ae f8 <c6> 04 25 00 00 00 00 01 5d c3 e8 4c 49 bc ff 84 c0 75 c3 48 c7
  RIP: sysrq_handle_crash+0x45/0x80 RSP: ffffc90000babdc8
  CR2: 0000000000000000

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/69329cb29b8f324bb5fcea14d61d224807fb6488.1477405374.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-10-25 18:40:37 +02:00
Borislav Petkov
06b8534cb7 x86/microcode: Rework microcode loading
Yeah, I know, I know, this is a huuge patch and reviewing it is hard.

Sorry but this is the only way I could think of in which I can rewrite
the microcode patches loading procedure without breaking (knowingly) the
driver.

So maybe this patch is easier to review if one looks at the files after
the patch has been applied instead at the diff. Because then it becomes
pretty obvious:

* The BSP-loading path - load_ucode_bsp() is working independently from
  the AP path now and it doesn't save any pointers or patches anymore -
  it solely parses the builtin or initrd microcode and applies the patch.
  That's it.

This fixes the CONFIG_RANDOMIZE_MEMORY offset fun more solidly.

* The AP-loading path - load_ucode_ap() then goes and scans
  builtin/initrd *again* for the microcode patches but it caches them this
  time so that we don't have to do that scan on each AP but only once.

This simplifies the code considerably.

Then, when we save the microcode from the initrd/builtin, we go and
add the relevant patches to our own cache. The AMD side did do that
and now the Intel side does it too. So no more pointer copying and
blabla, we save the microcode patches ourselves and are independent from
initrd/builtin.

This whole conversion gives us other benefits like unifying the
initrd parsing into a single function: find_microcode_in_initrd() is
used by both.

The diffstat speaks for itself: 456 insertions(+), 695 deletions(-)

Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20161025095522.11964-12-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-10-25 12:28:59 +02:00
Borislav Petkov
8027923ab4 x86/microcode/intel: Remove intel_lib.c
Its functions are used in intel.c only now, so get rid of it. Make
functions static.

No functionality change.

Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20161025095522.11964-11-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-10-25 12:28:59 +02:00
Borislav Petkov
76bd11c23a x86/microcode/amd: Move private inlines to .c and mark local functions static
Make them all static as they're used in a single file now.

No functionality change.

Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20161025095522.11964-10-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-10-25 12:28:59 +02:00
Borislav Petkov
b3763a672d x86/microcode/amd: Hand down the CPU family
Will be needed in a following patch.

No functionality change.

Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20161025095522.11964-7-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-10-25 12:28:58 +02:00
Borislav Petkov
058dc49803 x86/microcode: Export the microcode cache linked list
It will be used by both drivers so move it to core.c.

No functionality change.

Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20161025095522.11964-6-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-10-25 12:28:58 +02:00
Borislav Petkov
f5bdfefbf9 x86/microcode: Remove one #ifdef clause
Move the function declaration to the other #ifdef CONFIG_MICROCODE
together with the other functions.

No functionality change.

Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20161025095522.11964-5-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-10-25 12:28:57 +02:00
Peter Zijlstra
890658b7ab locking/mutex: Kill arch specific code
Its all generic atomic_long_t stuff now.

Tested-by: Jason Low <jason.low2@hpe.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-arch@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-10-25 11:31:51 +02:00
Josh Poimboeuf
946c191161 x86/entry/unwind: Create stack frames for saved interrupt registers
With frame pointers, when a task is interrupted, its stack is no longer
completely reliable because the function could have been interrupted
before it had a chance to save the previous frame pointer on the stack.
So the caller of the interrupted function could get skipped by a stack
trace.

This is problematic for live patching, which needs to know whether a
stack trace of a sleeping task can be relied upon.  There's currently no
way to detect if a sleeping task was interrupted by a page fault
exception or preemption before it went to sleep.

Another issue is that when dumping the stack of an interrupted task, the
unwinder has no way of knowing where the saved pt_regs registers are, so
it can't print them.

This solves those issues by encoding the pt_regs pointer in the frame
pointer on entry from an interrupt or an exception.

This patch also updates the unwinder to be able to decode it, because
otherwise the unwinder would be broken by this change.

Note that this causes a change in the behavior of the unwinder: each
instance of a pt_regs on the stack is now considered a "frame".  So
callers of unwind_get_return_address() will now get an occasional
'regs->ip' address that would have previously been skipped over.

Suggested-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/8b9f84a21e39d249049e0547b559ff8da0df0988.1476973742.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-10-21 09:26:03 +02:00
Heiko Carstens
c8061485a0 sched/core, x86: Make struct thread_info arch specific again
The following commit:

  c65eacbe29 ("sched/core: Allow putting thread_info into task_struct")

... made 'struct thread_info' a generic struct with only a
single ::flags member, if CONFIG_THREAD_INFO_IN_TASK_STRUCT=y is
selected.

This change however seems to be quite x86 centric, since at least the
generic preemption code (asm-generic/preempt.h) assumes that struct
thread_info also has a preempt_count member, which apparently was not
true for x86.

We could add a bit more #ifdefs to solve this problem too, but it seems
to be much simpler to make struct thread_info arch specific
again. This also makes the conversion to THREAD_INFO_IN_TASK_STRUCT a
bit easier for architectures that have a couple of arch specific stuff
in their thread_info definition.

The arch specific stuff _could_ be moved to thread_struct. However
keeping them in thread_info makes it easier: accessing thread_info
members is simple, since it is at the beginning of the task_struct,
while the thread_struct is at the end. At least on s390 the offsets
needed to access members of the thread_struct (with task_struct as
base) are too large for various asm instructions.  This is not a
problem when keeping these members within thread_info.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: keescook@chromium.org
Cc: linux-arch@vger.kernel.org
Link: http://lkml.kernel.org/r/1476901693-8492-2-git-send-email-mark.rutland@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-10-20 13:27:47 +02:00
Piotr Luc
8214899342 x86/cpufeature: Add AVX512_4VNNIW and AVX512_4FMAPS features
AVX512_4VNNIW  - Vector instructions for deep learning enhanced word
variable precision.
AVX512_4FMAPS - Vector instructions for deep learning floating-point
single precision.

These new instructions are to be used in future Intel Xeon & Xeon Phi
processors. The bits 2&3 of CPUID[level:0x07, EDX] inform that new
instructions are supported by a processor.

The spec can be found in the Intel Software Developer Manual (SDM) or in
the Instruction Set Extensions Programming Reference (ISE).

Define new feature flags to enumerate the new instructions in /proc/cpuinfo
accordingly to CPUID bits and add the required xsave extensions which are
required for proper operation.

Signed-off-by: Piotr Luc <piotr.luc@intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/20161018150111.29926-1-piotr.luc@intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-10-19 17:37:13 +02:00
Linus Torvalds
0832881425 Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
 "Misc fixes, plus hw-enablement changes:

   - fix persistent RAM handling
   - remove pkeys warning
   - remove duplicate macro
   - fix debug warning in irq handler
   - add new 'Knights Mill' CPU related constants and enable the perf bits"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/x86/intel/uncore: Add Knights Mill CPUID
  perf/x86/intel/rapl: Add Knights Mill CPUID
  perf/x86/intel: Add Knights Mill CPUID
  x86/cpu/intel: Add Knights Mill to Intel family
  x86/e820: Don't merge consecutive E820_PRAM ranges
  pkeys: Remove easily triggered WARN
  x86: Remove duplicate rtit status MSR macro
  x86/smp: Add irq_enter/exit() in smp_reschedule_interrupt()
2016-10-18 09:59:04 -07:00
Josh Poimboeuf
55a76b59b5 locking/rwsem/x86: Add stack frame dependency for ____down_write()
Arnd reported the following objtool warning:

  kernel/locking/rwsem.o: warning: objtool: down_write_killable()+0x16: call without frame pointer save/setup

The warning means gcc placed the ____down_write() inline asm (and its
call instruction) before the frame pointer setup in
down_write_killable(), which breaks frame pointer convention and can
result in incorrect stack traces.

Force the stack frame to be created before the call instruction by
listing the stack pointer as an output operand in the inline asm
statement.

Reported-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1188b7015f04baf361e59de499ee2d7272c59dce.1476393828.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-10-18 12:21:16 +02:00
Andy Lutomirski
e63650840e x86/fpu: Finish excising 'eagerfpu'
Now that eagerfpu= is gone, remove it from the docs and some
comments.  Also sync the changes to tools/.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/cf430dd4481d41280e93ac6cf0def1007a67fc8e.1476740397.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-10-18 09:56:03 +02:00
Piotr Luc
0047f59834 x86/cpu/intel: Add Knights Mill to Intel family
Add CPUID of Knights Mill (KNM) processor to Intel family list.

Signed-off-by: Piotr Luc <piotr.luc@intel.com>
Reviewed-by: Dave Hansen <dave.hansen@intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20161012180520.30976-1-piotr.luc@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-10-17 10:45:08 +02:00
Ingo Molnar
4d69f155d5 Merge tag 'v4.9-rc1' into x86/fpu, to resolve conflict
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-10-16 13:04:34 +02:00
Rik van Riel
c474e50711 x86/fpu: Split old_fpu & new_fpu handling into separate functions
By moving all of the new_fpu state handling into switch_fpu_finish(),
the code can be simplified some more.

This gets rid of the prefetch, but given the size of the FPU register
state on modern CPUs, and the amount of work done by __switch_to()
inbetween both functions, the value of a single cache line prefetch
seems somewhat dubious anyway.

Signed-off-by: Rik van Riel <riel@redhat.com>
Acked-by: Dave Hansen <dave.hansen@intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1476447331-21566-3-git-send-email-riel@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-10-16 11:38:41 +02:00
Rik van Riel
317b622cb2 x86/fpu: Remove 'cpu' argument from __cpu_invalidate_fpregs_state()
The __{fpu,cpu}_invalidate_fpregs_state() functions can only be used
to invalidate a resource they control.  Document that, and change
the API a little bit to reflect that.

Go back to open coding the fpu_fpregs_owner_ctx write in the CPU
hotplug code, which should be the exception, and move __kernel_fpu_begin()
to this API.

This patch has no functional changes to the current code.

Signed-off-by: Rik van Riel <riel@redhat.com>
Acked-by: Dave Hansen <dave.hansen@intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1476447331-21566-2-git-send-email-riel@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-10-16 11:38:31 +02:00
Ingo Molnar
1d33369db2 Merge tag 'v4.9-rc1' into x86/urgent, to pick up updates
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-10-16 11:31:39 +02:00
Linus Torvalds
84d69848c9 Merge branch 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild
Pull kbuild updates from Michal Marek:

 - EXPORT_SYMBOL for asm source by Al Viro.

   This does bring a regression, because genksyms no longer generates
   checksums for these symbols (CONFIG_MODVERSIONS). Nick Piggin is
   working on a patch to fix this.

   Plus, we are talking about functions like strcpy(), which rarely
   change prototypes.

 - Fixes for PPC fallout of the above by Stephen Rothwell and Nick
   Piggin

 - fixdep speedup by Alexey Dobriyan.

 - preparatory work by Nick Piggin to allow architectures to build with
   -ffunction-sections, -fdata-sections and --gc-sections

 - CONFIG_THIN_ARCHIVES support by Stephen Rothwell

 - fix for filenames with colons in the initramfs source by me.

* 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild: (22 commits)
  initramfs: Escape colons in depfile
  ppc: there is no clear_pages to export
  powerpc/64: whitelist unresolved modversions CRCs
  kbuild: -ffunction-sections fix for archs with conflicting sections
  kbuild: add arch specific post-link Makefile
  kbuild: allow archs to select link dead code/data elimination
  kbuild: allow architectures to use thin archives instead of ld -r
  kbuild: Regenerate genksyms lexer
  kbuild: genksyms fix for typeof handling
  fixdep: faster CONFIG_ search
  ia64: move exports to definitions
  sparc32: debride memcpy.S a bit
  [sparc] unify 32bit and 64bit string.h
  sparc: move exports to definitions
  ppc: move exports to definitions
  arm: move exports to definitions
  s390: move exports to definitions
  m68k: move exports to definitions
  alpha: move exports to actual definitions
  x86: move exports to actual definitions
  ...
2016-10-14 14:26:58 -07:00
Linus Torvalds
b6daa51b9a Merge branch 'for-4.9' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu
Pull percpu updates from Tejun Heo:

 - Nick improved generic implementations of percpu operations which
   modify the variable and return so that they calculate the physical
   address only once.

 - percpu_ref percpu <-> atomic mode switching improvements. The
   patchset was originally posted about a year ago but fell through the
   crack.

 - misc non-critical fixes.

* 'for-4.9' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu:
  mm/percpu.c: fix potential memory leakage for pcpu_embed_first_chunk()
  mm/percpu.c: correct max_distance calculation for pcpu_embed_first_chunk()
  percpu: eliminate two sparse warnings
  percpu: improve generic percpu modify-return implementation
  percpu-refcount: init ->confirm_switch member properly
  percpu_ref: allow operation mode switching operations to be called concurrently
  percpu_ref: restructure operation mode switching
  percpu_ref: unify staggered atomic switching wait behavior
  percpu_ref: reorganize __percpu_ref_switch_to_atomic() and relocate percpu_ref_switch_to_atomic()
  percpu_ref: remove unnecessary RCU grace period for staggered atomic switching confirmation
2016-10-14 11:46:25 -07:00
Longpeng(Mike)
c836eeda3e x86: Remove duplicate rtit status MSR macro
The MSR_IA32_RTIT_STATUS is defined twice, so remove one.

Signed-off-by: Longpeng(Mike) <longpeng2@huawei.com>
Acked-by: Borislav Petkov <bp@suse.de>
Cc: len.brown@intel.com
Cc: peterz@infradead.org
Cc: rafael.j.wysocki@intel.com
Cc: alexander.shishkin@linux.intel.com
Cc: ray.huang@amd.com
Cc: Aravind.Gopalakrishnan@amd.com
Cc: wu.wubin@huawei.com
Cc: srinivas.pandruvada@linux.intel.com
Cc: zhaoshenglong@huawei.com
Cc: vladimir_zapolskiy@mentor.com
Link: http://lkml.kernel.org/r/1476405740-80816-1-git-send-email-longpeng2@huawei.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-10-14 14:14:20 +02:00
Linus Torvalds
4cdf8dbe2d Merge branch 'work.uaccess2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull uaccess.h prepwork from Al Viro:
 "Preparations to tree-wide switch to use of linux/uaccess.h (which,
  obviously, will allow to start unifying stuff for real). The last step
  there, ie

    PATT='^[[:blank:]]*#[[:blank:]]*include[[:blank:]]*<asm/uaccess.h>'
    sed -i -e "s!$PATT!#include <linux/uaccess.h>!" \
            `git grep -l "$PATT"|grep -v ^include/linux/uaccess.h`

  is not taken here - I would prefer to do it once just before or just
  after -rc1.  However, everything should be ready for it"

* 'work.uaccess2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  remove a stray reference to asm/uaccess.h in docs
  sparc64: separate extable_64.h, switch elf_64.h to it
  score: separate extable.h, switch module.h to it
  mips: separate extable.h, switch module.h to it
  x86: separate extable.h, switch sections.h to it
  remove stray include of asm/uaccess.h from cacheflush.h
  mn10300: remove a bogus processor.h->uaccess.h include
  xtensa: split uaccess.h into C and asm sides
  bonding: quit messing with IOCTL
  kill __kernel_ds_p off
  mn10300: finish verify_area() off
  frv: move HAVE_ARCH_UNMAPPED_AREA to pgtable.h
  exceptions: detritus removal
2016-10-11 23:38:39 -07:00
Hidehiro Kawai
0ee59413c9 x86/panic: replace smp_send_stop() with kdump friendly version in panic path
Daniel Walker reported problems which happens when
crash_kexec_post_notifiers kernel option is enabled
(https://lkml.org/lkml/2015/6/24/44).

In that case, smp_send_stop() is called before entering kdump routines
which assume other CPUs are still online.  As the result, for x86, kdump
routines fail to save other CPUs' registers and disable virtualization
extensions.

To fix this problem, call a new kdump friendly function,
crash_smp_send_stop(), instead of the smp_send_stop() when
crash_kexec_post_notifiers is enabled.  crash_smp_send_stop() is a weak
function, and it just call smp_send_stop().  Architecture codes should
override it so that kdump can work appropriately.  This patch only
provides x86-specific version.

For Xen's PV kernel, just keep the current behavior.

NOTES:

- Right solution would be to place crash_smp_send_stop() before
  __crash_kexec() invocation in all cases and remove smp_send_stop(), but
  we can't do that until all architectures implement own
  crash_smp_send_stop()

- crash_smp_send_stop()-like work is still needed by
  machine_crash_shutdown() because crash_kexec() can be called without
  entering panic()

Fixes: f06e5153f4 (kernel/panic.c: add "crash_kexec_post_notifiers" option)
Link: http://lkml.kernel.org/r/20160810080948.11028.15344.stgit@sysi4-13.yrl.intra.hitachi.co.jp
Signed-off-by: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
Reported-by: Daniel Walker <dwalker@fifo99.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Daniel Walker <dwalker@fifo99.com>
Cc: Xunlei Pang <xpang@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: Toshi Kani <toshi.kani@hpe.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: David Daney <david.daney@cavium.com>
Cc: Aaro Koskinen <aaro.koskinen@iki.fi>
Cc: "Steven J. Hill" <steven.hill@cavium.com>
Cc: Corey Minyard <cminyard@mvista.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-11 15:06:32 -07:00
Linus Torvalds
93c26d7dc0 Merge branch 'mm-pkeys-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull protection keys syscall interface from Thomas Gleixner:
 "This is the final step of Protection Keys support which adds the
  syscalls so user space can actually allocate keys and protect memory
  areas with them. Details and usage examples can be found in the
  documentation.

  The mm side of this has been acked by Mel"

* 'mm-pkeys-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/pkeys: Update documentation
  x86/mm/pkeys: Do not skip PKRU register if debug registers are not used
  x86/pkeys: Fix pkeys build breakage for some non-x86 arches
  x86/pkeys: Add self-tests
  x86/pkeys: Allow configuration of init_pkru
  x86/pkeys: Default to a restrictive init PKRU
  pkeys: Add details of system call use to Documentation/
  generic syscalls: Wire up memory protection keys syscalls
  x86: Wire up protection keys system calls
  x86/pkeys: Allocation/free syscalls
  x86/pkeys: Make mprotect_key() mask off additional vm_flags
  mm: Implement new pkey_mprotect() system call
  x86/pkeys: Add fault handling for PF_PK page fault bit
2016-10-10 11:01:51 -07:00
Linus Torvalds
5fa0eb0b4d Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 updates from Thomas Gleixner:
 "A pile of regression fixes and updates:

   - address the fallout of the patches which made the cpuid - nodeid
     relation permanent: Handling of invalid APIC ids and preventing
     pointless warning messages.

   - force eager FPU when protection keys are enabled. Protection keys
     are not generating FPU exceptions so they cannot work with the lazy
     FPU mechanism.

   - prevent force migration of interrupts which are not part of the CPU
     vector domain.

   - handle the fact that APIC ids are not updated in the ACPI/MADT
     tables on physical CPU hotplug

   - remove bash-isms from syscall table generator script

   - use the hypervisor supplied APIC frequency when running on VMware"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/pkeys: Make protection keys an "eager" feature
  x86/apic: Prevent pointless warning messages
  x86/acpi: Prevent LAPIC id 0xff from being accounted
  arch/x86: Handle non enumerated CPU after physical hotplug
  x86/unwind: Fix oprofile module link error
  x86/vmware: Skip lapic calibration on VMware
  x86/syscalls: Remove bash-isms in syscall table generator
  x86/irq: Prevent force migration of irqs which are not in the vector domain
2016-10-10 10:59:07 -07:00
Dave Hansen
d4b05923f5 x86/pkeys: Make protection keys an "eager" feature
Our XSAVE features are divided into two categories: those that
generate FPU exceptions, and those that do not.  MPX and pkeys do
not generate FPU exceptions and thus can not be used lazily.  We
disable them when lazy mode is forced on.

We have a pair of masks to collect these two sets of features, but
XFEATURE_MASK_PKRU was added to the wrong mask: XFEATURE_MASK_LAZY.
Fix it by moving the feature to XFEATURE_MASK_EAGER.

Note: this only causes problem if you boot with lazy FPU mode
(eagerfpu=off) which is *not* the default.  It also only affects
hardware which is not currently publicly available.  It looks like
eager mode is going away, but we still need this patch applied
to any kernel that has protection keys and lazy mode, which is 4.6
through 4.8 at this point, and 4.9 if the lazy removal isn't sent
to Linus for 4.9.

Fixes: c8df400984 ("x86/fpu, x86/mm/pkeys: Add PKRU xsave fields and data structures")
Signed-off-by: Dave Hansen <dave.hansen@intel.com>
Cc: Dave Hansen <dave@sr71.net>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/20161007162342.28A49813@viggo.jf.intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-10-08 12:26:44 +02:00
Chris Metcalf
6727ad9e20 nmi_backtrace: generate one-line reports for idle cpus
When doing an nmi backtrace of many cores, most of which are idle, the
output is a little overwhelming and very uninformative.  Suppress
messages for cpus that are idling when they are interrupted and just
emit one line, "NMI backtrace for N skipped: idling at pc 0xNNN".

We do this by grouping all the cpuidle code together into a new
.cpuidle.text section, and then checking the address of the interrupted
PC to see if it lies within that section.

This commit suitably tags x86 and tile idle routines, and only adds in
the minimal framework for other architectures.

Link: http://lkml.kernel.org/r/1472487169-14923-5-git-send-email-cmetcalf@mellanox.com
Signed-off-by: Chris Metcalf <cmetcalf@mellanox.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Daniel Thompson <daniel.thompson@linaro.org> [arm]
Tested-by: Petr Mladek <pmladek@suse.com>
Cc: Aaron Tomlin <atomlin@redhat.com>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-07 18:46:30 -07:00
Chris Metcalf
9a01c3ed5c nmi_backtrace: add more trigger_*_cpu_backtrace() methods
Patch series "improvements to the nmi_backtrace code" v9.

This patch series modifies the trigger_xxx_backtrace() NMI-based remote
backtracing code to make it more flexible, and makes a few small
improvements along the way.

The motivation comes from the task isolation code, where there are
scenarios where we want to be able to diagnose a case where some cpu is
about to interrupt a task-isolated cpu.  It can be helpful to see both
where the interrupting cpu is, and also an approximation of where the
cpu that is being interrupted is.  The nmi_backtrace framework allows us
to discover the stack of the interrupted cpu.

I've tested that the change works as desired on tile, and build-tested
x86, arm, mips, and sparc64.  For x86 I confirmed that the generic
cpuidle stuff as well as the architecture-specific routines are in the
new cpuidle section.  For arm, mips, and sparc I just build-tested it
and made sure the generic cpuidle routines were in the new cpuidle
section, but I didn't attempt to figure out which the platform-specific
idle routines might be.  That might be more usefully done by someone
with platform experience in follow-up patches.

This patch (of 4):

Currently you can only request a backtrace of either all cpus, or all
cpus but yourself.  It can also be helpful to request a remote backtrace
of a single cpu, and since we want that, the logical extension is to
support a cpumask as the underlying primitive.

This change modifies the existing lib/nmi_backtrace.c code to take a
cpumask as its basic primitive, and modifies the linux/nmi.h code to use
the new "cpumask" method instead.

The existing clients of nmi_backtrace (arm and x86) are converted to
using the new cpumask approach in this change.

The other users of the backtracing API (sparc64 and mips) are converted
to use the cpumask approach rather than the all/allbutself approach.
The mips code ignored the "include_self" boolean but with this change it
will now also dump a local backtrace if requested.

Link: http://lkml.kernel.org/r/1472487169-14923-2-git-send-email-cmetcalf@mellanox.com
Signed-off-by: Chris Metcalf <cmetcalf@mellanox.com>
Tested-by: Daniel Thompson <daniel.thompson@linaro.org> [arm]
Reviewed-by: Aaron Tomlin <atomlin@redhat.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-07 18:46:30 -07:00
Baoyou Xie
08ea8c07fb mm: move phys_mem_access_prot_allowed() declaration to pgtable.h
We get 1 warning when building kernel with W=1:

  drivers/char/mem.c:220:12: warning: no previous prototype for 'phys_mem_access_prot_allowed' [-Wmissing-prototypes]
   int __weak phys_mem_access_prot_allowed(struct file *file,

In fact, its declaration is spreading to several header files in
different architecture, but need to be declare in common header file.

So this patch moves phys_mem_access_prot_allowed() to pgtable.h.

Link: http://lkml.kernel.org/r/1473751597-12139-1-git-send-email-baoyou.xie@linaro.org
Signed-off-by: Baoyou Xie <baoyou.xie@linaro.org>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-07 18:46:29 -07:00
Linus Torvalds
e6e3d8f8f4 Merge tag 'pci-v4.9-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull PCI updates from Bjorn Helgaas:
 "Summary of PCI changes for the v4.9 merge window:

  Enumeration:
   - microblaze: Add multidomain support for procfs (Bharat Kumar Gogada)

  Resource management:
   - Ignore requested alignment for PROBE_ONLY and fixed resources (Yongji Xie)
   - Ignore requested alignment for VF BARs (Yongji Xie)

  PCI device hotplug:
   - Make core explicitly non-modular (Paul Gortmaker)

  PCIe native device hotplug:
   - Rename pcie_isr() locals for clarity (Bjorn Helgaas)
   - Return IRQ_NONE when we can't read interrupt status (Bjorn Helgaas)
   - Remove unnecessary guard (Bjorn Helgaas)
   - Clean up dmesg "Slot(%s)" messages (Bjorn Helgaas)
   - Remove useless pciehp_get_latch_status() calls (Bjorn Helgaas)
   - Clear attention LED on device add (Keith Busch)
   - Allow exclusive userspace control of indicators (Keith Busch)
   - Process all hotplug events before looking for new ones (Mayurkumar Patel)
   - Don't re-read Slot Status when queuing hotplug event (Mayurkumar Patel)
   - Don't re-read Slot Status when handling surprise event (Mayurkumar Patel)
   - Make explicitly non-modular (Paul Gortmaker)

  Power management:
   - Afford direct-complete to devices with non-standard PM (Lukas Wunner)
   - Query platform firmware for device power state (Lukas Wunner)
   - Recognize D3cold in pci_update_current_state() (Lukas Wunner)
   - Avoid unnecessary resume after direct-complete (Lukas Wunner)
   - Make explicitly non-modular (Paul Gortmaker)

  Virtualization:
   - Mark Atheros AR9580 to avoid bus reset (Maik Broemme)
   - Check for pci_setup_device() failure in pci_iov_add_virtfn() (Po Liu)

  MSI:
   - Enable PCI_MSI_IRQ_DOMAIN support for ARC (Joao Pinto)

  AER:
   - Remove aerdriver.nosourceid kernel parameter (Bjorn Helgaas)
   - Remove aerdriver.forceload kernel parameter (Bjorn Helgaas)
   - Fix aer_probe() kernel-doc comment (Cao jin)
   - Add bus flag to skip source ID matching (Jon Derrick)
   - Avoid memory allocation in interrupt handling path (Jon Derrick)
   - Cache capability position (Keith Busch)
   - Make explicitly non-modular (Paul Gortmaker)
   - Remove duplicate AER severity translation (Tyler Baicar)
   - Send correct severity to calculate AER severity (Tyler Baicar)

  Precision Time Measurement:
   - Add Precision Time Measurement (PTM) support (Jonathan Yong)
   - Add PTM clock granularity information (Bjorn Helgaas)
   - Add pci_enable_ptm() for drivers to enable PTM on endpoints (Bjorn Helgaas)

  Generic host bridge driver:
   - Fix pci_remap_iospace() failure path (Lorenzo Pieralisi)
   - Make explicitly non-modular (Paul Gortmaker)

  Altera host bridge driver:
   - Remove redundant platform_get_resource() return value check (Bjorn Helgaas)
   - Poll for link training status after retraining the link (Ley Foon Tan)
   - Rework config accessors for use without a struct pci_bus (Ley Foon Tan)
   - Move retrain from fixup to altera_pcie_host_init() (Ley Foon Tan)
   - Make MSI explicitly non-modular (Paul Gortmaker)
   - Make explicitly non-modular (Paul Gortmaker)
   - Relax device number checking to allow SR-IOV (Po Liu)

  ARM Versatile host bridge driver:
   - Fix pci_remap_iospace() failure path (Lorenzo Pieralisi)

  Axis ARTPEC-6 host bridge driver:
   - Drop __init from artpec6_add_pcie_port() (Niklas Cassel)

  Freescale i.MX6 host bridge driver:
   - Make explicitly non-modular (Paul Gortmaker)

  Intel VMD host bridge driver:
   - Add quirk for AER to ignore source ID (Jon Derrick)
   - Allocate IRQ lists with correct MSI-X count (Jon Derrick)
   - Convert to use pci_alloc_irq_vectors() API (Jon Derrick)
   - Eliminate vmd_vector member from list type (Jon Derrick)
   - Eliminate index member from IRQ list (Jon Derrick)
   - Synchronize with RCU freeing MSI IRQ descs (Keith Busch)
   - Request userspace control of PCIe hotplug indicators (Keith Busch)
   - Move VMD driver to drivers/pci/host (Keith Busch)

  Marvell Aardvark host bridge driver:
   - Fix pci_remap_iospace() failure path (Lorenzo Pieralisi)
   - Remove redundant dev_err call in advk_pcie_probe() (Wei Yongjun)

  Microsoft Hyper-V host bridge driver:
   - Use zero-length array in struct pci_packet (Dexuan Cui)
   - Use pci_function_description[0] in struct definitions (Dexuan Cui)
   - Remove the unused 'wrk' in struct hv_pcibus_device (Dexuan Cui)
   - Handle vmbus_sendpacket() failure in hv_compose_msi_msg() (Dexuan Cui)
   - Handle hv_pci_generic_compl() error case (Dexuan Cui)
   - Use list_move_tail() instead of list_del() + list_add_tail() (Wei Yongjun)

  NVIDIA Tegra host bridge driver:
   - Fix pci_remap_iospace() failure path (Lorenzo Pieralisi)
   - Remove redundant _data suffix (Thierry Reding)
   - Use of_device_get_match_data() (Thierry Reding)

  Qualcomm host bridge driver:
   - Make explicitly non-modular (Paul Gortmaker)

  Renesas R-Car host bridge driver:
   - Consolidate register space lookup and ioremap (Bjorn Helgaas)
   - Don't disable/unprepare clocks on prepare/enable failure (Geert Uytterhoeven)
   - Add multi-MSI support (Grigory Kletsko)
   - Fix pci_remap_iospace() failure path (Lorenzo Pieralisi)
   - Fix some checkpatch warnings (Sergei Shtylyov)
   - Try increasing PCIe link speed to 5 GT/s at boot (Sergei Shtylyov)

  Rockchip host bridge driver:
   - Add DT bindings for Rockchip PCIe controller (Shawn Lin)
   - Add Rockchip PCIe controller support (Shawn Lin)
   - Improve the deassert sequence of four reset pins (Shawn Lin)
   - Fix wrong transmitted FTS count (Shawn Lin)
   - Increase the Max Credit update interval (Rajat Jain)

  Samsung Exynos host bridge driver:
   - Make explicitly non-modular (Paul Gortmaker)

  ST Microelectronics SPEAr13xx host bridge driver:
   - Make explicitly non-modular (Paul Gortmaker)

  Synopsys DesignWare host bridge driver:
   - Return data directly from dw_pcie_readl_rc() (Bjorn Helgaas)
   - Exchange viewport of `MEMORYs' and `CFGs/IOs' (Dong Bo)
   - Check LTSSM training bit before deciding link is up (Jisheng Zhang)
   - Move link wait definitions to .c file (Joao Pinto)
   - Wait for iATU enable (Joao Pinto)
   - Add iATU Unroll feature (Joao Pinto)
   - Fix pci_remap_iospace() failure path (Lorenzo Pieralisi)
   - Make explicitly non-modular (Paul Gortmaker)
   - Relax device number checking to allow SR-IOV (Po Liu)
   - Keep viewport fixed for IO transaction if num_viewport > 2 (Pratyush Anand)
   - Remove redundant platform_get_resource() return value check (Wei Yongjun)

  TI DRA7xx host bridge driver:
   - Make explicitly non-modular (Paul Gortmaker)

  TI Keystone host bridge driver:
   - Propagate request_irq() failure (Wei Yongjun)

  Xilinx AXI host bridge driver:
   - Keep both legacy and MSI interrupt domain references (Bharat Kumar Gogada)
   - Clear interrupt register for invalid interrupt (Bharat Kumar Gogada)
   - Clear correct MSI set bit (Bharat Kumar Gogada)
   - Dispose of MSI virtual IRQ (Bharat Kumar Gogada)
   - Make explicitly non-modular (Paul Gortmaker)
   - Relax device number checking to allow SR-IOV (Po Liu)

  Xilinx NWL host bridge driver:
   - Expand error logging (Bharat Kumar Gogada)
   - Enable all MSI interrupts using MSI mask (Bharat Kumar Gogada)
   - Make explicitly non-modular (Paul Gortmaker)

  Miscellaneous:
   - Drop CONFIG_KEXEC_CORE ifdeffery (Lukas Wunner)
   - portdrv: Make explicitly non-modular (Paul Gortmaker)
   - Make DPC explicitly non-modular (Paul Gortmaker)"

* tag 'pci-v4.9-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (105 commits)
  x86/PCI: VMD: Move VMD driver to drivers/pci/host
  PCI: rockchip: Fix wrong transmitted FTS count
  PCI: rockchip: Improve the deassert sequence of four reset pins
  PCI: rockchip: Increase the Max Credit update interval
  PCI: rcar: Try increasing PCIe link speed to 5 GT/s at boot
  PCI/AER: Fix aer_probe() kernel-doc comment
  PCI: Ignore requested alignment for VF BARs
  PCI: Ignore requested alignment for PROBE_ONLY and fixed resources
  PCI: Avoid unnecessary resume after direct-complete
  PCI: Recognize D3cold in pci_update_current_state()
  PCI: Query platform firmware for device power state
  PCI: Afford direct-complete to devices with non-standard PM
  PCI/AER: Cache capability position
  PCI/AER: Avoid memory allocation in interrupt handling path
  x86/PCI: VMD: Request userspace control of PCIe hotplug indicators
  PCI: pciehp: Allow exclusive userspace control of indicators
  ACPI / APEI: Send correct severity to calculate AER severity
  PCI/AER: Remove duplicate AER severity translation
  x86/PCI: VMD: Synchronize with RCU freeing MSI IRQ descs
  x86/PCI: VMD: Eliminate index member from IRQ list
  ...
2016-10-07 11:46:37 -07:00
Rik van Riel
9ad93fe35a x86/fpu: Split old & new FPU code paths
Now that CR0.TS is no longer being manipulated, we can simplify
switch_fpu_prepare() by no longer nesting the handling of new_fpu
inside the two branches for the old_fpu.

Signed-off-by: Rik van Riel <riel@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: pbonzini@redhat.com
Link: http://lkml.kernel.org/r/1475627678-20788-10-git-send-email-riel@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-10-07 11:14:43 +02:00
Rik van Riel
66f314efca x86/fpu: Remove __fpregs_(de)activate()
Now that fpregs_activate() and fpregs_deactivate() do nothing except
call the double underscored versions of themselves, we can get
rid of the double underscore version.

Signed-off-by: Rik van Riel <riel@redhat.com>
Reviewed-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: pbonzini@redhat.com
Link: http://lkml.kernel.org/r/1475627678-20788-9-git-send-email-riel@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-10-07 11:14:42 +02:00