Commit Graph

16 Commits

Author SHA1 Message Date
Borislav Petkov
154d744fbe x86/cpu: Restore AMD's DE_CFG MSR after resume
commit 2632daebafd04746b4b96c2f26a6021bc38f6209 upstream.

DE_CFG contains the LFENCE serializing bit, restore it on resume too.
This is relevant to older families due to the way how they do S3.

Unify and correct naming while at it.

Fixes: e4d0e84e49 ("x86/cpu/AMD: Make LFENCE a serializing instruction")
Reported-by: Andrew Cooper <Andrew.Cooper3@citrix.com>
Reported-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: <stable@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-11-16 09:57:20 +01:00
Daniel Sneddon
509c2c9fe7 x86/speculation: Add RSB VM Exit protections
commit 2b1299322016731d56807aa49254a5ea3080b6b3 upstream.

tl;dr: The Enhanced IBRS mitigation for Spectre v2 does not work as
documented for RET instructions after VM exits. Mitigate it with a new
one-entry RSB stuffing mechanism and a new LFENCE.

== Background ==

Indirect Branch Restricted Speculation (IBRS) was designed to help
mitigate Branch Target Injection and Speculative Store Bypass, i.e.
Spectre, attacks. IBRS prevents software run in less privileged modes
from affecting branch prediction in more privileged modes. IBRS requires
the MSR to be written on every privilege level change.

To overcome some of the performance issues of IBRS, Enhanced IBRS was
introduced.  eIBRS is an "always on" IBRS, in other words, just turn
it on once instead of writing the MSR on every privilege level change.
When eIBRS is enabled, more privileged modes should be protected from
less privileged modes, including protecting VMMs from guests.

== Problem ==

Here's a simplification of how guests are run on Linux' KVM:

void run_kvm_guest(void)
{
	// Prepare to run guest
	VMRESUME();
	// Clean up after guest runs
}

The execution flow for that would look something like this to the
processor:

1. Host-side: call run_kvm_guest()
2. Host-side: VMRESUME
3. Guest runs, does "CALL guest_function"
4. VM exit, host runs again
5. Host might make some "cleanup" function calls
6. Host-side: RET from run_kvm_guest()

Now, when back on the host, there are a couple of possible scenarios of
post-guest activity the host needs to do before executing host code:

* on pre-eIBRS hardware (legacy IBRS, or nothing at all), the RSB is not
touched and Linux has to do a 32-entry stuffing.

* on eIBRS hardware, VM exit with IBRS enabled, or restoring the host
IBRS=1 shortly after VM exit, has a documented side effect of flushing
the RSB except in this PBRSB situation where the software needs to stuff
the last RSB entry "by hand".

IOW, with eIBRS supported, host RET instructions should no longer be
influenced by guest behavior after the host retires a single CALL
instruction.

However, if the RET instructions are "unbalanced" with CALLs after a VM
exit as is the RET in #6, it might speculatively use the address for the
instruction after the CALL in #3 as an RSB prediction. This is a problem
since the (untrusted) guest controls this address.

Balanced CALL/RET instruction pairs such as in step #5 are not affected.

== Solution ==

The PBRSB issue affects a wide variety of Intel processors which
support eIBRS. But not all of them need mitigation. Today,
X86_FEATURE_RSB_VMEXIT triggers an RSB filling sequence that mitigates
PBRSB. Systems setting RSB_VMEXIT need no further mitigation - i.e.,
eIBRS systems which enable legacy IBRS explicitly.

However, such systems (X86_FEATURE_IBRS_ENHANCED) do not set RSB_VMEXIT
and most of them need a new mitigation.

Therefore, introduce a new feature flag X86_FEATURE_RSB_VMEXIT_LITE
which triggers a lighter-weight PBRSB mitigation versus RSB_VMEXIT.

The lighter-weight mitigation performs a CALL instruction which is
immediately followed by a speculative execution barrier (INT3). This
steers speculative execution to the barrier -- just like a retpoline
-- which ensures that speculation can never reach an unbalanced RET.
Then, ensure this CALL is retired before continuing execution with an
LFENCE.

In other words, the window of exposure is opened at VM exit where RET
behavior is troublesome. While the window is open, force RSB predictions
sampling for RET targets to a dead end at the INT3. Close the window
with the LFENCE.

There is a subset of eIBRS systems which are not vulnerable to PBRSB.
Add these systems to the cpu_vuln_whitelist[] as NO_EIBRS_PBRSB.
Future systems that aren't vulnerable will set ARCH_CAP_PBRSB_NO.

  [ bp: Massage, incorporate review comments from Andy Cooper. ]

Signed-off-by: Daniel Sneddon <daniel.sneddon@linux.intel.com>
Co-developed-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-08-11 13:06:47 +02:00
Arnaldo Carvalho de Melo
3f93b8630a tools arch x86: Sync the msr-index.h copy with the kernel sources
commit 91d248c3b903b46a58cbc7e8d38d684d3e4007c2 upstream.

To pick up the changes from these csets:

  4ad3278df6fe2b08 ("x86/speculation: Disable RRSBA behavior")
  d7caac991feeef1b ("x86/cpu/amd: Add Spectral Chicken")

That cause no changes to tooling:

  $ tools/perf/trace/beauty/tracepoints/x86_msr.sh > before
  $ cp arch/x86/include/asm/msr-index.h tools/arch/x86/include/asm/msr-index.h
  $ tools/perf/trace/beauty/tracepoints/x86_msr.sh > after
  $ diff -u before after
  $

Just silences this perf build warning:

  Warning: Kernel ABI header at 'tools/arch/x86/include/asm/msr-index.h' differs from latest version at 'arch/x86/include/asm/msr-index.h'
  diff -u tools/arch/x86/include/asm/msr-index.h arch/x86/include/asm/msr-index.h

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/lkml/YtQTm9wsB3hxQWvy@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-07-25 11:26:54 +02:00
Pawan Gupta
eb38964b6f x86/speculation: Disable RRSBA behavior
commit 4ad3278df6fe2b0852b00d5757fc2ccd8e92c26e upstream.

Some Intel processors may use alternate predictors for RETs on
RSB-underflow. This condition may be vulnerable to Branch History
Injection (BHI) and intramode-BTI.

Kernel earlier added spectre_v2 mitigation modes (eIBRS+Retpolines,
eIBRS+LFENCE, Retpolines) which protect indirect CALLs and JMPs against
such attacks. However, on RSB-underflow, RET target prediction may
fallback to alternate predictors. As a result, RET's predicted target
may get influenced by branch history.

A new MSR_IA32_SPEC_CTRL bit (RRSBA_DIS_S) controls this fallback
behavior when in kernel mode. When set, RETs will not take predictions
from alternate predictors, hence mitigating RETs as well. Support for
this is enumerated by CPUID.7.2.EDX[RRSBA_CTRL] (bit2).

For spectre v2 mitigation, when a user selects a mitigation that
protects indirect CALLs and JMPs against BHI and intramode-BTI, set
RRSBA_DIS_S also to protect RETs for RSB-underflow case.

Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
[bwh: Backported to 5.15: adjust context in scattered.c]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-07-25 11:26:51 +02:00
Pawan Gupta
bde15fdcce KVM: x86/speculation: Disable Fill buffer clear within guests
commit 027bbb884be006b05d9c577d6401686053aa789e upstream

The enumeration of MD_CLEAR in CPUID(EAX=7,ECX=0).EDX{bit 10} is not an
accurate indicator on all CPUs of whether the VERW instruction will
overwrite fill buffers. FB_CLEAR enumeration in
IA32_ARCH_CAPABILITIES{bit 17} covers the case of CPUs that are not
vulnerable to MDS/TAA, indicating that microcode does overwrite fill
buffers.

Guests running in VMM environments may not be aware of all the
capabilities/vulnerabilities of the host CPU. Specifically, a guest may
apply MDS/TAA mitigations when a virtual CPU is enumerated as vulnerable
to MDS/TAA even when the physical CPU is not. On CPUs that enumerate
FB_CLEAR_CTRL the VMM may set FB_CLEAR_DIS to skip overwriting of fill
buffers by the VERW instruction. This is done by setting FB_CLEAR_DIS
during VMENTER and resetting on VMEXIT. For guests that enumerate
FB_CLEAR (explicitly asking for fill buffer clear capability) the VMM
will not use FB_CLEAR_DIS.

Irrespective of guest state, host overwrites CPU buffers before VMENTER
to protect itself from an MMIO capable guest, as part of mitigation for
MMIO Stale Data vulnerabilities.

Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-16 13:27:59 +02:00
Pawan Gupta
e66310bc96 x86/speculation/mmio: Enumerate Processor MMIO Stale Data bug
commit 51802186158c74a0304f51ab963e7c2b3a2b046f upstream

Processor MMIO Stale Data is a class of vulnerabilities that may
expose data after an MMIO operation. For more details please refer to
Documentation/admin-guide/hw-vuln/processor_mmio_stale_data.rst

Add the Processor MMIO Stale Data bug enumeration. A microcode update
adds new bits to the MSR IA32_ARCH_CAPABILITIES, define them.

Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-16 13:27:57 +02:00
Arnaldo Carvalho de Melo
32b734e09e tools arch x86: Sync the msr-index.h copy with the kernel sources
To pick up the changes in:

  29dcc60f6a ("x86/boot/compressed/64: Add stage1 #VC handler")
  36e1be8ada ("perf/x86/amd/ibs: Fix raw sample data accumulation")
  59a854e2f3 ("perf/x86/intel: Support TopDown metrics on Ice Lake")
  7b2c05a15d ("perf/x86/intel: Generic support for hardware TopDown metrics")
  99e40204e0 ("x86/msr: Move the F15h MSRs where they belong")
  b57de6cd16 ("x86/sev-es: Add SEV-ES Feature Detection")
  ed7bde7a6d ("cpufreq: intel_pstate: Allow enable/disable energy efficiency")
  f0f2f9feb4 ("x86/msr-index: Define an IA32_PASID MSR")

That cause these changes in tooling:

  $ tools/perf/trace/beauty/tracepoints/x86_msr.sh > before
  $ cp arch/x86/include/asm/msr-index.h tools/arch/x86/include/asm/msr-index.h
  $ tools/perf/trace/beauty/tracepoints/x86_msr.sh > after
  $ diff -u before after
  --- before	2020-10-19 13:27:33.195274425 -0300
  +++ after	2020-10-19 13:27:44.144507610 -0300
  @@ -113,6 +113,8 @@
   	[0x00000309] = "CORE_PERF_FIXED_CTR0",
   	[0x0000030a] = "CORE_PERF_FIXED_CTR1",
   	[0x0000030b] = "CORE_PERF_FIXED_CTR2",
  +	[0x0000030c] = "CORE_PERF_FIXED_CTR3",
  +	[0x00000329] = "PERF_METRICS",
   	[0x00000345] = "IA32_PERF_CAPABILITIES",
   	[0x0000038d] = "CORE_PERF_FIXED_CTR_CTRL",
   	[0x0000038e] = "CORE_PERF_GLOBAL_STATUS",
  @@ -222,6 +224,7 @@
   	[0x00000774] = "HWP_REQUEST",
   	[0x00000777] = "HWP_STATUS",
   	[0x00000d90] = "IA32_BNDCFGS",
  +	[0x00000d93] = "IA32_PASID",
   	[0x00000da0] = "IA32_XSS",
   	[0x00000dc0] = "LBR_INFO_0",
   	[0x00000ffc] = "IA32_BNDCFGS_RSVD",
  @@ -279,6 +282,7 @@
   	[0xc0010115 - x86_AMD_V_KVM_MSRs_offset] = "VM_IGNNE",
   	[0xc0010117 - x86_AMD_V_KVM_MSRs_offset] = "VM_HSAVE_PA",
   	[0xc001011f - x86_AMD_V_KVM_MSRs_offset] = "AMD64_VIRT_SPEC_CTRL",
  +	[0xc0010130 - x86_AMD_V_KVM_MSRs_offset] = "AMD64_SEV_ES_GHCB",
   	[0xc0010131 - x86_AMD_V_KVM_MSRs_offset] = "AMD64_SEV",
   	[0xc0010140 - x86_AMD_V_KVM_MSRs_offset] = "AMD64_OSVW_ID_LENGTH",
   	[0xc0010141 - x86_AMD_V_KVM_MSRs_offset] = "AMD64_OSVW_STATUS",
  $

Which causes these parts of tools/perf/ to be rebuilt:

  CC       /tmp/build/perf/trace/beauty/tracepoints/x86_msr.o
  DESCEND  plugins
  GEN      /tmp/build/perf/python/perf.so
  INSTALL  trace_plugins
  LD       /tmp/build/perf/trace/beauty/tracepoints/perf-in.o
  LD       /tmp/build/perf/trace/beauty/perf-in.o
  LD       /tmp/build/perf/perf-in.o
  LINK     /tmp/build/perf/per

At some point these should just be tables read by perf on demand.

This addresses this perf tools build warning:

  diff -u tools/arch/x86/include/asm/msr-index.h arch/x86/include/asm/msr-index.h
  Warning: Kernel ABI header at 'tools/arch/x86/include/asm/msr-index.h' differs from latest version at 'arch/x86/include/asm/msr-index.h'

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Joerg Roedel <jroedel@suse.de>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kim Phillips <kim.phillips@amd.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-11-03 08:36:30 -03:00
Arnaldo Carvalho de Melo
f815fe512c tools arch x86: Sync the msr-index.h copy with the kernel sources
To pick up the changes in:

  d6a162a41b x86/msr-index: Add bunch of MSRs for Arch LBR
  ed7bde7a6d cpufreq: intel_pstate: Allow enable/disable energy efficiency
  99e40204e0 (tip/x86/cleanups) x86/msr: Move the F15h MSRs where they belong
  1068ed4547 x86/msr: Lift AMD family 0x15 power-specific MSRs
  5cde265384 (tag: perf-core-2020-06-01) perf/x86/rapl: Add AMD Fam17h RAPL support

Addressing these tools/perf build warnings:

That makes the beautification scripts to pick some new entries:

  $ tools/perf/trace/beauty/tracepoints/x86_msr.sh > before
  $ cp arch/x86/include/asm/msr-index.h tools/arch/x86/include/asm/msr-index.h
  $ tools/perf/trace/beauty/tracepoints/x86_msr.sh > after
  $ diff -u before after
  --- before	2020-08-07 08:45:18.801298854 -0300
  +++ after	2020-08-07 08:45:28.654456422 -0300
  @@ -271,6 +271,8 @@
   	[0xc0010062 - x86_AMD_V_KVM_MSRs_offset] = "AMD_PERF_CTL",
   	[0xc0010063 - x86_AMD_V_KVM_MSRs_offset] = "AMD_PERF_STATUS",
   	[0xc0010064 - x86_AMD_V_KVM_MSRs_offset] = "AMD_PSTATE_DEF_BASE",
  +	[0xc001007a - x86_AMD_V_KVM_MSRs_offset] = "F15H_CU_PWR_ACCUMULATOR",
  +	[0xc001007b - x86_AMD_V_KVM_MSRs_offset] = "F15H_CU_MAX_PWR_ACCUMULATOR",
   	[0xc0010112 - x86_AMD_V_KVM_MSRs_offset] = "K8_TSEG_ADDR",
   	[0xc0010113 - x86_AMD_V_KVM_MSRs_offset] = "K8_TSEG_MASK",
   	[0xc0010114 - x86_AMD_V_KVM_MSRs_offset] = "VM_CR",
  $

And this gets rebuilt:

  CC       /tmp/build/perf/trace/beauty/tracepoints/x86_msr.o
  INSTALL  trace_plugins
  LD       /tmp/build/perf/trace/beauty/tracepoints/perf-in.o
  LD       /tmp/build/perf/trace/beauty/perf-in.o
  LD       /tmp/build/perf/perf-in.o
  LINK     /tmp/build/perf/perf

Now one can trace systemwide asking to see backtraces to where those
MSRs are being read/written with:

  # perf trace -e msr:*_msr/max-stack=32/ --filter="msr==F15H_CU_PWR_ACCUMULATOR || msr==F15H_CU_MAX_PWR_ACCUMULATOR"
  ^C#
  #

If we use -v (verbose mode) we can see what it does behind the scenes:

  # perf trace -v -e msr:*_msr/max-stack=32/ --filter="msr==F15H_CU_PWR_ACCUMULATOR || msr==F15H_CU_MAX_PWR_ACCUMULATOR"
  Using CPUID GenuineIntel-6-8E-A
  0xc001007a
  0xc001007b
  New filter for msr:read_msr: (msr==0xc001007a || msr==0xc001007b) && (common_pid != 2448054 && common_pid != 2782)
  0xc001007a
  0xc001007b
  New filter for msr:write_msr: (msr==0xc001007a || msr==0xc001007b) && (common_pid != 2448054 && common_pid != 2782)
  mmap size 528384B
  ^C#

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Cc: Stephane Eranian <eranian@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-08-07 08:45:47 -03:00
Arnaldo Carvalho de Melo
25ca7e5c0b tools arch x86: Sync the msr-index.h copy with the kernel sources
To pick up the changes in:

  7e5b3c267d ("x86/speculation: Add Special Register Buffer Data Sampling (SRBDS) mitigation")

Addressing these tools/perf build warnings:

  Warning: Kernel ABI header at 'tools/arch/x86/include/asm/msr-index.h' differs from latest version at 'arch/x86/include/asm/msr-index.h'
  diff -u tools/arch/x86/include/asm/msr-index.h arch/x86/include/asm/msr-index.h
  Warning: Kernel ABI header at 'tools/arch/x86/include/asm/cpufeatures.h' differs from latest version at 'arch/x86/include/asm/cpufeatures.h'
  diff -u tools/arch/x86/include/asm/cpufeatures.h arch/x86/include/asm/cpufeatures.h

With this one will be able to use these new AMD MSRs in filters, by
name, e.g.:

  # perf trace -e msr:* --filter "msr==IA32_MCU_OPT_CTRL"
  ^C#

Using -v we can see how it sets up the tracepoint filters, converting
from the string in the filter to the numeric value:

  # perf trace -v -e msr:* --filter "msr==IA32_MCU_OPT_CTRL"
  Using CPUID GenuineIntel-6-8E-A
  0x123
  New filter for msr:read_msr: (msr==0x123) && (common_pid != 335 && common_pid != 30344)
  0x123
  New filter for msr:write_msr: (msr==0x123) && (common_pid != 335 && common_pid != 30344)
  0x123
  New filter for msr:rdpmc: (msr==0x123) && (common_pid != 335 && common_pid != 30344)
  mmap size 528384B
  ^C#

The updating process shows how this affects tooling in more detail:

  $ diff -u tools/arch/x86/include/asm/msr-index.h arch/x86/include/asm/msr-index.h
  --- tools/arch/x86/include/asm/msr-index.h	2020-06-03 10:36:09.959910238 -0300
  +++ arch/x86/include/asm/msr-index.h	2020-06-17 10:04:20.235052901 -0300
  @@ -128,6 +128,10 @@
   #define TSX_CTRL_RTM_DISABLE		BIT(0)	/* Disable RTM feature */
   #define TSX_CTRL_CPUID_CLEAR		BIT(1)	/* Disable TSX enumeration */

  +/* SRBDS support */
  +#define MSR_IA32_MCU_OPT_CTRL		0x00000123
  +#define RNGDS_MITG_DIS			BIT(0)
  +
   #define MSR_IA32_SYSENTER_CS		0x00000174
   #define MSR_IA32_SYSENTER_ESP		0x00000175
   #define MSR_IA32_SYSENTER_EIP		0x00000176
  $ set -o vi
  $ tools/perf/trace/beauty/tracepoints/x86_msr.sh > before
  $ cp arch/x86/include/asm/msr-index.h tools/arch/x86/include/asm/msr-index.h
  $ tools/perf/trace/beauty/tracepoints/x86_msr.sh > after
  $ diff -u before after
  --- before	2020-06-17 10:05:49.653114752 -0300
  +++ after	2020-06-17 10:06:01.777258731 -0300
  @@ -51,6 +51,7 @@
   	[0x0000011e] = "IA32_BBL_CR_CTL3",
   	[0x00000120] = "IDT_MCR_CTRL",
   	[0x00000122] = "IA32_TSX_CTRL",
  +	[0x00000123] = "IA32_MCU_OPT_CTRL",
   	[0x00000140] = "MISC_FEATURES_ENABLES",
   	[0x00000174] = "IA32_SYSENTER_CS",
   	[0x00000175] = "IA32_SYSENTER_ESP",
  $

The related change to cpu-features.h affects this:

  CC       /tmp/build/perf/bench/mem-memcpy-x86-64-asm.o
  CC       /tmp/build/perf/bench/mem-memset-x86-64-asm.o

This shouldn't be affecting that 'perf bench' entry:

  $ find tools/perf/ -type f | xargs grep SRBDS
  $

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Gross <mgross@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-06-17 13:21:26 -03:00
Arnaldo Carvalho de Melo
3b1f47d6e7 tools arch x86: Sync the msr-index.h copy with the kernel sources
To pick up the changes in:

  5cde265384 ("perf/x86/rapl: Add AMD Fam17h RAPL support")

Addressing this tools/perf build warning:

  Warning: Kernel ABI header at 'tools/arch/x86/include/asm/msr-index.h' differs from latest version at 'arch/x86/include/asm/msr-index.h'
  diff -u tools/arch/x86/include/asm/msr-index.h arch/x86/include/asm/msr-index.h

With this one will be able to use these new AMD MSRs in filters, by
name, e.g.:

   # perf trace -e msr:* --filter="msr==AMD_PKG_ENERGY_STATUS || msr==AMD_RAPL_POWER_UNIT"

Just like it is now possible with other MSRs:

  [root@five ~]# uname -a
  Linux five 5.5.17-200.fc31.x86_64 #1 SMP Mon Apr 13 15:29:42 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
  [root@five ~]# grep 'model name' -m1 /proc/cpuinfo
  model name	: AMD Ryzen 5 3600X 6-Core Processor
  [root@five ~]#
  [root@five ~]# perf trace -e msr:*/max-stack=16/ --filter="msr==AMD_PERF_CTL" --max-events=2
       0.000 kworker/1:1-ev/2327824 msr:write_msr(msr: AMD_PERF_CTL, val: 2)
                                         do_trace_write_msr ([kernel.kallsyms])
                                         do_trace_write_msr ([kernel.kallsyms])
                                         [0xffffffffc01d71c3] ([acpi_cpufreq])
                                         [0] ([unknown])
                                         __cpufreq_driver_target ([kernel.kallsyms])
                                         od_dbs_update ([kernel.kallsyms])
                                         dbs_work_handler ([kernel.kallsyms])
                                         process_one_work ([kernel.kallsyms])
                                         worker_thread ([kernel.kallsyms])
                                         kthread ([kernel.kallsyms])
                                         ret_from_fork ([kernel.kallsyms])
       8.597 kworker/2:2-ev/2338099 msr:write_msr(msr: AMD_PERF_CTL, val: 2)
                                         do_trace_write_msr ([kernel.kallsyms])
                                         do_trace_write_msr ([kernel.kallsyms])
                                         [0] ([unknown])
                                         [0] ([unknown])
                                         __cpufreq_driver_target ([kernel.kallsyms])
                                         od_dbs_update ([kernel.kallsyms])
                                         dbs_work_handler ([kernel.kallsyms])
                                         process_one_work ([kernel.kallsyms])
                                         worker_thread ([kernel.kallsyms])
                                         kthread ([kernel.kallsyms])
                                         ret_from_fork ([kernel.kallsyms])
  [root@five ~]#

Longer explanation with what happens in the perf build process,
automatically after this is made in synch with the kernel sources:

  $ make -C tools/perf O=/tmp/build/perf install-bin
  <SNIP>
  Warning: Kernel ABI header at 'tools/arch/x86/include/asm/msr-index.h' differs from latest version at 'arch/x86/include/asm/msr-index.h'
  diff -u tools/arch/x86/include/asm/msr-index.h arch/x86/include/asm/msr-index.h
  <SNIP>
  make: Leaving directory '/home/acme/git/perf/tools/perf'
  $
  $ tools/perf/trace/beauty/tracepoints/x86_msr.sh > before
  $
  $ diff -u tools/arch/x86/include/asm/msr-index.h arch/x86/include/asm/msr-index.h
  --- tools/arch/x86/include/asm/msr-index.h	2020-06-02 10:46:36.217782288 -0300
  +++ arch/x86/include/asm/msr-index.h	2020-05-28 10:41:23.313794627 -0300
  @@ -301,6 +301,9 @@
   #define MSR_PP1_ENERGY_STATUS		0x00000641
   #define MSR_PP1_POLICY			0x00000642

  +#define MSR_AMD_PKG_ENERGY_STATUS	0xc001029b
  +#define MSR_AMD_RAPL_POWER_UNIT		0xc0010299
  +
   /* Config TDP MSRs */
   #define MSR_CONFIG_TDP_NOMINAL		0x00000648
   #define MSR_CONFIG_TDP_LEVEL_1		0x00000649
  $ cp arch/x86/include/asm/msr-index.h tools/arch/x86/include/asm/msr-index.h
  $
  $ make -C tools/perf O=/tmp/build/perf install-bin
  <SNIP>
    CC       /tmp/build/perf/trace/beauty/tracepoints/x86_msr.o
    LD       /tmp/build/perf/trace/beauty/tracepoints/perf-in.o
    LD       /tmp/build/perf/trace/beauty/perf-in.o
    LD       /tmp/build/perf/perf-in.o
    LINK     /tmp/build/perf/perf
  <SNIP>
  make: Leaving directory '/home/acme/git/perf/tools/perf'
  $
  $ tools/perf/trace/beauty/tracepoints/x86_msr.sh > after
  $ diff -u before after
  --- before	2020-06-02 10:47:08.486334348 -0300
  +++ after	2020-06-02 10:47:33.075008948 -0300
  @@ -286,6 +286,8 @@
   	[0xc0010240 - x86_AMD_V_KVM_MSRs_offset] = "F15H_NB_PERF_CTL",
   	[0xc0010241 - x86_AMD_V_KVM_MSRs_offset] = "F15H_NB_PERF_CTR",
   	[0xc0010280 - x86_AMD_V_KVM_MSRs_offset] = "F15H_PTSC",
  +	[0xc0010299 - x86_AMD_V_KVM_MSRs_offset] = "AMD_RAPL_POWER_UNIT",
  +	[0xc001029b - x86_AMD_V_KVM_MSRs_offset] = "AMD_PKG_ENERGY_STATUS",
   	[0xc00102f0 - x86_AMD_V_KVM_MSRs_offset] = "AMD_PPIN_CTL",
   	[0xc00102f1 - x86_AMD_V_KVM_MSRs_offset] = "AMD_PPIN",
   };
  $

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-06-02 10:57:59 -03:00
Arnaldo Carvalho de Melo
bab1a501e6 tools arch x86: Sync the msr-index.h copy with the kernel sources
To pick up the changes in:

  6650cdd9a8 ("x86/split_lock: Enable split lock detection by kernel")

  Warning: Kernel ABI header at 'tools/arch/x86/include/asm/msr-index.h' differs from latest version at 'arch/x86/include/asm/msr-index.h'
  diff -u tools/arch/x86/include/asm/msr-index.h arch/x86/include/asm/msr-index.h

Which causes these changes in tooling:

  $ tools/perf/trace/beauty/tracepoints/x86_msr.sh > before
  $ cp arch/x86/include/asm/msr-index.h tools/arch/x86/include/asm/msr-index.h
  $ tools/perf/trace/beauty/tracepoints/x86_msr.sh > after
  $ diff -u before after
  --- before	2020-04-01 12:11:14.789344795 -0300
  +++ after	2020-04-01 12:11:56.907798879 -0300
  @@ -10,6 +10,7 @@
   	[0x00000029] = "KNC_EVNTSEL1",
   	[0x0000002a] = "IA32_EBL_CR_POWERON",
   	[0x0000002c] = "EBC_FREQUENCY_ID",
  +	[0x00000033] = "TEST_CTRL",
   	[0x00000034] = "SMI_COUNT",
   	[0x0000003a] = "IA32_FEAT_CTL",
   	[0x0000003b] = "IA32_TSC_ADJUST",
  @@ -27,6 +28,7 @@
   	[0x000000c2] = "IA32_PERFCTR1",
   	[0x000000cd] = "FSB_FREQ",
   	[0x000000ce] = "PLATFORM_INFO",
  +	[0x000000cf] = "IA32_CORE_CAPS",
   	[0x000000e2] = "PKG_CST_CONFIG_CONTROL",
   	[0x000000e7] = "IA32_MPERF",
   	[0x000000e8] = "IA32_APERF",
  $

  $ make -C tools/perf O=/tmp/build/perf install-bin
  <SNIP>
    CC       /tmp/build/perf/trace/beauty/tracepoints/x86_msr.o
    LD       /tmp/build/perf/trace/beauty/tracepoints/perf-in.o
    LD       /tmp/build/perf/trace/beauty/perf-in.o
    LD       /tmp/build/perf/perf-in.o
    LINK     /tmp/build/perf/perf
  <SNIP>

Now one can do:

	perf trace -e msr:* --filter=msr==IA32_CORE_CAPS

or:

	perf trace -e msr:* --filter='msr==IA32_CORE_CAPS || msr==TEST_CTRL'

And see only those MSRs being accessed via:

  # perf trace -v -e msr:* --filter='msr==IA32_CORE_CAPS || msr==TEST_CTRL'
  New filter for msr:read_msr: (msr==0xcf || msr==0x33) && (common_pid != 8263 && common_pid != 23250)
  New filter for msr:write_msr: (msr==0xcf || msr==0x33) && (common_pid != 8263 && common_pid != 23250)
  New filter for msr:rdpmc: (msr==0xcf || msr==0x33) && (common_pid != 8263 && common_pid != 23250)

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/lkml/20200401153325.GC12534@kernel.org/
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-04-14 08:42:56 -03:00
Arnaldo Carvalho de Melo
d8e3ee2e2b tools arch x86: Sync the msr-index.h copy with the kernel sources
To pick up the changes from these csets:

  21b5ee59ef ("x86/cpu/amd: Enable the fixed Instructions Retired counter IRPERF")

  $ tools/perf/trace/beauty/tracepoints/x86_msr.sh > before
  $ cp arch/x86/include/asm/msr-index.h tools/arch/x86/include/asm/msr-index.h
  $ git diff
  diff --git a/tools/arch/x86/include/asm/msr-index.h b/tools/arch/x86/include/asm/msr-index.h
  index ebe1685e92dd..d5e517d1c3dd 100644
  --- a/tools/arch/x86/include/asm/msr-index.h
  +++ b/tools/arch/x86/include/asm/msr-index.h
  @@ -512,6 +512,8 @@
   #define MSR_K7_HWCR                    0xc0010015
   #define MSR_K7_HWCR_SMMLOCK_BIT                0
   #define MSR_K7_HWCR_SMMLOCK            BIT_ULL(MSR_K7_HWCR_SMMLOCK_BIT)
  +#define MSR_K7_HWCR_IRPERF_EN_BIT      30
  +#define MSR_K7_HWCR_IRPERF_EN          BIT_ULL(MSR_K7_HWCR_IRPERF_EN_BIT)
   #define MSR_K7_FID_VID_CTL             0xc0010041
   #define MSR_K7_FID_VID_STATUS          0xc0010042
  $

That don't result in any change in tooling:

  $ tools/perf/trace/beauty/tracepoints/x86_msr.sh > after
  $ diff -u before after
  $

To silence this perf build warning:

  Warning: Kernel ABI header at 'tools/arch/x86/include/asm/msr-index.h' differs from latest version at 'arch/x86/include/asm/msr-index.h'
  diff -u tools/arch/x86/include/asm/msr-index.h arch/x86/include/asm/msr-index.h

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kim Phillips <kim.phillips@amd.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-02-27 09:49:56 -03:00
Linus Torvalds
c0275ae758 Merge branch 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 cpu-features updates from Ingo Molnar:
 "The biggest change in this cycle was a large series from Sean
  Christopherson to clean up the handling of VMX features. This both
  fixes bugs/inconsistencies and makes the code more coherent and
  future-proof.

  There are also two cleanups and a minor TSX syslog messages
  enhancement"

* 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (23 commits)
  x86/cpu: Remove redundant cpu_detect_cache_sizes() call
  x86/cpu: Print "VMX disabled" error message iff KVM is enabled
  KVM: VMX: Allow KVM_INTEL when building for Centaur and/or Zhaoxin CPUs
  perf/x86: Provide stubs of KVM helpers for non-Intel CPUs
  KVM: VMX: Use VMX_FEATURE_* flags to define VMCS control bits
  KVM: VMX: Check for full VMX support when verifying CPU compatibility
  KVM: VMX: Use VMX feature flag to query BIOS enabling
  KVM: VMX: Drop initialization of IA32_FEAT_CTL MSR
  x86/cpufeatures: Add flag to track whether MSR IA32_FEAT_CTL is configured
  x86/cpu: Set synthetic VMX cpufeatures during init_ia32_feat_ctl()
  x86/cpu: Print VMX flags in /proc/cpuinfo using VMX_FEATURES_*
  x86/cpu: Detect VMX features on Intel, Centaur and Zhaoxin CPUs
  x86/vmx: Introduce VMX_FEATURES_*
  x86/cpu: Clear VMX feature flag if VMX is not fully enabled
  x86/zhaoxin: Use common IA32_FEAT_CTL MSR initialization
  x86/centaur: Use common IA32_FEAT_CTL MSR initialization
  x86/mce: WARN once if IA32_FEAT_CTL MSR is left unlocked
  x86/intel: Initialize IA32_FEAT_CTL MSR at boot
  tools/x86: Sync msr-index.h from kernel sources
  selftests, kvm: Replace manual MSR defs with common msr-index.h
  ...
2020-01-28 12:46:42 -08:00
Sean Christopherson
f6505c88bf tools/x86: Sync msr-index.h from kernel sources
Sync msr-index.h to pull in recent renames of the IA32_FEATURE_CONTROL
MSR definitions.  Update KVM's VMX selftest and turbostat accordingly.
Keep the full name in turbostat's output to avoid breaking someone's
workflow, e.g. if a script is looking for the full name.

While using the renamed defines is by no means necessary, do the sync
now to avoid leaving a landmine that will get stepped on the next time
msr-index.h needs to be refreshed for some other reason.

No functional change intended.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20191221044513.21680-4-sean.j.christopherson@intel.com
2020-01-13 17:42:57 +01:00
Arnaldo Carvalho de Melo
8122b047dd tools arch x86: Sync the msr-index.h copy with the kernel sources
To pick up the changes from these csets:

  3f3c8be973 Merge tag 'for-linus-5.5a-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
  4e3f77d841 ("xen/mcelog: add PPIN to record when available")
  db4d30fbb7 ("x86/bugs: Add ITLB_MULTIHIT bug infrastructure")
  1b42f01741 ("x86/speculation/taa: Add mitigation for TSX Async Abort")
  c2955f270a ("x86/msr: Add the IA32_TSX_CTRL MSR")

These are the changes in tooling that this udpate ensues:

  $ tools/perf/trace/beauty/tracepoints/x86_msr.sh > /tmp/before
  $
  $ cp arch/x86/include/asm/msr-index.h tools/arch/x86/include/asm/msr-index.h
  $
  $ tools/perf/trace/beauty/tracepoints/x86_msr.sh > /tmp/after
  $ diff -u /tmp/before /tmp/after
  --- /tmp/before	2019-12-02 11:54:44.371035723 -0300
  +++ /tmp/after	2019-12-02 11:55:31.847859784 -0300
  @@ -48,6 +48,7 @@
   	[0x00000119] = "IA32_BBL_CR_CTL",
   	[0x0000011e] = "IA32_BBL_CR_CTL3",
   	[0x00000120] = "IDT_MCR_CTRL",
  +	[0x00000122] = "IA32_TSX_CTRL",
   	[0x00000140] = "MISC_FEATURES_ENABLES",
   	[0x00000174] = "IA32_SYSENTER_CS",
   	[0x00000175] = "IA32_SYSENTER_ESP",
  @@ -283,4 +284,6 @@
   	[0xc0010240 - x86_AMD_V_KVM_MSRs_offset] = "F15H_NB_PERF_CTL",
   	[0xc0010241 - x86_AMD_V_KVM_MSRs_offset] = "F15H_NB_PERF_CTR",
   	[0xc0010280 - x86_AMD_V_KVM_MSRs_offset] = "F15H_PTSC",
  +	[0xc00102f0 - x86_AMD_V_KVM_MSRs_offset] = "AMD_PPIN_CTL",
  +	[0xc00102f1 - x86_AMD_V_KVM_MSRs_offset] = "AMD_PPIN",
   };
  $

  CC       /tmp/build/perf/trace/beauty/tracepoints/x86_msr.o
  LD       /tmp/build/perf/trace/beauty/tracepoints/perf-in.o
  LD       /tmp/build/perf/trace/beauty/perf-in.o
  LD       /tmp/build/perf/perf-in.o

Now it is possible to use these strings when setting up filters for the msr:*
tracepoints, like:

  # perf trace -e msr:* --filter=msr==IA32_TSX_CTRL
  ^C[root@quaco ~]#

If we use an invalid operator we can check what is the filter that is put in
place:

  # perf trace -e msr:* --filter=msr=IA32_TSX_CTRL
  Failed to set filter "(msr=0x122) && (common_pid != 25976 && common_pid != 25860)" on event msr:read_msr with 22 (Invalid argument)

One can as well use -v to see the tracepoints and its filters:

  # perf trace -v -e msr:* --filter=msr==IA32_TSX_CTRL
  Using CPUID GenuineIntel-6-8E-A
  New filter for msr:read_msr: (msr==0x122) && (common_pid != 26110 && common_pid != 25860)
  New filter for msr:write_msr: (msr==0x122) && (common_pid != 26110 && common_pid != 25860)
  New filter for msr:rdpmc: (msr==0x122) && (common_pid != 26110 && common_pid != 25860)
  mmap size 528384B
  ^C#

Better than keep looking up those numbers, works with callchains as
well, e.g. for something more common:

  # perf trace -e msr:*/max-stack=16/ --filter="msr==IA32_SPEC_CTRL" --max-events=2
       0.000 SCTP timer/6158 msr:write_msr(msr: IA32_SPEC_CTRL, val: 6)
                                         do_trace_write_msr ([kernel.kallsyms])
                                         do_trace_write_msr ([kernel.kallsyms])
                                         __switch_to_xtra ([kernel.kallsyms])
                                         __switch_to ([kernel.kallsyms])
                                         __sched_text_start ([kernel.kallsyms])
                                         schedule ([kernel.kallsyms])
                                         schedule_hrtimeout_range_clock ([kernel.kallsyms])
                                         poll_schedule_timeout.constprop.0 ([kernel.kallsyms])
                                         do_select ([kernel.kallsyms])
                                         core_sys_select ([kernel.kallsyms])
                                         kern_select ([kernel.kallsyms])
                                         __x64_sys_select ([kernel.kallsyms])
                                         do_syscall_64 ([kernel.kallsyms])
                                         entry_SYSCALL_64 ([kernel.kallsyms])
                                         __select (/usr/lib64/libc-2.29.so)
                                         [0] ([unknown])
       0.024 :0/0 msr:write_msr(msr: IA32_SPEC_CTRL)
                                         do_trace_write_msr ([kernel.kallsyms])
                                         do_trace_write_msr ([kernel.kallsyms])
                                         __switch_to_xtra ([kernel.kallsyms])
                                         __switch_to ([kernel.kallsyms])
                                         __sched_text_start ([kernel.kallsyms])
                                         schedule_idle ([kernel.kallsyms])
                                         do_idle ([kernel.kallsyms])
                                         cpu_startup_entry ([kernel.kallsyms])
                                         start_secondary ([kernel.kallsyms])
                                         [0x2000d4] ([kernel.kallsyms])
  #

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Juergen Gross <jgross@suse.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vineela Tummalapalli <vineela.tummalapalli@intel.com>
Link: https://lkml.kernel.org/n/tip-n1xd78fpd5lxn4q1brqi2jl6@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-12-02 12:03:49 -03:00
Arnaldo Carvalho de Melo
444e2ff34d tools arch x86: Grab a copy of the file containing the MSR numbers
We'll use it to generate a table and then convert the
msr:{read,write}_msr 'msr' option in things like perf trace, script,
etc.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-y1f4s0y1s43d4drh7pd2huzn@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-07 12:22:18 -03:00