Commit Graph

738556 Commits

Author SHA1 Message Date
Masahiro Yamada
fe852ac200 kbuild: simplify modname calculation
modname can be calculated much more simply.  If modname-multi is
empty, it is a single-used object.  So, modname = $(basetarget).
Otherwise, modname = $(modname-multi).

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Reviewed-by: Cao jin <caoj.fnst@cn.fujitsu.com>
2018-03-26 02:01:25 +09:00
Cao jin
c96a294eb6 kbuild: fix modname for composite modules
Commit cf4f21938e ("kbuild: Allow to specify composite modules
with modname-m") added modname-m support, but missed to update the
corresponding multi-objs-m & modname-multi definition.

Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2018-03-26 02:01:25 +09:00
Masahiro Yamada
aeacb019b6 kbuild: define KBUILD_MODNAME even if multiple modules share objects
Currently, KBUILD_MODNAME is defined only when $(modname) contains
just one word.  If an object is shared among multiple modules,
undefined KBUILD_MODNAME could cause a build error.  For example,
if CONFIG_DYNAMIC_DEBUG is enabled, any call of printk() populates
.modname, then fails to build due to undefined KBUILD_MODNAME.

Take the following code as an example:

  obj-m += foo.o
  obj-m += bar.o
  foo-objs := foo-bar-common.o foo-only.o
  bar-objs := foo-bar-common.o bar-only.o

In this case, there is room for argument what to define for
KBUILD_MODNAME when foo-bar-common.o is being compiled.
"foo", "bar", or what else?

One idea is to define colon-separated modules that share the object,
in this case, "bar:foo" (modules are sorted alphabetically by
$(sort ...)).

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Reviewed-by: Cao jin <caoj.fnst@cn.fujitsu.com>
2018-03-26 02:01:25 +09:00
Masahiro Yamada
8cd0e46d3f kbuild: remove unnecessary $(subst $(obj)/, , ...) in modname-multi
In the context ...

    $(obj)/%.s: $(src)/%.c FORCE
            $(call if_changed_dep,cc_s_c)

    $(obj)/%.i: $(src)/%.c FORCE
            $(call if_changed_dep,cpp_i_c)

    $(obj)/%.o: $(src)/%.c $(recordmcount_source) $(objtool_dep) FORCE
            $(call cmd,force_checksrc)
            $(call if_changed_rule,cc_o_c)

    $(obj)/%.lst: $(src)/%.c FORCE
            $(call if_changed_dep,cc_lst_c)

'$*' returns the stem of the target (the part of '%'), so $(obj)/ has
already been ripped off.

$(subst $(obj)/,,$*.o) is the same as $*.o

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Reviewed-by: Cao jin <caoj.fnst@cn.fujitsu.com>
2018-03-26 02:01:24 +09:00
Michael Forney
a670b0b4ae kbuild: Use ls(1) instead of stat(1) to obtain file size
stat(1) is not standardized and different implementations have their own
(conflicting) flags for querying the size of a file.

ls(1) provides the same information (value of st.st_size) in the 5th
column, except when the file is a character or block device. This output
is standardized[0]. The -n option turns on -l, which writes lines
formatted like

  "%s %u %s %s %u %s %s\n", <file mode>, <number of links>,
      <owner name>, <group name>, <size>, <date and time>,
      <pathname>

but instead of writing the <owner name> and <group name>, it writes the
numeric owner and group IDs (this avoids /etc/passwd and /etc/group
lookups as well as potential field splitting issues).

The <size> field is specified as "the value that would be returned for
the file in the st_size field of struct stat".

To avoid duplicating logic in several locations in the tree, create
scripts/file-size.sh and update callers to use that instead of stat(1).

[0] http://pubs.opengroup.org/onlinepubs/9699919799/utilities/ls.html#tag_20_73_10

Signed-off-by: Michael Forney <forney@google.com>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2018-03-26 02:01:24 +09:00
Masahiro Yamada
3fdc7d3fe4 kbuild: link vmlinux only once for CONFIG_TRIM_UNUSED_KSYMS
If CONFIG_TRIM_UNUSED_KSYMS is enabled and the kernel is built from
a pristine state, the vmlinux is linked twice.

[1] A user runs 'make'

[2] First build with empty autoksyms.h

[3] adjust_autoksyms.sh updates autoksyms.h and recurses 'make vmlinux'

  --------(begin sub-make)--------
  [4] Second build with new autoksyms.h

  [5] link-vmlinux.sh is invoked because vmlinux is missing
  ---------(end sub-make)---------

[6] link-vmlinux.sh is invoked again despite vmlinux is up-to-date.

The reason of [6] is probably because Make already decided to update
vmlinux at the time of [2] because vmlinux was missing when Make
built up the dependency graph.

Because if_changed is implemented based on $?, this issue can be
narrowed down to how Make handles $?.

You can test it with the following simple code:

[Test Makefile]
  A: B
          @echo newer prerequisite: $?
          cp B A

  B: C
          cp C B
          touch A

[Result]
  $ rm -f A B
  $ touch C
  $ make
  cp C B
  touch A
  newer prerequisite: B
  cp B A

Here, 'A' has been touched in the recipe of 'B'.  So, the dependency
'A: B' has already been met before the recipe of 'A' is executed.
However, Make does not notice the fact that the recipe of 'B' also
updates 'A' as a side-effect.

The situation is similar in this case; the vmlinux has actually been
updated in the vmlinux_prereq target.  Make cannot predict this, so
judges the vmlinux is old.

link-vmlinux.sh is costly, so it is better to not run it when unneeded.
Split CONFIG_TRIM_UNUSED_KSYMS recursion to a dedicated target.

The reason of commit 2441e78b19 ("kbuild: better abstract vmlinux
sequential prerequisites") was to cater to CONFIG_BUILD_DOCSRC, but
it was later removed by commit 1848929251 ("samples: move blackfin
gptimers-example from Documentation").

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Acked-by: Nicolas Pitre <nico@linaro.org>
2018-03-26 02:01:24 +09:00
Masahiro Yamada
fbfa9be990 kbuild: move include/config/ksym/* to include/ksym/*
The idea of using fixdep was inspired by Kconfig, but autoksyms
belongs to a different group.  So, I want to move those touched
files under include/config/ksym/ to include/ksym/.

The directory include/ksym/ can be removed by 'make clean' because
it is meaningless for the external module building.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Acked-by: Nicolas Pitre <nico@linaro.org>
2018-03-26 02:01:23 +09:00
Masahiro Yamada
1f50b80a09 kbuild: move CONFIG_TRIM_UNUSED_KSYMS code unneeded for external module
The external module building does not need to parse this code because
KBUILD_MODULES is always set anyway.

Move this code inside the "ifeq ($(KBUILD_EXTMOD),) ... endif" block.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Acked-by: Nicolas Pitre <nico@linaro.org>
2018-03-26 02:01:23 +09:00
Masahiro Yamada
07a422bb21 kbuild: restore autoksyms.h touch to the top Makefile
Commit d3fc425e81 ("kbuild: make sure autoksyms.h exists early")
moved the code that touches autoksyms.h to scripts/kconfig/Makefile
with obscure reason.

From Nicolas' comment [1], he did not seem to be sure about the root
cause.

I guess I figured it out, so here is a fix-up I think is more correct.
According to the error log in the original post [2], the build failed
in scripts/mod/devicetable-offsets.c

scripts/mod/Makefile is descended from scripts/Makefile, which is
invoked from the top-level Makefile by the 'scripts' target.

To build vmlinux and/or modules, Kbuild descend into $(vmlinux-dirs).
This depends on 'prepare' and 'scripts' as follows:

  $(vmlinux-dirs): prepare scripts

Because there is no dependency between 'prepare' and 'scripts', the
parallel building can execute them simultaneously.

'prepare' depends on 'prepare1', which touched autoksyms.h, while
'scripts' descends into script/, then scripts/mod/, which needs
<generated/autoksyms.h> if CONFIG_TRIM_UNUSED_KSYMS.  It was the
reason of the race.

I am not happy to have unrelated code in the Kconfig Makefile, so
getting it back to the top Makefile.

I removed the standalone test target because I want to use it to
create an empty autoksyms.h file.  Here is a little improvement;
unnecessary autoksyms.h is not created when CONFIG_TRIM_UNUSED_KSYMS
is disabled.

[1] https://lkml.org/lkml/2016/11/30/734
[2] https://lkml.org/lkml/2016/11/30/531

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Acked-by: Nicolas Pitre <nico@linaro.org>
2018-03-26 02:01:22 +09:00
Masahiro Yamada
d8821622c8 kbuild: move 'scripts' target below
Just a trivial change to prepare for the next commit.
This target is still invisible from external module building.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2018-03-26 02:01:22 +09:00
Masahiro Yamada
baa16684b0 kbuild: remove wrong 'touch' in adjust_autoksyms.sh
The comment mentions it creates autoksyms.h in case it is missing,
but the actual code touches it when it does exists.

The build system creates it anyway because <linux/export.h> and
<asm-generic/export.h> need it.

The code would not have worked as intended, and people have not
noticed it.  This is a proof that we can simply remove it.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Acked-by: Nicolas Pitre <nico@linaro.org>
2018-03-26 02:01:22 +09:00
Masahiro Yamada
ce99d0bf31 kbuild: clear LDFLAGS in the top Makefile
Currently LDFLAGS is not cleared, so same flags are accumulated in
LDFLAGS when the top Makefile is recursively invoked.

I found unneeded rebuild for ARCH=arm64 when CONFIG_TRIM_UNUSED_KSYMS
is enabled.  If include/generated/autoksyms.h is updated, the top
Makefile is recursively invoked, then arch/arm64/Makefile adds one
more '-maarch64linux'.  Due to the command line change, modules are
rebuilt needlessly.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Acked-by: Nicolas Pitre <nico@linaro.org>
2018-03-26 02:01:21 +09:00
Masahiro Yamada
bd0f98eba6 kbuild: remove internally used LDFLAGS_vmlinux from kbuild.txt
Documentation/kbuild/makefiles.txt lists variables used in Makefile
whereas Documentation/kbuild/kbuild.txt describes user assignable
parameters given via environments or the command line.

The top Makefile and arch/*/Makefile accumulate proper linker flags to
LDFLAGS_vmlinux.  So, users can not override it from the command line.
Generally, per-file options are not supposed to be user-assignable.
Remove the misleading entry from kbuild.txt.

If we need a way to append user-specific flags for linking the kernel,
LDFLAGS_KERNEL would be a consistent choice because we already expose
LDFLAGS_MODULE counter-part to users.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2018-03-26 02:01:21 +09:00
Masahiro Yamada
35cd02bee6 kbuild: remove command line interface LDFLAGS_MODULE from makefiles.txt
Documentation/kbuild/makefiles.txt lists variables used in Makefile
whereas Documentation/kbuild/kbuild.txt describes user assignable
parameters given via environments or the command line.

LDFLAGS_MODULE is a command line interface, so it should be dropped
from makefiles.txt.

Some lines below in this file, it is clearly explained that
KBUILD_LDFLAGS_MODULE is the right one for the internal use:

    KBUILD_LDFLAGS_MODULE   Options for $(LD) when linking modules

        $(KBUILD_LDFLAGS_MODULE) is used to add arch-specific options
        used when linking modules. This is often a linker script.
        From commandline LDFLAGS_MODULE shall be used (see kbuild.txt).

Then, kbuild.txt explains LDFLAGS_MODULE, like follows:

    LDFLAGS_MODULE
    --------------------------------------------------
    Additional options used for $(LD) when linking modules.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2018-03-26 02:01:20 +09:00
Masahiro Yamada
0294e6f4a0 kbuild: simplify ld-option implementation
Currently, linker options are tested by the coordination of $(CC) and
$(LD) because $(LD) needs some object to link.

As commit 86a9df597c ("kbuild: fix linker feature test macros when
cross compiling with Clang") addressed, we need to make sure $(CC)
and $(LD) agree the underlying architecture of the passed object.

This could be a bit complex when we combine tools from different groups.
For example, we can use clang for $(CC), but we still need to rely on
GCC toolchain for $(LD).

So, I was searching for a way of standalone testing of linker options.
A trick I found is to use '-v'; this not only prints the version string,
but also tests if the given option is recognized.

If a given option is supported,

  $ aarch64-linux-gnu-ld -v --fix-cortex-a53-843419
  GNU ld (Linaro_Binutils-2017.11) 2.28.2.20170706
  $ echo $?
  0

If unsupported,

  $ aarch64-linux-gnu-ld -v --fix-cortex-a53-843419
  GNU ld (crosstool-NG linaro-1.13.1-4.7-2013.04-20130415 - Linaro GCC 2013.04) 2.23.1
  aarch64-linux-gnu-ld: unrecognized option '--fix-cortex-a53-843419'
  aarch64-linux-gnu-ld: use the --help option for usage information
  $ echo $?
  1

Gold works likewise.

  $ aarch64-linux-gnu-ld.gold -v --fix-cortex-a53-843419
  GNU gold (Linaro_Binutils-2017.11 2.28.2.20170706) 1.14
  masahiro@pug:~/ref/linux$ echo $?
  0
  $ aarch64-linux-gnu-ld.gold -v --fix-cortex-a53-999999
  GNU gold (Linaro_Binutils-2017.11 2.28.2.20170706) 1.14
  aarch64-linux-gnu-ld.gold: --fix-cortex-a53-999999: unknown option
  aarch64-linux-gnu-ld.gold: use the --help option for usage information
  $ echo $?
  1

LLD too.

  $ ld.lld -v --gc-sections
  LLD 7.0.0 (http://llvm.org/git/lld.git 4a0e4190e74cea19f8a8dc625ccaebdf8b5d1585) (compatible with GNU linkers)
  $ echo $?
  0
  $ ld.lld -v --fix-cortex-a53-843419
  LLD 7.0.0 (http://llvm.org/git/lld.git 4a0e4190e74cea19f8a8dc625ccaebdf8b5d1585) (compatible with GNU linkers)
  $ echo $?
  0
  $ ld.lld -v --fix-cortex-a53-999999
  ld.lld: error: unknown argument: --fix-cortex-a53-999999
  LLD 7.0.0 (http://llvm.org/git/lld.git 4a0e4190e74cea19f8a8dc625ccaebdf8b5d1585) (compatible with GNU linkers)
  $ echo $?
  1

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
2018-03-26 02:01:20 +09:00
Masahiro Yamada
22340a0653 kbuild: process mixture of clean/build targets one by one
Support parallel building of clean, config, and build targets in a
single command.

For example,

  make -j<N> clean all

or

  make -j<N> mrproper defconfig all

They should be handled one by one.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2018-03-26 02:01:20 +09:00
Nicholas Piggin
f49821ee32 kbuild: rename built-in.o to built-in.a
Incremental linking is gone, so rename built-in.o to built-in.a, which
is the usual extension for archive files.

This patch does two things, first is a simple search/replace:

git grep -l 'built-in\.o' | xargs sed -i 's/built-in\.o/built-in\.a/g'

The second is to invert nesting of nested text manipulations to avoid
filtering built-in.a out from libs-y2:

-libs-y2 := $(filter-out %.a, $(patsubst %/, %/built-in.a, $(libs-y)))
+libs-y2 := $(patsubst %/, %/built-in.a, $(filter-out %.a, $(libs-y)))

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2018-03-26 02:01:19 +09:00
Nicholas Piggin
6358d6e8b9 kbuild: remove incremental linking option
This removes the old `ld -r` incremental link option, which has not
been selected by any architecture since June 2017.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2018-03-26 02:01:19 +09:00
Michael Forney
1fe7d2bb24 kbuild: Improve portability of some sed invocations
* Use BREs where EREs aren't necessary.
* Pass -E instead of -r to use EREs. This will be standardized in the
  next POSIX revision[0]. GNU sed supports this since 4.2 (May 2009),
  and busybox since 1.22.0 (Jan 2014).
* Use the [:space:] character class instead of ` \t` in bracket
  expressions. In bracket expressions, POSIX says that <backslash> loses
  its special meaning, so a conforming implementation cannot expand \t
  to <tab>[1].
* In BREs, use interval expressions (\{n,m\}) instead of non-standard
  features like \+ and \?.
* Use a loop instead of -s flag.

There are still plenty of other cases of non-standard sed invocations
(use of ERE features in BREs, in-place editing), but this fixes some
core ones.

[0] http://austingroupbugs.net/view.php?id=528
[1] http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap09.html#tag_09_03_05

Signed-off-by: Michael Forney <forney@google.com>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2018-03-26 02:01:18 +09:00
Sami Tolvanen
ae0c553c24 kbuild: add clang-version.sh
Based on gcc-version.sh, clang-version.sh prints out the correct
version of clang.

Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2018-03-26 02:01:18 +09:00
Linus Torvalds
0c8efd610b Linux 4.16-rc5 2018-03-11 17:25:09 -07:00
Linus Torvalds
ed58d66f60 Merge branch 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86/pti updates from Thomas Gleixner:
 "Yet another pile of melted spectrum related updates:

   - Drop native vsyscall support finally as it causes more trouble than
     benefit.

   - Make microcode loading more robust. There were a few issues
     especially related to late loading which are now surfacing because
     late loading of the IB* microcodes addressing spectre issues has
     become more widely used.

   - Simplify and robustify the syscall handling in the entry code

   - Prevent kprobes on the entry trampoline code which lead to kernel
     crashes when the probe hits before CR3 is updated

   - Don't check microcode versions when running on hypervisors as they
     are considered as lying anyway.

   - Fix the 32bit objtool build and a coment typo"

* 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/kprobes: Fix kernel crash when probing .entry_trampoline code
  x86/pti: Fix a comment typo
  x86/microcode: Synchronize late microcode loading
  x86/microcode: Request microcode on the BSP
  x86/microcode/intel: Look into the patch cache first
  x86/microcode: Do not upload microcode if CPUs are offline
  x86/microcode/intel: Writeback and invalidate caches before updating microcode
  x86/microcode/intel: Check microcode revision before updating sibling threads
  x86/microcode: Get rid of struct apply_microcode_ctx
  x86/spectre_v2: Don't check microcode versions when running under hypervisors
  x86/vsyscall/64: Drop "native" vsyscalls
  x86/entry/64/compat: Save one instruction in entry_INT80_compat()
  x86/entry: Do not special-case clone(2) in compat entry
  x86/syscalls: Use COMPAT_SYSCALL_DEFINEx() macros for x86-only compat syscalls
  x86/syscalls: Use proper syscall definition for sys_ioperm()
  x86/entry: Remove stale syscall prototype
  x86/syscalls/32: Simplify $entry == $compat entries
  objtool: Fix 32-bit build
2018-03-11 14:59:23 -07:00
Linus Torvalds
1ad5daa653 Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer fix from Thomas Gleixner:
 "Just a single fix which adds a missing Kconfig dependency to avoid
  unmet dependency warnings"

* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  clocksource/atmel-st: Add 'depends on HAS_IOMEM' to fix unmet dependency
2018-03-11 14:55:15 -07:00
Linus Torvalds
ebb3762e88 Merge branch 'ras-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull RAS fixes from Thomas Gleixner:
 "Two small fixes for RAS/MCE:

   - Serialize sysfs changes to avoid concurrent modificaiton of
     underlying data

   - Add microcode revision to Machine Check records. This should have
     been there forever, but now with the broken microcode versions in
     the wild it has become important"

* 'ras-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/MCE: Serialize sysfs changes
  x86/MCE: Save microcode revision in machine check records
2018-03-11 14:52:41 -07:00
Linus Torvalds
8ad4424350 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf updates from Thomas Gleixner:
 "Another set of perf updates:

   - Fix a Skylake Uncore event format declaration

   - Prevent perf pipe mode from crahsing which was caused by a missing
     buffer allocation

   - Make the perf top popup message which tells the user that it uses
     fallback mode on older kernels a debug message.

   - Make perf context rescheduling work correcctly

   - Robustify the jump error drawing in perf browser mode so it does
     not try to create references to NULL initialized offset entries

   - Make trigger_on() robust so it does not enable the trigger before
     everything is set up correctly to handle it

   - Make perf auxtrace respect the --no-itrace option so it does not
     try to queue AUX data for decoding.

   - Prevent having different number of field separators in CVS output
     lines when a counter is not supported.

   - Make the perf kallsyms man page usage behave like it does for all
     other perf commands.

   - Synchronize the kernel headers"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/core: Fix ctx_event_type in ctx_resched()
  perf tools: Fix trigger class trigger_on()
  perf auxtrace: Prevent decoding when --no-itrace
  perf stat: Fix CVS output format for non-supported counters
  tools headers: Sync x86's cpufeatures.h
  tools headers: Sync copy of kvm UAPI headers
  perf record: Fix crash in pipe mode
  perf annotate browser: Be more robust when drawing jump arrows
  perf top: Fix annoying fallback message on older kernels
  perf kallsyms: Fix the usage on the man page
  perf/x86/intel/uncore: Fix Skylake UPI event format
2018-03-11 14:49:49 -07:00
Linus Torvalds
02bf0ef028 Merge branch 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking fix from Thomas Gleixner:
 "rt_mutex_futex_unlock() grew a new irq-off call site, but the function
  assumes that its always called from irq enabled context.

  Use (un)lock_irqsafe() to handle the new call site correctly"

* 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  rtmutex: Make rt_mutex_futex_unlock() safe for irq-off callsites
2018-03-11 14:46:54 -07:00
Linus Torvalds
abeb75218a Merge tag 'dmaengine-fix-4.16-rc5' of git://git.infradead.org/users/vkoul/slave-dma
Pull dmaengine fixes from Vinod Koul:
 "Two small fixes are for this cycle:

   - fix max_chunk_size for rcar-dmac for R-Car Gen3

   - fix clock resource of mv_xor_v2"

* tag 'dmaengine-fix-4.16-rc5' of git://git.infradead.org/users/vkoul/slave-dma:
  dmaengine: mv_xor_v2: Fix clock resource by adding a register clock
  dmaengine: rcar-dmac: fix max_chunk_size for R-Car Gen3
2018-03-11 13:07:14 -07:00
Linus Torvalds
d43be80a4a Merge tag 'gpio-v4.16-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio
Pull GPIO fix from Linus Walleij:
 "This is a single GPIO fix for the v4.16 series affecting the Renesas
  driver, and fixes wakeup from external stuff"

* tag 'gpio-v4.16-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
  gpio: rcar: Use wakeup_path i.s.o. explicit clock handling
2018-03-11 13:05:15 -07:00
Gregory CLEMENT
3cd2c313f1 dmaengine: mv_xor_v2: Fix clock resource by adding a register clock
On the CP110 components which are present on the Armada 7K/8K SoC we need
to explicitly enable the clock for the registers. However it is not
needed for the AP8xx component, that's why this clock is optional.

With this patch both clock have now a name, but in order to be backward
compatible, the name of the first clock is not used. It allows to still
use this clock with a device tree using the old binding.

Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2018-03-11 20:33:27 +05:30
Linus Torvalds
3266b5bd97 Merge tag 'kbuild-fixes-v4.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild fixes from Masahiro Yamada:

 - make fixdep parse kconfig.h to fix missing rebuild

 - replace hyphens with underscores in builtin DTB label names

 - fix typos

* tag 'kbuild-fixes-v4.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
  kbuild: Handle builtin dtb file names containing hyphens
  scripts/bloat-o-meter: fix typos in help
  fixdep: do not ignore kconfig.h
  fixdep: remove some false CONFIG_ matches
  fixdep: remove stale references to uml-config.h
2018-03-10 10:21:07 -08:00
Linus Torvalds
23b33acc5f Merge tag 'linux-watchdog-4.16-fixes-2' of git://www.linux-watchdog.org/linux-watchdog
Pull watchdog fixes from Wim Van Sebroeck:

 - f71808e_wdt: Fix magic close handling

 - sbsa: 32-bit read fix for WCV

 - hpwdt: Remove legacy NMI sourcing

* tag 'linux-watchdog-4.16-fixes-2' of git://www.linux-watchdog.org/linux-watchdog:
  watchdog: hpwdt: Remove legacy NMI sourcing.
  watchdog: sbsa: use 32-bit read for WCV
  watchdog: f71808e_wdt: Fix magic close handling
2018-03-10 10:17:59 -08:00
Linus Torvalds
91a262096e Merge tag 'for-linus-20180309' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:

 - a xen-blkfront fix from Bhavesh with a multiqueue fix when
   detaching/re-attaching

 - a few important NVMe fixes, including a revert for a sysfs fix that
   caused some user space confusion

 - two bcache fixes by way of Michael Lyle

 - a loop regression fix, fixing an issue with lost writes on DAX.

* tag 'for-linus-20180309' of git://git.kernel.dk/linux-block:
  loop: Fix lost writes caused by missing flag
  nvme_fc: rework sqsize handling
  nvme-fabrics: Ignore nr_io_queues option for discovery controllers
  xen-blkfront: move negotiate_mq to cover all cases of new VBDs
  Revert "nvme: create 'slaves' and 'holders' entries for hidden controllers"
  bcache: don't attach backing with duplicate UUID
  bcache: fix crashes in duplicate cache device register
  nvme: pci: pass max vectors as num_possible_cpus() to pci_alloc_irq_vectors
  nvme-pci: Fix EEH failure on ppc
2018-03-10 08:48:01 -08:00
Linus Torvalds
b3b25b1d9e Merge tag 'for-4.16/dm-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
Pull device mapper fixes from Mike Snitzer:

 - Fix an uninitialized variable false warning in dm bufio

 - Fix DM's passthrough ioctl support to be race free against an
   underlying device being removed.

 - Fix corner-case of DM raid resync reporting if/when the raid becomes
   degraded during resync; otherwise automated raid repair will fail.

 - A few DM multipath fixes to make non-SCSI optimizations, that were
   introduced during the 4.16 merge, useful for all non-SCSI devices,
   rather than narrowly define this non-SCSI mode in terms of "nvme".

   This allows the removal of "queue_mode nvme" that really didn't need
   to be introduced. Instead DM core will internalize whether
   nvme-specific IO submission optimizations are doable and DM multipath
   will only do SCSI-specific device handler operations if SCSI is in
   use.

* tag 'for-4.16/dm-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
  dm table: allow upgrade from bio-based to specialized bio-based variant
  dm mpath: remove unnecessary NVMe branching in favor of scsi_dh checks
  dm table: fix "nvme" test
  dm raid: fix incorrect sync_ratio when degraded
  dm: use blkdev_get rather than bdgrab when issuing pass-through ioctl
  dm bufio: avoid false-positive Wmaybe-uninitialized warning
2018-03-10 08:45:44 -08:00
Linus Torvalds
2f64e70cd0 Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
Pull rdma fixes from Doug Ledford:

 - Various driver bug fixes in mlx5, mlx4, bnxt_re and qedr, ranging
   from bugs under load to bad error case handling

 - There in one largish patch fixing the locking in bnxt_re to avoid a
   machine hard lock situation

 - A few core bugs on error paths

 - A patch to reduce stack usage in the new CQ API

 - One mlx5 regression introduced in this merge window

 - There were new syzkaller scripts written for the RDMA subsystem and
   we are fixing issues found by the bot

 - One of the commits (aa0de36a40 “RDMA/mlx5: Fix integer overflow
   while resizing CQ”) is missing part of the commit log message and one
   of the SOB lines. The original patch was from Leon Romanovsky, and a
   cut-n-paste separator in the commit message confused patchworks which
   then put the end of message separator in the wrong place in the
   downloaded patch, and I didn’t notice in time. The patch made it into
   the official branch, and the only way to fix it in-place was to
   rebase. Given the pain that a rebase causes, and the fact that the
   patch has relevant tags for stable and syzkaller, a revert of the
   munged patch and a reapplication of the original patch with the log
   message intact was done.

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (25 commits)
  RDMA/mlx5: Fix integer overflow while resizing CQ
  Revert "RDMA/mlx5: Fix integer overflow while resizing CQ"
  RDMA/ucma: Check that user doesn't overflow QP state
  RDMA/mlx5: Fix integer overflow while resizing CQ
  RDMA/ucma: Limit possible option size
  IB/core: Fix possible crash to access NULL netdev
  RDMA/bnxt_re: Avoid Hard lockup during error CQE processing
  RDMA/core: Reduce poll batch for direct cq polling
  IB/mlx5: Fix an error code in __mlx5_ib_modify_qp()
  IB/mlx5: When not in dual port RoCE mode, use provided port as native
  IB/mlx4: Include GID type when deleting GIDs from HW table under RoCE
  IB/mlx4: Fix corruption of RoCEv2 IPv4 GIDs
  RDMA/qedr: Fix iWARP write and send with immediate
  RDMA/qedr: Fix kernel panic when running fio over NFSoRDMA
  RDMA/qedr: Fix iWARP connect with port mapper
  RDMA/qedr: Fix ipv6 destination address resolution
  IB/core : Add null pointer check in addr_resolve
  RDMA/bnxt_re: Fix the ib_reg failure cleanup
  RDMA/bnxt_re: Fix incorrect DB offset calculation
  RDMA/bnxt_re: Unconditionly fence non wire memory operations
  ...
2018-03-10 08:38:01 -08:00
Linus Torvalds
b3337a6c35 Merge tag 'platform-drivers-x86-v4.16-6' of git://git.infradead.org/linux-platform-drivers-x86
Pull x86 platform driver fixes from Darren Hart:
 "Correct a module loading race condition between the DELL_SMBIOS
  backend modules and the first user by converting them to bool features
  of the DELL_SMBIOS driver. Fixup the resulting Kconfig dependency
  issue with DCDBAS"

* tag 'platform-drivers-x86-v4.16-6' of git://git.infradead.org/linux-platform-drivers-x86:
  platform/x86: dell-smbios: Resolve dependency error on DCDBAS
  platform/x86: Allow for SMBIOS backend defaults
  platform/x86: dell-smbios: Link all dell-smbios-* modules together
  platform/x86: dell-smbios: Rename dell-smbios source to dell-smbios-base
  platform/x86: dell-smbios: Correct some style warnings
2018-03-10 08:35:29 -08:00
Linus Torvalds
cdb06e9d8f Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM fixes from Radim Krčmář:
 "PPC:

   - Fix guest time accounting in the host

   - Fix large-page backing for radix guests on POWER9

   - Fix HPT guests on POWER9 backed by 2M or 1G pages

   - Compile fixes for some configs and gcc versions

  s390:

   - Fix random memory corruption when running as guest2 (e.g. KVM in
     LPAR) and starting guest3 (e.g. nested KVM) with many CPUs

   - Export forgotten io interrupt delivery statistics counter"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: s390: fix memory overwrites when not using SCA entries
  KVM: PPC: Book3S HV: Fix guest time accounting with VIRT_CPU_ACCOUNTING_GEN
  KVM: PPC: Book3S HV: Fix VRMA initialization with 2MB or 1GB memory backing
  KVM: PPC: Book3S HV: Fix handling of large pages in radix page fault handler
  KVM: s390: provide io interrupt kvm_stat
  KVM: PPC: Book3S: Fix compile error that occurs with some gcc versions
  KVM: PPC: Fix compile error that occurs when CONFIG_ALTIVEC=n
2018-03-09 16:59:19 -08:00
Linus Torvalds
39614481fb Merge tag 'for-linus-4.16a-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen fix from Juergen Gross:
 "Just one fix for the correct error handling after a failed
  device_register()"

* tag 'for-linus-4.16a-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  xen: xenbus: use put_device() instead of kfree()
2018-03-09 16:54:18 -08:00
Linus Torvalds
4178802c77 Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Catalin Marinas:

 - The SMCCC firmware interface for the spectre variant 2 mitigation has
   been updated to allow the discovery of whether the CPU needs the
   workaround. This pull request relaxes the kernel check on the return
   value from firmware.

 - Fix the commit allowing changing from global to non-global page table
   entries which inadvertently disallowed other safe attribute changes.

 - Fix sleeping in atomic during the arm_perf_teardown_cpu() code.

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: Relax ARM_SMCCC_ARCH_WORKAROUND_1 discovery
  arm_pmu: Use disable_irq_nosync when disabling SPI in CPU teardown hook
  arm64: mm: fix thinko in non-global page table attribute check
2018-03-09 16:49:30 -08:00
Linus Torvalds
ed3c4dff8d Merge tag 'docs-4.16-fix' of git://git.lwn.net/linux
Pull Documentation build fix from Jonathan Corbet:
 "The Sphinx 1.7 release broke the build process for reasons that are
  mostly our fault.

  This is a single fix cherry-picked from docs-next that restores docs
  buildability for all supported Sphinx versions"

* tag 'docs-4.16-fix' of git://git.lwn.net/linux:
  Documentation/sphinx: Fix Directive import error
2018-03-09 16:45:57 -08:00
Linus Torvalds
cfc79ae844 Merge branch 'akpm' (patches from Andrew)
Merge misc fixes from Andrew Morton:
 "8 fixes"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  lib/test_kmod.c: fix limit check on number of test devices created
  selftests/vm/run_vmtests: adjust hugetlb size according to nr_cpus
  mm/page_alloc: fix memmap_init_zone pageblock alignment
  mm/memblock.c: hardcode the end_pfn being -1
  mm/gup.c: teach get_user_pages_unlocked to handle FOLL_NOWAIT
  lib/bug.c: exclude non-BUG/WARN exceptions from report_bug()
  bug: use %pB in BUG and stack protector failure
  hugetlb: fix surplus pages accounting
2018-03-09 16:42:25 -08:00
Luis R. Rodriguez
ac68b1b3b9 lib/test_kmod.c: fix limit check on number of test devices created
As reported by Dan the parentheses is in the wrong place, and since
unlikely() call returns either 0 or 1 it's never less than zero.  The
second issue is that signed integer overflows like "INT_MAX + 1" are
undefined behavior.

Since num_test_devs represents the number of devices, we want to stop
prior to hitting the max, and not rely on the wrap arround at all.  So
just cap at num_test_devs + 1, prior to assigning a new device.

Link: http://lkml.kernel.org/r/20180224030046.24238-1-mcgrof@kernel.org
Fixes: d9c6a72d6f ("kmod: add test driver to stress test the module loader")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-03-09 16:40:02 -08:00
Li Zhijian
0627be7d3c selftests/vm/run_vmtests: adjust hugetlb size according to nr_cpus
Fix userfaultfd_hugetlb on hosts which have more than 64 cpus.

  ---------------------------
  running userfaultfd_hugetlb
  ---------------------------
  invalid MiB
  Usage: <MiB> <bounces>
  [FAIL]

Via userfaultfd.c we can know, hugetlb_size needs to meet hugetlb_size
>= nr_cpus * hugepage_size.  hugepage_size is often 2M, so when host
cpus > 64, it requires more than 128M.

[zhijianx.li@intel.com: update changelog/comments and variable name]
 Link: http://lkml.kernel.org/r/20180302024356.83359-1-zhijianx.li@intel.com
 Link: http://lkml.kernel.org/r/20180303125027.81638-1-zhijianx.li@intel.com
Link: http://lkml.kernel.org/r/20180302024356.83359-1-zhijianx.li@intel.com
Signed-off-by: Li Zhijian <zhijianx.li@intel.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: SeongJae Park <sj38.park@gmail.com>
Cc: Philippe Ombredanne <pombredanne@nexb.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-03-09 16:40:01 -08:00
Daniel Vacek
864b75f9d6 mm/page_alloc: fix memmap_init_zone pageblock alignment
Commit b92df1de5d ("mm: page_alloc: skip over regions of invalid pfns
where possible") introduced a bug where move_freepages() triggers a
VM_BUG_ON() on uninitialized page structure due to pageblock alignment.
To fix this, simply align the skipped pfns in memmap_init_zone() the
same way as in move_freepages_block().

Seen in one of the RHEL reports:

  crash> log | grep -e BUG -e RIP -e Call.Trace -e move_freepages_block -e rmqueue -e freelist -A1
  kernel BUG at mm/page_alloc.c:1389!
  invalid opcode: 0000 [#1] SMP
  --
  RIP: 0010:[<ffffffff8118833e>]  [<ffffffff8118833e>] move_freepages+0x15e/0x160
  RSP: 0018:ffff88054d727688  EFLAGS: 00010087
  --
  Call Trace:
   [<ffffffff811883b3>] move_freepages_block+0x73/0x80
   [<ffffffff81189e63>] __rmqueue+0x263/0x460
   [<ffffffff8118c781>] get_page_from_freelist+0x7e1/0x9e0
   [<ffffffff8118caf6>] __alloc_pages_nodemask+0x176/0x420
  --
  RIP  [<ffffffff8118833e>] move_freepages+0x15e/0x160
   RSP <ffff88054d727688>

  crash> page_init_bug -v | grep RAM
  <struct resource 0xffff88067fffd2f8>          1000 -        9bfff	System RAM (620.00 KiB)
  <struct resource 0xffff88067fffd3a0>        100000 -     430bffff	System RAM (  1.05 GiB = 1071.75 MiB = 1097472.00 KiB)
  <struct resource 0xffff88067fffd410>      4b0c8000 -     4bf9cfff	System RAM ( 14.83 MiB = 15188.00 KiB)
  <struct resource 0xffff88067fffd480>      4bfac000 -     646b1fff	System RAM (391.02 MiB = 400408.00 KiB)
  <struct resource 0xffff88067fffd560>      7b788000 -     7b7fffff	System RAM (480.00 KiB)
  <struct resource 0xffff88067fffd640>     100000000 -    67fffffff	System RAM ( 22.00 GiB)

  crash> page_init_bug | head -6
  <struct resource 0xffff88067fffd560>      7b788000 -     7b7fffff	System RAM (480.00 KiB)
  <struct page 0xffffea0001ede200>   1fffff00000000  0 <struct pglist_data 0xffff88047ffd9000> 1 <struct zone 0xffff88047ffd9800> DMA32          4096    1048575
  <struct page 0xffffea0001ede200> 505736 505344 <struct page 0xffffea0001ed8000> 505855 <struct page 0xffffea0001edffc0>
  <struct page 0xffffea0001ed8000>                0  0 <struct pglist_data 0xffff88047ffd9000> 0 <struct zone 0xffff88047ffd9000> DMA               1       4095
  <struct page 0xffffea0001edffc0>   1fffff00000400  0 <struct pglist_data 0xffff88047ffd9000> 1 <struct zone 0xffff88047ffd9800> DMA32          4096    1048575
  BUG, zones differ!

Note that this range follows two not populated sections
68000000-77ffffff in this zone.  7b788000-7b7fffff is the first one
after a gap.  This makes memmap_init_zone() skip all the pfns up to the
beginning of this range.  But this range is not pageblock (2M) aligned.
In fact no range has to be.

  crash> kmem -p 77fff000 78000000 7b5ff000 7b600000 7b787000 7b788000
        PAGE        PHYSICAL      MAPPING       INDEX CNT FLAGS
  ffffea0001e00000  78000000                0        0  0 0
  ffffea0001ed7fc0  7b5ff000                0        0  0 0
  ffffea0001ed8000  7b600000                0        0  0 0	<<<<
  ffffea0001ede1c0  7b787000                0        0  0 0
  ffffea0001ede200  7b788000                0        0  1 1fffff00000000

Top part of page flags should contain nodeid and zonenr, which is not
the case for page ffffea0001ed8000 here (<<<<).

  crash> log | grep -o fffea0001ed[^\ ]* | sort -u
  fffea0001ed8000
  fffea0001eded20
  fffea0001edffc0

  crash> bt -r | grep -o fffea0001ed[^\ ]* | sort -u
  fffea0001ed8000
  fffea0001eded00
  fffea0001eded20
  fffea0001edffc0

Initialization of the whole beginning of the section is skipped up to
the start of the range due to the commit b92df1de5d.  Now any code
calling move_freepages_block() (like reusing the page from a freelist as
in this example) with a page from the beginning of the range will get
the page rounded down to start_page ffffea0001ed8000 and passed to
move_freepages() which crashes on assertion getting wrong zonenr.

  >         VM_BUG_ON(page_zone(start_page) != page_zone(end_page));

Note, page_zone() derives the zone from page flags here.

From similar machine before commit b92df1de5d:

  crash> kmem -p 77fff000 78000000 7b5ff000 7b600000 7b7fe000 7b7ff000
        PAGE        PHYSICAL      MAPPING       INDEX CNT FLAGS
  fffff73941e00000  78000000                0        0  1 1fffff00000000
  fffff73941ed7fc0  7b5ff000                0        0  1 1fffff00000000
  fffff73941ed8000  7b600000                0        0  1 1fffff00000000
  fffff73941edff80  7b7fe000                0        0  1 1fffff00000000
  fffff73941edffc0  7b7ff000 ffff8e67e04d3ae0     ad84  1 1fffff00020068 uptodate,lru,active,mappedtodisk

All the pages since the beginning of the section are initialized.
move_freepages()' not gonna blow up.

The same machine with this fix applied:

  crash> kmem -p 77fff000 78000000 7b5ff000 7b600000 7b7fe000 7b7ff000
        PAGE        PHYSICAL      MAPPING       INDEX CNT FLAGS
  ffffea0001e00000  78000000                0        0  0 0
  ffffea0001e00000  7b5ff000                0        0  0 0
  ffffea0001ed8000  7b600000                0        0  1 1fffff00000000
  ffffea0001edff80  7b7fe000                0        0  1 1fffff00000000
  ffffea0001edffc0  7b7ff000 ffff88017fb13720        8  2 1fffff00020068 uptodate,lru,active,mappedtodisk

At least the bare minimum of pages is initialized preventing the crash
as well.

Customers started to report this as soon as 7.4 (where b92df1de5d was
merged in RHEL) was released.  I remember reports from
September/October-ish times.  It's not easily reproduced and happens on
a handful of machines only.  I guess that's why.  But that does not make
it less serious, I think.

Though there actually is a report here:
  https://bugzilla.kernel.org/show_bug.cgi?id=196443

And there are reports for Fedora from July:
  https://bugzilla.redhat.com/show_bug.cgi?id=1473242
and CentOS:
  https://bugs.centos.org/view.php?id=13964
and we internally track several dozens reports for RHEL bug
  https://bugzilla.redhat.com/show_bug.cgi?id=1525121

Link: http://lkml.kernel.org/r/0485727b2e82da7efbce5f6ba42524b429d0391a.1520011945.git.neelx@redhat.com
Fixes: b92df1de5d ("mm: page_alloc: skip over regions of invalid pfns where possible")
Signed-off-by: Daniel Vacek <neelx@redhat.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: Pavel Tatashin <pasha.tatashin@oracle.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-03-09 16:40:01 -08:00
Daniel Vacek
379b03b7fa mm/memblock.c: hardcode the end_pfn being -1
This is just a cleanup.  It aids handling the special end case in the
next commit.

[akpm@linux-foundation.org: make it work against current -linus, not against -mm]
[akpm@linux-foundation.org: make it work against current -linus, not against -mm some more]
Link: http://lkml.kernel.org/r/1ca478d4269125a99bcfb1ca04d7b88ac1aee924.1520011944.git.neelx@redhat.com
Signed-off-by: Daniel Vacek <neelx@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Pavel Tatashin <pasha.tatashin@oracle.com>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-03-09 16:40:01 -08:00
Andrea Arcangeli
96312e6128 mm/gup.c: teach get_user_pages_unlocked to handle FOLL_NOWAIT
KVM is hanging during postcopy live migration with userfaultfd because
get_user_pages_unlocked is not capable to handle FOLL_NOWAIT.

Earlier FOLL_NOWAIT was only ever passed to get_user_pages.

Specifically faultin_page (the callee of get_user_pages_unlocked caller)
doesn't know that if FAULT_FLAG_RETRY_NOWAIT was set in the page fault
flags, when VM_FAULT_RETRY is returned, the mmap_sem wasn't actually
released (even if nonblocking is not NULL).  So it sets *nonblocking to
zero and the caller won't release the mmap_sem thinking it was already
released, but it wasn't because of FOLL_NOWAIT.

Link: http://lkml.kernel.org/r/20180302174343.5421-2-aarcange@redhat.com
Fixes: ce53053ce3 ("kvm: switch get_user_page_nowait() to get_user_pages_unlocked()")
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Reported-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Tested-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-03-09 16:40:01 -08:00
Kees Cook
1b4cfe3c0a lib/bug.c: exclude non-BUG/WARN exceptions from report_bug()
Commit b8347c2196 ("x86/debug: Handle warnings before the notifier
chain, to fix KGDB crash") changed the ordering of fixups, and did not
take into account the case of x86 processing non-WARN() and non-BUG()
exceptions.  This would lead to output of a false BUG line with no other
information.

In the case of a refcount exception, it would be immediately followed by
the refcount WARN(), producing very strange double-"cut here":

  lkdtm: attempting bad refcount_inc() overflow
  ------------[ cut here ]------------
  Kernel BUG at 0000000065f29de5 [verbose debug info unavailable]
  ------------[ cut here ]------------
  refcount_t overflow at lkdtm_REFCOUNT_INC_OVERFLOW+0x6b/0x90 in cat[3065], uid/euid: 0/0
  WARNING: CPU: 0 PID: 3065 at kernel/panic.c:657 refcount_error_report+0x9a/0xa4
  ...

In the prior ordering, exceptions were searched first:

   do_trap_no_signal(struct task_struct *tsk, int trapnr, char *str,
   ...
                if (fixup_exception(regs, trapnr))
                        return 0;

  -               if (fixup_bug(regs, trapnr))
  -                       return 0;
  -

As a result, fixup_bugs()'s is_valid_bugaddr() didn't take into account
needing to search the exception list first, since that had already
happened.

So, instead of searching the exception list twice (once in
is_valid_bugaddr() and then again in fixup_exception()), just add a
simple sanity check to report_bug() that will immediately bail out if a
BUG() (or WARN()) entry is not found.

Link: http://lkml.kernel.org/r/20180301225934.GA34350@beast
Fixes: b8347c2196 ("x86/debug: Handle warnings before the notifier chain, to fix KGDB crash")
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Richard Weinberger <richard.weinberger@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-03-09 16:40:01 -08:00
Kees Cook
0862ca422b bug: use %pB in BUG and stack protector failure
The BUG and stack protector reports were still using a raw %p.  This
changes it to %pB for more meaningful output.

Link: http://lkml.kernel.org/r/20180301225704.GA34198@beast
Fixes: ad67b74d24 ("printk: hash addresses printed with %p")
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Richard Weinberger <richard.weinberger@gmail.com>,
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-03-09 16:40:01 -08:00
Michal Hocko
4704dea36d hugetlb: fix surplus pages accounting
Dan Rue has noticed that libhugetlbfs test suite fails counter test:

  # mount_point="/mnt/hugetlb/"
  # echo 200 > /proc/sys/vm/nr_hugepages
  # mkdir -p "${mount_point}"
  # mount -t hugetlbfs hugetlbfs "${mount_point}"
  # export LD_LIBRARY_PATH=/root/libhugetlbfs/libhugetlbfs-2.20/obj64
  # /root/libhugetlbfs/libhugetlbfs-2.20/tests/obj64/counters
  Starting testcase "/root/libhugetlbfs/libhugetlbfs-2.20/tests/obj64/counters", pid 3319
  Base pool size: 0
  Clean...
  FAIL    Line 326: Bad HugePages_Total: expected 0, actual 1

The bug was bisected to 0c397daea1 ("mm, hugetlb: further simplify
hugetlb allocation API").

The reason is that alloc_surplus_huge_page() misaccounts per node
surplus pages.  We should increase surplus_huge_pages_node rather than
nr_huge_pages_node which is already handled by alloc_fresh_huge_page.

Link: http://lkml.kernel.org/r/20180221191439.GM2231@dhcp22.suse.cz
Fixes: 0c397daea1 ("mm, hugetlb: further simplify hugetlb allocation API")
Signed-off-by: Michal Hocko <mhocko@suse.com>
Reported-by: Dan Rue <dan.rue@linaro.org>
Tested-by: Dan Rue <dan.rue@linaro.org>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-03-09 16:40:01 -08:00
Leon Romanovsky
28e9091e31 RDMA/mlx5: Fix integer overflow while resizing CQ
The user can provide very large cqe_size which will cause to integer
overflow as it can be seen in the following UBSAN warning:

=======================================================================
UBSAN: Undefined behaviour in drivers/infiniband/hw/mlx5/cq.c:1192:53
signed integer overflow:
64870 * 65536 cannot be represented in type 'int'
CPU: 0 PID: 267 Comm: syzkaller605279 Not tainted 4.15.0+ #90 Hardware
name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014
Call Trace:
 dump_stack+0xde/0x164
 ? dma_virt_map_sg+0x22c/0x22c
 ubsan_epilogue+0xe/0x81
 handle_overflow+0x1f3/0x251
 ? __ubsan_handle_negate_overflow+0x19b/0x19b
 ? lock_acquire+0x440/0x440
 mlx5_ib_resize_cq+0x17e7/0x1e40
 ? cyc2ns_read_end+0x10/0x10
 ? native_read_msr_safe+0x6c/0x9b
 ? cyc2ns_read_end+0x10/0x10
 ? mlx5_ib_modify_cq+0x220/0x220
 ? sched_clock_cpu+0x18/0x200
 ? lookup_get_idr_uobject+0x200/0x200
 ? rdma_lookup_get_uobject+0x145/0x2f0
 ib_uverbs_resize_cq+0x207/0x3e0
 ? ib_uverbs_ex_create_cq+0x250/0x250
 ib_uverbs_write+0x7f9/0xef0
 ? cyc2ns_read_end+0x10/0x10
 ? print_irqtrace_events+0x280/0x280
 ? ib_uverbs_ex_create_cq+0x250/0x250
 ? uverbs_devnode+0x110/0x110
 ? sched_clock_cpu+0x18/0x200
 ? do_raw_spin_trylock+0x100/0x100
 ? __lru_cache_add+0x16e/0x290
 __vfs_write+0x10d/0x700
 ? uverbs_devnode+0x110/0x110
 ? kernel_read+0x170/0x170
 ? sched_clock_cpu+0x18/0x200
 ? security_file_permission+0x93/0x260
 vfs_write+0x1b0/0x550
 SyS_write+0xc7/0x1a0
 ? SyS_read+0x1a0/0x1a0
 ? trace_hardirqs_on_thunk+0x1a/0x1c
 entry_SYSCALL_64_fastpath+0x1e/0x8b
RIP: 0033:0x433549
RSP: 002b:00007ffe63bd1ea8 EFLAGS: 00000217
=======================================================================

Cc: syzkaller <syzkaller@googlegroups.com>
Cc: <stable@vger.kernel.org> # 3.13
Fixes: bde51583f4 ("IB/mlx5: Add support for resize CQ")
Reported-by: Noa Osherovich <noaos@mellanox.com>
Reviewed-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-03-09 18:10:48 -05:00
Doug Ledford
212a0cbc56 Revert "RDMA/mlx5: Fix integer overflow while resizing CQ"
The original commit of this patch has a munged log message that is
missing several of the tags the original author intended to be on the
patch.  This was due to patchworks misinterpreting a cut-n-paste
separator line as an end of message line and munging the mbox that was
used to import the patch:

https://patchwork.kernel.org/patch/10264089/

The original patch will be reapplied with a fixed commit message so the
proper tags are applied.

This reverts commit aa0de36a40.

Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-03-09 18:07:46 -05:00