Commit Graph

1362 Commits

Author SHA1 Message Date
Jason A. Donenfeld
ec25e386d3 random: fix sysctl documentation nits
commit 069c4ea6871c18bd368f27756e0f91ffb524a788 upstream.

A semicolon was missing, and the almost-alphabetical-but-not ordering
was confusing, so regroup these by category instead.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-05-30 09:33:40 +02:00
Jason A. Donenfeld
f3bc5eca83 random: treat bootloader trust toggle the same way as cpu trust toggle
commit d97c68d178fbf8aaaf21b69b446f2dfb13909316 upstream.

If CONFIG_RANDOM_TRUST_CPU is set, the RNG initializes using RDRAND.
But, the user can disable (or enable) this behavior by setting
`random.trust_cpu=0/1` on the kernel command line. This allows system
builders to do reasonable things while avoiding howls from tinfoil
hatters. (Or vice versa.)

CONFIG_RANDOM_TRUST_BOOTLOADER is basically the same thing, but regards
the seed passed via EFI or device tree, which might come from RDRAND or
a TPM or somewhere else. In order to allow distros to more easily enable
this while avoiding those same howls (or vice versa), this commit adds
the corresponding `random.trust_bootloader=0/1` toggle.

Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Graham Christensen <graham@grahamc.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Dominik Brodowski <linux@dominikbrodowski.net>
Link: https://github.com/NixOS/nixpkgs/pull/165355
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-05-30 09:33:38 +02:00
Jason A. Donenfeld
17ad693cd2 random: remove ifdef'd out interrupt bench
commit 95e6060c20a7f5db60163274c5222a725ac118f9 upstream.

With tools like kbench9000 giving more finegrained responses, and this
basically never having been used ever since it was initially added,
let's just get rid of this. There *is* still work to be done on the
interrupt handler, but this really isn't the way it's being developed.

Cc: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Eric Biggers <ebiggers@google.com>
Reviewed-by: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-05-30 09:33:34 +02:00
Jason A. Donenfeld
32d1d7ce3a random: always wake up entropy writers after extraction
commit 489c7fc44b5740d377e8cfdbf0851036e493af00 upstream.

Now that POOL_BITS == POOL_MIN_BITS, we must unconditionally wake up
entropy writers after every extraction. Therefore there's no point of
write_wakeup_threshold, so we can move it to the dustbin of unused
compatibility sysctls. While we're at it, we can fix a small comparison
where we were waking up after <= min rather than < min.

Cc: Theodore Ts'o <tytso@mit.edu>
Suggested-by: Eric Biggers <ebiggers@kernel.org>
Reviewed-by: Eric Biggers <ebiggers@google.com>
Reviewed-by: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-05-30 09:33:31 +02:00
Guilherme G. Piccoli
415edc68b6 docs: sysctl/kernel: add missing bit to panic_print
commit a1ff1de00db21ecb956213f046b79741b64c6b65 upstream.

Patch series "Some improvements on panic_print".

This is a mix of a documentation fix with some additions to the
"panic_print" syscall / parameter.  The goal here is being able to collect
all CPUs backtraces during a panic event and also to enable "panic_print"
in a kdump event - details of the reasoning and design choices in the
patches.

This patch (of 3):

Commit de6da1e8bc ("panic: add an option to replay all the printk
message in buffer") added a new bit to the sysctl/kernel parameter
"panic_print", but the documentation was added only in
kernel-parameters.txt, not in the sysctl guide.

Fix it here by adding bit 5 to sysctl admin-guide documentation.

[rdunlap@infradead.org: fix table format warning]
  Link: https://lkml.kernel.org/r/20220109055635.6999-1-rdunlap@infradead.org

Link: https://lkml.kernel.org/r/20211109202848.610874-1-gpiccoli@igalia.com
Link: https://lkml.kernel.org/r/20211109202848.610874-2-gpiccoli@igalia.com
Fixes: de6da1e8bc ("panic: add an option to replay all the printk message in buffer")
Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com>
Reviewed-by: Feng Tang <feng.tang@intel.com>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Iurii Zaikin <yzaikin@google.com>
Cc: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-08 14:40:44 +02:00
Kim Phillips
e335384560 x86/speculation: Update link to AMD speculation whitepaper
commit e9b6013a7ce31535b04b02ba99babefe8a8599fa upstream.

Update the link to the "Software Techniques for Managing Speculation
on AMD Processors" whitepaper.

Signed-off-by: Kim Phillips <kim.phillips@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-03-11 12:11:50 +01:00
Peter Zijlstra
071e8b69d7 Documentation/hw-vuln: Update spectre doc
commit 5ad3eb1132453b9795ce5fd4572b1c18b292cca9 upstream.

Update the doc with the new fun.

  [ bp: Massage commit message. ]

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
[fllinden@amazon.com: backported to 5.10]
Signed-off-by: Frank van der Linden <fllinden@amazon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-03-11 12:11:49 +01:00
Lukas Bulwahn
5d38cbf66d Documentation: refer to config RANDOMIZE_BASE for kernel address-space randomization
commit 82ca67321f55a8d1da6ac3ed611da3c32818bb37 upstream.

The config RANDOMIZE_SLAB does not exist, the authors probably intended to
refer to the config RANDOMIZE_BASE, which provides kernel address-space
randomization. They probably just confused SLAB with BASE (these two
four-letter words coincidentally share three common letters), as they also
point out the config SLAB_FREELIST_RANDOM as further randomization within
the same sentence.

Fix the reference of the config for kernel address-space randomization to
the config that provides that.

Fixes: 6e88559470 ("Documentation: Add section about CPU vulnerabilities for Spectre")
Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Link: https://lore.kernel.org/r/20211230171940.27558-1-lukas.bulwahn@gmail.com
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-01-27 10:54:29 +01:00
Daniel Borkmann
8c15bfb36a bpf: Add kconfig knob for disabling unpriv bpf by default
commit 08389d888287c3823f80b0216766b71e17f0aba5 upstream.

Add a kconfig knob which allows for unprivileged bpf to be disabled by default.
If set, the knob sets /proc/sys/kernel/unprivileged_bpf_disabled to value of 2.

This still allows a transition of 2 -> {0,1} through an admin. Similarly,
this also still keeps 1 -> {1} behavior intact, so that once set to permanently
disabled, it cannot be undone aside from a reboot.

We've also added extra2 with max of 2 for the procfs handler, so that an admin
still has a chance to toggle between 0 <-> 2.

Either way, as an additional alternative, applications can make use of CAP_BPF
that we added a while ago.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/74ec548079189e4e4dffaeb42b8987bb3c852eee.1620765074.git.daniel@iogearbox.net
Cc: Salvatore Bonaccorso <carnil@debian.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-01-05 12:40:34 +01:00
Takashi Iwai
bb672eff74 Input: i8042 - add deferred probe support
[ Upstream commit 9222ba68c3f4065f6364b99cc641b6b019ef2d42 ]

We've got a bug report about the non-working keyboard on ASUS ZenBook
UX425UA.  It seems that the PS/2 device isn't ready immediately at
boot but takes some seconds to get ready.  Until now, the only
workaround is to defer the probe, but it's available only when the
driver is a module.  However, many distros, including openSUSE as in
the original report, build the PS/2 input drivers into kernel, hence
it won't work easily.

This patch adds the support for the deferred probe for i8042 stuff as
a workaround of the problem above.  When the deferred probe mode is
enabled and the device couldn't be probed, it'll be repeated with the
standard deferred probe mechanism.

The deferred probe mode is enabled either via the new option
i8042.probe_defer or via the quirk table entry.  As of this patch, the
quirk table contains only ASUS ZenBook UX425UA.

The deferred probe part is based on Fabio's initial work.

BugLink: https://bugzilla.suse.com/show_bug.cgi?id=1190256
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Tested-by: Samuel Čavoj <samuel@cavoj.net>
Link: https://lore.kernel.org/r/20211117063757.11380-1-tiwai@suse.de

Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-01-05 12:40:29 +01:00
Sean Christopherson
8008fc1d0b KVM: VMX: Fix stale docs for kvm-intel.emulate_invalid_guest_state
commit 0ff29701ffad9a5d5a24344d8b09f3af7b96ffda upstream.

Update the documentation for kvm-intel's emulate_invalid_guest_state to
rectify the description of KVM's default behavior, and to document that
the behavior and thus parameter only applies to L1.

Fixes: a27685c33a ("KVM: VMX: Emulate invalid guest state by default")
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20211207193006.120997-4-seanjc@google.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-12-29 12:26:05 +01:00
Juergen Gross
af7d25d785 xen/balloon: add late_initcall_sync() for initial ballooning done
commit 40fdea0284bb20814399da0484a658a96c735d90 upstream.

When running as PVH or HVM guest with actual memory < max memory the
hypervisor is using "populate on demand" in order to allow the guest
to balloon down from its maximum memory size. For this to work
correctly the guest must not touch more memory pages than its target
memory size as otherwise the PoD cache will be exhausted and the guest
is crashed as a result of that.

In extreme cases ballooning down might not be finished today before
the init process is started, which can consume lots of memory.

In order to avoid random boot crashes in such cases, add a late init
call to wait for ballooning down having finished for PVH/HVM guests.

Warn on console if initial ballooning fails, panic() after stalling
for more than 3 minutes per default. Add a module parameter for
changing this timeout.

[boris: replaced pr_info() with pr_notice()]

Cc: <stable@vger.kernel.org>
Reported-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Link: https://lore.kernel.org/r/20211102091944.17487-1-jgross@suse.com
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-11-18 14:03:49 +01:00
Leon Romanovsky
8ea3e622af docs: Fix infiniband uverbs minor number
[ Upstream commit 8d7e415d55610d503fdb8815344846b72d194a40 ]

Starting from the beginning of infiniband subsystem, the uverbs char
devices start from 192 as a minor number, see
commit bc38a6abdd ("[PATCH] IB uverbs: core implementation").

This patch updates the admin guide documentation to reflect it.

Fixes: 9d85025b04 ("docs-rst: create an user's manual book")
Link: https://lore.kernel.org/r/bad03e6bcde45550c01e12908a6fe7dfa4770703.1627477347.git.leonro@nvidia.com
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-09-18 13:40:11 +02:00
Paul E. McKenney
03a65c14ab clocksource: Retry clock read if long delays detected
[ Upstream commit db3a34e17433de2390eb80d436970edcebd0ca3e ]

When the clocksource watchdog marks a clock as unstable, this might be due
to that clock being unstable or it might be due to delays that happen to
occur between the reads of the two clocks.  Yes, interrupts are disabled
across those two reads, but there are no shortage of things that can delay
interrupts-disabled regions of code ranging from SMI handlers to vCPU
preemption.  It would be good to have some indication as to why the clock
was marked unstable.

Therefore, re-read the watchdog clock on either side of the read from the
clock under test.  If the watchdog clock shows an excessive time delta
between its pair of reads, the reads are retried.

The maximum number of retries is specified by a new kernel boot parameter
clocksource.max_cswd_read_retries, which defaults to three, that is, up to
four reads, one initial and up to three retries.  If more than one retry
was required, a message is printed on the console (the occasional single
retry is expected behavior, especially in guest OSes).  If the maximum
number of retries is exceeded, the clock under test will be marked
unstable.  However, the probability of this happening due to various sorts
of delays is quite small.  In addition, the reason (clock-read delays) for
the unstable marking will be apparent.

Reported-by: Chris Mason <clm@fb.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Feng Tang <feng.tang@intel.com>
Link: https://lore.kernel.org/r/20210527190124.440372-1-paulmck@kernel.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-14 16:56:01 +02:00
David Hildenbrand
8876cc237e drivers/base/memory: don't store phys_device in memory blocks
[ Upstream commit e9a2e48e8704c9d20a625c6f2357147d03ea7b97 ]

No need to store the value for each and every memory block, as we can
easily query the value at runtime.  Reshuffle the members to optimize the
memory layout.  Also, let's clarify what the interface once was used for
and why it's legacy nowadays.

"phys_device" was used on s390x in older versions of lsmem[2]/chmem[3],
back when they were still part of s390x-tools.  They were later replaced
by the variants in linux-utils.  For example, RHEL6 and RHEL7 contain
lsmem/chmem from s390-utils.  RHEL8 switched to versions from util-linux
on s390x [4].

"phys_device" was added with sysfs support for memory hotplug in commit
3947be1969 ("[PATCH] memory hotplug: sysfs and add/remove functions") in
2005.  It always returned 0.

s390x started returning something != 0 on some setups (if sclp.rzm is set
by HW) in 2010 via commit 57b552ba0b ("memory hotplug/s390: set
phys_device").

For s390x, it allowed for identifying which memory block devices belong to
the same storage increment (RZM).  Only if all memory block devices
comprising a single storage increment were offline, the memory could
actually be removed in the hypervisor.

Since commit e5d709bb5f ("s390/memory hotplug: provide
memory_block_size_bytes() function") in 2013 a memory block device spans
at least one storage increment - which is why the interface isn't really
helpful/used anymore (except by old lsmem/chmem tools).

There were once RFC patches to make use of "phys_device" in ACPI context;
however, the underlying problem could be solved using different interfaces
[1].

[1] https://patchwork.kernel.org/patch/2163871/
[2] https://github.com/ibm-s390-tools/s390-tools/blob/v2.1.0/zconf/lsmem
[3] https://github.com/ibm-s390-tools/s390-tools/blob/v2.1.0/zconf/chmem
[4] https://bugzilla.redhat.com/show_bug.cgi?id=1504134

Link: https://lkml.kernel.org/r/20210201181347.13262-2-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Cc: Ilya Dryomov <idryomov@gmail.com>
Cc: Vaibhav Jain <vaibhav@linux.ibm.com>
Cc: Tom Rix <trix@redhat.com>
Cc: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-17 17:06:25 +01:00
Dave Hansen
54683f81c8 mm/vmscan: restore zone_reclaim_mode ABI
commit 519983645a9f2ec339cabfa0c6ef7b09be985dd0 upstream.

I went to go add a new RECLAIM_* mode for the zone_reclaim_mode sysctl.
Like a good kernel developer, I also went to go update the
documentation.  I noticed that the bits in the documentation didn't
match the bits in the #defines.

The VM never explicitly checks the RECLAIM_ZONE bit.  The bit is,
however implicitly checked when checking 'node_reclaim_mode==0'.  The
RECLAIM_ZONE #define was removed in a cleanup.  That, by itself is fine.

But, when the bit was removed (bit 0) the _other_ bit locations also got
changed.  That's not OK because the bit values are documented to mean
one specific thing.  Users surely do not expect the meaning to change
from kernel to kernel.

The end result is that if someone had a script that did:

	sysctl vm.zone_reclaim_mode=1

it would have gone from enabling node reclaim for clean unmapped pages
to writing out pages during node reclaim after the commit in question.
That's not great.

Put the bits back the way they were and add a comment so something like
this is a bit harder to do again.  Update the documentation to make it
clear that the first bit is ignored.

Link: https://lkml.kernel.org/r/20210219172555.FF0CDF23@viggo.jf.intel.com
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Fixes: 648b5cf368 ("mm/vmscan: remove unused RECLAIM_OFF/RECLAIM_ZONE")
Reviewed-by: Ben Widawsky <ben.widawsky@intel.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Acked-by: David Rientjes <rientjes@google.com>
Acked-by: Christoph Lameter <cl@linux.com>
Cc: Alex Shi <alex.shi@linux.alibaba.com>
Cc: Daniel Wagner <dwagner@suse.de>
Cc: "Tobin C. Harding" <tobin@kernel.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Qian Cai <cai@lca.pw>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-04 11:38:38 +01:00
Robin Murphy
1ea83d4891 perf/arm-cmn: Fix PMU instance naming
[ Upstream commit 79d7c3dca99fa96033695ddf5d495b775a3a137b ]

Although it's neat to avoid the suffix for the typical case of a
single PMU, it means systems with multiple CMN instances end up with
inconsistent naming. I think it also breaks perf tool's "uncore alias"
logic if the common instance prefix is also the full name of one.

Avoid any surprises by not trying to be clever and simply numbering
every instance, even when it might technically prove redundant.

Fixes: 0ba64770a2 ("perf: Add Arm CMN-600 PMU driver")
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/649a2281233f193d59240b13ed91b57337c77b32.1611839564.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-04 11:37:44 +01:00
David Woodhouse
a8fddd4192 x86/xen: Add xen_no_vector_callback option to test PCI INTX delivery
[ Upstream commit b36b0fe96af13460278bf9b173beced1bd15f85d ]

It's useful to be able to test non-vector event channel delivery, to make
sure Linux will work properly on older Xen which doesn't have it.

It's also useful for those working on Xen and Xen-compatible hypervisors,
because there are guest kernels still in active use which use PCI INTX
even when vector delivery is available.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Link: https://lore.kernel.org/r/20210106153958.584169-4-dwmw2@infradead.org
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-01-27 11:55:01 +01:00
Mikulas Patocka
9cb683c3c4 dm integrity: conditionally disable "recalculate" feature
commit 5c02406428d5219c367c5f53457698c58bc5f917 upstream.

Otherwise a malicious user could (ab)use the "recalculate" feature
that makes dm-integrity calculate the checksums in the background
while the device is already usable. When the system restarts before all
checksums have been calculated, the calculation continues where it was
interrupted even if the recalculate feature is not requested the next
time the dm device is set up.

Disable recalculating if we use internal_hash or journal_hash with a
key (e.g. HMAC) and we don't have the "legacy_recalculate" flag.

This may break activation of a volume, created by an older kernel,
that is not yet fully recalculated -- if this happens, the user should
add the "legacy_recalculate" flag to constructor parameters.

Cc: stable@vger.kernel.org
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Reported-by: Daniel Glockner <dg@emlix.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-01-27 11:54:55 +01:00
Oliver Neukum
2440c1cb25 USB: UAS: introduce a quirk to set no_write_same
commit 8010622c86ca5bb44bc98492f5968726fc7c7a21 upstream.

UAS does not share the pessimistic assumption storage is making that
devices cannot deal with WRITE_SAME.  A few devices supported by UAS,
are reported to not deal well with WRITE_SAME. Those need a quirk.

Add it to the device that needs it.

Reported-by: David C. Partridge <david.partridge@perdrix.co.uk>
Signed-off-by: Oliver Neukum <oneukum@suse.com>
Cc: stable <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20201209152639.9195-1-oneukum@suse.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-12-21 13:30:08 +01:00
Linus Torvalds
8a02ec8f35 Merge tag 'trace-v5.10-rc6-bootconfig' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull bootconfig fixes from Steven Rostedt:
 "Have bootconfig size and checksum be little endian

  In case the bootconfig is created on one kind of endian machine, and
  then read on the other kind of endian kernel, the size and checksum
  will be incorrect. Instead, have both the size and checksum always be
  little endian and have the tool and the kernel convert it from little
  endian to or from the host endian"

* tag 'trace-v5.10-rc6-bootconfig' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  docs: bootconfig: Add the endianness of fields
  tools/bootconfig: Store size and checksum in footer as le32
  bootconfig: Load size and checksum in the footer as le32
2020-12-02 12:09:36 -08:00
Linus Torvalds
ef6900acc8 Merge tag 'trace-v5.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing fixes from Steven Rostedt:

 - Use correct timestamp variable for ring buffer write stamp update

 - Fix up before stamp and write stamp when crossing ring buffer sub
   buffers

 - Keep a zero delta in ring buffer in slow path if cmpxchg fails

 - Fix trace_printk static buffer for archs that care

 - Fix ftrace record accounting for ftrace ops with trampolines

 - Fix DYNAMIC_FTRACE_WITH_DIRECT_CALLS dependency

 - Remove WARN_ON in hwlat tracer that triggers on something that is OK

 - Make "my_tramp" trampoline in ftrace direct sample code global

 - Fixes in the bootconfig tool for better alignment management

* tag 'trace-v5.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  ring-buffer: Always check to put back before stamp when crossing pages
  ftrace: Fix DYNAMIC_FTRACE_WITH_DIRECT_CALLS dependency
  ftrace: Fix updating FTRACE_FL_TRAMP
  tracing: Fix alignment of static buffer
  tracing: Remove WARN_ON in start_thread()
  samples/ftrace: Mark my_tramp[12]? global
  ring-buffer: Set the right timestamp in the slow path of __rb_reserve_next()
  ring-buffer: Update write stamp with the correct ts
  docs: bootconfig: Update file format on initrd image
  tools/bootconfig: Align the bootconfig applied initrd image size to 4
  tools/bootconfig: Fix to check the write failure correctly
  tools/bootconfig: Fix errno reference after printf()
2020-12-01 15:30:18 -08:00
Masami Hiramatsu
05227490c5 docs: bootconfig: Add the endianness of fields
Add a description about the endianness of the size and the checksum
fields. Those must be stored as le32 instead of u32. This will allow
us to apply bootconfig to the cross build initrd without caring
the endianness.

Link: https://lkml.kernel.org/r/160583936246.547349.10964204130590955409.stgit@devnote2

Reported-by: Steven Rostedt <rostedt@goodmis.org>
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-11-30 23:22:11 -05:00
Linus Torvalds
dda3f4252e Merge tag 'powerpc-cve-2020-4788' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
 "Fixes for CVE-2020-4788.

  From Daniel's cover letter:

  IBM Power9 processors can speculatively operate on data in the L1
  cache before it has been completely validated, via a way-prediction
  mechanism. It is not possible for an attacker to determine the
  contents of impermissible memory using this method, since these
  systems implement a combination of hardware and software security
  measures to prevent scenarios where protected data could be leaked.

  However these measures don't address the scenario where an attacker
  induces the operating system to speculatively execute instructions
  using data that the attacker controls. This can be used for example to
  speculatively bypass "kernel user access prevention" techniques, as
  discovered by Anthony Steinhauser of Google's Safeside Project. This
  is not an attack by itself, but there is a possibility it could be
  used in conjunction with side-channels or other weaknesses in the
  privileged code to construct an attack.

  This issue can be mitigated by flushing the L1 cache between privilege
  boundaries of concern.

  This patch series flushes the L1 cache on kernel entry (patch 2) and
  after the kernel performs any user accesses (patch 3). It also adds a
  self-test and performs some related cleanups"

* tag 'powerpc-cve-2020-4788' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/64s: rename pnv|pseries_setup_rfi_flush to _setup_security_mitigations
  selftests/powerpc: refactor entry and rfi_flush tests
  selftests/powerpc: entry flush test
  powerpc: Only include kup-radix.h for 64-bit Book3S
  powerpc/64s: flush L1D after user accesses
  powerpc/64s: flush L1D on kernel entry
  selftests/powerpc: rfi_flush: disable entry flush if present
2020-11-19 11:32:31 -08:00
Masami Hiramatsu
fbc6e1c6e0 docs: bootconfig: Update file format on initrd image
To align the total file size, add padding null character when appending
the bootconfig to initrd image.

Link: https://lkml.kernel.org/r/160576522916.320071.4145530996151028855.stgit@devnote2

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-11-19 08:55:44 -05:00
Nicholas Piggin
9a32a7e78b powerpc/64s: flush L1D after user accesses
IBM Power9 processors can speculatively operate on data in the L1 cache
before it has been completely validated, via a way-prediction mechanism. It
is not possible for an attacker to determine the contents of impermissible
memory using this method, since these systems implement a combination of
hardware and software security measures to prevent scenarios where
protected data could be leaked.

However these measures don't address the scenario where an attacker induces
the operating system to speculatively execute instructions using data that
the attacker controls. This can be used for example to speculatively bypass
"kernel user access prevention" techniques, as discovered by Anthony
Steinhauser of Google's Safeside Project. This is not an attack by itself,
but there is a possibility it could be used in conjunction with
side-channels or other weaknesses in the privileged code to construct an
attack.

This issue can be mitigated by flushing the L1 cache between privilege
boundaries of concern. This patch flushes the L1 cache after user accesses.

This is part of the fix for CVE-2020-4788.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2020-11-19 23:47:18 +11:00
Nicholas Piggin
f79643787e powerpc/64s: flush L1D on kernel entry
IBM Power9 processors can speculatively operate on data in the L1 cache
before it has been completely validated, via a way-prediction mechanism. It
is not possible for an attacker to determine the contents of impermissible
memory using this method, since these systems implement a combination of
hardware and software security measures to prevent scenarios where
protected data could be leaked.

However these measures don't address the scenario where an attacker induces
the operating system to speculatively execute instructions using data that
the attacker controls. This can be used for example to speculatively bypass
"kernel user access prevention" techniques, as discovered by Anthony
Steinhauser of Google's Safeside Project. This is not an attack by itself,
but there is a possibility it could be used in conjunction with
side-channels or other weaknesses in the privileged code to construct an
attack.

This issue can be mitigated by flushing the L1 cache between privilege
boundaries of concern. This patch flushes the L1 cache on kernel entry.

This is part of the fix for CVE-2020-4788.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2020-11-19 23:47:15 +11:00
Linus Torvalds
f786dfa374 Merge tag 'pm-5.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management fixes from Rafael Wysocki:
 "These fix the device links support in runtime PM, correct mistakes in
  the cpuidle documentation, fix the handling of policy limits changes
  in the schedutil cpufreq governor, fix assorted issues in the OPP
  (operating performance points) framework and make one janitorial
  change.

  Specifics:

   - Unify the handling of managed and stateless device links in the
     runtime PM framework and prevent runtime PM references to devices
     from being leaked after device link removal (Rafael Wysocki).

   - Fix two mistakes in the cpuidle documentation (Julia Lawall).

   - Prevent the schedutil cpufreq governor from missing policy limits
     updates in some cases (Viresh Kumar).

   - Prevent static OPPs from being dropped by mistake (Viresh Kumar).

   - Prevent helper function in the OPP framework from returning
     prematurely (Viresh Kumar).

   - Prevent opp_table_lock from being held too long during removal of
     OPP tables with no more active references (Viresh Kumar).

   - Drop redundant semicolon from the Intel RAPL power capping driver
     (Tom Rix)"

* tag 'pm-5.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  PM: runtime: Resume the device earlier in __device_release_driver()
  PM: runtime: Drop pm_runtime_clean_up_links()
  PM: runtime: Drop runtime PM references to supplier on link removal
  powercap/intel_rapl: remove unneeded semicolon
  Documentation: PM: cpuidle: correct path name
  Documentation: PM: cpuidle: correct typo
  cpufreq: schedutil: Don't skip freq update if need_freq_update is set
  opp: Reduce the size of critical section in _opp_table_kref_release()
  opp: Fix early exit from dev_pm_opp_register_set_opp_helper()
  opp: Don't always remove static OPPs in _of_add_opp_table_v1()
2020-11-05 11:04:29 -08:00
Linus Torvalds
e6b0bd61a7 Merge tag 'docs-5.10-warnings' of git://git.lwn.net/linux
Pull documentation build warning fixes from Jonathan Corbet:
 "This contains a series of warning fixes from Mauro; once applied, the
  number of warnings from the once-noisy docs build process is nearly
  zero.

  Getting to this point has required a lot of work; once there,
  hopefully we can keep things that way.

  I have packaged this as a separate pull because it does a fair amount
  of reaching outside of Documentation/. The changes are all in comments
  and in code placement. It's all been in linux-next since last week"

* tag 'docs-5.10-warnings' of git://git.lwn.net/linux: (24 commits)
  docs: SafeSetID: fix a warning
  amdgpu: fix a few kernel-doc markup issues
  selftests: kselftest_harness.h: fix kernel-doc markups
  drm: amdgpu_dm: fix a typo
  gpu: docs: amdgpu.rst: get rid of wrong kernel-doc markups
  drm: amdgpu: kernel-doc: update some adev parameters
  docs: fs: api-summary.rst: get rid of kernel-doc include
  IB/srpt: docs: add a description for cq_size member
  locking/refcount: move kernel-doc markups to the proper place
  docs: lockdep-design: fix some warning issues
  MAINTAINERS: fix broken doc refs due to yaml conversion
  ice: docs fix a devlink info that broke a table
  crypto: sun8x-ce*: update entries to its documentation
  net: phy: remove kernel-doc duplication
  mm: pagemap.h: fix two kernel-doc markups
  blk-mq: docs: add kernel-doc description for a new struct member
  docs: userspace-api: add iommu.rst to the index file
  docs: hwmon: mp2975.rst: address some html build warnings
  docs: net: statistics.rst: remove a duplicated kernel-doc
  docs: kasan.rst: add two missing blank lines
  ...
2020-11-03 13:14:14 -08:00
Julia Lawall
23d18dcfc5 Documentation: PM: cpuidle: correct path name
cpu/ is needed before cpu<N>/

Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-11-02 18:02:20 +01:00
Julia Lawall
6841ca15fe Documentation: PM: cpuidle: correct typo
cerainly -> certainly

Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-11-02 18:00:08 +01:00
Mauro Carvalho Chehab
50865d041e docs: abi-testing.rst: enable --rst-sources when building docs
Now that ABI/testing documents were fixed, add --rst-sources to
the ABI/testing too.

Acked-by: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Link: https://lore.kernel.org/r/9242473fd3df785565bb6084b1b814cc15074fb2.1604042072.git.mchehab+huawei@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-10-30 13:07:02 +01:00
Mauro Carvalho Chehab
7832dea738 docs: ABI: don't escape ReST-incompatible chars from obsolete and removed
With just a single fix, the contents there can be parsed properly
without the need to escape any ReST incompatible stuff.

Acked-by: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Link: https://lore.kernel.org/r/472f4574b6aa2ff4de5a819db1a4a5c9a34f5168.1604042072.git.mchehab+huawei@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-10-30 13:07:02 +01:00
Mauro Carvalho Chehab
c12759ff2d docs: ABI: create a 2-depth index for ABI
That helps to identify what ABI files are adding titles.

Acked-by: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Link: https://lore.kernel.org/r/e267b36ae7f32bab2a86f1da6b40bb3e62c877d4.1604042072.git.mchehab+huawei@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-10-30 13:07:02 +01:00
Mauro Carvalho Chehab
642514df1b docs: ABI: make it parse ABI/stable as ReST-compatible files
Now that the stable ABI files are compatible with ReST,
parse them without converting complex descriptions as literal
blocks nor escaping special characters.

Please notice that escaping special characters will probably
be needed at descriptions, at least for the asterisk character.

Acked-by: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Link: https://lore.kernel.org/r/59ccbaa75ff05f23e701dd9a0bbe118e9343a553.1604042072.git.mchehab+huawei@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-10-30 13:07:02 +01:00
Mauro Carvalho Chehab
a7ee04b3ef docs: add ABI documentation to the admin-guide book
As we don't want a generic Sphinx extension to execute commands,
change the one proposed to Markus to call the abi_book.pl
script.

Use a script to parse the Documentation/ABI directory and output
it at the admin-guide.

Acked-by: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Link: https://lore.kernel.org/r/5651482b06500e69a1acdf92152f90a203e6521d.1604042072.git.mchehab+huawei@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-10-30 13:07:01 +01:00
Mauro Carvalho Chehab
afc74ce7b4 docs: SafeSetID: fix a warning
As reported by Sphinx 2.4.4:

	docs/Documentation/admin-guide/LSM/SafeSetID.rst:110: WARNING: Title underline too short.

	Note on GID policies and setgroups()
	==================

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Link: https://lore.kernel.org/r/4afa281c170daabd1ce522653d5d5d5078ebd92c.1603791716.git.mchehab+huawei@kernel.org
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2020-10-28 11:42:02 -06:00
Mauro Carvalho Chehab
2644ccef6f docs: admin-guide: net.rst: add a missing blank line
There's a missing blank line after a literal block, which
causes this warning:

	Documentation/admin-guide/sysctl/net.rst:303: WARNING: Literal block ends without a blank line; unexpected unindent.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Link: https://lore.kernel.org/r/b2545be4a4c71269d10278b5990c3e06c4b65f84.1603791716.git.mchehab+huawei@kernel.org
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2020-10-28 11:26:10 -06:00
Linus Torvalds
bd6aabc7ca Merge tag 'for-linus-5.10b-rc1c-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull more xen updates from Juergen Gross:

 - a series for the Xen pv block drivers adding module parameters for
   better control of resource usge

 - a cleanup series for the Xen event driver

* tag 'for-linus-5.10b-rc1c-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  Documentation: add xen.fifo_events kernel parameter description
  xen/events: unmask a fifo event channel only if it was masked
  xen/events: only register debug interrupt for 2-level events
  xen/events: make struct irq_info private to events_base.c
  xen: remove no longer used functions
  xen-blkfront: Apply changed parameter name to the document
  xen-blkfront: add a parameter for disabling of persistent grants
  xen-blkback: add a parameter for disabling of persistent grants
2020-10-25 10:55:35 -07:00
Linus Torvalds
81ecf91eab Merge tag 'safesetid-5.10' of git://github.com/micah-morton/linux
Pull SafeSetID updates from Micah Morton:
 "The changes are mostly contained to within the SafeSetID LSM, with the
  exception of a few 1-line changes to change some ns_capable() calls to
  ns_capable_setid() -- causing a flag (CAP_OPT_INSETID) to be set that
  is examined by SafeSetID code and nothing else in the kernel.

  The changes to SafeSetID internally allow for setting up GID
  transition security policies, as already existed for UIDs"

* tag 'safesetid-5.10' of git://github.com/micah-morton/linux:
  LSM: SafeSetID: Fix warnings reported by test bot
  LSM: SafeSetID: Add GID security policy handling
  LSM: Signal to SafeSetID when setting group IDs
2020-10-25 10:45:26 -07:00
Linus Torvalds
c80e42a496 Merge tag 'docs-5.10-2' of git://git.lwn.net/linux
Pull documentation fixes from Jonathan Corbet:
 "A handful of late-arriving documentation fixes"

* tag 'docs-5.10-2' of git://git.lwn.net/linux:
  docs: Add two missing entries in vm sysctl index
  docs/vm: trivial fixes to several spelling mistakes
  docs: submitting-patches: describe preserving review/test tags
  Documentation: Chinese translation of Documentation/arm64/hugetlbpage.rst
  Documentation: x86: fix a missing word in x86_64/mm.rst.
  docs: driver-api: remove a duplicated index entry
  docs: lkdtm: Modernize and improve details
  docs: deprecated.rst: Expand str*cpy() replacement notes
  docs/cpu-load: format the example code.
2020-10-23 17:13:53 -07:00
Juergen Gross
1a89c1dc95 Documentation: add xen.fifo_events kernel parameter description
The kernel boot parameter xen.fifo_events isn't listed in
Documentation/admin-guide/kernel-parameters.txt. Add it.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Link: https://lore.kernel.org/r/20201022094907.28560-6-jgross@suse.com
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2020-10-23 05:41:25 -05:00
Fam Zheng
62af696471 docs: Add two missing entries in vm sysctl index
Both seem overlooked while adding the section in the main content.

Signed-off-by: Fam Zheng <famzheng@amazon.com>
Link: https://lore.kernel.org/r/20201022065403.3936070-1-fam@euphon.net
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2020-10-22 16:11:46 -06:00
Linus Torvalds
24717cfbbb Merge tag 'nfsd-5.10' of git://linux-nfs.org/~bfields/linux
Pull nfsd updates from Bruce Fields:
 "The one new feature this time, from Anna Schumaker, is READ_PLUS,
  which has the same arguments as READ but allows the server to return
  an array of data and hole extents.

  Otherwise it's a lot of cleanup and bugfixes"

* tag 'nfsd-5.10' of git://linux-nfs.org/~bfields/linux: (43 commits)
  NFSv4.2: Fix NFS4ERR_STALE error when doing inter server copy
  SUNRPC: fix copying of multiple pages in gss_read_proxy_verf()
  sunrpc: raise kernel RPC channel buffer size
  svcrdma: fix bounce buffers for unaligned offsets and multiple pages
  nfsd: remove unneeded break
  net/sunrpc: Fix return value for sysctl sunrpc.transports
  NFSD: Encode a full READ_PLUS reply
  NFSD: Return both a hole and a data segment
  NFSD: Add READ_PLUS hole segment encoding
  NFSD: Add READ_PLUS data support
  NFSD: Hoist status code encoding into XDR encoder functions
  NFSD: Map nfserr_wrongsec outside of nfsd_dispatch
  NFSD: Remove the RETURN_STATUS() macro
  NFSD: Call NFSv2 encoders on error returns
  NFSD: Fix .pc_release method for NFSv2
  NFSD: Remove vestigial typedefs
  NFSD: Refactor nfsd_dispatch() error paths
  NFSD: Clean up nfsd_dispatch() variables
  NFSD: Clean up stale comments in nfsd_dispatch()
  NFSD: Clean up switch statement in nfsd_dispatch()
  ...
2020-10-22 09:44:27 -07:00
Hui Su
bb7a2c6362 docs/cpu-load: format the example code.
format the example code.

Signed-off-by: Hui Su <sh_def@163.com>
Link: https://lore.kernel.org/r/20201018170557.GA7670@rlk
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2020-10-21 15:07:30 -06:00
Linus Torvalds
4a5bb973fa Merge tag 'for-linus-5.10b-rc1b-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull more xen updates from Juergen Gross:

 - A single patch to fix the Xen security issue XSA-331 (malicious
   guests can DoS dom0 by triggering NULL-pointer dereferences or access
   to stale data).

 - A larger series to fix the Xen security issue XSA-332 (malicious
   guests can DoS dom0 by sending events at high frequency leading to
   dom0's vcpus being busy in IRQ handling for elongated times).

* tag 'for-linus-5.10b-rc1b-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  xen/events: block rogue events for some time
  xen/events: defer eoi in case of excessive number of events
  xen/events: use a common cpu hotplug hook for event channels
  xen/events: switch user event channels to lateeoi model
  xen/pciback: use lateeoi irq binding
  xen/pvcallsback: use lateeoi irq binding
  xen/scsiback: use lateeoi irq binding
  xen/netback: use lateeoi irq binding
  xen/blkback: use lateeoi irq binding
  xen/events: add a new "late EOI" evtchn framework
  xen/events: fix race in evtchn_fifo_unmask()
  xen/events: add a proper barrier to 2-level uevent unmasking
  xen/events: avoid removing an event channel while handling it
2020-10-20 09:24:01 -07:00
Juergen Gross
e99502f762 xen/events: defer eoi in case of excessive number of events
In case rogue guests are sending events at high frequency it might
happen that xen_evtchn_do_upcall() won't stop processing events in
dom0. As this is done in irq handling a crash might be the result.

In order to avoid that, delay further inter-domain events after some
time in xen_evtchn_do_upcall() by forcing eoi processing into a
worker on the same cpu, thus inhibiting new events coming in.

The time after which eoi processing is to be delayed is configurable
via a new module parameter "event_loop_timeout" which specifies the
maximum event loop time in jiffies (default: 2, the value was chosen
after some tests showing that a value of 2 was the lowest with an
only slight drop of dom0 network throughput while multiple guests
performed an event storm).

How long eoi processing will be delayed can be specified via another
parameter "event_eoi_delay" (again in jiffies, default 10, again the
value was chosen after testing with different delay values).

This is part of XSA-332.

Cc: stable@vger.kernel.org
Reported-by: Julien Grall <julien@xen.org>
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: Wei Liu <wl@xen.org>
2020-10-20 10:22:16 +02:00
Linus Torvalds
bbe85027ce Merge tag 'xfs-5.10-merge-5' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
Pull more xfs updates from Darrick Wong:
 "The second large pile of new stuff for 5.10, with changes even more
  monumental than last week!

  We are formally announcing the deprecation of the V4 filesystem format
  in 2030. All users must upgrade to the V5 format, which contains
  design improvements that greatly strengthen metadata validation,
  supports reflink and online fsck, and is the intended vehicle for
  handling timestamps past 2038. We're also deprecating the old Irix
  behavioral tweaks in September 2025.

  Coming along for the ride are two design changes to the deferred
  metadata ops subsystem. One of the improvements is to retain correct
  logical ordering of tasks and subtasks, which is a more logical design
  for upper layers of XFS and will become necessary when we add atomic
  file range swaps and commits. The second improvement to deferred ops
  improves the scalability of the log by helping the log tail to move
  forward during long-running operations. This reduces log contention
  when there are a large number of threads trying to run transactions.

  In addition to that, this fixes numerous small bugs in log recovery;
  refactors logical intent log item recovery to remove the last
  remaining place in XFS where we could have nested transactions; fixes
  a couple of ways that intent log item recovery could fail in ways that
  wouldn't have happened in the regular commit paths; fixes a deadlock
  vector in the GETFSMAP implementation (which improves its performance
  by 20%); and fixes serious bugs in the realtime growfs, fallocate, and
  bitmap handling code.

  Summary:

   - Deprecate the V4 filesystem format, some disused mount options, and
     some legacy sysctl knobs now that we can support dates into the
     25th century. Note that removal of V4 support will not happen until
     the early 2030s.

   - Fix some probles with inode realtime flag propagation.

   - Fix some buffer handling issues when growing a rt filesystem.

   - Fix a problem where a BMAP_REMAP unmap call would free rt extents
     even though the purpose of BMAP_REMAP is to avoid freeing the
     blocks.

   - Strengthen the dabtree online scrubber to check hash values on
     child dabtree blocks.

   - Actually log new intent items created as part of recovering log
     intent items.

   - Fix a bug where quotas weren't attached to an inode undergoing bmap
     intent item recovery.

   - Fix a buffer overrun problem with specially crafted log buffer
     headers.

   - Various cleanups to type usage and slightly inaccurate comments.

   - More cleanups to the xattr, log, and quota code.

   - Don't run the (slower) shared-rmap operations on attr fork
     mappings.

   - Fix a bug where we failed to check the LSN of finobt blocks during
     replay and could therefore overwrite newer data with older data.

   - Clean up the ugly nested transaction mess that log recovery uses to
     stage intent item recovery in the correct order by creating a
     proper data structure to capture recovered chains.

   - Use the capture structure to resume intent item chains with the
     same log space and block reservations as when they were captured.

   - Fix a UAF bug in bmap intent item recovery where we failed to
     maintain our reference to the incore inode if the bmap operation
     needed to relog itself to continue.

   - Rearrange the defer ops mechanism to finish newly created subtasks
     of a parent task before moving on to the next parent task.

   - Automatically relog intent items in deferred ops chains if doing so
     would help us avoid pinning the log tail. This will help fix some
     log scaling problems now and will facilitate atomic file updates
     later.

   - Fix a deadlock in the GETFSMAP implementation by using an internal
     memory buffer to reduce indirect calls and copies to userspace,
     thereby improving its performance by ~20%.

   - Fix various problems when calling growfs on a realtime volume would
     not fully update the filesystem metadata.

   - Fix broken Kconfig asking about deprecated XFS when XFS is
     disabled"

* tag 'xfs-5.10-merge-5' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: (48 commits)
  xfs: fix Kconfig asking about XFS_SUPPORT_V4 when XFS_FS=n
  xfs: fix high key handling in the rt allocator's query_range function
  xfs: annotate grabbing the realtime bitmap/summary locks in growfs
  xfs: make xfs_growfs_rt update secondary superblocks
  xfs: fix realtime bitmap/summary file truncation when growing rt volume
  xfs: fix the indent in xfs_trans_mod_dquot
  xfs: do the ASSERT for the arguments O_{u,g,p}dqpp
  xfs: fix deadlock and streamline xfs_getfsmap performance
  xfs: limit entries returned when counting fsmap records
  xfs: only relog deferred intent items if free space in the log gets low
  xfs: expose the log push threshold
  xfs: periodically relog deferred intent items
  xfs: change the order in which child and parent defer ops are finished
  xfs: fix an incore inode UAF in xfs_bui_recover
  xfs: clean up xfs_bui_item_recover iget/trans_alloc/ilock ordering
  xfs: clean up bmap intent item recovery checking
  xfs: xfs_defer_capture should absorb remaining transaction reservation
  xfs: xfs_defer_capture should absorb remaining block reservations
  xfs: proper replay of deferred ops queued during log recovery
  xfs: remove XFS_LI_RECOVERED
  ...
2020-10-19 14:38:46 -07:00
Linus Torvalds
41eea65e2a Merge tag 'core-rcu-2020-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull RCU changes from Ingo Molnar:

 - Debugging for smp_call_function()

 - RT raw/non-raw lock ordering fixes

 - Strict grace periods for KASAN

 - New smp_call_function() torture test

 - Torture-test updates

 - Documentation updates

 - Miscellaneous fixes

[ This doesn't actually pull the tag - I've dropped the last merge from
  the RCU branch due to questions about the series.   - Linus ]

* tag 'core-rcu-2020-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (77 commits)
  smp: Make symbol 'csd_bug_count' static
  kernel/smp: Provide CSD lock timeout diagnostics
  smp: Add source and destination CPUs to __call_single_data
  rcu: Shrink each possible cpu krcp
  rcu/segcblist: Prevent useless GP start if no CBs to accelerate
  torture: Add gdb support
  rcutorture: Allow pointer leaks to test diagnostic code
  rcutorture: Hoist OOM registry up one level
  refperf: Avoid null pointer dereference when buf fails to allocate
  rcutorture: Properly synchronize with OOM notifier
  rcutorture: Properly set rcu_fwds for OOM handling
  torture: Add kvm.sh --help and update help message
  rcutorture: Add CONFIG_PROVE_RCU_LIST to TREE05
  torture: Update initrd documentation
  rcutorture: Replace HTTP links with HTTPS ones
  locktorture: Make function torture_percpu_rwsem_init() static
  torture: document --allcpus argument added to the kvm.sh script
  rcutorture: Output number of elapsed grace periods
  rcutorture: Remove KCSAN stubs
  rcu: Remove unused "cpu" parameter from rcu_report_qs_rdp()
  ...
2020-10-18 14:34:50 -07:00
Linus Torvalds
54a4c789ca Merge tag 'docs/v5.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media
Pull documentation updates from Mauro Carvalho Chehab:
 "A series of patches addressing warnings produced by make htmldocs.
  This includes:

   - kernel-doc markup fixes

   - ReST fixes

   - Updates at the build system in order to support newer versions of
     the docs build toolchain (Sphinx)

  After this series, the number of html build warnings should reduce
  significantly, and building with Sphinx 3.1 or later should now be
  supported (although it is still recommended to use Sphinx 2.4.4).

  As agreed with Jon, I should be sending you a late pull request by the
  end of the merge window addressing remaining issues with docs build,
  as there are a number of warning fixes that depends on pull requests
  that should be happening along the merge window.

  The end goal is to have a clean htmldocs build on Kernel 5.10.

  PS. It should be noticed that Sphinx 3.0 is not currently supported,
  as it lacks support for C domain namespaces. Such feature, needed in
  order to document uAPI system calls with Sphinx 3.x, was added only on
  Sphinx 3.1"

* tag 'docs/v5.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (75 commits)
  PM / devfreq: remove a duplicated kernel-doc markup
  mm/doc: fix a literal block markup
  workqueue: fix a kernel-doc warning
  docs: virt: user_mode_linux_howto_v2.rst: fix a literal block markup
  Input: sparse-keymap: add a description for @sw
  rcu/tree: docs: document bkvcache new members at struct kfree_rcu_cpu
  nl80211: docs: add a description for s1g_cap parameter
  usb: docs: document altmode register/unregister functions
  kunit: test.h: fix a bad kernel-doc markup
  drivers: core: fix kernel-doc markup for dev_err_probe()
  docs: bio: fix a kerneldoc markup
  kunit: test.h: solve kernel-doc warnings
  block: bio: fix a warning at the kernel-doc markups
  docs: powerpc: syscall64-abi.rst: fix a malformed table
  drivers: net: hamradio: fix document location
  net: appletalk: Kconfig: Fix docs location
  dt-bindings: fix references to files converted to yaml
  memblock: get rid of a :c:type leftover
  math64.h: kernel-docs: Convert some markups into normal comments
  media: uAPI: buffer.rst: remove a left-over documentation
  ...
2020-10-16 15:02:21 -07:00