PSI accounts stalls for each cgroup separately and aggregates it at each
level of the hierarchy. This causes additional overhead with psi_avgs_work
being called for each cgroup in the hierarchy. psi_avgs_work has been
highly optimized, however on systems with large number of cgroups the
overhead becomes noticeable.
Systems which use PSI only at the system level could avoid this overhead
if PSI can be configured to skip per-cgroup stall accounting.
Add "cgroup_disable=pressure" kernel command-line option to allow
requesting system-wide only pressure stall accounting. When set, it
keeps system-wide accounting under /proc/pressure/ but skips accounting
for individual cgroups and does not expose PSI nodes in cgroup hierarchy.
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
Link: https://lore.kernel.org/patchwork/patch/1435705
(cherry picked from commit 3958e2d0c34e18c41b60dc01832bd670a59ef70f
https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git tj)
Bug: 178872719
Bug: 191734423
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Change-Id: Ifc8fbc52f9a1131d7c2668edbb44c525c76c3360
This reverts commit a7af91adc7 ("ANDROID: mm: oom_kill: reap memory of
a task that receives SIGKILL") as this functionality is moved to vendor
hook based approach.
Bug: 189803002
Signed-off-by: Charan Teja Reddy <charante@codeaurora.org>
Change-Id: Ica0d6df22fb81bf430e9b4c7d12b36ab7d44dab8
Right now only anonymous page faults are speculatively handled,
thus leaving out a large percentage of faults still requiring to
take mmap_sem. These were left out since there can be fault
handlers mainly in the fs layer which may use vma in unknown ways.
This patch enables speculative fault for ext4, f2fs and shmem. The
feature is disabled by default and enabled via allow_file_spec_access
kernel param.
Bug: 171954515
Change-Id: I0d23ebf299000e4ac5e2c71bc0b7fc9006e98da9
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
This param allows forcing all dependencies to be treated as mandatory.
This will be useful for boards in which all optional dependencies like
IOMMUs and DMAs need to be treated as mandatory dependencies.
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Saravana Kannan <saravanak@google.com>
Link: https://lore.kernel.org/r/20210205222644.2357303-4-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 19d0f5f6bff878277783fd98fef4ae2441d6a1d8)
Bug: 181264536
Change-Id: If9cc5918624082b0e5426c4c9dd7f6eb6bba8232
If the no_hash_pointers command line parameter is set, then
printk("%p") will print pointers as unhashed, which is useful for
debugging purposes. This change applies to any function that uses
vsprintf, such as print_hex_dump() and seq_buf_printf().
A large warning message is displayed if this option is enabled.
Unhashed pointers expose kernel addresses, which can be a security
risk.
Also update test_printf to skip the hashed pointer tests if the
command-line option is set.
Signed-off-by: Timur Tabi <timur@kernel.org>
Acked-by: Petr Mladek <pmladek@suse.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Acked-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Marco Elver <elver@google.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20210214161348.369023-4-timur@kernel.org
(cherry picked from commit 5ead723a20e0447bc7db33dc3070b420e5f80aa6)
Bug: 181049978
Change-Id: I06c5cfdc0b4c12f38c9109179f27bf2b54ac57e8
Signed-off-by: Chris Goldsworthy <cgoldswo@codeaurora.org>
Page replacement is handled in the Linux Kernel in one of two ways:
1) Asynchronously via kswapd
2) Synchronously, via direct reclaim
At page allocation time the allocating task is immediately given a page
from the zone free list allowing it to go right back to work doing
whatever it was doing; Probably directly or indirectly executing business
logic.
Just prior to satisfying the allocation, free pages is checked to see if
it has reached the zone low watermark and if so, kswapd is awakened.
Kswapd will start scanning pages looking for inactive pages to evict to
make room for new page allocations. The work of kswapd allows tasks to
continue allocating memory from their respective zone free list without
incurring any delay.
When the demand for free pages exceeds the rate that kswapd tasks can
supply them, page allocation works differently. Once the allocating task
finds that the number of free pages is at or below the zone min watermark,
the task will no longer pull pages from the free list. Instead, the task
will run the same CPU-bound routines as kswapd to satisfy its own
allocation by scanning and evicting pages. This is called a direct reclaim.
The time spent performing a direct reclaim can be substantial, often
taking tens to hundreds of milliseconds for small order0 allocations to
half a second or more for order9 huge-page allocations. In fact, kswapd is
not actually required on a linux system. It exists for the sole purpose of
optimizing performance by preventing direct reclaims.
When memory shortfall is sufficient to trigger direct reclaims, they can
occur in any task that is running on the system. A single aggressive
memory allocating task can set the stage for collateral damage to occur in
small tasks that rarely allocate additional memory. Consider the impact of
injecting an additional 100ms of latency when nscd allocates memory to
facilitate caching of a DNS query.
The presence of direct reclaims 10 years ago was a fairly reliable
indicator that too much was being asked of a Linux system. Kswapd was
likely wasting time scanning pages that were ineligible for eviction.
Adding RAM or reducing the working set size would usually make the problem
go away. Since then hardware has evolved to bring a new struggle for
kswapd. Storage speeds have increased by orders of magnitude while CPU
clock speeds stayed the same or even slowed down in exchange for more
cores per package. This presents a throughput problem for a single
threaded kswapd that will get worse with each generation of new hardware.
Test Details
NOTE: The tests below were run with shadow entries disabled. See the
associated patch and cover letter for details
The tests below were designed with the assumption that a kswapd bottleneck
is best demonstrated using filesystem reads. This way, the inactive list
will be full of clean pages, simplifying the analysis and allowing kswapd
to achieve the highest possible steal rate. Maximum steal rates for kswapd
are likely to be the same or lower for any other mix of page types on the
system.
Tests were run on a 2U Oracle X7-2L with 52 Intel Xeon Skylake 2GHz cores,
756GB of RAM and 8 x 3.6 TB NVMe Solid State Disk drives. Each drive has
an XFS file system mounted separately as /d0 through /d7. SSD drives
require multiple concurrent streams to show their potential, so I created
eleven 250GB zero-filled files on each drive so that I could test with
parallel reads.
The test script runs in multiple stages. At each stage, the number of dd
tasks run concurrently is increased by 2. I did not include all of the
test output for brevity.
During each stage dd tasks are launched to read from each drive in a round
robin fashion until the specified number of tasks for the stage has been
reached. Then iostat, vmstat and top are started in the background with 10
second intervals. After five minutes, all of the dd tasks are killed and
the iostat, vmstat and top output is parsed in order to report the
following:
CPU consumption
- sy - aggregate kernel mode CPU consumption from vmstat output. The value
doesn't tend to fluctuate much so I just grab the highest value.
Each sample is averaged over 10 seconds
- dd_cpu - for all of the dd tasks averaged across the top samples since
there is a lot of variation.
Throughput
- in Kbytes
- Command is iostat -x -d 10 -g total
This first test performs reads using O_DIRECT in order to show the maximum
throughput that can be obtained using these drives. It also demonstrates
how rapidly throughput scales as the number of dd tasks are increased.
The dd command for this test looks like this:
Command Used: dd iflag=direct if=/d${i}/$n of=/dev/null bs=4M
Test #1: Direct IO
dd sy dd_cpu throughput
6 0 2.33 14726026.40
10 1 2.95 19954974.80
16 1 2.63 24419689.30
22 1 2.63 25430303.20
28 1 2.91 26026513.20
34 1 2.53 26178618.00
40 1 2.18 26239229.20
46 1 1.91 26250550.40
52 1 1.69 26251845.60
58 1 1.54 26253205.60
64 1 1.43 26253780.80
70 1 1.31 26254154.80
76 1 1.21 26253660.80
82 1 1.12 26254214.80
88 1 1.07 26253770.00
90 1 1.04 26252406.40
Throughput was close to peak with only 22 dd tasks. Very little system CPU
was consumed as expected as the drives DMA directly into the user address
space when using direct IO.
In this next test, the iflag=direct option is removed and we only run the
test until the pgscan_kswapd from /proc/vmstat starts to increment. At
that point metrics are parsed and reported and the pagecache contents are
dropped prior to the next test. Lather, rinse, repeat.
Test #2: standard file system IO, no page replacement
dd sy dd_cpu throughput
6 2 28.78 5134316.40
10 3 31.40 8051218.40
16 5 34.73 11438106.80
22 7 33.65 14140596.40
28 8 31.24 16393455.20
34 10 29.88 18219463.60
40 11 28.33 19644159.60
46 11 25.05 20802497.60
52 13 26.92 22092370.00
58 13 23.29 22884881.20
64 14 23.12 23452248.80
70 15 22.40 23916468.00
76 16 22.06 24328737.20
82 17 20.97 24718693.20
88 16 18.57 25149404.40
90 16 18.31 25245565.60
Each read has to pause after the buffer in kernel space is populated while
those pages are added to the pagecache and copied into the user address
space. For this reason, more parallel streams are required to achieve peak
throughput. The copy operation consumes substantially more CPU than direct
IO as expected.
The next test measures throughput after kswapd starts running. This is the
same test only we wait for kswapd to wake up before we start collecting
metrics. The script actually keeps track of a few things that were not
mentioned earlier. It tracks direct reclaims and page scans by watching
the metrics in /proc/vmstat. CPU consumption for kswapd is tracked the
same way it is tracked for dd.
Since the test is 100% reads, you can assume that the page steal rate for
kswapd and direct reclaims is almost identical to the scan rate.
Test #3: 1 kswapd thread per node
dd sy dd_cpu kswapd0 kswapd1 throughput dr pgscan_kswapd pgscan_direct
10 4 26.07 28.56 27.03 7355924.40 0 459316976 0
16 7 34.94 69.33 69.66 10867895.20 0 872661643 0
22 10 36.03 93.99 99.33 13130613.60 489 1037654473 11268334
28 10 30.34 95.90 98.60 14601509.60 671 1182591373 15429142
34 14 34.77 97.50 99.23 16468012.00 10850 1069005644 249839515
40 17 36.32 91.49 97.11 17335987.60 18903 975417728 434467710
46 19 38.40 90.54 91.61 17705394.40 25369 855737040 582427973
52 22 40.88 83.97 83.70 17607680.40 31250 709532935 724282458
58 25 40.89 82.19 80.14 17976905.60 35060 657796473 804117540
64 28 41.77 73.49 75.20 18001910.00 39073 561813658 895289337
70 33 45.51 63.78 64.39 17061897.20 44523 379465571 1020726436
76 36 46.95 57.96 60.32 16964459.60 47717 291299464 1093172384
82 39 47.16 55.43 56.16 16949956.00 49479 247071062 1134163008
88 42 47.41 53.75 47.62 16930911.20 51521 195449924 1180442208
90 43 47.18 51.40 50.59 16864428.00 51618 190758156 1183203901
In the previous test where kswapd was not involved, the system-wide kernel
mode CPU consumption with 90 dd tasks was 16%. In this test CPU consumption
with 90 tasks is at 43%. With 52 cores, and two kswapd tasks (one per NUMA
node), kswapd can only be responsible for a little over 4% of the increase.
The rest is likely caused by 51,618 direct reclaims that scanned 1.2
billion pages over the five minute time period of the test.
Same test, more kswapd tasks:
Test #4: 4 kswapd threads per node
dd sy dd_cpu kswapd0 kswapd1 throughput dr pgscan_kswapd pgscan_direct
10 5 27.09 16.65 14.17 7842605.60 0 459105291 0
16 10 37.12 26.02 24.85 11352920.40 15 920527796 358515
22 11 36.94 37.13 35.82 13771869.60 0 1132169011 0
28 13 35.23 48.43 46.86 16089746.00 0 1312902070 0
34 15 33.37 53.02 55.69 18314856.40 0 1476169080 0
40 19 35.90 69.60 64.41 19836126.80 0 1629999149 0
46 22 36.82 88.55 57.20 20740216.40 0 1708478106 0
52 24 34.38 93.76 68.34 21758352.00 0 1794055559 0
58 24 30.51 79.20 82.33 22735594.00 0 1872794397 0
64 26 30.21 97.12 76.73 23302203.60 176 1916593721 4206821
70 33 32.92 92.91 92.87 23776588.00 3575 1817685086 85574159
76 37 31.62 91.20 89.83 24308196.80 4752 1812262569 113981763
82 29 25.53 93.23 92.33 24802791.20 306 2032093122 7350704
88 43 37.12 76.18 77.01 25145694.40 20310 1253204719 487048202
90 42 38.56 73.90 74.57 22516787.60 22774 1193637495 545463615
By increasing the number of kswapd threads, throughput increased by ~50%
while kernel mode CPU utilization decreased or stayed the same, likely due
to a decrease in the number of parallel tasks at any given time doing page
replacement.
Signed-off-by: Buddy Lumpkin <buddy.lumpkin@oracle.com>
Bug: 171351667
Link: https://lore.kernel.org/lkml/1522661062-39745-1-git-send-email-buddy.lumpkin@oracle.com
[charante@codeaurora.org]: Changes made to select number of kswapds through uapi
Change-Id: I8425cab7f40cbeaf65af0ea118c1a9ac7da0930e
Signed-off-by: Charan Teja Reddy <charante@codeaurora.org>
In order to be able to disable Pointer Authentication at runtime,
whether it is for testing purposes, or to work around HW issues,
let's add support for overriding the ID_AA64ISAR1_EL1.{GPI,GPA,API,APA}
fields.
This is further mapped on the arm64.nopauth command-line alias.
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: David Brazdil <dbrazdil@google.com>
Tested-by: Srinivas Ramana <sramana@codeaurora.org>
Link: https://lore.kernel.org/r/20210208095732.3267263-23-maz@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
(cherry picked from commit f8da5752fd1b25f1ecf78a79013e2dfd2b860589
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git for-next/cpufeature)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 175544340
Change-Id: I0913dff5ed1c5ee731ad6705e8b6cbde8b3e9a81
In order to be able to disable BTI at runtime, whether it is
for testing purposes, or to work around HW issues, let's add
support for overriding the ID_AA64PFR1_EL1.BTI field.
This is further mapped on the arm64.nobti command-line alias.
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: David Brazdil <dbrazdil@google.com>
Tested-by: Srinivas Ramana <sramana@codeaurora.org>
Link: https://lore.kernel.org/r/20210208095732.3267263-21-maz@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
(cherry picked from commit 93ad55b7852b324a3fd7d46910b88c81deb62357
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git for-next/cpufeature)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 175544340
Change-Id: Ib6a3e923a97120d6a438a7b46cb182a175d434bf
Admitedly, passing id_aa64mmfr1.vh=0 on the command-line isn't
that easy to understand, and it is likely that users would much
prefer write "kvm-arm.mode=nvhe", or "...=protected".
So here you go. This has the added advantage that we can now
always honor the "kvm-arm.mode=protected" option, even when
booting on a VHE system.
Signed-off-by: Marc Zyngier <maz@kernel.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: David Brazdil <dbrazdil@google.com>
Link: https://lore.kernel.org/r/20210208095732.3267263-18-maz@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
(cherry picked from commit 1945a067f351debcd2518d9f6039b1835de08dfd
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git for-next/cpufeature)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 175544340
Change-Id: I66d4a5bdb341db7ac6017bd6af97cdddb694d6c3
The PELT half-life is currently hard-coded to 32ms.
Create cmdline arg to enable switching to a PELT half-life of 8mS.
Bug: 177593580
Change-Id: I9f8cfc3d9554a500eec0f6a1b161f4155c296b4d
Signed-off-by: Shaleen Agrawal <shalagra@codeaurora.org>
Add an early parameter that allows users to select the mode of operation
for KVM/arm64.
For now, the only supported value is "protected". By passing this flag
users opt into the hypervisor placing additional restrictions on the
host kernel. These allow the hypervisor to spawn guests whose state is
kept private from the host. Restrictions will include stage-2 address
translation to prevent host from accessing guest memory, filtering its
SMC calls, etc.
Without this parameter, the default behaviour remains selecting VHE/nVHE
based on hardware support and CONFIG_ARM64_VHE.
Signed-off-by: David Brazdil <dbrazdil@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20201202184122.26046-2-dbrazdil@google.com
(cherry picked from commit d8b369c4e31430a4746571bcae45a98933827232)
Signed-off-by: Will Deacon <willdeacon@google.com>
Change-Id: I1fe46bb18b40d0a1df41a600a07b848f82988ac6
Bug: 178098380
Test: atest VirtualizationHostTestCases on an EL2-enabled device
Changes in 5.10.11
scsi: target: tcmu: Fix use-after-free of se_cmd->priv
mtd: rawnand: gpmi: fix dst bit offset when extracting raw payload
mtd: rawnand: nandsim: Fix the logic when selecting Hamming soft ECC engine
i2c: tegra: Wait for config load atomically while in ISR
i2c: bpmp-tegra: Ignore unknown I2C_M flags
platform/x86: i2c-multi-instantiate: Don't create platform device for INT3515 ACPI nodes
platform/x86: ideapad-laptop: Disable touchpad_switch for ELAN0634
ALSA: seq: oss: Fix missing error check in snd_seq_oss_synth_make_info()
ALSA: hda/realtek - Limit int mic boost on Acer Aspire E5-575T
ALSA: hda/via: Add minimum mute flag
crypto: xor - Fix divide error in do_xor_speed()
dm crypt: fix copy and paste bug in crypt_alloc_req_aead
ACPI: scan: Make acpi_bus_get_device() clear return pointer on error
btrfs: don't get an EINTR during drop_snapshot for reloc
btrfs: do not double free backref nodes on error
btrfs: fix lockdep splat in btrfs_recover_relocation
btrfs: don't clear ret in btrfs_start_dirty_block_groups
btrfs: send: fix invalid clone operations when cloning from the same file and root
fs: fix lazytime expiration handling in __writeback_single_inode()
pinctrl: ingenic: Fix JZ4760 support
mmc: core: don't initialize block size from ext_csd if not present
mmc: sdhci-of-dwcmshc: fix rpmb access
mmc: sdhci-xenon: fix 1.8v regulator stabilization
mmc: sdhci-brcmstb: Fix mmc timeout errors on S5 suspend
dm: avoid filesystem lookup in dm_get_dev_t()
dm integrity: fix a crash if "recalculate" used without "internal_hash"
dm integrity: conditionally disable "recalculate" feature
drm/atomic: put state on error path
drm/syncobj: Fix use-after-free
drm/amdgpu: remove gpu info firmware of green sardine
drm/amd/display: DCN2X Find Secondary Pipe properly in MPO + ODM Case
drm/i915/gt: Prevent use of engine->wa_ctx after error
drm/i915: Check for rq->hwsp validity after acquiring RCU lock
ASoC: Intel: haswell: Add missing pm_ops
ASoC: rt711: mutex between calibration and power state changes
SUNRPC: Handle TCP socket sends with kernel_sendpage() again
HID: multitouch: Enable multi-input for Synaptics pointstick/touchpad device
HID: sony: select CONFIG_CRC32
dm integrity: select CRYPTO_SKCIPHER
x86/hyperv: Fix kexec panic/hang issues
scsi: ufs: Relax the condition of UFSHCI_QUIRK_SKIP_MANUAL_WB_FLUSH_CTRL
scsi: ufs: Correct the LUN used in eh_device_reset_handler() callback
scsi: qedi: Correct max length of CHAP secret
scsi: scsi_debug: Fix memleak in scsi_debug_init()
scsi: sd: Suppress spurious errors when WRITE SAME is being disabled
riscv: Fix kernel time_init()
riscv: Fix sifive serial driver
riscv: Enable interrupts during syscalls with M-Mode
HID: logitech-dj: add the G602 receiver
HID: Ignore battery for Elan touchscreen on ASUS UX550
clk: tegra30: Add hda clock default rates to clock driver
ALSA: hda/tegra: fix tegra-hda on tegra30 soc
riscv: cacheinfo: Fix using smp_processor_id() in preemptible
arm64: make atomic helpers __always_inline
xen: Fix event channel callback via INTX/GSI
x86/xen: Add xen_no_vector_callback option to test PCI INTX delivery
x86/xen: Fix xen_hvm_smp_init() when vector callback not available
dts: phy: fix missing mdio device and probe failure of vsc8541-01 device
dts: phy: add GPIO number and active state used for phy reset
riscv: defconfig: enable gpio support for HiFive Unleashed
drm/amdgpu/psp: fix psp gfx ctrl cmds
drm/amd/display: disable dcn10 pipe split by default
HID: logitech-hidpp: Add product ID for MX Ergo in Bluetooth mode
drm/amd/display: Fix to be able to stop crc calculation
drm/nouveau/bios: fix issue shadowing expansion ROMs
drm/nouveau/privring: ack interrupts the same way as RM
drm/nouveau/i2c/gm200: increase width of aux semaphore owner fields
drm/nouveau/mmu: fix vram heap sizing
drm/nouveau/kms/nv50-: fix case where notifier buffer is at offset 0
io_uring: flush timeouts that should already have expired
libperf tests: If a test fails return non-zero
libperf tests: Fail when failing to get a tracepoint id
RISC-V: Set current memblock limit
RISC-V: Fix maximum allowed phsyical memory for RV32
x86/xen: fix 'nopvspin' build error
nfsd: Fixes for nfsd4_encode_read_plus_data()
nfsd: Don't set eof on a truncated READ_PLUS
gpiolib: cdev: fix frame size warning in gpio_ioctl()
pinctrl: aspeed: g6: Fix PWMG0 pinctrl setting
pinctrl: mediatek: Fix fallback call path
RDMA/ucma: Do not miss ctx destruction steps in some cases
btrfs: print the actual offset in btrfs_root_name
scsi: megaraid_sas: Fix MEGASAS_IOC_FIRMWARE regression
scsi: ufs: ufshcd-pltfrm depends on HAS_IOMEM
scsi: ufs: Fix tm request when non-fatal error happens
crypto: omap-sham - Fix link error without crypto-engine
bpf: Prevent double bpf_prog_put call from bpf_tracing_prog_attach
powerpc: Use the common INIT_DATA_SECTION macro in vmlinux.lds.S
powerpc: Fix alignment bug within the init sections
arm64: entry: remove redundant IRQ flag tracing
bpf: Reject too big ctx_size_in for raw_tp test run
drm/amdkfd: Fix out-of-bounds read in kdf_create_vcrat_image_cpu()
RDMA/umem: Avoid undefined behavior of rounddown_pow_of_two()
RDMA/cma: Fix error flow in default_roce_mode_store
printk: ringbuffer: fix line counting
printk: fix kmsg_dump_get_buffer length calulations
iov_iter: fix the uaccess area in copy_compat_iovec_from_user
i2c: octeon: check correct size of maximum RECV_LEN packet
drm/vc4: Unify PCM card's driver_name
platform/x86: intel-vbtn: Drop HP Stream x360 Convertible PC 11 from allow-list
platform/x86: hp-wmi: Don't log a warning on HPWMI_RET_UNKNOWN_COMMAND errors
gpio: sifive: select IRQ_DOMAIN_HIERARCHY rather than depend on it
ALSA: hda: Balance runtime/system PM if direct-complete is disabled
xsk: Clear pool even for inactive queues
selftests: net: fib_tests: remove duplicate log test
can: dev: can_restart: fix use after free bug
can: vxcan: vxcan_xmit: fix use after free bug
can: peak_usb: fix use after free bugs
perf evlist: Fix id index for heterogeneous systems
i2c: sprd: depend on COMMON_CLK to fix compile tests
iio: common: st_sensors: fix possible infinite loop in st_sensors_irq_thread
iio: ad5504: Fix setting power-down state
drivers: iio: temperature: Add delay after the addressed reset command in mlx90632.c
iio: adc: ti_am335x_adc: remove omitted iio_kfifo_free()
counter:ti-eqep: remove floor
powerpc/64s: fix scv entry fallback flush vs interrupt
cifs: do not fail __smb_send_rqst if non-fatal signals are pending
irqchip/mips-cpu: Set IPI domain parent chip
x86/fpu: Add kernel_fpu_begin_mask() to selectively initialize state
x86/topology: Make __max_die_per_package available unconditionally
x86/mmx: Use KFPU_387 for MMX string operations
x86/setup: don't remove E820_TYPE_RAM for pfn 0
proc_sysctl: fix oops caused by incorrect command parameters
mm: memcg/slab: optimize objcg stock draining
mm: memcg: fix memcg file_dirty numa stat
mm: fix numa stats for thp migration
io_uring: iopoll requests should also wake task ->in_idle state
io_uring: fix SQPOLL IORING_OP_CLOSE cancelation state
io_uring: fix short read retries for non-reg files
intel_th: pci: Add Alder Lake-P support
stm class: Fix module init return on allocation failure
serial: mvebu-uart: fix tx lost characters at power off
ehci: fix EHCI host controller initialization sequence
USB: ehci: fix an interrupt calltrace error
usb: gadget: aspeed: fix stop dma register setting.
USB: gadget: dummy-hcd: Fix errors in port-reset handling
usb: udc: core: Use lock when write to soft_connect
usb: bdc: Make bdc pci driver depend on BROKEN
usb: cdns3: imx: fix writing read-only memory issue
usb: cdns3: imx: fix can't create core device the second time issue
xhci: make sure TRB is fully written before giving it to the controller
xhci: tegra: Delay for disabling LFPS detector
drivers core: Free dma_range_map when driver probe failed
driver core: Fix device link device name collision
driver core: Extend device_is_dependent()
drm/i915: s/intel_dp_sink_dpms/intel_dp_set_power/
drm/i915: Only enable DFP 4:4:4->4:2:0 conversion when outputting YCbCr 4:4:4
x86/entry: Fix noinstr fail
x86/cpu/amd: Set __max_die_per_package on AMD
cls_flower: call nla_ok() before nla_next()
netfilter: rpfilter: mask ecn bits before fib lookup
tools: gpio: fix %llu warning in gpio-event-mon.c
tools: gpio: fix %llu warning in gpio-watch.c
drm/i915/hdcp: Update CP property in update_pipe
sh: dma: fix kconfig dependency for G2_DMA
sh: Remove unused HAVE_COPY_THREAD_TLS macro
locking/lockdep: Cure noinstr fail
ASoC: SOF: Intel: fix page fault at probe if i915 init fails
octeontx2-af: Fix missing check bugs in rvu_cgx.c
net: dsa: mv88e6xxx: also read STU state in mv88e6250_g1_vtu_getnext
selftests/powerpc: Fix exit status of pkey tests
sh_eth: Fix power down vs. is_opened flag ordering
nvme-pci: refactor nvme_unmap_data
nvme-pci: fix error unwind in nvme_map_data
cachefiles: Drop superfluous readpages aops NULL check
lightnvm: fix memory leak when submit fails
skbuff: back tiny skbs with kmalloc() in __netdev_alloc_skb() too
kasan: fix unaligned address is unhandled in kasan_remove_zero_shadow
kasan: fix incorrect arguments passing in kasan_add_zero_shadow
tcp: fix TCP socket rehash stats mis-accounting
net_sched: gen_estimator: support large ewma log
udp: mask TOS bits in udp_v4_early_demux()
ipv6: create multicast route with RTPROT_KERNEL
net_sched: avoid shift-out-of-bounds in tcindex_set_parms()
net_sched: reject silly cell_log in qdisc_get_rtab()
ipv6: set multicast flag on the multicast route
net: mscc: ocelot: allow offloading of bridge on top of LAG
net: Disable NETIF_F_HW_TLS_RX when RXCSUM is disabled
net: dsa: b53: fix an off by one in checking "vlan->vid"
tcp: do not mess with cloned skbs in tcp_add_backlog()
tcp: fix TCP_USER_TIMEOUT with zero window
net: mscc: ocelot: Fix multicast to the CPU port
net: core: devlink: use right genl user_ptr when handling port param get/set
pinctrl: qcom: Allow SoCs to specify a GPIO function that's not 0
pinctrl: qcom: No need to read-modify-write the interrupt status
pinctrl: qcom: Properly clear "intr_ack_high" interrupts when unmasking
pinctrl: qcom: Don't clear pending interrupts when enabling
x86/sev: Fix nonistr violation
tty: implement write_iter
tty: fix up hung_up_tty_write() conversion
net: systemport: free dev before on error path
x86/sev-es: Handle string port IO to kernel memory properly
tcp: Fix potential use-after-free due to double kfree()
ASoC: SOF: Intel: hda: Avoid checking jack on system suspend
drm/i915/hdcp: Get conn while content_type changed
bpf: Local storage helpers should check nullness of owner ptr passed
kernfs: implement ->read_iter
kernfs: implement ->write_iter
kernfs: wire up ->splice_read and ->splice_write
interconnect: imx8mq: Use icc_sync_state
fs/pipe: allow sendfile() to pipe again
Commit 9bb48c82aced ("tty: implement write_iter") converted the tty layer to use write_iter. Fix the redirected_tty_write declaration also in n_tty and change the comparisons to use write_iter instead of write. also in n_tty and change the comparisons to use write_iter instead of write.
mm: fix initialization of struct page for holes in memory layout
Revert "mm: fix initialization of struct page for holes in memory layout"
Linux 5.10.11
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I502239d06f9bcd68b59149376f3c796c64de5942
[ Upstream commit b36b0fe96af13460278bf9b173beced1bd15f85d ]
It's useful to be able to test non-vector event channel delivery, to make
sure Linux will work properly on older Xen which doesn't have it.
It's also useful for those working on Xen and Xen-compatible hypervisors,
because there are guest kernels still in active use which use PCI INTX
even when vector delivery is available.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Link: https://lore.kernel.org/r/20210106153958.584169-4-dwmw2@infradead.org
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Free the pages parallely for a task that receives SIGKILL, from ULMK
process, using the oom_reaper. This freeing of pages will help to give
the pages to buddy system well advance.
Add the boot param, reap_mem_when_killed_by=, that configures the
process name, the kill signal to a process from which makes its memory
reaped by oom reaper.
As an example, when reap_mem_when_killed_by=lmkd, then all the processes
that receives the kill signal from lmkd is added to oom reaper.
Not initializing this param makes this feature disabled.
Change-Id: I21adb95de5e380a80d7eb0b87d9b5b553f52e28a
Bug: 171763461
Signed-off-by: Charan Teja Reddy <charante@codeaurora.org>
Signed-off-by: Isaac J. Manjarres <isaacm@codeaurora.org>
Changes in 5.10.2
ptrace: Prevent kernel-infoleak in ptrace_get_syscall_info()
ktest.pl: If size of log is too big to email, email error message
ktest.pl: Fix the logic for truncating the size of the log file for email
USB: legotower: fix logical error in recent commit
USB: dummy-hcd: Fix uninitialized array use in init()
USB: add RESET_RESUME quirk for Snapscan 1212
ALSA: usb-audio: Fix potential out-of-bounds shift
ALSA: usb-audio: Fix control 'access overflow' errors from chmap
xhci: Give USB2 ports time to enter U3 in bus suspend
usb: xhci: Set quirk for XHCI_SG_TRB_CACHE_SIZE_QUIRK
xhci-pci: Allow host runtime PM as default for Intel Alpine Ridge LP
xhci-pci: Allow host runtime PM as default for Intel Maple Ridge xHCI
USB: UAS: introduce a quirk to set no_write_same
USB: sisusbvga: Make console support depend on BROKEN
ALSA: pcm: oss: Fix potential out-of-bounds shift
serial: 8250_omap: Avoid FIFO corruption caused by MDR1 access
Linux 5.10.2
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I0dfd41a3ba5b102699ef78641fbe48ed16957a0f
IBM Power9 processors can speculatively operate on data in the L1 cache
before it has been completely validated, via a way-prediction mechanism. It
is not possible for an attacker to determine the contents of impermissible
memory using this method, since these systems implement a combination of
hardware and software security measures to prevent scenarios where
protected data could be leaked.
However these measures don't address the scenario where an attacker induces
the operating system to speculatively execute instructions using data that
the attacker controls. This can be used for example to speculatively bypass
"kernel user access prevention" techniques, as discovered by Anthony
Steinhauser of Google's Safeside Project. This is not an attack by itself,
but there is a possibility it could be used in conjunction with
side-channels or other weaknesses in the privileged code to construct an
attack.
This issue can be mitigated by flushing the L1 cache between privilege
boundaries of concern. This patch flushes the L1 cache after user accesses.
This is part of the fix for CVE-2020-4788.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
IBM Power9 processors can speculatively operate on data in the L1 cache
before it has been completely validated, via a way-prediction mechanism. It
is not possible for an attacker to determine the contents of impermissible
memory using this method, since these systems implement a combination of
hardware and software security measures to prevent scenarios where
protected data could be leaked.
However these measures don't address the scenario where an attacker induces
the operating system to speculatively execute instructions using data that
the attacker controls. This can be used for example to speculatively bypass
"kernel user access prevention" techniques, as discovered by Anthony
Steinhauser of Google's Safeside Project. This is not an attack by itself,
but there is a possibility it could be used in conjunction with
side-channels or other weaknesses in the privileged code to construct an
attack.
This issue can be mitigated by flushing the L1 cache between privilege
boundaries of concern. This patch flushes the L1 cache on kernel entry.
This is part of the fix for CVE-2020-4788.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Steps on the way to 5.10-rc1
Resolves conflicts in:
fs/userfaultfd.c
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: Ie3fe3c818f1f6565cfd4fa551de72d2b72ef60af
Steps on the way to 5.10-rc1
Resolves merge issues in:
drivers/net/virtio_net.c
net/xfrm/xfrm_state.c
net/xfrm/xfrm_user.c
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I3132e7802f25cb775eb02d0b3a03068da39a6fe2
Resolves conflicts in:
kernel/dma/mapping.c
Steps on the way to 5.10-rc1
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I61292201a3ac4b92c39f330585692652cf985550
Pull more xen updates from Juergen Gross:
- a series for the Xen pv block drivers adding module parameters for
better control of resource usge
- a cleanup series for the Xen event driver
* tag 'for-linus-5.10b-rc1c-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
Documentation: add xen.fifo_events kernel parameter description
xen/events: unmask a fifo event channel only if it was masked
xen/events: only register debug interrupt for 2-level events
xen/events: make struct irq_info private to events_base.c
xen: remove no longer used functions
xen-blkfront: Apply changed parameter name to the document
xen-blkfront: add a parameter for disabling of persistent grants
xen-blkback: add a parameter for disabling of persistent grants
Pull more xen updates from Juergen Gross:
- A single patch to fix the Xen security issue XSA-331 (malicious
guests can DoS dom0 by triggering NULL-pointer dereferences or access
to stale data).
- A larger series to fix the Xen security issue XSA-332 (malicious
guests can DoS dom0 by sending events at high frequency leading to
dom0's vcpus being busy in IRQ handling for elongated times).
* tag 'for-linus-5.10b-rc1b-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen/events: block rogue events for some time
xen/events: defer eoi in case of excessive number of events
xen/events: use a common cpu hotplug hook for event channels
xen/events: switch user event channels to lateeoi model
xen/pciback: use lateeoi irq binding
xen/pvcallsback: use lateeoi irq binding
xen/scsiback: use lateeoi irq binding
xen/netback: use lateeoi irq binding
xen/blkback: use lateeoi irq binding
xen/events: add a new "late EOI" evtchn framework
xen/events: fix race in evtchn_fifo_unmask()
xen/events: add a proper barrier to 2-level uevent unmasking
xen/events: avoid removing an event channel while handling it
In case rogue guests are sending events at high frequency it might
happen that xen_evtchn_do_upcall() won't stop processing events in
dom0. As this is done in irq handling a crash might be the result.
In order to avoid that, delay further inter-domain events after some
time in xen_evtchn_do_upcall() by forcing eoi processing into a
worker on the same cpu, thus inhibiting new events coming in.
The time after which eoi processing is to be delayed is configurable
via a new module parameter "event_loop_timeout" which specifies the
maximum event loop time in jiffies (default: 2, the value was chosen
after some tests showing that a value of 2 was the lowest with an
only slight drop of dom0 network throughput while multiple guests
performed an event storm).
How long eoi processing will be delayed can be specified via another
parameter "event_eoi_delay" (again in jiffies, default 10, again the
value was chosen after testing with different delay values).
This is part of XSA-332.
Cc: stable@vger.kernel.org
Reported-by: Julien Grall <julien@xen.org>
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: Wei Liu <wl@xen.org>
Pull RCU changes from Ingo Molnar:
- Debugging for smp_call_function()
- RT raw/non-raw lock ordering fixes
- Strict grace periods for KASAN
- New smp_call_function() torture test
- Torture-test updates
- Documentation updates
- Miscellaneous fixes
[ This doesn't actually pull the tag - I've dropped the last merge from
the RCU branch due to questions about the series. - Linus ]
* tag 'core-rcu-2020-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (77 commits)
smp: Make symbol 'csd_bug_count' static
kernel/smp: Provide CSD lock timeout diagnostics
smp: Add source and destination CPUs to __call_single_data
rcu: Shrink each possible cpu krcp
rcu/segcblist: Prevent useless GP start if no CBs to accelerate
torture: Add gdb support
rcutorture: Allow pointer leaks to test diagnostic code
rcutorture: Hoist OOM registry up one level
refperf: Avoid null pointer dereference when buf fails to allocate
rcutorture: Properly synchronize with OOM notifier
rcutorture: Properly set rcu_fwds for OOM handling
torture: Add kvm.sh --help and update help message
rcutorture: Add CONFIG_PROVE_RCU_LIST to TREE05
torture: Update initrd documentation
rcutorture: Replace HTTP links with HTTPS ones
locktorture: Make function torture_percpu_rwsem_init() static
torture: document --allcpus argument added to the kvm.sh script
rcutorture: Output number of elapsed grace periods
rcutorture: Remove KCSAN stubs
rcu: Remove unused "cpu" parameter from rcu_report_qs_rdp()
...
Merge more updates from Andrew Morton:
"155 patches.
Subsystems affected by this patch series: mm (dax, debug, thp,
readahead, page-poison, util, memory-hotplug, zram, cleanups), misc,
core-kernel, get_maintainer, MAINTAINERS, lib, bitops, checkpatch,
binfmt, ramfs, autofs, nilfs, rapidio, panic, relay, kgdb, ubsan,
romfs, and fault-injection"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (155 commits)
lib, uaccess: add failure injection to usercopy functions
lib, include/linux: add usercopy failure capability
ROMFS: support inode blocks calculation
ubsan: introduce CONFIG_UBSAN_LOCAL_BOUNDS for Clang
sched.h: drop in_ubsan field when UBSAN is in trap mode
scripts/gdb/tasks: add headers and improve spacing format
scripts/gdb/proc: add struct mount & struct super_block addr in lx-mounts command
kernel/relay.c: drop unneeded initialization
panic: dump registers on panic_on_warn
rapidio: fix the missed put_device() for rio_mport_add_riodev
rapidio: fix error handling path
nilfs2: fix some kernel-doc warnings for nilfs2
autofs: harden ioctl table
ramfs: fix nommu mmap with gaps in the page cache
mm: remove the now-unnecessary mmget_still_valid() hack
mm/gup: take mmap_lock in get_dump_page()
binfmt_elf, binfmt_elf_fdpic: use a VMA list snapshot
coredump: rework elf/elf_fdpic vma_dump_size() into common helper
coredump: refactor page range dumping into common helper
coredump: let dump_emit() bail out on short writes
...
Patch series "add fault injection to user memory access", v3.
The goal of this series is to improve testing of fault-tolerance in usages
of user memory access functions, by adding support for fault injection.
syzkaller/syzbot are using the existing fault injection modes and will use
this particular feature also.
The first patch adds failure injection capability for usercopy functions.
The second changes usercopy functions to use this new failure capability
(copy_from_user, ...). The third patch adds get/put/clear_user failures
to x86.
This patch (of 3):
Add a failure injection capability to improve testing of fault-tolerance
in usages of user memory access functions.
Add CONFIG_FAULT_INJECTION_USERCOPY to enable faults in usercopy
functions. The should_fail_usercopy function is to be called by these
functions (copy_from_user, get_user, ...) in order to fail or not.
Signed-off-by: Albert van der Linde <alinde@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Akinobu Mita <akinobu.mita@gmail.com>
Reviewed-by: Alexander Potapenko <glider@google.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Marco Elver <elver@google.com>
Cc: Christoph Hellwig <hch@lst.de>
Link: http://lkml.kernel.org/r/20200831171733.955393-1-alinde@google.com
Link: http://lkml.kernel.org/r/20200831171733.955393-2-alinde@google.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull networking updates from Jakub Kicinski:
- Add redirect_neigh() BPF packet redirect helper, allowing to limit
stack traversal in common container configs and improving TCP
back-pressure.
Daniel reports ~10Gbps => ~15Gbps single stream TCP performance gain.
- Expand netlink policy support and improve policy export to user
space. (Ge)netlink core performs request validation according to
declared policies. Expand the expressiveness of those policies
(min/max length and bitmasks). Allow dumping policies for particular
commands. This is used for feature discovery by user space (instead
of kernel version parsing or trial and error).
- Support IGMPv3/MLDv2 multicast listener discovery protocols in
bridge.
- Allow more than 255 IPv4 multicast interfaces.
- Add support for Type of Service (ToS) reflection in SYN/SYN-ACK
packets of TCPv6.
- In Multi-patch TCP (MPTCP) support concurrent transmission of data on
multiple subflows in a load balancing scenario. Enhance advertising
addresses via the RM_ADDR/ADD_ADDR options.
- Support SMC-Dv2 version of SMC, which enables multi-subnet
deployments.
- Allow more calls to same peer in RxRPC.
- Support two new Controller Area Network (CAN) protocols - CAN-FD and
ISO 15765-2:2016.
- Add xfrm/IPsec compat layer, solving the 32bit user space on 64bit
kernel problem.
- Add TC actions for implementing MPLS L2 VPNs.
- Improve nexthop code - e.g. handle various corner cases when nexthop
objects are removed from groups better, skip unnecessary
notifications and make it easier to offload nexthops into HW by
converting to a blocking notifier.
- Support adding and consuming TCP header options by BPF programs,
opening the doors for easy experimental and deployment-specific TCP
option use.
- Reorganize TCP congestion control (CC) initialization to simplify
life of TCP CC implemented in BPF.
- Add support for shipping BPF programs with the kernel and loading
them early on boot via the User Mode Driver mechanism, hence reusing
all the user space infra we have.
- Support sleepable BPF programs, initially targeting LSM and tracing.
- Add bpf_d_path() helper for returning full path for given 'struct
path'.
- Make bpf_tail_call compatible with bpf-to-bpf calls.
- Allow BPF programs to call map_update_elem on sockmaps.
- Add BPF Type Format (BTF) support for type and enum discovery, as
well as support for using BTF within the kernel itself (current use
is for pretty printing structures).
- Support listing and getting information about bpf_links via the bpf
syscall.
- Enhance kernel interfaces around NIC firmware update. Allow
specifying overwrite mask to control if settings etc. are reset
during update; report expected max time operation may take to users;
support firmware activation without machine reboot incl. limits of
how much impact reset may have (e.g. dropping link or not).
- Extend ethtool configuration interface to report IEEE-standard
counters, to limit the need for per-vendor logic in user space.
- Adopt or extend devlink use for debug, monitoring, fw update in many
drivers (dsa loop, ice, ionic, sja1105, qed, mlxsw, mv88e6xxx,
dpaa2-eth).
- In mlxsw expose critical and emergency SFP module temperature alarms.
Refactor port buffer handling to make the defaults more suitable and
support setting these values explicitly via the DCBNL interface.
- Add XDP support for Intel's igb driver.
- Support offloading TC flower classification and filtering rules to
mscc_ocelot switches.
- Add PTP support for Marvell Octeontx2 and PP2.2 hardware, as well as
fixed interval period pulse generator and one-step timestamping in
dpaa-eth.
- Add support for various auth offloads in WiFi APs, e.g. SAE (WPA3)
offload.
- Add Lynx PHY/PCS MDIO module, and convert various drivers which have
this HW to use it. Convert mvpp2 to split PCS.
- Support Marvell Prestera 98DX3255 24-port switch ASICs, as well as
7-port Mediatek MT7531 IP.
- Add initial support for QCA6390 and IPQ6018 in ath11k WiFi driver,
and wcn3680 support in wcn36xx.
- Improve performance for packets which don't require much offloads on
recent Mellanox NICs by 20% by making multiple packets share a
descriptor entry.
- Move chelsio inline crypto drivers (for TLS and IPsec) from the
crypto subtree to drivers/net. Move MDIO drivers out of the phy
directory.
- Clean up a lot of W=1 warnings, reportedly the actively developed
subsections of networking drivers should now build W=1 warning free.
- Make sure drivers don't use in_interrupt() to dynamically adapt their
code. Convert tasklets to use new tasklet_setup API (sadly this
conversion is not yet complete).
* tag 'net-next-5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2583 commits)
Revert "bpfilter: Fix build error with CONFIG_BPFILTER_UMH"
net, sockmap: Don't call bpf_prog_put() on NULL pointer
bpf, selftest: Fix flaky tcp_hdr_options test when adding addr to lo
bpf, sockmap: Add locking annotations to iterator
netfilter: nftables: allow re-computing sctp CRC-32C in 'payload' statements
net: fix pos incrementment in ipv6_route_seq_next
net/smc: fix invalid return code in smcd_new_buf_create()
net/smc: fix valid DMBE buffer sizes
net/smc: fix use-after-free of delayed events
bpfilter: Fix build error with CONFIG_BPFILTER_UMH
cxgb4/ch_ipsec: Replace the module name to ch_ipsec from chcr
net: sched: Fix suspicious RCU usage while accessing tcf_tunnel_info
bpf: Fix register equivalence tracking.
rxrpc: Fix loss of final ack on shutdown
rxrpc: Fix bundle counting for exclusive connections
netfilter: restore NF_INET_NUMHOOKS
ibmveth: Identify ingress large send packets.
ibmveth: Switch order of ibmveth_helper calls.
cxgb4: handle 4-tuple PEDIT to NAT mode translation
selftests: Add VRF route leaking tests
...
Pull dma-mapping updates from Christoph Hellwig:
- rework the non-coherent DMA allocator
- move private definitions out of <linux/dma-mapping.h>
- lower CMA_ALIGNMENT (Paul Cercueil)
- remove the omap1 dma address translation in favor of the common code
- make dma-direct aware of multiple dma offset ranges (Jim Quinlan)
- support per-node DMA CMA areas (Barry Song)
- increase the default seg boundary limit (Nicolin Chen)
- misc fixes (Robin Murphy, Thomas Tai, Xu Wang)
- various cleanups
* tag 'dma-mapping-5.10' of git://git.infradead.org/users/hch/dma-mapping: (63 commits)
ARM/ixp4xx: add a missing include of dma-map-ops.h
dma-direct: simplify the DMA_ATTR_NO_KERNEL_MAPPING handling
dma-direct: factor out a dma_direct_alloc_from_pool helper
dma-direct check for highmem pages in dma_direct_alloc_pages
dma-mapping: merge <linux/dma-noncoherent.h> into <linux/dma-map-ops.h>
dma-mapping: move large parts of <linux/dma-direct.h> to kernel/dma
dma-mapping: move dma-debug.h to kernel/dma/
dma-mapping: remove <asm/dma-contiguous.h>
dma-mapping: merge <linux/dma-contiguous.h> into <linux/dma-map-ops.h>
dma-contiguous: remove dma_contiguous_set_default
dma-contiguous: remove dev_set_cma_area
dma-contiguous: remove dma_declare_contiguous
dma-mapping: split <linux/dma-mapping.h>
cma: decrease CMA_ALIGNMENT lower limit to 2
firewire-ohci: use dma_alloc_pages
dma-iommu: implement ->alloc_noncoherent
dma-mapping: add new {alloc,free}_noncoherent dma_map_ops methods
dma-mapping: add a new dma_alloc_pages API
dma-mapping: remove dma_cache_sync
53c700: convert to dma_alloc_noncoherent
...
Pull documentation updates from Jonathan Corbet:
"As hoped, things calmed down for docs this cycle; fewer changes and
almost no conflicts at all. This includes:
- A reworked and expanded user-mode Linux document
- Some simplifications and improvements for submitting-patches.rst
- An emergency fix for (some) problems with Sphinx 3.x
- Some welcome automarkup improvements to automatically generate
cross-references to struct definitions and other documents
- The usual collection of translation updates, typo fixes, etc"
* tag 'docs-5.10' of git://git.lwn.net/linux: (81 commits)
gpiolib: Update indentation in driver.rst for code excerpts
Documentation/admin-guide: tainted-kernels: Fix typo occured
Documentation: better locations for sysfs-pci, sysfs-tagging
docs: programming-languages: refresh blurb on clang support
Documentation: kvm: fix a typo
Documentation: Chinese translation of Documentation/arm64/amu.rst
doc: zh_CN: index files in arm64 subdirectory
mailmap: add entry for <mstarovoitov@marvell.com>
doc: seq_file: clarify role of *pos in ->next()
docs: trace: ring-buffer-design.rst: use the new SPDX tag
Documentation: kernel-parameters: clarify "module." parameters
Fix references to nommu-mmap.rst
docs: rewrite admin-guide/sysctl/abi.rst
docs: fb: Remove vesafb scrollback boot option
docs: fb: Remove sstfb scrollback boot option
docs: fb: Remove matroxfb scrollback boot option
docs: fb: Remove framebuffer scrollback boot option
docs: replace the old User Mode Linux HowTo with a new one
Documentation/admin-guide: blockdev/ramdisk: remove use of "rdev"
Documentation/admin-guide: README & svga: remove use of "rdev"
...
Pull v5.10 RCU changes from Paul E. McKenney:
- Debugging for smp_call_function().
- Strict grace periods for KASAN. The point of this series is to find
RCU-usage bugs, so the corresponding new RCU_STRICT_GRACE_PERIOD
Kconfig option depends on both DEBUG_KERNEL and RCU_EXPERT, and is
further disabled by dfefault. Finally, the help text includes
a goodly list of scary caveats.
- New smp_call_function() torture test.
- Torture-test updates.
- Documentation updates.
- Miscellaneous fixes.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
On cuttlefish device, DCC registers are unavailable and cause kernel to
crash if those registers are probed. Introduce a module parameter
("hvc_dcc.enable") to enable DCC at the kernel commandline.
Bug: 169129589
Change-Id: I0218d9e64443c881d163e484712edf18e42975fd
Signed-off-by: Elliot Berman <eberman@codeaurora.org>
Merge dma-contiguous.h into dma-map-ops.h, after removing the comment
describing the contiguous allocator into kernel/dma/contigous.c.
Signed-off-by: Christoph Hellwig <hch@lst.de>