android_kernel_xiaomi_sm8450

xiaomi-sm8450/android_kernel_xiaomi_sm8450

Author	SHA1	Message	Date
Greg Kroah-Hartman	578a3af78b	Merge 5.10.213 into android12-5.10-lts Changes in 5.10.213 mmc: mmci: stm32: use a buffer for unaligned DMA requests mmc: mmci: stm32: fix DMA API overlapping mappings warning lan78xx: Fix white space and style issues lan78xx: Add missing return code checks lan78xx: Fix partial packet errors on suspend/resume lan78xx: Fix race conditions in suspend/resume handling net: lan78xx: fix runtime PM count underflow on link stop ixgbe: {dis, en}able irqs in ixgbe_txrx_ring_{dis, en}able i40e: disable NAPI right after disabling irqs when handling xsk_pool tracing/net_sched: Fix tracepoints that save qdisc_dev() as a string geneve: make sure to pull inner header in geneve_rx() net: ice: Fix potential NULL pointer dereference in ice_bridge_setlink() net/ipv6: avoid possible UAF in ip6_route_mpath_notify() cpumap: Zero-initialise xdp_rxq_info struct before running XDP program net/rds: fix WARNING in rds_conn_connect_if_down netfilter: nft_ct: fix l3num expectations with inet pseudo family netfilter: nf_conntrack_h323: Add protection for bmp length out of range netrom: Fix a data-race around sysctl_netrom_default_path_quality netrom: Fix a data-race around sysctl_netrom_obsolescence_count_initialiser netrom: Fix data-races around sysctl_netrom_network_ttl_initialiser netrom: Fix a data-race around sysctl_netrom_transport_timeout netrom: Fix a data-race around sysctl_netrom_transport_maximum_tries netrom: Fix a data-race around sysctl_netrom_transport_acknowledge_delay netrom: Fix a data-race around sysctl_netrom_transport_busy_delay netrom: Fix a data-race around sysctl_netrom_transport_requested_window_size netrom: Fix a data-race around sysctl_netrom_transport_no_activity_timeout netrom: Fix a data-race around sysctl_netrom_routing_control netrom: Fix a data-race around sysctl_netrom_link_fails_count netrom: Fix data-races around sysctl_net_busy_read selftests/mm: switch to bash from sh selftests: mm: fix map_hugetlb failure on 64K page size systems um: allow not setting extra rpaths in the linux binary xhci: remove extra loop in interrupt context xhci: prevent double-fetch of transfer and transfer event TRBs xhci: process isoc TD properly when there was a transaction error mid TD. xhci: handle isoc Babble and Buffer Overrun events properly serial: max310x: Use devm_clk_get_optional() to get the input clock serial: max310x: Try to get crystal clock rate from property serial: max310x: fail probe if clock crystal is unstable serial: max310x: Make use of device properties serial: max310x: use regmap methods for SPI batch operations serial: max310x: use a separate regmap for each port serial: max310x: prevent infinite while() loop in port startup net: Change sock_getsockopt() to take the sk ptr instead of the sock ptr bpf: net: Change sk_getsockopt() to take the sockptr_t argument lsm: make security_socket_getpeersec_stream() sockptr_t safe lsm: fix default return value of the socket_getpeersec_() hooks ext4: make ext4_es_insert_extent() return void ext4: refactor ext4_da_map_blocks() ext4: convert to exclusive lock while inserting delalloc extents Drivers: hv: vmbus: Add vmbus_requestor data structure for VMBus hardening hv_netvsc: Use vmbus_requestor to generate transaction IDs for VMBus hardening hv_netvsc: Wait for completion on request SWITCH_DATA_PATH hv_netvsc: Process NETDEV_GOING_DOWN on VF hot remove hv_netvsc: Make netvsc/VF binding check both MAC and serial number hv_netvsc: use netif_is_bond_master() instead of open code hv_netvsc: Register VF in netvsc_probe if NET_DEVICE_REGISTER missed mm/hugetlb: change hugetlb_reserve_pages() to type bool mm: hugetlb pages should not be reserved by shmat() if SHM_NORESERVE getrusage: add the "signal_struct sig" local variable getrusage: move thread_group_cputime_adjusted() outside of lock_task_sighand() getrusage: use __for_each_thread() getrusage: use sig->stats_lock rather than lock_task_sighand() serial: max310x: Unprepare and disable clock in error path Drivers: hv: vmbus: Drop error message when 'No request id available' regmap: allow to define reg_update_bits for no bus configuration regmap: Add bulk read/write callbacks into regmap_config serial: max310x: make accessing revision id interface-agnostic serial: max310x: implement I2C support serial: max310x: fix IO data corruption in batched operations Linux 5.10.213 Change-Id: I3450b2b1b545eeb2e3eb862f39d1846a31d17a0a Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>	2024-05-01 06:27:24 +00:00
Oleg Nesterov	f610023e67	getrusage: use sig->stats_lock rather than lock_task_sighand() [ Upstream commit f7ec1cd5cc7ef3ad964b677ba82b8b77f1c93009 ] lock_task_sighand() can trigger a hard lockup. If NR_CPUS threads call getrusage() at the same time and the process has NR_THREADS, spin_lock_irq will spin with irqs disabled O(NR_CPUS * NR_THREADS) time. Change getrusage() to use sig->stats_lock, it was specifically designed for this type of use. This way it runs lockless in the likely case. TODO: - Change do_task_stat() to use sig->stats_lock too, then we can remove spin_lock_irq(siglock) in wait_task_zombie(). - Turn sig->stats_lock into seqcount_rwlock_t, this way the readers in the slow mode won't exclude each other. See https://lore.kernel.org/all/20230913154907.GA26210@redhat.com/ - stats_lock has to disable irqs because ->siglock can be taken in irq context, it would be very nice to change __exit_signal() to avoid the siglock->stats_lock dependency. Link: https://lkml.kernel.org/r/20240122155053.GA26214@redhat.com Signed-off-by: Oleg Nesterov <oleg@redhat.com> Reported-by: Dylan Hatch <dylanbhatch@google.com> Tested-by: Dylan Hatch <dylanbhatch@google.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-03-15 10:48:22 -04:00
Oleg Nesterov	9ca9786820	getrusage: use __for_each_thread() [ Upstream commit 13b7bc60b5353371460a203df6c38ccd38ad7a3a ] do/while_each_thread should be avoided when possible. Plus this change allows to avoid lock_task_sighand(), we can use rcu and/or sig->stats_lock instead. Link: https://lkml.kernel.org/r/20230909172629.GA20454@redhat.com Signed-off-by: Oleg Nesterov <oleg@redhat.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Stable-dep-of: f7ec1cd5cc7e ("getrusage: use sig->stats_lock rather than lock_task_sighand()") Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-03-15 10:48:22 -04:00
Oleg Nesterov	21677f35e1	getrusage: move thread_group_cputime_adjusted() outside of lock_task_sighand() [ Upstream commit daa694e4137571b4ebec330f9a9b4d54aa8b8089 ] Patch series "getrusage: use sig->stats_lock", v2. This patch (of 2): thread_group_cputime() does its own locking, we can safely shift thread_group_cputime_adjusted() which does another for_each_thread loop outside of ->siglock protected section. This is also preparation for the next patch which changes getrusage() to use stats_lock instead of siglock, thread_group_cputime() takes the same lock. With the current implementation recursive read_seqbegin_or_lock() is fine, thread_group_cputime() can't enter the slow mode if the caller holds stats_lock, yet this looks more safe and better performance-wise. Link: https://lkml.kernel.org/r/20240122155023.GA26169@redhat.com Link: https://lkml.kernel.org/r/20240122155050.GA26205@redhat.com Signed-off-by: Oleg Nesterov <oleg@redhat.com> Reported-by: Dylan Hatch <dylanbhatch@google.com> Tested-by: Dylan Hatch <dylanbhatch@google.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-03-15 10:48:22 -04:00
Oleg Nesterov	811415fe76	getrusage: add the "signal_struct *sig" local variable [ Upstream commit c7ac8231ace9b07306d0299969e42073b189c70a ] No functional changes, cleanup/preparation. Link: https://lkml.kernel.org/r/20230909172554.GA20441@redhat.com Signed-off-by: Oleg Nesterov <oleg@redhat.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Stable-dep-of: daa694e41375 ("getrusage: move thread_group_cputime_adjusted() outside of lock_task_sighand()") Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-03-15 10:48:22 -04:00
Greg Kroah-Hartman	4c20c2c837	Merge 5.10.179 into android12-5.10-lts Changes in 5.10.179 ARM: dts: rockchip: fix a typo error for rk3288 spdif node arm64: dts: qcom: ipq8074-hk01: enable QMP device, not the PHY node arm64: dts: meson-g12-common: specify full DMC range arm64: dts: imx8mm-evk: correct pmic clock source netfilter: br_netfilter: fix recent physdev match breakage regulator: fan53555: Explicitly include bits header net: sched: sch_qfq: prevent slab-out-of-bounds in qfq_activate_agg virtio_net: bugfix overflow inside xdp_linearize_page() sfc: Split STATE_READY in to STATE_NET_DOWN and STATE_NET_UP. sfc: Fix use-after-free due to selftest_work netfilter: nf_tables: fix ifdef to also consider nf_tables=m i40e: fix accessing vsi->active_filters without holding lock i40e: fix i40e_setup_misc_vector() error handling mlxfw: fix null-ptr-deref in mlxfw_mfa2_tlv_next() net: rpl: fix rpl header size calculation mlxsw: pci: Fix possible crash during initialization bpf: Fix incorrect verifier pruning due to missing register precision taints e1000e: Disable TSO on i219-LM card to increase speed f2fs: Fix f2fs_truncate_partial_nodes ftrace event Input: i8042 - add quirk for Fujitsu Lifebook A574/H selftests: sigaltstack: fix -Wuninitialized scsi: megaraid_sas: Fix fw_crash_buffer_show() scsi: core: Improve scsi_vpd_inquiry() checks net: dsa: b53: mmap: add phy ops s390/ptrace: fix PTRACE_GET_LAST_BREAK error handling nvme-tcp: fix a possible UAF when failing to allocate an io queue xen/netback: use same error messages for same errors powerpc/doc: Fix htmldocs errors xfs: drop submit side trans alloc for append ioends iio: light: tsl2772: fix reading proximity-diodes from device tree nilfs2: initialize unused bytes in segment summary blocks memstick: fix memory leak if card device is never registered kernel/sys.c: fix and improve control flow in __sys_setres[ug]id() mmc: sdhci_am654: Set HIGH_SPEED_ENA for SDR12 and SDR25 mm/khugepaged: check again on anon uffd-wp during isolation sched/uclamp: Make task_fits_capacity() use util_fits_cpu() sched/uclamp: Fix fits_capacity() check in feec() sched/uclamp: Make select_idle_capacity() use util_fits_cpu() sched/uclamp: Make asym_fits_capacity() use util_fits_cpu() sched/uclamp: Make cpu_overutilized() use util_fits_cpu() sched/uclamp: Cater for uclamp in find_energy_efficient_cpu()'s early exit condition sched/fair: Detect capacity inversion sched/fair: Consider capacity inversion in util_fits_cpu() sched/uclamp: Fix a uninitialized variable warnings sched/fair: Fixes for capacity inversion detection MIPS: Define RUNTIME_DISCARD_EXIT in LD script docs: futex: Fix kernel-doc references after code split-up preparation purgatory: fix disabling debug info virtiofs: clean up error handling in virtio_fs_get_tree() virtiofs: split requests that exceed virtqueue size fuse: check s_root when destroying sb fuse: fix attr version comparison in fuse_read_update_size() fuse: always revalidate rename target dentry fuse: fix deadlock between atomic O_TRUNC and page invalidation Revert "ext4: fix use-after-free in ext4_xattr_set_entry" ext4: remove duplicate definition of ext4_xattr_ibody_inline_set() ext4: fix use-after-free in ext4_xattr_set_entry udp: Call inet6_destroy_sock() in setsockopt(IPV6_ADDRFORM). tcp/udp: Call inet6_destroy_sock() in IPv6 sk->sk_destruct(). inet6: Remove inet6_destroy_sock() in sk->sk_prot->destroy(). dccp: Call inet6_destroy_sock() via sk->sk_destruct(). sctp: Call inet6_destroy_sock() via sk->sk_destruct(). pwm: meson: Explicitly set .polarity in .get_state() pwm: iqs620a: Explicitly set .polarity in .get_state() pwm: hibvt: Explicitly set .polarity in .get_state() iio: adc: at91-sama5d2_adc: fix an error code in at91_adc_allocate_trigger() ASoC: fsl_asrc_dma: fix potential null-ptr-deref ASN.1: Fix check for strdup() success Linux 5.10.179 Change-Id: I54e476aa9b199a4711a091c77583739ed82af5ad Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>	2023-06-16 09:49:29 +00:00
Ondrej Mosnacek	c61928fcca	kernel/sys.c: fix and improve control flow in __sys_setres[ug]id() commit 659c0ce1cb9efc7f58d380ca4bb2a51ae9e30553 upstream. Linux Security Modules (LSMs) that implement the "capable" hook will usually emit an access denial message to the audit log whenever they "block" the current task from using the given capability based on their security policy. The occurrence of a denial is used as an indication that the given task has attempted an operation that requires the given access permission, so the callers of functions that perform LSM permission checks must take care to avoid calling them too early (before it is decided if the permission is actually needed to perform the requested operation). The __sys_setres[ug]id() functions violate this convention by first calling ns_capable_setid() and only then checking if the operation requires the capability or not. It means that any caller that has the capability granted by DAC (task's capability set) but not by MAC (LSMs) will generate a "denied" audit record, even if is doing an operation for which the capability is not required. Fix this by reordering the checks such that ns_capable_setid() is checked last and -EPERM is returned immediately if it returns false. While there, also do two small optimizations: * move the capability check before prepare_creds() and * bail out early in case of a no-op. Link: https://lkml.kernel.org/r/20230217162154.837549-1-omosnace@redhat.com Fixes: `1da177e4c3` ("Linux-2.6.12-rc2") Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-04-26 11:27:38 +02:00
Greg Kroah-Hartman	78da590924	Merge 5.10.165 into android12-5.10-lts Changes in 5.10.165 btrfs: fix trace event name typo for FLUSH_DELAYED_REFS pNFS/filelayout: Fix coalescing test for single DS selftests/bpf: check null propagation only neither reg is PTR_TO_BTF_ID tools/virtio: initialize spinlocks in vring_test.c net/ethtool/ioctl: return -EOPNOTSUPP if we have no phy stats RDMA/srp: Move large values to a new enum for gcc13 btrfs: always report error in run_one_delayed_ref() x86/asm: Fix an assembler warning with current binutils f2fs: let's avoid panic if extent_tree is not created wifi: brcmfmac: fix regression for Broadcom PCIe wifi devices wifi: mac80211: sdata can be NULL during AMPDU start Add exception protection processing for vd in axi_chan_handle_err function zonefs: Detect append writes at invalid locations nilfs2: fix general protection fault in nilfs_btree_insert() efi: fix userspace infinite retry read efivars after EFI runtime services page fault ALSA: hda/realtek - Turn on power early drm/i915/gt: Reset twice Bluetooth: hci_qca: Wait for timeout during suspend Bluetooth: hci_qca: Fix driver shutdown on closed serdev io_uring: don't gate task_work run on TIF_NOTIFY_SIGNAL io_uring: improve send/recv error handling io_uring: ensure recv and recvmsg handle MSG_WAITALL correctly io_uring: add flag for disabling provided buffer recycling io_uring: support MSG_WAITALL for IORING_OP_SEND(MSG) io_uring: allow re-poll if we made progress io_uring: fix async accept on O_NONBLOCK sockets io_uring: check for valid register opcode earlier io_uring: lock overflowing for IOPOLL io_uring: fix CQ waiting timeout handling io_uring: ensure that cached task references are always put on exit io_uring: remove duplicated calls to io_kiocb_ppos io_uring: update kiocb->ki_pos at execution time io_uring: do not recalculate ppos unnecessarily io_uring/rw: defer fsnotify calls to task context xhci-pci: set the dma max_seg_size usb: xhci: Check endpoint is valid before dereferencing it xhci: Fix null pointer dereference when host dies xhci: Add update_hub_device override for PCI xHCI hosts xhci: Add a flag to disable USB3 lpm on a xhci root port level. usb: acpi: add helper to check port lpm capability using acpi _DSM xhci: Detect lpm incapable xHC USB3 roothub ports from ACPI tables prlimit: do_prlimit needs to have a speculation check USB: serial: option: add Quectel EM05-G (GR) modem USB: serial: option: add Quectel EM05-G (CS) modem USB: serial: option: add Quectel EM05-G (RS) modem USB: serial: option: add Quectel EC200U modem USB: serial: option: add Quectel EM05CN (SG) modem USB: serial: option: add Quectel EM05CN modem staging: vchiq_arm: fix enum vchiq_status return types USB: misc: iowarrior: fix up header size for USB_DEVICE_ID_CODEMERCS_IOW100 misc: fastrpc: Don't remove map on creater_process and device_release misc: fastrpc: Fix use-after-free race condition for maps usb: core: hub: disable autosuspend for TI TUSB8041 comedi: adv_pci1760: Fix PWM instruction handling mmc: sunxi-mmc: Fix clock refcount imbalance during unbind mmc: sdhci-esdhc-imx: correct the tuning start tap and step setting btrfs: fix race between quota rescan and disable leading to NULL pointer deref cifs: do not include page data when checking signature thunderbolt: Use correct function to calculate maximum USB3 link rate tty: serial: qcom-geni-serial: fix slab-out-of-bounds on RX FIFO buffer USB: gadgetfs: Fix race between mounting and unmounting USB: serial: cp210x: add SCALANCE LPE-9000 device id usb: host: ehci-fsl: Fix module alias usb: typec: altmodes/displayport: Add pin assignment helper usb: typec: altmodes/displayport: Fix pin assignment calculation usb: gadget: g_webcam: Send color matching descriptor per frame usb: gadget: f_ncm: fix potential NULL ptr deref in ncm_bitrate() usb-storage: apply IGNORE_UAS only for HIKSEMI MD202 on RTL9210 dt-bindings: phy: g12a-usb2-phy: fix compatible string documentation dt-bindings: phy: g12a-usb3-pcie-phy: fix compatible string documentation serial: pch_uart: Pass correct sg to dma_unmap_sg() dmaengine: tegra210-adma: fix global intr clear serial: atmel: fix incorrect baudrate setup gsmi: fix null-deref in gsmi_get_variable mei: me: add meteor lake point M DID drm/i915: re-disable RC6p on Sandy Bridge drm/amd/display: Fix set scaling doesn's work drm/amd/display: Calculate output_color_space after pixel encoding adjustment drm/amd/display: Fix COLOR_SPACE_YCBCR2020_TYPE matrix arm64: efi: Execute runtime services from a dedicated stack efi: rt-wrapper: Add missing include Revert "drm/amdgpu: make display pinning more flexible (v2)" x86/fpu: Use _Alignof to avoid undefined behavior in TYPE_ALIGN tracing: Use alignof__(struct {type b;}) instead of offsetof() io_uring: io_kiocb_update_pos() should not touch file for non -1 offset io_uring/net: fix fast_iov assignment in io_setup_async_msg() net/ulp: use consistent error code when blocking ULP net/mlx5: fix missing mutex_unlock in mlx5_fw_fatal_reporter_err_work() Revert "wifi: mac80211: fix memory leak in ieee80211_if_add()" soc: qcom: apr: Make qcom,protection-domain optional again Bluetooth: hci_qca: Wait for SSR completion during suspend Bluetooth: hci_qca: check for SSR triggered flag while suspend Bluetooth: hci_qca: Fixed issue during suspend mm/khugepaged: fix collapse_pte_mapped_thp() to allow anon_vma io_uring: Clean up a false-positive warning from GCC 9.3.0 io_uring: fix double poll leak on repolling io_uring/rw: ensure kiocb_end_write() is always called io_uring/rw: remove leftover debug statement Linux 5.10.165 Change-Id: Icb91157d9fa1b56cd79eedb8a9cc6118d0705244 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>	2023-02-16 16:43:59 +00:00
Greg Kroah-Hartman	9f8e45720e	prlimit: do_prlimit needs to have a speculation check commit 739790605705ddcf18f21782b9c99ad7d53a8c11 upstream. do_prlimit() adds the user-controlled resource value to a pointer that will subsequently be dereferenced. In order to help prevent this codepath from being used as a spectre "gadget" a barrier needs to be added after checking the range. Reported-by: Jordy Zomer <jordyzomer@google.com> Tested-by: Jordy Zomer <jordyzomer@google.com> Suggested-by: Linus Torvalds <torvalds@linuxfoundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-01-24 07:19:58 +01:00
Greg Kroah-Hartman	33740c9227	Merge 5.10.69 into android12-5.10-lts Changes in 5.10.69 PCI: pci-bridge-emul: Add PCIe Root Capabilities Register PCI: aardvark: Fix reporting CRS value console: consume APC, DM, DCS s390/pci_mmio: fully validate the VMA before calling follow_pte() ARM: Qualify enabling of swiotlb_init() ARM: 9077/1: PLT: Move struct plt_entries definition to header ARM: 9078/1: Add warn suppress parameter to arm_gen_branch_link() ARM: 9079/1: ftrace: Add MODULE_PLTS support ARM: 9098/1: ftrace: MODULE_PLT: Fix build problem without DYNAMIC_FTRACE Revert "net/mlx5: Register to devlink ingress VLAN filter trap" sctp: validate chunk size in __rcv_asconf_lookup sctp: add param size validation for SCTP_PARAM_SET_PRIMARY staging: rtl8192u: Fix bitwise vs logical operator in TranslateRxSignalStuff819xUsb() coredump: fix memleak in dump_vma_snapshot() um: virtio_uml: fix memory leak on init failures dmaengine: acpi: Avoid comparison GSI with Linux vIRQ perf test: Fix bpf test sample mismatch reporting tools lib: Adopt memchr_inv() from kernel perf tools: Allow build-id with trailing zeros thermal/drivers/exynos: Fix an error code in exynos_tmu_probe() 9p/trans_virtio: Remove sysfs file on probe failure prctl: allow to setup brk for et_dyn executables nilfs2: use refcount_dec_and_lock() to fix potential UAF profiling: fix shift-out-of-bounds bugs PM: sleep: core: Avoid setting power.must_resume to false pwm: lpc32xx: Don't modify HW state in .probe() after the PWM chip was registered pwm: mxs: Don't modify HW state in .probe() after the PWM chip was registered dmaengine: idxd: fix wq slot allocation index check platform/chrome: sensorhub: Add trace events for sample platform/chrome: cros_ec_trace: Fix format warnings ceph: allow ceph_put_mds_session to take NULL or ERR_PTR ceph: cancel delayed work instead of flushing on mdsc teardown Kconfig.debug: drop selecting non-existing HARDLOCKUP_DETECTOR_ARCH tools/bootconfig: Fix tracing_on option checking in ftrace2bconf.sh thermal/core: Fix thermal_cooling_device_register() prototype drm/amdgpu: Disable PCIE_DPM on Intel RKL Platform drivers: base: cacheinfo: Get rid of DEFINE_SMP_CALL_CACHE_FUNCTION() dma-buf: DMABUF_MOVE_NOTIFY should depend on DMA_SHARED_BUFFER parisc: Move pci_dev_is_behind_card_dino to where it is used iommu/amd: Relocate GAMSup check to early_enable_iommus dmaengine: idxd: depends on !UML dmaengine: sprd: Add missing MODULE_DEVICE_TABLE dmaengine: ioat: depends on !UML dmaengine: xilinx_dma: Set DMA mask for coherent APIs ceph: request Fw caps before updating the mtime in ceph_write_iter ceph: remove the capsnaps when removing caps ceph: lockdep annotations for try_nonblocking_invalidate btrfs: update the bdev time directly when closing btrfs: fix lockdep warning while mounting sprout fs nilfs2: fix memory leak in nilfs_sysfs_create_device_group nilfs2: fix NULL pointer in nilfs_##name##_attr_release nilfs2: fix memory leak in nilfs_sysfs_create_##name##_group nilfs2: fix memory leak in nilfs_sysfs_delete_##name##_group nilfs2: fix memory leak in nilfs_sysfs_create_snapshot_group nilfs2: fix memory leak in nilfs_sysfs_delete_snapshot_group habanalabs: add validity check for event ID received from F/W pwm: img: Don't modify HW state in .remove() callback pwm: rockchip: Don't modify HW state in .remove() callback pwm: stm32-lp: Don't modify HW state in .remove() callback blk-throttle: fix UAF by deleteing timer in blk_throtl_exit() blk-mq: allow 4x BLK_MAX_REQUEST_COUNT at blk_plug for multiple_queues rtc: rx8010: select REGMAP_I2C sched/idle: Make the idle timer expire in hard interrupt context drm/nouveau/nvkm: Replace -ENOSYS with -ENODEV Linux 5.10.69 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: I982349e3a65b83e92e9b808154bf8c84d094f1d6	2021-09-30 18:36:17 +02:00
Cyrill Gorcunov	30417cbecc	prctl: allow to setup brk for et_dyn executables commit e1fbbd073137a9d63279f6bf363151a938347640 upstream. Keno Fischer reported that when a binray loaded via ld-linux-x the prctl(PR_SET_MM_MAP) doesn't allow to setup brk value because it lays before mm:end_data. For example a test program shows \| # ~/t \| \| start_code 401000 \| end_code 401a15 \| start_stack 7ffce4577dd0 \| start_data 403e10 \| end_data 40408c \| start_brk b5b000 \| sbrk(0) b5b000 and when executed via ld-linux \| # /lib64/ld-linux-x86-64.so.2 ~/t \| \| start_code 7fc25b0a4000 \| end_code 7fc25b0c4524 \| start_stack 7fffcc6b2400 \| start_data 7fc25b0ce4c0 \| end_data 7fc25b0cff98 \| start_brk 55555710c000 \| sbrk(0) 55555710c000 This of course prevent criu from restoring such programs. Looking into how kernel operates with brk/start_brk inside brk() syscall I don't see any problem if we allow to setup brk/start_brk without checking for end_data. Even if someone pass some weird address here on a purpose then the worst possible result will be an unexpected unmapping of existing vma (own vma, since prctl works with the callers memory) but test for RLIMIT_DATA is still valid and a user won't be able to gain more memory in case of expanding VMAs via new values shipped with prctl call. Link: https://lkml.kernel.org/r/20210121221207.GB2174@grain Fixes: `bbdc6076d2` ("binfmt_elf: move brk out of mmap when doing direct loader exec") Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com> Reported-by: Keno Fischer <keno@juliacomputing.com> Acked-by: Andrey Vagin <avagin@gmail.com> Tested-by: Andrey Vagin <avagin@gmail.com> Cc: Dmitry Safonov <0x7f454c46@gmail.com> Cc: Kirill Tkhai <ktkhai@virtuozzo.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Pavel Tikhomirov <ptikhomirov@virtuozzo.com> Cc: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-09-26 14:08:57 +02:00
Greg Kroah-Hartman	ae16b7c668	Revert "Add a reference to ucounts for each cred" This reverts commit `b2c4d9a33c` which is commit 905ae01c4ae2ae3df05bb141801b1db4b7d83c61 upstream. This commit should not have been applied to the 5.10.y stable tree, so revert it. Reported-by: "Eric W. Biederman" <ebiederm@xmission.com> Link: https://lore.kernel.org/r/87v93k4bl6.fsf@disp2133 Cc: Alexey Gladkov <legion@kernel.org> Cc: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-09-08 08:49:00 +02:00
Peter Collingbourne	50829b8901	BACKPORT: arm64: Introduce prctl(PR_PAC_{SET,GET}_ENABLED_KEYS) This change introduces a prctl that allows the user program to control which PAC keys are enabled in a particular task. The main reason why this is useful is to enable a userspace ABI that uses PAC to sign and authenticate function pointers and other pointers exposed outside of the function, while still allowing binaries conforming to the ABI to interoperate with legacy binaries that do not sign or authenticate pointers. The idea is that a dynamic loader or early startup code would issue this prctl very early after establishing that a process may load legacy binaries, but before executing any PAC instructions. This change adds a small amount of overhead to kernel entry and exit due to additional required instruction sequences. On a DragonBoard 845c (Cortex-A75) with the powersave governor, the overhead of similar instruction sequences was measured as 4.9ns when simulating the common case where IA is left enabled, or 43.7ns when simulating the uncommon case where IA is disabled. These numbers can be seen as the worst case scenario, since in more realistic scenarios a better performing governor would be used and a newer chip would be used that would support PAC unlike Cortex-A75 and would be expected to be faster than Cortex-A75. On an Apple M1 under a hypervisor, the overhead of the entry/exit instruction sequences introduced by this patch was measured as 0.3ns in the case where IA is left enabled, and 33.0ns in the case where IA is disabled. Signed-off-by: Peter Collingbourne <pcc@google.com> Reviewed-by: Dave Martin <Dave.Martin@arm.com> Link: https://linux-review.googlesource.com/id/Ibc41a5e6a76b275efbaa126b31119dc197b927a5 Link: https://lore.kernel.org/r/d6609065f8f40397a4124654eb68c9f490b4d477.1616123271.git.pcc@google.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Bug: 192536783 (cherry picked from commit 201698626fbca1cf1a3b686ba14cf2a056500716) Change-Id: Ic0a21c92a22575f9ec3599fb67bd2931a50b9f04 [quic_eberman@quicinc.com: Resolved merge conflict in arch/arm64/kernel/process.c] Signed-off-by: Elliot Berman <quic_eberman@quicinc.com> Signed-off-by: Peter Collingbourne <pcc@google.com>	2021-07-14 20:52:05 -07:00
Alexey Gladkov	b2c4d9a33c	Add a reference to ucounts for each cred [ Upstream commit 905ae01c4ae2ae3df05bb141801b1db4b7d83c61 ] For RLIMIT_NPROC and some other rlimits the user_struct that holds the global limit is kept alive for the lifetime of a process by keeping it in struct cred. Adding a pointer to ucounts in the struct cred will allow to track RLIMIT_NPROC not only for user in the system, but for user in the user_namespace. Updating ucounts may require memory allocation which may fail. So, we cannot change cred.ucounts in the commit_creds() because this function cannot fail and it should always return 0. For this reason, we modify cred.ucounts before calling the commit_creds(). Changelog v6: * Fix null-ptr-deref in is_ucounts_overlimit() detected by trinity. This error was caused by the fact that cred_alloc_blank() left the ucounts pointer empty. Reported-by: kernel test robot <oliver.sang@intel.com> Signed-off-by: Alexey Gladkov <legion@kernel.org> Link: https://lkml.kernel.org/r/b37aaef28d8b9b0d757e07ba6dd27281bbe39259.1619094428.git.legion@kernel.org Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-07-14 16:55:48 +02:00
Frankie Chang	772beecd5b	ANDROID: Add vendor hooks when syscall prctl finished Add vendor hook when syscall prctl finished for vendor-specific tuning. Bug: 181819699 Signed-off-by: Frankie Chang <frankie.chang@mediatek.com> Change-Id: Ica42d80ab4b540045330e9c5b211e0e814eed0ff (cherry picked from commit d150b26653e7a3d15383a09384aace140b537ff4)	2021-03-08 16:18:51 +00:00
Greg Kroah-Hartman	67d3ed5765	Merge 'v5.10-rc1' into android-mainline Linux 5.10-rc1 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: Iace3fc84a00d3023c75caa086a266de17dc1847c	2020-10-29 06:32:38 +01:00
Greg Kroah-Hartman	338f8e80b4	Merge `0746c4a9f3` ("Merge branch 'i2c/for-5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux") into android-mainline Steps on the way to 5.10-rc1 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: Iec426c6de4a59a517e5fa575a9424b883d958f08	2020-10-28 21:20:34 +01:00
Greg Kroah-Hartman	05d2a661fd	Merge `54a4c789ca` ("Merge tag 'docs/v5.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media") into android-mainline Steps on the way to 5.10-rc1 Resolves conflicts in: fs/userfaultfd.c Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: Ie3fe3c818f1f6565cfd4fa551de72d2b72ef60af	2020-10-26 09:23:33 +01:00
Rasmus Villemoes	986b9eacb2	kernel/sys.c: fix prototype of prctl_get_tid_address() tid_addr is not a "pointer to (pointer to int in userspace)"; it is in fact a "pointer to (pointer to int in userspace) in userspace". So sparse rightfully complains about passing a kernel pointer to put_user(). Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-10-25 11:44:16 -07:00
Linus Torvalds	81ecf91eab	Merge tag 'safesetid-5.10' of git://github.com/micah-morton/linux Pull SafeSetID updates from Micah Morton: "The changes are mostly contained to within the SafeSetID LSM, with the exception of a few 1-line changes to change some ns_capable() calls to ns_capable_setid() -- causing a flag (CAP_OPT_INSETID) to be set that is examined by SafeSetID code and nothing else in the kernel. The changes to SafeSetID internally allow for setting up GID transition security policies, as already existed for UIDs" * tag 'safesetid-5.10' of git://github.com/micah-morton/linux: LSM: SafeSetID: Fix warnings reported by test bot LSM: SafeSetID: Add GID security policy handling LSM: Signal to SafeSetID when setting group IDs	2020-10-25 10:45:26 -07:00
Liao Pingfang	15ec0fcff6	kernel/sys.c: replace do_brk with do_brk_flags in comment of prctl_set_mm_map() Replace do_brk with do_brk_flags in comment of prctl_set_mm_map(), since do_brk was removed in following commit. Fixes: `bb177a732c` ("mm: do not bug_on on incorrect length in __mm_populate()") Signed-off-by: Liao Pingfang <liao.pingfang@zte.com.cn> Signed-off-by: Yi Wang <wang.yi59@zte.com.cn> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Link: https://lkml.kernel.org/r/1600650751-43127-1-git-send-email-wang.yi59@zte.com.cn Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-10-16 11:11:19 -07:00
Thomas Cedeno	111767c1d8	LSM: Signal to SafeSetID when setting group IDs For SafeSetID to properly gate setgid() calls, it needs to know whether ns_capable() is being called from within a sys_setgid() function or is being called from elsewhere in the kernel. This allows SafeSetID to deny CAP_SETGID to restricted groups when they are attempting to use the capability for code paths other than updating GIDs (e.g. setting up userns GID mappings). This is the identical approach to what is currently done for CAP_SETUID. NOTE: We also add signaling to SafeSetID from the setgroups() syscall, as we have future plans to restrict a process' ability to set supplementary groups in addition to what is added in this series for restricting setting of the primary group. Signed-off-by: Thomas Cedeno <thomascedeno@google.com> Signed-off-by: Micah Morton <mortonm@chromium.org>	2020-10-13 09:17:34 -07:00
Greg Kroah-Hartman	f022c0602c	Merge 5.9-rc3 into android-mainline Linux 5.9-rc3 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: Ic7758bc57a7d91861657388ddd015db5c5db5480	2020-08-31 19:51:25 +02:00
Gustavo A. R. Silva	df561f6688	treewide: Use fallthrough pseudo-keyword Replace the existing /* fall through */ comments and its variants with the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary fall-through markings when it is the case. [1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>	2020-08-23 17:36:59 -05:00
Greg Kroah-Hartman	f04a11ac99	Merge `7b4ea9456d` ("Revert "x86/mm/64: Do not sync vmalloc/ioremap mappings"") into android-mainline Steps on the way to 5.9-rc1 Resolves conflicts in: drivers/irqchip/qcom-pdc.c include/linux/device.h net/xfrm/xfrm_state.c security/lsm_audit.c Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: I4aeb3d04f4717714a421721eb3ce690c099bb30a	2020-08-07 16:01:35 +02:00
Nicolas Viennot	227175b2c9	prctl: exe link permission error changed from -EINVAL to -EPERM This brings consistency with the rest of the prctl() syscall where -EPERM is returned when failing a capability check. Signed-off-by: Nicolas Viennot <Nicolas.Viennot@twosigma.com> Signed-off-by: Adrian Reber <areber@redhat.com> Reviewed-by: Serge Hallyn <serge@hallyn.com> Link: https://lore.kernel.org/r/20200719100418.2112740-7-areber@redhat.com Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>	2020-07-19 20:14:42 +02:00
Nicolas Viennot	ebd6de6812	prctl: Allow local CAP_CHECKPOINT_RESTORE to change /proc/self/exe Originally, only a local CAP_SYS_ADMIN could change the exe link, making it difficult for doing checkpoint/restore without CAP_SYS_ADMIN. This commit adds CAP_CHECKPOINT_RESTORE in addition to CAP_SYS_ADMIN for permitting changing the exe link. The following describes the history of the /proc/self/exe permission checks as it may be difficult to understand what decisions lead to this point. * [1] May 2012: This commit introduces the ability of changing /proc/self/exe if the user is CAP_SYS_RESOURCE capable. In the related discussion [2], no clear thread model is presented for what could happen if the /proc/self/exe changes multiple times, or why would the admin be at the mercy of userspace. * [3] Oct 2014: This commit introduces a new API to change /proc/self/exe. The permission no longer checks for CAP_SYS_RESOURCE, but instead checks if the current user is root (uid=0) in its local namespace. In the related discussion [4] it is said that "Controlling exe_fd without privileges may turn out to be dangerous. At least things like tomoyo examine it for making policy decisions (see tomoyo_manager())." * [5] Dec 2016: This commit removes the restriction to change /proc/self/exe at most once. The related discussion [6] informs that the audit subsystem relies on the exe symlink, presumably audit_log_d_path_exe() in kernel/audit.c. * [7] May 2017: This commit changed the check from uid==0 to local CAP_SYS_ADMIN. No discussion. * [8] July 2020: A PoC to spoof any program's /proc/self/exe via ptrace is demonstrated Overall, the concrete points that were made to retain capability checks around changing the exe symlink is that tomoyo_manager() and audit_log_d_path_exe() uses the exe_file path. Christian Brauner said that relying on /proc/<pid>/exe being immutable (or guarded by caps) in a sake of security is a bit misleading. It can only be used as a hint without any guarantees of what code is being executed once execve() returns to userspace. Christian suggested that in the future, we could call audit_log() or similar to inform the admin of all exe link changes, instead of attempting to provide security guarantees via permission checks. However, this proposed change requires the understanding of the security implications in the tomoyo/audit subsystems. [1] `b32dfe3771` ("c/r: prctl: add ability to set new mm_struct::exe_file") [2] https://lore.kernel.org/patchwork/patch/292515/ [3] `f606b77f1a` ("prctl: PR_SET_MM -- introduce PR_SET_MM_MAP operation") [4] https://lore.kernel.org/patchwork/patch/479359/ [5] `3fb4afd9a5` ("prctl: remove one-shot limitation for changing exe link") [6] https://lore.kernel.org/patchwork/patch/697304/ [7] `4d28df6152` ("prctl: Allow local CAP_SYS_ADMIN changing exe_file") [8] https://github.com/nviennot/run_as_exe Signed-off-by: Nicolas Viennot <Nicolas.Viennot@twosigma.com> Signed-off-by: Adrian Reber <areber@redhat.com> Link: https://lore.kernel.org/r/20200719100418.2112740-6-areber@redhat.com Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>	2020-07-19 20:14:42 +02:00
Greg Kroah-Hartman	a33191d441	Merge 5.8-rc1 into android-mainline Linux 5.8-rc1 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: I00f2168bc9b6fd8e48c7c0776088d2c6cb8e1629	2020-06-25 14:25:32 +02:00
Greg Kroah-Hartman	1da58ab555	ANDROID: fix up direct access to mmap_sem It's now being abstracted away, so fix up the ANDROID specific code that touches the lock to use the correct functions instead. Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: I9cedb62118cd1bba7af27deb11499d312d24d7fc	2020-06-24 18:17:12 +02:00
Greg Kroah-Hartman	a253db8915	Merge `ad57a1022f` ("Merge tag 'exfat-for-5.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat") into android-mainline Steps on the way to 5.8-rc1. Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: I4bc42f572167ea2f815688b4d1eb6124b6d260d4	2020-06-24 17:54:12 +02:00
Linus Torvalds	4a87b197c1	Merge tag 'LSM-add-setgid-hook-5.8-author-fix' of git://github.com/micah-morton/linux Pull SafeSetID update from Micah Morton: "Add additional LSM hooks for SafeSetID SafeSetID is capable of making allow/deny decisions for setuid calls on a system, and we want to add similar functionality for setgid calls. The work to do that is not yet complete, so probably won't make it in for v5.8, but we are looking to get this simple patch in for v5.8 since we have it ready. We are planning on the rest of the work for extending the SafeSetID LSM being merged during the v5.9 merge window" * tag 'LSM-add-setgid-hook-5.8-author-fix' of git://github.com/micah-morton/linux: security: Add LSM hooks to set*gid syscalls	2020-06-14 11:39:31 -07:00
Thomas Cedeno	39030e1351	security: Add LSM hooks to setgid syscalls The SafeSetID LSM uses the security_task_fix_setuid hook to filter setuid() syscalls according to its configured security policy. In preparation for adding analagous support in the LSM for set*gid() syscalls, we add the requisite hook here. Tested by putting print statements in the security_task_fix_setgid hook and seeing them get hit during kernel boot. Signed-off-by: Thomas Cedeno <thomascedeno@google.com> Signed-off-by: Micah Morton <mortonm@chromium.org>	2020-06-14 10:52:02 -07:00
Greg Kroah-Hartman	a64320f5b8	Merge `94709049fb` ("Merge branch 'akpm' (patches from Andrew)") into android-mainline Tiny merge resolutions along the way to 5.8-rc1. Change-Id: I24b3cca28ed36f32c92b6374dae5d7f006d3bced Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>	2020-06-12 14:39:46 +02:00
Greg Kroah-Hartman	e93a57129f	Merge `062ea674ae` ("Merge branch 'uaccess.__copy_to_user' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs") into android-mainline baby steps on the way to 5.8-rc1 Change-Id: I45a38d5752e33d1f9a27bce6055435811512ea87 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>	2020-06-12 13:55:55 +02:00
Michel Lespinasse	c1e8d7c6a7	mmap locking API: convert mmap_sem comments Convert comments that reference mmap_sem to reference mmap_lock instead. [akpm@linux-foundation.org: fix up linux-next leftovers] [akpm@linux-foundation.org: s/lockaphore/lock/, per Vlastimil] [akpm@linux-foundation.org: more linux-next fixups, per Michel] Signed-off-by: Michel Lespinasse <walken@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Daniel Jordan <daniel.m.jordan@oracle.com> Cc: Davidlohr Bueso <dbueso@suse.de> Cc: David Rientjes <rientjes@google.com> Cc: Hugh Dickins <hughd@google.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Jerome Glisse <jglisse@redhat.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Laurent Dufour <ldufour@linux.ibm.com> Cc: Liam Howlett <Liam.Howlett@oracle.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ying Han <yinghan@google.com> Link: http://lkml.kernel.org/r/20200520052908.204642-13-walken@google.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-06-09 09:39:14 -07:00
Michel Lespinasse	d8ed45c5dc	mmap locking API: use coccinelle to convert mmap_sem rwsem call sites This change converts the existing mmap_sem rwsem calls to use the new mmap locking API instead. The change is generated using coccinelle with the following rule: // spatch --sp-file mmap_lock_api.cocci --in-place --include-headers --dir . @@ expression mm; @@ ( -init_rwsem +mmap_init_lock \| -down_write +mmap_write_lock \| -down_write_killable +mmap_write_lock_killable \| -down_write_trylock +mmap_write_trylock \| -up_write +mmap_write_unlock \| -downgrade_write +mmap_write_downgrade \| -down_read +mmap_read_lock \| -down_read_killable +mmap_read_lock_killable \| -down_read_trylock +mmap_read_trylock \| -up_read +mmap_read_unlock ) -(&mm->mmap_sem) +(mm) Signed-off-by: Michel Lespinasse <walken@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Daniel Jordan <daniel.m.jordan@oracle.com> Reviewed-by: Laurent Dufour <ldufour@linux.ibm.com> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Cc: Davidlohr Bueso <dbueso@suse.de> Cc: David Rientjes <rientjes@google.com> Cc: Hugh Dickins <hughd@google.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Jerome Glisse <jglisse@redhat.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Liam Howlett <Liam.Howlett@oracle.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ying Han <yinghan@google.com> Link: http://lkml.kernel.org/r/20200520052908.204642-5-walken@google.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-06-09 09:39:14 -07:00
Linus Torvalds	94709049fb	Merge branch 'akpm' (patches from Andrew) Merge updates from Andrew Morton: "A few little subsystems and a start of a lot of MM patches. Subsystems affected by this patch series: squashfs, ocfs2, parisc, vfs. With mm subsystems: slab-generic, slub, debug, pagecache, gup, swap, memcg, pagemap, memory-failure, vmalloc, kasan" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (128 commits) kasan: move kasan_report() into report.c mm/mm_init.c: report kasan-tag information stored in page->flags ubsan: entirely disable alignment checks under UBSAN_TRAP kasan: fix clang compilation warning due to stack protector x86/mm: remove vmalloc faulting mm: remove vmalloc_sync_(un)mappings() x86/mm/32: implement arch_sync_kernel_mappings() x86/mm/64: implement arch_sync_kernel_mappings() mm/ioremap: track which page-table levels were modified mm/vmalloc: track which page-table levels were modified mm: add functions to track page directory modifications s390: use __vmalloc_node in stack_alloc powerpc: use __vmalloc_node in alloc_vm_stack arm64: use __vmalloc_node in arch_alloc_vmap_stack mm: remove vmalloc_user_node_flags mm: switch the test_vmalloc module to use __vmalloc_node mm: remove __vmalloc_node_flags_caller mm: remove both instances of __vmalloc_node_flags mm: remove the prot argument to __vmalloc_node mm: remove the pgprot argument to __vmalloc ...	2020-06-02 12:21:36 -07:00
NeilBrown	a37b0715dd	mm/writeback: replace PF_LESS_THROTTLE with PF_LOCAL_THROTTLE PF_LESS_THROTTLE exists for loop-back nfsd (and a similar need in the loop block driver and callers of prctl(PR_SET_IO_FLUSHER)), where a daemon needs to write to one bdi (the final bdi) in order to free up writes queued to another bdi (the client bdi). The daemon sets PF_LESS_THROTTLE and gets a larger allowance of dirty pages, so that it can still dirty pages after other processses have been throttled. The purpose of this is to avoid deadlock that happen when the PF_LESS_THROTTLE process must write for any dirty pages to be freed, but it is being thottled and cannot write. This approach was designed when all threads were blocked equally, independently on which device they were writing to, or how fast it was. Since that time the writeback algorithm has changed substantially with different threads getting different allowances based on non-trivial heuristics. This means the simple "add 25%" heuristic is no longer reliable. The important issue is not that the daemon needs a larger dirty page allowance, but that it needs a private dirty page allowance, so that dirty pages for the "client" bdi that it is helping to clear (the bdi for an NFS filesystem or loop block device etc) do not affect the throttling of the daemon writing to the "final" bdi. This patch changes the heuristic so that the task is not throttled when the bdi it is writing to has a dirty page count below below (or equal to) the free-run threshold for that bdi. This ensures it will always be able to have some pages in flight, and so will not deadlock. In a steady-state, it is expected that PF_LOCAL_THROTTLE tasks might still be throttled by global threshold, but that is acceptable as it is only the deadlock state that is interesting for this flag. This approach of "only throttle when target bdi is busy" is consistent with the other use of PF_LESS_THROTTLE in current_may_throttle(), were it causes attention to be focussed only on the target bdi. So this patch - renames PF_LESS_THROTTLE to PF_LOCAL_THROTTLE, - removes the 25% bonus that that flag gives, and - If PF_LOCAL_THROTTLE is set, don't delay at all unless the global and the local free-run thresholds are exceeded. Note that previously realtime threads were treated the same as PF_LESS_THROTTLE threads. This patch does not change the behvaiour for real-time threads, so it is now different from the behaviour of nfsd and loop tasks. I don't know what is wanted for realtime. [akpm@linux-foundation.org: coding style fixes] Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Jan Kara <jack@suse.cz> Acked-by: Chuck Lever <chuck.lever@oracle.com> [nfsd] Cc: Christoph Hellwig <hch@lst.de> Cc: Michal Hocko <mhocko@suse.com> Cc: Trond Myklebust <trond.myklebust@hammerspace.com> Link: http://lkml.kernel.org/r/87ftbf7gs3.fsf@notabene.neil.brown.name Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-06-02 10:59:08 -07:00
Al Viro	ce5155c4f8	compat sysinfo(2): don't bother with field-by-field copyout Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2020-04-25 18:06:05 -04:00
Greg Kroah-Hartman	ae56fd997e	Merge 5.6-rc6 into android-mainline Linux 5.6-rc6 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: I6c2d7aff44ad5a9b75030b72d34ca5dbd5ad3ceb	2020-03-16 08:09:43 +01:00
Cyril Hrubis	ecc421e05b	sys/sysinfo: Respect boottime inside time namespace The sysinfo() syscall includes uptime in seconds but has no correction for time namespaces which makes it inconsistent with the /proc/uptime inside of a time namespace. Add the missing time namespace adjustment call. Signed-off-by: Cyril Hrubis <chrubis@suse.cz> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Dmitry Safonov <dima@arista.com> Link: https://lkml.kernel.org/r/20200303150638.7329-1-chrubis@suse.cz	2020-03-03 19:34:32 +01:00
Greg Kroah-Hartman	7881aee544	Merge `39bed42de2` ("Merge tag 'for-linus-hmm' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma") into android-mainline Baby steps in the 5.6-rc1 merge cycle to make things easier to review and debug. Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: I0fa183764fd1adbde44e8181f0b3df6cff4da18b	2020-02-03 15:34:37 +00:00
Mike Christie	8d19f1c8e1	prctl: PR_{G,S}ET_IO_FLUSHER to support controlling memory reclaim There are several storage drivers like dm-multipath, iscsi, tcmu-runner, amd nbd that have userspace components that can run in the IO path. For example, iscsi and nbd's userspace deamons may need to recreate a socket and/or send IO on it, and dm-multipath's daemon multipathd may need to send SG IO or read/write IO to figure out the state of paths and re-set them up. In the kernel these drivers have access to GFP_NOIO/GFP_NOFS and the memalloc_*_save/restore functions to control the allocation behavior, but for userspace we would end up hitting an allocation that ended up writing data back to the same device we are trying to allocate for. The device is then in a state of deadlock, because to execute IO the device needs to allocate memory, but to allocate memory the memory layers want execute IO to the device. Here is an example with nbd using a local userspace daemon that performs network IO to a remote server. We are using XFS on top of the nbd device, but it can happen with any FS or other modules layered on top of the nbd device that can write out data to free memory. Here a nbd daemon helper thread, msgr-worker-1, is performing a write/sendmsg on a socket to execute a request. This kicks off a reclaim operation which results in a WRITE to the nbd device and the nbd thread calling back into the mm layer. [ 1626.609191] msgr-worker-1 D 0 1026 1 0x00004000 [ 1626.609193] Call Trace: [ 1626.609195] ? __schedule+0x29b/0x630 [ 1626.609197] ? wait_for_completion+0xe0/0x170 [ 1626.609198] schedule+0x30/0xb0 [ 1626.609200] schedule_timeout+0x1f6/0x2f0 [ 1626.609202] ? blk_finish_plug+0x21/0x2e [ 1626.609204] ? _xfs_buf_ioapply+0x2e6/0x410 [ 1626.609206] ? wait_for_completion+0xe0/0x170 [ 1626.609208] wait_for_completion+0x108/0x170 [ 1626.609210] ? wake_up_q+0x70/0x70 [ 1626.609212] ? __xfs_buf_submit+0x12e/0x250 [ 1626.609214] ? xfs_bwrite+0x25/0x60 [ 1626.609215] xfs_buf_iowait+0x22/0xf0 [ 1626.609218] __xfs_buf_submit+0x12e/0x250 [ 1626.609220] xfs_bwrite+0x25/0x60 [ 1626.609222] xfs_reclaim_inode+0x2e8/0x310 [ 1626.609224] xfs_reclaim_inodes_ag+0x1b6/0x300 [ 1626.609227] xfs_reclaim_inodes_nr+0x31/0x40 [ 1626.609228] super_cache_scan+0x152/0x1a0 [ 1626.609231] do_shrink_slab+0x12c/0x2d0 [ 1626.609233] shrink_slab+0x9c/0x2a0 [ 1626.609235] shrink_node+0xd7/0x470 [ 1626.609237] do_try_to_free_pages+0xbf/0x380 [ 1626.609240] try_to_free_pages+0xd9/0x1f0 [ 1626.609245] __alloc_pages_slowpath+0x3a4/0xd30 [ 1626.609251] ? ___slab_alloc+0x238/0x560 [ 1626.609254] __alloc_pages_nodemask+0x30c/0x350 [ 1626.609259] skb_page_frag_refill+0x97/0xd0 [ 1626.609274] sk_page_frag_refill+0x1d/0x80 [ 1626.609279] tcp_sendmsg_locked+0x2bb/0xdd0 [ 1626.609304] tcp_sendmsg+0x27/0x40 [ 1626.609307] sock_sendmsg+0x54/0x60 [ 1626.609308] ___sys_sendmsg+0x29f/0x320 [ 1626.609313] ? sock_poll+0x66/0xb0 [ 1626.609318] ? ep_item_poll.isra.15+0x40/0xc0 [ 1626.609320] ? ep_send_events_proc+0xe6/0x230 [ 1626.609322] ? hrtimer_try_to_cancel+0x54/0xf0 [ 1626.609324] ? ep_read_events_proc+0xc0/0xc0 [ 1626.609326] ? _raw_write_unlock_irq+0xa/0x20 [ 1626.609327] ? ep_scan_ready_list.constprop.19+0x218/0x230 [ 1626.609329] ? __hrtimer_init+0xb0/0xb0 [ 1626.609331] ? _raw_spin_unlock_irq+0xa/0x20 [ 1626.609334] ? ep_poll+0x26c/0x4a0 [ 1626.609337] ? tcp_tsq_write.part.54+0xa0/0xa0 [ 1626.609339] ? release_sock+0x43/0x90 [ 1626.609341] ? _raw_spin_unlock_bh+0xa/0x20 [ 1626.609342] __sys_sendmsg+0x47/0x80 [ 1626.609347] do_syscall_64+0x5f/0x1c0 [ 1626.609349] ? prepare_exit_to_usermode+0x75/0xa0 [ 1626.609351] entry_SYSCALL_64_after_hwframe+0x44/0xa9 This patch adds a new prctl command that daemons can use after they have done their initial setup, and before they start to do allocations that are in the IO path. It sets the PF_MEMALLOC_NOIO and PF_LESS_THROTTLE flags so both userspace block and FS threads can use it to avoid the allocation recursion and try to prevent from being throttled while writing out data to free up memory. Signed-off-by: Mike Christie <mchristi@redhat.com> Acked-by: Michal Hocko <mhocko@suse.com> Tested-by: Masato Suzuki <masato.suzuki@wdc.com> Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Link: https://lore.kernel.org/r/20191112001900.9206-1-mchristi@redhat.com Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>	2020-01-28 10:09:51 +01:00
Greg Kroah-Hartman	d3a196a371	Merge 5.5-rc1 into android-mainline Linux 5.5-rc1 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: I6f952ebdd40746115165a2f99bab340482f5c237	2019-12-09 12:12:00 +01:00
Joe Perches	5e1aada08c	kernel/sys.c: avoid copying possible padding bytes in copy_to_user Initialization is not guaranteed to zero padding bytes so use an explicit memset instead to avoid leaking any kernel content in any possible padding bytes. Link: http://lkml.kernel.org/r/dfa331c00881d61c8ee51577a082d8bebd61805c.camel@perches.com Signed-off-by: Joe Perches <joe@perches.com> Cc: Dan Carpenter <error27@gmail.com> Cc: Julia Lawall <julia.lawall@lip6.fr> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Kees Cook <keescook@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2019-12-04 19:44:12 -08:00
Arnd Bergmann	bdd565f817	y2038: rusage: use __kernel_old_timeval There are two 'struct timeval' fields in 'struct rusage'. Unfortunately the definition of timeval is now ambiguous when used in user space with a libc that has a 64-bit time_t, and this also changes the 'rusage' definition in user space in a way that is incompatible with the system call interface. While there is no good solution to avoid all ambiguity here, change the definition in the kernel headers to be compatible with the kernel ABI, using __kernel_old_timeval as an unambiguous base type. In previous discussions, there was also a plan to add a replacement for rusage based on 64-bit timestamps and nanosecond resolution, i.e. 'struct __kernel_timespec'. I have patches for that as well, if anyone thinks we should do that. Reviewed-by: Cyrill Gorcunov <gorcunov@gmail.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de>	2019-11-15 14:38:29 +01:00
Greg Kroah-Hartman	bfa0399bc8	Merge Linus's 5.4-rc1-prerelease branch into android-mainline This merges Linus's tree as of commit `b41dae061b` ("Merge tag 'xfs-5.4-merge-7' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux") into android-mainline. This "early" merge makes it easier to test and handle merge conflicts instead of having to wait until the "end" of the merge window and handle all 10000+ commits at once. Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: I6bebf55e5e2353f814e3c87f5033607b1ae5d812	2019-09-20 16:07:54 -07:00
Linus Torvalds	7f2444d38f	Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull core timer updates from Thomas Gleixner: "Timers and timekeeping updates: - A large overhaul of the posix CPU timer code which is a preparation for moving the CPU timer expiry out into task work so it can be properly accounted on the task/process. An update to the bogus permission checks will come later during the merge window as feedback was not complete before heading of for travel. - Switch the timerqueue code to use cached rbtrees and get rid of the homebrewn caching of the leftmost node. - Consolidate hrtimer_init() + hrtimer_init_sleeper() calls into a single function - Implement the separation of hrtimers to be forced to expire in hard interrupt context even when PREEMPT_RT is enabled and mark the affected timers accordingly. - Implement a mechanism for hrtimers and the timer wheel to protect RT against priority inversion and live lock issues when a (hr)timer which should be canceled is currently executing the callback. Instead of infinitely spinning, the task which tries to cancel the timer blocks on a per cpu base expiry lock which is held and released by the (hr)timer expiry code. - Enable the Hyper-V TSC page based sched_clock for Hyper-V guests resulting in faster access to timekeeping functions. - Updates to various clocksource/clockevent drivers and their device tree bindings. - The usual small improvements all over the place" * 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (101 commits) posix-cpu-timers: Fix permission check regression posix-cpu-timers: Always clear head pointer on dequeue hrtimer: Add a missing bracket and hide `migration_base' on !SMP posix-cpu-timers: Make expiry_active check actually work correctly posix-timers: Unbreak CONFIG_POSIX_TIMERS=n build tick: Mark sched_timer to expire in hard interrupt context hrtimer: Add kernel doc annotation for HRTIMER_MODE_HARD x86/hyperv: Hide pv_ops access for CONFIG_PARAVIRT=n posix-cpu-timers: Utilize timerqueue for storage posix-cpu-timers: Move state tracking to struct posix_cputimers posix-cpu-timers: Deduplicate rlimit handling posix-cpu-timers: Remove pointless comparisons posix-cpu-timers: Get rid of 64bit divisions posix-cpu-timers: Consolidate timer expiry further posix-cpu-timers: Get rid of zero checks rlimit: Rewrite non-sensical RLIMIT_CPU comment posix-cpu-timers: Respect INFINITY for hard RTTIME limit posix-cpu-timers: Switch thread group sampling to array posix-cpu-timers: Restructure expiry array posix-cpu-timers: Remove cputime_expires ...	2019-09-17 12:35:15 -07:00
Linus Torvalds	22331f8952	Merge branch 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 cpu-feature updates from Ingo Molnar: - Rework the Intel model names symbols/macros, which were decades of ad-hoc extensions and added random noise. It's now a coherent, easy to follow nomenclature. - Add new Intel CPU model IDs: - "Tiger Lake" desktop and mobile models - "Elkhart Lake" model ID - and the "Lightning Mountain" variant of Airmont, plus support code - Add the new AVX512_VP2INTERSECT instruction to cpufeatures - Remove Intel MPX user-visible APIs and the self-tests, because the toolchain (gcc) is not supporting it going forward. This is the first, lowest-risk phase of MPX removal. - Remove X86_FEATURE_MFENCE_RDTSC - Various smaller cleanups and fixes * 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (25 commits) x86/cpu: Update init data for new Airmont CPU model x86/cpu: Add new Airmont variant to Intel family x86/cpu: Add Elkhart Lake to Intel family x86/cpu: Add Tiger Lake to Intel family x86: Correct misc typos x86/intel: Add common OPTDIFFs x86/intel: Aggregate microserver naming x86/intel: Aggregate big core graphics naming x86/intel: Aggregate big core mobile naming x86/intel: Aggregate big core client naming x86/cpufeature: Explain the macro duplication x86/ftrace: Remove mcount() declaration x86/PCI: Remove superfluous returns from void functions x86/msr-index: Move AMD MSRs where they belong x86/cpu: Use constant definitions for CPU models lib: Remove redundant ftrace flag removal x86/crash: Remove unnecessary comparison x86/bitops: Use __builtin_constant_p() directly instead of IS_IMMEDIATE() x86: Remove X86_FEATURE_MFENCE_RDTSC x86/mpx: Remove MPX APIs ...	2019-09-16 18:47:53 -07:00
Thomas Gleixner	2bbdbdae05	posix-cpu-timers: Get rid of zero checks Deactivation of the expiry cache is done by setting all clock caches to 0. That requires to have a check for zero in all places which update the expiry cache: if (cache == 0 \|\| new < cache) cache = new; Use U64_MAX as the deactivated value, which allows to remove the zero checks when updating the cache and reduces it to the obvious check: if (new < cache) cache = new; This also removes the weird workaround in do_prlimit() which was required to convert a RLIMIT_CPU value of 0 (immediate expiry) to 1 because handing in 0 to the posix CPU timer code would have effectively disarmed it. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Link: https://lkml.kernel.org/r/20190821192922.275086128@linutronix.de	2019-08-28 11:50:40 +02:00

1 2 3 4 5 ...

413 Commits