Changes in 5.10.37
Bluetooth: verify AMP hci_chan before amp_destroy
bluetooth: eliminate the potential race condition when removing the HCI controller
net/nfc: fix use-after-free llcp_sock_bind/connect
io_uring: truncate lengths larger than MAX_RW_COUNT on provide buffers
Revert "USB: cdc-acm: fix rounding error in TIOCSSERIAL"
usb: roles: Call try_module_get() from usb_role_switch_find_by_fwnode()
tty: moxa: fix TIOCSSERIAL jiffies conversions
tty: amiserial: fix TIOCSSERIAL permission check
USB: serial: usb_wwan: fix TIOCSSERIAL jiffies conversions
staging: greybus: uart: fix TIOCSSERIAL jiffies conversions
USB: serial: ti_usb_3410_5052: fix TIOCSSERIAL permission check
staging: fwserial: fix TIOCSSERIAL jiffies conversions
tty: moxa: fix TIOCSSERIAL permission check
staging: fwserial: fix TIOCSSERIAL permission check
drm: bridge: fix LONTIUM use of mipi_dsi_() functions
usb: typec: tcpm: Address incorrect values of tcpm psy for fixed supply
usb: typec: tcpm: Address incorrect values of tcpm psy for pps supply
usb: typec: tcpm: update power supply once partner accepts
usb: xhci-mtk: remove or operator for setting schedule parameters
usb: xhci-mtk: improve bandwidth scheduling with TT
ASoC: samsung: tm2_wm5110: check of of_parse return value
ASoC: Intel: kbl_da7219_max98927: Fix kabylake_ssp_fixup function
ASoC: tlv320aic32x4: Register clocks before registering component
ASoC: tlv320aic32x4: Increase maximum register in regmap
MIPS: pci-mt7620: fix PLL lock check
MIPS: pci-rt2880: fix slot 0 configuration
FDDI: defxx: Bail out gracefully with unassigned PCI resource for CSR
PCI: Allow VPD access for QLogic ISP2722
KVM: x86: Defer the MMU unload to the normal path on an global INVPCID
PCI: xgene: Fix cfg resource mapping
PCI: keystone: Let AM65 use the pci_ops defined in pcie-designware-host.c
PM / devfreq: Unlock mutex and free devfreq struct in error path
soc/tegra: regulators: Fix locking up when voltage-spread is out of range
iio: inv_mpu6050: Fully validate gyro and accel scale writes
iio:accel:adis16201: Fix wrong axis assignment that prevents loading
iio:adc:ad7476: Fix remove handling
sc16is7xx: Defer probe if device read fails
phy: cadence: Sierra: Fix PHY power_on sequence
misc: lis3lv02d: Fix false-positive WARN on various HP models
phy: ti: j721e-wiz: Invoke wiz_init() before of_platform_device_create()
misc: vmw_vmci: explicitly initialize vmci_notify_bm_set_msg struct
misc: vmw_vmci: explicitly initialize vmci_datagram payload
selinux: add proper NULL termination to the secclass_map permissions
x86, sched: Treat Intel SNC topology as default, COD as exception
async_xor: increase src_offs when dropping destination page
md/bitmap: wait for external bitmap writes to complete during tear down
md-cluster: fix use-after-free issue when removing rdev
md: split mddev_find
md: factor out a mddev_find_locked helper from mddev_find
md: md_open returns -EBUSY when entering racing area
md: Fix missing unused status line of /proc/mdstat
mt76: mt7615: use ieee80211_free_txskb() in mt7615_tx_token_put()
ipw2x00: potential buffer overflow in libipw_wx_set_encodeext()
cfg80211: scan: drop entry from hidden_list on overflow
rtw88: Fix array overrun in rtw_get_tx_power_params()
mt76: fix potential DMA mapping leak
FDDI: defxx: Make MMIO the configuration default except for EISA
drm/i915/gvt: Fix virtual display setup for BXT/APL
drm/i915/gvt: Fix vfio_edid issue for BXT/APL
drm/qxl: use ttm bo priorities
drm/panfrost: Clear MMU irqs before handling the fault
drm/panfrost: Don't try to map pages that are already mapped
drm/radeon: fix copy of uninitialized variable back to userspace
drm/dp_mst: Revise broadcast msg lct & lcr
drm/dp_mst: Set CLEAR_PAYLOAD_ID_TABLE as broadcast
drm: bridge/panel: Cleanup connector on bridge detach
drm/amd/display: Reject non-zero src_y and src_x for video planes
drm/amdgpu: fix concurrent VM flushes on Vega/Navi v2
ALSA: hda/realtek: Re-order ALC882 Acer quirk table entries
ALSA: hda/realtek: Re-order ALC882 Sony quirk table entries
ALSA: hda/realtek: Re-order ALC882 Clevo quirk table entries
ALSA: hda/realtek: Re-order ALC269 HP quirk table entries
ALSA: hda/realtek: Re-order ALC269 Acer quirk table entries
ALSA: hda/realtek: Re-order ALC269 Dell quirk table entries
ALSA: hda/realtek: Re-order ALC269 ASUS quirk table entries
ALSA: hda/realtek: Re-order ALC269 Sony quirk table entries
ALSA: hda/realtek: Re-order ALC269 Lenovo quirk table entries
ALSA: hda/realtek: Re-order remaining ALC269 quirk table entries
ALSA: hda/realtek: Re-order ALC662 quirk table entries
ALSA: hda/realtek: Remove redundant entry for ALC861 Haier/Uniwill devices
ALSA: hda/realtek: ALC285 Thinkpad jack pin quirk is unreachable
ALSA: hda/realtek: Fix speaker amp on HP Envy AiO 32
KVM: s390: VSIE: correctly handle MVPG when in VSIE
KVM: s390: split kvm_s390_logical_to_effective
KVM: s390: fix guarded storage control register handling
s390: fix detection of vector enhancements facility 1 vs. vector packed decimal facility
KVM: s390: VSIE: fix MVPG handling for prefixing and MSO
KVM: s390: split kvm_s390_real_to_abs
KVM: s390: extend kvm_s390_shadow_fault to return entry pointer
KVM: x86/mmu: Alloc page for PDPTEs when shadowing 32-bit NPT with 64-bit
KVM: x86: Remove emulator's broken checks on CR0/CR3/CR4 loads
KVM: nSVM: Set the shadow root level to the TDP level for nested NPT
KVM: SVM: Don't strip the C-bit from CR2 on #PF interception
KVM: SVM: Do not allow SEV/SEV-ES initialization after vCPUs are created
KVM: SVM: Inject #GP on guest MSR_TSC_AUX accesses if RDTSCP unsupported
KVM: nVMX: Defer the MMU reload to the normal path on an EPTP switch
KVM: nVMX: Truncate bits 63:32 of VMCS field on nested check in !64-bit
KVM: nVMX: Truncate base/index GPR value on address calc in !64-bit
KVM: arm/arm64: Fix KVM_VGIC_V3_ADDR_TYPE_REDIST read
KVM: Destroy I/O bus devices on unregister failure _after_ sync'ing SRCU
KVM: Stop looking for coalesced MMIO zones if the bus is destroyed
KVM: arm64: Fully zero the vcpu state on reset
KVM: arm64: Fix KVM_VGIC_V3_ADDR_TYPE_REDIST_REGION read
Revert "drivers/net/wan/hdlc_fr: Fix a double free in pvc_xmit"
Revert "i3c master: fix missing destroy_workqueue() on error in i3c_master_register"
ovl: fix missing revert_creds() on error path
Revert "drm/qxl: do not run release if qxl failed to init"
usb: gadget: pch_udc: Revert d3cb25a121 completely
Revert "tools/power turbostat: adjust for temperature offset"
firmware: xilinx: Fix dereferencing freed memory
firmware: xilinx: Add a blank line after function declaration
firmware: xilinx: Remove zynqmp_pm_get_eemi_ops() in IS_REACHABLE(CONFIG_ZYNQMP_FIRMWARE)
fpga: fpga-mgr: xilinx-spi: fix error messages on -EPROBE_DEFER
crypto: sun8i-ss - fix result memory leak on error path
memory: gpmc: fix out of bounds read and dereference on gpmc_cs[]
ARM: dts: exynos: correct fuel gauge interrupt trigger level on GT-I9100
ARM: dts: exynos: correct fuel gauge interrupt trigger level on Midas family
ARM: dts: exynos: correct MUIC interrupt trigger level on Midas family
ARM: dts: exynos: correct PMIC interrupt trigger level on Midas family
ARM: dts: exynos: correct PMIC interrupt trigger level on Odroid X/U3 family
ARM: dts: exynos: correct PMIC interrupt trigger level on SMDK5250
ARM: dts: exynos: correct PMIC interrupt trigger level on Snow
ARM: dts: s5pv210: correct fuel gauge interrupt trigger level on Fascinate family
ARM: dts: renesas: Add mmc aliases into R-Car Gen2 board dts files
arm64: dts: renesas: Add mmc aliases into board dts files
x86/platform/uv: Set section block size for hubless architectures
serial: stm32: fix code cleaning warnings and checks
serial: stm32: add "_usart" prefix in functions name
serial: stm32: fix probe and remove order for dma
serial: stm32: Use of_device_get_match_data()
serial: stm32: fix startup by enabling usart for reception
serial: stm32: fix incorrect characters on console
serial: stm32: fix TX and RX FIFO thresholds
serial: stm32: fix a deadlock condition with wakeup event
serial: stm32: fix wake-up flag handling
serial: stm32: fix a deadlock in set_termios
serial: stm32: fix tx dma completion, release channel
serial: stm32: call stm32_transmit_chars locked
serial: stm32: fix FIFO flush in startup and set_termios
serial: stm32: add FIFO flush when port is closed
serial: stm32: fix tx_empty condition
usb: typec: tcpci: Check ROLE_CONTROL while interpreting CC_STATUS
usb: typec: tps6598x: Fix return value check in tps6598x_probe()
usb: typec: stusb160x: fix return value check in stusb160x_probe()
regmap: set debugfs_name to NULL after it is freed
spi: rockchip: avoid objtool warning
mtd: rawnand: fsmc: Fix error code in fsmc_nand_probe()
mtd: rawnand: brcmnand: fix OOB R/W with Hamming ECC
mtd: Handle possible -EPROBE_DEFER from parse_mtd_partitions()
mtd: rawnand: qcom: Return actual error code instead of -ENODEV
mtd: don't lock when recursively deleting partitions
mtd: maps: fix error return code of physmap_flash_remove()
ARM: dts: stm32: fix usart 2 & 3 pinconf to wake up with flow control
arm64: dts: qcom: sm8250: Fix level triggered PMU interrupt polarity
arm64: dts: qcom: sm8250: Fix timer interrupt to specify EL2 physical timer
arm64: dts: qcom: sdm845: fix number of pins in 'gpio-ranges'
arm64: dts: qcom: sm8150: fix number of pins in 'gpio-ranges'
arm64: dts: qcom: sm8250: fix number of pins in 'gpio-ranges'
arm64: dts: qcom: db845c: fix correct powerdown pin for WSA881x
crypto: sun8i-ss - Fix memory leak of object d when dma_iv fails to map
spi: stm32: drop devres version of spi_register_master
regulator: bd9576: Fix return from bd957x_probe()
arm64: dts: renesas: r8a77980: Fix vin4-7 endpoint binding
spi: stm32: Fix use-after-free on unbind
x86/microcode: Check for offline CPUs before requesting new microcode
devtmpfs: fix placement of complete() call
usb: gadget: pch_udc: Replace cpu_to_le32() by lower_32_bits()
usb: gadget: pch_udc: Check if driver is present before calling ->setup()
usb: gadget: pch_udc: Check for DMA mapping error
usb: gadget: pch_udc: Initialize device pointer before use
usb: gadget: pch_udc: Provide a GPIO line used on Intel Minnowboard (v1)
crypto: ccp - fix command queuing to TEE ring buffer
crypto: qat - don't release uninitialized resources
crypto: qat - ADF_STATUS_PF_RUNNING should be set after adf_dev_init
fotg210-udc: Fix DMA on EP0 for length > max packet size
fotg210-udc: Fix EP0 IN requests bigger than two packets
fotg210-udc: Remove a dubious condition leading to fotg210_done
fotg210-udc: Mask GRP2 interrupts we don't handle
fotg210-udc: Don't DMA more than the buffer can take
fotg210-udc: Complete OUT requests on short packets
usb: gadget: s3c: Fix incorrect resources releasing
usb: gadget: s3c: Fix the error handling path in 's3c2410_udc_probe()'
dt-bindings: serial: stm32: Use 'type: object' instead of false for 'additionalProperties'
mtd: require write permissions for locking and badblock ioctls
arm64: dts: renesas: r8a779a0: Fix PMU interrupt
bus: qcom: Put child node before return
soundwire: bus: Fix device found flag correctly
phy: ti: j721e-wiz: Delete "clk_div_sel" clk provider during cleanup
phy: marvell: ARMADA375_USBCLUSTER_PHY should not default to y, unconditionally
arm64: dts: mediatek: fix reset GPIO level on pumpkin
NFSD: Fix sparse warning in nfs4proc.c
NFSv4.2: fix copy stateid copying for the async copy
crypto: poly1305 - fix poly1305_core_setkey() declaration
crypto: qat - fix error path in adf_isr_resource_alloc()
usb: gadget: aspeed: fix dma map failure
USB: gadget: udc: fix wrong pointer passed to IS_ERR() and PTR_ERR()
drivers: nvmem: Fix voltage settings for QTI qfprom-efuse
driver core: platform: Declare early_platform_cleanup() prototype
memory: pl353: fix mask of ECC page_size config register
soundwire: stream: fix memory leak in stream config error path
m68k: mvme147,mvme16x: Don't wipe PCC timer config bits
firmware: qcom_scm: Make __qcom_scm_is_call_available() return bool
firmware: qcom_scm: Reduce locking section for __get_convention()
firmware: qcom_scm: Workaround lack of "is available" call on SC7180
iio: adc: Kconfig: make AD9467 depend on ADI_AXI_ADC symbol
mtd: rawnand: gpmi: Fix a double free in gpmi_nand_init
irqchip/gic-v3: Fix OF_BAD_ADDR error handling
staging: comedi: tests: ni_routes_test: Fix compilation error
staging: rtl8192u: Fix potential infinite loop
staging: fwserial: fix TIOCSSERIAL implementation
staging: fwserial: fix TIOCGSERIAL implementation
staging: greybus: uart: fix unprivileged TIOCCSERIAL
soc: qcom: pdr: Fix error return code in pdr_register_listener
PM / devfreq: Use more accurate returned new_freq as resume_freq
clocksource/drivers/timer-ti-dm: Fix posted mode status check order
clocksource/drivers/timer-ti-dm: Add missing set_state_oneshot_stopped
clocksource/drivers/ingenic_ost: Fix return value check in ingenic_ost_probe()
spi: Fix use-after-free with devm_spi_alloc_*
spi: fsl: add missing iounmap() on error in of_fsl_spi_probe()
soc: qcom: mdt_loader: Validate that p_filesz < p_memsz
soc: qcom: mdt_loader: Detect truncated read of segments
PM: runtime: Replace inline function pm_runtime_callbacks_present()
cpuidle: Fix ARM_QCOM_SPM_CPUIDLE configuration
ACPI: CPPC: Replace cppc_attr with kobj_attribute
crypto: allwinner - add missing CRYPTO_ prefix
crypto: sun8i-ss - Fix memory leak of pad
crypto: sa2ul - Fix memory leak of rxd
crypto: qat - Fix a double free in adf_create_ring
cpufreq: armada-37xx: Fix setting TBG parent for load levels
clk: mvebu: armada-37xx-periph: remove .set_parent method for CPU PM clock
cpufreq: armada-37xx: Fix the AVS value for load L1
clk: mvebu: armada-37xx-periph: Fix switching CPU freq from 250 Mhz to 1 GHz
clk: mvebu: armada-37xx-periph: Fix workaround for switching from L1 to L0
cpufreq: armada-37xx: Fix driver cleanup when registration failed
cpufreq: armada-37xx: Fix determining base CPU frequency
spi: spi-zynqmp-gqspi: use wait_for_completion_timeout to make zynqmp_qspi_exec_op not interruptible
spi: spi-zynqmp-gqspi: add mutex locking for exec_op
spi: spi-zynqmp-gqspi: transmit dummy circles by using the controller's internal functionality
spi: spi-zynqmp-gqspi: fix incorrect operating mode in zynqmp_qspi_read_op
spi: fsl-lpspi: Fix PM reference leak in lpspi_prepare_xfer_hardware()
usb: gadget: r8a66597: Add missing null check on return from platform_get_resource
USB: cdc-acm: fix unprivileged TIOCCSERIAL
USB: cdc-acm: fix TIOCGSERIAL implementation
tty: actually undefine superseded ASYNC flags
tty: fix return value for unsupported ioctls
tty: Remove dead termiox code
tty: fix return value for unsupported termiox ioctls
serial: core: return early on unsupported ioctls
firmware: qcom-scm: Fix QCOM_SCM configuration
node: fix device cleanups in error handling code
crypto: chelsio - Read rxchannel-id from firmware
usbip: vudc: fix missing unlock on error in usbip_sockfd_store()
m68k: Add missing mmap_read_lock() to sys_cacheflush()
spi: spi-zynqmp-gqspi: Fix missing unlock on error in zynqmp_qspi_exec_op()
memory: renesas-rpc-if: fix possible NULL pointer dereference of resource
memory: samsung: exynos5422-dmc: handle clk_set_parent() failure
security: keys: trusted: fix TPM2 authorizations
platform/x86: pmc_atom: Match all Beckhoff Automation baytrail boards with critclk_systems DMI table
ARM: dts: aspeed: Rainier: Fix humidity sensor bus address
Drivers: hv: vmbus: Use after free in __vmbus_open()
spi: spi-zynqmp-gqspi: fix clk_enable/disable imbalance issue
spi: spi-zynqmp-gqspi: fix hang issue when suspend/resume
spi: spi-zynqmp-gqspi: fix use-after-free in zynqmp_qspi_exec_op
spi: spi-zynqmp-gqspi: return -ENOMEM if dma_map_single fails
x86/platform/uv: Fix !KEXEC build failure
hwmon: (pmbus/pxe1610) don't bail out when not all pages are active
Drivers: hv: vmbus: Increase wait time for VMbus unload
PM: hibernate: x86: Use crc32 instead of md5 for hibernation e820 integrity check
usb: dwc2: Fix host mode hibernation exit with remote wakeup flow.
usb: dwc2: Fix hibernation between host and device modes.
ttyprintk: Add TTY hangup callback.
serial: omap: don't disable rs485 if rts gpio is missing
serial: omap: fix rs485 half-duplex filtering
xen-blkback: fix compatibility bug with single page rings
soc: aspeed: fix a ternary sign expansion bug
drm/tilcdc: send vblank event when disabling crtc
drm/stm: Fix bus_flags handling
drm/amd/display: Fix off by one in hdmi_14_process_transaction()
drm/mcde/panel: Inverse misunderstood flag
sched/fair: Fix shift-out-of-bounds in load_balance()
afs: Fix updating of i_mode due to 3rd party change
rcu: Remove spurious instrumentation_end() in rcu_nmi_enter()
media: vivid: fix assignment of dev->fbuf_out_flags
media: saa7134: use sg_dma_len when building pgtable
media: saa7146: use sg_dma_len when building pgtable
media: omap4iss: return error code when omap4iss_get() failed
media: rkisp1: rsz: crash fix when setting src format
media: aspeed: fix clock handling logic
drm/probe-helper: Check epoch counter in output_poll_execute()
media: venus: core: Fix some resource leaks in the error path of 'venus_probe()'
media: platform: sunxi: sun6i-csi: fix error return code of sun6i_video_start_streaming()
media: m88ds3103: fix return value check in m88ds3103_probe()
media: docs: Fix data organization of MEDIA_BUS_FMT_RGB101010_1X30
media: [next] staging: media: atomisp: fix memory leak of object flash
media: atomisp: Fixed error handling path
media: m88rs6000t: avoid potential out-of-bounds reads on arrays
media: atomisp: Fix use after free in atomisp_alloc_css_stat_bufs()
drm/amdkfd: fix build error with AMD_IOMMU_V2=m
of: overlay: fix for_each_child.cocci warnings
x86/kprobes: Fix to check non boostable prefixes correctly
selftests: fix prepending $(OUTPUT) to $(TEST_PROGS)
pata_arasan_cf: fix IRQ check
pata_ipx4xx_cf: fix IRQ check
sata_mv: add IRQ checks
ata: libahci_platform: fix IRQ check
seccomp: Fix CONFIG tests for Seccomp_filters
nvme-tcp: block BH in sk state_change sk callback
nvmet-tcp: fix incorrect locking in state_change sk callback
clk: imx: Fix reparenting of UARTs not associated with stdout
power: supply: bq25980: Move props from battery node
nvme: retrigger ANA log update if group descriptor isn't found
media: i2c: imx219: Move out locking/unlocking of vflip and hflip controls from imx219_set_stream
media: i2c: imx219: Balance runtime PM use-count
media: v4l2-ctrls.c: fix race condition in hdl->requests list
vfio/fsl-mc: Re-order vfio_fsl_mc_probe()
vfio/pci: Move VGA and VF initialization to functions
vfio/pci: Re-order vfio_pci_probe()
vfio/mdev: Do not allow a mdev_type to have a NULL parent pointer
clk: zynqmp: move zynqmp_pll_set_mode out of round_rate callback
clk: zynqmp: pll: add set_pll_mode to check condition in zynqmp_pll_enable
drm: xlnx: zynqmp: fix a memset in zynqmp_dp_train()
clk: qcom: a53-pll: Add missing MODULE_DEVICE_TABLE
clk: qcom: apss-ipq-pll: Add missing MODULE_DEVICE_TABLE
drm/amd/display: use GFP_ATOMIC in dcn20_resource_construct
drm/radeon: Fix a missing check bug in radeon_dp_mst_detect()
clk: uniphier: Fix potential infinite loop
scsi: pm80xx: Increase timeout for pm80xx mpi_uninit_check()
scsi: pm80xx: Fix potential infinite loop
scsi: ufs: ufshcd-pltfrm: Fix deferred probing
scsi: hisi_sas: Fix IRQ checks
scsi: jazz_esp: Add IRQ check
scsi: sun3x_esp: Add IRQ check
scsi: sni_53c710: Add IRQ check
scsi: ibmvfc: Fix invalid state machine BUG_ON()
mailbox: sprd: Introduce refcnt when clients requests/free channels
mfd: stm32-timers: Avoid clearing auto reload register
nvmet-tcp: fix a segmentation fault during io parsing error
nvme-pci: don't simple map sgl when sgls are disabled
media: cedrus: Fix H265 status definitions
HSI: core: fix resource leaks in hsi_add_client_from_dt()
x86/events/amd/iommu: Fix sysfs type mismatch
perf/amd/uncore: Fix sysfs type mismatch
io_uring: fix overflows checks in provide buffers
sched/debug: Fix cgroup_path[] serialization
drivers/block/null_blk/main: Fix a double free in null_init.
xsk: Respect device's headroom and tailroom on generic xmit path
HID: plantronics: Workaround for double volume key presses
perf symbols: Fix dso__fprintf_symbols_by_name() to return the number of printed chars
ASoC: Intel: boards: sof-wm8804: add check for PLL setting
ASoC: Intel: Skylake: Compile when any configuration is selected
RDMA/mlx5: Fix mlx5 rates to IB rates map
wilc1000: write value to WILC_INTR2_ENABLE register
KVM: x86/mmu: Retry page faults that hit an invalid memslot
Bluetooth: avoid deadlock between hci_dev->lock and socket lock
net: lapbether: Prevent racing when checking whether the netif is running
libbpf: Add explicit padding to bpf_xdp_set_link_opts
bpftool: Fix maybe-uninitialized warnings
iommu: Check dev->iommu in iommu_dev_xxx functions
iommu/vt-d: Reject unsupported page request modes
selftests/bpf: Re-generate vmlinux.h and BPF skeletons if bpftool changed
libbpf: Add explicit padding to btf_dump_emit_type_decl_opts
powerpc/fadump: Mark fadump_calculate_reserve_size as __init
powerpc/prom: Mark identical_pvr_fixup as __init
MIPS: fix local_irq_{disable,enable} in asmmacro.h
ima: Fix the error code for restoring the PCR value
inet: use bigger hash table for IP ID generation
pinctrl: pinctrl-single: remove unused parameter
pinctrl: pinctrl-single: fix pcs_pin_dbg_show() when bits_per_mux is not zero
MIPS: loongson64: fix bug when PAGE_SIZE > 16KB
ASoC: wm8960: Remove bitclk relax condition in wm8960_configure_sysclk
iommu/arm-smmu-v3: add bit field SFM into GERROR_ERR_MASK
RDMA/mlx5: Fix drop packet rule in egress table
IB/isert: Fix a use after free in isert_connect_request
powerpc: Fix HAVE_HARDLOCKUP_DETECTOR_ARCH build configuration
MIPS/bpf: Enable bpf_probe_read{, str}() on MIPS again
gpio: guard gpiochip_irqchip_add_domain() with GPIOLIB_IRQCHIP
ALSA: core: remove redundant spin_lock pair in snd_card_disconnect
net: phy: lan87xx: fix access to wrong register of LAN87xx
udp: never accept GSO_FRAGLIST packets
powerpc/pseries: Only register vio drivers if vio bus exists
net/tipc: fix missing destroy_workqueue() on error in tipc_crypto_start()
bug: Remove redundant condition check in report_bug
RDMA/core: Fix corrupted SL on passive side
nfc: pn533: prevent potential memory corruption
net: hns3: Limiting the scope of vector_ring_chain variable
mips: bmips: fix syscon-reboot nodes
iommu/vt-d: Don't set then clear private data in prq_event_thread()
iommu: Fix a boundary issue to avoid performance drop
iommu/vt-d: Report right snoop capability when using FL for IOVA
iommu/vt-d: Report the right page fault address
iommu/vt-d: Preset Access/Dirty bits for IOVA over FL
iommu/vt-d: Remove WO permissions on second-level paging entries
iommu/vt-d: Invalidate PASID cache when root/context entry changed
ALSA: usb-audio: Add error checks for usb_driver_claim_interface() calls
HID: lenovo: Use brightness_set_blocking callback for setting LEDs brightness
HID: lenovo: Fix lenovo_led_set_tp10ubkbd() error handling
HID: lenovo: Check hid_get_drvdata() returns non NULL in lenovo_event()
HID: lenovo: Map mic-mute button to KEY_F20 instead of KEY_MICMUTE
KVM: arm64: Initialize VCPU mdcr_el2 before loading it
ASoC: simple-card: fix possible uninitialized single_cpu local variable
liquidio: Fix unintented sign extension of a left shift of a u16
IB/hfi1: Use kzalloc() for mmu_rb_handler allocation
powerpc/64s: Fix pte update for kernel memory on radix
powerpc/perf: Fix PMU constraint check for EBB events
powerpc: iommu: fix build when neither PCI or IBMVIO is set
mac80211: bail out if cipher schemes are invalid
perf vendor events amd: Fix broken L2 Cache Hits from L2 HWPF metric
xfs: fix return of uninitialized value in variable error
rtw88: Fix an error code in rtw_debugfs_set_rsvd_page()
mt7601u: fix always true expression
mt76: mt7615: fix tx skb dma unmap
mt76: mt7915: fix tx skb dma unmap
mt76: mt7915: fix aggr len debugfs node
mt76: mt7615: fix mib stats counter reporting to mac80211
mt76: mt7915: fix mib stats counter reporting to mac80211
mt76: mt7663s: make all of packets 4-bytes aligned in sdio tx aggregation
mt76: mt7663s: fix the possible device hang in high traffic
KVM: PPC: Book3S HV P9: Restore host CTRL SPR after guest exit
ovl: invalidate readdir cache on changes to dir with origin
RDMA/qedr: Fix error return code in qedr_iw_connect()
IB/hfi1: Fix error return code in parse_platform_config()
RDMA/bnxt_re: Fix error return code in bnxt_qplib_cq_process_terminal()
cxgb4: Fix unintentional sign extension issues
net: thunderx: Fix unintentional sign extension issue
RDMA/srpt: Fix error return code in srpt_cm_req_recv()
RDMA/rtrs-clt: destroy sysfs after removing session from active list
i2c: cadence: fix reference leak when pm_runtime_get_sync fails
i2c: img-scb: fix reference leak when pm_runtime_get_sync fails
i2c: imx-lpi2c: fix reference leak when pm_runtime_get_sync fails
i2c: imx: fix reference leak when pm_runtime_get_sync fails
i2c: omap: fix reference leak when pm_runtime_get_sync fails
i2c: sprd: fix reference leak when pm_runtime_get_sync fails
i2c: stm32f7: fix reference leak when pm_runtime_get_sync fails
i2c: xiic: fix reference leak when pm_runtime_get_sync fails
i2c: cadence: add IRQ check
i2c: emev2: add IRQ check
i2c: jz4780: add IRQ check
i2c: mlxbf: add IRQ check
i2c: rcar: make sure irq is not threaded on Gen2 and earlier
i2c: rcar: protect against supurious interrupts on V3U
i2c: rcar: add IRQ check
i2c: sh7760: add IRQ check
powerpc/xive: Drop check on irq_data in xive_core_debug_show()
powerpc/xive: Fix xmon command "dxi"
ASoC: ak5558: correct reset polarity
net/mlx5: Fix bit-wise and with zero
net/packet: make packet_fanout.arr size configurable up to 64K
net/packet: remove data races in fanout operations
drm/i915/gvt: Fix error code in intel_gvt_init_device()
iommu/amd: Put newline after closing bracket in warning
perf beauty: Fix fsconfig generator
drm/amd/pm: fix error code in smu_set_power_limit()
MIPS: pci-legacy: stop using of_pci_range_to_resource
powerpc/pseries: extract host bridge from pci_bus prior to bus removal
powerpc/smp: Reintroduce cpu_core_mask
KVM: x86: dump_vmcs should not assume GUEST_IA32_EFER is valid
rtlwifi: 8821ae: upgrade PHY and RF parameters
wlcore: fix overlapping snprintf arguments in debugfs
i2c: sh7760: fix IRQ error path
i2c: mediatek: Fix wrong dma sync flag
mwl8k: Fix a double Free in mwl8k_probe_hw
netfilter: nft_payload: fix C-VLAN offload support
netfilter: nftables_offload: VLAN id needs host byteorder in flow dissector
netfilter: nftables_offload: special ethertype handling for VLAN
vsock/vmci: log once the failed queue pair allocation
libbpf: Initialize the bpf_seq_printf parameters array field by field
net: ethernet: ixp4xx: Set the DMA masks explicitly
gro: fix napi_gro_frags() Fast GRO breakage due to IP alignment check
RDMA/cxgb4: add missing qpid increment
RDMA/i40iw: Fix error unwinding when i40iw_hmc_sd_one fails
ALSA: usb: midi: don't return -ENOMEM when usb_urb_ep_type_check fails
sfc: ef10: fix TX queue lookup in TX event handling
vsock/virtio: free queued packets when closing socket
net: marvell: prestera: fix port event handling on init
net: davinci_emac: Fix incorrect masking of tx and rx error channel
mt76: mt7615: fix memleak when mt7615_unregister_device()
crypto: ccp: Detect and reject "invalid" addresses destined for PSP
nfp: devlink: initialize the devlink port attribute "lanes"
net: stmmac: fix TSO and TBS feature enabling during driver open
net: renesas: ravb: Fix a stuck issue when a lot of frames are received
net: phy: intel-xway: enable integrated led functions
RDMA/rxe: Fix a bug in rxe_fill_ip_info()
RDMA/core: Add CM to restrack after successful attachment to a device
powerpc/64: Fix the definition of the fixmap area
ath9k: Fix error check in ath9k_hw_read_revisions() for PCI devices
ath10k: Fix a use after free in ath10k_htc_send_bundle
ath10k: Fix ath10k_wmi_tlv_op_pull_peer_stats_info() unlock without lock
wlcore: Fix buffer overrun by snprintf due to incorrect buffer size
powerpc/perf: Fix the threshold event selection for memory events in power10
powerpc/52xx: Fix an invalid ASM expression ('addi' used instead of 'add')
net: phy: marvell: fix m88e1011_set_downshift
net: phy: marvell: fix m88e1111_set_downshift
net: enetc: fix link error again
bnxt_en: fix ternary sign extension bug in bnxt_show_temp()
ARM: dts: uniphier: Change phy-mode to RGMII-ID to enable delay pins for RTL8211E
arm64: dts: uniphier: Change phy-mode to RGMII-ID to enable delay pins for RTL8211E
net: geneve: modify IP header check in geneve6_xmit_skb and geneve_xmit_skb
selftests: net: mirror_gre_vlan_bridge_1q: Make an FDB entry static
selftests: mlxsw: Remove a redundant if statement in tc_flower_scale test
bnxt_en: Fix RX consumer index logic in the error path.
KVM: VMX: Intercept FS/GS_BASE MSR accesses for 32-bit KVM
net:emac/emac-mac: Fix a use after free in emac_mac_tx_buf_send
selftests/bpf: Fix BPF_CORE_READ_BITFIELD() macro
selftests/bpf: Fix field existence CO-RE reloc tests
selftests/bpf: Fix core_reloc test runner
bpf: Fix propagation of 32 bit unsigned bounds from 64 bit bounds
RDMA/siw: Fix a use after free in siw_alloc_mr
RDMA/bnxt_re: Fix a double free in bnxt_qplib_alloc_res
net: bridge: mcast: fix broken length + header check for MRDv6 Adv.
net:nfc:digital: Fix a double free in digital_tg_recv_dep_req
perf tools: Change fields type in perf_record_time_conv
perf jit: Let convert_timestamp() to be backwards-compatible
perf session: Add swap operation for event TIME_CONV
ia64: fix EFI_DEBUG build
kfifo: fix ternary sign extension bugs
mm/sl?b.c: remove ctor argument from kmem_cache_flags
mm: memcontrol: slab: fix obtain a reference to a freeing memcg
mm/sparse: add the missing sparse_buffer_fini() in error branch
mm/memory-failure: unnecessary amount of unmapping
afs: Fix speculative status fetches
bpf: Fix alu32 const subreg bound tracking on bitwise operations
bpf, ringbuf: Deny reserve of buffers larger than ringbuf
bpf: Prevent writable memory-mapping of read-only ringbuf pages
arm64: Remove arm64_dma32_phys_limit and its uses
net: Only allow init netns to set default tcp cong to a restricted algo
smp: Fix smp_call_function_single_async prototype
Revert "net/sctp: fix race condition in sctp_destroy_sock"
sctp: delay auto_asconf init until binding the first addr
Linux 5.10.37
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I5bee89c285d9dd72de967b0e70d96951ae4e06ae
[ Upstream commit ad789f84c9a145f8a18744c0387cec22ec51651e ]
The handling of sysrq key can be activated by echoing the key to
/proc/sysrq-trigger or via the magic key sequence typed into a terminal
that is connected to the system in some way (serial, USB or other mean).
In the former case, the handling is done in a user context. In the
latter case, it is likely to be in an interrupt context.
Currently in print_cpu() of kernel/sched/debug.c, sched_debug_lock is
taken with interrupt disabled for the whole duration of the calls to
print_*_stats() and print_rq() which could last for the quite some time
if the information dump happens on the serial console.
If the system has many cpus and the sched_debug_lock is somehow busy
(e.g. parallel sysrq-t), the system may hit a hard lockup panic
depending on the actually serial console implementation of the
system.
The purpose of sched_debug_lock is to serialize the use of the global
cgroup_path[] buffer in print_cpu(). The rests of the printk calls don't
need serialization from sched_debug_lock.
Calling printk() with interrupt disabled can still be problematic if
multiple instances are running. Allocating a stack buffer of PATH_MAX
bytes is not feasible because of the limited size of the kernel stack.
The solution implemented in this patch is to allow only one caller at a
time to use the full size group_path[], while other simultaneous callers
will have to use shorter stack buffers with the possibility of path
name truncation. A "..." suffix will be printed if truncation may have
happened. The cgroup path name is provided for informational purpose
only, so occasional path name truncation should not be a big problem.
Fixes: efe25c2c7b ("sched: Reinstate group names in /proc/sched_debug")
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20210415195426.6677-1-longman@redhat.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Export sched_feat_keys to check Sched feature is enabled or not
from vendor modules.
Bug: 173559623
Change-Id: Id03483149f39bfc3e3a18ea56736a84d824a53f7
Signed-off-by: Satya Durga Srinivasu Prabhala <satyap@codeaurora.org>
Reading /proc/sys/kernel/sched_domain/cpu*/domain0/flags mutliple times
with small reads causes oopses with slub corruption issues because the kfree is
free'ing an offset from a previous allocation. Fix this by adding in a new
pointer 'buf' for the allocation and kfree and use the temporary pointer tmp
to handle memory copies of the buf offsets.
Fixes: 5b9f8ff7b3 ("sched/debug: Output SD flag names rather than their values")
Reported-by: Jeff Bastian <jbastian@redhat.com>
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Valentin Schneider <valentin.schneider@arm.com>
Link: https://lkml.kernel.org/r/20201029151103.373410-1-colin.king@canonical.com
The last sd_flag_debug shuffle inadvertently moved its definition within
an #ifdef CONFIG_SYSCTL region. While CONFIG_SYSCTL is indeed required to
produce the sched domain ctl interface (which uses sd_flag_debug to output
flag names), it isn't required to run any assertion on the sched_domain
hierarchy itself.
Move the definition of sd_flag_debug to a CONFIG_SCHED_DEBUG region of
topology.c.
Now at long last we have:
- sd_flag_debug declared in include/linux/sched/topology.h iff
CONFIG_SCHED_DEBUG=y
- sd_flag_debug defined in kernel/sched/topology.c, conditioned by:
- CONFIG_SCHED_DEBUG, with an explicit #ifdef block
- CONFIG_SMP, as a requirement to compile topology.c
With this change, all symbols pertaining to SD flag metadata (with the
exception of __SD_FLAG_CNT) are now defined exclusively within topology.c
Fixes: 8fca9494d4 ("sched/topology: Move sd_flag_debug out of linux/sched/topology.h")
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20200908184956.23369-1-valentin.schneider@arm.com
Decoding the output of /proc/sys/kernel/sched_domain/cpu*/domain*/flags has
always been somewhat annoying, as one needs to go fetch the bit -> name
mapping from the source code itself. This encoding can be saved in a script
somewhere, but that isn't safe from flags being added, removed or even
shuffled around.
What matters for debugging purposes is to get *which* flags are set in a
given domain, their associated value is pretty much meaningless.
Make the sd flags debug file output flag names.
Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: https://lore.kernel.org/r/20200817113003.20802-7-valentin.schneider@arm.com
Writing to the sysctl of a sched_domain->flags directly updates the value of
the field, and goes nowhere near update_top_cache_domain(). This means that
the cached domain pointers can end up containing stale data (e.g. the
domain pointed to doesn't have the relevant flag set anymore).
Explicit domain walks that check for flags will be affected by
the write, but this won't be in sync with the cached pointers which will
still point to the domains that were cached at the last sched_domain
build.
In other words, writing to this interface is playing a dangerous game. It
could be made to trigger an update of the cached sched_domain pointers when
written to, but this does not seem to be worth the trouble. Make it
read-only.
Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200415210512.805-3-valentin.schneider@arm.com
Now that runnable_load_avg has been removed, we can replace it by a new
signal that will highlight the runnable pressure on a cfs_rq. This signal
track the waiting time of tasks on rq and can help to better define the
state of rqs.
At now, only util_avg is used to define the state of a rq:
A rq with more that around 80% of utilization and more than 1 tasks is
considered as overloaded.
But the util_avg signal of a rq can become temporaly low after that a task
migrated onto another rq which can bias the classification of the rq.
When tasks compete for the same rq, their runnable average signal will be
higher than util_avg as it will include the waiting time and we can use
this signal to better classify cfs_rqs.
The new runnable_avg will track the runnable time of a task which simply
adds the waiting time to the running time. The runnable _avg of cfs_rq
will be the /Sum of se's runnable_avg and the runnable_avg of group entity
will follow the one of the rq similarly to util_avg.
Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: "Dietmar Eggemann <dietmar.eggemann@arm.com>"
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Valentin Schneider <valentin.schneider@arm.com>
Cc: Phil Auld <pauld@redhat.com>
Cc: Hillf Danton <hdanton@sina.com>
Link: https://lore.kernel.org/r/20200224095223.13361-9-mgorman@techsingularity.net
Lengthy output of sysrq-t may take a lot of time on slow serial console
with lots of processes and CPUs.
So we need to reset NMI-watchdog to avoid spurious lockup messages, and
we also reset softlockup watchdogs on all other CPUs since another CPU
might be blocked waiting for us to process an IPI or stop_machine.
Add to sysrq_sched_debug_show() as what we did in show_state_filter().
Signed-off-by: Wei Li <liwei391@huawei.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Link: https://lkml.kernel.org/r/20191226085224.48942-1-liwei391@huawei.com
Based on 2 normalized pattern(s):
this program is free software you can redistribute it and or modify
it under the terms of the gnu general public license version 2 as
published by the free software foundation
this program is free software you can redistribute it and or modify
it under the terms of the gnu general public license version 2 as
published by the free software foundation #
extracted by the scancode license scanner the SPDX license identifier
GPL-2.0-only
has been chosen to replace the boilerplate/reference in 4122 file(s).
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Enrico Weigelt <info@metux.net>
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Allison Randal <allison@lohutok.net>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190604081206.933168790@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
register_sched_domain_sysctl() copies the cpu_possible_mask into
sd_sysctl_cpus, but only if sd_sysctl_cpus hasn't already been
allocated (ie, CONFIG_CPUMASK_OFFSTACK is set). However, when
CONFIG_CPUMASK_OFFSTACK is not set, sd_sysctl_cpus is left
uninitialized (all zeroes) and the kernel may fail to initialize
sched_domain sysctl entries for all possible CPUs.
This is visible to the user if the kernel is booted with maxcpus=n, or
if ACPI tables have been modified to leave CPUs offline, and then
checking for missing /proc/sys/kernel/sched_domain/cpu* entries.
Fix this by separating the allocation and initialization, and adding a
flag to initialize the possible CPU entries while system booting only.
Tested-by: Syuuichirou Ishii <ishii.shuuichir@jp.fujitsu.com>
Tested-by: Tarumizu, Kohei <tarumizu.kohei@jp.fujitsu.com>
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com>
Acked-by: Joe Lawrence <joe.lawrence@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Masayoshi Mizuma <msys.mizuma@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20190129151245.5073-1-msys.mizuma@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Currently, CONFIG_JUMP_LABEL just means "I _want_ to use jump label".
The jump label is controlled by HAVE_JUMP_LABEL, which is defined
like this:
#if defined(CC_HAVE_ASM_GOTO) && defined(CONFIG_JUMP_LABEL)
# define HAVE_JUMP_LABEL
#endif
We can improve this by testing 'asm goto' support in Kconfig, then
make JUMP_LABEL depend on CC_HAS_ASM_GOTO.
Ugly #ifdef HAVE_JUMP_LABEL will go away, and CONFIG_JUMP_LABEL will
match to the real kernel capability.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Pull x86 timer updates from Thomas Gleixner:
"Early TSC based time stamping to allow better boot time analysis.
This comes with a general cleanup of the TSC calibration code which
grew warts and duct taping over the years and removes 250 lines of
code. Initiated and mostly implemented by Pavel with help from various
folks"
* 'x86-timers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (37 commits)
x86/kvmclock: Mark kvm_get_preset_lpj() as __init
x86/tsc: Consolidate init code
sched/clock: Disable interrupts when calling generic_sched_clock_init()
timekeeping: Prevent false warning when persistent clock is not available
sched/clock: Close a hole in sched_clock_init()
x86/tsc: Make use of tsc_calibrate_cpu_early()
x86/tsc: Split native_calibrate_cpu() into early and late parts
sched/clock: Use static key for sched_clock_running
sched/clock: Enable sched clock early
sched/clock: Move sched clock initialization and merge with generic clock
x86/tsc: Use TSC as sched clock early
x86/tsc: Initialize cyc2ns when tsc frequency is determined
x86/tsc: Calibrate tsc only once
ARM/time: Remove read_boot_clock64()
s390/time: Remove read_boot_clock64()
timekeeping: Default boot time offset to local_clock()
timekeeping: Replace read_boot_clock64() with read_persistent_wall_and_boot_offset()
s390/time: Add read_persistent_wall_and_boot_offset()
x86/xen/time: Output xen sched_clock time from 0
x86/xen/time: Initialize pv xen time in init_hypervisor_platform()
...
Variants of proc_create{,_data} that directly take a struct seq_operations
argument and drastically reduces the boilerplate code in the callers.
All trivial callers converted over.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Pull scheduler updates from Ingo Molnar:
"The main scheduler changes in this cycle were:
- NUMA balancing improvements (Mel Gorman)
- Further load tracking improvements (Patrick Bellasi)
- Various NOHZ balancing cleanups and optimizations (Peter Zijlstra)
- Improve blocked load handling, in particular we can now reduce and
eventually stop periodic load updates on 'very idle' CPUs. (Vincent
Guittot)
- On isolated CPUs offload the final 1Hz scheduler tick as well, plus
related cleanups and reorganization. (Frederic Weisbecker)
- Core scheduler code cleanups (Ingo Molnar)"
* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (45 commits)
sched/core: Update preempt_notifier_key to modern API
sched/cpufreq: Rate limits for SCHED_DEADLINE
sched/fair: Update util_est only on util_avg updates
sched/cpufreq/schedutil: Use util_est for OPP selection
sched/fair: Use util_est in LB and WU paths
sched/fair: Add util_est on top of PELT
sched/core: Remove TASK_ALL
sched/completions: Use bool in try_wait_for_completion()
sched/fair: Update blocked load when newly idle
sched/fair: Move idle_balance()
sched/nohz: Merge CONFIG_NO_HZ_COMMON blocks
sched/fair: Move rebalance_domains()
sched/nohz: Optimize nohz_idle_balance()
sched/fair: Reduce the periodic update duration
sched/nohz: Stop NOHZ stats when decayed
sched/cpufreq: Provide migration hint
sched/nohz: Clean up nohz enter/exit
sched/fair: Update blocked load from NEWIDLE
sched/fair: Add NOHZ stats balancing
sched/fair: Restructure nohz_balance_kick()
...
Scheduler debug stats include newlines that display out of alignment
when prefixed by timestamps. For example, the dmesg utility:
% echo t > /proc/sysrq-trigger
% dmesg
...
[ 83.124251]
runnable tasks:
S task PID tree-key switches prio wait-time
sum-exec sum-sleep
-----------------------------------------------------------------------------------------------------------
At the same time, some syslog utilities (like rsyslog by default) don't
like the additional newlines control characters, saving lines like this
to /var/log/messages:
Mar 16 16:02:29 localhost kernel: #012runnable tasks:#012 S task PID tree-key ...
^^^^ ^^^^
Clean these up by moving newline characters to their own SEQ_printf
invocation. This leaves the /proc/sched_debug unchanged, but brings the
entire output into alignment when prefixed:
% echo t > /proc/sysrq-trigger
% dmesg
...
[ 62.410368] runnable tasks:
[ 62.410368] S task PID tree-key switches prio wait-time sum-exec sum-sleep
[ 62.410369] -----------------------------------------------------------------------------------------------------------
[ 62.410369] I kworker/u12:0 5 1932.215593 332 120 0.000000 3.621252 0.000000 0 0 /
and no escaped control characters from rsyslog in /var/log/messages:
Mar 16 16:15:06 localhost kernel: runnable tasks:
Mar 16 16:15:06 localhost kernel: S task PID tree-key ...
Signed-off-by: Joe Lawrence <joe.lawrence@redhat.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1521484555-8620-3-git-send-email-joe.lawrence@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
When the SEQ_printf() macro prints to the console, it runs a simple
printk() without KERN_CONT "continued" line printing. The result of
this is oddly wrapped task info, for example:
% echo t > /proc/sysrq-trigger
% dmesg
...
runnable tasks:
...
[ 29.608611] I
[ 29.608613] rcu_sched 8 3252.013846 4087 120
[ 29.608614] 0.000000 29.090111 0.000000
[ 29.608615] 0 0
[ 29.608616] /
Modify SEQ_printf to use pr_cont() for expected one-line results:
% echo t > /proc/sysrq-trigger
% dmesg
...
runnable tasks:
...
[ 106.716329] S cpuhp/5 37 2006.315026 14 120 0.000000 0.496893 0.000000 0 0 /
Signed-off-by: Joe Lawrence <joe.lawrence@redhat.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1521484555-8620-2-git-send-email-joe.lawrence@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The util_avg signal computed by PELT is too variable for some use-cases.
For example, a big task waking up after a long sleep period will have its
utilization almost completely decayed. This introduces some latency before
schedutil will be able to pick the best frequency to run a task.
The same issue can affect task placement. Indeed, since the task
utilization is already decayed at wakeup, when the task is enqueued in a
CPU, this can result in a CPU running a big task as being temporarily
represented as being almost empty. This leads to a race condition where
other tasks can be potentially allocated on a CPU which just started to run
a big task which slept for a relatively long period.
Moreover, the PELT utilization of a task can be updated every [ms], thus
making it a continuously changing value for certain longer running
tasks. This means that the instantaneous PELT utilization of a RUNNING
task is not really meaningful to properly support scheduler decisions.
For all these reasons, a more stable signal can do a better job of
representing the expected/estimated utilization of a task/cfs_rq.
Such a signal can be easily created on top of PELT by still using it as
an estimator which produces values to be aggregated on meaningful
events.
This patch adds a simple implementation of util_est, a new signal built on
top of PELT's util_avg where:
util_est(task) = max(task::util_avg, f(task::util_avg@dequeue))
This allows to remember how big a task has been reported by PELT in its
previous activations via f(task::util_avg@dequeue), which is the new
_task_util_est(struct task_struct*) function added by this patch.
If a task should change its behavior and it runs longer in a new
activation, after a certain time its util_est will just track the
original PELT signal (i.e. task::util_avg).
The estimated utilization of cfs_rq is defined only for root ones.
That's because the only sensible consumer of this signal are the
scheduler and schedutil when looking for the overall CPU utilization
due to FAIR tasks.
For this reason, the estimated utilization of a root cfs_rq is simply
defined as:
util_est(cfs_rq) = max(cfs_rq::util_avg, cfs_rq::util_est::enqueued)
where:
cfs_rq::util_est::enqueued = sum(_task_util_est(task))
for each RUNNABLE task on that root cfs_rq
It's worth noting that the estimated utilization is tracked only for
objects of interests, specifically:
- Tasks: to better support tasks placement decisions
- root cfs_rqs: to better support both tasks placement decisions as
well as frequencies selection
Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Joel Fernandes <joelaf@google.com>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Morten Rasmussen <morten.rasmussen@arm.com>
Cc: Paul Turner <pjt@google.com>
Cc: Rafael J . Wysocki <rafael.j.wysocki@intel.com>
Cc: Steve Muckle <smuckle@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Todd Kjos <tkjos@android.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Link: http://lkml.kernel.org/r/20180309095245.11071-2-patrick.bellasi@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Do the following cleanups and simplifications:
- sched/sched.h already includes <asm/paravirt.h>, so no need to
include it in sched/core.c again.
- order the <linux/sched/*.h> headers alphabetically
- add all <linux/sched/*.h> headers to kernel/sched/sched.h
- remove all unnecessary includes from the .c files that
are already included in kernel/sched/sched.h.
Finally, make all scheduler .c files use a single common header:
#include "sched.h"
... which now contains a union of the relied upon headers.
This makes the various .c files easier to read and easier to handle.
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
A good number of small style inconsistencies have accumulated
in the scheduler core, so do a pass over them to harmonize
all these details:
- fix speling in comments,
- use curly braces for multi-line statements,
- remove unnecessary parentheses from integer literals,
- capitalize consistently,
- remove stray newlines,
- add comments where necessary,
- remove invalid/unnecessary comments,
- align structure definitions and other data types vertically,
- add missing newlines for increased readability,
- fix vertical tabulation where it's misaligned,
- harmonize preprocessor conditional block labeling
and vertical alignment,
- remove line-breaks where they uglify the code,
- add newline after local variable definitions,
No change in functionality:
md5:
1191fa0a890cfa8132156d2959d7e9e2 built-in.o.before.asm
1191fa0a890cfa8132156d2959d7e9e2 built-in.o.after.asm
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The load balancer uses runnable_load_avg as load indicator. For
!cgroup this is:
runnable_load_avg = \Sum se->avg.load_avg ; where se->on_rq
That is, a direct sum of all runnable tasks on that runqueue. As
opposed to load_avg, which is a sum of all tasks on the runqueue,
which includes a blocked component.
However, in the cgroup case, this comes apart since the group entities
are always runnable, even if most of their constituent entities are
blocked.
Therefore introduce a runnable_weight which for task entities is the
same as the regular weight, but for group entities is a fraction of
the entity weight and represents the runnable part of the group
runqueue.
Then propagate this load through the PELT hierarchy to arrive at an
effective runnable load avgerage -- which we should not confuse with
the canonical runnable load average.
Suggested-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
When an entity migrates in (or out) of a runqueue, we need to add (or
remove) its contribution from the entire PELT hierarchy, because even
non-runnable entities are included in the load average sums.
In order to do this we have some propagation logic that updates the
PELT tree, however the way it 'propagates' the runnable (or load)
change is (more or less):
tg->weight * grq->avg.load_avg
ge->avg.load_avg = ------------------------------
tg->load_avg
But that is the expression for ge->weight, and per the definition of
load_avg:
ge->avg.load_avg := ge->weight * ge->avg.runnable_avg
That destroys the runnable_avg (by setting it to 1) we wanted to
propagate.
Instead directly propagate runnable_sum.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Since on wakeup migration we don't hold the rq->lock for the old CPU
we cannot update its state. Instead we add the removed 'load' to an
atomic variable and have the next update on that CPU collect and
process it.
Currently we have 2 atomic variables; which already have the issue
that they can be read out-of-sync. Also, two atomic ops on a single
cacheline is already more expensive than an uncontended lock.
Since we want to add more, convert the thing over to an explicit
cacheline with a lock in.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull scheduler fixes from Ingo Molnar:
"Three CPU hotplug related fixes and a debugging improvement"
* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/debug: Add debugfs knob for "sched_debug"
sched/core: WARN() when migrating to an offline CPU
sched/fair: Plug hole between hotplug and active_load_balance()
sched/fair: Avoid newidle balance for !active CPUs
Currently we unconditionally destroy all sysctl bits and regenerate
them after we've rebuild the domains (even if that rebuild is a
no-op).
And since we unconditionally (re)build the sysctl for all possible
CPUs, onlining all CPUs gets us O(n^2) time. Instead change this to
only rebuild the bits for CPUs we've actually installed new domains
on.
Reported-by: Ofer Levi(SW) <oferle@mellanox.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>