Commit Graph

177 Commits

Author SHA1 Message Date
Liujie Xie
a61d61bab7 ANDROID: vendor_hooks: Add hooks to record the time of the process in various states
These hooks will do the following works:
a) record the time of the process in various states
b) Make corresponding optimization strategies in different hooks

Bug: 205938967

Signed-off-by: Liujie Xie <xieliujie@oppo.com>
Change-Id: Ia3c47bbf0aadd17337ce18fd910343b1b8c3ef93
2021-11-11 22:15:49 +08:00
Greg Kroah-Hartman
2df0fb4a4b Merge 5.10.50 into android12-5.10-lts
Changes in 5.10.50
	Bluetooth: hci_qca: fix potential GPF
	Bluetooth: btqca: Don't modify firmware contents in-place
	Bluetooth: Remove spurious error message
	ALSA: usb-audio: fix rate on Ozone Z90 USB headset
	ALSA: usb-audio: Fix OOB access at proc output
	ALSA: firewire-motu: fix stream format for MOTU 8pre FireWire
	ALSA: usb-audio: scarlett2: Fix wrong resume call
	ALSA: intel8x0: Fix breakage at ac97 clock measurement
	ALSA: hda/realtek: fix mute/micmute LEDs for HP ProBook 450 G8
	ALSA: hda/realtek: fix mute/micmute LEDs for HP ProBook 445 G8
	ALSA: hda/realtek: fix mute/micmute LEDs for HP ProBook 630 G8
	ALSA: hda/realtek: Add another ALC236 variant support
	ALSA: hda/realtek: fix mute/micmute LEDs for HP EliteBook x360 830 G8
	ALSA: hda/realtek: Improve fixup for HP Spectre x360 15-df0xxx
	ALSA: hda/realtek: Fix bass speaker DAC mapping for Asus UM431D
	ALSA: hda/realtek: Apply LED fixup for HP Dragonfly G1, too
	ALSA: hda/realtek: fix mute/micmute LEDs for HP EliteBook 830 G8 Notebook PC
	media: dvb-usb: fix wrong definition
	Input: usbtouchscreen - fix control-request directions
	net: can: ems_usb: fix use-after-free in ems_usb_disconnect()
	usb: gadget: eem: fix echo command packet response issue
	usb: renesas-xhci: Fix handling of unknown ROM state
	USB: cdc-acm: blacklist Heimann USB Appset device
	usb: dwc3: Fix debugfs creation flow
	usb: typec: Add the missed altmode_id_remove() in typec_register_altmode()
	xhci: solve a double free problem while doing s4
	gfs2: Fix underflow in gfs2_page_mkwrite
	gfs2: Fix error handling in init_statfs
	ntfs: fix validity check for file name attribute
	selftests/lkdtm: Avoid needing explicit sub-shell
	copy_page_to_iter(): fix ITER_DISCARD case
	iov_iter_fault_in_readable() should do nothing in xarray case
	Input: joydev - prevent use of not validated data in JSIOCSBTNMAP ioctl
	crypto: nx - Fix memcpy() over-reading in nonce
	crypto: ccp - Annotate SEV Firmware file names
	arm_pmu: Fix write counter incorrect in ARMv7 big-endian mode
	ARM: dts: ux500: Fix LED probing
	ARM: dts: at91: sama5d4: fix pinctrl muxing
	btrfs: send: fix invalid path for unlink operations after parent orphanization
	btrfs: compression: don't try to compress if we don't have enough pages
	btrfs: clear defrag status of a root if starting transaction fails
	ext4: cleanup in-core orphan list if ext4_truncate() failed to get a transaction handle
	ext4: fix kernel infoleak via ext4_extent_header
	ext4: fix overflow in ext4_iomap_alloc()
	ext4: return error code when ext4_fill_flex_info() fails
	ext4: correct the cache_nr in tracepoint ext4_es_shrink_exit
	ext4: remove check for zero nr_to_scan in ext4_es_scan()
	ext4: fix avefreec in find_group_orlov
	ext4: use ext4_grp_locked_error in mb_find_extent
	can: bcm: delay release of struct bcm_op after synchronize_rcu()
	can: gw: synchronize rcu operations before removing gw job entry
	can: isotp: isotp_release(): omit unintended hrtimer restart on socket release
	can: j1939: j1939_sk_init(): set SOCK_RCU_FREE to call sk_destruct() after RCU is done
	can: peak_pciefd: pucan_handle_status(): fix a potential starvation issue in TX path
	mac80211: remove iwlwifi specific workaround that broke sta NDP tx
	SUNRPC: Fix the batch tasks count wraparound.
	SUNRPC: Should wake up the privileged task firstly.
	bus: mhi: Wait for M2 state during system resume
	mm/gup: fix try_grab_compound_head() race with split_huge_page()
	perf/smmuv3: Don't trample existing events with global filter
	KVM: nVMX: Handle split-lock #AC exceptions that happen in L2
	KVM: PPC: Book3S HV: Workaround high stack usage with clang
	KVM: x86/mmu: Treat NX as used (not reserved) for all !TDP shadow MMUs
	KVM: x86/mmu: Use MMU's role to detect CR4.SMEP value in nested NPT walk
	s390/cio: dont call css_wait_for_slow_path() inside a lock
	s390: mm: Fix secure storage access exception handling
	f2fs: Prevent swap file in LFS mode
	clk: agilex/stratix10/n5x: fix how the bypass_reg is handled
	clk: agilex/stratix10: remove noc_clk
	clk: agilex/stratix10: fix bypass representation
	rtc: stm32: Fix unbalanced clk_disable_unprepare() on probe error path
	iio: frequency: adf4350: disable reg and clk on error in adf4350_probe()
	iio: light: tcs3472: do not free unallocated IRQ
	iio: ltr501: mark register holding upper 8 bits of ALS_DATA{0,1} and PS_DATA as volatile, too
	iio: ltr501: ltr559: fix initialization of LTR501_ALS_CONTR
	iio: ltr501: ltr501_read_ps(): add missing endianness conversion
	iio: accel: bma180: Fix BMA25x bandwidth register values
	serial: mvebu-uart: fix calculation of clock divisor
	serial: sh-sci: Stop dmaengine transfer in sci_stop_tx()
	serial_cs: Add Option International GSM-Ready 56K/ISDN modem
	serial_cs: remove wrong GLOBETROTTER.cis entry
	ath9k: Fix kernel NULL pointer dereference during ath_reset_internal()
	ssb: sdio: Don't overwrite const buffer if block_write fails
	rsi: Assign beacon rate settings to the correct rate_info descriptor field
	rsi: fix AP mode with WPA failure due to encrypted EAPOL
	tracing/histograms: Fix parsing of "sym-offset" modifier
	tracepoint: Add tracepoint_probe_register_may_exist() for BPF tracing
	seq_buf: Make trace_seq_putmem_hex() support data longer than 8
	powerpc/stacktrace: Fix spurious "stale" traces in raise_backtrace_ipi()
	loop: Fix missing discard support when using LOOP_CONFIGURE
	evm: Execute evm_inode_init_security() only when an HMAC key is loaded
	evm: Refuse EVM_ALLOW_METADATA_WRITES only if an HMAC key is loaded
	fuse: Fix crash in fuse_dentry_automount() error path
	fuse: Fix crash if superblock of submount gets killed early
	fuse: Fix infinite loop in sget_fc()
	fuse: ignore PG_workingset after stealing
	fuse: check connected before queueing on fpq->io
	fuse: reject internal errno
	thermal/cpufreq_cooling: Update offline CPUs per-cpu thermal_pressure
	spi: Make of_register_spi_device also set the fwnode
	Add a reference to ucounts for each cred
	staging: media: rkvdec: fix pm_runtime_get_sync() usage count
	media: marvel-ccic: fix some issues when getting pm_runtime
	media: mdk-mdp: fix pm_runtime_get_sync() usage count
	media: s5p: fix pm_runtime_get_sync() usage count
	media: am437x: fix pm_runtime_get_sync() usage count
	media: sh_vou: fix pm_runtime_get_sync() usage count
	media: mtk-vcodec: fix PM runtime get logic
	media: s5p-jpeg: fix pm_runtime_get_sync() usage count
	media: sunxi: fix pm_runtime_get_sync() usage count
	media: sti/bdisp: fix pm_runtime_get_sync() usage count
	media: exynos4-is: fix pm_runtime_get_sync() usage count
	media: exynos-gsc: fix pm_runtime_get_sync() usage count
	spi: spi-loopback-test: Fix 'tx_buf' might be 'rx_buf'
	spi: spi-topcliff-pch: Fix potential double free in pch_spi_process_messages()
	spi: omap-100k: Fix the length judgment problem
	regulator: uniphier: Add missing MODULE_DEVICE_TABLE
	sched/core: Initialize the idle task with preemption disabled
	hwrng: exynos - Fix runtime PM imbalance on error
	crypto: nx - add missing MODULE_DEVICE_TABLE
	media: sti: fix obj-$(config) targets
	media: cpia2: fix memory leak in cpia2_usb_probe
	media: cobalt: fix race condition in setting HPD
	media: hevc: Fix dependent slice segment flags
	media: pvrusb2: fix warning in pvr2_i2c_core_done
	media: imx: imx7_mipi_csis: Fix logging of only error event counters
	crypto: qat - check return code of qat_hal_rd_rel_reg()
	crypto: qat - remove unused macro in FW loader
	crypto: qce: skcipher: Fix incorrect sg count for dma transfers
	arm64: perf: Convert snprintf to sysfs_emit
	sched/fair: Fix ascii art by relpacing tabs
	media: i2c: ov2659: Use clk_{prepare_enable,disable_unprepare}() to set xvclk on/off
	media: bt878: do not schedule tasklet when it is not setup
	media: em28xx: Fix possible memory leak of em28xx struct
	media: hantro: Fix .buf_prepare
	media: cedrus: Fix .buf_prepare
	media: v4l2-core: Avoid the dangling pointer in v4l2_fh_release
	media: bt8xx: Fix a missing check bug in bt878_probe
	media: st-hva: Fix potential NULL pointer dereferences
	crypto: hisilicon/sec - fixup 3des minimum key size declaration
	Makefile: fix GDB warning with CONFIG_RELR
	media: dvd_usb: memory leak in cinergyt2_fe_attach
	memstick: rtsx_usb_ms: fix UAF
	mmc: sdhci-sprd: use sdhci_sprd_writew
	mmc: via-sdmmc: add a check against NULL pointer dereference
	spi: meson-spicc: fix a wrong goto jump for avoiding memory leak.
	spi: meson-spicc: fix memory leak in meson_spicc_probe
	crypto: shash - avoid comparing pointers to exported functions under CFI
	media: dvb_net: avoid speculation from net slot
	media: siano: fix device register error path
	media: imx-csi: Skip first few frames from a BT.656 source
	hwmon: (max31790) Report correct current pwm duty cycles
	hwmon: (max31790) Fix pwmX_enable attributes
	drivers/perf: fix the missed ida_simple_remove() in ddr_perf_probe()
	KVM: PPC: Book3S HV: Fix TLB management on SMT8 POWER9 and POWER10 processors
	btrfs: fix error handling in __btrfs_update_delayed_inode
	btrfs: abort transaction if we fail to update the delayed inode
	btrfs: sysfs: fix format string for some discard stats
	btrfs: don't clear page extent mapped if we're not invalidating the full page
	btrfs: disable build on platforms having page size 256K
	locking/lockdep: Fix the dep path printing for backwards BFS
	lockding/lockdep: Avoid to find wrong lock dep path in check_irq_usage()
	KVM: s390: get rid of register asm usage
	regulator: mt6358: Fix vdram2 .vsel_mask
	regulator: da9052: Ensure enough delay time for .set_voltage_time_sel
	media: Fix Media Controller API config checks
	ACPI: video: use native backlight for GA401/GA502/GA503
	HID: do not use down_interruptible() when unbinding devices
	EDAC/ti: Add missing MODULE_DEVICE_TABLE
	ACPI: processor idle: Fix up C-state latency if not ordered
	hv_utils: Fix passing zero to 'PTR_ERR' warning
	lib: vsprintf: Fix handling of number field widths in vsscanf
	Input: goodix - platform/x86: touchscreen_dmi - Move upside down quirks to touchscreen_dmi.c
	platform/x86: touchscreen_dmi: Add an extra entry for the upside down Goodix touchscreen on Teclast X89 tablets
	platform/x86: touchscreen_dmi: Add info for the Goodix GT912 panel of TM800A550L tablets
	ACPI: EC: Make more Asus laptops use ECDT _GPE
	block_dump: remove block_dump feature in mark_inode_dirty()
	blk-mq: grab rq->refcount before calling ->fn in blk_mq_tagset_busy_iter
	blk-mq: clear stale request in tags->rq[] before freeing one request pool
	fs: dlm: cancel work sync othercon
	random32: Fix implicit truncation warning in prandom_seed_state()
	open: don't silently ignore unknown O-flags in openat2()
	drivers: hv: Fix missing error code in vmbus_connect()
	fs: dlm: fix memory leak when fenced
	ACPICA: Fix memory leak caused by _CID repair function
	ACPI: bus: Call kobject_put() in acpi_init() error path
	ACPI: resources: Add checks for ACPI IRQ override
	block: fix race between adding/removing rq qos and normal IO
	platform/x86: asus-nb-wmi: Revert "Drop duplicate DMI quirk structures"
	platform/x86: asus-nb-wmi: Revert "add support for ASUS ROG Zephyrus G14 and G15"
	platform/x86: toshiba_acpi: Fix missing error code in toshiba_acpi_setup_keyboard()
	nvme-pci: fix var. type for increasing cq_head
	nvmet-fc: do not check for invalid target port in nvmet_fc_handle_fcp_rqst()
	EDAC/Intel: Do not load EDAC driver when running as a guest
	PCI: hv: Add check for hyperv_initialized in init_hv_pci_drv()
	cifs: improve fallocate emulation
	ACPI: EC: trust DSDT GPE for certain HP laptop
	clocksource: Retry clock read if long delays detected
	clocksource: Check per-CPU clock synchronization when marked unstable
	tpm_tis_spi: add missing SPI device ID entries
	ACPI: tables: Add custom DSDT file as makefile prerequisite
	HID: wacom: Correct base usage for capacitive ExpressKey status bits
	cifs: fix missing spinlock around update to ses->status
	mailbox: qcom: Use PLATFORM_DEVID_AUTO to register platform device
	block: fix discard request merge
	kthread_worker: fix return value when kthread_mod_delayed_work() races with kthread_cancel_delayed_work_sync()
	ia64: mca_drv: fix incorrect array size calculation
	writeback, cgroup: increment isw_nr_in_flight before grabbing an inode
	spi: Allow to have all native CSs in use along with GPIOs
	spi: Avoid undefined behaviour when counting unused native CSs
	media: venus: Rework error fail recover logic
	media: s5p_cec: decrement usage count if disabled
	media: hantro: do a PM resume earlier
	crypto: ixp4xx - dma_unmap the correct address
	crypto: ixp4xx - update IV after requests
	crypto: ux500 - Fix error return code in hash_hw_final()
	sata_highbank: fix deferred probing
	pata_rb532_cf: fix deferred probing
	media: I2C: change 'RST' to "RSET" to fix multiple build errors
	sched/uclamp: Fix wrong implementation of cpu.uclamp.min
	sched/uclamp: Fix locking around cpu_util_update_eff()
	kbuild: Fix objtool dependency for 'OBJECT_FILES_NON_STANDARD_<obj> := n'
	pata_octeon_cf: avoid WARN_ON() in ata_host_activate()
	evm: fix writing <securityfs>/evm overflow
	x86/elf: Use _BITUL() macro in UAPI headers
	crypto: sa2ul - Fix leaks on failure paths with sa_dma_init()
	crypto: sa2ul - Fix pm_runtime enable in sa_ul_probe()
	crypto: ccp - Fix a resource leak in an error handling path
	media: rc: i2c: Fix an error message
	pata_ep93xx: fix deferred probing
	locking/lockdep: Reduce LOCKDEP dependency list
	media: rkvdec: Fix .buf_prepare
	media: exynos4-is: Fix a use after free in isp_video_release
	media: au0828: fix a NULL vs IS_ERR() check
	media: tc358743: Fix error return code in tc358743_probe_of()
	media: gspca/gl860: fix zero-length control requests
	m68k: atari: Fix ATARI_KBD_CORE kconfig unmet dependency warning
	media: siano: Fix out-of-bounds warnings in smscore_load_firmware_family2()
	regulator: fan53880: Fix vsel_mask setting for FAN53880_BUCK
	crypto: nitrox - fix unchecked variable in nitrox_register_interrupts
	crypto: omap-sham - Fix PM reference leak in omap sham ops
	crypto: x86/curve25519 - fix cpu feature checking logic in mod_exit
	crypto: sm2 - remove unnecessary reset operations
	crypto: sm2 - fix a memory leak in sm2
	mmc: usdhi6rol0: fix error return code in usdhi6_probe()
	arm64: consistently use reserved_pg_dir
	arm64/mm: Fix ttbr0 values stored in struct thread_info for software-pan
	media: subdev: remove VIDIOC_DQEVENT_TIME32 handling
	media: s5p-g2d: Fix a memory leak on ctx->fh.m2m_ctx
	hwmon: (lm70) Use device_get_match_data()
	hwmon: (lm70) Revert "hwmon: (lm70) Add support for ACPI"
	hwmon: (max31722) Remove non-standard ACPI device IDs
	hwmon: (max31790) Fix fan speed reporting for fan7..12
	KVM: nVMX: Sync all PGDs on nested transition with shadow paging
	KVM: nVMX: Ensure 64-bit shift when checking VMFUNC bitmap
	KVM: nVMX: Don't clobber nested MMU's A/D status on EPTP switch
	KVM: x86/mmu: Fix return value in tdp_mmu_map_handle_target_level()
	perf/arm-cmn: Fix invalid pointer when access dtc object sharing the same IRQ number
	KVM: arm64: Don't zero the cycle count register when PMCR_EL0.P is set
	regulator: hi655x: Fix pass wrong pointer to config.driver_data
	btrfs: clear log tree recovering status if starting transaction fails
	x86/sev: Make sure IRQs are disabled while GHCB is active
	x86/sev: Split up runtime #VC handler for correct state tracking
	sched/rt: Fix RT utilization tracking during policy change
	sched/rt: Fix Deadline utilization tracking during policy change
	sched/uclamp: Fix uclamp_tg_restrict()
	lockdep: Fix wait-type for empty stack
	lockdep/selftests: Fix selftests vs PROVE_RAW_LOCK_NESTING
	spi: spi-sun6i: Fix chipselect/clock bug
	crypto: nx - Fix RCU warning in nx842_OF_upd_status
	psi: Fix race between psi_trigger_create/destroy
	media: v4l2-async: Clean v4l2_async_notifier_add_fwnode_remote_subdev
	media: video-mux: Skip dangling endpoints
	PM / devfreq: Add missing error code in devfreq_add_device()
	ACPI: PM / fan: Put fan device IDs into separate header file
	block: avoid double io accounting for flush request
	nvme-pci: look for StorageD3Enable on companion ACPI device instead
	ACPI: sysfs: Fix a buffer overrun problem with description_show()
	mark pstore-blk as broken
	clocksource/drivers/timer-ti-dm: Save and restore timer TIOCP_CFG
	extcon: extcon-max8997: Fix IRQ freeing at error path
	ACPI: APEI: fix synchronous external aborts in user-mode
	blk-wbt: introduce a new disable state to prevent false positive by rwb_enabled()
	blk-wbt: make sure throttle is enabled properly
	ACPI: Use DEVICE_ATTR_<RW|RO|WO> macros
	ACPI: bgrt: Fix CFI violation
	cpufreq: Make cpufreq_online() call driver->offline() on errors
	blk-mq: update hctx->dispatch_busy in case of real scheduler
	ocfs2: fix snprintf() checking
	dax: fix ENOMEM handling in grab_mapping_entry()
	mm/debug_vm_pgtable/basic: add validation for dirtiness after write protect
	mm/debug_vm_pgtable/basic: iterate over entire protection_map[]
	mm/debug_vm_pgtable: ensure THP availability via has_transparent_hugepage()
	swap: fix do_swap_page() race with swapoff
	mm/shmem: fix shmem_swapin() race with swapoff
	mm: memcg/slab: properly set up gfp flags for objcg pointer array
	mm: page_alloc: refactor setup_per_zone_lowmem_reserve()
	mm/page_alloc: fix counting of managed_pages
	xfrm: xfrm_state_mtu should return at least 1280 for ipv6
	drm/bridge/sii8620: fix dependency on extcon
	drm/bridge: Fix the stop condition of drm_bridge_chain_pre_enable()
	drm/amd/dc: Fix a missing check bug in dm_dp_mst_detect()
	drm/ast: Fix missing conversions to managed API
	video: fbdev: imxfb: Fix an error message
	net: mvpp2: Put fwnode in error case during ->probe()
	net: pch_gbe: Propagate error from devm_gpio_request_one()
	pinctrl: renesas: r8a7796: Add missing bias for PRESET# pin
	pinctrl: renesas: r8a77990: JTAG pins do not have pull-down capabilities
	drm/vmwgfx: Mark a surface gpu-dirty after the SVGA3dCmdDXGenMips command
	drm/vmwgfx: Fix cpu updates of coherent multisample surfaces
	net: qrtr: ns: Fix error return code in qrtr_ns_init()
	clk: meson: g12a: fix gp0 and hifi ranges
	net: ftgmac100: add missing error return code in ftgmac100_probe()
	drm: rockchip: set alpha_en to 0 if it is not used
	drm/rockchip: cdn-dp-core: add missing clk_disable_unprepare() on error in cdn_dp_grf_write()
	drm/rockchip: dsi: move all lane config except LCDC mux to bind()
	drm/rockchip: lvds: Fix an error handling path
	drm/rockchip: cdn-dp: fix sign extension on an int multiply for a u64 result
	mptcp: fix pr_debug in mptcp_token_new_connect
	mptcp: generate subflow hmac after mptcp_finish_join()
	RDMA/srp: Fix a recently introduced memory leak
	RDMA/rtrs-clt: Check state of the rtrs_clt_sess before reading its stats
	RDMA/rtrs: Do not reset hb_missed_max after re-connection
	RDMA/rtrs-srv: Fix memory leak of unfreed rtrs_srv_stats object
	RDMA/rtrs-srv: Fix memory leak when having multiple sessions
	RDMA/rtrs-clt: Check if the queue_depth has changed during a reconnection
	RDMA/rtrs-clt: Fix memory leak of not-freed sess->stats and stats->pcpu_stats
	ehea: fix error return code in ehea_restart_qps()
	clk: tegra30: Use 300MHz for video decoder by default
	xfrm: remove the fragment check for ipv6 beet mode
	net/sched: act_vlan: Fix modify to allow 0
	RDMA/core: Sanitize WQ state received from the userspace
	drm/pl111: depend on CONFIG_VEXPRESS_CONFIG
	RDMA/rxe: Fix failure during driver load
	drm/pl111: Actually fix CONFIG_VEXPRESS_CONFIG depends
	drm/vc4: hdmi: Fix error path of hpd-gpios
	clk: vc5: fix output disabling when enabling a FOD
	drm: qxl: ensure surf.data is ininitialized
	tools/bpftool: Fix error return code in do_batch()
	ath10k: go to path err_unsupported when chip id is not supported
	ath10k: add missing error return code in ath10k_pci_probe()
	wireless: carl9170: fix LEDS build errors & warnings
	ieee802154: hwsim: Fix possible memory leak in hwsim_subscribe_all_others
	clk: imx8mq: remove SYS PLL 1/2 clock gates
	wcn36xx: Move hal_buf allocation to devm_kmalloc in probe
	ssb: Fix error return code in ssb_bus_scan()
	brcmfmac: fix setting of station info chains bitmask
	brcmfmac: correctly report average RSSI in station info
	brcmfmac: Fix a double-free in brcmf_sdio_bus_reset
	brcmsmac: mac80211_if: Fix a resource leak in an error handling path
	cw1200: Revert unnecessary patches that fix unreal use-after-free bugs
	ath11k: Fix an error handling path in ath11k_core_fetch_board_data_api_n()
	ath10k: Fix an error code in ath10k_add_interface()
	ath11k: send beacon template after vdev_start/restart during csa
	netlabel: Fix memory leak in netlbl_mgmt_add_common
	RDMA/mlx5: Don't add slave port to unaffiliated list
	netfilter: nft_exthdr: check for IPv6 packet before further processing
	netfilter: nft_osf: check for TCP packet before further processing
	netfilter: nft_tproxy: restrict support to TCP and UDP transport protocols
	RDMA/rxe: Fix qp reference counting for atomic ops
	selftests/bpf: Whitelist test_progs.h from .gitignore
	xsk: Fix missing validation for skb and unaligned mode
	xsk: Fix broken Tx ring validation
	bpf: Fix libelf endian handling in resolv_btfids
	RDMA/rtrs-srv: Set minimal max_send_wr and max_recv_wr
	samples/bpf: Fix Segmentation fault for xdp_redirect command
	samples/bpf: Fix the error return code of xdp_redirect's main()
	mt76: fix possible NULL pointer dereference in mt76_tx
	mt76: mt7615: fix NULL pointer dereference in tx_prepare_skb()
	net: ethernet: aeroflex: fix UAF in greth_of_remove
	net: ethernet: ezchip: fix UAF in nps_enet_remove
	net: ethernet: ezchip: fix error handling
	vrf: do not push non-ND strict packets with a source LLA through packet taps again
	net: sched: add barrier to ensure correct ordering for lockless qdisc
	tls: prevent oversized sendfile() hangs by ignoring MSG_MORE
	netfilter: nf_tables_offload: check FLOW_DISSECTOR_KEY_BASIC in VLAN transfer logic
	pkt_sched: sch_qfq: fix qfq_change_class() error path
	xfrm: Fix xfrm offload fallback fail case
	iwlwifi: increase PNVM load timeout
	rtw88: 8822c: fix lc calibration timing
	vxlan: add missing rcu_read_lock() in neigh_reduce()
	ip6_tunnel: fix GRE6 segmentation
	net/ipv4: swap flow ports when validating source
	net: ti: am65-cpsw-nuss: Fix crash when changing number of TX queues
	tc-testing: fix list handling
	ieee802154: hwsim: Fix memory leak in hwsim_add_one
	ieee802154: hwsim: avoid possible crash in hwsim_del_edge_nl()
	bpf: Fix null ptr deref with mixed tail calls and subprogs
	drm/msm: Fix error return code in msm_drm_init()
	drm/msm/dpu: Fix error return code in dpu_mdss_init()
	mac80211: remove iwlwifi specific workaround NDPs of null_response
	net: bcmgenet: Fix attaching to PYH failed on RPi 4B
	ipv6: exthdrs: do not blindly use init_net
	can: j1939: j1939_sk_setsockopt(): prevent allocation of j1939 filter for optlen == 0
	bpf: Do not change gso_size during bpf_skb_change_proto()
	i40e: Fix error handling in i40e_vsi_open
	i40e: Fix autoneg disabling for non-10GBaseT links
	i40e: Fix missing rtnl locking when setting up pf switch
	Revert "ibmvnic: remove duplicate napi_schedule call in open function"
	ibmvnic: set ltb->buff to NULL after freeing
	ibmvnic: free tx_pool if tso_pool alloc fails
	RDMA/cma: Protect RMW with qp_mutex
	net: macsec: fix the length used to copy the key for offloading
	net: phy: mscc: fix macsec key length
	net: atlantic: fix the macsec key length
	ipv6: fix out-of-bound access in ip6_parse_tlv()
	e1000e: Check the PCIm state
	net: dsa: sja1105: fix NULL pointer dereference in sja1105_reload_cbs()
	bpfilter: Specify the log level for the kmsg message
	RDMA/cma: Fix incorrect Packet Lifetime calculation
	gve: Fix swapped vars when fetching max queues
	Revert "be2net: disable bh with spin_lock in be_process_mcc"
	Bluetooth: mgmt: Fix slab-out-of-bounds in tlv_data_is_valid
	Bluetooth: Fix not sending Set Extended Scan Response
	Bluetooth: Fix Set Extended (Scan Response) Data
	Bluetooth: Fix handling of HCI_LE_Advertising_Set_Terminated event
	clk: actions: Fix UART clock dividers on Owl S500 SoC
	clk: actions: Fix SD clocks factor table on Owl S500 SoC
	clk: actions: Fix bisp_factor_table based clocks on Owl S500 SoC
	clk: actions: Fix AHPPREDIV-H-AHB clock chain on Owl S500 SoC
	clk: qcom: clk-alpha-pll: fix CAL_L write in alpha_pll_fabia_prepare
	clk: si5341: Wait for DEVICE_READY on startup
	clk: si5341: Avoid divide errors due to bogus register contents
	clk: si5341: Check for input clock presence and PLL lock on startup
	clk: si5341: Update initialization magic
	writeback: fix obtain a reference to a freeing memcg css
	net: lwtunnel: handle MTU calculation in forwading
	net: sched: fix warning in tcindex_alloc_perfect_hash
	net: tipc: fix FB_MTU eat two pages
	RDMA/mlx5: Don't access NULL-cleared mpi pointer
	RDMA/core: Always release restrack object
	MIPS: Fix PKMAP with 32-bit MIPS huge page support
	staging: fbtft: Rectify GPIO handling
	staging: fbtft: Don't spam logs when probe is deferred
	ASoC: rt5682: Disable irq on shutdown
	rcu: Invoke rcu_spawn_core_kthreads() from rcu_spawn_gp_kthread()
	serial: fsl_lpuart: don't modify arbitrary data on lpuart32
	serial: fsl_lpuart: remove RTSCTS handling from get_mctrl()
	serial: 8250_omap: fix a timeout loop condition
	tty: nozomi: Fix a resource leak in an error handling function
	mwifiex: re-fix for unaligned accesses
	iio: adis_buffer: do not return ints in irq handlers
	iio: adis16400: do not return ints in irq handlers
	iio: adis16475: do not return ints in irq handlers
	iio: accel: bma180: Fix buffer alignment in iio_push_to_buffers_with_timestamp()
	iio: accel: bma220: Fix buffer alignment in iio_push_to_buffers_with_timestamp()
	iio: accel: hid: Fix buffer alignment in iio_push_to_buffers_with_timestamp()
	iio: accel: kxcjk-1013: Fix buffer alignment in iio_push_to_buffers_with_timestamp()
	iio: accel: mxc4005: Fix overread of data and alignment issue.
	iio: accel: stk8312: Fix buffer alignment in iio_push_to_buffers_with_timestamp()
	iio: accel: stk8ba50: Fix buffer alignment in iio_push_to_buffers_with_timestamp()
	iio: adc: ti-ads1015: Fix buffer alignment in iio_push_to_buffers_with_timestamp()
	iio: adc: vf610: Fix buffer alignment in iio_push_to_buffers_with_timestamp()
	iio: gyro: bmg160: Fix buffer alignment in iio_push_to_buffers_with_timestamp()
	iio: humidity: am2315: Fix buffer alignment in iio_push_to_buffers_with_timestamp()
	iio: prox: srf08: Fix buffer alignment in iio_push_to_buffers_with_timestamp()
	iio: prox: pulsed-light: Fix buffer alignment in iio_push_to_buffers_with_timestamp()
	iio: prox: as3935: Fix buffer alignment in iio_push_to_buffers_with_timestamp()
	iio: magn: hmc5843: Fix buffer alignment in iio_push_to_buffers_with_timestamp()
	iio: magn: bmc150: Fix buffer alignment in iio_push_to_buffers_with_timestamp()
	iio: light: isl29125: Fix buffer alignment in iio_push_to_buffers_with_timestamp()
	iio: light: tcs3414: Fix buffer alignment in iio_push_to_buffers_with_timestamp()
	iio: light: tcs3472: Fix buffer alignment in iio_push_to_buffers_with_timestamp()
	iio: chemical: atlas: Fix buffer alignment in iio_push_to_buffers_with_timestamp()
	iio: cros_ec_sensors: Fix alignment of buffer in iio_push_to_buffers_with_timestamp()
	iio: potentiostat: lmp91000: Fix alignment of buffer in iio_push_to_buffers_with_timestamp()
	ASoC: rk3328: fix missing clk_disable_unprepare() on error in rk3328_platform_probe()
	ASoC: hisilicon: fix missing clk_disable_unprepare() on error in hi6210_i2s_startup()
	backlight: lm3630a_bl: Put fwnode in error case during ->probe()
	ASoC: rsnd: tidyup loop on rsnd_adg_clk_query()
	Input: hil_kbd - fix error return code in hil_dev_connect()
	perf scripting python: Fix tuple_set_u64()
	mtd: partitions: redboot: seek fis-index-block in the right node
	mtd: rawnand: arasan: Ensure proper configuration for the asserted target
	staging: mmal-vchiq: Fix incorrect static vchiq_instance.
	char: pcmcia: error out if 'num_bytes_read' is greater than 4 in set_protocol()
	firmware: stratix10-svc: Fix a resource leak in an error handling path
	tty: nozomi: Fix the error handling path of 'nozomi_card_init()'
	leds: class: The -ENOTSUPP should never be seen by user space
	leds: lm3532: select regmap I2C API
	leds: lm36274: Put fwnode in error case during ->probe()
	leds: lm3692x: Put fwnode in any case during ->probe()
	leds: lm3697: Don't spam logs when probe is deferred
	leds: lp50xx: Put fwnode in error case during ->probe()
	scsi: FlashPoint: Rename si_flags field
	scsi: iscsi: Flush block work before unblock
	mfd: mp2629: Select MFD_CORE to fix build error
	mfd: rn5t618: Fix IRQ trigger by changing it to level mode
	fsi: core: Fix return of error values on failures
	fsi: scom: Reset the FSI2PIB engine for any error
	fsi: occ: Don't accept response from un-initialized OCC
	fsi/sbefifo: Clean up correct FIFO when receiving reset request from SBE
	fsi/sbefifo: Fix reset timeout
	visorbus: fix error return code in visorchipset_init()
	iommu/amd: Fix extended features logging
	s390/irq: select HAVE_IRQ_EXIT_ON_IRQ_STACK
	s390: enable HAVE_IOREMAP_PROT
	s390: appldata depends on PROC_SYSCTL
	selftests: splice: Adjust for handler fallback removal
	iommu/dma: Fix IOVA reserve dma ranges
	ASoC: max98373-sdw: use first_hw_init flag on resume
	ASoC: rt1308-sdw: use first_hw_init flag on resume
	ASoC: rt5682-sdw: use first_hw_init flag on resume
	ASoC: rt700-sdw: use first_hw_init flag on resume
	ASoC: rt711-sdw: use first_hw_init flag on resume
	ASoC: rt715-sdw: use first_hw_init flag on resume
	ASoC: rt5682: fix getting the wrong device id when the suspend_stress_test
	ASoC: rt5682-sdw: set regcache_cache_only false before reading RT5682_DEVICE_ID
	ASoC: mediatek: mtk-btcvsd: Fix an error handling path in 'mtk_btcvsd_snd_probe()'
	usb: gadget: f_fs: Fix setting of device and driver data cross-references
	usb: dwc2: Don't reset the core after setting turnaround time
	eeprom: idt_89hpesx: Put fwnode in matching case during ->probe()
	eeprom: idt_89hpesx: Restore printing the unsupported fwnode name
	thunderbolt: Bond lanes only when dual_link_port != NULL in alloc_dev_default()
	iio: adc: at91-sama5d2: Fix buffer alignment in iio_push_to_buffers_with_timestamp()
	iio: adc: hx711: Fix buffer alignment in iio_push_to_buffers_with_timestamp()
	iio: adc: mxs-lradc: Fix buffer alignment in iio_push_to_buffers_with_timestamp()
	iio: adc: ti-ads8688: Fix alignment of buffer in iio_push_to_buffers_with_timestamp()
	iio: magn: rm3100: Fix alignment of buffer in iio_push_to_buffers_with_timestamp()
	iio: light: vcnl4000: Fix buffer alignment in iio_push_to_buffers_with_timestamp()
	ASoC: fsl_spdif: Fix error handler with pm_runtime_enable
	staging: gdm724x: check for buffer overflow in gdm_lte_multi_sdu_pkt()
	staging: gdm724x: check for overflow in gdm_lte_netif_rx()
	staging: rtl8712: fix error handling in r871xu_drv_init
	staging: rtl8712: fix memory leak in rtl871x_load_fw_cb
	coresight: core: Fix use of uninitialized pointer
	staging: mt7621-dts: fix pci address for PCI memory range
	serial: 8250: Actually allow UPF_MAGIC_MULTIPLIER baud rates
	iio: light: vcnl4035: Fix buffer alignment in iio_push_to_buffers_with_timestamp()
	iio: prox: isl29501: Fix buffer alignment in iio_push_to_buffers_with_timestamp()
	ASoC: cs42l42: Correct definition of CS42L42_ADC_PDN_MASK
	of: Fix truncation of memory sizes on 32-bit platforms
	mtd: rawnand: marvell: add missing clk_disable_unprepare() on error in marvell_nfc_resume()
	habanalabs: Fix an error handling path in 'hl_pci_probe()'
	scsi: mpt3sas: Fix error return value in _scsih_expander_add()
	soundwire: stream: Fix test for DP prepare complete
	phy: uniphier-pcie: Fix updating phy parameters
	phy: ti: dm816x: Fix the error handling path in 'dm816x_usb_phy_probe()
	extcon: sm5502: Drop invalid register write in sm5502_reg_data
	extcon: max8997: Add missing modalias string
	powerpc/powernv: Fix machine check reporting of async store errors
	ASoC: atmel-i2s: Fix usage of capture and playback at the same time
	configfs: fix memleak in configfs_release_bin_file
	ASoC: Intel: sof_sdw: add SOF_RT715_DAI_ID_FIX for AlderLake
	ASoC: fsl_spdif: Fix unexpected interrupt after suspend
	leds: as3645a: Fix error return code in as3645a_parse_node()
	leds: ktd2692: Fix an error handling path
	selftests/ftrace: fix event-no-pid on 1-core machine
	serial: 8250: 8250_omap: Disable RX interrupt after DMA enable
	serial: 8250: 8250_omap: Fix possible interrupt storm on K3 SoCs
	powerpc: Offline CPU in stop_this_cpu()
	powerpc/papr_scm: Properly handle UUID types and API
	powerpc/64s: Fix copy-paste data exposure into newly created tasks
	powerpc/papr_scm: Make 'perf_stats' invisible if perf-stats unavailable
	ALSA: firewire-lib: Fix 'amdtp_domain_start()' when no AMDTP_OUT_STREAM stream is found
	serial: mvebu-uart: do not allow changing baudrate when uartclk is not available
	serial: mvebu-uart: correctly calculate minimal possible baudrate
	arm64: dts: marvell: armada-37xx: Fix reg for standard variant of UART
	vfio/pci: Handle concurrent vma faults
	mm/pmem: avoid inserting hugepage PTE entry with fsdax if hugepage support is disabled
	mm/huge_memory.c: remove dedicated macro HPAGE_CACHE_INDEX_MASK
	mm/huge_memory.c: add missing read-only THP checking in transparent_hugepage_enabled()
	mm/huge_memory.c: don't discard hugepage if other processes are mapping it
	mm/hugetlb: use helper huge_page_order and pages_per_huge_page
	mm/hugetlb: remove redundant check in preparing and destroying gigantic page
	hugetlb: remove prep_compound_huge_page cleanup
	include/linux/huge_mm.h: remove extern keyword
	mm/z3fold: fix potential memory leak in z3fold_destroy_pool()
	mm/z3fold: use release_z3fold_page_locked() to release locked z3fold page
	lib/math/rational.c: fix divide by zero
	selftests/vm/pkeys: fix alloc_random_pkey() to make it really, really random
	selftests/vm/pkeys: handle negative sys_pkey_alloc() return code
	selftests/vm/pkeys: refill shadow register after implicit kernel write
	perf llvm: Return -ENOMEM when asprintf() fails
	csky: fix syscache.c fallthrough warning
	csky: syscache: Fixup duplicate cache flush
	exfat: handle wrong stream entry size in exfat_readdir()
	scsi: fc: Correct RHBA attributes length
	scsi: target: cxgbit: Unmap DMA buffer before calling target_execute_cmd()
	mailbox: qcom-ipcc: Fix IPCC mbox channel exhaustion
	fscrypt: don't ignore minor_hash when hash is 0
	fscrypt: fix derivation of SipHash keys on big endian CPUs
	tpm: Replace WARN_ONCE() with dev_err_once() in tpm_tis_status()
	erofs: fix error return code in erofs_read_superblock()
	block: return the correct bvec when checking for gaps
	io_uring: fix blocking inline submission
	mmc: block: Disable CMDQ on the ioctl path
	mmc: vub3000: fix control-request direction
	media: exynos4-is: remove a now unused integer
	scsi: core: Retry I/O for Notify (Enable Spinup) Required error
	crypto: qce - fix error return code in qce_skcipher_async_req_handle()
	s390: preempt: Fix preempt_count initialization
	cred: add missing return error code when set_cred_ucounts() failed
	iommu/dma: Fix compile warning in 32-bit builds
	powerpc/preempt: Don't touch the idle task's preempt_count during hotplug
	Linux 5.10.50

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: Iec4eab24ea8eb5a6d79739a1aec8432d93a8f82c
2021-07-14 17:35:23 +02:00
Vincent Donnefort
c576472a05 sched/rt: Fix RT utilization tracking during policy change
[ Upstream commit fecfcbc288e9f4923f40fd23ca78a6acdc7fdf6c ]

RT keeps track of the utilization on a per-rq basis with the structure
avg_rt. This utilization is updated during task_tick_rt(),
put_prev_task_rt() and set_next_task_rt(). However, when the current
running task changes its policy, set_next_task_rt() which would usually
take care of updating the utilization when the rq starts running RT tasks,
will not see a such change, leaving the avg_rt structure outdated. When
that very same task will be dequeued later, put_prev_task_rt() will then
update the utilization, based on a wrong last_update_time, leading to a
huge spike in the RT utilization signal.

The signal would eventually recover from this issue after few ms. Even if
no RT tasks are run, avg_rt is also updated in __update_blocked_others().
But as the CPU capacity depends partly on the avg_rt, this issue has
nonetheless a significant impact on the scheduler.

Fix this issue by ensuring a load update when a running task changes
its policy to RT.

Fixes: 371bf427 ("sched/rt: Add rt_rq utilization tracking")
Signed-off-by: Vincent Donnefort <vincent.donnefort@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
Link: https://lore.kernel.org/r/1624271872-211872-2-git-send-email-vincent.donnefort@arm.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-14 16:56:09 +02:00
J. Avila
bb41438152 ANDROID: sched/rt: Only enable RT sync for SMP targets
The rt sync wakeup support has a condition which relies on a field that
exists only when CONFIG_SMP is defined, causing a compilation issue.
Since sync wakeup has no real meaning on a non-SMP system, we can just
drop the CONFIG_RT_GROUP_SCHED part of the #ifdef.

Fixes: da5f3cd378 ("ANDROID: sched/rt: Add support for rt sync wakeups")
Signed-off-by: J. Avila <elavila@google.com>
Change-Id: I9b95304408d323b0c1017bd33746ecfbb2b35808
2021-03-01 18:50:27 +00:00
J. Avila
da5f3cd378 ANDROID: sched/rt: Add support for rt sync wakeups
Some rt tasks undergo sync wakeup. Currently, these tasks will be placed
on other, often sleeping or otherwise idle CPUs, which can lead to
unnecessary power hits.

Bug: 157906395
Change-Id: I48864d0847bbe4f7813c842032880ad3f3b8b06b
Signed-off-by: J. Avila <elavila@google.com>
2021-02-25 18:27:15 +00:00
Pavankumar Kondeti
4b5f345c73 ANDROID: sched: Add restrict vendor hooks for balance_rt()
Add rvh called android_rvh_sched_balance_rt to influence
balance_rt() from vendor modules.

Bug: 178572414
Change-Id: I555c8ebcf5a3a5d8e3ab881ab9aa507f325285c2
Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org>
2021-01-28 08:48:23 +00:00
Rick Yiu
fb03f7bef8 ANDROID: sched: Export symbol for vendor RT hook funcion
Export task_may_not_preempt.

Bug: 174030348
Signed-off-by: Rick Yiu <rickyiu@google.com>
Change-Id: I71b50f876306811f008414096043b883dc43b4d5
Signed-off-by: Will McVicker <willmcvicker@google.com>
2021-01-12 12:57:38 -08:00
Satya Durga Srinivasu Prabhala
ebce8ec6bd ANDROID: sched: rt: rearrange invocation of find_lowest_rq() vendor hook
Right now, invocation of find_lowest_rq() vendor hook is made before
error checks and also, cpupri_find() isn't exported either. It would be
appropriate to move invocation of find_lowest_rq() vendor hook after
error checks are done & calling cpupri_find().

Bug: 173559623
Change-Id: I298dffd39be0451b0b154930ace4e16763c6e78d
Signed-off-by: Satya Durga Srinivasu Prabhala <satyap@codeaurora.org>
2020-11-20 09:46:29 +00:00
John Dias
8d19443b0b ANDROID: sched: avoid migrating when softint on tgt cpu should be short
The scheduling change to avoid putting RT threads on cores that
are handling softint's was catching cases where there was no reason
to believe the softint would take a long time, resulting in unnecessary
migration overhead. This patch reduces the migration to cases where
the core has a softint that is actually likely to take a long time,
as opposed to the RCU, SCHED, and TIMER softints that are rather quick.

Bug: 31752786
Bug: 168521633
Change-Id: Ib4e179f1e15c736b2fdba31070494e357e9fbbe2
Signed-off-by: John Dias <joaodias@google.com>
[elavila: Amend commit text for AOSP, port to mainline]
Signed-off-by: J. Avila <elavila@google.com>
2020-11-10 19:07:11 +00:00
John Dias
3adfd8e344 ANDROID: sched: avoid placing RT threads on cores handling softirqs
In certain audio use cases, scheduling RT threads on cores that are
handling softirqs can lead to glitches. Prevent this behavior.

Bug: 31501544
Bug: 168521633
Change-Id: I99dd7aaa12c11270b28dbabea484bcc8fb8ba0c1
Signed-off-by: John Dias <joaodias@google.com>
[elavila: Port to mainline, amend commit text]
Signed-off-by: J. Avila <elavila@google.com>
2020-11-10 19:07:11 +00:00
Sai Harshini Nimmala
bf3d991a7d ANDROID: sched: Add trace hook for rt throttle dump
Create a trace hook when RT tasks are throttled. This allows
vendors to debug long RT runs.

Bug: 172264047
Change-Id: I534959f8e8d714463aac2f9f1c5627d2e735f543
Signed-off-by: Sai Harshini Nimmala <snimmala@codeaurora.org>
2020-11-05 19:50:27 +00:00
Greg Kroah-Hartman
67d3ed5765 Merge 'v5.10-rc1' into android-mainline
Linux 5.10-rc1

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: Iace3fc84a00d3023c75caa086a266de17dc1847c
2020-10-29 06:32:38 +01:00
Joe Perches
33def8498f treewide: Convert macro and uses of __section(foo) to __section("foo")
Use a more generic form for __section that requires quotes to avoid
complications with clang and gcc differences.

Remove the quote operator # from compiler_attributes.h __section macro.

Convert all unquoted __section(foo) uses to quoted __section("foo").
Also convert __attribute__((section("foo"))) uses to __section("foo")
even if the __attribute__ has multiple list entry forms.

Conversion done using the script at:

    https://lore.kernel.org/lkml/75393e5ddc272dc7403de74d645e6c6e0f4e70eb.camel@perches.com/2-convert_section.pl

Signed-off-by: Joe Perches <joe@perches.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@gooogle.com>
Reviewed-by: Miguel Ojeda <ojeda@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-10-25 14:51:49 -07:00
Greg Kroah-Hartman
00d6a8a7ee Merge e4cbce4d13 ("Merge tag 'sched-core-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip") into android-mainline
Baby steps for 5.9-rc1

Resolves some kernel/sched/ merge issues.

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I88cf5411ac7251f9795d9c50cb18b0df5bf0bcd6
2020-08-07 14:17:39 +02:00
Park Bumgyu
16330b3560 ANDROID: Add vendor hooks to the scheduler
Add vendor hooks for vendor-specific scheduling.

  android_rvh_select_task_rq_rt:
    To perform vendor-specific RT task placement.

  android_rvh_select_fallback_rq:
    To restrict cpu usage.

  android_rvh_scheduler_tick:
    To collect periodic scheduling information and to schedule tasks.

  android_rvh_enqueue_tas/android_rvh_dequeue_task:
    For vendor to be aware of the task schedule in/out.

  android_rvh_can_migrate_task:
    To limit task migration based on vendor requirements.

  android_rvh_find_lowest_rq:
    To find the lowest rq for RT task with vendor-specific way.

Bug: 155241766

Change-Id: I926458b0a911d564e5932e200125b12406c2deee
Signed-off-by: Park Bumgyu <bumgyu.park@samsung.com>
2020-07-17 14:38:05 +00:00
Steven Rostedt (VMware)
a87e749e8f sched: Remove struct sched_class::next field
Now that the sched_class descriptors are defined in order via the linker
script vmlinux.lds.h, there's no reason to have a "next" pointer to the
previous priroity structure. The order of the sturctures can be aligned as
an array, and used to index and find the next sched_class descriptor.

Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20191219214558.845353593@goodmis.org
2020-06-25 13:45:44 +02:00
Steven Rostedt (VMware)
590d697963 sched: Force the address order of each sched class descriptor
In order to make a micro optimization in pick_next_task(), the order of the
sched class descriptor address must be in the same order as their priority
to each other. That is:

 &idle_sched_class < &fair_sched_class < &rt_sched_class <
 &dl_sched_class < &stop_sched_class

In order to guarantee this order of the sched class descriptors, add each
one into their own data section and force the order in the linker script.

Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/157675913272.349305.8936736338884044103.stgit@localhost.localdomain
2020-06-25 13:45:43 +02:00
Linus Torvalds
cb8e59cc87 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from David Miller:

 1) Allow setting bluetooth L2CAP modes via socket option, from Luiz
    Augusto von Dentz.

 2) Add GSO partial support to igc, from Sasha Neftin.

 3) Several cleanups and improvements to r8169 from Heiner Kallweit.

 4) Add IF_OPER_TESTING link state and use it when ethtool triggers a
    device self-test. From Andrew Lunn.

 5) Start moving away from custom driver versions, use the globally
    defined kernel version instead, from Leon Romanovsky.

 6) Support GRO vis gro_cells in DSA layer, from Alexander Lobakin.

 7) Allow hard IRQ deferral during NAPI, from Eric Dumazet.

 8) Add sriov and vf support to hinic, from Luo bin.

 9) Support Media Redundancy Protocol (MRP) in the bridging code, from
    Horatiu Vultur.

10) Support netmap in the nft_nat code, from Pablo Neira Ayuso.

11) Allow UDPv6 encapsulation of ESP in the ipsec code, from Sabrina
    Dubroca. Also add ipv6 support for espintcp.

12) Lots of ReST conversions of the networking documentation, from Mauro
    Carvalho Chehab.

13) Support configuration of ethtool rxnfc flows in bcmgenet driver,
    from Doug Berger.

14) Allow to dump cgroup id and filter by it in inet_diag code, from
    Dmitry Yakunin.

15) Add infrastructure to export netlink attribute policies to
    userspace, from Johannes Berg.

16) Several optimizations to sch_fq scheduler, from Eric Dumazet.

17) Fallback to the default qdisc if qdisc init fails because otherwise
    a packet scheduler init failure will make a device inoperative. From
    Jesper Dangaard Brouer.

18) Several RISCV bpf jit optimizations, from Luke Nelson.

19) Correct the return type of the ->ndo_start_xmit() method in several
    drivers, it's netdev_tx_t but many drivers were using
    'int'. From Yunjian Wang.

20) Add an ethtool interface for PHY master/slave config, from Oleksij
    Rempel.

21) Add BPF iterators, from Yonghang Song.

22) Add cable test infrastructure, including ethool interfaces, from
    Andrew Lunn. Marvell PHY driver is the first to support this
    facility.

23) Remove zero-length arrays all over, from Gustavo A. R. Silva.

24) Calculate and maintain an explicit frame size in XDP, from Jesper
    Dangaard Brouer.

25) Add CAP_BPF, from Alexei Starovoitov.

26) Support terse dumps in the packet scheduler, from Vlad Buslov.

27) Support XDP_TX bulking in dpaa2 driver, from Ioana Ciornei.

28) Add devm_register_netdev(), from Bartosz Golaszewski.

29) Minimize qdisc resets, from Cong Wang.

30) Get rid of kernel_getsockopt and kernel_setsockopt in order to
    eliminate set_fs/get_fs calls. From Christoph Hellwig.

* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2517 commits)
  selftests: net: ip_defrag: ignore EPERM
  net_failover: fixed rollback in net_failover_open()
  Revert "tipc: Fix potential tipc_aead refcnt leak in tipc_crypto_rcv"
  Revert "tipc: Fix potential tipc_node refcnt leak in tipc_rcv"
  vmxnet3: allow rx flow hash ops only when rss is enabled
  hinic: add set_channels ethtool_ops support
  selftests/bpf: Add a default $(CXX) value
  tools/bpf: Don't use $(COMPILE.c)
  bpf, selftests: Use bpf_probe_read_kernel
  s390/bpf: Use bcr 0,%0 as tail call nop filler
  s390/bpf: Maintain 8-byte stack alignment
  selftests/bpf: Fix verifier test
  selftests/bpf: Fix sample_cnt shared between two threads
  bpf, selftests: Adapt cls_redirect to call csum_level helper
  bpf: Add csum_level helper for fixing up csum levels
  bpf: Fix up bpf_skb_adjust_room helper's skb csum setting
  sfc: add missing annotation for efx_ef10_try_update_nic_stats_vf()
  crypto/chtls: IPv6 support for inline TLS
  Crypto/chcr: Fixes a coccinile check error
  Crypto/chcr: Fixes compilations warnings
  ...
2020-06-03 16:27:18 -07:00
Huaixin Chang
d505b8af58 sched: Defend cfs and rt bandwidth quota against overflow
When users write some huge number into cpu.cfs_quota_us or
cpu.rt_runtime_us, overflow might happen during to_ratio() shifts of
schedulable checks.

to_ratio() could be altered to avoid unnecessary internal overflow, but
min_cfs_quota_period is less than 1 << BW_SHIFT, so a cutoff would still
be needed. Set a cap MAX_BW for cfs_quota_us and rt_runtime_us to
prevent overflow.

Signed-off-by: Huaixin Chang <changhuaixin@linux.alibaba.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Ben Segall <bsegall@google.com>
Link: https://lkml.kernel.org/r/20200425105248.60093-1-changhuaixin@linux.alibaba.com
2020-05-19 20:34:14 +02:00
Christoph Hellwig
32927393dc sysctl: pass kernel pointers to ->proc_handler
Instead of having all the sysctl handlers deal with user pointers, which
is rather hairy in terms of the BPF interaction, copy the input to and
from  userspace in common code.  This also means that the strings are
always NUL-terminated by the common code, making the API a little bit
safer.

As most handler just pass through the data to one of the common handlers
a lot of the changes are mechnical.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-04-27 02:07:40 -04:00
Qais Yousef
d94a9df490 sched/rt: Remove unnecessary push for unfit tasks
In task_woken_rt() and switched_to_rto() we try trigger push-pull if the
task is unfit.

But the logic is found lacking because if the task was the only one
running on the CPU, then rt_rq is not in overloaded state and won't
trigger a push.

The necessity of this logic was under a debate as well, a summary of
the discussion can be found in the following thread:

  https://lore.kernel.org/lkml/20200226160247.iqvdakiqbakk2llz@e107158-lin.cambridge.arm.com/

Remove the logic for now until a better approach is agreed upon.

Signed-off-by: Qais Yousef <qais.yousef@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Fixes: 804d402fb6 ("sched/rt: Make RT capacity-aware")
Link: https://lkml.kernel.org/r/20200302132721.8353-6-qais.yousef@arm.com
2020-03-06 12:57:29 +01:00
Qais Yousef
98ca645f82 sched/rt: Allow pulling unfitting task
When implemented RT Capacity Awareness; the logic was done such that if
a task was running on a fitting CPU, then it was sticky and we would try
our best to keep it there.

But as Steve suggested, to adhere to the strict priority rules of RT
class; allow pulling an RT task to unfitting CPU to ensure it gets a
chance to run ASAP.

LINK: https://lore.kernel.org/lkml/20200203111451.0d1da58f@oasis.local.home/
Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Qais Yousef <qais.yousef@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Fixes: 804d402fb6 ("sched/rt: Make RT capacity-aware")
Link: https://lkml.kernel.org/r/20200302132721.8353-5-qais.yousef@arm.com
2020-03-06 12:57:28 +01:00
Qais Yousef
a1bd02e1f2 sched/rt: Optimize cpupri_find() on non-heterogenous systems
By introducing a new cpupri_find_fitness() function that takes the
fitness_fn as an argument and only called when asym_system static key is
enabled.

cpupri_find() is now a wrapper function that calls cpupri_find_fitness()
passing NULL as a fitness_fn, hence disabling the logic that handles
fitness by default.

LINK: https://lore.kernel.org/lkml/c0772fca-0a4b-c88d-fdf2-5715fcf8447b@arm.com/
Reported-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Signed-off-by: Qais Yousef <qais.yousef@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Fixes: 804d402fb6 ("sched/rt: Make RT capacity-aware")
Link: https://lkml.kernel.org/r/20200302132721.8353-4-qais.yousef@arm.com
2020-03-06 12:57:27 +01:00
Qais Yousef
b28bc1e002 sched/rt: Re-instate old behavior in select_task_rq_rt()
When RT Capacity Aware support was added, the logic in select_task_rq_rt
was modified to force a search for a fitting CPU if the task currently
doesn't run on one.

But if the search failed, and the search was only triggered to fulfill
the fitness request; we could end up selecting a new CPU unnecessarily.

Fix this and re-instate the original behavior by ensuring we bail out
in that case.

This behavior change only affected asymmetric systems that are using
util_clamp to implement capacity aware. None asymmetric systems weren't
affected.

LINK: https://lore.kernel.org/lkml/20200218041620.GD28029@codeaurora.org/
Reported-by: Pavan Kondeti <pkondeti@codeaurora.org>
Signed-off-by: Qais Yousef <qais.yousef@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Fixes: 804d402fb6 ("sched/rt: Make RT capacity-aware")
Link: https://lkml.kernel.org/r/20200302132721.8353-3-qais.yousef@arm.com
2020-03-06 12:57:27 +01:00
Konstantin Khlebnikov
b4fb015eef sched/rt: Optimize checking group RT scheduler constraints
Group RT scheduler contains protection against setting zero runtime for
cgroup with RT tasks. Right now function tg_set_rt_bandwidth() iterates
over all CPU cgroups and calls tg_has_rt_tasks() for any cgroup which
runtime is zero (not only for changed one). Default RT runtime is zero,
thus tg_has_rt_tasks() will is called for almost at CPU cgroups.

This protection already is slightly racy: runtime limit could be changed
between cpu_cgroup_can_attach() and cpu_cgroup_attach() because changing
cgroup attribute does not lock cgroup_mutex while attach does not lock
rt_constraints_mutex. Changing task scheduler class also races with
changing rt runtime: check in __sched_setscheduler() isn't protected.

Function tg_has_rt_tasks() iterates over all threads in the system.
This gives NR_CGROUPS * NR_TASKS operations under single tasklist_lock
locked for read tg_set_rt_bandwidth(). Any concurrent attempt of locking
tasklist_lock for write (for example fork) will stuck with disabled irqs.

This patch makes two optimizations:
1) Remove locking tasklist_lock and iterate only tasks in cgroup
2) Call tg_has_rt_tasks() iff rt runtime changes from non-zero to zero

All changed code is under CONFIG_RT_GROUP_SCHED.

Testcase:

 # mkdir /sys/fs/cgroup/cpu/test{1..10000}
 # echo 0 | tee /sys/fs/cgroup/cpu/test*/cpu.rt_runtime_us

At the same time without patch fork time will be >100ms:

 # perf trace -e clone --duration 100 stress-ng --fork 1

Also remote ping will show timings >100ms caused by irq latency.

Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/157996383820.4651.11292439232549211693.stgit@buzz
2020-01-28 21:37:09 +01:00
Qais Yousef
804d402fb6 sched/rt: Make RT capacity-aware
Capacity Awareness refers to the fact that on heterogeneous systems
(like Arm big.LITTLE), the capacity of the CPUs is not uniform, hence
when placing tasks we need to be aware of this difference of CPU
capacities.

In such scenarios we want to ensure that the selected CPU has enough
capacity to meet the requirement of the running task. Enough capacity
means here that capacity_orig_of(cpu) >= task.requirement.

The definition of task.requirement is dependent on the scheduling class.

For CFS, utilization is used to select a CPU that has >= capacity value
than the cfs_task.util.

	capacity_orig_of(cpu) >= cfs_task.util

DL isn't capacity aware at the moment but can make use of the bandwidth
reservation to implement that in a similar manner CFS uses utilization.
The following patchset implements that:

https://lore.kernel.org/lkml/20190506044836.2914-1-luca.abeni@santannapisa.it/

	capacity_orig_of(cpu)/SCHED_CAPACITY >= dl_deadline/dl_runtime

For RT we don't have a per task utilization signal and we lack any
information in general about what performance requirement the RT task
needs. But with the introduction of uclamp, RT tasks can now control
that by setting uclamp_min to guarantee a minimum performance point.

ATM the uclamp value are only used for frequency selection; but on
heterogeneous systems this is not enough and we need to ensure that the
capacity of the CPU is >= uclamp_min. Which is what implemented here.

	capacity_orig_of(cpu) >= rt_task.uclamp_min

Note that by default uclamp.min is 1024, which means that RT tasks will
always be biased towards the big CPUs, which make for a better more
predictable behavior for the default case.

Must stress that the bias acts as a hint rather than a definite
placement strategy. For example, if all big cores are busy executing
other RT tasks we can't guarantee that a new RT task will be placed
there.

On non-heterogeneous systems the original behavior of RT should be
retained. Similarly if uclamp is not selected in the config.

[ mingo: Minor edits to comments. ]

Signed-off-by: Qais Yousef <qais.yousef@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20191009104611.15363-1-qais.yousef@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-12-25 10:42:10 +01:00
Peter Zijlstra
a0e813f26e sched/core: Further clarify sched_class::set_next_task()
It turns out there really is something special to the first
set_next_task() invocation. In specific the 'change' pattern really
should not cause balance callbacks.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: bsegall@google.com
Cc: dietmar.eggemann@arm.com
Cc: juri.lelli@redhat.com
Cc: ktkhai@virtuozzo.com
Cc: mgorman@suse.de
Cc: qais.yousef@arm.com
Cc: qperret@google.com
Cc: rostedt@goodmis.org
Cc: valentin.schneider@arm.com
Cc: vincent.guittot@linaro.org
Fixes: f95d4eaee6 ("sched/{rt,deadline}: Fix set_next_task vs pick_next_task")
Link: https://lkml.kernel.org/r/20191108131909.775434698@infradead.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-11-11 08:35:21 +01:00
Peter Zijlstra
98c2f700ed sched/core: Simplify sched_class::pick_next_task()
Now that the indirect class call never uses the last two arguments of
pick_next_task(), remove them.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: bsegall@google.com
Cc: dietmar.eggemann@arm.com
Cc: juri.lelli@redhat.com
Cc: ktkhai@virtuozzo.com
Cc: mgorman@suse.de
Cc: qais.yousef@arm.com
Cc: qperret@google.com
Cc: rostedt@goodmis.org
Cc: valentin.schneider@arm.com
Cc: vincent.guittot@linaro.org
Link: https://lkml.kernel.org/r/20191108131909.660595546@infradead.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-11-11 08:35:20 +01:00
Peter Zijlstra
6e2df0581f sched: Fix pick_next_task() vs 'change' pattern race
Commit 67692435c4 ("sched: Rework pick_next_task() slow-path")
inadvertly introduced a race because it changed a previously
unexplored dependency between dropping the rq->lock and
sched_class::put_prev_task().

The comments about dropping rq->lock, in for example
newidle_balance(), only mentions the task being current and ->on_cpu
being set. But when we look at the 'change' pattern (in for example
sched_setnuma()):

	queued = task_on_rq_queued(p); /* p->on_rq == TASK_ON_RQ_QUEUED */
	running = task_current(rq, p); /* rq->curr == p */

	if (queued)
		dequeue_task(...);
	if (running)
		put_prev_task(...);

	/* change task properties */

	if (queued)
		enqueue_task(...);
	if (running)
		set_next_task(...);

It becomes obvious that if we do this after put_prev_task() has
already been called on @p, things go sideways. This is exactly what
the commit in question allows to happen when it does:

	prev->sched_class->put_prev_task(rq, prev, rf);
	if (!rq->nr_running)
		newidle_balance(rq, rf);

The newidle_balance() call will drop rq->lock after we've called
put_prev_task() and that allows the above 'change' pattern to
interleave and mess up the state.

Furthermore, it turns out we lost the RT-pull when we put the last DL
task.

Fix both problems by extracting the balancing from put_prev_task() and
doing a multi-class balance() pass before put_prev_task().

Fixes: 67692435c4 ("sched: Rework pick_next_task() slow-path")
Reported-by: Quentin Perret <qperret@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Quentin Perret <qperret@google.com>
Tested-by: Valentin Schneider <valentin.schneider@arm.com>
2019-11-08 22:34:14 +01:00
Linus Torvalds
7f2444d38f Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull core timer updates from Thomas Gleixner:
 "Timers and timekeeping updates:

   - A large overhaul of the posix CPU timer code which is a preparation
     for moving the CPU timer expiry out into task work so it can be
     properly accounted on the task/process.

     An update to the bogus permission checks will come later during the
     merge window as feedback was not complete before heading of for
     travel.

   - Switch the timerqueue code to use cached rbtrees and get rid of the
     homebrewn caching of the leftmost node.

   - Consolidate hrtimer_init() + hrtimer_init_sleeper() calls into a
     single function

   - Implement the separation of hrtimers to be forced to expire in hard
     interrupt context even when PREEMPT_RT is enabled and mark the
     affected timers accordingly.

   - Implement a mechanism for hrtimers and the timer wheel to protect
     RT against priority inversion and live lock issues when a (hr)timer
     which should be canceled is currently executing the callback.
     Instead of infinitely spinning, the task which tries to cancel the
     timer blocks on a per cpu base expiry lock which is held and
     released by the (hr)timer expiry code.

   - Enable the Hyper-V TSC page based sched_clock for Hyper-V guests
     resulting in faster access to timekeeping functions.

   - Updates to various clocksource/clockevent drivers and their device
     tree bindings.

   - The usual small improvements all over the place"

* 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (101 commits)
  posix-cpu-timers: Fix permission check regression
  posix-cpu-timers: Always clear head pointer on dequeue
  hrtimer: Add a missing bracket and hide `migration_base' on !SMP
  posix-cpu-timers: Make expiry_active check actually work correctly
  posix-timers: Unbreak CONFIG_POSIX_TIMERS=n build
  tick: Mark sched_timer to expire in hard interrupt context
  hrtimer: Add kernel doc annotation for HRTIMER_MODE_HARD
  x86/hyperv: Hide pv_ops access for CONFIG_PARAVIRT=n
  posix-cpu-timers: Utilize timerqueue for storage
  posix-cpu-timers: Move state tracking to struct posix_cputimers
  posix-cpu-timers: Deduplicate rlimit handling
  posix-cpu-timers: Remove pointless comparisons
  posix-cpu-timers: Get rid of 64bit divisions
  posix-cpu-timers: Consolidate timer expiry further
  posix-cpu-timers: Get rid of zero checks
  rlimit: Rewrite non-sensical RLIMIT_CPU comment
  posix-cpu-timers: Respect INFINITY for hard RTTIME limit
  posix-cpu-timers: Switch thread group sampling to array
  posix-cpu-timers: Restructure expiry array
  posix-cpu-timers: Remove cputime_expires
  ...
2019-09-17 12:35:15 -07:00
Thomas Gleixner
3a245c0f11 posix-cpu-timers: Move expiry cache into struct posix_cputimers
The expiry cache belongs into the posix_cputimers container where the other
cpu timers information is.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
Link: https://lkml.kernel.org/r/20190821192921.014444012@linutronix.de
2019-08-28 11:50:35 +02:00
Peter Zijlstra
67692435c4 sched: Rework pick_next_task() slow-path
Avoid the RETRY_TASK case in the pick_next_task() slow path.

By doing the put_prev_task() early, we get the rt/deadline pull done,
and by testing rq->nr_running we know if we need newidle_balance().

This then gives a stable state to pick a task from.

Since the fast-path is fair only; it means the other classes will
always have pick_next_task(.prev=NULL, .rf=NULL) and we can simplify.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Aaron Lu <aaron.lwe@gmail.com>
Cc: Valentin Schneider <valentin.schneider@arm.com>
Cc: mingo@kernel.org
Cc: Phil Auld <pauld@redhat.com>
Cc: Julien Desfossez <jdesfossez@digitalocean.com>
Cc: Nishanth Aravamudan <naravamudan@digitalocean.com>
Link: https://lkml.kernel.org/r/aa34d24b36547139248f32a30138791ac6c02bd6.1559129225.git.vpillai@digitalocean.com
2019-08-08 09:09:31 +02:00
Peter Zijlstra
5f2a45fc9e sched: Allow put_prev_task() to drop rq->lock
Currently the pick_next_task() loop is convoluted and ugly because of
how it can drop the rq->lock and needs to restart the picking.

For the RT/Deadline classes, it is put_prev_task() where we do
balancing, and we could do this before the picking loop. Make this
possible.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Valentin Schneider <valentin.schneider@arm.com>
Cc: Aaron Lu <aaron.lwe@gmail.com>
Cc: mingo@kernel.org
Cc: Phil Auld <pauld@redhat.com>
Cc: Julien Desfossez <jdesfossez@digitalocean.com>
Cc: Nishanth Aravamudan <naravamudan@digitalocean.com>
Link: https://lkml.kernel.org/r/e4519f6850477ab7f3d257062796e6425ee4ba7c.1559129225.git.vpillai@digitalocean.com
2019-08-08 09:09:31 +02:00
Peter Zijlstra
03b7fad167 sched: Add task_struct pointer to sched_class::set_curr_task
In preparation of further separating pick_next_task() and
set_curr_task() we have to pass the actual task into it, while there,
rename the thing to better pair with put_prev_task().

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Aaron Lu <aaron.lwe@gmail.com>
Cc: Valentin Schneider <valentin.schneider@arm.com>
Cc: mingo@kernel.org
Cc: Phil Auld <pauld@redhat.com>
Cc: Julien Desfossez <jdesfossez@digitalocean.com>
Cc: Nishanth Aravamudan <naravamudan@digitalocean.com>
Link: https://lkml.kernel.org/r/a96d1bcdd716db4a4c5da2fece647a1456c0ed78.1559129225.git.vpillai@digitalocean.com
2019-08-08 09:09:31 +02:00
Peter Zijlstra
f95d4eaee6 sched/{rt,deadline}: Fix set_next_task vs pick_next_task
Because pick_next_task() implies set_curr_task() and some of the
details haven't mattered too much, some of what _should_ be in
set_curr_task() ended up in pick_next_task, correct this.

This prepares the way for a pick_next_task() variant that does not
affect the current state; allowing remote picking.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Aaron Lu <aaron.lwe@gmail.com>
Cc: Valentin Schneider <valentin.schneider@arm.com>
Cc: mingo@kernel.org
Cc: Phil Auld <pauld@redhat.com>
Cc: Julien Desfossez <jdesfossez@digitalocean.com>
Cc: Nishanth Aravamudan <naravamudan@digitalocean.com>
Link: https://lkml.kernel.org/r/38c61d5240553e043c27c5e00b9dd0d184dd6081.1559129225.git.vpillai@digitalocean.com
2019-08-08 09:09:30 +02:00
Sebastian Andrzej Siewior
d5096aa65a sched: Mark hrtimers to expire in hard interrupt context
The scheduler related hrtimers need to expire in hard interrupt context
even on PREEMPT_RT enabled kernels. Mark then as such.

No functional change.

[ tglx: Split out from larger combo patch. Add changelog. ]

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20190726185753.077004842@linutronix.de
2019-08-01 20:51:19 +02:00
Patrick Bellasi
982d9cdc22 sched/cpufreq, sched/uclamp: Add clamps for FAIR and RT tasks
Each time a frequency update is required via schedutil, a frequency is
selected to (possibly) satisfy the utilization reported by each
scheduling class and irqs. However, when utilization clamping is in use,
the frequency selection should consider userspace utilization clamping
hints.  This will allow, for example, to:

 - boost tasks which are directly affecting the user experience
   by running them at least at a minimum "requested" frequency

 - cap low priority tasks not directly affecting the user experience
   by running them only up to a maximum "allowed" frequency

These constraints are meant to support a per-task based tuning of the
frequency selection thus supporting a fine grained definition of
performance boosting vs energy saving strategies in kernel space.

Add support to clamp the utilization of RUNNABLE FAIR and RT tasks
within the boundaries defined by their aggregated utilization clamp
constraints.

Do that by considering the max(min_util, max_util) to give boosted tasks
the performance they need even when they happen to be co-scheduled with
other capped tasks.

Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alessio Balsini <balsini@android.com>
Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Joel Fernandes <joelaf@google.com>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Morten Rasmussen <morten.rasmussen@arm.com>
Cc: Paul Turner <pjt@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Perret <quentin.perret@arm.com>
Cc: Rafael J . Wysocki <rafael.j.wysocki@intel.com>
Cc: Steve Muckle <smuckle@google.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Todd Kjos <tkjos@google.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Link: https://lkml.kernel.org/r/20190621084217.8167-10-patrick.bellasi@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-06-24 19:23:48 +02:00
Sebastian Andrzej Siewior
3bd3706251 sched/core: Provide a pointer to the valid CPU mask
In commit:

  4b53a3412d ("sched/core: Remove the tsk_nr_cpus_allowed() wrapper")

the tsk_nr_cpus_allowed() wrapper was removed. There was not
much difference in !RT but in RT we used this to implement
migrate_disable(). Within a migrate_disable() section the CPU mask is
restricted to single CPU while the "normal" CPU mask remains untouched.

As an alternative implementation Ingo suggested to use:

	struct task_struct {
		const cpumask_t		*cpus_ptr;
		cpumask_t		cpus_mask;
        };
with
	t->cpus_ptr = &t->cpus_mask;

In -RT we then can switch the cpus_ptr to:

	t->cpus_ptr = &cpumask_of(task_cpu(p));

in a migration disabled region. The rules are simple:

 - Code that 'uses' ->cpus_allowed would use the pointer.
 - Code that 'modifies' ->cpus_allowed would use the direct mask.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20190423142636.14347-1-bigeasy@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-06-03 11:49:37 +02:00
Konstantin Khlebnikov
1a010e29cf sched/rt: Check integer overflow at usec to nsec conversion
Example of unhandled overflows:

 # echo 18446744073709651 > cpu.rt_runtime_us
 # cat cpu.rt_runtime_us
 99

 # echo 18446744073709900 > cpu.rt_period_us
 # cat cpu.rt_period_us
 348

After this patch they will fail with -EINVAL.

Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/155125501739.293431.5252197504404771496.stgit@buzz
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-04-19 13:42:09 +02:00
Vincent Guittot
2312729688 sched/fair: Update scale invariance of PELT
The current implementation of load tracking invariance scales the
contribution with current frequency and uarch performance (only for
utilization) of the CPU. One main result of this formula is that the
figures are capped by current capacity of CPU. Another one is that the
load_avg is not invariant because not scaled with uarch.

The util_avg of a periodic task that runs r time slots every p time slots
varies in the range :

    U * (1-y^r)/(1-y^p) * y^i < Utilization < U * (1-y^r)/(1-y^p)

with U is the max util_avg value = SCHED_CAPACITY_SCALE

At a lower capacity, the range becomes:

    U * C * (1-y^r')/(1-y^p) * y^i' < Utilization <  U * C * (1-y^r')/(1-y^p)

with C reflecting the compute capacity ratio between current capacity and
max capacity.

so C tries to compensate changes in (1-y^r') but it can't be accurate.

Instead of scaling the contribution value of PELT algo, we should scale the
running time. The PELT signal aims to track the amount of computation of
tasks and/or rq so it seems more correct to scale the running time to
reflect the effective amount of computation done since the last update.

In order to be fully invariant, we need to apply the same amount of
running time and idle time whatever the current capacity. Because running
at lower capacity implies that the task will run longer, we have to ensure
that the same amount of idle time will be applied when system becomes idle
and no idle time has been "stolen". But reaching the maximum utilization
value (SCHED_CAPACITY_SCALE) means that the task is seen as an
always-running task whatever the capacity of the CPU (even at max compute
capacity). In this case, we can discard this "stolen" idle times which
becomes meaningless.

In order to achieve this time scaling, a new clock_pelt is created per rq.
The increase of this clock scales with current capacity when something
is running on rq and synchronizes with clock_task when rq is idle. With
this mechanism, we ensure the same running and idle time whatever the
current capacity. This also enables to simplify the pelt algorithm by
removing all references of uarch and frequency and applying the same
contribution to utilization and loads. Furthermore, the scaling is done
only once per update of clock (update_rq_clock_task()) instead of during
each update of sched_entities and cfs/rt/dl_rq of the rq like the current
implementation. This is interesting when cgroup are involved as shown in
the results below:

On a hikey (octo Arm64 platform).
Performance cpufreq governor and only shallowest c-state to remove variance
generated by those power features so we only track the impact of pelt algo.

each test runs 16 times:

	./perf bench sched pipe
	(higher is better)
	kernel	tip/sched/core     + patch
	        ops/seconds        ops/seconds         diff
	cgroup
	root    59652(+/- 0.18%)   59876(+/- 0.24%)    +0.38%
	level1  55608(+/- 0.27%)   55923(+/- 0.24%)    +0.57%
	level2  52115(+/- 0.29%)   52564(+/- 0.22%)    +0.86%

	hackbench -l 1000
	(lower is better)
	kernel	tip/sched/core     + patch
	        duration(sec)      duration(sec)        diff
	cgroup
	root    4.453(+/- 2.37%)   4.383(+/- 2.88%)     -1.57%
	level1  4.859(+/- 8.50%)   4.830(+/- 7.07%)     -0.60%
	level2  5.063(+/- 9.83%)   4.928(+/- 9.66%)     -2.66%

Then, the responsiveness of PELT is improved when CPU is not running at max
capacity with this new algorithm. I have put below some examples of
duration to reach some typical load values according to the capacity of the
CPU with current implementation and with this patch. These values has been
computed based on the geometric series and the half period value:

  Util (%)     max capacity  half capacity(mainline)  half capacity(w/ patch)
  972 (95%)    138ms         not reachable            276ms
  486 (47.5%)  30ms          138ms                     60ms
  256 (25%)    13ms           32ms                     26ms

On my hikey (octo Arm64 platform) with schedutil governor, the time to
reach max OPP when starting from a null utilization, decreases from 223ms
with current scale invariance down to 121ms with the new algorithm.

Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Morten.Rasmussen@arm.com
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: bsegall@google.com
Cc: dietmar.eggemann@arm.com
Cc: patrick.bellasi@arm.com
Cc: pjt@google.com
Cc: pkondeti@codeaurora.org
Cc: quentin.perret@arm.com
Cc: rjw@rjwysocki.net
Cc: srinivas.pandruvada@linux.intel.com
Cc: thara.gopinath@linaro.org
Link: https://lkml.kernel.org/r/1548257214-13745-3-git-send-email-vincent.guittot@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-02-04 09:13:21 +01:00
Yangtao Li
9ebc605381 sched/core: Remove unnecessary unlikely() in push_*_task()
WARN_ON() already contains an unlikely(), so it's not necessary to
use WARN_ON(1).

Signed-off-by: Yangtao Li <tiny.windzz@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20181103172602.1917-1-tiny.windzz@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-12-11 15:16:57 +01:00
Muchun Song
ff1cdc94de sched/core: Introduce set_next_task() helper for better code readability
When we pick the next task, we will do the following for the task:

  1) p->se.exec_start = rq_clock_task(rq);
  2) dequeue_pushable(_dl)_task(rq, p);

When we call set_curr_task(), we also need to do the same thing
above. In rt.c, the code at 1) is in the _pick_next_task_rt()
and the code at 2) is in the pick_next_task_rt(). If we put two
operations in one function, maybe better. So, we introduce a new
function set_next_task(), which is responsible for doing the above.

By introducing the function we can get rid of calling the
dequeue_pushable(_dl)_task() directly(We can call set_next_task())
in pick_next_task() and have better code readability and reuse.
In set_curr_task_rt(), we also can call set_next_task().

Do this things such that we end up with:

  static struct task_struct *pick_next_task(struct rq *rq,
  					    struct task_struct *prev,
  					    struct rq_flags *rf)
  {
  	/* do something else ... */

  	put_prev_task(rq, prev);

  	/* pick next task p */

  	set_next_task(rq, p);

  	/* do something else ... */
  }

put_prev_task() can match set_next_task(), which can make the
code more readable.

Signed-off-by: Muchun Song <smuchun@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20181026131743.21786-1-smuchun@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-11-04 00:59:24 +01:00
Muchun Song
a68d75081a sched/rt: Update comment in pick_next_task_rt()
Commit:

  f4ebcbc0d7 ("sched/rt: Substract number of tasks of throttled queues from rq->nr_running")

added a new rt_rq->rt_queued field, which is used to indicate the status of
rq->rt enqueue or dequeue. So, the ->rt_nr_running check was removed and we
now check ->rt_queued instead.

Fix the comment in pick_next_task_rt() as well, which was still referencing
the old logic.

Signed-off-by: Muchun Song <smuchun@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20181027030517.23292-1-smuchun@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-10-29 07:18:04 +01:00
Ingo Molnar
4765096f4f Merge branch 'sched/urgent' into sched/core, to pick up fixes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-07-25 11:29:58 +02:00
Hailong Liu
f3d133ee0a sched/rt: Restore rt_runtime after disabling RT_RUNTIME_SHARE
NO_RT_RUNTIME_SHARE feature is used to prevent a CPU borrow enough
runtime with a spin-rt-task.

However, if RT_RUNTIME_SHARE feature is enabled and rt_rq has borrowd
enough rt_runtime at the beginning, rt_runtime can't be restored to
its initial bandwidth rt_runtime after we disable RT_RUNTIME_SHARE.

E.g. on my PC with 4 cores, procedure to reproduce:
1) Make sure  RT_RUNTIME_SHARE is enabled
 cat /sys/kernel/debug/sched_features
  GENTLE_FAIR_SLEEPERS START_DEBIT NO_NEXT_BUDDY LAST_BUDDY
  CACHE_HOT_BUDDY WAKEUP_PREEMPTION NO_HRTICK NO_DOUBLE_TICK
  LB_BIAS NONTASK_CAPACITY TTWU_QUEUE NO_SIS_AVG_CPU SIS_PROP
  NO_WARN_DOUBLE_CLOCK RT_PUSH_IPI RT_RUNTIME_SHARE NO_LB_MIN
  ATTACH_AGE_LOAD WA_IDLE WA_WEIGHT WA_BIAS
2) Start a spin-rt-task
 ./loop_rr &
3) set affinity to the last cpu
 taskset -p 8 $pid_of_loop_rr
4) Observe that last cpu have borrowed enough runtime.
 cat /proc/sched_debug | grep rt_runtime
  .rt_runtime                    : 950.000000
  .rt_runtime                    : 900.000000
  .rt_runtime                    : 950.000000
  .rt_runtime                    : 1000.000000
5) Disable RT_RUNTIME_SHARE
 echo NO_RT_RUNTIME_SHARE > /sys/kernel/debug/sched_features
6) Observe that rt_runtime can not been restored
 cat /proc/sched_debug | grep rt_runtime
  .rt_runtime                    : 950.000000
  .rt_runtime                    : 900.000000
  .rt_runtime                    : 950.000000
  .rt_runtime                    : 1000.000000

This patch help to restore rt_runtime after we disable
RT_RUNTIME_SHARE.

Signed-off-by: Hailong Liu <liu.hailong6@zte.com.cn>
Signed-off-by: Jiang Biao <jiang.biao2@zte.com.cn>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: zhong.weidong@zte.com.cn
Link: http://lkml.kernel.org/r/1531874815-39357-1-git-send-email-liu.hailong6@zte.com.cn
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-07-25 11:29:08 +02:00
Vincent Guittot
523e979d31 sched/core: Use PELT for scale_rt_capacity()
The utilization of the CPU by RT, DL and IRQs are now tracked with
PELT so we can use these metrics instead of rt_avg to evaluate the remaining
capacity available for CFS class.

scale_rt_capacity() behavior has been changed and now returns the remaining
capacity available for CFS instead of a scaling factor because RT, DL and
IRQ provide now absolute utilization value.

The same formula as schedutil is used:

  IRQ util_avg + (1 - IRQ util_avg / max capacity ) * /Sum rq util_avg

but the implementation is different because it doesn't return the same value
and doesn't benefit of the same optimization.

Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Morten.Rasmussen@arm.com
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: claudio@evidence.eu.com
Cc: daniel.lezcano@linaro.org
Cc: dietmar.eggemann@arm.com
Cc: joel@joelfernandes.org
Cc: juri.lelli@redhat.com
Cc: luca.abeni@santannapisa.it
Cc: patrick.bellasi@arm.com
Cc: quentin.perret@arm.com
Cc: rjw@rjwysocki.net
Cc: valentin.schneider@arm.com
Cc: viresh.kumar@linaro.org
Link: http://lkml.kernel.org/r/1530200714-4504-10-git-send-email-vincent.guittot@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-07-16 00:16:25 +02:00
Vincent Guittot
371bf42732 sched/rt: Add rt_rq utilization tracking
schedutil governor relies on cfs_rq's util_avg to choose the OPP when CFS
tasks are running. When the CPU is overloaded by CFS and RT tasks, CFS tasks
are preempted by RT tasks and in this case util_avg reflects the remaining
capacity but not what CFS want to use. In such case, schedutil can select a
lower OPP whereas the CPU is overloaded. In order to have a more accurate
view of the utilization of the CPU, we track the utilization of RT tasks.
Only util_avg is correctly tracked but not load_avg and runnable_load_avg
which are useless for rt_rq.

rt_rq uses rq_clock_task and cfs_rq uses cfs_rq_clock_task but they are
the same at the root group level, so the PELT windows of the util_sum are
aligned.

Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Morten.Rasmussen@arm.com
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: claudio@evidence.eu.com
Cc: daniel.lezcano@linaro.org
Cc: dietmar.eggemann@arm.com
Cc: joel@joelfernandes.org
Cc: juri.lelli@redhat.com
Cc: luca.abeni@santannapisa.it
Cc: patrick.bellasi@arm.com
Cc: quentin.perret@arm.com
Cc: rjw@rjwysocki.net
Cc: valentin.schneider@arm.com
Cc: viresh.kumar@linaro.org
Link: http://lkml.kernel.org/r/1530200714-4504-3-git-send-email-vincent.guittot@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-07-15 23:51:20 +02:00
Vincent Guittot
296b2ffe7f sched/rt: Fix call to cpufreq_update_util()
With commit:

  8f111bc357 ("cpufreq/schedutil: Rewrite CPUFREQ_RT support")

the schedutil governor uses rq->rt.rt_nr_running to detect whether an
RT task is currently running on the CPU and to set frequency to max
if necessary.

cpufreq_update_util() is called in enqueue/dequeue_top_rt_rq() but
rq->rt.rt_nr_running has not been updated yet when dequeue_top_rt_rq() is
called so schedutil still considers that an RT task is running when the
last task is dequeued. The update of rq->rt.rt_nr_running happens later
in dequeue_rt_stack().

In fact, we can take advantage of the sequence that the dequeue then
re-enqueue rt entities when a rt task is enqueued or dequeued;
As a result enqueue_top_rt_rq() is always called when a task is
enqueued or dequeued and also when groups are throttled or unthrottled.
The only place that not use enqueue_top_rt_rq() is when root rt_rq is
throttled.

Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: efault@gmx.de
Cc: juri.lelli@redhat.com
Cc: patrick.bellasi@arm.com
Cc: viresh.kumar@linaro.org
Fixes: 8f111bc357 ('cpufreq/schedutil: Rewrite CPUFREQ_RT support')
Link: http://lkml.kernel.org/r/1530021202-21695-1-git-send-email-vincent.guittot@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-07-03 09:17:28 +02:00
Kees Cook
6396bb2215 treewide: kzalloc() -> kcalloc()
The kzalloc() function has a 2-factor argument form, kcalloc(). This
patch replaces cases of:

        kzalloc(a * b, gfp)

with:
        kcalloc(a * b, gfp)

as well as handling cases of:

        kzalloc(a * b * c, gfp)

with:

        kzalloc(array3_size(a, b, c), gfp)

as it's slightly less ugly than:

        kzalloc_array(array_size(a, b), c, gfp)

This does, however, attempt to ignore constant size factors like:

        kzalloc(4 * 1024, gfp)

though any constants defined via macros get caught up in the conversion.

Any factors with a sizeof() of "unsigned char", "char", and "u8" were
dropped, since they're redundant.

The Coccinelle script used for this was:

// Fix redundant parens around sizeof().
@@
type TYPE;
expression THING, E;
@@

(
  kzalloc(
-	(sizeof(TYPE)) * E
+	sizeof(TYPE) * E
  , ...)
|
  kzalloc(
-	(sizeof(THING)) * E
+	sizeof(THING) * E
  , ...)
)

// Drop single-byte sizes and redundant parens.
@@
expression COUNT;
typedef u8;
typedef __u8;
@@

(
  kzalloc(
-	sizeof(u8) * (COUNT)
+	COUNT
  , ...)
|
  kzalloc(
-	sizeof(__u8) * (COUNT)
+	COUNT
  , ...)
|
  kzalloc(
-	sizeof(char) * (COUNT)
+	COUNT
  , ...)
|
  kzalloc(
-	sizeof(unsigned char) * (COUNT)
+	COUNT
  , ...)
|
  kzalloc(
-	sizeof(u8) * COUNT
+	COUNT
  , ...)
|
  kzalloc(
-	sizeof(__u8) * COUNT
+	COUNT
  , ...)
|
  kzalloc(
-	sizeof(char) * COUNT
+	COUNT
  , ...)
|
  kzalloc(
-	sizeof(unsigned char) * COUNT
+	COUNT
  , ...)
)

// 2-factor product with sizeof(type/expression) and identifier or constant.
@@
type TYPE;
expression THING;
identifier COUNT_ID;
constant COUNT_CONST;
@@

(
- kzalloc
+ kcalloc
  (
-	sizeof(TYPE) * (COUNT_ID)
+	COUNT_ID, sizeof(TYPE)
  , ...)
|
- kzalloc
+ kcalloc
  (
-	sizeof(TYPE) * COUNT_ID
+	COUNT_ID, sizeof(TYPE)
  , ...)
|
- kzalloc
+ kcalloc
  (
-	sizeof(TYPE) * (COUNT_CONST)
+	COUNT_CONST, sizeof(TYPE)
  , ...)
|
- kzalloc
+ kcalloc
  (
-	sizeof(TYPE) * COUNT_CONST
+	COUNT_CONST, sizeof(TYPE)
  , ...)
|
- kzalloc
+ kcalloc
  (
-	sizeof(THING) * (COUNT_ID)
+	COUNT_ID, sizeof(THING)
  , ...)
|
- kzalloc
+ kcalloc
  (
-	sizeof(THING) * COUNT_ID
+	COUNT_ID, sizeof(THING)
  , ...)
|
- kzalloc
+ kcalloc
  (
-	sizeof(THING) * (COUNT_CONST)
+	COUNT_CONST, sizeof(THING)
  , ...)
|
- kzalloc
+ kcalloc
  (
-	sizeof(THING) * COUNT_CONST
+	COUNT_CONST, sizeof(THING)
  , ...)
)

// 2-factor product, only identifiers.
@@
identifier SIZE, COUNT;
@@

- kzalloc
+ kcalloc
  (
-	SIZE * COUNT
+	COUNT, SIZE
  , ...)

// 3-factor product with 1 sizeof(type) or sizeof(expression), with
// redundant parens removed.
@@
expression THING;
identifier STRIDE, COUNT;
type TYPE;
@@

(
  kzalloc(
-	sizeof(TYPE) * (COUNT) * (STRIDE)
+	array3_size(COUNT, STRIDE, sizeof(TYPE))
  , ...)
|
  kzalloc(
-	sizeof(TYPE) * (COUNT) * STRIDE
+	array3_size(COUNT, STRIDE, sizeof(TYPE))
  , ...)
|
  kzalloc(
-	sizeof(TYPE) * COUNT * (STRIDE)
+	array3_size(COUNT, STRIDE, sizeof(TYPE))
  , ...)
|
  kzalloc(
-	sizeof(TYPE) * COUNT * STRIDE
+	array3_size(COUNT, STRIDE, sizeof(TYPE))
  , ...)
|
  kzalloc(
-	sizeof(THING) * (COUNT) * (STRIDE)
+	array3_size(COUNT, STRIDE, sizeof(THING))
  , ...)
|
  kzalloc(
-	sizeof(THING) * (COUNT) * STRIDE
+	array3_size(COUNT, STRIDE, sizeof(THING))
  , ...)
|
  kzalloc(
-	sizeof(THING) * COUNT * (STRIDE)
+	array3_size(COUNT, STRIDE, sizeof(THING))
  , ...)
|
  kzalloc(
-	sizeof(THING) * COUNT * STRIDE
+	array3_size(COUNT, STRIDE, sizeof(THING))
  , ...)
)

// 3-factor product with 2 sizeof(variable), with redundant parens removed.
@@
expression THING1, THING2;
identifier COUNT;
type TYPE1, TYPE2;
@@

(
  kzalloc(
-	sizeof(TYPE1) * sizeof(TYPE2) * COUNT
+	array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
  , ...)
|
  kzalloc(
-	sizeof(TYPE1) * sizeof(THING2) * (COUNT)
+	array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
  , ...)
|
  kzalloc(
-	sizeof(THING1) * sizeof(THING2) * COUNT
+	array3_size(COUNT, sizeof(THING1), sizeof(THING2))
  , ...)
|
  kzalloc(
-	sizeof(THING1) * sizeof(THING2) * (COUNT)
+	array3_size(COUNT, sizeof(THING1), sizeof(THING2))
  , ...)
|
  kzalloc(
-	sizeof(TYPE1) * sizeof(THING2) * COUNT
+	array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
  , ...)
|
  kzalloc(
-	sizeof(TYPE1) * sizeof(THING2) * (COUNT)
+	array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
  , ...)
)

// 3-factor product, only identifiers, with redundant parens removed.
@@
identifier STRIDE, SIZE, COUNT;
@@

(
  kzalloc(
-	(COUNT) * STRIDE * SIZE
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
|
  kzalloc(
-	COUNT * (STRIDE) * SIZE
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
|
  kzalloc(
-	COUNT * STRIDE * (SIZE)
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
|
  kzalloc(
-	(COUNT) * (STRIDE) * SIZE
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
|
  kzalloc(
-	COUNT * (STRIDE) * (SIZE)
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
|
  kzalloc(
-	(COUNT) * STRIDE * (SIZE)
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
|
  kzalloc(
-	(COUNT) * (STRIDE) * (SIZE)
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
|
  kzalloc(
-	COUNT * STRIDE * SIZE
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
)

// Any remaining multi-factor products, first at least 3-factor products,
// when they're not all constants...
@@
expression E1, E2, E3;
constant C1, C2, C3;
@@

(
  kzalloc(C1 * C2 * C3, ...)
|
  kzalloc(
-	(E1) * E2 * E3
+	array3_size(E1, E2, E3)
  , ...)
|
  kzalloc(
-	(E1) * (E2) * E3
+	array3_size(E1, E2, E3)
  , ...)
|
  kzalloc(
-	(E1) * (E2) * (E3)
+	array3_size(E1, E2, E3)
  , ...)
|
  kzalloc(
-	E1 * E2 * E3
+	array3_size(E1, E2, E3)
  , ...)
)

// And then all remaining 2 factors products when they're not all constants,
// keeping sizeof() as the second factor argument.
@@
expression THING, E1, E2;
type TYPE;
constant C1, C2, C3;
@@

(
  kzalloc(sizeof(THING) * C2, ...)
|
  kzalloc(sizeof(TYPE) * C2, ...)
|
  kzalloc(C1 * C2 * C3, ...)
|
  kzalloc(C1 * C2, ...)
|
- kzalloc
+ kcalloc
  (
-	sizeof(TYPE) * (E2)
+	E2, sizeof(TYPE)
  , ...)
|
- kzalloc
+ kcalloc
  (
-	sizeof(TYPE) * E2
+	E2, sizeof(TYPE)
  , ...)
|
- kzalloc
+ kcalloc
  (
-	sizeof(THING) * (E2)
+	E2, sizeof(THING)
  , ...)
|
- kzalloc
+ kcalloc
  (
-	sizeof(THING) * E2
+	E2, sizeof(THING)
  , ...)
|
- kzalloc
+ kcalloc
  (
-	(E1) * E2
+	E1, E2
  , ...)
|
- kzalloc
+ kcalloc
  (
-	(E1) * (E2)
+	E1, E2
  , ...)
|
- kzalloc
+ kcalloc
  (
-	E1 * E2
+	E1, E2
  , ...)
)

Signed-off-by: Kees Cook <keescook@chromium.org>
2018-06-12 16:19:22 -07:00
Mathieu Malaterre
f6a3463063 sched/debug: Move the print_rt_rq() and print_dl_rq() declarations to kernel/sched/sched.h
In the following commit:

  6b55c9654f ("sched/debug: Move print_cfs_rq() declaration to kernel/sched/sched.h")

the print_cfs_rq() prototype was added to <kernel/sched/sched.h>,
right next to the prototypes for print_cfs_stats(), print_rt_stats()
and print_dl_stats().

Finish this previous commit and also move related prototypes for
print_rt_rq() and print_dl_rq().

Remove existing extern declarations now that they not needed anymore.

Silences the following GCC warning, triggered by W=1:

  kernel/sched/debug.c:573:6: warning: no previous prototype for ‘print_rt_rq’ [-Wmissing-prototypes]
  kernel/sched/debug.c:603:6: warning: no previous prototype for ‘print_dl_rq’ [-Wmissing-prototypes]

Signed-off-by: Mathieu Malaterre <malat@debian.org>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20180516195348.30426-1-malat@debian.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-05-18 09:05:14 +02:00