Commit Graph

72 Commits

Author SHA1 Message Date
Greg Kroah-Hartman
5dadf6321c Merge 5.10.111 into android12-5.10-lts
Changes in 5.10.111
	ubifs: Rectify space amount budget for mkdir/tmpfile operations
	gfs2: Check for active reservation in gfs2_release
	gfs2: Fix gfs2_release for non-writers regression
	gfs2: gfs2_setattr_size error path fix
	rtc: wm8350: Handle error for wm8350_register_irq
	KVM: x86/svm: Clear reserved bits written to PerfEvtSeln MSRs
	KVM: x86/emulator: Emulate RDPID only if it is enabled in guest
	drm: Add orientation quirk for GPD Win Max
	ath5k: fix OOB in ath5k_eeprom_read_pcal_info_5111
	drm/amd/display: Add signal type check when verify stream backends same
	drm/amd/amdgpu/amdgpu_cs: fix refcount leak of a dma_fence obj
	usb: gadget: tegra-xudc: Do not program SPARAM
	usb: gadget: tegra-xudc: Fix control endpoint's definitions
	ptp: replace snprintf with sysfs_emit
	powerpc: dts: t104xrdb: fix phy type for FMAN 4/5
	ath11k: fix kernel panic during unload/load ath11k modules
	ath11k: mhi: use mhi_sync_power_up()
	bpf: Make dst_port field in struct bpf_sock 16-bit wide
	scsi: mvsas: Replace snprintf() with sysfs_emit()
	scsi: bfa: Replace snprintf() with sysfs_emit()
	power: supply: axp20x_battery: properly report current when discharging
	mt76: dma: initialize skip_unmap in mt76_dma_rx_fill
	cfg80211: don't add non transmitted BSS to 6GHz scanned channels
	libbpf: Fix build issue with llvm-readelf
	ipv6: make mc_forwarding atomic
	powerpc: Set crashkernel offset to mid of RMA region
	drm/amdgpu: Fix recursive locking warning
	PCI: aardvark: Fix support for MSI interrupts
	iommu/arm-smmu-v3: fix event handling soft lockup
	usb: ehci: add pci device support for Aspeed platforms
	PCI: endpoint: Fix alignment fault error in copy tests
	tcp: Don't acquire inet_listen_hashbucket::lock with disabled BH.
	PCI: pciehp: Add Qualcomm quirk for Command Completed erratum
	power: supply: axp288-charger: Set Vhold to 4.4V
	iwlwifi: mvm: Correctly set fragmented EBS
	ipv4: Invalidate neighbour for broadcast address upon address addition
	dm ioctl: prevent potential spectre v1 gadget
	dm: requeue IO if mapping table not yet available
	drm/amdkfd: make CRAT table missing message informational only
	scsi: pm8001: Fix pm80xx_pci_mem_copy() interface
	scsi: pm8001: Fix pm8001_mpi_task_abort_resp()
	scsi: pm8001: Fix task leak in pm8001_send_abort_all()
	scsi: pm8001: Fix tag leaks on error
	scsi: pm8001: Fix memory leak in pm8001_chip_fw_flash_update_req()
	mt76: mt7615: Fix assigning negative values to unsigned variable
	scsi: aha152x: Fix aha152x_setup() __setup handler return value
	scsi: hisi_sas: Free irq vectors in order for v3 HW
	net/smc: correct settings of RMB window update limit
	mips: ralink: fix a refcount leak in ill_acc_of_setup()
	macvtap: advertise link netns via netlink
	tuntap: add sanity checks about msg_controllen in sendmsg
	Bluetooth: Fix not checking for valid hdev on bt_dev_{info,warn,err,dbg}
	Bluetooth: use memset avoid memory leaks
	bnxt_en: Eliminate unintended link toggle during FW reset
	PCI: endpoint: Fix misused goto label
	MIPS: fix fortify panic when copying asm exception handlers
	powerpc/secvar: fix refcount leak in format_show()
	scsi: libfc: Fix use after free in fc_exch_abts_resp()
	can: isotp: set default value for N_As to 50 micro seconds
	net: account alternate interface name memory
	net: limit altnames to 64k total
	net: sfp: add 2500base-X quirk for Lantech SFP module
	usb: dwc3: omap: fix "unbalanced disables for smps10_out1" on omap5evm
	xtensa: fix DTC warning unit_address_format
	MIPS: ingenic: correct unit node address
	Bluetooth: Fix use after free in hci_send_acl
	netlabel: fix out-of-bounds memory accesses
	ceph: fix memory leak in ceph_readdir when note_last_dentry returns error
	init/main.c: return 1 from handled __setup() functions
	minix: fix bug when opening a file with O_DIRECT
	clk: si5341: fix reported clk_rate when output divider is 2
	staging: vchiq_core: handle NULL result of find_service_by_handle
	phy: amlogic: meson8b-usb2: Use dev_err_probe()
	staging: wfx: fix an error handling in wfx_init_common()
	w1: w1_therm: fixes w1_seq for ds28ea00 sensors
	NFSv4.2: fix reference count leaks in _nfs42_proc_copy_notify()
	NFSv4: Protect the state recovery thread against direct reclaim
	xen: delay xen_hvm_init_time_ops() if kdump is boot on vcpu>=32
	clk: ti: Preserve node in ti_dt_clocks_register()
	clk: Enforce that disjoints limits are invalid
	SUNRPC/call_alloc: async tasks mustn't block waiting for memory
	SUNRPC/xprt: async tasks mustn't block waiting for memory
	SUNRPC: remove scheduling boost for "SWAPPER" tasks.
	NFS: swap IO handling is slightly different for O_DIRECT IO
	NFS: swap-out must always use STABLE writes.
	x86/Kconfig: Do not allow CONFIG_X86_X32_ABI=y with llvm-objcopy
	serial: samsung_tty: do not unlock port->lock for uart_write_wakeup()
	virtio_console: eliminate anonymous module_init & module_exit
	jfs: prevent NULL deref in diFree
	SUNRPC: Fix socket waits for write buffer space
	NFS: nfsiod should not block forever in mempool_alloc()
	NFS: Avoid writeback threads getting stuck in mempool_alloc()
	parisc: Fix CPU affinity for Lasi, WAX and Dino chips
	parisc: Fix patch code locking and flushing
	mm: fix race between MADV_FREE reclaim and blkdev direct IO read
	Revert "hv: utils: add PTP_1588_CLOCK to Kconfig to fix build"
	drm/amdgpu: fix off by one in amdgpu_gfx_kiq_acquire()
	Drivers: hv: vmbus: Fix potential crash on module unload
	Revert "NFSv4: Handle the special Linux file open access mode"
	NFSv4: fix open failure with O_ACCMODE flag
	scsi: zorro7xx: Fix a resource leak in zorro7xx_remove_one()
	net/tls: fix slab-out-of-bounds bug in decrypt_internal
	ice: Clear default forwarding VSI during VSI release
	net: ipv4: fix route with nexthop object delete warning
	net: stmmac: Fix unset max_speed difference between DT and non-DT platforms
	drm/imx: imx-ldb: Check for null pointer after calling kmemdup
	drm/imx: Fix memory leak in imx_pd_connector_get_modes
	bnxt_en: reserve space inside receive page for skb_shared_info
	sfc: Do not free an empty page_ring
	RDMA/mlx5: Don't remove cache MRs when a delay is needed
	IB/rdmavt: add lock to call to rvt_error_qp to prevent a race condition
	dpaa2-ptp: Fix refcount leak in dpaa2_ptp_probe
	ice: Set txq_teid to ICE_INVAL_TEID on ring creation
	ice: Do not skip not enabled queues in ice_vc_dis_qs_msg
	ipv6: Fix stats accounting in ip6_pkt_drop
	ice: synchronize_rcu() when terminating rings
	net: openvswitch: don't send internal clone attribute to the userspace.
	net: openvswitch: fix leak of nested actions
	rxrpc: fix a race in rxrpc_exit_net()
	net: phy: mscc-miim: reject clause 45 register accesses
	qede: confirm skb is allocated before using
	spi: bcm-qspi: fix MSPI only access with bcm_qspi_exec_mem_op()
	bpf: Support dual-stack sockets in bpf_tcp_check_syncookie
	drbd: Fix five use after free bugs in get_initial_state
	io_uring: don't touch scm_fp_list after queueing skb
	SUNRPC: Handle ENOMEM in call_transmit_status()
	SUNRPC: Handle low memory situations in call_status()
	SUNRPC: svc_tcp_sendmsg() should handle errors from xdr_alloc_bvec()
	iommu/omap: Fix regression in probe for NULL pointer dereference
	perf: arm-spe: Fix perf report --mem-mode
	perf tools: Fix perf's libperf_print callback
	perf session: Remap buf if there is no space for event
	arm64: Add part number for Arm Cortex-A78AE
	Revert "mmc: sdhci-xenon: fix annoying 1.8V regulator warning"
	mmc: mmci: stm32: correctly check all elements of sg list
	mmc: renesas_sdhi: don't overwrite TAP settings when HS400 tuning is complete
	lz4: fix LZ4_decompress_safe_partial read out of bound
	mmmremap.c: avoid pointless invalidate_range_start/end on mremap(old_size=0)
	mm/mempolicy: fix mpol_new leak in shared_policy_replace
	io_uring: fix race between timeout flush and removal
	x86/pm: Save the MSR validity status at context setup
	x86/speculation: Restore speculation related MSRs during S3 resume
	btrfs: fix qgroup reserve overflow the qgroup limit
	btrfs: prevent subvol with swapfile from being deleted
	arm64: patch_text: Fixup last cpu should be master
	RDMA/hfi1: Fix use-after-free bug for mm struct
	gpio: Restrict usage of GPIO chip irq members before initialization
	ata: sata_dwc_460ex: Fix crash due to OOB write
	perf: qcom_l2_pmu: fix an incorrect NULL check on list iterator
	irqchip/gic-v3: Fix GICR_CTLR.RWP polling
	drm/amdgpu/smu10: fix SoC/fclk units in auto mode
	drm/nouveau/pmu: Add missing callbacks for Tegra devices
	drm/amdkfd: Create file descriptor after client is added to smi_clients list
	perf build: Don't use -ffat-lto-objects in the python feature test when building with clang-13
	perf python: Fix probing for some clang command line options
	tools build: Filter out options and warnings not supported by clang
	tools build: Use $(shell ) instead of `` to get embedded libperl's ccopts
	dmaengine: Revert "dmaengine: shdma: Fix runtime PM imbalance on error"
	ubsan: remove CONFIG_UBSAN_OBJECT_SIZE
	mm: don't skip swap entry even if zap_details specified
	cgroup: Use open-time credentials for process migraton perm checks
	selftests/cgroup: Fix build on older distros
	selftests: cgroup: Make cg_create() use 0755 for permission instead of 0644
	selftests: cgroup: Test open-time credential usage for migration checks
	selftests: cgroup: Test open-time cgroup namespace usage for migration checks
	arm64: module: remove (NOLOAD) from linker script
	Drivers: hv: vmbus: Replace smp_store_mb() with virt_store_mb()
	irqchip/gic, gic-v3: Prevent GSI to SGI translations
	mm/sparsemem: fix 'mem_section' will never be NULL gcc 12 warning
	powerpc: Fix virt_addr_valid() for 64-bit Book3E & 32-bit
	Linux 5.10.111

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I9b4c1d30ae226b865494df03d871db2a2b9281c7
2022-04-21 14:27:41 +02:00
Tejun Heo
4665722d36 cgroup: Use open-time credentials for process migraton perm checks
commit 1756d7994ad85c2479af6ae5a9750b92324685af upstream.

cgroup process migration permission checks are performed at write time as
whether a given operation is allowed or not is dependent on the content of
the write - the PID. This currently uses current's credentials which is a
potential security weakness as it may allow scenarios where a less
privileged process tricks a more privileged one into writing into a fd that
it created.

This patch makes both cgroup2 and cgroup1 process migration interfaces to
use the credentials saved at the time of open (file->f_cred) instead of
current's.

Reported-by: "Eric W. Biederman" <ebiederm@xmission.com>
Suggested-by: Linus Torvalds <torvalds@linuxfoundation.org>
Fixes: 187fe84067 ("cgroup: require write perm on common ancestor when moving processes on the default hierarchy")
Reviewed-by: Michal Koutný <mkoutny@suse.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
[OP: apply original __cgroup_procs_write() changes to cgroup_threads_write()
and cgroup_procs_write(), as the refactoring commit da70862efe006 ("cgroup:
cgroup.{procs,threads} factor out common parts") is not present in 5.10-stable]
Signed-off-by: Ovidiu Panait <ovidiu.panait@windriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-13 21:01:10 +02:00
Greg Kroah-Hartman
51790ed529 Merge 5.10.109 into android12-5.10-lts
Changes in 5.10.109
	nfc: st21nfca: Fix potential buffer overflows in EVT_TRANSACTION
	net: ipv6: fix skb_over_panic in __ip6_append_data
	exfat: avoid incorrectly releasing for root inode
	cgroup: Allocate cgroup_file_ctx for kernfs_open_file->priv
	cgroup: Use open-time cgroup namespace for process migration perm checks
	cgroup-v1: Correct privileges check in release_agent writes
	tpm: Fix error handling in async work
	staging: fbtft: fb_st7789v: reset display before initialization
	llc: fix netdevice reference leaks in llc_ui_bind()
	ASoC: sti: Fix deadlock via snd_pcm_stop_xrun() call
	ALSA: oss: Fix PCM OSS buffer allocation overflow
	ALSA: usb-audio: add mapping for new Corsair Virtuoso SE
	ALSA: hda/realtek: Add quirk for Clevo NP70PNJ
	ALSA: hda/realtek: Add quirk for Clevo NP50PNJ
	ALSA: hda/realtek - Fix headset mic problem for a HP machine with alc671
	ALSA: hda/realtek: Add quirk for ASUS GA402
	ALSA: pcm: Fix races among concurrent hw_params and hw_free calls
	ALSA: pcm: Fix races among concurrent read/write and buffer changes
	ALSA: pcm: Fix races among concurrent prepare and hw_params/hw_free calls
	ALSA: pcm: Fix races among concurrent prealloc proc writes
	ALSA: pcm: Add stream lock during PCM reset ioctl operations
	ALSA: usb-audio: Add mute TLV for playback volumes on RODE NT-USB
	ALSA: cmipci: Restore aux vol on suspend/resume
	ALSA: pci: fix reading of swapped values from pcmreg in AC97 codec
	drivers: net: xgene: Fix regression in CRC stripping
	netfilter: nf_tables: initialize registers in nft_do_chain()
	ACPI / x86: Work around broken XSDT on Advantech DAC-BJ01 board
	ACPI: battery: Add device HID and quirk for Microsoft Surface Go 3
	ACPI: video: Force backlight native for Clevo NL5xRU and NL5xNU
	crypto: qat - disable registration of algorithms
	Revert "ath: add support for special 0x0 regulatory domain"
	rcu: Don't deboost before reporting expedited quiescent state
	mac80211: fix potential double free on mesh join
	tpm: use try_get_ops() in tpm-space.c
	wcn36xx: Differentiate wcn3660 from wcn3620
	nds32: fix access_ok() checks in get/put_user
	llc: only change llc->dev when bind() succeeds
	Linux 5.10.109

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: Ifd757f0ec4ba643f7cbaf78aa899d3c159c4b877
2022-03-28 10:10:32 +02:00
Michal Koutný
ea21245cdc cgroup-v1: Correct privileges check in release_agent writes
commit 467a726b754f474936980da793b4ff2ec3e382a7 upstream.

The idea is to check: a) the owning user_ns of cgroup_ns, b)
capabilities in init_user_ns.

The commit 24f600856418 ("cgroup-v1: Require capabilities to set
release_agent") got this wrong in the write handler of release_agent
since it checked user_ns of the opener (may be different from the owning
user_ns of cgroup_ns).
Secondly, to avoid possibly confused deputy, the capability of the
opener must be checked.

Fixes: 24f600856418 ("cgroup-v1: Require capabilities to set release_agent")
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/stable/20220216121142.GB30035@blackbody.suse.cz/
Signed-off-by: Michal Koutný <mkoutny@suse.com>
Reviewed-by: Masami Ichikawa(CIP) <masami.ichikawa@cybertrust.co.jp>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-03-28 09:57:08 +02:00
Tejun Heo
f28364fe38 cgroup: Allocate cgroup_file_ctx for kernfs_open_file->priv
commit 0d2b5955b36250a9428c832664f2079cbf723bec upstream.

of->priv is currently used by each interface file implementation to store
private information. This patch collects the current two private data usages
into struct cgroup_file_ctx which is allocated and freed by the common path.
This allows generic private data which applies to multiple files, which will
be used to in the following patch.

Note that cgroup_procs iterator is now embedded as procs.iter in the new
cgroup_file_ctx so that it doesn't need to be allocated and freed
separately.

v2: union dropped from cgroup_file_ctx and the procs iterator is embedded in
    cgroup_file_ctx as suggested by Linus.

v3: Michal pointed out that cgroup1's procs pidlist uses of->priv too.
    Converted. Didn't change to embedded allocation as cgroup1 pidlists get
    stored for caching.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Reviewed-by: Michal Koutný <mkoutny@suse.com>
[mkoutny: v5.10: modify cgroup.pressure handlers, adjust context]
Signed-off-by: Michal Koutný <mkoutny@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-03-28 09:57:07 +02:00
Greg Kroah-Hartman
26d02dc8ef Merge 5.10.97 into android12-5.10-lts
Changes in 5.10.97
	PCI: pciehp: Fix infinite loop in IRQ handler upon power fault
	net: ipa: fix atomic update in ipa_endpoint_replenish()
	net: ipa: use a bitmap for endpoint replenish_enabled
	net: ipa: prevent concurrent replenish
	Revert "drivers: bus: simple-pm-bus: Add support for probing simple bus only devices"
	KVM: x86: Forcibly leave nested virt when SMM state is toggled
	psi: Fix uaf issue when psi trigger is destroyed while being polled
	x86/mce: Add Xeon Sapphire Rapids to list of CPUs that support PPIN
	x86/cpu: Add Xeon Icelake-D to list of CPUs that support PPIN
	drm/vc4: hdmi: Make sure the device is powered with CEC
	cgroup-v1: Require capabilities to set release_agent
	net/mlx5e: Fix handling of wrong devices during bond netevent
	net/mlx5: Use del_timer_sync in fw reset flow of halting poll
	net/mlx5: E-Switch, Fix uninitialized variable modact
	ipheth: fix EOVERFLOW in ipheth_rcvbulk_callback
	net: amd-xgbe: ensure to reset the tx_timer_active flag
	net: amd-xgbe: Fix skb data length underflow
	fanotify: Fix stale file descriptor in copy_event_to_user()
	net: sched: fix use-after-free in tc_new_tfilter()
	rtnetlink: make sure to refresh master_dev/m_ops in __rtnl_newlink()
	cpuset: Fix the bug that subpart_cpus updated wrongly in update_cpumask()
	af_packet: fix data-race in packet_setsockopt / packet_setsockopt
	tcp: add missing tcp_skb_can_collapse() test in tcp_shift_skb_data()
	Linux 5.10.97

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I428a930b475ba1b15d4b1ad05dde7df36cec6405
2022-02-08 10:08:24 +01:00
Eric W. Biederman
1fc3444cda cgroup-v1: Require capabilities to set release_agent
commit 24f6008564183aa120d07c03d9289519c2fe02af upstream.

The cgroup release_agent is called with call_usermodehelper.  The function
call_usermodehelper starts the release_agent with a full set fo capabilities.
Therefore require capabilities when setting the release_agaent.

Reported-by: Tabitha Sable <tabitha.c.sable@gmail.com>
Tested-by: Tabitha Sable <tabitha.c.sable@gmail.com>
Fixes: 81a6a5cdd2 ("Task Control Groups: automatic userspace notification of idle cgroups")
Cc: stable@vger.kernel.org # v2.6.24+
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-02-05 12:37:55 +01:00
Greg Kroah-Hartman
1afedcdcf8 Merge 5.10.55 into android12-5.10-lts
Changes in 5.10.55
	tools: Allow proper CC/CXX/... override with LLVM=1 in Makefile.include
	io_uring: fix link timeout refs
	KVM: x86: determine if an exception has an error code only when injecting it.
	af_unix: fix garbage collect vs MSG_PEEK
	workqueue: fix UAF in pwq_unbound_release_workfn()
	cgroup1: fix leaked context root causing sporadic NULL deref in LTP
	net/802/mrp: fix memleak in mrp_request_join()
	net/802/garp: fix memleak in garp_request_join()
	net: annotate data race around sk_ll_usec
	sctp: move 198 addresses from unusable to private scope
	rcu-tasks: Don't delete holdouts within trc_inspect_reader()
	rcu-tasks: Don't delete holdouts within trc_wait_for_one_reader()
	ipv6: allocate enough headroom in ip6_finish_output2()
	drm/ttm: add a check against null pointer dereference
	hfs: add missing clean-up in hfs_fill_super
	hfs: fix high memory mapping in hfs_bnode_read
	hfs: add lock nesting notation to hfs_find_init
	firmware: arm_scmi: Fix possible scmi_linux_errmap buffer overflow
	firmware: arm_scmi: Fix range check for the maximum number of pending messages
	cifs: fix the out of range assignment to bit fields in parse_server_interfaces
	iomap: remove the length variable in iomap_seek_data
	iomap: remove the length variable in iomap_seek_hole
	ARM: dts: versatile: Fix up interrupt controller node names
	ipv6: ip6_finish_output2: set sk into newly allocated nskb
	Linux 5.10.55

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I2d673bdde784b3689af73289305091dbd4ead042
2021-07-31 08:51:04 +02:00
Paul Gortmaker
df34f88862 cgroup1: fix leaked context root causing sporadic NULL deref in LTP
commit 1e7107c5ef44431bc1ebbd4c353f1d7c22e5f2ec upstream.

Richard reported sporadic (roughly one in 10 or so) null dereferences and
other strange behaviour for a set of automated LTP tests.  Things like:

   BUG: kernel NULL pointer dereference, address: 0000000000000008
   #PF: supervisor read access in kernel mode
   #PF: error_code(0x0000) - not-present page
   PGD 0 P4D 0
   Oops: 0000 [#1] PREEMPT SMP PTI
   CPU: 0 PID: 1516 Comm: umount Not tainted 5.10.0-yocto-standard #1
   Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-48-gd9c812dda519-prebuilt.qemu.org 04/01/2014
   RIP: 0010:kernfs_sop_show_path+0x1b/0x60

...or these others:

   RIP: 0010:do_mkdirat+0x6a/0xf0
   RIP: 0010:d_alloc_parallel+0x98/0x510
   RIP: 0010:do_readlinkat+0x86/0x120

There were other less common instances of some kind of a general scribble
but the common theme was mount and cgroup and a dubious dentry triggering
the NULL dereference.  I was only able to reproduce it under qemu by
replicating Richard's setup as closely as possible - I never did get it
to happen on bare metal, even while keeping everything else the same.

In commit 71d883c37e ("cgroup_do_mount(): massage calling conventions")
we see this as a part of the overall change:

   --------------
           struct cgroup_subsys *ss;
   -       struct dentry *dentry;

   [...]

   -       dentry = cgroup_do_mount(&cgroup_fs_type, fc->sb_flags, root,
   -                                CGROUP_SUPER_MAGIC, ns);

   [...]

   -       if (percpu_ref_is_dying(&root->cgrp.self.refcnt)) {
   -               struct super_block *sb = dentry->d_sb;
   -               dput(dentry);
   +       ret = cgroup_do_mount(fc, CGROUP_SUPER_MAGIC, ns);
   +       if (!ret && percpu_ref_is_dying(&root->cgrp.self.refcnt)) {
   +               struct super_block *sb = fc->root->d_sb;
   +               dput(fc->root);
                   deactivate_locked_super(sb);
                   msleep(10);
                   return restart_syscall();
           }
   --------------

In changing from the local "*dentry" variable to using fc->root, we now
export/leave that dentry pointer in the file context after doing the dput()
in the unlikely "is_dying" case.   With LTP doing a crazy amount of back to
back mount/unmount [testcases/bin/cgroup_regression_5_1.sh] the unlikely
becomes slightly likely and then bad things happen.

A fix would be to not leave the stale reference in fc->root as follows:

   --------------
                  dput(fc->root);
  +               fc->root = NULL;
                  deactivate_locked_super(sb);
   --------------

...but then we are just open-coding a duplicate of fc_drop_locked() so we
simply use that instead.

Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Tejun Heo <tj@kernel.org>
Cc: Zefan Li <lizefan.x@bytedance.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: stable@vger.kernel.org      # v5.1+
Reported-by: Richard Purdie <richard.purdie@linuxfoundation.org>
Fixes: 71d883c37e ("cgroup_do_mount(): massage calling conventions")
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-07-31 08:16:11 +02:00
Greg Kroah-Hartman
51ab149d5f Merge 5.10.52 into android12-5.10-lts
Changes in 5.10.52
	certs: add 'x509_revocation_list' to gitignore
	cifs: handle reconnect of tcon when there is no cached dfs referral
	KVM: mmio: Fix use-after-free Read in kvm_vm_ioctl_unregister_coalesced_mmio
	KVM: x86: Use guest MAXPHYADDR from CPUID.0x8000_0008 iff TDP is enabled
	KVM: x86/mmu: Do not apply HPA (memory encryption) mask to GPAs
	KVM: nSVM: Check the value written to MSR_VM_HSAVE_PA
	KVM: X86: Disable hardware breakpoints unconditionally before kvm_x86->run()
	scsi: core: Fix bad pointer dereference when ehandler kthread is invalid
	scsi: zfcp: Report port fc_security as unknown early during remote cable pull
	tracing: Do not reference char * as a string in histograms
	drm/i915/gtt: drop the page table optimisation
	drm/i915/gt: Fix -EDEADLK handling regression
	cgroup: verify that source is a string
	fbmem: Do not delete the mode that is still in use
	drm/dp_mst: Do not set proposed vcpi directly
	drm/dp_mst: Avoid to mess up payload table by ports in stale topology
	drm/dp_mst: Add missing drm parameters to recently added call to drm_dbg_kms()
	drm/ingenic: Fix non-OSD mode
	drm/ingenic: Switch IPU plane to type OVERLAY
	Revert "drm/ast: Remove reference to struct drm_device.pdev"
	net: bridge: multicast: fix PIM hello router port marking race
	net: bridge: multicast: fix MRD advertisement router port marking race
	leds: tlc591xx: fix return value check in tlc591xx_probe()
	ASoC: Intel: sof_sdw: add mutual exclusion between PCH DMIC and RT715
	dmaengine: fsl-qdma: check dma_set_mask return value
	scsi: arcmsr: Fix the wrong CDB payload report to IOP
	srcu: Fix broken node geometry after early ssp init
	rcu: Reject RCU_LOCKDEP_WARN() false positives
	tty: serial: fsl_lpuart: fix the potential risk of division or modulo by zero
	serial: fsl_lpuart: disable DMA for console and fix sysrq
	misc/libmasm/module: Fix two use after free in ibmasm_init_one
	misc: alcor_pci: fix null-ptr-deref when there is no PCI bridge
	ASoC: intel/boards: add missing MODULE_DEVICE_TABLE
	partitions: msdos: fix one-byte get_unaligned()
	iio: gyro: fxa21002c: Balance runtime pm + use pm_runtime_resume_and_get().
	iio: magn: bmc150: Balance runtime pm + use pm_runtime_resume_and_get()
	ALSA: usx2y: Avoid camelCase
	ALSA: usx2y: Don't call free_pages_exact() with NULL address
	Revert "ALSA: bebob/oxfw: fix Kconfig entry for Mackie d.2 Pro"
	usb: common: usb-conn-gpio: fix NULL pointer dereference of charger
	w1: ds2438: fixing bug that would always get page0
	scsi: arcmsr: Fix doorbell status being updated late on ARC-1886
	scsi: hisi_sas: Propagate errors in interrupt_init_v1_hw()
	scsi: lpfc: Fix "Unexpected timeout" error in direct attach topology
	scsi: lpfc: Fix crash when lpfc_sli4_hba_setup() fails to initialize the SGLs
	scsi: core: Cap scsi_host cmd_per_lun at can_queue
	ALSA: ac97: fix PM reference leak in ac97_bus_remove()
	tty: serial: 8250: serial_cs: Fix a memory leak in error handling path
	scsi: mpt3sas: Fix deadlock while cancelling the running firmware event
	scsi: core: Fixup calling convention for scsi_mode_sense()
	scsi: scsi_dh_alua: Check for negative result value
	fs/jfs: Fix missing error code in lmLogInit()
	scsi: megaraid_sas: Fix resource leak in case of probe failure
	scsi: megaraid_sas: Early detection of VD deletion through RaidMap update
	scsi: megaraid_sas: Handle missing interrupts while re-enabling IRQs
	scsi: iscsi: Add iscsi_cls_conn refcount helpers
	scsi: iscsi: Fix conn use after free during resets
	scsi: iscsi: Fix shost->max_id use
	scsi: qedi: Fix null ref during abort handling
	scsi: qedi: Fix race during abort timeouts
	scsi: qedi: Fix TMF session block/unblock use
	scsi: qedi: Fix cleanup session block/unblock use
	mfd: da9052/stmpe: Add and modify MODULE_DEVICE_TABLE
	mfd: cpcap: Fix cpcap dmamask not set warnings
	ASoC: img: Fix PM reference leak in img_i2s_in_probe()
	fsi: Add missing MODULE_DEVICE_TABLE
	serial: tty: uartlite: fix console setup
	s390/sclp_vt220: fix console name to match device
	s390: disable SSP when needed
	selftests: timers: rtcpie: skip test if default RTC device does not exist
	ALSA: sb: Fix potential double-free of CSP mixer elements
	powerpc/ps3: Add dma_mask to ps3_dma_region
	iommu/arm-smmu: Fix arm_smmu_device refcount leak when arm_smmu_rpm_get fails
	iommu/arm-smmu: Fix arm_smmu_device refcount leak in address translation
	ASoC: soc-pcm: fix the return value in dpcm_apply_symmetry()
	gpio: zynq: Check return value of pm_runtime_get_sync
	gpio: zynq: Check return value of irq_get_irq_data
	scsi: storvsc: Correctly handle multiple flags in srb_status
	ALSA: ppc: fix error return code in snd_pmac_probe()
	selftests/powerpc: Fix "no_handler" EBB selftest
	gpio: pca953x: Add support for the On Semi pca9655
	powerpc/mm/book3s64: Fix possible build error
	ASoC: soc-core: Fix the error return code in snd_soc_of_parse_audio_routing()
	habanalabs/gaudi: set the correct cpu_id on MME2_QM failure
	habanalabs: remove node from list before freeing the node
	s390/processor: always inline stap() and __load_psw_mask()
	s390/ipl_parm: fix program check new psw handling
	s390/mem_detect: fix diag260() program check new psw handling
	s390/mem_detect: fix tprot() program check new psw handling
	Input: hideep - fix the uninitialized use in hideep_nvm_unlock()
	ALSA: bebob: add support for ToneWeal FW66
	ALSA: usb-audio: scarlett2: Fix 18i8 Gen 2 PCM Input count
	ALSA: usb-audio: scarlett2: Fix data_mutex lock
	ALSA: usb-audio: scarlett2: Fix scarlett2_*_ctl_put() return values
	usb: gadget: f_hid: fix endianness issue with descriptors
	usb: gadget: hid: fix error return code in hid_bind()
	powerpc/boot: Fixup device-tree on little endian
	ASoC: Intel: kbl_da7219_max98357a: shrink platform_id below 20 characters
	backlight: lm3630a: Fix return code of .update_status() callback
	ALSA: hda: Add IRQ check for platform_get_irq()
	ALSA: usb-audio: scarlett2: Fix 6i6 Gen 2 line out descriptions
	ALSA: firewire-motu: fix detection for S/PDIF source on optical interface in v2 protocol
	leds: turris-omnia: add missing MODULE_DEVICE_TABLE
	staging: rtl8723bs: fix macro value for 2.4Ghz only device
	intel_th: Wait until port is in reset before programming it
	i2c: core: Disable client irq on reboot/shutdown
	phy: intel: Fix for warnings due to EMMC clock 175Mhz change in FIP
	lib/decompress_unlz4.c: correctly handle zero-padding around initrds.
	kcov: add __no_sanitize_coverage to fix noinstr for all architectures
	power: supply: sc27xx: Add missing MODULE_DEVICE_TABLE
	power: supply: sc2731_charger: Add missing MODULE_DEVICE_TABLE
	pwm: spear: Don't modify HW state in .remove callback
	PCI: ftpci100: Rename macro name collision
	power: supply: ab8500: Avoid NULL pointers
	PCI: hv: Fix a race condition when removing the device
	power: supply: max17042: Do not enforce (incorrect) interrupt trigger type
	power: reset: gpio-poweroff: add missing MODULE_DEVICE_TABLE
	ARM: 9087/1: kprobes: test-thumb: fix for LLVM_IAS=1
	PCI/P2PDMA: Avoid pci_get_slot(), which may sleep
	NFSv4: Fix delegation return in cases where we have to retry
	PCI: pciehp: Ignore Link Down/Up caused by DPC
	watchdog: Fix possible use-after-free in wdt_startup()
	watchdog: sc520_wdt: Fix possible use-after-free in wdt_turnoff()
	watchdog: Fix possible use-after-free by calling del_timer_sync()
	watchdog: imx_sc_wdt: fix pretimeout
	watchdog: iTCO_wdt: Account for rebooting on second timeout
	x86/fpu: Return proper error codes from user access functions
	remoteproc: core: Fix cdev remove and rproc del
	PCI: tegra: Add missing MODULE_DEVICE_TABLE
	orangefs: fix orangefs df output.
	ceph: remove bogus checks and WARN_ONs from ceph_set_page_dirty
	drm/gma500: Add the missed drm_gem_object_put() in psb_user_framebuffer_create()
	NFS: nfs_find_open_context() may only select open files
	power: supply: charger-manager: add missing MODULE_DEVICE_TABLE
	power: supply: ab8500: add missing MODULE_DEVICE_TABLE
	drm/amdkfd: fix sysfs kobj leak
	pwm: img: Fix PM reference leak in img_pwm_enable()
	pwm: tegra: Don't modify HW state in .remove callback
	ACPI: AMBA: Fix resource name in /proc/iomem
	ACPI: video: Add quirk for the Dell Vostro 3350
	PCI: rockchip: Register IRQ handlers after device and data are ready
	virtio-blk: Fix memory leak among suspend/resume procedure
	virtio_net: Fix error handling in virtnet_restore()
	virtio_console: Assure used length from device is limited
	f2fs: atgc: fix to set default age threshold
	NFSD: Fix TP_printk() format specifier in nfsd_clid_class
	x86/signal: Detect and prevent an alternate signal stack overflow
	f2fs: add MODULE_SOFTDEP to ensure crc32 is included in the initramfs
	f2fs: compress: fix to disallow temp extension
	remoteproc: k3-r5: Fix an error message
	PCI/sysfs: Fix dsm_label_utf16s_to_utf8s() buffer overrun
	power: supply: rt5033_battery: Fix device tree enumeration
	NFSv4: Initialise connection to the server in nfs4_alloc_client()
	NFSv4: Fix an Oops in pnfs_mark_request_commit() when doing O_DIRECT
	misc: alcor_pci: fix inverted branch condition
	um: fix error return code in slip_open()
	um: fix error return code in winch_tramp()
	ubifs: Fix off-by-one error
	ubifs: journal: Fix error return code in ubifs_jnl_write_inode()
	watchdog: aspeed: fix hardware timeout calculation
	watchdog: jz4740: Fix return value check in jz4740_wdt_probe()
	SUNRPC: prevent port reuse on transports which don't request it.
	nfs: fix acl memory leak of posix_acl_create()
	ubifs: Set/Clear I_LINKABLE under i_lock for whiteout inode
	PCI: iproc: Fix multi-MSI base vector number allocation
	PCI: iproc: Support multi-MSI only on uniprocessor kernel
	f2fs: fix to avoid adding tab before doc section
	x86/fpu: Fix copy_xstate_to_kernel() gap handling
	x86/fpu: Limit xstate copy size in xstateregs_set()
	PCI: intel-gw: Fix INTx enable
	pwm: imx1: Don't disable clocks at device remove time
	PCI: tegra194: Fix tegra_pcie_ep_raise_msi_irq() ill-defined shift
	vdpa/mlx5: Fix umem sizes assignments on VQ create
	vdpa/mlx5: Fix possible failure in umem size calculation
	virtio_net: move tx vq operation under tx queue lock
	nvme-tcp: can't set sk_user_data without write_lock
	nfsd: Reduce contention for the nfsd_file nf_rwsem
	ALSA: isa: Fix error return code in snd_cmi8330_probe()
	vdpa/mlx5: Clear vq ready indication upon device reset
	NFSv4/pnfs: Fix the layout barrier update
	NFSv4/pnfs: Fix layoutget behaviour after invalidation
	NFSv4/pNFS: Don't call _nfs4_pnfs_v3_ds_connect multiple times
	hexagon: handle {,SOFT}IRQENTRY_TEXT in linker script
	hexagon: use common DISCARDS macro
	ARM: dts: gemini-rut1xx: remove duplicate ethernet node
	reset: RESET_BRCMSTB_RESCAL should depend on ARCH_BRCMSTB
	reset: RESET_INTEL_GW should depend on X86
	reset: a10sr: add missing of_match_table reference
	ARM: exynos: add missing of_node_put for loop iteration
	ARM: dts: exynos: fix PWM LED max brightness on Odroid XU/XU3
	ARM: dts: exynos: fix PWM LED max brightness on Odroid HC1
	ARM: dts: exynos: fix PWM LED max brightness on Odroid XU4
	memory: stm32-fmc2-ebi: add missing of_node_put for loop iteration
	memory: atmel-ebi: add missing of_node_put for loop iteration
	reset: brcmstb: Add missing MODULE_DEVICE_TABLE
	memory: pl353: Fix error return code in pl353_smc_probe()
	ARM: dts: sun8i: h3: orangepi-plus: Fix ethernet phy-mode
	rtc: fix snprintf() checking in is_rtc_hctosys()
	arm64: dts: renesas: v3msk: Fix memory size
	ARM: dts: r8a7779, marzen: Fix DU clock names
	arm64: dts: ti: j7200-main: Enable USB2 PHY RX sensitivity workaround
	arm64: dts: renesas: Add missing opp-suspend properties
	arm64: dts: renesas: r8a7796[01]: Fix OPP table entry voltages
	ARM: dts: stm32: Connect PHY IRQ line on DH STM32MP1 SoM
	ARM: dts: stm32: Rework LAN8710Ai PHY reset on DHCOM SoM
	arm64: dts: qcom: trogdor: Add no-hpd to DSI bridge node
	firmware: tegra: Fix error return code in tegra210_bpmp_init()
	firmware: arm_scmi: Reset Rx buffer to max size during async commands
	dt-bindings: i2c: at91: fix example for scl-gpios
	ARM: dts: BCM5301X: Fixup SPI binding
	reset: bail if try_module_get() fails
	arm64: dts: renesas: r8a779a0: Drop power-domains property from GIC node
	arm64: dts: ti: k3-j721e-main: Fix external refclk input to SERDES
	memory: fsl_ifc: fix leak of IO mapping on probe failure
	memory: fsl_ifc: fix leak of private memory on probe failure
	arm64: dts: allwinner: a64-sopine-baseboard: change RGMII mode to TXID
	ARM: dts: dra7: Fix duplicate USB4 target module node
	ARM: dts: am335x: align ti,pindir-d0-out-d1-in property with dt-shema
	ARM: dts: am437x: align ti,pindir-d0-out-d1-in property with dt-shema
	thermal/drivers/sprd: Add missing MODULE_DEVICE_TABLE
	ARM: dts: imx6q-dhcom: Fix ethernet reset time properties
	ARM: dts: imx6q-dhcom: Fix ethernet plugin detection problems
	ARM: dts: imx6q-dhcom: Add gpios pinctrl for i2c bus recovery
	thermal/drivers/rcar_gen3_thermal: Fix coefficient calculations
	firmware: turris-mox-rwtm: fix reply status decoding function
	firmware: turris-mox-rwtm: report failures better
	firmware: turris-mox-rwtm: fail probing when firmware does not support hwrng
	firmware: turris-mox-rwtm: show message about HWRNG registration
	arm64: dts: rockchip: Re-add regulator-boot-on, regulator-always-on for vdd_gpu on rk3399-roc-pc
	arm64: dts: rockchip: Re-add regulator-always-on for vcc_sdio for rk3399-roc-pc
	scsi: be2iscsi: Fix an error handling path in beiscsi_dev_probe()
	sched/uclamp: Ignore max aggregation if rq is idle
	jump_label: Fix jump_label_text_reserved() vs __init
	static_call: Fix static_call_text_reserved() vs __init
	mips: always link byteswap helpers into decompressor
	mips: disable branch profiling in boot/decompress.o
	MIPS: vdso: Invalid GIC access through VDSO
	scsi: scsi_dh_alua: Fix signedness bug in alua_rtpg()
	seq_file: disallow extremely large seq buffer allocations
	Linux 5.10.52

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: Ic1b04661728db8b0e060ca6935783e15a22210da
2021-07-20 16:36:53 +02:00
Christian Brauner
811763e3be cgroup: verify that source is a string
commit 3b0462726e7ef281c35a7a4ae33e93ee2bc9975b upstream.

The following sequence can be used to trigger a UAF:

    int fscontext_fd = fsopen("cgroup");
    int fd_null = open("/dev/null, O_RDONLY);
    int fsconfig(fscontext_fd, FSCONFIG_SET_FD, "source", fd_null);
    close_range(3, ~0U, 0);

The cgroup v1 specific fs parser expects a string for the "source"
parameter.  However, it is perfectly legitimate to e.g.  specify a file
descriptor for the "source" parameter.  The fs parser doesn't know what
a filesystem allows there.  So it's a bug to assume that "source" is
always of type fs_value_is_string when it can reasonably also be
fs_value_is_file.

This assumption in the cgroup code causes a UAF because struct
fs_parameter uses a union for the actual value.  Access to that union is
guarded by the param->type member.  Since the cgroup paramter parser
didn't check param->type but unconditionally moved param->string into
fc->source a close on the fscontext_fd would trigger a UAF during
put_fs_context() which frees fc->source thereby freeing the file stashed
in param->file causing a UAF during a close of the fd_null.

Fix this by verifying that param->type is actually a string and report
an error if not.

In follow up patches I'll add a new generic helper that can be used here
and by other filesystems instead of this error-prone copy-pasta fix.
But fixing it in here first makes backporting a it to stable a lot
easier.

Fixes: 8d2451f499 ("cgroup1: switch to option-by-option parsing")
Reported-by: syzbot+283ce5a46486d6acdbaf@syzkaller.appspotmail.com
Cc: Christoph Hellwig <hch@lst.de>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: <stable@kernel.org>
Cc: syzkaller-bugs <syzkaller-bugs@googlegroups.com>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-07-20 16:05:36 +02:00
Greg Kroah-Hartman
82658bfd88 Merge 5.10.44 into android12-5.10-lts
Changes in 5.10.44
	proc: Track /proc/$pid/attr/ opener mm_struct
	ASoC: max98088: fix ni clock divider calculation
	ASoC: amd: fix for pcm_read() error
	spi: Fix spi device unregister flow
	spi: spi-zynq-qspi: Fix stack violation bug
	bpf: Forbid trampoline attach for functions with variable arguments
	net/nfc/rawsock.c: fix a permission check bug
	usb: cdns3: Fix runtime PM imbalance on error
	ASoC: Intel: bytcr_rt5640: Add quirk for the Glavey TM800A550L tablet
	ASoC: Intel: bytcr_rt5640: Add quirk for the Lenovo Miix 3-830 tablet
	vfio-ccw: Reset FSM state to IDLE inside FSM
	vfio-ccw: Serialize FSM IDLE state with I/O completion
	ASoC: sti-sas: add missing MODULE_DEVICE_TABLE
	spi: sprd: Add missing MODULE_DEVICE_TABLE
	usb: chipidea: udc: assign interrupt number to USB gadget structure
	isdn: mISDN: netjet: Fix crash in nj_probe:
	bonding: init notify_work earlier to avoid uninitialized use
	netlink: disable IRQs for netlink_lock_table()
	net: mdiobus: get rid of a BUG_ON()
	cgroup: disable controllers at parse time
	wq: handle VM suspension in stall detection
	net/qla3xxx: fix schedule while atomic in ql_sem_spinlock
	RDS tcp loopback connection can hang
	net:sfc: fix non-freed irq in legacy irq mode
	scsi: bnx2fc: Return failure if io_req is already in ABTS processing
	scsi: vmw_pvscsi: Set correct residual data length
	scsi: hisi_sas: Drop free_irq() of devm_request_irq() allocated irq
	scsi: target: qla2xxx: Wait for stop_phase1 at WWN removal
	net: macb: ensure the device is available before accessing GEMGXL control registers
	net: appletalk: cops: Fix data race in cops_probe1
	net: dsa: microchip: enable phy errata workaround on 9567
	nvme-fabrics: decode host pathing error for connect
	MIPS: Fix kernel hang under FUNCTION_GRAPH_TRACER and PREEMPT_TRACER
	dm verity: fix require_signatures module_param permissions
	bnx2x: Fix missing error code in bnx2x_iov_init_one()
	nvme-tcp: remove incorrect Kconfig dep in BLK_DEV_NVME
	nvmet: fix false keep-alive timeout when a controller is torn down
	powerpc/fsl: set fsl,i2c-erratum-a004447 flag for P2041 i2c controllers
	powerpc/fsl: set fsl,i2c-erratum-a004447 flag for P1010 i2c controllers
	spi: Don't have controller clean up spi device before driver unbind
	spi: Cleanup on failure of initial setup
	i2c: mpc: Make use of i2c_recover_bus()
	i2c: mpc: implement erratum A-004447 workaround
	ALSA: seq: Fix race of snd_seq_timer_open()
	ALSA: firewire-lib: fix the context to call snd_pcm_stop_xrun()
	ALSA: hda/realtek: headphone and mic don't work on an Acer laptop
	ALSA: hda/realtek: fix mute/micmute LEDs and speaker for HP Elite Dragonfly G2
	ALSA: hda/realtek: fix mute/micmute LEDs and speaker for HP EliteBook x360 1040 G8
	ALSA: hda/realtek: fix mute/micmute LEDs for HP EliteBook 840 Aero G8
	ALSA: hda/realtek: fix mute/micmute LEDs for HP ZBook Power G8
	spi: bcm2835: Fix out-of-bounds access with more than 4 slaves
	Revert "ACPI: sleep: Put the FACS table after using it"
	drm: Fix use-after-free read in drm_getunique()
	drm: Lock pointer access in drm_master_release()
	perf/x86/intel/uncore: Fix M2M event umask for Ice Lake server
	KVM: X86: MMU: Use the correct inherited permissions to get shadow page
	kvm: avoid speculation-based attacks from out-of-range memslot accesses
	staging: rtl8723bs: Fix uninitialized variables
	async_xor: check src_offs is not NULL before updating it
	btrfs: return value from btrfs_mark_extent_written() in case of error
	btrfs: promote debugging asserts to full-fledged checks in validate_super
	cgroup1: don't allow '\n' in renaming
	ftrace: Do not blindly read the ip address in ftrace_bug()
	mmc: renesas_sdhi: abort tuning when timeout detected
	mmc: renesas_sdhi: Fix HS400 on R-Car M3-W+
	USB: f_ncm: ncm_bitrate (speed) is unsigned
	usb: f_ncm: only first packet of aggregate needs to start timer
	usb: pd: Set PD_T_SINK_WAIT_CAP to 310ms
	usb: dwc3-meson-g12a: fix usb2 PHY glue init when phy0 is disabled
	usb: dwc3: meson-g12a: Disable the regulator in the error handling path of the probe
	usb: dwc3: gadget: Bail from dwc3_gadget_exit() if dwc->gadget is NULL
	usb: dwc3: ep0: fix NULL pointer exception
	usb: musb: fix MUSB_QUIRK_B_DISCONNECT_99 handling
	usb: typec: wcove: Use LE to CPU conversion when accessing msg->header
	usb: typec: ucsi: Clear PPM capability data in ucsi_init() error path
	usb: typec: intel_pmc_mux: Put fwnode in error case during ->probe()
	usb: typec: intel_pmc_mux: Add missed error check for devm_ioremap_resource()
	usb: gadget: f_fs: Ensure io_completion_wq is idle during unbind
	USB: serial: ftdi_sio: add NovaTech OrionMX product ID
	USB: serial: omninet: add device id for Zyxel Omni 56K Plus
	USB: serial: quatech2: fix control-request directions
	USB: serial: cp210x: fix alternate function for CP2102N QFN20
	usb: gadget: eem: fix wrong eem header operation
	usb: fix various gadgets null ptr deref on 10gbps cabling.
	usb: fix various gadget panics on 10gbps cabling
	usb: typec: tcpm: cancel vdm and state machine hrtimer when unregister tcpm port
	usb: typec: tcpm: cancel frs hrtimer when unregister tcpm port
	regulator: core: resolve supply for boot-on/always-on regulators
	regulator: max77620: Use device_set_of_node_from_dev()
	regulator: bd718x7: Fix the BUCK7 voltage setting on BD71837
	regulator: fan53880: Fix missing n_voltages setting
	regulator: bd71828: Fix .n_voltages settings
	regulator: rtmv20: Fix .set_current_limit/.get_current_limit callbacks
	phy: usb: Fix misuse of IS_ENABLED
	usb: dwc3: gadget: Disable gadget IRQ during pullup disable
	usb: typec: mux: Fix copy-paste mistake in typec_mux_match
	drm/mcde: Fix off by 10^3 in calculation
	drm/msm/a6xx: fix incorrectly set uavflagprd_inv field for A650
	drm/msm/a6xx: update/fix CP_PROTECT initialization
	drm/msm/a6xx: avoid shadow NULL reference in failure path
	RDMA/ipoib: Fix warning caused by destroying non-initial netns
	RDMA/mlx4: Do not map the core_clock page to user space unless enabled
	ARM: cpuidle: Avoid orphan section warning
	vmlinux.lds.h: Avoid orphan section with !SMP
	tools/bootconfig: Fix error return code in apply_xbc()
	phy: cadence: Sierra: Fix error return code in cdns_sierra_phy_probe()
	ASoC: core: Fix Null-point-dereference in fmt_single_name()
	ASoC: meson: gx-card: fix sound-dai dt schema
	phy: ti: Fix an error code in wiz_probe()
	gpio: wcd934x: Fix shift-out-of-bounds error
	perf: Fix data race between pin_count increment/decrement
	sched/fair: Keep load_avg and load_sum synced
	sched/fair: Make sure to update tg contrib for blocked load
	sched/fair: Fix util_est UTIL_AVG_UNCHANGED handling
	x86/nmi_watchdog: Fix old-style NMI watchdog regression on old Intel CPUs
	KVM: x86: Ensure liveliness of nested VM-Enter fail tracepoint message
	IB/mlx5: Fix initializing CQ fragments buffer
	NFS: Fix a potential NULL dereference in nfs_get_client()
	NFSv4: Fix deadlock between nfs4_evict_inode() and nfs4_opendata_get_inode()
	perf session: Correct buffer copying when peeking events
	kvm: fix previous commit for 32-bit builds
	NFS: Fix use-after-free in nfs4_init_client()
	NFSv4: Fix second deadlock in nfs4_evict_inode()
	NFSv4: nfs4_proc_set_acl needs to restore NFS_CAP_UIDGID_NOMAP on error.
	scsi: core: Fix error handling of scsi_host_alloc()
	scsi: core: Fix failure handling of scsi_add_host_with_dma()
	scsi: core: Put .shost_dev in failure path if host state changes to RUNNING
	scsi: core: Only put parent device if host state differs from SHOST_CREATED
	tracing: Correct the length check which causes memory corruption
	proc: only require mm_struct for writing
	Linux 5.10.44

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: Ic64172b4e72ccb54d96000b3065dd8b33aa9fef5
2021-06-16 13:14:03 +02:00
Alexander Kuznetsov
74d3b20b1b cgroup1: don't allow '\n' in renaming
commit b7e24eb1caa5f8da20d405d262dba67943aedc42 upstream.

cgroup_mkdir() have restriction on newline usage in names:
$ mkdir $'/sys/fs/cgroup/cpu/test\ntest2'
mkdir: cannot create directory
'/sys/fs/cgroup/cpu/test\ntest2': Invalid argument

But in cgroup1_rename() such check is missed.
This allows us to make /proc/<pid>/cgroup unparsable:
$ mkdir /sys/fs/cgroup/cpu/test
$ mv /sys/fs/cgroup/cpu/test $'/sys/fs/cgroup/cpu/test\ntest2'
$ echo $$ > $'/sys/fs/cgroup/cpu/test\ntest2'
$ cat /proc/self/cgroup
11:pids:/
10:freezer:/
9:hugetlb:/
8:cpuset:/
7:blkio:/user.slice
6:memory:/user.slice
5:net_cls,net_prio:/
4:perf_event:/
3:devices:/user.slice
2:cpu,cpuacct:/test
test2
1:name=systemd:/
0::/

Signed-off-by: Alexander Kuznetsov <wwfq@yandex-team.ru>
Reported-by: Andrey Krasichkov <buglloc@yandex-team.ru>
Acked-by: Dmitry Yakunin <zeil@yandex-team.ru>
Cc: stable@vger.kernel.org
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-06-16 12:01:40 +02:00
Pavankumar Kondeti
29203f8c8f ANDROID: cgroup: Add android_rvh_cgroup_force_kthread_migration
In Android GKI, CONFIG_FAIR_GROUP_SCHED is enabled [1] to help
prioritize important work. Given that CPU shares of root cgroup
can't be changed, leaving the tasks inside root cgroup will give
them higher share compared to the other tasks inside important
cgroups. This is mitigated by moving all tasks inside root cgroup to
a different cgroup after Android is booted. However, there are many
kernel tasks stuck in the root cgroup after the boot.

It is possible to relax kernel threads and kworkers migrations under
certain scenarios. However the patch [2] posted at upstream is not
accepted. Hence add a restricted vendor hook to notify modules when a
kernel thread is requested for cgroup migration. The modules can relax
the restrictions forced by the kernel and allow the cgroup migration.

[1] f08f049de1
[2] https://lore.kernel.org/lkml/1617714261-18111-1-git-send-email-pkondeti@codeaurora.org

Bug: 184594949
Change-Id: I445a170ba797c8bece3b4b59b7a42cdd85438f1f
Signed-off-by: Pavankumar Kondeti <quic_pkondeti@quicinc.com>
2021-05-04 20:13:09 +00:00
Frankie Chang
02a9f884d5 ANDROID: cgroup: Add vendor hook to the cgroup
Add a vendor hook after attaching a task to a cgroup to 
recognize the group_id for performance tuning

Bug: 181917687

Signed-off-by: Frankie Chang <frankie.chang@mediatek.com>
Change-Id: I603afa3d893dd575a7dcb97f83bd9eacb8315bab
(cherry picked from commit de089a37a3d248608a1d5855a4ae82ebad3ec2ab)
2021-03-08 16:17:15 +00:00
Greg Kroah-Hartman
b129c98dc6 Merge 5.10.17 into android12-5.10
Changes in 5.10.17
	objtool: Fix seg fault with Clang non-section symbols
	Revert "dts: phy: add GPIO number and active state used for phy reset"
	gpio: mxs: GPIO_MXS should not default to y unconditionally
	gpio: ep93xx: fix BUG_ON port F usage
	gpio: ep93xx: Fix single irqchip with multi gpiochips
	tracing: Do not count ftrace events in top level enable output
	tracing: Check length before giving out the filter buffer
	drm/i915: Fix overlay frontbuffer tracking
	arm/xen: Don't probe xenbus as part of an early initcall
	cgroup: fix psi monitor for root cgroup
	Revert "drm/amd/display: Update NV1x SR latency values"
	drm/i915/tgl+: Make sure TypeC FIA is powered up when initializing it
	drm/dp_mst: Don't report ports connected if nothing is attached to them
	dmaengine: move channel device_node deletion to driver
	tmpfs: disallow CONFIG_TMPFS_INODE64 on s390
	tmpfs: disallow CONFIG_TMPFS_INODE64 on alpha
	soc: ti: omap-prm: Fix boot time errors for rst_map_012 bits 0 and 1
	arm64: dts: rockchip: Fix PCIe DT properties on rk3399
	arm64: dts: qcom: sdm845: Reserve LPASS clocks in gcc
	ARM: OMAP2+: Fix suspcious RCU usage splats for omap_enter_idle_coupled
	arm64: dts: rockchip: remove interrupt-names property from rk3399 vdec node
	platform/x86: hp-wmi: Disable tablet-mode reporting by default
	arm64: dts: rockchip: Disable display for NanoPi R2S
	ovl: perform vfs_getxattr() with mounter creds
	cap: fix conversions on getxattr
	ovl: skip getxattr of security labels
	scsi: lpfc: Fix EEH encountering oops with NVMe traffic
	x86/split_lock: Enable the split lock feature on another Alder Lake CPU
	nvme-pci: ignore the subsysem NQN on Phison E16
	drm/amd/display: Fix DPCD translation for LTTPR AUX_RD_INTERVAL
	drm/amd/display: Add more Clock Sources to DCN2.1
	drm/amd/display: Release DSC before acquiring
	drm/amd/display: Fix dc_sink kref count in emulated_link_detect
	drm/amd/display: Free atomic state after drm_atomic_commit
	drm/amd/display: Decrement refcount of dc_sink before reassignment
	riscv: virt_addr_valid must check the address belongs to linear mapping
	bfq-iosched: Revert "bfq: Fix computation of shallow depth"
	ARM: dts: lpc32xx: Revert set default clock rate of HCLK PLL
	kallsyms: fix nonconverging kallsyms table with lld
	ARM: ensure the signal page contains defined contents
	ARM: kexec: fix oops after TLB are invalidated
	ubsan: implement __ubsan_handle_alignment_assumption
	Revert "lib: Restrict cpumask_local_spread to houskeeping CPUs"
	x86/efi: Remove EFI PGD build time checks
	lkdtm: don't move ctors to .rodata
	KVM: x86: cleanup CR3 reserved bits checks
	cgroup-v1: add disabled controller check in cgroup1_parse_param()
	dmaengine: idxd: fix misc interrupt completion
	ath9k: fix build error with LEDS_CLASS=m
	mt76: dma: fix a possible memory leak in mt76_add_fragment()
	drm/vc4: hvs: Fix buffer overflow with the dlist handling
	dmaengine: idxd: check device state before issue command
	bpf: Unbreak BPF_PROG_TYPE_KPROBE when kprobe is called via do_int3
	bpf: Check for integer overflow when using roundup_pow_of_two()
	netfilter: xt_recent: Fix attempt to update deleted entry
	selftests: netfilter: fix current year
	netfilter: nftables: fix possible UAF over chains from packet path in netns
	netfilter: flowtable: fix tcp and udp header checksum update
	xen/netback: avoid race in xenvif_rx_ring_slots_available()
	net: hdlc_x25: Return meaningful error code in x25_open
	net: ipa: set error code in gsi_channel_setup()
	hv_netvsc: Reset the RSC count if NVSP_STAT_FAIL in netvsc_receive()
	net: enetc: initialize the RFS and RSS memories
	selftests: txtimestamp: fix compilation issue
	net: stmmac: set TxQ mode back to DCB after disabling CBS
	ibmvnic: Clear failover_pending if unable to schedule
	netfilter: conntrack: skip identical origin tuple in same zone only
	scsi: scsi_debug: Fix a memory leak
	x86/build: Disable CET instrumentation in the kernel for 32-bit too
	net: dsa: felix: implement port flushing on .phylink_mac_link_down
	net: hns3: add a check for queue_id in hclge_reset_vf_queue()
	net: hns3: add a check for tqp_index in hclge_get_ring_chain_from_mbx()
	net: hns3: add a check for index in hclge_get_rss_key()
	firmware_loader: align .builtin_fw to 8
	drm/sun4i: tcon: set sync polarity for tcon1 channel
	drm/sun4i: dw-hdmi: always set clock rate
	drm/sun4i: Fix H6 HDMI PHY configuration
	drm/sun4i: dw-hdmi: Fix max. frequency for H6
	clk: sunxi-ng: mp: fix parent rate change flag check
	i2c: stm32f7: fix configuration of the digital filter
	h8300: fix PREEMPTION build, TI_PRE_COUNT undefined
	scripts: set proper OpenSSL include dir also for sign-file
	x86/pci: Create PCI/MSI irqdomain after x86_init.pci.arch_init()
	arm64: mte: Allow PTRACE_PEEKMTETAGS access to the zero page
	rxrpc: Fix clearance of Tx/Rx ring when releasing a call
	udp: fix skb_copy_and_csum_datagram with odd segment sizes
	net: dsa: call teardown method on probe failure
	cpufreq: ACPI: Extend frequency tables to cover boost frequencies
	cpufreq: ACPI: Update arch scale-invariance max perf ratio if CPPC is not there
	net: gro: do not keep too many GRO packets in napi->rx_list
	net: fix iteration for sctp transport seq_files
	net/vmw_vsock: fix NULL pointer dereference
	net/vmw_vsock: improve locking in vsock_connect_timeout()
	net: watchdog: hold device global xmit lock during tx disable
	bridge: mrp: Fix the usage of br_mrp_port_switchdev_set_state
	switchdev: mrp: Remove SWITCHDEV_ATTR_ID_MRP_PORT_STAT
	vsock/virtio: update credit only if socket is not closed
	vsock: fix locking in vsock_shutdown()
	net/rds: restrict iovecs length for RDS_CMSG_RDMA_ARGS
	net/qrtr: restrict user-controlled length in qrtr_tun_write_iter()
	ovl: expand warning in ovl_d_real()
	kcov, usb: only collect coverage from __usb_hcd_giveback_urb in softirq
	Linux 5.10.17

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: Id0300681f52b51d3f466f1e66ec3a6c25f65f4d3
2021-02-18 11:21:01 +01:00
Chen Zhou
3e53d64e9a cgroup-v1: add disabled controller check in cgroup1_parse_param()
[ Upstream commit 61e960b07b637f0295308ad91268501d744c21b5 ]

When mounting a cgroup hierarchy with disabled controller in cgroup v1,
all available controllers will be attached.
For example, boot with cgroup_no_v1=cpu or cgroup_disable=cpu, and then
mount with "mount -t cgroup -ocpu cpu /sys/fs/cgroup/cpu", then all
enabled controllers will be attached except cpu.

Fix this by adding disabled controller check in cgroup1_parse_param().
If the specified controller is disabled, just return error with information
"Disabled controller xx" rather than attaching all the other enabled
controllers.

Fixes: f5dfb5315d ("cgroup: take options parsing into ->parse_monolithic()")
Signed-off-by: Chen Zhou <chenzhou10@huawei.com>
Reviewed-by: Zefan Li <lizefan.x@bytedance.com>
Reviewed-by: Michal Koutný <mkoutny@suse.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-02-17 11:02:25 +01:00
Greg Kroah-Hartman
9cf2ceaffd Merge 5.10.5 into android12-5.10
Changes in 5.10.5
	net/sched: sch_taprio: reset child qdiscs before freeing them
	mptcp: fix security context on server socket
	ethtool: fix error paths in ethnl_set_channels()
	ethtool: fix string set id check
	md/raid10: initialize r10_bio->read_slot before use.
	drm/amd/display: Add get_dig_frontend implementation for DCEx
	io_uring: close a small race gap for files cancel
	jffs2: Allow setting rp_size to zero during remounting
	jffs2: Fix NULL pointer dereference in rp_size fs option parsing
	spi: dw-bt1: Fix undefined devm_mux_control_get symbol
	opp: fix memory leak in _allocate_opp_table
	opp: Call the missing clk_put() on error
	scsi: block: Fix a race in the runtime power management code
	mm/hugetlb: fix deadlock in hugetlb_cow error path
	mm: memmap defer init doesn't work as expected
	lib/zlib: fix inflating zlib streams on s390
	io_uring: don't assume mm is constant across submits
	io_uring: use bottom half safe lock for fixed file data
	io_uring: add a helper for setting a ref node
	io_uring: fix io_sqe_files_unregister() hangs
	uapi: move constants from <linux/kernel.h> to <linux/const.h>
	tools headers UAPI: Sync linux/const.h with the kernel headers
	cgroup: Fix memory leak when parsing multiple source parameters
	zlib: move EXPORT_SYMBOL() and MODULE_LICENSE() out of dfltcc_syms.c
	scsi: cxgb4i: Fix TLS dependency
	Bluetooth: hci_h5: close serdev device and free hu in h5_close
	fbcon: Disable accelerated scrolling
	reiserfs: add check for an invalid ih_entry_count
	misc: vmw_vmci: fix kernel info-leak by initializing dbells in vmci_ctx_get_chkpt_doorbells()
	media: gp8psk: initialize stats at power control logic
	f2fs: fix shift-out-of-bounds in sanity_check_raw_super()
	ALSA: seq: Use bool for snd_seq_queue internal flags
	ALSA: rawmidi: Access runtime->avail always in spinlock
	bfs: don't use WARNING: string when it's just info.
	ext4: check for invalid block size early when mounting a file system
	fcntl: Fix potential deadlock in send_sig{io, urg}()
	io_uring: check kthread stopped flag when sq thread is unparked
	rtc: sun6i: Fix memleak in sun6i_rtc_clk_init
	module: set MODULE_STATE_GOING state when a module fails to load
	quota: Don't overflow quota file offsets
	rtc: pl031: fix resource leak in pl031_probe
	powerpc: sysdev: add missing iounmap() on error in mpic_msgr_probe()
	i3c master: fix missing destroy_workqueue() on error in i3c_master_register
	NFSv4: Fix a pNFS layout related use-after-free race when freeing the inode
	f2fs: avoid race condition for shrinker count
	f2fs: fix race of pending_pages in decompression
	module: delay kobject uevent until after module init call
	powerpc/64: irq replay remove decrementer overflow check
	fs/namespace.c: WARN if mnt_count has become negative
	watchdog: rti-wdt: fix reference leak in rti_wdt_probe
	um: random: Register random as hwrng-core device
	um: ubd: Submit all data segments atomically
	NFSv4.2: Don't error when exiting early on a READ_PLUS buffer overflow
	ceph: fix inode refcount leak when ceph_fill_inode on non-I_NEW inode fails
	drm/amd/display: updated wm table for Renoir
	tick/sched: Remove bogus boot "safety" check
	s390: always clear kernel stack backchain before calling functions
	io_uring: remove racy overflow list fast checks
	ALSA: pcm: Clear the full allocated memory at hw_params
	dm verity: skip verity work if I/O error when system is shutting down
	ext4: avoid s_mb_prefetch to be zero in individual scenarios
	device-dax: Fix range release
	Linux 5.10.5

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I2b481bfac06bafdef2cf3cc1ac2c2a4ddf9913dc
2021-01-10 12:19:03 +01:00
Qinglang Miao
bf81221a40 cgroup: Fix memory leak when parsing multiple source parameters
commit 2d18e54dd8662442ef5898c6bdadeaf90b3cebbc upstream.

A memory leak is found in cgroup1_parse_param() when multiple source
parameters overwrite fc->source in the fs_context struct without free.

unreferenced object 0xffff888100d930e0 (size 16):
  comm "mount", pid 520, jiffies 4303326831 (age 152.783s)
  hex dump (first 16 bytes):
    74 65 73 74 6c 65 61 6b 00 00 00 00 00 00 00 00  testleak........
  backtrace:
    [<000000003e5023ec>] kmemdup_nul+0x2d/0xa0
    [<00000000377dbdaa>] vfs_parse_fs_string+0xc0/0x150
    [<00000000cb2b4882>] generic_parse_monolithic+0x15a/0x1d0
    [<000000000f750198>] path_mount+0xee1/0x1820
    [<0000000004756de2>] do_mount+0xea/0x100
    [<0000000094cafb0a>] __x64_sys_mount+0x14b/0x1f0

Fix this bug by permitting a single source parameter and rejecting with
an error all subsequent ones.

Fixes: 8d2451f499 ("cgroup1: switch to option-by-option parsing")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Qinglang Miao <miaoqinglang@huawei.com>
Reviewed-by: Zefan Li <lizefan@huawei.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-01-06 14:56:51 +01:00
Greg Kroah-Hartman
34ed0e2946 Merge 5364abc579 ("Merge tag 'arc-5.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc") into android-mainline
Steps along the 5.7-rc1 merge.

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: Ib9f87147ac3d81985496818b0c61bdd086140eed
2020-04-08 09:25:42 +02:00
Greg Kroah-Hartman
ae56fd997e Merge 5.6-rc6 into android-mainline
Linux 5.6-rc6

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I6c2d7aff44ad5a9b75030b72d34ca5dbd5ad3ceb
2020-03-16 08:09:43 +01:00
Tejun Heo
e7b20d9796 cgroup: Restructure release_agent_path handling
cgrp->root->release_agent_path is protected by both cgroup_mutex and
release_agent_path_lock and readers can hold either one. The
dual-locking scheme was introduced while breaking a locking dependency
issue around cgroup_mutex but doesn't make sense anymore given that
the only remaining reader which uses cgroup_mutex is
cgroup1_releaes_agent().

This patch updates cgroup1_release_agent() to use
release_agent_path_lock so that release_agent_path is always protected
only by release_agent_path_lock.

While at it, convert strlen() based empty string checks to direct
tests on the first character as suggested by Linus.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
2020-03-12 16:44:35 -04:00
Tycho Andersen
2e5383d790 cgroup1: don't call release_agent when it is ""
Older (and maybe current) versions of systemd set release_agent to "" when
shutting down, but do not set notify_on_release to 0.

Since 64e90a8acb ("Introduce STATIC_USERMODEHELPER to mediate
call_usermodehelper()"), we filter out such calls when the user mode helper
path is "". However, when used in conjunction with an actual (i.e. non "")
STATIC_USERMODEHELPER, the path is never "", so the real usermode helper
will be called with argv[0] == "".

Let's avoid this by not invoking the release_agent when it is "".

Signed-off-by: Tycho Andersen <tycho@tycho.ws>
Signed-off-by: Tejun Heo <tj@kernel.org>
2020-03-04 11:53:33 -05:00
Vasily Averin
db8dd96972 cgroup-v1: cgroup_pidlist_next should update position index
if seq_file .next fuction does not change position index,
read after some lseek can generate unexpected output.

 # mount | grep cgroup
 # dd if=/mnt/cgroup.procs bs=1  # normal output
...
1294
1295
1296
1304
1382
584+0 records in
584+0 records out
584 bytes copied

dd: /mnt/cgroup.procs: cannot skip to specified offset
83  <<< generates end of last line
1383  <<< ... and whole last line once again
0+1 records in
0+1 records out
8 bytes copied

dd: /mnt/cgroup.procs: cannot skip to specified offset
1386  <<< generates last line anyway
0+1 records in
0+1 records out
5 bytes copied

https://bugzilla.kernel.org/show_bug.cgi?id=206283
Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2020-02-12 16:53:35 -05:00
Greg Kroah-Hartman
aa601dde64 Merge c9d35ee049 ("Merge branch 'merge.nfs-fs_parse.1' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs") into android-mainline
Tiny steps to deal with merge issues in sdcardfs due to fs param passing
api changes.

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I03ba8763e8cc324c25fb6316c363b59957103474
2020-02-10 08:39:09 -08:00
Al Viro
58c025f0e8 cgroup1: switch to use of errorfc() et.al.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-02-07 14:48:43 -05:00
Al Viro
d7167b1499 fs_parse: fold fs_parameter_desc/fs_parameter_spec
The former contains nothing but a pointer to an array of the latter...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-02-07 14:48:37 -05:00
Eric Sandeen
96cafb9ccb fs_parser: remove fs_parameter_description name field
Unused now.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-02-07 14:48:36 -05:00
Al Viro
fbc2d1686d get rid of cg_invalf()
pointless alias for invalf()...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-02-07 14:48:31 -05:00
Dmitry Torokhov
a88f616760 CHROMIUM: cgroups: relax permissions on moving tasks between cgroups
Android expects system_server to be able to move tasks between different
cgroups/cpusets, but does not want to be running as root. Let's relax
permission check so that processes can move other tasks if they have
CAP_SYS_NICE in the affected task's user namespace.

BUG=b:31790445,chromium:647994
Bug: 147109865
TEST=Boot android container, examine logcat

Signed-off-by: Dmitry Torokhov <dtor@chromium.org>
Reviewed-on: https://chromium-review.googlesource.com/394927
Reviewed-by: Ricky Zhou <rickyz@chromium.org>
[AmitP: Refactored original changes to align with upstream commit
        201af4c0fa ("cgroup: move cgroup files under kernel/cgroup/")]
Change-Id: Ia919c66ab6ed6a6daf7c4cf67feb38b13b1ad09b
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
(cherry picked from commit ec54762b84a1d06de188bc846655305d3f7acf75)
2020-01-07 01:56:09 +00:00
Michal Koutný
9a3284fad4 cgroup: Optimize single thread migration
There are reports of users who use thread migrations between cgroups and
they report performance drop after d59cfc09c3 ("sched, cgroup: replace
signal_struct->group_rwsem with a global percpu_rwsem"). The effect is
pronounced on machines with more CPUs.

The migration is affected by forking noise happening in the background,
after the mentioned commit a migrating thread must wait for all
(forking) processes on the system, not only of its threadgroup.

There are several places that need to synchronize with migration:
	a) do_exit,
	b) de_thread,
	c) copy_process,
	d) cgroup_update_dfl_csses,
	e) parallel migration (cgroup_{proc,thread}s_write).

In the case of self-migrating thread, we relax the synchronization on
cgroup_threadgroup_rwsem to avoid the cost of waiting. d) and e) are
excluded with cgroup_mutex, c) does not matter in case of single thread
migration and the executing thread cannot exec(2) or exit(2) while it is
writing into cgroup.threads. In case of do_exit because of signal
delivery, we either exit before the migration or finish the migration
(of not yet PF_EXITING thread) and die afterwards.

This patch handles only the case of self-migration by writing "0" into
cgroup.threads. For simplicity, we always take cgroup_threadgroup_rwsem
with numeric PIDs.

This change improves migration dependent workload performance similar
to per-signal_struct state.

Signed-off-by: Michal Koutný <mkoutny@suse.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2019-10-07 07:11:53 -07:00
Marc Koderer
653a23ca7e Use kvmalloc in cgroups-v1
Instead of using its own logic for k-/vmalloc rely on
kvmalloc which is actually doing quite the same.

Signed-off-by: Marc Koderer <marc@koderer.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2019-08-07 11:37:58 -07:00
Thomas Gleixner
457c899653 treewide: Add SPDX license identifier for missed files
Add SPDX license identifiers to all files which:

 - Have no license information of any form

 - Have EXPORT_.*_SYMBOL_GPL inside which was used in the
   initial scan/conversion to ignore the file

These files fall under the project license, GPL v2 only. The resulting SPDX
license identifier is:

  GPL-2.0-only

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-21 10:50:45 +02:00
Roman Gushchin
aade7f9efb cgroup: implement __cgroup_task_count() helper
The helper is identical to the existing cgroup_task_count()
except it doesn't take the css_set_lock by itself, assuming
that the caller does.

Also, move cgroup_task_count() implementation into
kernel/cgroup/cgroup.c, as there is nothing specific to cgroup v1.

Signed-off-by: Roman Gushchin <guro@fb.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: kernel-team@fb.com
2019-04-19 11:26:48 -07:00
David Howells
06a2ae56b5 vfs: Add some logging to the core users of the fs_context log
Add some logging to the core users of the fs_context log so that
information can be extracted from them as to the reason for failure.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-02-28 03:29:38 -05:00
Al Viro
cca8f32714 cgroup: store a reference to cgroup_ns into cgroup_fs_context
... and trim cgroup_do_mount() arguments (renaming it to cgroup_do_get_tree())

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-02-28 03:29:34 -05:00
Al Viro
6678889f07 cgroup1_get_tree(): separate "get cgroup_root to use" into a separate helper
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-02-28 03:29:33 -05:00
Al Viro
71d883c37e cgroup_do_mount(): massage calling conventions
pass it fs_context instead of fs_type/flags/root triple, have
it return int instead of dentry and make it deal with setting
fc->root.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-02-28 03:29:33 -05:00
Al Viro
cf6299b1d0 cgroup: stash cgroup_root reference into cgroup_fs_context
Note that this reference is *NOT* contributing to refcount of
cgroup_root in question and is valid only until cgroup_do_mount()
returns.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-02-28 03:29:32 -05:00
Al Viro
8d2451f499 cgroup1: switch to option-by-option parsing
[dhowells should be the author - it's carved out of his patch]

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-02-28 03:29:31 -05:00
Al Viro
f5dfb5315d cgroup: take options parsing into ->parse_monolithic()
Store the results in cgroup_fs_context.  There's a nasty twist caused
by the enabling/disabling subsystems - we can't do the checks sensitive
to that until cgroup_mutex gets grabbed.  Frankly, these checks are
complete bullshit (e.g. all,none combination is accepted if all subsystems
are disabled; so's cpusets,none and all,cpusets when cpusets is disabled,
etc.), but touching that would be a userland-visible behaviour change ;-/

So we do parsing in ->parse_monolithic() and have the consistency checks
done in check_cgroupfs_options(), with the latter called (on already parsed
options) from cgroup1_get_tree() and cgroup1_reconfigure().

Freeing the strdup'ed strings is done from fs_context destructor, which
somewhat simplifies the life for cgroup1_{get_tree,reconfigure}().

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-02-28 03:29:30 -05:00
Al Viro
7feeef5869 cgroup: fold cgroup1_mount() into cgroup1_get_tree()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-02-28 03:29:30 -05:00
Al Viro
90129625d9 cgroup: start switching to fs_context
Unfortunately, cgroup is tangled into kernfs infrastructure.
To avoid converting all kernfs-based filesystems at once,
we need to untangle the remount part of things, instead of
having it go through kernfs_sop_remount_fs().  Fortunately,
it's not hard to do.

This commit just gets cgroup/cgroup1 to use fs_context to
deliver options on mount and remount paths.  Parsing those
is going to be done in the next commits; for now we do
pretty much what legacy case does.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-02-28 03:29:29 -05:00
Al Viro
35ac118424 cgroup: saner refcounting for cgroup_root
* make the reference from superblock to cgroup_root counting -
do cgroup_put() in cgroup_kill_sb() whether we'd done
percpu_ref_kill() or not; matching grab is done when we allocate
a new root.  That gives the same refcounting rules for all callers
of cgroup_do_mount() - a reference to cgroup_root has been grabbed
by caller and it either is transferred to new superblock or dropped.

* have cgroup_kill_sb() treat an already killed refcount as "just
don't bother killing it, then".

* after successful cgroup_do_mount() have cgroup1_mount() recheck
if we'd raced with mount/umount from somebody else and cgroup_root
got killed.  In that case we drop the superblock and bugger off
with -ERESTARTSYS, same as if we'd found it in the list already
dying.

* don't bother with delayed initialization of refcount - it's
unreliable and not needed.  No need to prevent attempts to bump
the refcount if we find cgroup_root of another mount in progress -
sget will reuse an existing superblock just fine and if the
other sb manages to die before we get there, we'll catch
that immediately after cgroup_do_mount().

* don't bother with kernfs_pin_sb() - no need for doing that
either.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-01-17 11:53:02 -05:00
Tejun Heo
3fc9c12d27 cgroup: Add named hierarchy disabling to cgroup_no_v1 boot param
It can be useful to inhibit all cgroup1 hierarchies especially during
transition and for debugging.  cgroup_no_v1 can block hierarchies with
controllers which leaves out the named hierarchies.  Expand it to
cover the named hierarchies so that "cgroup_no_v1=all,named" disables
all cgroup1 hierarchies.

Signed-off-by: Tejun Heo <tj@kernel.org>
Suggested-by: Marcin Pawlowski <mpawlowski@fb.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2018-12-28 10:34:12 -08:00
Steven Rostedt (VMware)
e4f8d81c73 cgroup/tracing: Move taking of spin lock out of trace event handlers
It is unwise to take spin locks from the handlers of trace events.
Mainly, because they can introduce lockups, because it introduces locks
in places that are normally not tested. Worse yet, because trace events
are tucked away in the include/trace/events/ directory, locks that are
taken there are forgotten about.

As a general rule, I tell people never to take any locks in a trace
event handler.

Several cgroup trace event handlers call cgroup_path() which eventually
takes the kernfs_rename_lock spinlock. This injects the spinlock in the
code without people realizing it. It also can cause issues for the
PREEMPT_RT patch, as the spinlock becomes a mutex, and the trace event
handlers are called with preemption disabled.

By moving the calculation of the cgroup_path() out of the trace event
handlers and into a macro (surrounded by a
trace_cgroup_##type##_enabled()), then we could place the cgroup_path
into a string, and pass that to the trace event. Not only does this
remove the taking of the spinlock out of the trace event handler, but
it also means that the cgroup_path() only needs to be called once (it
is currently called twice, once to get the length to reserver the
buffer for, and once again to get the path itself. Now it only needs to
be done once.

Reported-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
2018-07-11 10:48:47 -07:00
Kees Cook
42bc47b353 treewide: Use array_size() in vmalloc()
The vmalloc() function has no 2-factor argument form, so multiplication
factors need to be wrapped in array_size(). This patch replaces cases of:

        vmalloc(a * b)

with:
        vmalloc(array_size(a, b))

as well as handling cases of:

        vmalloc(a * b * c)

with:

        vmalloc(array3_size(a, b, c))

This does, however, attempt to ignore constant size factors like:

        vmalloc(4 * 1024)

though any constants defined via macros get caught up in the conversion.

Any factors with a sizeof() of "unsigned char", "char", and "u8" were
dropped, since they're redundant.

The Coccinelle script used for this was:

// Fix redundant parens around sizeof().
@@
type TYPE;
expression THING, E;
@@

(
  vmalloc(
-	(sizeof(TYPE)) * E
+	sizeof(TYPE) * E
  , ...)
|
  vmalloc(
-	(sizeof(THING)) * E
+	sizeof(THING) * E
  , ...)
)

// Drop single-byte sizes and redundant parens.
@@
expression COUNT;
typedef u8;
typedef __u8;
@@

(
  vmalloc(
-	sizeof(u8) * (COUNT)
+	COUNT
  , ...)
|
  vmalloc(
-	sizeof(__u8) * (COUNT)
+	COUNT
  , ...)
|
  vmalloc(
-	sizeof(char) * (COUNT)
+	COUNT
  , ...)
|
  vmalloc(
-	sizeof(unsigned char) * (COUNT)
+	COUNT
  , ...)
|
  vmalloc(
-	sizeof(u8) * COUNT
+	COUNT
  , ...)
|
  vmalloc(
-	sizeof(__u8) * COUNT
+	COUNT
  , ...)
|
  vmalloc(
-	sizeof(char) * COUNT
+	COUNT
  , ...)
|
  vmalloc(
-	sizeof(unsigned char) * COUNT
+	COUNT
  , ...)
)

// 2-factor product with sizeof(type/expression) and identifier or constant.
@@
type TYPE;
expression THING;
identifier COUNT_ID;
constant COUNT_CONST;
@@

(
  vmalloc(
-	sizeof(TYPE) * (COUNT_ID)
+	array_size(COUNT_ID, sizeof(TYPE))
  , ...)
|
  vmalloc(
-	sizeof(TYPE) * COUNT_ID
+	array_size(COUNT_ID, sizeof(TYPE))
  , ...)
|
  vmalloc(
-	sizeof(TYPE) * (COUNT_CONST)
+	array_size(COUNT_CONST, sizeof(TYPE))
  , ...)
|
  vmalloc(
-	sizeof(TYPE) * COUNT_CONST
+	array_size(COUNT_CONST, sizeof(TYPE))
  , ...)
|
  vmalloc(
-	sizeof(THING) * (COUNT_ID)
+	array_size(COUNT_ID, sizeof(THING))
  , ...)
|
  vmalloc(
-	sizeof(THING) * COUNT_ID
+	array_size(COUNT_ID, sizeof(THING))
  , ...)
|
  vmalloc(
-	sizeof(THING) * (COUNT_CONST)
+	array_size(COUNT_CONST, sizeof(THING))
  , ...)
|
  vmalloc(
-	sizeof(THING) * COUNT_CONST
+	array_size(COUNT_CONST, sizeof(THING))
  , ...)
)

// 2-factor product, only identifiers.
@@
identifier SIZE, COUNT;
@@

  vmalloc(
-	SIZE * COUNT
+	array_size(COUNT, SIZE)
  , ...)

// 3-factor product with 1 sizeof(type) or sizeof(expression), with
// redundant parens removed.
@@
expression THING;
identifier STRIDE, COUNT;
type TYPE;
@@

(
  vmalloc(
-	sizeof(TYPE) * (COUNT) * (STRIDE)
+	array3_size(COUNT, STRIDE, sizeof(TYPE))
  , ...)
|
  vmalloc(
-	sizeof(TYPE) * (COUNT) * STRIDE
+	array3_size(COUNT, STRIDE, sizeof(TYPE))
  , ...)
|
  vmalloc(
-	sizeof(TYPE) * COUNT * (STRIDE)
+	array3_size(COUNT, STRIDE, sizeof(TYPE))
  , ...)
|
  vmalloc(
-	sizeof(TYPE) * COUNT * STRIDE
+	array3_size(COUNT, STRIDE, sizeof(TYPE))
  , ...)
|
  vmalloc(
-	sizeof(THING) * (COUNT) * (STRIDE)
+	array3_size(COUNT, STRIDE, sizeof(THING))
  , ...)
|
  vmalloc(
-	sizeof(THING) * (COUNT) * STRIDE
+	array3_size(COUNT, STRIDE, sizeof(THING))
  , ...)
|
  vmalloc(
-	sizeof(THING) * COUNT * (STRIDE)
+	array3_size(COUNT, STRIDE, sizeof(THING))
  , ...)
|
  vmalloc(
-	sizeof(THING) * COUNT * STRIDE
+	array3_size(COUNT, STRIDE, sizeof(THING))
  , ...)
)

// 3-factor product with 2 sizeof(variable), with redundant parens removed.
@@
expression THING1, THING2;
identifier COUNT;
type TYPE1, TYPE2;
@@

(
  vmalloc(
-	sizeof(TYPE1) * sizeof(TYPE2) * COUNT
+	array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
  , ...)
|
  vmalloc(
-	sizeof(TYPE1) * sizeof(THING2) * (COUNT)
+	array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
  , ...)
|
  vmalloc(
-	sizeof(THING1) * sizeof(THING2) * COUNT
+	array3_size(COUNT, sizeof(THING1), sizeof(THING2))
  , ...)
|
  vmalloc(
-	sizeof(THING1) * sizeof(THING2) * (COUNT)
+	array3_size(COUNT, sizeof(THING1), sizeof(THING2))
  , ...)
|
  vmalloc(
-	sizeof(TYPE1) * sizeof(THING2) * COUNT
+	array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
  , ...)
|
  vmalloc(
-	sizeof(TYPE1) * sizeof(THING2) * (COUNT)
+	array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
  , ...)
)

// 3-factor product, only identifiers, with redundant parens removed.
@@
identifier STRIDE, SIZE, COUNT;
@@

(
  vmalloc(
-	(COUNT) * STRIDE * SIZE
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
|
  vmalloc(
-	COUNT * (STRIDE) * SIZE
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
|
  vmalloc(
-	COUNT * STRIDE * (SIZE)
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
|
  vmalloc(
-	(COUNT) * (STRIDE) * SIZE
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
|
  vmalloc(
-	COUNT * (STRIDE) * (SIZE)
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
|
  vmalloc(
-	(COUNT) * STRIDE * (SIZE)
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
|
  vmalloc(
-	(COUNT) * (STRIDE) * (SIZE)
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
|
  vmalloc(
-	COUNT * STRIDE * SIZE
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
)

// Any remaining multi-factor products, first at least 3-factor products
// when they're not all constants...
@@
expression E1, E2, E3;
constant C1, C2, C3;
@@

(
  vmalloc(C1 * C2 * C3, ...)
|
  vmalloc(
-	E1 * E2 * E3
+	array3_size(E1, E2, E3)
  , ...)
)

// And then all remaining 2 factors products when they're not all constants.
@@
expression E1, E2;
constant C1, C2;
@@

(
  vmalloc(C1 * C2, ...)
|
  vmalloc(
-	E1 * E2
+	array_size(E1, E2)
  , ...)
)

Signed-off-by: Kees Cook <keescook@chromium.org>
2018-06-12 16:19:22 -07:00
Kees Cook
6da2ec5605 treewide: kmalloc() -> kmalloc_array()
The kmalloc() function has a 2-factor argument form, kmalloc_array(). This
patch replaces cases of:

        kmalloc(a * b, gfp)

with:
        kmalloc_array(a * b, gfp)

as well as handling cases of:

        kmalloc(a * b * c, gfp)

with:

        kmalloc(array3_size(a, b, c), gfp)

as it's slightly less ugly than:

        kmalloc_array(array_size(a, b), c, gfp)

This does, however, attempt to ignore constant size factors like:

        kmalloc(4 * 1024, gfp)

though any constants defined via macros get caught up in the conversion.

Any factors with a sizeof() of "unsigned char", "char", and "u8" were
dropped, since they're redundant.

The tools/ directory was manually excluded, since it has its own
implementation of kmalloc().

The Coccinelle script used for this was:

// Fix redundant parens around sizeof().
@@
type TYPE;
expression THING, E;
@@

(
  kmalloc(
-	(sizeof(TYPE)) * E
+	sizeof(TYPE) * E
  , ...)
|
  kmalloc(
-	(sizeof(THING)) * E
+	sizeof(THING) * E
  , ...)
)

// Drop single-byte sizes and redundant parens.
@@
expression COUNT;
typedef u8;
typedef __u8;
@@

(
  kmalloc(
-	sizeof(u8) * (COUNT)
+	COUNT
  , ...)
|
  kmalloc(
-	sizeof(__u8) * (COUNT)
+	COUNT
  , ...)
|
  kmalloc(
-	sizeof(char) * (COUNT)
+	COUNT
  , ...)
|
  kmalloc(
-	sizeof(unsigned char) * (COUNT)
+	COUNT
  , ...)
|
  kmalloc(
-	sizeof(u8) * COUNT
+	COUNT
  , ...)
|
  kmalloc(
-	sizeof(__u8) * COUNT
+	COUNT
  , ...)
|
  kmalloc(
-	sizeof(char) * COUNT
+	COUNT
  , ...)
|
  kmalloc(
-	sizeof(unsigned char) * COUNT
+	COUNT
  , ...)
)

// 2-factor product with sizeof(type/expression) and identifier or constant.
@@
type TYPE;
expression THING;
identifier COUNT_ID;
constant COUNT_CONST;
@@

(
- kmalloc
+ kmalloc_array
  (
-	sizeof(TYPE) * (COUNT_ID)
+	COUNT_ID, sizeof(TYPE)
  , ...)
|
- kmalloc
+ kmalloc_array
  (
-	sizeof(TYPE) * COUNT_ID
+	COUNT_ID, sizeof(TYPE)
  , ...)
|
- kmalloc
+ kmalloc_array
  (
-	sizeof(TYPE) * (COUNT_CONST)
+	COUNT_CONST, sizeof(TYPE)
  , ...)
|
- kmalloc
+ kmalloc_array
  (
-	sizeof(TYPE) * COUNT_CONST
+	COUNT_CONST, sizeof(TYPE)
  , ...)
|
- kmalloc
+ kmalloc_array
  (
-	sizeof(THING) * (COUNT_ID)
+	COUNT_ID, sizeof(THING)
  , ...)
|
- kmalloc
+ kmalloc_array
  (
-	sizeof(THING) * COUNT_ID
+	COUNT_ID, sizeof(THING)
  , ...)
|
- kmalloc
+ kmalloc_array
  (
-	sizeof(THING) * (COUNT_CONST)
+	COUNT_CONST, sizeof(THING)
  , ...)
|
- kmalloc
+ kmalloc_array
  (
-	sizeof(THING) * COUNT_CONST
+	COUNT_CONST, sizeof(THING)
  , ...)
)

// 2-factor product, only identifiers.
@@
identifier SIZE, COUNT;
@@

- kmalloc
+ kmalloc_array
  (
-	SIZE * COUNT
+	COUNT, SIZE
  , ...)

// 3-factor product with 1 sizeof(type) or sizeof(expression), with
// redundant parens removed.
@@
expression THING;
identifier STRIDE, COUNT;
type TYPE;
@@

(
  kmalloc(
-	sizeof(TYPE) * (COUNT) * (STRIDE)
+	array3_size(COUNT, STRIDE, sizeof(TYPE))
  , ...)
|
  kmalloc(
-	sizeof(TYPE) * (COUNT) * STRIDE
+	array3_size(COUNT, STRIDE, sizeof(TYPE))
  , ...)
|
  kmalloc(
-	sizeof(TYPE) * COUNT * (STRIDE)
+	array3_size(COUNT, STRIDE, sizeof(TYPE))
  , ...)
|
  kmalloc(
-	sizeof(TYPE) * COUNT * STRIDE
+	array3_size(COUNT, STRIDE, sizeof(TYPE))
  , ...)
|
  kmalloc(
-	sizeof(THING) * (COUNT) * (STRIDE)
+	array3_size(COUNT, STRIDE, sizeof(THING))
  , ...)
|
  kmalloc(
-	sizeof(THING) * (COUNT) * STRIDE
+	array3_size(COUNT, STRIDE, sizeof(THING))
  , ...)
|
  kmalloc(
-	sizeof(THING) * COUNT * (STRIDE)
+	array3_size(COUNT, STRIDE, sizeof(THING))
  , ...)
|
  kmalloc(
-	sizeof(THING) * COUNT * STRIDE
+	array3_size(COUNT, STRIDE, sizeof(THING))
  , ...)
)

// 3-factor product with 2 sizeof(variable), with redundant parens removed.
@@
expression THING1, THING2;
identifier COUNT;
type TYPE1, TYPE2;
@@

(
  kmalloc(
-	sizeof(TYPE1) * sizeof(TYPE2) * COUNT
+	array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
  , ...)
|
  kmalloc(
-	sizeof(TYPE1) * sizeof(THING2) * (COUNT)
+	array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
  , ...)
|
  kmalloc(
-	sizeof(THING1) * sizeof(THING2) * COUNT
+	array3_size(COUNT, sizeof(THING1), sizeof(THING2))
  , ...)
|
  kmalloc(
-	sizeof(THING1) * sizeof(THING2) * (COUNT)
+	array3_size(COUNT, sizeof(THING1), sizeof(THING2))
  , ...)
|
  kmalloc(
-	sizeof(TYPE1) * sizeof(THING2) * COUNT
+	array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
  , ...)
|
  kmalloc(
-	sizeof(TYPE1) * sizeof(THING2) * (COUNT)
+	array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
  , ...)
)

// 3-factor product, only identifiers, with redundant parens removed.
@@
identifier STRIDE, SIZE, COUNT;
@@

(
  kmalloc(
-	(COUNT) * STRIDE * SIZE
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
|
  kmalloc(
-	COUNT * (STRIDE) * SIZE
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
|
  kmalloc(
-	COUNT * STRIDE * (SIZE)
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
|
  kmalloc(
-	(COUNT) * (STRIDE) * SIZE
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
|
  kmalloc(
-	COUNT * (STRIDE) * (SIZE)
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
|
  kmalloc(
-	(COUNT) * STRIDE * (SIZE)
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
|
  kmalloc(
-	(COUNT) * (STRIDE) * (SIZE)
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
|
  kmalloc(
-	COUNT * STRIDE * SIZE
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
)

// Any remaining multi-factor products, first at least 3-factor products,
// when they're not all constants...
@@
expression E1, E2, E3;
constant C1, C2, C3;
@@

(
  kmalloc(C1 * C2 * C3, ...)
|
  kmalloc(
-	(E1) * E2 * E3
+	array3_size(E1, E2, E3)
  , ...)
|
  kmalloc(
-	(E1) * (E2) * E3
+	array3_size(E1, E2, E3)
  , ...)
|
  kmalloc(
-	(E1) * (E2) * (E3)
+	array3_size(E1, E2, E3)
  , ...)
|
  kmalloc(
-	E1 * E2 * E3
+	array3_size(E1, E2, E3)
  , ...)
)

// And then all remaining 2 factors products when they're not all constants,
// keeping sizeof() as the second factor argument.
@@
expression THING, E1, E2;
type TYPE;
constant C1, C2, C3;
@@

(
  kmalloc(sizeof(THING) * C2, ...)
|
  kmalloc(sizeof(TYPE) * C2, ...)
|
  kmalloc(C1 * C2 * C3, ...)
|
  kmalloc(C1 * C2, ...)
|
- kmalloc
+ kmalloc_array
  (
-	sizeof(TYPE) * (E2)
+	E2, sizeof(TYPE)
  , ...)
|
- kmalloc
+ kmalloc_array
  (
-	sizeof(TYPE) * E2
+	E2, sizeof(TYPE)
  , ...)
|
- kmalloc
+ kmalloc_array
  (
-	sizeof(THING) * (E2)
+	E2, sizeof(THING)
  , ...)
|
- kmalloc
+ kmalloc_array
  (
-	sizeof(THING) * E2
+	E2, sizeof(THING)
  , ...)
|
- kmalloc
+ kmalloc_array
  (
-	(E1) * E2
+	E1, E2
  , ...)
|
- kmalloc
+ kmalloc_array
  (
-	(E1) * (E2)
+	E1, E2
  , ...)
|
- kmalloc
+ kmalloc_array
  (
-	E1 * E2
+	E1, E2
  , ...)
)

Signed-off-by: Kees Cook <keescook@chromium.org>
2018-06-12 16:19:22 -07:00
Christoph Hellwig
3f3942aca6 proc: introduce proc_create_single{,_data}
Variants of proc_create{,_data} that directly take a seq_file show
callback and drastically reduces the boilerplate code in the callers.

All trivial callers converted over.

Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-05-16 07:23:35 +02:00
Prateek Sood
116d2f7496 cgroup: Fix deadlock in cpu hotplug path
Deadlock during cgroup migration from cpu hotplug path when a task T is
being moved from source to destination cgroup.

kworker/0:0
cpuset_hotplug_workfn()
   cpuset_hotplug_update_tasks()
      hotplug_update_tasks_legacy()
        remove_tasks_in_empty_cpuset()
          cgroup_transfer_tasks() // stuck in iterator loop
            cgroup_migrate()
              cgroup_migrate_add_task()

In cgroup_migrate_add_task() it checks for PF_EXITING flag of task T.
Task T will not migrate to destination cgroup. css_task_iter_start()
will keep pointing to task T in loop waiting for task T cg_list node
to be removed.

Task T
do_exit()
  exit_signals() // sets PF_EXITING
  exit_task_namespaces()
    switch_task_namespaces()
      free_nsproxy()
        put_mnt_ns()
          drop_collected_mounts()
            namespace_unlock()
              synchronize_rcu()
                _synchronize_rcu_expedited()
                  schedule_work() // on cpu0 low priority worker pool
                  wait_event() // waiting for work item to execute

Task T inserted a work item in the worklist of cpu0 low priority
worker pool. It is waiting for expedited grace period work item
to execute. This work item will only be executed once kworker/0:0
complete execution of cpuset_hotplug_workfn().

kworker/0:0 ==> Task T ==>kworker/0:0

In case of PF_EXITING task being migrated from source to destination
cgroup, migrate next available task in source cgroup.

Signed-off-by: Prateek Sood <prsood@codeaurora.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
2017-12-19 05:38:47 -08:00