26604a533208b7551def7e01df965bc4dda05cfa
229 Commits
Author | SHA1 | Message | Date | |
---|---|---|---|---|
![]() |
d1e180148e |
BACKPORT: cgroup: Fix threadgroup_rwsem <-> cpus_read_lock() deadlock
Bringing up a CPU may involve creating and destroying tasks which requires read-locking threadgroup_rwsem, so threadgroup_rwsem nests inside cpus_read_lock(). However, cpuset's ->attach(), which may be called with thredagroup_rwsem write-locked, also wants to disable CPU hotplug and acquires cpus_read_lock(), leading to a deadlock. Fix it by guaranteeing that ->attach() is always called with CPU hotplug disabled and removing cpus_read_lock() call from cpuset_attach(). Bug: 242685775 Change-Id: Ib14746f8e361eac8a1cfb88ae920488d1155d904 Signed-off-by: Tejun Heo <tj@kernel.org> Reviewed-and-tested-by: Imran Khan <imran.f.khan@oracle.com> Reported-and-tested-by: Xuewen Yan <xuewen.yan@unisoc.com> Fixes: 05c7b7a92cc8 ("cgroup/cpuset: Fix a race between cpuset_attach() and cpu hotplug") Cc: stable@vger.kernel.org # v5.17+ Link: https://lore.kernel.org/lkml/YvrWaml3F+x9Dk+T@slm.duckdns.org/ Link: https://lore.kernel.org/lkml/20220705123705.764-1-xuewen.yan@unisoc.com/ (cherry picked from commit 4f7e7236435ca0abe005c674ebd6892c6e83aeb3 https://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup.git for-6.0-fixes) Signed-off-by: Mukesh Ojha <quic_mojha@quicinc.com> |
||
![]() |
ef04c4095d |
UPSTREAM: cgroup: Elide write-locking threadgroup_rwsem when updating csses on an empty subtree
cgroup_update_dfl_csses() write-lock the threadgroup_rwsem as updating the csses can trigger process migrations. However, if the subtree doesn't contain any tasks, there aren't gonna be any cgroup migrations. This condition can be trivially detected by testing whether mgctx.preloaded_src_csets is empty. Elide write-locking threadgroup_rwsem if the subtree is empty. After this optimization, the usage pattern of creating a cgroup, enabling the necessary controllers, and then seeding it with CLONE_INTO_CGROUP and then removing the cgroup after it becomes empty doesn't need to write-lock threadgroup_rwsem at all. Bug: 242685775 Change-Id: Ifc96030fc7b0655ecd85ef19c52c9ed97e910ffb Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Christian Brauner <brauner@kernel.org> Cc: Michal Koutný <mkoutny@suse.com> (cherry picked from commit 671c11f0619e5ccb380bcf0f062f69ba95fc974a https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git master) Signed-off-by: Mukesh Ojha <quic_mojha@quicinc.com> |
||
![]() |
9c2a5eef8f |
Merge tag 'android12-5.10.117_r00' into 'android12-5.10'
This is the merge of the upstream LTS release of 5.10.117 into the android12-5.10 branch. It contains the following commits: |
||
![]() |
e8fce59434 |
BACKPORT: FROMGIT: cgroup: Use separate src/dst nodes when preloading css_sets for migration
Each cset (css_set) is pinned by its tasks. When we're moving tasks around
across csets for a migration, we need to hold the source and destination
csets to ensure that they don't go away while we're moving tasks about. This
is done by linking cset->mg_preload_node on either the
mgctx->preloaded_dst_csets or mgctx->preloaded_dst_csets list. Using the
same cset->mg_preload_node for both the src and dst lists was deemed okay as
a cset can't be both the source and destination at the same time.
Unfortunately, this overloading becomes problematic when multiple tasks are
involved in a migration and some of them are identity noop migrations while
others are actually moving across cgroups. For example, this can happen with
the following sequence on cgroup1:
#1> mkdir -p /sys/fs/cgroup/misc/a/b
#2> echo $$ > /sys/fs/cgroup/misc/a/cgroup.procs
#3> RUN_A_COMMAND_WHICH_CREATES_MULTIPLE_THREADS &
#4> PID=$!
#5> echo $PID > /sys/fs/cgroup/misc/a/b/tasks
#6> echo $PID > /sys/fs/cgroup/misc/a/cgroup.procs
the process including the group leader back into a. In this final migration,
non-leader threads would be doing identity migration while the group leader
is doing an actual one.
After #3, let's say the whole process was in cset A, and that after #4, the
leader moves to cset B. Then, during #6, the following happens:
1. cgroup_migrate_add_src() is called on B for the leader.
2. cgroup_migrate_add_src() is called on A for the other threads.
3. cgroup_migrate_prepare_dst() is called. It scans the src list.
3. It notices that B wants to migrate to A, so it tries to A to the dst
list but realizes that its ->mg_preload_node is already busy.
4. and then it notices A wants to migrate to A as it's an identity
migration, it culls it by list_del_init()'ing its ->mg_preload_node and
putting references accordingly.
5. The rest of migration takes place with B on the src list but nothing on
the dst list.
This means that A isn't held while migration is in progress. If all tasks
leave A before the migration finishes and the incoming task pins it, the
cset will be destroyed leading to use-after-free.
This is caused by overloading cset->mg_preload_node for both src and dst
preload lists. We wanted to exclude the cset from the src list but ended up
inadvertently excluding it from the dst list too.
This patch fixes the issue by separating out cset->mg_preload_node into
->mg_src_preload_node and ->mg_dst_preload_node, so that the src and dst
preloadings don't interfere with each other.
Bug: 236582926
Change-Id: Ieaf1c0c8fc23753570897fd6e48a54335ab939ce
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Mukesh Ojha <quic_mojha@quicinc.com>
Reported-by: shisiyuan <shisiyuan19870131@gmail.com>
Link: http://lkml.kernel.org/r/1654187688-27411-1-git-send-email-shisiyuan@xiaomi.com
Link: https://lore.kernel.org/lkml/Yh+RGIJ0f3nrqIiN@slm.duckdns.org/#t
Fixes:
|
||
![]() |
5dadf6321c |
Merge 5.10.111 into android12-5.10-lts
Changes in 5.10.111 ubifs: Rectify space amount budget for mkdir/tmpfile operations gfs2: Check for active reservation in gfs2_release gfs2: Fix gfs2_release for non-writers regression gfs2: gfs2_setattr_size error path fix rtc: wm8350: Handle error for wm8350_register_irq KVM: x86/svm: Clear reserved bits written to PerfEvtSeln MSRs KVM: x86/emulator: Emulate RDPID only if it is enabled in guest drm: Add orientation quirk for GPD Win Max ath5k: fix OOB in ath5k_eeprom_read_pcal_info_5111 drm/amd/display: Add signal type check when verify stream backends same drm/amd/amdgpu/amdgpu_cs: fix refcount leak of a dma_fence obj usb: gadget: tegra-xudc: Do not program SPARAM usb: gadget: tegra-xudc: Fix control endpoint's definitions ptp: replace snprintf with sysfs_emit powerpc: dts: t104xrdb: fix phy type for FMAN 4/5 ath11k: fix kernel panic during unload/load ath11k modules ath11k: mhi: use mhi_sync_power_up() bpf: Make dst_port field in struct bpf_sock 16-bit wide scsi: mvsas: Replace snprintf() with sysfs_emit() scsi: bfa: Replace snprintf() with sysfs_emit() power: supply: axp20x_battery: properly report current when discharging mt76: dma: initialize skip_unmap in mt76_dma_rx_fill cfg80211: don't add non transmitted BSS to 6GHz scanned channels libbpf: Fix build issue with llvm-readelf ipv6: make mc_forwarding atomic powerpc: Set crashkernel offset to mid of RMA region drm/amdgpu: Fix recursive locking warning PCI: aardvark: Fix support for MSI interrupts iommu/arm-smmu-v3: fix event handling soft lockup usb: ehci: add pci device support for Aspeed platforms PCI: endpoint: Fix alignment fault error in copy tests tcp: Don't acquire inet_listen_hashbucket::lock with disabled BH. PCI: pciehp: Add Qualcomm quirk for Command Completed erratum power: supply: axp288-charger: Set Vhold to 4.4V iwlwifi: mvm: Correctly set fragmented EBS ipv4: Invalidate neighbour for broadcast address upon address addition dm ioctl: prevent potential spectre v1 gadget dm: requeue IO if mapping table not yet available drm/amdkfd: make CRAT table missing message informational only scsi: pm8001: Fix pm80xx_pci_mem_copy() interface scsi: pm8001: Fix pm8001_mpi_task_abort_resp() scsi: pm8001: Fix task leak in pm8001_send_abort_all() scsi: pm8001: Fix tag leaks on error scsi: pm8001: Fix memory leak in pm8001_chip_fw_flash_update_req() mt76: mt7615: Fix assigning negative values to unsigned variable scsi: aha152x: Fix aha152x_setup() __setup handler return value scsi: hisi_sas: Free irq vectors in order for v3 HW net/smc: correct settings of RMB window update limit mips: ralink: fix a refcount leak in ill_acc_of_setup() macvtap: advertise link netns via netlink tuntap: add sanity checks about msg_controllen in sendmsg Bluetooth: Fix not checking for valid hdev on bt_dev_{info,warn,err,dbg} Bluetooth: use memset avoid memory leaks bnxt_en: Eliminate unintended link toggle during FW reset PCI: endpoint: Fix misused goto label MIPS: fix fortify panic when copying asm exception handlers powerpc/secvar: fix refcount leak in format_show() scsi: libfc: Fix use after free in fc_exch_abts_resp() can: isotp: set default value for N_As to 50 micro seconds net: account alternate interface name memory net: limit altnames to 64k total net: sfp: add 2500base-X quirk for Lantech SFP module usb: dwc3: omap: fix "unbalanced disables for smps10_out1" on omap5evm xtensa: fix DTC warning unit_address_format MIPS: ingenic: correct unit node address Bluetooth: Fix use after free in hci_send_acl netlabel: fix out-of-bounds memory accesses ceph: fix memory leak in ceph_readdir when note_last_dentry returns error init/main.c: return 1 from handled __setup() functions minix: fix bug when opening a file with O_DIRECT clk: si5341: fix reported clk_rate when output divider is 2 staging: vchiq_core: handle NULL result of find_service_by_handle phy: amlogic: meson8b-usb2: Use dev_err_probe() staging: wfx: fix an error handling in wfx_init_common() w1: w1_therm: fixes w1_seq for ds28ea00 sensors NFSv4.2: fix reference count leaks in _nfs42_proc_copy_notify() NFSv4: Protect the state recovery thread against direct reclaim xen: delay xen_hvm_init_time_ops() if kdump is boot on vcpu>=32 clk: ti: Preserve node in ti_dt_clocks_register() clk: Enforce that disjoints limits are invalid SUNRPC/call_alloc: async tasks mustn't block waiting for memory SUNRPC/xprt: async tasks mustn't block waiting for memory SUNRPC: remove scheduling boost for "SWAPPER" tasks. NFS: swap IO handling is slightly different for O_DIRECT IO NFS: swap-out must always use STABLE writes. x86/Kconfig: Do not allow CONFIG_X86_X32_ABI=y with llvm-objcopy serial: samsung_tty: do not unlock port->lock for uart_write_wakeup() virtio_console: eliminate anonymous module_init & module_exit jfs: prevent NULL deref in diFree SUNRPC: Fix socket waits for write buffer space NFS: nfsiod should not block forever in mempool_alloc() NFS: Avoid writeback threads getting stuck in mempool_alloc() parisc: Fix CPU affinity for Lasi, WAX and Dino chips parisc: Fix patch code locking and flushing mm: fix race between MADV_FREE reclaim and blkdev direct IO read Revert "hv: utils: add PTP_1588_CLOCK to Kconfig to fix build" drm/amdgpu: fix off by one in amdgpu_gfx_kiq_acquire() Drivers: hv: vmbus: Fix potential crash on module unload Revert "NFSv4: Handle the special Linux file open access mode" NFSv4: fix open failure with O_ACCMODE flag scsi: zorro7xx: Fix a resource leak in zorro7xx_remove_one() net/tls: fix slab-out-of-bounds bug in decrypt_internal ice: Clear default forwarding VSI during VSI release net: ipv4: fix route with nexthop object delete warning net: stmmac: Fix unset max_speed difference between DT and non-DT platforms drm/imx: imx-ldb: Check for null pointer after calling kmemdup drm/imx: Fix memory leak in imx_pd_connector_get_modes bnxt_en: reserve space inside receive page for skb_shared_info sfc: Do not free an empty page_ring RDMA/mlx5: Don't remove cache MRs when a delay is needed IB/rdmavt: add lock to call to rvt_error_qp to prevent a race condition dpaa2-ptp: Fix refcount leak in dpaa2_ptp_probe ice: Set txq_teid to ICE_INVAL_TEID on ring creation ice: Do not skip not enabled queues in ice_vc_dis_qs_msg ipv6: Fix stats accounting in ip6_pkt_drop ice: synchronize_rcu() when terminating rings net: openvswitch: don't send internal clone attribute to the userspace. net: openvswitch: fix leak of nested actions rxrpc: fix a race in rxrpc_exit_net() net: phy: mscc-miim: reject clause 45 register accesses qede: confirm skb is allocated before using spi: bcm-qspi: fix MSPI only access with bcm_qspi_exec_mem_op() bpf: Support dual-stack sockets in bpf_tcp_check_syncookie drbd: Fix five use after free bugs in get_initial_state io_uring: don't touch scm_fp_list after queueing skb SUNRPC: Handle ENOMEM in call_transmit_status() SUNRPC: Handle low memory situations in call_status() SUNRPC: svc_tcp_sendmsg() should handle errors from xdr_alloc_bvec() iommu/omap: Fix regression in probe for NULL pointer dereference perf: arm-spe: Fix perf report --mem-mode perf tools: Fix perf's libperf_print callback perf session: Remap buf if there is no space for event arm64: Add part number for Arm Cortex-A78AE Revert "mmc: sdhci-xenon: fix annoying 1.8V regulator warning" mmc: mmci: stm32: correctly check all elements of sg list mmc: renesas_sdhi: don't overwrite TAP settings when HS400 tuning is complete lz4: fix LZ4_decompress_safe_partial read out of bound mmmremap.c: avoid pointless invalidate_range_start/end on mremap(old_size=0) mm/mempolicy: fix mpol_new leak in shared_policy_replace io_uring: fix race between timeout flush and removal x86/pm: Save the MSR validity status at context setup x86/speculation: Restore speculation related MSRs during S3 resume btrfs: fix qgroup reserve overflow the qgroup limit btrfs: prevent subvol with swapfile from being deleted arm64: patch_text: Fixup last cpu should be master RDMA/hfi1: Fix use-after-free bug for mm struct gpio: Restrict usage of GPIO chip irq members before initialization ata: sata_dwc_460ex: Fix crash due to OOB write perf: qcom_l2_pmu: fix an incorrect NULL check on list iterator irqchip/gic-v3: Fix GICR_CTLR.RWP polling drm/amdgpu/smu10: fix SoC/fclk units in auto mode drm/nouveau/pmu: Add missing callbacks for Tegra devices drm/amdkfd: Create file descriptor after client is added to smi_clients list perf build: Don't use -ffat-lto-objects in the python feature test when building with clang-13 perf python: Fix probing for some clang command line options tools build: Filter out options and warnings not supported by clang tools build: Use $(shell ) instead of `` to get embedded libperl's ccopts dmaengine: Revert "dmaengine: shdma: Fix runtime PM imbalance on error" ubsan: remove CONFIG_UBSAN_OBJECT_SIZE mm: don't skip swap entry even if zap_details specified cgroup: Use open-time credentials for process migraton perm checks selftests/cgroup: Fix build on older distros selftests: cgroup: Make cg_create() use 0755 for permission instead of 0644 selftests: cgroup: Test open-time credential usage for migration checks selftests: cgroup: Test open-time cgroup namespace usage for migration checks arm64: module: remove (NOLOAD) from linker script Drivers: hv: vmbus: Replace smp_store_mb() with virt_store_mb() irqchip/gic, gic-v3: Prevent GSI to SGI translations mm/sparsemem: fix 'mem_section' will never be NULL gcc 12 warning powerpc: Fix virt_addr_valid() for 64-bit Book3E & 32-bit Linux 5.10.111 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: I9b4c1d30ae226b865494df03d871db2a2b9281c7 |
||
![]() |
4665722d36 |
cgroup: Use open-time credentials for process migraton perm checks
commit 1756d7994ad85c2479af6ae5a9750b92324685af upstream.
cgroup process migration permission checks are performed at write time as
whether a given operation is allowed or not is dependent on the content of
the write - the PID. This currently uses current's credentials which is a
potential security weakness as it may allow scenarios where a less
privileged process tricks a more privileged one into writing into a fd that
it created.
This patch makes both cgroup2 and cgroup1 process migration interfaces to
use the credentials saved at the time of open (file->f_cred) instead of
current's.
Reported-by: "Eric W. Biederman" <ebiederm@xmission.com>
Suggested-by: Linus Torvalds <torvalds@linuxfoundation.org>
Fixes:
|
||
![]() |
51790ed529 |
Merge 5.10.109 into android12-5.10-lts
Changes in 5.10.109 nfc: st21nfca: Fix potential buffer overflows in EVT_TRANSACTION net: ipv6: fix skb_over_panic in __ip6_append_data exfat: avoid incorrectly releasing for root inode cgroup: Allocate cgroup_file_ctx for kernfs_open_file->priv cgroup: Use open-time cgroup namespace for process migration perm checks cgroup-v1: Correct privileges check in release_agent writes tpm: Fix error handling in async work staging: fbtft: fb_st7789v: reset display before initialization llc: fix netdevice reference leaks in llc_ui_bind() ASoC: sti: Fix deadlock via snd_pcm_stop_xrun() call ALSA: oss: Fix PCM OSS buffer allocation overflow ALSA: usb-audio: add mapping for new Corsair Virtuoso SE ALSA: hda/realtek: Add quirk for Clevo NP70PNJ ALSA: hda/realtek: Add quirk for Clevo NP50PNJ ALSA: hda/realtek - Fix headset mic problem for a HP machine with alc671 ALSA: hda/realtek: Add quirk for ASUS GA402 ALSA: pcm: Fix races among concurrent hw_params and hw_free calls ALSA: pcm: Fix races among concurrent read/write and buffer changes ALSA: pcm: Fix races among concurrent prepare and hw_params/hw_free calls ALSA: pcm: Fix races among concurrent prealloc proc writes ALSA: pcm: Add stream lock during PCM reset ioctl operations ALSA: usb-audio: Add mute TLV for playback volumes on RODE NT-USB ALSA: cmipci: Restore aux vol on suspend/resume ALSA: pci: fix reading of swapped values from pcmreg in AC97 codec drivers: net: xgene: Fix regression in CRC stripping netfilter: nf_tables: initialize registers in nft_do_chain() ACPI / x86: Work around broken XSDT on Advantech DAC-BJ01 board ACPI: battery: Add device HID and quirk for Microsoft Surface Go 3 ACPI: video: Force backlight native for Clevo NL5xRU and NL5xNU crypto: qat - disable registration of algorithms Revert "ath: add support for special 0x0 regulatory domain" rcu: Don't deboost before reporting expedited quiescent state mac80211: fix potential double free on mesh join tpm: use try_get_ops() in tpm-space.c wcn36xx: Differentiate wcn3660 from wcn3620 nds32: fix access_ok() checks in get/put_user llc: only change llc->dev when bind() succeeds Linux 5.10.109 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: Ifd757f0ec4ba643f7cbaf78aa899d3c159c4b877 |
||
![]() |
824a950c3f |
cgroup: Use open-time cgroup namespace for process migration perm checks
commit e57457641613fef0d147ede8bd6a3047df588b95 upstream.
cgroup process migration permission checks are performed at write time as
whether a given operation is allowed or not is dependent on the content of
the write - the PID. This currently uses current's cgroup namespace which is
a potential security weakness as it may allow scenarios where a less
privileged process tricks a more privileged one into writing into a fd that
it created.
This patch makes cgroup remember the cgroup namespace at the time of open
and uses it for migration permission checks instad of current's. Note that
this only applies to cgroup2 as cgroup1 doesn't have namespace support.
This also fixes a use-after-free bug on cgroupns reported in
https://lore.kernel.org/r/00000000000048c15c05d0083397@google.com
Note that backporting this fix also requires the preceding patch.
Reported-by: "Eric W. Biederman" <ebiederm@xmission.com>
Suggested-by: Linus Torvalds <torvalds@linuxfoundation.org>
Cc: Michal Koutný <mkoutny@suse.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Michal Koutný <mkoutny@suse.com>
Reported-by: syzbot+50f5cf33a284ce738b62@syzkaller.appspotmail.com
Link: https://lore.kernel.org/r/00000000000048c15c05d0083397@google.com
Fixes:
|
||
![]() |
f28364fe38 |
cgroup: Allocate cgroup_file_ctx for kernfs_open_file->priv
commit 0d2b5955b36250a9428c832664f2079cbf723bec upstream. of->priv is currently used by each interface file implementation to store private information. This patch collects the current two private data usages into struct cgroup_file_ctx which is allocated and freed by the common path. This allows generic private data which applies to multiple files, which will be used to in the following patch. Note that cgroup_procs iterator is now embedded as procs.iter in the new cgroup_file_ctx so that it doesn't need to be allocated and freed separately. v2: union dropped from cgroup_file_ctx and the procs iterator is embedded in cgroup_file_ctx as suggested by Linus. v3: Michal pointed out that cgroup1's procs pidlist uses of->priv too. Converted. Didn't change to embedded allocation as cgroup1 pidlists get stored for caching. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Reviewed-by: Michal Koutný <mkoutny@suse.com> [mkoutny: v5.10: modify cgroup.pressure handlers, adjust context] Signed-off-by: Michal Koutný <mkoutny@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
![]() |
26d02dc8ef |
Merge 5.10.97 into android12-5.10-lts
Changes in 5.10.97 PCI: pciehp: Fix infinite loop in IRQ handler upon power fault net: ipa: fix atomic update in ipa_endpoint_replenish() net: ipa: use a bitmap for endpoint replenish_enabled net: ipa: prevent concurrent replenish Revert "drivers: bus: simple-pm-bus: Add support for probing simple bus only devices" KVM: x86: Forcibly leave nested virt when SMM state is toggled psi: Fix uaf issue when psi trigger is destroyed while being polled x86/mce: Add Xeon Sapphire Rapids to list of CPUs that support PPIN x86/cpu: Add Xeon Icelake-D to list of CPUs that support PPIN drm/vc4: hdmi: Make sure the device is powered with CEC cgroup-v1: Require capabilities to set release_agent net/mlx5e: Fix handling of wrong devices during bond netevent net/mlx5: Use del_timer_sync in fw reset flow of halting poll net/mlx5: E-Switch, Fix uninitialized variable modact ipheth: fix EOVERFLOW in ipheth_rcvbulk_callback net: amd-xgbe: ensure to reset the tx_timer_active flag net: amd-xgbe: Fix skb data length underflow fanotify: Fix stale file descriptor in copy_event_to_user() net: sched: fix use-after-free in tc_new_tfilter() rtnetlink: make sure to refresh master_dev/m_ops in __rtnl_newlink() cpuset: Fix the bug that subpart_cpus updated wrongly in update_cpumask() af_packet: fix data-race in packet_setsockopt / packet_setsockopt tcp: add missing tcp_skb_can_collapse() test in tcp_shift_skb_data() Linux 5.10.97 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: I428a930b475ba1b15d4b1ad05dde7df36cec6405 |
||
![]() |
d4e4e61d4a |
psi: Fix uaf issue when psi trigger is destroyed while being polled
commit a06247c6804f1a7c86a2e5398a4c1f1db1471848 upstream.
With write operation on psi files replacing old trigger with a new one,
the lifetime of its waitqueue is totally arbitrary. Overwriting an
existing trigger causes its waitqueue to be freed and pending poll()
will stumble on trigger->event_wait which was destroyed.
Fix this by disallowing to redefine an existing psi trigger. If a write
operation is used on a file descriptor with an already existing psi
trigger, the operation will fail with EBUSY error.
Also bypass a check for psi_disabled in the psi_trigger_destroy as the
flag can be flipped after the trigger is created, leading to a memory
leak.
Fixes:
|
||
![]() |
c553d9a246 |
Merge 5.10.80 into android12-5.10-lts
Changes in 5.10.80
xhci: Fix USB 3.1 enumeration issues by increasing roothub power-on-good delay
usb: xhci: Enable runtime-pm by default on AMD Yellow Carp platform
binder: use euid from cred instead of using task
binder: use cred instead of task for selinux checks
binder: use cred instead of task for getsecid
Input: iforce - fix control-message timeout
Input: elantench - fix misreporting trackpoint coordinates
Input: i8042 - Add quirk for Fujitsu Lifebook T725
libata: fix read log timeout value
ocfs2: fix data corruption on truncate
scsi: core: Remove command size deduction from scsi_setup_scsi_cmnd()
scsi: qla2xxx: Fix kernel crash when accessing port_speed sysfs file
scsi: qla2xxx: Fix use after free in eh_abort path
mmc: mtk-sd: Add wait dma stop done flow
mmc: dw_mmc: Dont wait for DRTO on Write RSP error
exfat: fix incorrect loading of i_blocks for large files
parisc: Fix set_fixmap() on PA1.x CPUs
parisc: Fix ptrace check on syscall return
tpm: Check for integer overflow in tpm2_map_response_body()
firmware/psci: fix application of sizeof to pointer
crypto: s5p-sss - Add error handling in s5p_aes_probe()
media: rkvdec: Do not override sizeimage for output format
media: ite-cir: IR receiver stop working after receive overflow
media: rkvdec: Support dynamic resolution changes
media: ir-kbd-i2c: improve responsiveness of hauppauge zilog receivers
media: v4l2-ioctl: Fix check_ext_ctrls
ALSA: hda/realtek: Fix mic mute LED for the HP Spectre x360 14
ALSA: hda/realtek: Add a quirk for HP OMEN 15 mute LED
ALSA: hda/realtek: Add quirk for Clevo PC70HS
ALSA: hda/realtek: Headset fixup for Clevo NH77HJQ
ALSA: hda/realtek: Add a quirk for Acer Spin SP513-54N
ALSA: hda/realtek: Add quirk for ASUS UX550VE
ALSA: hda/realtek: Add quirk for HP EliteBook 840 G7 mute LED
ALSA: ua101: fix division by zero at probe
ALSA: 6fire: fix control and bulk message timeouts
ALSA: line6: fix control and interrupt message timeouts
ALSA: usb-audio: Line6 HX-Stomp XL USB_ID for 48k-fixed quirk
ALSA: usb-audio: Add registration quirk for JBL Quantum 400
ALSA: hda: Free card instance properly at probe errors
ALSA: synth: missing check for possible NULL after the call to kstrdup
ALSA: timer: Fix use-after-free problem
ALSA: timer: Unconditionally unlink slave instances, too
ext4: fix lazy initialization next schedule time computation in more granular unit
ext4: ensure enough credits in ext4_ext_shift_path_extents
ext4: refresh the ext4_ext_path struct after dropping i_data_sem.
fuse: fix page stealing
x86/sme: Use #define USE_EARLY_PGTABLE_L5 in mem_encrypt_identity.c
x86/cpu: Fix migration safety with X86_BUG_NULL_SEL
x86/irq: Ensure PI wakeup handler is unregistered before module unload
ASoC: soc-core: fix null-ptr-deref in snd_soc_del_component_unlocked()
ALSA: hda/realtek: Fixes HP Spectre x360 15-eb1xxx speakers
cavium: Return negative value when pci_alloc_irq_vectors() fails
scsi: qla2xxx: Return -ENOMEM if kzalloc() fails
scsi: qla2xxx: Fix unmap of already freed sgl
mISDN: Fix return values of the probe function
cavium: Fix return values of the probe function
sfc: Export fibre-specific supported link modes
sfc: Don't use netif_info before net_device setup
hyperv/vmbus: include linux/bitops.h
ARM: dts: sun7i: A20-olinuxino-lime2: Fix ethernet phy-mode
reset: socfpga: add empty driver allowing consumers to probe
mmc: winbond: don't build on M68K
drm: panel-orientation-quirks: Add quirk for Aya Neo 2021
fcnal-test: kill hanging ping/nettest binaries on cleanup
bpf: Define bpf_jit_alloc_exec_limit for arm64 JIT
bpf: Prevent increasing bpf_jit_limit above max
gpio: mlxbf2.c: Add check for bgpio_init failure
xen/netfront: stop tx queues during live migration
nvmet-tcp: fix a memory leak when releasing a queue
spi: spl022: fix Microwire full duplex mode
net: multicast: calculate csum of looped-back and forwarded packets
watchdog: Fix OMAP watchdog early handling
drm: panel-orientation-quirks: Add quirk for GPD Win3
block: schedule queue restart after BLK_STS_ZONE_RESOURCE
nvmet-tcp: fix header digest verification
r8169: Add device 10ec:8162 to driver r8169
vmxnet3: do not stop tx queues after netif_device_detach()
nfp: bpf: relax prog rejection for mtu check through max_pkt_offset
net/smc: Fix smc_link->llc_testlink_time overflow
net/smc: Correct spelling mistake to TCPF_SYN_RECV
rds: stop using dmapool
btrfs: clear MISSING device status bit in btrfs_close_one_device
btrfs: fix lost error handling when replaying directory deletes
btrfs: call btrfs_check_rw_degradable only if there is a missing device
KVM: VMX: Unregister posted interrupt wakeup handler on hardware unsetup
ia64: kprobes: Fix to pass correct trampoline address to the handler
selinux: fix race condition when computing ocontext SIDs
hwmon: (pmbus/lm25066) Add offset coefficients
regulator: s5m8767: do not use reset value as DVS voltage if GPIO DVS is disabled
regulator: dt-bindings: samsung,s5m8767: correct s5m8767,pmic-buck-default-dvs-idx property
EDAC/sb_edac: Fix top-of-high-memory value for Broadwell/Haswell
mwifiex: fix division by zero in fw download path
ath6kl: fix division by zero in send path
ath6kl: fix control-message timeout
ath10k: fix control-message timeout
ath10k: fix division by zero in send path
PCI: Mark Atheros QCA6174 to avoid bus reset
rtl8187: fix control-message timeouts
evm: mark evm_fixmode as __ro_after_init
ifb: Depend on netfilter alternatively to tc
wcn36xx: Fix HT40 capability for 2Ghz band
wcn36xx: Fix tx_status mechanism
wcn36xx: Fix (QoS) null data frame bitrate/modulation
PM: sleep: Do not let "syscore" devices runtime-suspend during system transitions
mwifiex: Read a PCI register after writing the TX ring write pointer
mwifiex: Try waking the firmware until we get an interrupt
libata: fix checking of DMA state
wcn36xx: handle connection loss indication
rsi: fix occasional initialisation failure with BT coex
rsi: fix key enabled check causing unwanted encryption for vap_id > 0
rsi: fix rate mask set leading to P2P failure
rsi: Fix module dev_oper_mode parameter description
perf/x86/intel/uncore: Support extra IMC channel on Ice Lake server
perf/x86/intel/uncore: Fix Intel ICX IIO event constraints
RDMA/qedr: Fix NULL deref for query_qp on the GSI QP
signal: Remove the bogus sigkill_pending in ptrace_stop
memory: renesas-rpc-if: Correct QSPI data transfer in Manual mode
signal/mips: Update (_save|_restore)_fp_context to fail with -EFAULT
soc: fsl: dpio: replace smp_processor_id with raw_smp_processor_id
soc: fsl: dpio: use the combined functions to protect critical zone
mtd: rawnand: socrates: Keep the driver compatible with on-die ECC engines
power: supply: max17042_battery: Prevent int underflow in set_soc_threshold
power: supply: max17042_battery: use VFSOC for capacity when no rsns
KVM: arm64: Extract ESR_ELx.EC only
KVM: nVMX: Query current VMCS when determining if MSR bitmaps are in use
can: j1939: j1939_tp_cmd_recv(): ignore abort message in the BAM transport
can: j1939: j1939_can_recv(): ignore messages with invalid source address
powerpc/85xx: Fix oops when mpc85xx_smp_guts_ids node cannot be found
ring-buffer: Protect ring_buffer_reset() from reentrancy
serial: core: Fix initializing and restoring termios speed
ifb: fix building without CONFIG_NET_CLS_ACT
ALSA: mixer: oss: Fix racy access to slots
ALSA: mixer: fix deadlock in snd_mixer_oss_set_volume
xen/balloon: add late_initcall_sync() for initial ballooning done
ovl: fix use after free in struct ovl_aio_req
PCI: pci-bridge-emul: Fix emulation of W1C bits
PCI: cadence: Add cdns_plat_pcie_probe() missing return
PCI: aardvark: Do not clear status bits of masked interrupts
PCI: aardvark: Fix checking for link up via LTSSM state
PCI: aardvark: Do not unmask unused interrupts
PCI: aardvark: Fix reporting Data Link Layer Link Active
PCI: aardvark: Fix configuring Reference clock
PCI: aardvark: Fix return value of MSI domain .alloc() method
PCI: aardvark: Read all 16-bits from PCIE_MSI_PAYLOAD_REG
PCI: aardvark: Fix support for bus mastering and PCI_COMMAND on emulated bridge
PCI: aardvark: Fix support for PCI_BRIDGE_CTL_BUS_RESET on emulated bridge
PCI: aardvark: Set PCI Bridge Class Code to PCI Bridge
PCI: aardvark: Fix support for PCI_ROM_ADDRESS1 on emulated bridge
quota: check block number when reading the block in quota file
quota: correct error number in free_dqentry()
pinctrl: core: fix possible memory leak in pinctrl_enable()
coresight: cti: Correct the parameter for pm_runtime_put
iio: dac: ad5446: Fix ad5622_write() return value
iio: ad5770r: make devicetree property reading consistent
USB: serial: keyspan: fix memleak on probe errors
serial: 8250: fix racy uartclk update
most: fix control-message timeouts
USB: iowarrior: fix control-message timeouts
USB: chipidea: fix interrupt deadlock
power: supply: max17042_battery: Clear status bits in interrupt handler
dma-buf: WARN on dmabuf release with pending attachments
drm: panel-orientation-quirks: Update the Lenovo Ideapad D330 quirk (v2)
drm: panel-orientation-quirks: Add quirk for KD Kurio Smart C15200 2-in-1
drm: panel-orientation-quirks: Add quirk for the Samsung Galaxy Book 10.6
Bluetooth: sco: Fix lock_sock() blockage by memcpy_from_msg()
Bluetooth: fix use-after-free error in lock_sock_nested()
drm/panel-orientation-quirks: add Valve Steam Deck
rcutorture: Avoid problematic critical section nesting on PREEMPT_RT
platform/x86: wmi: do not fail if disabling fails
MIPS: lantiq: dma: add small delay after reset
MIPS: lantiq: dma: reset correct number of channel
locking/lockdep: Avoid RCU-induced noinstr fail
net: sched: update default qdisc visibility after Tx queue cnt changes
rcu-tasks: Move RTGS_WAIT_CBS to beginning of rcu_tasks_kthread() loop
smackfs: Fix use-after-free in netlbl_catmap_walk()
ath11k: Align bss_chan_info structure with firmware
x86: Increase exception stack sizes
mwifiex: Run SET_BSS_MODE when changing from P2P to STATION vif-type
mwifiex: Properly initialize private structure on interface type changes
fscrypt: allow 256-bit master keys with AES-256-XTS
drm/amdgpu: Fix MMIO access page fault
ath11k: Avoid reg rules update during firmware recovery
ath11k: add handler for scan event WMI_SCAN_EVENT_DEQUEUED
ath11k: Change DMA_FROM_DEVICE to DMA_TO_DEVICE when map reinjected packets
ath10k: high latency fixes for beacon buffer
media: mt9p031: Fix corrupted frame after restarting stream
media: netup_unidvb: handle interrupt properly according to the firmware
media: atomisp: Fix error handling in probe
media: stm32: Potential NULL pointer dereference in dcmi_irq_thread()
media: uvcvideo: Set capability in s_param
media: uvcvideo: Return -EIO for control errors
media: uvcvideo: Set unique vdev name based in type
media: s5p-mfc: fix possible null-pointer dereference in s5p_mfc_probe()
media: s5p-mfc: Add checking to s5p_mfc_probe().
media: imx: set a media_device bus_info string
media: mceusb: return without resubmitting URB in case of -EPROTO error.
ia64: don't do IA64_CMPXCHG_DEBUG without CONFIG_PRINTK
rtw88: fix RX clock gate setting while fifo dump
brcmfmac: Add DMI nvram filename quirk for Cyberbook T116 tablet
media: rcar-csi2: Add checking to rcsi2_start_receiver()
ipmi: Disable some operations during a panic
fs/proc/uptime.c: Fix idle time reporting in /proc/uptime
ACPICA: Avoid evaluating methods too early during system resume
media: ipu3-imgu: imgu_fmt: Handle properly try
media: ipu3-imgu: VIDIOC_QUERYCAP: Fix bus_info
media: usb: dvd-usb: fix uninit-value bug in dibusb_read_eeprom_byte()
net-sysfs: try not to restart the syscall if it will fail eventually
tracefs: Have tracefs directories not set OTH permission bits by default
ath: dfs_pattern_detector: Fix possible null-pointer dereference in channel_detector_create()
mmc: moxart: Fix reference count leaks in moxart_probe
iov_iter: Fix iov_iter_get_pages{,_alloc} page fault return value
ACPI: battery: Accept charges over the design capacity as full
drm/amdkfd: fix resume error when iommu disabled in Picasso
net: phy: micrel: make *-skew-ps check more lenient
leaking_addresses: Always print a trailing newline
drm/msm: prevent NULL dereference in msm_gpu_crashstate_capture()
block: bump max plugged deferred size from 16 to 32
md: update superblock after changing rdev flags in state_store
memstick: r592: Fix a UAF bug when removing the driver
lib/xz: Avoid overlapping memcpy() with invalid input with in-place decompression
lib/xz: Validate the value before assigning it to an enum variable
workqueue: make sysfs of unbound kworker cpumask more clever
tracing/cfi: Fix cmp_entries_* functions signature mismatch
mt76: mt7915: fix an off-by-one bound check
mwl8k: Fix use-after-free in mwl8k_fw_state_machine()
block: remove inaccurate requeue check
media: allegro: ignore interrupt if mailbox is not initialized
nvmet: fix use-after-free when a port is removed
nvmet-rdma: fix use-after-free when a port is removed
nvmet-tcp: fix use-after-free when a port is removed
nvme: drop scan_lock and always kick requeue list when removing namespaces
PM: hibernate: Get block device exclusively in swsusp_check()
selftests: kvm: fix mismatched fclose() after popen()
selftests/bpf: Fix perf_buffer test on system with offline cpus
iwlwifi: mvm: disable RX-diversity in powersave
smackfs: use __GFP_NOFAIL for smk_cipso_doi()
ARM: clang: Do not rely on lr register for stacktrace
gre/sit: Don't generate link-local addr if addr_gen_mode is IN6_ADDR_GEN_MODE_NONE
gfs2: Cancel remote delete work asynchronously
gfs2: Fix glock_hash_walk bugs
ARM: 9136/1: ARMv7-M uses BE-8, not BE-32
vrf: run conntrack only in context of lower/physdev for locally generated packets
net: annotate data-race in neigh_output()
ACPI: AC: Quirk GK45 to skip reading _PSR
btrfs: reflink: initialize return value to 0 in btrfs_extent_same()
btrfs: do not take the uuid_mutex in btrfs_rm_device
spi: bcm-qspi: Fix missing clk_disable_unprepare() on error in bcm_qspi_probe()
wcn36xx: Correct band/freq reporting on RX
x86/hyperv: Protect set_hv_tscchange_cb() against getting preempted
drm/amd/display: dcn20_resource_construct reduce scope of FPU enabled
selftests/core: fix conflicting types compile error for close_range()
parisc: fix warning in flush_tlb_all
task_stack: Fix end_of_stack() for architectures with upwards-growing stack
erofs: don't trigger WARN() when decompression fails
parisc/unwind: fix unwinder when CONFIG_64BIT is enabled
parisc/kgdb: add kgdb_roundup() to make kgdb work with idle polling
netfilter: conntrack: set on IPS_ASSURED if flows enters internal stream state
selftests/bpf: Fix strobemeta selftest regression
Bluetooth: fix init and cleanup of sco_conn.timeout_work
rcu: Fix existing exp request check in sync_sched_exp_online_cleanup()
MIPS: lantiq: dma: fix burst length for DEU
objtool: Add xen_start_kernel() to noreturn list
x86/xen: Mark cpu_bringup_and_idle() as dead_end_function
objtool: Fix static_call list generation
drm/v3d: fix wait for TMU write combiner flush
virtio-gpu: fix possible memory allocation failure
lockdep: Let lock_is_held_type() detect recursive read as read
net: net_namespace: Fix undefined member in key_remove_domain()
cgroup: Make rebind_subsystems() disable v2 controllers all at once
wcn36xx: Fix Antenna Diversity Switching
wilc1000: fix possible memory leak in cfg_scan_result()
Bluetooth: btmtkuart: fix a memleak in mtk_hci_wmt_sync
crypto: caam - disable pkc for non-E SoCs
rxrpc: Fix _usecs_to_jiffies() by using usecs_to_jiffies()
net: dsa: rtl8366rb: Fix off-by-one bug
ath11k: fix some sleeping in atomic bugs
ath11k: Avoid race during regd updates
ath11k: fix packet drops due to incorrect 6 GHz freq value in rx status
ath11k: Fix memory leak in ath11k_qmi_driver_event_work
ath10k: Fix missing frame timestamp for beacon/probe-resp
ath10k: sdio: Add missing BH locking around napi_schdule()
drm/ttm: stop calling tt_swapin in vm_access
arm64: mm: update max_pfn after memory hotplug
drm/amdgpu: fix warning for overflow check
media: em28xx: add missing em28xx_close_extension
media: cxd2880-spi: Fix a null pointer dereference on error handling path
media: dvb-usb: fix ununit-value in az6027_rc_query
media: v4l2-ioctl: S_CTRL output the right value
media: TDA1997x: handle short reads of hdmi info frame.
media: mtk-vpu: Fix a resource leak in the error handling path of 'mtk_vpu_probe()'
media: radio-wl1273: Avoid card name truncation
media: si470x: Avoid card name truncation
media: tm6000: Avoid card name truncation
media: cx23885: Fix snd_card_free call on null card pointer
kprobes: Do not use local variable when creating debugfs file
crypto: ecc - fix CRYPTO_DEFAULT_RNG dependency
cpuidle: Fix kobject memory leaks in error paths
media: em28xx: Don't use ops->suspend if it is NULL
ath9k: Fix potential interrupt storm on queue reset
PM: EM: Fix inefficient states detection
EDAC/amd64: Handle three rank interleaving mode
rcu: Always inline rcu_dynticks_task*_{enter,exit}()
netfilter: nft_dynset: relax superfluous check on set updates
media: dvb-frontends: mn88443x: Handle errors of clk_prepare_enable()
crypto: qat - detect PFVF collision after ACK
crypto: qat - disregard spurious PFVF interrupts
hwrng: mtk - Force runtime pm ops for sleep ops
b43legacy: fix a lower bounds test
b43: fix a lower bounds test
gve: Recover from queue stall due to missed IRQ
mmc: sdhci-omap: Fix NULL pointer exception if regulator is not configured
mmc: sdhci-omap: Fix context restore
memstick: avoid out-of-range warning
memstick: jmb38x_ms: use appropriate free function in jmb38x_ms_alloc_host()
net, neigh: Fix NTF_EXT_LEARNED in combination with NTF_USE
hwmon: Fix possible memleak in __hwmon_device_register()
hwmon: (pmbus/lm25066) Let compiler determine outer dimension of lm25066_coeff
ath10k: fix max antenna gain unit
kernel/sched: Fix sched_fork() access an invalid sched_task_group
tcp: switch orphan_count to bare per-cpu counters
drm/msm: potential error pointer dereference in init()
drm/msm: uninitialized variable in msm_gem_import()
net: stream: don't purge sk_error_queue in sk_stream_kill_queues()
media: ir_toy: assignment to be16 should be of correct type
mmc: mxs-mmc: disable regulator on error and in the remove function
platform/x86: thinkpad_acpi: Fix bitwise vs. logical warning
mt76: mt7615: fix endianness warning in mt7615_mac_write_txwi
mt76: mt76x02: fix endianness warnings in mt76x02_mac.c
mt76: mt7915: fix possible infinite loop release semaphore
mt76: mt7915: fix sta_rec_wtbl tag len
mt76: mt7915: fix muar_idx in mt7915_mcu_alloc_sta_req()
rsi: stop thread firstly in rsi_91x_init() error handling
mwifiex: Send DELBA requests according to spec
net: enetc: unmap DMA in enetc_send_cmd()
phy: micrel: ksz8041nl: do not use power down mode
nvme-rdma: fix error code in nvme_rdma_setup_ctrl
PM: hibernate: fix sparse warnings
clocksource/drivers/timer-ti-dm: Select TIMER_OF
x86/sev: Fix stack type check in vc_switch_off_ist()
drm/msm: Fix potential NULL dereference in DPU SSPP
smackfs: use netlbl_cfg_cipsov4_del() for deleting cipso_v4_doi
KVM: selftests: Add operand to vmsave/vmload/vmrun in svm.c
KVM: selftests: Fix nested SVM tests when built with clang
bpftool: Avoid leaking the JSON writer prepared for program metadata
libbpf: Fix BTF data layout checks and allow empty BTF
libbpf: Allow loading empty BTFs
libbpf: Fix overflow in BTF sanity checks
libbpf: Fix BTF header parsing checks
s390/gmap: don't unconditionally call pte_unmap_unlock() in __gmap_zap()
KVM: s390: pv: avoid double free of sida page
KVM: s390: pv: avoid stalls for kvm_s390_pv_init_vm
irq: mips: avoid nested irq_enter()
tpm: fix Atmel TPM crash caused by too frequent queries
tpm_tis_spi: Add missing SPI ID
libbpf: Fix endianness detection in BPF_CORE_READ_BITFIELD_PROBED()
tcp: don't free a FIN sk_buff in tcp_remove_empty_skb()
spi: spi-rpc-if: Check return value of rpcif_sw_init()
samples/kretprobes: Fix return value if register_kretprobe() failed
KVM: s390: Fix handle_sske page fault handling
libertas_tf: Fix possible memory leak in probe and disconnect
libertas: Fix possible memory leak in probe and disconnect
wcn36xx: add proper DMA memory barriers in rx path
wcn36xx: Fix discarded frames due to wrong sequence number
drm/amdgpu/gmc6: fix DMA mask from 44 to 40 bits
selftests: bpf: Convert sk_lookup ctx access tests to PROG_TEST_RUN
selftests/bpf: Fix fd cleanup in sk_lookup test
net: amd-xgbe: Toggle PLL settings during rate change
net: phylink: avoid mvneta warning when setting pause parameters
crypto: pcrypt - Delay write to padata->info
selftests/bpf: Fix fclose/pclose mismatch in test_progs
udp6: allow SO_MARK ctrl msg to affect routing
ibmvnic: don't stop queue in xmit
ibmvnic: Process crqs after enabling interrupts
cgroup: Fix rootcg cpu.stat guest double counting
bpf: Fix propagation of bounds from 64-bit min/max into 32-bit and var_off.
bpf: Fix propagation of signed bounds from 64-bit min/max into 32-bit.
of: unittest: fix EXPECT text for gpio hog errors
iio: st_sensors: Call st_sensors_power_enable() from bus drivers
iio: st_sensors: disable regulators after device unregistration
RDMA/rxe: Fix wrong port_cap_flags
ARM: dts: BCM5301X: Fix memory nodes names
clk: mvebu: ap-cpu-clk: Fix a memory leak in error handling paths
ARM: s3c: irq-s3c24xx: Fix return value check for s3c24xx_init_intc()
arm64: dts: rockchip: Fix GPU register width for RK3328
ARM: dts: qcom: msm8974: Add xo_board reference clock to DSI0 PHY
RDMA/bnxt_re: Fix query SRQ failure
arm64: dts: ti: k3-j721e-main: Fix "max-virtual-functions" in PCIe EP nodes
arm64: dts: ti: k3-j721e-main: Fix "bus-range" upto 256 bus number for PCIe
arm64: dts: meson-g12a: Fix the pwm regulator supply properties
arm64: dts: meson-g12b: Fix the pwm regulator supply properties
bus: ti-sysc: Fix timekeeping_suspended warning on resume
ARM: dts: at91: tse850: the emac<->phy interface is rmii
scsi: dc395: Fix error case unwinding
MIPS: loongson64: make CPU_LOONGSON64 depends on MIPS_FP_SUPPORT
JFS: fix memleak in jfs_mount
arm64: dts: qcom: msm8916: Fix Secondary MI2S bit clock
arm64: dts: renesas: beacon: Fix Ethernet PHY mode
arm64: dts: qcom: pm8916: Remove wrong reg-names for rtc@6000
ALSA: hda: Reduce udelay() at SKL+ position reporting
ALSA: hda: Release controller display power during shutdown/reboot
ALSA: hda: Fix hang during shutdown due to link reset
ALSA: hda: Use position buffer for SKL+ again
soundwire: debugfs: use controller id and link_id for debugfs
scsi: pm80xx: Fix misleading log statement in pm8001_mpi_get_nvmd_resp()
driver core: Fix possible memory leak in device_link_add()
arm: dts: omap3-gta04a4: accelerometer irq fix
ASoC: SOF: topology: do not power down primary core during topology removal
soc/tegra: Fix an error handling path in tegra_powergate_power_up()
memory: fsl_ifc: fix leak of irq and nand_irq in fsl_ifc_ctrl_probe
clk: at91: check pmc node status before registering syscore ops
video: fbdev: chipsfb: use memset_io() instead of memset()
powerpc: Refactor is_kvm_guest() declaration to new header
powerpc: Rename is_kvm_guest() to check_kvm_guest()
powerpc: Reintroduce is_kvm_guest() as a fast-path check
powerpc: Fix is_kvm_guest() / kvm_para_available()
powerpc: fix unbalanced node refcount in check_kvm_guest()
serial: 8250_dw: Drop wrong use of ACPI_PTR()
usb: gadget: hid: fix error code in do_config()
power: supply: rt5033_battery: Change voltage values to µV
power: supply: max17040: fix null-ptr-deref in max17040_probe()
scsi: csiostor: Uninitialized data in csio_ln_vnp_read_cbfn()
RDMA/mlx4: Return missed an error if device doesn't support steering
usb: musb: select GENERIC_PHY instead of depending on it
staging: most: dim2: do not double-register the same device
staging: ks7010: select CRYPTO_HASH/CRYPTO_MICHAEL_MIC
pinctrl: renesas: checker: Fix off-by-one bug in drive register check
ARM: dts: stm32: Reduce DHCOR SPI NOR frequency to 50 MHz
ARM: dts: stm32: fix SAI sub nodes register range
ARM: dts: stm32: fix AV96 board SAI2 pin muxing on stm32mp15
ASoC: cs42l42: Correct some register default values
ASoC: cs42l42: Defer probe if request_threaded_irq() returns EPROBE_DEFER
soc: qcom: rpmhpd: Provide some missing struct member descriptions
soc: qcom: rpmhpd: Make power_on actually enable the domain
usb: typec: STUSB160X should select REGMAP_I2C
iio: adis: do not disabe IRQs in 'adis_init()'
scsi: ufs: Refactor ufshcd_setup_clocks() to remove skip_ref_clk
scsi: ufs: ufshcd-pltfrm: Fix memory leak due to probe defer
serial: imx: fix detach/attach of serial console
usb: dwc2: drd: fix dwc2_force_mode call in dwc2_ovr_init
usb: dwc2: drd: fix dwc2_drd_role_sw_set when clock could be disabled
usb: dwc2: drd: reset current session before setting the new one
firmware: qcom_scm: Fix error retval in __qcom_scm_is_call_available()
soc: qcom: apr: Add of_node_put() before return
pinctrl: equilibrium: Fix function addition in multiple groups
phy: qcom-qusb2: Fix a memory leak on probe
phy: ti: gmii-sel: check of_get_address() for failure
phy: qcom-snps: Correct the FSEL_MASK
serial: xilinx_uartps: Fix race condition causing stuck TX
clk: at91: sam9x60-pll: use DIV_ROUND_CLOSEST_ULL
HID: u2fzero: clarify error check and length calculations
HID: u2fzero: properly handle timeouts in usb_submit_urb
powerpc/44x/fsp2: add missing of_node_put
ASoC: cs42l42: Disable regulators if probe fails
ASoC: cs42l42: Use device_property API instead of of_property
ASoC: cs42l42: Correct configuring of switch inversion from ts-inv
virtio_ring: check desc == NULL when using indirect with packed
mips: cm: Convert to bitfield API to fix out-of-bounds access
power: supply: bq27xxx: Fix kernel crash on IRQ handler register error
apparmor: fix error check
rpmsg: Fix rpmsg_create_ept return when RPMSG config is not defined
nfsd: don't alloc under spinlock in rpc_parse_scope_id
i2c: mediatek: fixing the incorrect register offset
NFS: Fix dentry verifier races
pnfs/flexfiles: Fix misplaced barrier in nfs4_ff_layout_prepare_ds
drm/plane-helper: fix uninitialized variable reference
PCI: aardvark: Don't spam about PIO Response Status
PCI: aardvark: Fix preserving PCI_EXP_RTCTL_CRSSVE flag on emulated bridge
opp: Fix return in _opp_add_static_v2()
NFS: Fix deadlocks in nfs_scan_commit_list()
fs: orangefs: fix error return code of orangefs_revalidate_lookup()
mtd: spi-nor: hisi-sfc: Remove excessive clk_disable_unprepare()
PCI: uniphier: Serialize INTx masking/unmasking and fix the bit operation
mtd: core: don't remove debugfs directory if device is in use
remoteproc: Fix a memory leak in an error handling path in 'rproc_handle_vdev()'
rtc: rv3032: fix error handling in rv3032_clkout_set_rate()
dmaengine: at_xdmac: fix AT_XDMAC_CC_PERID() macro
NFS: Fix up commit deadlocks
NFS: Fix an Oops in pnfs_mark_request_commit()
Fix user namespace leak
auxdisplay: img-ascii-lcd: Fix lock-up when displaying empty string
auxdisplay: ht16k33: Connect backlight to fbdev
auxdisplay: ht16k33: Fix frame buffer device blanking
soc: fsl: dpaa2-console: free buffer before returning from dpaa2_console_read
netfilter: nfnetlink_queue: fix OOB when mac header was cleared
dmaengine: dmaengine_desc_callback_valid(): Check for `callback_result`
signal/sh: Use force_sig(SIGKILL) instead of do_group_exit(SIGKILL)
m68k: set a default value for MEMORY_RESERVE
watchdog: f71808e_wdt: fix inaccurate report in WDIOC_GETTIMEOUT
ar7: fix kernel builds for compiler test
scsi: qla2xxx: Changes to support FCP2 Target
scsi: qla2xxx: Relogin during fabric disturbance
scsi: qla2xxx: Fix gnl list corruption
scsi: qla2xxx: Turn off target reset during issue_lip
NFSv4: Fix a regression in nfs_set_open_stateid_locked()
i2c: xlr: Fix a resource leak in the error handling path of 'xlr_i2c_probe()'
xen-pciback: Fix return in pm_ctrl_init()
net: davinci_emac: Fix interrupt pacing disable
ethtool: fix ethtool msg len calculation for pause stats
openrisc: fix SMP tlb flush NULL pointer dereference
net: vlan: fix a UAF in vlan_dev_real_dev()
ice: Fix replacing VF hardware MAC to existing MAC filter
ice: Fix not stopping Tx queues for VFs
ACPI: PMIC: Fix intel_pmic_regs_handler() read accesses
drm/nouveau/svm: Fix refcount leak bug and missing check against null bug
net: phy: fix duplex out of sync problem while changing settings
bonding: Fix a use-after-free problem when bond_sysfs_slave_add() failed
mfd: core: Add missing of_node_put for loop iteration
can: mcp251xfd: mcp251xfd_chip_start(): fix error handling for mcp251xfd_chip_rx_int_enable()
mm/zsmalloc.c: close race window between zs_pool_dec_isolated() and zs_unregister_migration()
zram: off by one in read_block_state()
perf bpf: Add missing free to bpf_event__print_bpf_prog_info()
llc: fix out-of-bound array index in llc_sk_dev_hash()
nfc: pn533: Fix double free when pn533_fill_fragment_skbs() fails
arm64: pgtable: make __pte_to_phys/__phys_to_pte_val inline functions
bpf, sockmap: Remove unhash handler for BPF sockmap usage
bpf: sockmap, strparser, and tls are reusing qdisc_skb_cb and colliding
gve: Fix off by one in gve_tx_timeout()
seq_file: fix passing wrong private data
net/sched: sch_taprio: fix undefined behavior in ktime_mono_to_any
net: hns3: fix kernel crash when unload VF while it is being reset
net: hns3: allow configure ETS bandwidth of all TCs
net: stmmac: allow a tc-taprio base-time of zero
vsock: prevent unnecessary refcnt inc for nonblocking connect
net/smc: fix sk_refcnt underflow on linkdown and fallback
cxgb4: fix eeprom len when diagnostics not implemented
selftests/net: udpgso_bench_rx: fix port argument
ARM: 9155/1: fix early early_iounmap()
ARM: 9156/1: drop cc-option fallbacks for architecture selection
parisc: Fix backtrace to always include init funtion names
MIPS: Fix assembly error from MIPSr2 code used within MIPS_ISA_ARCH_LEVEL
x86/mce: Add errata workaround for Skylake SKX37
posix-cpu-timers: Clear task::posix_cputimers_work in copy_process()
irqchip/sifive-plic: Fixup EOI failed when masked
f2fs: should use GFP_NOFS for directory inodes
net, neigh: Enable state migration between NUD_PERMANENT and NTF_USE
9p/net: fix missing error check in p9_check_errors
memcg: prohibit unconditional exceeding the limit of dying tasks
powerpc/lib: Add helper to check if offset is within conditional branch range
powerpc/bpf: Validate branch ranges
powerpc/security: Add a helper to query stf_barrier type
powerpc/bpf: Emit stf barrier instruction sequences for BPF_NOSPEC
mm, oom: pagefault_out_of_memory: don't force global OOM for dying tasks
mm, oom: do not trigger out_of_memory from the #PF
mfd: dln2: Add cell for initializing DLN2 ADC
video: backlight: Drop maximum brightness override for brightness zero
s390/cio: check the subchannel validity for dev_busid
s390/tape: fix timer initialization in tape_std_assign()
s390/ap: Fix hanging ioctl caused by orphaned replies
s390/cio: make ccw_device_dma_* more robust
mtd: rawnand: ams-delta: Keep the driver compatible with on-die ECC engines
mtd: rawnand: xway: Keep the driver compatible with on-die ECC engines
mtd: rawnand: mpc5121: Keep the driver compatible with on-die ECC engines
mtd: rawnand: gpio: Keep the driver compatible with on-die ECC engines
mtd: rawnand: pasemi: Keep the driver compatible with on-die ECC engines
mtd: rawnand: orion: Keep the driver compatible with on-die ECC engines
mtd: rawnand: plat_nand: Keep the driver compatible with on-die ECC engines
mtd: rawnand: au1550nd: Keep the driver compatible with on-die ECC engines
powerpc/powernv/prd: Unregister OPAL_MSG_PRD2 notifier during module unload
powerpc/85xx: fix timebase sync issue when CONFIG_HOTPLUG_CPU=n
drm/sun4i: Fix macros in sun8i_csc.h
PCI: Add PCI_EXP_DEVCTL_PAYLOAD_* macros
PCI: aardvark: Fix PCIe Max Payload Size setting
SUNRPC: Partial revert of commit
|
||
![]() |
ba8ba76885 |
cgroup: Make rebind_subsystems() disable v2 controllers all at once
[ Upstream commit 7ee285395b211cad474b2b989db52666e0430daf ]
It was found that the following warning was displayed when remounting
controllers from cgroup v2 to v1:
[ 8042.997778] WARNING: CPU: 88 PID: 80682 at kernel/cgroup/cgroup.c:3130 cgroup_apply_control_disable+0x158/0x190
:
[ 8043.091109] RIP: 0010:cgroup_apply_control_disable+0x158/0x190
[ 8043.096946] Code: ff f6 45 54 01 74 39 48 8d 7d 10 48 c7 c6 e0 46 5a a4 e8 7b 67 33 00 e9 41 ff ff ff 49 8b 84 24 e8 01 00 00 0f b7 40 08 eb 95 <0f> 0b e9 5f ff ff ff 48 83 c4 08 5b 5d 41 5c 41 5d 41 5e 41 5f c3
[ 8043.115692] RSP: 0018:ffffba8a47c23d28 EFLAGS: 00010202
[ 8043.120916] RAX: 0000000000000036 RBX: ffffffffa624ce40 RCX: 000000000000181a
[ 8043.128047] RDX: ffffffffa63c43e0 RSI: ffffffffa63c43e0 RDI: ffff9d7284ee1000
[ 8043.135180] RBP: ffff9d72874c5800 R08: ffffffffa624b090 R09: 0000000000000004
[ 8043.142314] R10: ffffffffa624b080 R11: 0000000000002000 R12: ffff9d7284ee1000
[ 8043.149447] R13: ffff9d7284ee1000 R14: ffffffffa624ce70 R15: ffffffffa6269e20
[ 8043.156576] FS: 00007f7747cff740(0000) GS:ffff9d7a5fc00000(0000) knlGS:0000000000000000
[ 8043.164663] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 8043.170409] CR2: 00007f7747e96680 CR3: 0000000887d60001 CR4: 00000000007706e0
[ 8043.177539] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 8043.184673] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 8043.191804] PKRU: 55555554
[ 8043.194517] Call Trace:
[ 8043.196970] rebind_subsystems+0x18c/0x470
[ 8043.201070] cgroup_setup_root+0x16c/0x2f0
[ 8043.205177] cgroup1_root_to_use+0x204/0x2a0
[ 8043.209456] cgroup1_get_tree+0x3e/0x120
[ 8043.213384] vfs_get_tree+0x22/0xb0
[ 8043.216883] do_new_mount+0x176/0x2d0
[ 8043.220550] __x64_sys_mount+0x103/0x140
[ 8043.224474] do_syscall_64+0x38/0x90
[ 8043.228063] entry_SYSCALL_64_after_hwframe+0x44/0xae
It was caused by the fact that rebind_subsystem() disables
controllers to be rebound one by one. If more than one disabled
controllers are originally from the default hierarchy, it means that
cgroup_apply_control_disable() will be called multiple times for the
same default hierarchy. A controller may be killed by css_kill() in
the first round. In the second round, the killed controller may not be
completely dead yet leading to the warning.
To avoid this problem, we collect all the ssid's of controllers that
needed to be disabled from the default hierarchy and then disable them
in one go instead of one by one.
Fixes:
|
||
![]() |
a739489620 |
Merge 5.10.77 into android12-5.10-lts
Changes in 5.10.77 ARM: 9132/1: Fix __get_user_check failure with ARM KASAN images ARM: 9133/1: mm: proc-macros: ensure *_tlb_fns are 4B aligned ARM: 9134/1: remove duplicate memcpy() definition ARM: 9138/1: fix link warning with XIP + frame-pointer ARM: 9139/1: kprobes: fix arch_init_kprobes() prototype ARM: 9141/1: only warn about XIP address when not compile testing io_uring: don't take uring_lock during iowq cancel powerpc/bpf: Fix BPF_MOD when imm == 1 arm64: Avoid premature usercopy failure ext4: fix possible UAF when remounting r/o a mmp-protected file system usbnet: sanity check for maxpacket usbnet: fix error return code in usbnet_probe() Revert "pinctrl: bcm: ns: support updated DT binding as syscon subnode" pinctrl: amd: disable and mask interrupts on probe ata: sata_mv: Fix the error handling of mv_chip_id() tipc: fix size validations for the MSG_CRYPTO type nfc: port100: fix using -ERRNO as command type mask Revert "net: mdiobus: Fix memory leak in __mdiobus_register" net/tls: Fix flipped sign in tls_err_abort() calls mmc: vub300: fix control-message timeouts mmc: cqhci: clear HALT state after CQE enable mmc: mediatek: Move cqhci init behind ungate clock mmc: dw_mmc: exynos: fix the finding clock sample value mmc: sdhci: Map more voltage level to SDHCI_POWER_330 mmc: sdhci-esdhc-imx: clear the buffer_read_ready to reset standard tuning circuit ocfs2: fix race between searching chunks and release journal_head from buffer_head nvme-tcp: fix H2CData PDU send accounting (again) cfg80211: scan: fix RCU in cfg80211_add_nontrans_list() cfg80211: fix management registrations locking net: lan78xx: fix division by zero in send path mm, thp: bail out early in collapse_file for writeback page drm/ttm: fix memleak in ttm_transfered_destroy drm/amdgpu: fix out of bounds write cgroup: Fix memory leak caused by missing cgroup_bpf_offline riscv, bpf: Fix potential NULL dereference tcp_bpf: Fix one concurrency problem in the tcp_bpf_send_verdict function bpf: Fix potential race in tail call compatibility check bpf: Fix error usage of map_fd and fdget() in generic_map_update_batch() IB/qib: Protect from buffer overflow in struct qib_user_sdma_pkt fields IB/hfi1: Fix abba locking issue with sc_disable() nvmet-tcp: fix data digest pointer calculation nvme-tcp: fix data digest pointer calculation nvme-tcp: fix possible req->offset corruption octeontx2-af: Display all enabled PF VF rsrc_alloc entries. RDMA/mlx5: Set user priority for DCT arm64: dts: allwinner: h5: NanoPI Neo 2: Fix ethernet node reset: brcmstb-rescal: fix incorrect polarity of status bit regmap: Fix possible double-free in regcache_rbtree_exit() net: batman-adv: fix error handling net-sysfs: initialize uid and gid before calling net_ns_get_ownership cfg80211: correct bridge/4addr mode check net: Prevent infinite while loop in skb_tx_hash() RDMA/sa_query: Use strscpy_pad instead of memcpy to copy a string gpio: xgs-iproc: fix parsing of ngpios property nios2: Make NIOS2_DTB_SOURCE_BOOL depend on !COMPILE_TEST mlxsw: pci: Recycle received packet upon allocation failure net: ethernet: microchip: lan743x: Fix driver crash when lan743x_pm_resume fails net: ethernet: microchip: lan743x: Fix dma allocation failure by using dma_set_mask_and_coherent net: nxp: lpc_eth.c: avoid hang when bringing interface down net/tls: Fix flipped sign in async_wait.err assignment phy: phy_ethtool_ksettings_get: Lock the phy for consistency phy: phy_ethtool_ksettings_set: Move after phy_start_aneg phy: phy_start_aneg: Add an unlocked version phy: phy_ethtool_ksettings_set: Lock the PHY while changing settings sctp: use init_tag from inithdr for ABORT chunk sctp: fix the processing for INIT_ACK chunk sctp: fix the processing for COOKIE_ECHO chunk sctp: add vtag check in sctp_sf_violation sctp: add vtag check in sctp_sf_do_8_5_1_E_sa sctp: add vtag check in sctp_sf_ootb lan743x: fix endianness when accessing descriptors KVM: s390: clear kicked_mask before sleeping again KVM: s390: preserve deliverable_mask in __airqs_kick_single_vcpu scsi: ufs: ufs-exynos: Correct timeout value setting registers riscv: fix misalgned trap vector base address riscv: Fix asan-stack clang build perf script: Check session->header.env.arch before using it Linux 5.10.77 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: I4cd89af4d20b7a8a1a6d9906233d1aaf026659a8 |
||
![]() |
01599bf7cc |
cgroup: Fix memory leak caused by missing cgroup_bpf_offline
commit 04f8ef5643bcd8bcde25dfdebef998aea480b2ba upstream. When enabling CONFIG_CGROUP_BPF, kmemleak can be observed by running the command as below: $mount -t cgroup -o none,name=foo cgroup cgroup/ $umount cgroup/ unreferenced object 0xc3585c40 (size 64): comm "mount", pid 425, jiffies 4294959825 (age 31.990s) hex dump (first 32 bytes): 01 00 00 80 84 8c 28 c0 00 00 00 00 00 00 00 00 ......(......... 00 00 00 00 00 00 00 00 6c 43 a0 c3 00 00 00 00 ........lC...... backtrace: [<e95a2f9e>] cgroup_bpf_inherit+0x44/0x24c [<1f03679c>] cgroup_setup_root+0x174/0x37c [<ed4b0ac5>] cgroup1_get_tree+0x2c0/0x4a0 [<f85b12fd>] vfs_get_tree+0x24/0x108 [<f55aec5c>] path_mount+0x384/0x988 [<e2d5e9cd>] do_mount+0x64/0x9c [<208c9cfe>] sys_mount+0xfc/0x1f4 [<06dd06e0>] ret_fast_syscall+0x0/0x48 [<a8308cb3>] 0xbeb4daa8 This is because that since the commit |
||
![]() |
b1a6760ddf |
Merge branch 'android12-5.10' into android12-5.10-lts
Sync up with android12-5.10 for the following commits: |
||
![]() |
f41a95eadc |
ANDROID: Export memcg functions to allow module to add new files
Export cgroup_add_legacy_cftypes and a helper function to allow vendor module to expose additional files in the memory cgroup hierarchy. Bug: 192052083 Signed-off-by: Liujie Xie <xieliujie@oppo.com> Change-Id: Ie2b936b3e77c7ab6d740d1bb6d70e03c70a326a7 |
||
![]() |
bc9699030e |
Merge branch 'android12-5.10' into android12-5.10-lts
Sync up with android12-5.10 for the following commits: |
||
![]() |
c7186c2c46 |
FROMGIT: cgroup: make per-cgroup pressure stall tracking configurable
PSI accounts stalls for each cgroup separately and aggregates it at each level of the hierarchy. This causes additional overhead with psi_avgs_work being called for each cgroup in the hierarchy. psi_avgs_work has been highly optimized, however on systems with large number of cgroups the overhead becomes noticeable. Systems which use PSI only at the system level could avoid this overhead if PSI can be configured to skip per-cgroup stall accounting. Add "cgroup_disable=pressure" kernel command-line option to allow requesting system-wide only pressure stall accounting. When set, it keeps system-wide accounting under /proc/pressure/ but skips accounting for individual cgroups and does not expose PSI nodes in cgroup hierarchy. Signed-off-by: Suren Baghdasaryan <surenb@google.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Tejun Heo <tj@kernel.org> Link: https://lore.kernel.org/patchwork/patch/1435705 (cherry picked from commit 3958e2d0c34e18c41b60dc01832bd670a59ef70f https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git tj) Bug: 178872719 Bug: 191734423 Signed-off-by: Suren Baghdasaryan <surenb@google.com> Change-Id: Ifc8fbc52f9a1131d7c2668edbb44c525c76c3360 |
||
![]() |
82658bfd88 |
Merge 5.10.44 into android12-5.10-lts
Changes in 5.10.44 proc: Track /proc/$pid/attr/ opener mm_struct ASoC: max98088: fix ni clock divider calculation ASoC: amd: fix for pcm_read() error spi: Fix spi device unregister flow spi: spi-zynq-qspi: Fix stack violation bug bpf: Forbid trampoline attach for functions with variable arguments net/nfc/rawsock.c: fix a permission check bug usb: cdns3: Fix runtime PM imbalance on error ASoC: Intel: bytcr_rt5640: Add quirk for the Glavey TM800A550L tablet ASoC: Intel: bytcr_rt5640: Add quirk for the Lenovo Miix 3-830 tablet vfio-ccw: Reset FSM state to IDLE inside FSM vfio-ccw: Serialize FSM IDLE state with I/O completion ASoC: sti-sas: add missing MODULE_DEVICE_TABLE spi: sprd: Add missing MODULE_DEVICE_TABLE usb: chipidea: udc: assign interrupt number to USB gadget structure isdn: mISDN: netjet: Fix crash in nj_probe: bonding: init notify_work earlier to avoid uninitialized use netlink: disable IRQs for netlink_lock_table() net: mdiobus: get rid of a BUG_ON() cgroup: disable controllers at parse time wq: handle VM suspension in stall detection net/qla3xxx: fix schedule while atomic in ql_sem_spinlock RDS tcp loopback connection can hang net:sfc: fix non-freed irq in legacy irq mode scsi: bnx2fc: Return failure if io_req is already in ABTS processing scsi: vmw_pvscsi: Set correct residual data length scsi: hisi_sas: Drop free_irq() of devm_request_irq() allocated irq scsi: target: qla2xxx: Wait for stop_phase1 at WWN removal net: macb: ensure the device is available before accessing GEMGXL control registers net: appletalk: cops: Fix data race in cops_probe1 net: dsa: microchip: enable phy errata workaround on 9567 nvme-fabrics: decode host pathing error for connect MIPS: Fix kernel hang under FUNCTION_GRAPH_TRACER and PREEMPT_TRACER dm verity: fix require_signatures module_param permissions bnx2x: Fix missing error code in bnx2x_iov_init_one() nvme-tcp: remove incorrect Kconfig dep in BLK_DEV_NVME nvmet: fix false keep-alive timeout when a controller is torn down powerpc/fsl: set fsl,i2c-erratum-a004447 flag for P2041 i2c controllers powerpc/fsl: set fsl,i2c-erratum-a004447 flag for P1010 i2c controllers spi: Don't have controller clean up spi device before driver unbind spi: Cleanup on failure of initial setup i2c: mpc: Make use of i2c_recover_bus() i2c: mpc: implement erratum A-004447 workaround ALSA: seq: Fix race of snd_seq_timer_open() ALSA: firewire-lib: fix the context to call snd_pcm_stop_xrun() ALSA: hda/realtek: headphone and mic don't work on an Acer laptop ALSA: hda/realtek: fix mute/micmute LEDs and speaker for HP Elite Dragonfly G2 ALSA: hda/realtek: fix mute/micmute LEDs and speaker for HP EliteBook x360 1040 G8 ALSA: hda/realtek: fix mute/micmute LEDs for HP EliteBook 840 Aero G8 ALSA: hda/realtek: fix mute/micmute LEDs for HP ZBook Power G8 spi: bcm2835: Fix out-of-bounds access with more than 4 slaves Revert "ACPI: sleep: Put the FACS table after using it" drm: Fix use-after-free read in drm_getunique() drm: Lock pointer access in drm_master_release() perf/x86/intel/uncore: Fix M2M event umask for Ice Lake server KVM: X86: MMU: Use the correct inherited permissions to get shadow page kvm: avoid speculation-based attacks from out-of-range memslot accesses staging: rtl8723bs: Fix uninitialized variables async_xor: check src_offs is not NULL before updating it btrfs: return value from btrfs_mark_extent_written() in case of error btrfs: promote debugging asserts to full-fledged checks in validate_super cgroup1: don't allow '\n' in renaming ftrace: Do not blindly read the ip address in ftrace_bug() mmc: renesas_sdhi: abort tuning when timeout detected mmc: renesas_sdhi: Fix HS400 on R-Car M3-W+ USB: f_ncm: ncm_bitrate (speed) is unsigned usb: f_ncm: only first packet of aggregate needs to start timer usb: pd: Set PD_T_SINK_WAIT_CAP to 310ms usb: dwc3-meson-g12a: fix usb2 PHY glue init when phy0 is disabled usb: dwc3: meson-g12a: Disable the regulator in the error handling path of the probe usb: dwc3: gadget: Bail from dwc3_gadget_exit() if dwc->gadget is NULL usb: dwc3: ep0: fix NULL pointer exception usb: musb: fix MUSB_QUIRK_B_DISCONNECT_99 handling usb: typec: wcove: Use LE to CPU conversion when accessing msg->header usb: typec: ucsi: Clear PPM capability data in ucsi_init() error path usb: typec: intel_pmc_mux: Put fwnode in error case during ->probe() usb: typec: intel_pmc_mux: Add missed error check for devm_ioremap_resource() usb: gadget: f_fs: Ensure io_completion_wq is idle during unbind USB: serial: ftdi_sio: add NovaTech OrionMX product ID USB: serial: omninet: add device id for Zyxel Omni 56K Plus USB: serial: quatech2: fix control-request directions USB: serial: cp210x: fix alternate function for CP2102N QFN20 usb: gadget: eem: fix wrong eem header operation usb: fix various gadgets null ptr deref on 10gbps cabling. usb: fix various gadget panics on 10gbps cabling usb: typec: tcpm: cancel vdm and state machine hrtimer when unregister tcpm port usb: typec: tcpm: cancel frs hrtimer when unregister tcpm port regulator: core: resolve supply for boot-on/always-on regulators regulator: max77620: Use device_set_of_node_from_dev() regulator: bd718x7: Fix the BUCK7 voltage setting on BD71837 regulator: fan53880: Fix missing n_voltages setting regulator: bd71828: Fix .n_voltages settings regulator: rtmv20: Fix .set_current_limit/.get_current_limit callbacks phy: usb: Fix misuse of IS_ENABLED usb: dwc3: gadget: Disable gadget IRQ during pullup disable usb: typec: mux: Fix copy-paste mistake in typec_mux_match drm/mcde: Fix off by 10^3 in calculation drm/msm/a6xx: fix incorrectly set uavflagprd_inv field for A650 drm/msm/a6xx: update/fix CP_PROTECT initialization drm/msm/a6xx: avoid shadow NULL reference in failure path RDMA/ipoib: Fix warning caused by destroying non-initial netns RDMA/mlx4: Do not map the core_clock page to user space unless enabled ARM: cpuidle: Avoid orphan section warning vmlinux.lds.h: Avoid orphan section with !SMP tools/bootconfig: Fix error return code in apply_xbc() phy: cadence: Sierra: Fix error return code in cdns_sierra_phy_probe() ASoC: core: Fix Null-point-dereference in fmt_single_name() ASoC: meson: gx-card: fix sound-dai dt schema phy: ti: Fix an error code in wiz_probe() gpio: wcd934x: Fix shift-out-of-bounds error perf: Fix data race between pin_count increment/decrement sched/fair: Keep load_avg and load_sum synced sched/fair: Make sure to update tg contrib for blocked load sched/fair: Fix util_est UTIL_AVG_UNCHANGED handling x86/nmi_watchdog: Fix old-style NMI watchdog regression on old Intel CPUs KVM: x86: Ensure liveliness of nested VM-Enter fail tracepoint message IB/mlx5: Fix initializing CQ fragments buffer NFS: Fix a potential NULL dereference in nfs_get_client() NFSv4: Fix deadlock between nfs4_evict_inode() and nfs4_opendata_get_inode() perf session: Correct buffer copying when peeking events kvm: fix previous commit for 32-bit builds NFS: Fix use-after-free in nfs4_init_client() NFSv4: Fix second deadlock in nfs4_evict_inode() NFSv4: nfs4_proc_set_acl needs to restore NFS_CAP_UIDGID_NOMAP on error. scsi: core: Fix error handling of scsi_host_alloc() scsi: core: Fix failure handling of scsi_add_host_with_dma() scsi: core: Put .shost_dev in failure path if host state changes to RUNNING scsi: core: Only put parent device if host state differs from SHOST_CREATED tracing: Correct the length check which causes memory corruption proc: only require mm_struct for writing Linux 5.10.44 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: Ic64172b4e72ccb54d96000b3065dd8b33aa9fef5 |
||
![]() |
5ca472d40e |
cgroup: disable controllers at parse time
[ Upstream commit 45e1ba40837ac2f6f4d4716bddb8d44bd7e4a251 ] This patch effectively reverts the commit |
||
![]() |
2bb462a3af |
ANDROID: cgroup: add vendor hook to cgroup .attach()
Add a vendor hook when a set of tasks change the cgroups in the controller for performance tuning. Bug: 187972571 Signed-off-by: Jing-Ting Wu <Jing-Ting.Wu@mediatek.com> Change-Id: If0ea3d089e077a4650507c69b6e60952d102fa1d |
||
![]() |
29203f8c8f |
ANDROID: cgroup: Add android_rvh_cgroup_force_kthread_migration
In Android GKI, CONFIG_FAIR_GROUP_SCHED is enabled [1] to help
prioritize important work. Given that CPU shares of root cgroup
can't be changed, leaving the tasks inside root cgroup will give
them higher share compared to the other tasks inside important
cgroups. This is mitigated by moving all tasks inside root cgroup to
a different cgroup after Android is booted. However, there are many
kernel tasks stuck in the root cgroup after the boot.
It is possible to relax kernel threads and kworkers migrations under
certain scenarios. However the patch [2] posted at upstream is not
accepted. Hence add a restricted vendor hook to notify modules when a
kernel thread is requested for cgroup migration. The modules can relax
the restrictions forced by the kernel and allow the cgroup migration.
[1]
|
||
![]() |
b129c98dc6 |
Merge 5.10.17 into android12-5.10
Changes in 5.10.17 objtool: Fix seg fault with Clang non-section symbols Revert "dts: phy: add GPIO number and active state used for phy reset" gpio: mxs: GPIO_MXS should not default to y unconditionally gpio: ep93xx: fix BUG_ON port F usage gpio: ep93xx: Fix single irqchip with multi gpiochips tracing: Do not count ftrace events in top level enable output tracing: Check length before giving out the filter buffer drm/i915: Fix overlay frontbuffer tracking arm/xen: Don't probe xenbus as part of an early initcall cgroup: fix psi monitor for root cgroup Revert "drm/amd/display: Update NV1x SR latency values" drm/i915/tgl+: Make sure TypeC FIA is powered up when initializing it drm/dp_mst: Don't report ports connected if nothing is attached to them dmaengine: move channel device_node deletion to driver tmpfs: disallow CONFIG_TMPFS_INODE64 on s390 tmpfs: disallow CONFIG_TMPFS_INODE64 on alpha soc: ti: omap-prm: Fix boot time errors for rst_map_012 bits 0 and 1 arm64: dts: rockchip: Fix PCIe DT properties on rk3399 arm64: dts: qcom: sdm845: Reserve LPASS clocks in gcc ARM: OMAP2+: Fix suspcious RCU usage splats for omap_enter_idle_coupled arm64: dts: rockchip: remove interrupt-names property from rk3399 vdec node platform/x86: hp-wmi: Disable tablet-mode reporting by default arm64: dts: rockchip: Disable display for NanoPi R2S ovl: perform vfs_getxattr() with mounter creds cap: fix conversions on getxattr ovl: skip getxattr of security labels scsi: lpfc: Fix EEH encountering oops with NVMe traffic x86/split_lock: Enable the split lock feature on another Alder Lake CPU nvme-pci: ignore the subsysem NQN on Phison E16 drm/amd/display: Fix DPCD translation for LTTPR AUX_RD_INTERVAL drm/amd/display: Add more Clock Sources to DCN2.1 drm/amd/display: Release DSC before acquiring drm/amd/display: Fix dc_sink kref count in emulated_link_detect drm/amd/display: Free atomic state after drm_atomic_commit drm/amd/display: Decrement refcount of dc_sink before reassignment riscv: virt_addr_valid must check the address belongs to linear mapping bfq-iosched: Revert "bfq: Fix computation of shallow depth" ARM: dts: lpc32xx: Revert set default clock rate of HCLK PLL kallsyms: fix nonconverging kallsyms table with lld ARM: ensure the signal page contains defined contents ARM: kexec: fix oops after TLB are invalidated ubsan: implement __ubsan_handle_alignment_assumption Revert "lib: Restrict cpumask_local_spread to houskeeping CPUs" x86/efi: Remove EFI PGD build time checks lkdtm: don't move ctors to .rodata KVM: x86: cleanup CR3 reserved bits checks cgroup-v1: add disabled controller check in cgroup1_parse_param() dmaengine: idxd: fix misc interrupt completion ath9k: fix build error with LEDS_CLASS=m mt76: dma: fix a possible memory leak in mt76_add_fragment() drm/vc4: hvs: Fix buffer overflow with the dlist handling dmaengine: idxd: check device state before issue command bpf: Unbreak BPF_PROG_TYPE_KPROBE when kprobe is called via do_int3 bpf: Check for integer overflow when using roundup_pow_of_two() netfilter: xt_recent: Fix attempt to update deleted entry selftests: netfilter: fix current year netfilter: nftables: fix possible UAF over chains from packet path in netns netfilter: flowtable: fix tcp and udp header checksum update xen/netback: avoid race in xenvif_rx_ring_slots_available() net: hdlc_x25: Return meaningful error code in x25_open net: ipa: set error code in gsi_channel_setup() hv_netvsc: Reset the RSC count if NVSP_STAT_FAIL in netvsc_receive() net: enetc: initialize the RFS and RSS memories selftests: txtimestamp: fix compilation issue net: stmmac: set TxQ mode back to DCB after disabling CBS ibmvnic: Clear failover_pending if unable to schedule netfilter: conntrack: skip identical origin tuple in same zone only scsi: scsi_debug: Fix a memory leak x86/build: Disable CET instrumentation in the kernel for 32-bit too net: dsa: felix: implement port flushing on .phylink_mac_link_down net: hns3: add a check for queue_id in hclge_reset_vf_queue() net: hns3: add a check for tqp_index in hclge_get_ring_chain_from_mbx() net: hns3: add a check for index in hclge_get_rss_key() firmware_loader: align .builtin_fw to 8 drm/sun4i: tcon: set sync polarity for tcon1 channel drm/sun4i: dw-hdmi: always set clock rate drm/sun4i: Fix H6 HDMI PHY configuration drm/sun4i: dw-hdmi: Fix max. frequency for H6 clk: sunxi-ng: mp: fix parent rate change flag check i2c: stm32f7: fix configuration of the digital filter h8300: fix PREEMPTION build, TI_PRE_COUNT undefined scripts: set proper OpenSSL include dir also for sign-file x86/pci: Create PCI/MSI irqdomain after x86_init.pci.arch_init() arm64: mte: Allow PTRACE_PEEKMTETAGS access to the zero page rxrpc: Fix clearance of Tx/Rx ring when releasing a call udp: fix skb_copy_and_csum_datagram with odd segment sizes net: dsa: call teardown method on probe failure cpufreq: ACPI: Extend frequency tables to cover boost frequencies cpufreq: ACPI: Update arch scale-invariance max perf ratio if CPPC is not there net: gro: do not keep too many GRO packets in napi->rx_list net: fix iteration for sctp transport seq_files net/vmw_vsock: fix NULL pointer dereference net/vmw_vsock: improve locking in vsock_connect_timeout() net: watchdog: hold device global xmit lock during tx disable bridge: mrp: Fix the usage of br_mrp_port_switchdev_set_state switchdev: mrp: Remove SWITCHDEV_ATTR_ID_MRP_PORT_STAT vsock/virtio: update credit only if socket is not closed vsock: fix locking in vsock_shutdown() net/rds: restrict iovecs length for RDS_CMSG_RDMA_ARGS net/qrtr: restrict user-controlled length in qrtr_tun_write_iter() ovl: expand warning in ovl_d_real() kcov, usb: only collect coverage from __usb_hcd_giveback_urb in softirq Linux 5.10.17 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: Id0300681f52b51d3f466f1e66ec3a6c25f65f4d3 |
||
![]() |
e72a65802a |
cgroup: fix psi monitor for root cgroup
commit 385aac1519417b89cb91b77c22e4ca21db563cd0 upstream. Fix NULL pointer dereference when adding new psi monitor to the root cgroup. PSI files for root cgroup was introduced in |
||
![]() |
3975c98baf |
ANDROID: cgroup: Export functions used by modules
Below symbols would be used by vendor modules for tracking tasks in various cgroups 1. cgroup_taskset_first 2. cgroup_taskset_next Bug: 175045928 Change-Id: I6d89148eb4c71174a02a27acab196ff940be9082 Signed-off-by: Abhijeet Dharmapurikar <adharmap@codeaurora.org> (cherry picked from commit 7333bd73c055e8ebe9862316eaaff691e0eace27) |
||
![]() |
5a920a6503 |
ANDROID: Sched: Export scheduler symbols needed by vendor modules
Need to export internal scheduler symbols to facilitate vendor module with scheduler based value-adds. Bug: 173725277 Change-Id: I021f09097dfc1480abcc998cc8e05e75b2ee828b Signed-off-by: Shaleen Agrawal <shalagra@codeaurora.org> Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org> |
||
![]() |
65026da59c |
cgroup: Zero sized write should be no-op
Do not report failure on zero sized writes, and handle them as no-op. There's issues for example in case of writev() when there's iovec containing zero buffer as a first one. It's expected writev() on below example to successfully perform the write to specified writable cgroup file expecting integer value, and to return 2. For now it's returning value -1, and skipping the write: int writetest(int fd) { const char *buf1 = ""; const char *buf2 = "1\n"; struct iovec iov[2] = { { .iov_base = (void*)buf1, .iov_len = 0 }, { .iov_base = (void*)buf2, .iov_len = 2 } }; return writev(fd, iov, 2); } This patch fixes the issue by checking if there's nothing to write, and handling the write as no-op by just returning 0. Signed-off-by: Jouni Roivas <jouni.roivas@tuxera.com> Signed-off-by: Tejun Heo <tj@kernel.org> |
||
![]() |
95d325185c |
cgroup: remove redundant kernfs_activate in cgroup_setup_root()
This step is already done in rebind_subsystems(). Not necessary to do it again. Signed-off-by: Wei Yang <richard.weiyang@linux.alibaba.com> Signed-off-by: Tejun Heo <tj@kernel.org> |
||
![]() |
ad0f75e5f5 |
cgroup: fix cgroup_sk_alloc() for sk_clone_lock()
When we clone a socket in sk_clone_lock(), its sk_cgrp_data is copied, so the cgroup refcnt must be taken too. And, unlike the sk_alloc() path, sock_update_netprioidx() is not called here. Therefore, it is safe and necessary to grab the cgroup refcnt even when cgroup_sk_alloc is disabled. sk_clone_lock() is in BH context anyway, the in_interrupt() would terminate this function if called there. And for sk_alloc() skcd->val is always zero. So it's safe to factor out the code to make it more readable. The global variable 'cgroup_sk_alloc_disabled' is used to determine whether to take these reference counts. It is impossible to make the reference counting correct unless we save this bit of information in skcd->val. So, add a new bit there to record whether the socket has already taken the reference counts. This obviously relies on kmalloc() to align cgroup pointers to at least 4 bytes, ARCH_KMALLOC_MINALIGN is certainly larger than that. This bug seems to be introduced since the beginning, commit |
||
![]() |
4a7e89c5ec |
Merge branch 'for-5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup updates from Tejun Heo: "Just two patches: one to add system-level cpu.stat to the root cgroup for convenience and a trivial comment update" * 'for-5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: cgroup: add cpu.stat file to root cgroup cgroup: Remove stale comments |
||
![]() |
936f2a70f2 |
cgroup: add cpu.stat file to root cgroup
Currently, the root cgroup does not have a cpu.stat file. Add one which is consistent with /proc/stat to capture global cpu statistics that might not fall under cgroup accounting. We haven't done this in the past because the data are already presented in /proc/stat and we didn't want to add overhead from collecting root cgroup stats when cgroups are configured, but no cgroups have been created. By keeping the data consistent with /proc/stat, I think we avoid the first problem, while improving the usability of cgroups stats. We avoid the second problem by computing the contents of cpu.stat from existing data collected for /proc/stat anyway. Signed-off-by: Boris Burkov <boris@bur.io> Suggested-by: Tejun Heo <tj@kernel.org> Signed-off-by: Tejun Heo <tj@kernel.org> |
||
![]() |
6b6ebb3474 |
cgroup: Remove stale comments
- The default root is where we can create v2 cgroups. - The __DEVEL__sane_behavior mount option has been removed long long ago. Signed-off-by: Li Zefan <lizefan@huawei.com> Signed-off-by: Tejun Heo <tj@kernel.org> |
||
![]() |
f9d041271c |
bpf: Refactor bpf_link update handling
Make bpf_link update support more generic by making it into another bpf_link_ops methods. This allows generic syscall handling code to be agnostic to various conditionally compiled features (e.g., the case of CONFIG_CGROUP_BPF). This also allows to keep link type-specific code to remain static within respective code base. Refactor existing bpf_cgroup_link code and take advantage of this. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200429001614.1544-2-andriin@fb.com |
||
![]() |
d883600523 |
Merge branch 'for-5.7' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup updates from Tejun Heo: - Christian extended clone3 so that processes can be spawned into cgroups directly. This is not only neat in terms of semantics but also avoids grabbing the global cgroup_threadgroup_rwsem for migration. - Daniel added !root xattr support to cgroupfs. Userland already uses xattrs on cgroupfs for bookkeeping. This will allow delegated cgroups to support such usages. - Prateek tried to make cpuset hotplug handling synchronous but that led to possible deadlock scenarios. Reverted. - Other minor changes including release_agent_path handling cleanup. * 'for-5.7' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: docs: cgroup-v1: Document the cpuset_v2_mode mount option Revert "cpuset: Make cpuset hotplug synchronous" cgroupfs: Support user xattrs kernfs: Add option to enable user xattrs kernfs: Add removed_size out param for simple_xattr_set kernfs: kvmalloc xattr value instead of kmalloc cgroup: Restructure release_agent_path handling selftests/cgroup: add tests for cloning into cgroups clone3: allow spawning processes into cgroups cgroup: add cgroup_may_write() helper cgroup: refactor fork helpers cgroup: add cgroup_get_from_file() helper cgroup: unify attach permission checking cpuset: Make cpuset hotplug synchronous cgroup.c: Use built-in RCU list checking kselftest/cgroup: add cgroup destruction test cgroup: Clean up css_set task traversal |
||
![]() |
8a931f8013 |
mm: memcontrol: recursive memory.low protection
Right now, the effective protection of any given cgroup is capped by its own explicit memory.low setting, regardless of what the parent says. The reasons for this are mostly historical and ease of implementation: to make delegation of memory.low safe, effective protection is the min() of all memory.low up the tree. Unfortunately, this limitation makes it impossible to protect an entire subtree from another without forcing the user to make explicit protection allocations all the way to the leaf cgroups - something that is highly undesirable in real life scenarios. Consider memory in a data center host. At the cgroup top level, we have a distinction between system management software and the actual workload the system is executing. Both branches are further subdivided into individual services, job components etc. We want to protect the workload as a whole from the system management software, but that doesn't mean we want to protect and prioritize individual workload wrt each other. Their memory demand can vary over time, and we'd want the VM to simply cache the hottest data within the workload subtree. Yet, the current memory.low limitations force us to allocate a fixed amount of protection to each workload component in order to get protection from system management software in general. This results in very inefficient resource distribution. Another concern with mandating downward allocation is that, as the complexity of the cgroup tree grows, it gets harder for the lower levels to be informed about decisions made at the host-level. Consider a container inside a namespace that in turn creates its own nested tree of cgroups to run multiple workloads. It'd be extremely difficult to configure memory.low parameters in those leaf cgroups that on one hand balance pressure among siblings as the container desires, while also reflecting the host-level protection from e.g. rpm upgrades, that lie beyond one or more delegation and namespacing points in the tree. It's highly unusual from a cgroup interface POV that nested levels have to be aware of and reflect decisions made at higher levels for them to be effective. To enable such use cases and scale configurability for complex trees, this patch implements a resource inheritance model for memory that is similar to how the CPU and the IO controller implement work-conserving resource allocations: a share of a resource allocated to a subree always applies to the entire subtree recursively, while allowing, but not mandating, children to further specify distribution rules. That means that if protection is explicitly allocated among siblings, those configured shares are being followed during page reclaim just like they are now. However, if the memory.low set at a higher level is not fully claimed by the children in that subtree, the "floating" remainder is applied to each cgroup in the tree in proportion to its size. Since reclaim pressure is applied in proportion to size as well, each child in that tree gets the same boost, and the effect is neutral among siblings - with respect to each other, they behave as if no memory control was enabled at all, and the VM simply balances the memory demands optimally within the subtree. But collectively those cgroups enjoy a boost over the cgroups in neighboring trees. E.g. a leaf cgroup with a memory.low setting of 0 no longer means that it's not getting a share of the hierarchically assigned resource, just that it doesn't claim a fixed amount of it to protect from its siblings. This allows us to recursively protect one subtree (workload) from another (system management), while letting subgroups compete freely among each other - without having to assign fixed shares to each leaf, and without nested groups having to echo higher-level settings. The floating protection composes naturally with fixed protection. Consider the following example tree: A A: low = 2G / \ A1: low = 1G A1 A2 A2: low = 0G As outside pressure is applied to this tree, A1 will enjoy a fixed protection from A2 of 1G, but the remaining, unclaimed 1G from A is split evenly among A1 and A2, coming out to 1.5G and 0.5G. There is a slight risk of regressing theoretical setups where the top-level cgroups don't know about the true budgeting and set bogusly high "bypass" values that are meaningfully allocated down the tree. Such setups would rely on unclaimed protection to be discarded, and distributing it would change the intended behavior. Be safe and hide the new behavior behind a mount option, 'memory_recursiveprot'. Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Tejun Heo <tj@kernel.org> Acked-by: Roman Gushchin <guro@fb.com> Acked-by: Chris Down <chris@chrisdown.name> Cc: Michal Hocko <mhocko@suse.com> Cc: Michal Koutný <mkoutny@suse.com> Link: http://lkml.kernel.org/r/20200227195606.46212-4-hannes@cmpxchg.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
![]() |
0c991ebc8c |
bpf: Implement bpf_prog replacement for an active bpf_cgroup_link
Add new operation (LINK_UPDATE), which allows to replace active bpf_prog from under given bpf_link. Currently this is only supported for bpf_cgroup_link, but will be extended to other kinds of bpf_links in follow-up patches. For bpf_cgroup_link, implemented functionality matches existing semantics for direct bpf_prog attachment (including BPF_F_REPLACE flag). User can either unconditionally set new bpf_prog regardless of which bpf_prog is currently active under given bpf_link, or, optionally, can specify expected active bpf_prog. If active bpf_prog doesn't match expected one, no changes are performed, old bpf_link stays intact and attached, operation returns a failure. cgroup_bpf_replace() operation is resolving race between auto-detachment and bpf_prog update in the same fashion as it's done for bpf_link detachment, except in this case update has no way of succeeding because of target cgroup marked as dying. So in this case error is returned. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200330030001.2312810-3-andriin@fb.com |
||
![]() |
af6eea5743 |
bpf: Implement bpf_link-based cgroup BPF program attachment
Implement new sub-command to attach cgroup BPF programs and return FD-based bpf_link back on success. bpf_link, once attached to cgroup, cannot be replaced, except by owner having its FD. Cgroup bpf_link supports only BPF_F_ALLOW_MULTI semantics. Both link-based and prog-based BPF_F_ALLOW_MULTI attachments can be freely intermixed. To prevent bpf_cgroup_link from keeping cgroup alive past the point when no BPF program can be executed, implement auto-detachment of link. When cgroup_bpf_release() is called, all attached bpf_links are forced to release cgroup refcounts, but they leave bpf_link otherwise active and allocated, as well as still owning underlying bpf_prog. This is because user-space might still have FDs open and active, so bpf_link as a user-referenced object can't be freed yet. Once last active FD is closed, bpf_link will be freed and underlying bpf_prog refcount will be dropped. But cgroup refcount won't be touched, because cgroup is released already. The inherent race between bpf_cgroup_link release (from closing last FD) and cgroup_bpf_release() is resolved by both operations taking cgroup_mutex. So the only additional check required is when bpf_cgroup_link attempts to detach itself from cgroup. At that time we need to check whether there is still cgroup associated with that link. And if not, exit with success, because bpf_cgroup_link was already successfully detached. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Roman Gushchin <guro@fb.com> Link: https://lore.kernel.org/bpf/20200330030001.2312810-2-andriin@fb.com |
||
![]() |
38aca3071c |
cgroupfs: Support user xattrs
This patch turns on xattr support for cgroupfs. This is useful for letting non-root owners of delegated subtrees attach metadata to cgroups. One use case is for subtree owners to tell a userspace out of memory killer to bias away from killing specific subtrees. Tests: [/sys/fs/cgroup]# for i in $(seq 0 130); \ do setfattr workload.slice -n user.name$i -v wow; done setfattr: workload.slice: No space left on device setfattr: workload.slice: No space left on device setfattr: workload.slice: No space left on device [/sys/fs/cgroup]# for i in $(seq 0 130); \ do setfattr workload.slice --remove user.name$i; done setfattr: workload.slice: No such attribute setfattr: workload.slice: No such attribute setfattr: workload.slice: No such attribute [/sys/fs/cgroup]# for i in $(seq 0 130); \ do setfattr workload.slice -n user.name$i -v wow; done setfattr: workload.slice: No space left on device setfattr: workload.slice: No space left on device setfattr: workload.slice: No space left on device `seq 0 130` is inclusive, and 131 - 128 = 3, which is the number of errors we expect to see. [/data]# cat testxattr.c #include <sys/types.h> #include <sys/xattr.h> #include <stdio.h> #include <stdlib.h> int main() { char name[256]; char *buf = malloc(64 << 10); if (!buf) { perror("malloc"); return 1; } for (int i = 0; i < 4; ++i) { snprintf(name, 256, "user.bigone%d", i); if (setxattr("/sys/fs/cgroup/system.slice", name, buf, 64 << 10, 0)) { printf("setxattr failed on iteration=%d\n", i); return 1; } } return 0; } [/data]# ./a.out setxattr failed on iteration=2 [/data]# ./a.out setxattr failed on iteration=0 [/sys/fs/cgroup]# setfattr -x user.bigone0 system.slice/ [/sys/fs/cgroup]# setfattr -x user.bigone1 system.slice/ [/data]# ./a.out setxattr failed on iteration=2 Signed-off-by: Daniel Xu <dxu@dxuuu.xyz> Acked-by: Chris Down <chris@chrisdown.name> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tejun Heo <tj@kernel.org> |
||
![]() |
1b51f69461 |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from David Miller: "It looks like a decent sized set of fixes, but a lot of these are one liner off-by-one and similar type changes: 1) Fix netlink header pointer to calcular bad attribute offset reported to user. From Pablo Neira Ayuso. 2) Don't double clear PHY interrupts when ->did_interrupt is set, from Heiner Kallweit. 3) Add missing validation of various (devlink, nl802154, fib, etc.) attributes, from Jakub Kicinski. 4) Missing *pos increments in various netfilter seq_next ops, from Vasily Averin. 5) Missing break in of_mdiobus_register() loop, from Dajun Jin. 6) Don't double bump tx_dropped in veth driver, from Jiang Lidong. 7) Work around FMAN erratum A050385, from Madalin Bucur. 8) Make sure ARP header is pulled early enough in bonding driver, from Eric Dumazet. 9) Do a cond_resched() during multicast processing of ipvlan and macvlan, from Mahesh Bandewar. 10) Don't attach cgroups to unrelated sockets when in interrupt context, from Shakeel Butt. 11) Fix tpacket ring state management when encountering unknown GSO types. From Willem de Bruijn. 12) Fix MDIO bus PHY resume by checking mdio_bus_phy_may_suspend() only in the suspend context. From Heiner Kallweit" * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (112 commits) net: systemport: fix index check to avoid an array out of bounds access tc-testing: add ETS scheduler to tdc build configuration net: phy: fix MDIO bus PM PHY resuming net: hns3: clear port base VLAN when unload PF net: hns3: fix RMW issue for VLAN filter switch net: hns3: fix VF VLAN table entries inconsistent issue net: hns3: fix "tc qdisc del" failed issue taprio: Fix sending packets without dequeueing them net: mvmdio: avoid error message for optional IRQ net: dsa: mv88e6xxx: Add missing mask of ATU occupancy register net: memcg: fix lockdep splat in inet_csk_accept() s390/qeth: implement smarter resizing of the RX buffer pool s390/qeth: refactor buffer pool code s390/qeth: use page pointers to manage RX buffer pool seg6: fix SRv6 L2 tunnels to use IANA-assigned protocol number net: dsa: Don't instantiate phylink for CPU/DSA ports unless needed net/packet: tpacket_rcv: do not increment ring index on drop sxgbe: Fix off by one in samsung driver strncpy size arg net: caif: Add lockdep expression to RCU traversal primitive MAINTAINERS: remove Sathya Perla as Emulex NIC maintainer ... |
||
![]() |
a09833f7cd | Merge branch 'for-5.6-fixes' into for-5.7 | ||
![]() |
e876ecc67d |
cgroup: memcg: net: do not associate sock with unrelated cgroup
We are testing network memory accounting in our setup and noticed inconsistent network memory usage and often unrelated cgroups network usage correlates with testing workload. On further inspection, it seems like mem_cgroup_sk_alloc() and cgroup_sk_alloc() are broken in irq context specially for cgroup v1. mem_cgroup_sk_alloc() and cgroup_sk_alloc() can be called in irq context and kind of assumes that this can only happen from sk_clone_lock() and the source sock object has already associated cgroup. However in cgroup v1, where network memory accounting is opt-in, the source sock can be unassociated with any cgroup and the new cloned sock can get associated with unrelated interrupted cgroup. Cgroup v2 can also suffer if the source sock object was created by process in the root cgroup or if sk_alloc() is called in irq context. The fix is to just do nothing in interrupt. WARNING: Please note that about half of the TCP sockets are allocated from the IRQ context, so, memory used by such sockets will not be accouted by the memcg. The stack trace of mem_cgroup_sk_alloc() from IRQ-context: CPU: 70 PID: 12720 Comm: ssh Tainted: 5.6.0-smp-DEV #1 Hardware name: ... Call Trace: <IRQ> dump_stack+0x57/0x75 mem_cgroup_sk_alloc+0xe9/0xf0 sk_clone_lock+0x2a7/0x420 inet_csk_clone_lock+0x1b/0x110 tcp_create_openreq_child+0x23/0x3b0 tcp_v6_syn_recv_sock+0x88/0x730 tcp_check_req+0x429/0x560 tcp_v6_rcv+0x72d/0xa40 ip6_protocol_deliver_rcu+0xc9/0x400 ip6_input+0x44/0xd0 ? ip6_protocol_deliver_rcu+0x400/0x400 ip6_rcv_finish+0x71/0x80 ipv6_rcv+0x5b/0xe0 ? ip6_sublist_rcv+0x2e0/0x2e0 process_backlog+0x108/0x1e0 net_rx_action+0x26b/0x460 __do_softirq+0x104/0x2a6 do_softirq_own_stack+0x2a/0x40 </IRQ> do_softirq.part.19+0x40/0x50 __local_bh_enable_ip+0x51/0x60 ip6_finish_output2+0x23d/0x520 ? ip6table_mangle_hook+0x55/0x160 __ip6_finish_output+0xa1/0x100 ip6_finish_output+0x30/0xd0 ip6_output+0x73/0x120 ? __ip6_finish_output+0x100/0x100 ip6_xmit+0x2e3/0x600 ? ipv6_anycast_cleanup+0x50/0x50 ? inet6_csk_route_socket+0x136/0x1e0 ? skb_free_head+0x1e/0x30 inet6_csk_xmit+0x95/0xf0 __tcp_transmit_skb+0x5b4/0xb20 __tcp_send_ack.part.60+0xa3/0x110 tcp_send_ack+0x1d/0x20 tcp_rcv_state_process+0xe64/0xe80 ? tcp_v6_connect+0x5d1/0x5f0 tcp_v6_do_rcv+0x1b1/0x3f0 ? tcp_v6_do_rcv+0x1b1/0x3f0 __release_sock+0x7f/0xd0 release_sock+0x30/0xa0 __inet_stream_connect+0x1c3/0x3b0 ? prepare_to_wait+0xb0/0xb0 inet_stream_connect+0x3b/0x60 __sys_connect+0x101/0x120 ? __sys_getsockopt+0x11b/0x140 __x64_sys_connect+0x1a/0x20 do_syscall_64+0x51/0x200 entry_SYSCALL_64_after_hwframe+0x44/0xa9 The stack trace of mem_cgroup_sk_alloc() from IRQ-context: Fixes: |
||
![]() |
190ecb190a |
cgroup: fix psi_show() crash on 32bit ino archs
Similar to the commit |
||
![]() |
ef2c41cf38 |
clone3: allow spawning processes into cgroups
This adds support for creating a process in a different cgroup than its parent. Callers can limit and account processes and threads right from the moment they are spawned: - A service manager can directly spawn new services into dedicated cgroups. - A process can be directly created in a frozen cgroup and will be frozen as well. - The initial accounting jitter experienced by process supervisors and daemons is eliminated with this. - Threaded applications or even thread implementations can choose to create a specific cgroup layout where each thread is spawned directly into a dedicated cgroup. This feature is limited to the unified hierarchy. Callers need to pass a directory file descriptor for the target cgroup. The caller can choose to pass an O_PATH file descriptor. All usual migration restrictions apply, i.e. there can be no processes in inner nodes. In general, creating a process directly in a target cgroup adheres to all migration restrictions. One of the biggest advantages of this feature is that CLONE_INTO_GROUP does not need to grab the write side of the cgroup cgroup_threadgroup_rwsem. This global lock makes moving tasks/threads around super expensive. With clone3() this lock is avoided. Cc: Tejun Heo <tj@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Li Zefan <lizefan@huawei.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: cgroups@vger.kernel.org Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> Signed-off-by: Tejun Heo <tj@kernel.org> |
||
![]() |
f3553220d4 |
cgroup: add cgroup_may_write() helper
Add a cgroup_may_write() helper which we can use in the CLONE_INTO_CGROUP patch series to verify that we can write to the destination cgroup. Cc: Tejun Heo <tj@kernel.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Li Zefan <lizefan@huawei.com> Cc: cgroups@vger.kernel.org Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> Signed-off-by: Tejun Heo <tj@kernel.org> |
||
![]() |
5a5cf5cb30 |
cgroup: refactor fork helpers
This refactors the fork helpers so they can be easily modified in the next patches. The patch just moves the cgroup threadgroup rwsem grab and release into the helpers. They don't need to be directly exposed in fork.c. Cc: Tejun Heo <tj@kernel.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Li Zefan <lizefan@huawei.com> Cc: cgroups@vger.kernel.org Acked-by: Michal Koutný <mkoutny@suse.com> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> Signed-off-by: Tejun Heo <tj@kernel.org> |
||
![]() |
17703097f3 |
cgroup: add cgroup_get_from_file() helper
Add a helper cgroup_get_from_file(). The helper will be used in subsequent patches to retrieve a cgroup while holding a reference to the struct file it was taken from. Cc: Tejun Heo <tj@kernel.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Li Zefan <lizefan@huawei.com> Cc: cgroups@vger.kernel.org Acked-by: Michal Koutný <mkoutny@suse.com> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> Signed-off-by: Tejun Heo <tj@kernel.org> |
||
![]() |
6df970e4f5 |
cgroup: unify attach permission checking
The core codepaths to check whether a process can be attached to a cgroup are the same for threads and thread-group leaders. Only a small piece of code verifying that source and destination cgroup are in the same domain differentiates the thread permission checking from thread-group leader permission checking. Since cgroup_migrate_vet_dst() only matters cgroup2 - it is a noop on cgroup1 - we can move it out of cgroup_attach_task(). All checks can now be consolidated into a new helper cgroup_attach_permissions() callable from both cgroup_procs_write() and cgroup_threads_write(). Cc: Tejun Heo <tj@kernel.org> Cc: Li Zefan <lizefan@huawei.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: cgroups@vger.kernel.org Acked-by: Michal Koutný <mkoutny@suse.com> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> Signed-off-by: Tejun Heo <tj@kernel.org> |
||
![]() |
3010c5b9f5 |
cgroup.c: Use built-in RCU list checking
list_for_each_entry_rcu has built-in RCU and lock checking. Pass cond argument to list_for_each_entry_rcu() to silence false lockdep warning when CONFIG_PROVE_RCU_LIST is enabled by default. Even though the function css_next_child() already checks if cgroup_mutex or rcu_read_lock() is held using cgroup_assert_mutex_or_rcu_locked(), there is a need to pass cond to list_for_each_entry_rcu() to avoid false positive lockdep warning. Signed-off-by: Madhuparna Bhowmik <madhuparnabhowmik10@gmail.com> Acked-by: Michal Koutný <mkoutny@suse.com> Signed-off-by: Tejun Heo <tj@kernel.org> |
||
![]() |
f43caa2adc |
cgroup: Clean up css_set task traversal
css_task_iter stores pointer to head of each iterable list, this dates
back to commit
|